TY - JOUR
T1 - Noncanonical links in generalized linear models - when is the effort justified?
AU - Czado, Claudia
AU - Munk, Axel
N1 - Funding Information:
Parts of this paper were written while C. Czado was visiting the Sonderforschungsbereich 386 ‘Statistische Analyse diskreter Strukturen’ at the Ludwig-Maximilians-Universität München, Germany. The authors would like to thank L. Fahrmeir and the LMU München for its hospitality. C. Czado was supported by research grant OGP0089858 of the Natural Sciences and Engineering Research Council of Canada. The work of A. Munk was partially supported by the Deutsche Forschungsgemeinschaft.
PY - 2000/6/1
Y1 - 2000/6/1
N2 - Generalized linear models (GLMs) allow for a wide range of statistical models for regression data. In particular, the logistic model is usually applied for binomial observations. Canonical links for GLMs such as the logit link in the binomial case, are often used because in this case minimal sufficient statistics for the regression parameter exist which allow for simple interpretation of the results. However, in some applications, the overall fit as measured by the p-values of goodness-of-fit statistics (as the residual deviance) can be improved significantly by the use of a noncanonical link. In this case, the interpretation of the influence of the covariables is more complicated compared to GLMs with canonical link functions. It will be illustrated through simulation that the p-value associated with the common goodness-of-link tests is not appropriate to quantify the changes to mean response estimates and other quantities of interest when switching to a noncanonical link. In particular, the rate of misspecifications becomes considerably large, when the inverse information value associated with the underlying parametric link model increases. This shows that the classical tests are often too sensitive, in particular, when the number of observations is large. The consideration of a generalized p-value function is proposed instead, which allows the exact quantification of a suitable distance to the canonical model at a controlled error rate. Corresponding tests for validating or discriminating the canonical model can easily performed by means of this function. Finally, it is indicated how this method can be applied to the problem of overdispersion.
AB - Generalized linear models (GLMs) allow for a wide range of statistical models for regression data. In particular, the logistic model is usually applied for binomial observations. Canonical links for GLMs such as the logit link in the binomial case, are often used because in this case minimal sufficient statistics for the regression parameter exist which allow for simple interpretation of the results. However, in some applications, the overall fit as measured by the p-values of goodness-of-fit statistics (as the residual deviance) can be improved significantly by the use of a noncanonical link. In this case, the interpretation of the influence of the covariables is more complicated compared to GLMs with canonical link functions. It will be illustrated through simulation that the p-value associated with the common goodness-of-link tests is not appropriate to quantify the changes to mean response estimates and other quantities of interest when switching to a noncanonical link. In particular, the rate of misspecifications becomes considerably large, when the inverse information value associated with the underlying parametric link model increases. This shows that the classical tests are often too sensitive, in particular, when the number of observations is large. The consideration of a generalized p-value function is proposed instead, which allows the exact quantification of a suitable distance to the canonical model at a controlled error rate. Corresponding tests for validating or discriminating the canonical model can easily performed by means of this function. Finally, it is indicated how this method can be applied to the problem of overdispersion.
KW - 62F03
KW - 62F04
KW - 62J12
KW - Generalized linear models
KW - Goodness-of-link tests
KW - Link function
KW - Logistic regression
KW - Model validation and discrimination
KW - Overdispersion
KW - Parametric links
KW - p -value curve
UR - http://www.scopus.com/inward/record.url?scp=0042916377&partnerID=8YFLogxK
U2 - 10.1016/S0378-3758(99)00195-0
DO - 10.1016/S0378-3758(99)00195-0
M3 - Article
AN - SCOPUS:0042916377
SN - 0378-3758
VL - 87
SP - 317
EP - 345
JO - Journal of Statistical Planning and Inference
JF - Journal of Statistical Planning and Inference
IS - 2
ER -