![]() |
|
![]() | |||
|
|
|||||
We consider Wald tests and likelihood ratio tests, introducing the deviance statistic.
The Wald test follows immediately from the fact that the information matrix for generalized linear models is given by
| (B.9) |
| (B.10) |
Tests for subsets of b are based on the corresponding marginal normal distributions.
Example: In the case of normal errors with identity link we have W = I (where I denotes the identity matrix), f = s2, and the exact distribution of [^(b)] is multivariate normal with mean b and variance-covariance matrix (XX)-1s2.
We will show how the likelihood ratio criterion for comparing any two nested models, say w1 w2, can be constructed in terms of a statistic called the deviance and an unknown scale parameter f.
Consider first comparing a model of interest w with a saturated model W that provides a separate parameter for each observation.
Let [^(m)]i denote the fitted values under w and let [^(q)]i denote the corresponding estimates of the canonical parameters. Similarly, let [(m)\tilde]O = yi and [(q)\tilde]i denote the corresponding estimates under W.
The likelihood ratio criterion to compare these two models in the exponential family has the form
|
Assume as usual that ai(f) = f/pi for known prior weights pi. Then we can write the likelihood-ratio criterion as follows:
| (B.11) |
| (B.12) |
The likelihood ratio criterion -2logL is the deviance divided by the scale parameter f, and is called the scaled deviance.
Example: Recall that for the normal distribution we had qi = mi, b(qi) = [1/2]qi2, and ai(f) = s2, so the prior weights are pi = 1. Thus, the deviance is
| |||||||||||||||||||||||||||||||||||||||||||||||
Let us now return to the comparison of two nested models w1, with p1 parameters, and w2, with p2 parameters, such that w1 w2 and p2 > p1.
The log of the ratio of maximized likelihoods under the two models can be written as a difference of deviances, since the maximized log-likelihood under the saturated model cancels out. Thus, we have
| (B.13) |
Large sample theory tells us that the asymptotic distribution of this criterion under the usual regularity conditions is c2n with n = p2-p1 degrees of freedom.
Example: In the linear model with normal errors we estimate the unknown scale parameter f using the residual sum of squares of the larger model, so the criterion becomes
|
In Sections B.4 and B.5 we will construct likelihood ratio tests for binomial and Poisson data. In those cases f = 1 (unless one allows over-dispersion and estimates f, but that's another story) and the deviance is the same as the scaled deviance. All our tests will be based on asymptotic c2 statistics.
Continue with B.4. Binomial Errors and Link Logit
Copyright © Germán Rodríguez, 1993-2000.
Please send feedback to grodri@princeton.edu
Conversion from LaTeX was done using TTH, version 2.34.