![]() |
|
![]() | ||
|
|
||||
We use data from Long (1990) on the number of publications produced by Ph.D. biochemists.
The variables in the dataset are
art: articles in last three years of Ph.D.
fem: coded one for females
mar: coded one if married
kid5: number of children under age six
phd: prestige of Ph.D. program
ment: articles by mentor in last three years
These data have also been analyzed by Long and Freese (2001), and are available from the Stata website:
. use http://www.stata-press.com/data/lf2/couart2
(Academic Biochemists / S Long)
. summarize art
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
art | 915 1.692896 1.926069 0 19
The mean number of articles is 1.69 and the variance is 3.71, a bit more than twice the mean. The data are over-dispersed, but of course we haven't considered any covariates yet.
Let us fit the model used by Long and Freese(2001), a simple
additive model using all five predictors.
We could use poisson to obtain the estimates and
then estat goff to obtain the deviance.
. poisson art fem mar kid5 phd ment
Iteration 0: log likelihood = -1651.4574
Iteration 1: log likelihood = -1651.0567
Iteration 2: log likelihood = -1651.0563
Iteration 3: log likelihood = -1651.0563
Poisson regression Number of obs = 915
LR chi2(5) = 183.03
Prob > chi2 = 0.0000
Log likelihood = -1651.0563 Pseudo R2 = 0.0525
------------------------------------------------------------------------------
art | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
fem | -.2245942 .0546138 -4.11 0.000 -.3316352 -.1175532
mar | .1552434 .0613747 2.53 0.011 .0349512 .2755356
kid5 | -.1848827 .0401272 -4.61 0.000 -.2635305 -.1062349
phd | .0128226 .0263972 0.49 0.627 -.038915 .0645601
ment | .0255427 .0020061 12.73 0.000 .0216109 .0294746
_cons | .3046168 .1029822 2.96 0.003 .1027755 .5064581
------------------------------------------------------------------------------
. estat gof
Goodness-of-fit chi2 = 1634.371
Prob > chi2(909) = 0.0000
Alternatively, we can use glm, which gives us
not only the estimates and the deviance but Pearson's chi-squared
statistic as well
. glm art fem mar kid5 phd ment, family(poisson)
Iteration 0: log likelihood = -1670.3221
Iteration 1: log likelihood = -1651.1048
Iteration 2: log likelihood = -1651.0563
Iteration 3: log likelihood = -1651.0563
Generalized linear models No. of obs = 915
Optimization : ML Residual df = 909
Scale parameter = 1
Deviance = 1634.370984 (1/df) Deviance = 1.797988
Pearson = 1662.54655 (1/df) Pearson = 1.828984
Variance function: V(u) = u [Poisson]
Link function : g(u) = ln(u) [Log]
AIC = 3.621981
Log likelihood = -1651.056316 BIC = -4564.031
------------------------------------------------------------------------------
| OIM
art | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
fem | -.2245942 .0546138 -4.11 0.000 -.3316352 -.1175532
mar | .1552434 .0613747 2.53 0.011 .0349512 .2755356
kid5 | -.1848827 .0401272 -4.61 0.000 -.2635305 -.1062349
phd | .0128226 .0263972 0.49 0.627 -.038915 .0645601
ment | .0255427 .0020061 12.73 0.000 .0216109 .0294746
_cons | .3046168 .1029822 2.96 0.003 .1027755 .5064581
------------------------------------------------------------------------------
We see that the model obviously doesn't fit the data. The five-percent critical value for a chi-squared with 909 d.f. is
. display invchi2tail(909,0.05) 980.25178
and both the deviance and Pearson's chi-squared are in the 1600's.
Before we proceed further we will store the coefficients and their variances in matrices for later use:
. mat bp = e(b) . mat vp = vecdiag(e(V))
We now assume that the variance is proportional to the mean and estimate the scale parameter by dividing Pearson's chi-squared by its d.f. Instead of copying numbers from the output we use the stored results
. di e(deviance_p)/e(df) 1.8289841
The variance is about 83% larger than the mean. We can adjust
the standard errors multiplying by 1.35, the square root of the
scale parameter. We can do this with the scale
option of the glm command, which takes either a
numeric value (in our case 1.828984) or simply x2
to indicate that the adjustment should be based on Pearson's
chi-squared.
. glm art fem mar kid5 phd ment, family(poisson) scale(x2)
Iteration 0: log likelihood = -1670.3221
Iteration 1: log likelihood = -1651.1048
Iteration 2: log likelihood = -1651.0563
Iteration 3: log likelihood = -1651.0563
Generalized linear models No. of obs = 915
Optimization : ML Residual df = 909
Scale parameter = 1
Deviance = 1634.370984 (1/df) Deviance = 1.797988
Pearson = 1662.54655 (1/df) Pearson = 1.828984
Variance function: V(u) = u [Poisson]
Link function : g(u) = ln(u) [Log]
AIC = 3.621981
Log likelihood = -1651.056316 BIC = -4564.031
------------------------------------------------------------------------------
| OIM
art | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
fem | -.2245942 .0738596 -3.04 0.002 -.3693564 -.079832
mar | .1552434 .0830031 1.87 0.061 -.0074397 .3179265
kid5 | -.1848827 .054268 -3.41 0.001 -.291246 -.0785194
phd | .0128226 .0356995 0.36 0.719 -.0571472 .0827924
ment | .0255427 .002713 9.41 0.000 .0202253 .0308602
_cons | .3046168 .139273 2.19 0.029 .0316468 .5775869
------------------------------------------------------------------------------
(Standard errors scaled using square root of Pearson X2-based dispersion)
You can verify that these standard errors are about 35% larger than before. We will save the variances in a matrix for later use
. mat vo = vecdiag(e(V))
Let us now fit a negative binomial model with the same predictors:
. nbreg art fem mar kid5 phd ment
Fitting Poisson model:
Iteration 0: log likelihood = -1651.4574
Iteration 1: log likelihood = -1651.0567
Iteration 2: log likelihood = -1651.0563
Iteration 3: log likelihood = -1651.0563
Fitting constant-only model:
Iteration 0: log likelihood = -1625.4242
Iteration 1: log likelihood = -1609.9746
Iteration 2: log likelihood = -1609.9368
Iteration 3: log likelihood = -1609.9367
Fitting full model:
Iteration 0: log likelihood = -1565.6652
Iteration 1: log likelihood = -1561.0095
Iteration 2: log likelihood = -1560.9583
Iteration 3: log likelihood = -1560.9583
Negative binomial regression Number of obs = 915
LR chi2(5) = 97.96
Dispersion = mean Prob > chi2 = 0.0000
Log likelihood = -1560.9583 Pseudo R2 = 0.0304
------------------------------------------------------------------------------
art | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
fem | -.2164184 .0726724 -2.98 0.003 -.3588537 -.0739832
mar | .1504895 .0821063 1.83 0.067 -.0104359 .3114148
kid5 | -.1764152 .0530598 -3.32 0.001 -.2804105 -.07242
phd | .0152712 .0360396 0.42 0.672 -.0553652 .0859075
ment | .0290823 .0034701 8.38 0.000 .0222811 .0358836
_cons | .256144 .1385604 1.85 0.065 -.0154294 .5277174
-------------+----------------------------------------------------------------
/lnalpha | -.8173044 .1199372 -1.052377 -.5822318
-------------+----------------------------------------------------------------
alpha | .4416205 .0529667 .3491069 .5586502
------------------------------------------------------------------------------
Likelihood-ratio test of alpha=0: chibar2(01) = 180.20 Prob>=chibar2 = 0.000
Stata's alpha is the variance of the unobservable;
it is estimated to be 0.44 and is highly significant (non zero).
Twice the difference in log-likelihoods between this model and
the (nested) Poisson model gives a likelihood ratio statistic
of 180.2. Stata reports this as chibar2 because
it doesn't have a standard chi-squared distribution with one d.f.
Treating it as a chi-squared one anyway gives a conservative test.
We have overwhelming evidence of overdispersion.
Note that the parameter estimates are not very different from those based on a Poisson regression model. Let us compare them side by side.
. mat coef = e(b)
. mat bnb = coef[1,1..6]
. mat b = bp' , bnb'
. mat list b
b[6,2]
y1 y1
art:fem -.22459423 -.21641842
art:mar .15524338 .15048945
art:kid5 -.1848827 -.17641524
art:phd .01282258 .01527116
art:ment .02554275 .02908234
art:_cons .30461683 .25614402
Both sets of estimates would lead to the same conclusions.
We now compare the variances of the estimated coefficients based on the Poisson, over-dispersed Poisson, and negative binomial models:
. mat d = vecdiag(e(V))
. mat vnb = d[1,1..6]
. mat v = vp' , vo' , vnb'
. mat list v
v[6,3]
r1 r1 r1
art:fem .00298266 .00545524 .00528127
art:mar .00376685 .00688951 .00674144
art:kid5 .00161019 .00294501 .00281534
art:phd .00069681 .00127446 .00129885
art:ment 4.024e-06 7.360e-06 .00001204
art:_cons .01060532 .01939697 .01919898
Note that the Poisson model underestimates standard errors, and that both approaches to over-dispersion lead to very similar estimates.
One way to compute the deviance of the negative binomial model
is to feed the estimate of the variance into glm,
which can estimate these models for a fixed value of the scale
parameter
. glm art fem mar kid5 phd ment, family(nb .4416205)
Iteration 0: log likelihood = -1563.5534
Iteration 1: log likelihood = -1560.9585
Iteration 2: log likelihood = -1560.9583
Iteration 3: log likelihood = -1560.9583
Generalized linear models No. of obs = 915
Optimization : ML Residual df = 909
Scale parameter = 1
Deviance = 1004.281481 (1/df) Deviance = 1.10482
Pearson = 944.5494396 (1/df) Pearson = 1.039108
Variance function: V(u) = u+(.4416205)u^2 [Neg. Binomial]
Link function : g(u) = ln(u) [Log]
AIC = 3.425046
Log likelihood = -1560.958338 BIC = -5194.12
------------------------------------------------------------------------------
| OIM
art | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
fem | -.2164184 .0726706 -2.98 0.003 -.3588501 -.0739867
mar | .1504895 .0821062 1.83 0.067 -.0104358 .3114147
kid5 | -.1764152 .0530587 -3.32 0.001 -.2804084 -.0724221
phd | .0152712 .0360382 0.42 0.672 -.0553624 .0859047
ment | .0290823 .0034657 8.39 0.000 .0222896 .0358751
_cons | .256144 .1385256 1.85 0.064 -.0153613 .5276493
------------------------------------------------------------------------------
We see that the negative binomial model fits much better than the Poisson, but still has a deviance (just) above the five percent critical value.
Exercise: Overdispersion may be due to specification errors in the linear predictor. Check to see if we are missing a significant interaction term. Would including that term be enough to get the deviance below the five-percent critical value?