Random-Coefficient Models
We continue our analysis of the Snijders and Bosker data. This time we will consider verbal IQ as a predictor of language scores. Here are the data.
. infile schoolnr pupilnr iq_verb iq_perf sex minority repeatgr /// > aritpret classnr aritpost langpret langpost ses denomina schoolses /// > satiprin natitest meetings currmeet mixedgra percmino aritdiff /// > homework classsiz groupsiz using /// > http://data.princeton.edu/wws509/datasets/snijders.dat (2287 observations read)
To simplify interpretation we will center verbal IQ on the overall mean. We also compute the number of observations per schol and create a flag for the first observation in each school (as before).
. sum iq_verb
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
iq_verb | 2287 11.83406 2.06889 4 18
. gen iqvc = iq_verb-r(mean)
. egen sn = count(langpost), by(schoolnr)
. bysort schoolnr : gen first=_n==1
School Regressions
Our first step wll be to run a separate regression for each school.
We take advantage of Stata's stabsby command to run the
regression for each school and save the intercept and slope as
new variables sa and sb in a new Stata
dataset called "ols". We then merge that dataset with the current data
and plot the regression lines.
(For an alternative way of doing this sort of thing using looping
see
my panel data log.)
. statsby sa=_b[_cons] sb=_b[iqvc], by(schoolnr) saving(ols, replace) ///
> : regress langpost iqvc
(running regress on estimation sample)
command: regress langpost iqvc
sa: _b[_cons]
sb: _b[iqvc]
by: schoolnr
Statsby groups
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
.................................................. 100
...............................
. sort schoolnr
. merge schoolnr using ols
variable schoolnr does not uniquely identify observations in the master data
. drop _merge
. gen yhat = sa+sb*iqvc
. sort schoolnr iqvc
. line yhat iqvc, connect(ascending) title(School Regressions)
. graph export fig1lang2.png, width(400) replace
(file fig1lang2.png written in PNG format)
We will compare these lines with the Bayesian estimates based on random intercept and random slope models.
Random Intercepts
We now consider a model where each school has its onw intercept but these
are drawn from a normal distribution with mean a
and standard deviation sa.
This model can be fit using xtreg, but we will use
xtmixed so we can get BLUPS.
. xtmixed langpost iqvc || schoolnr: , mle
Performing EM optimization:
Performing gradient-based optimization:
Iteration 0: log likelihood = -7625.8865
Iteration 1: log likelihood = -7625.8865
Computing standard errors:
Mixed-effects ML regression Number of obs = 2287
Group variable: schoolnr Number of groups = 131
Obs per group: min = 4
avg = 17.5
max = 35
Wald chi2(1) = 1261.42
Log likelihood = -7625.8865 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
langpost | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
iqvc | 2.488094 .0700548 35.52 0.000 2.350789 2.625399
_cons | 40.60937 .3068551 132.34 0.000 40.00794 41.21079
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
schoolnr: Identity |
sd(_cons) | 3.081719 .2552304 2.619967 3.624851
-----------------------------+------------------------------------------------
sd(Residual) | 6.498244 .0991428 6.306804 6.695495
------------------------------------------------------------------------------
LR test vs. linear regression: chibar2(01) = 225.92 Prob >= chibar2 = 0.0000
The expected language score for a kid with average verbal IQ averages 4.6 across all schools, but shows substantial variation, with a standard deviation of 3.1. The common slope is estimated as a gain of 2.5 points in language score per point of verbal IQ. (The standard deviation of verbal IQ is 2.5, so one standard deviation of verbal IQ is associated to 5.1 points in language score. This gives us a good idea of the relative importance of observed nd unobserved effects.)
Next we use predict to compute fitted lines and
estimate the random effects
. predict yhat1, fitted // y-hat for model 1
. predict ra1, reffects // residual intercept for model 1
. gen check = (_b[_cons]+ra1) + _b[iqvc]*iqvc
. list yhat1 check in 1/10
+---------------------+
| yhat1 check |
|---------------------|
1. | 20.74202 20.74202 |
2. | 24.47416 24.47416 |
3. | 26.96226 26.96226 |
4. | 30.6944 30.6944 |
5. | 33.1825 33.1825 |
|---------------------|
6. | 34.42654 34.42654 |
7. | 34.42654 34.42654 |
8. | 34.42654 34.42654 |
9. | 34.42654 34.42654 |
10. | 35.67059 35.67059 |
+---------------------+
We used the fitted option of predict to obtain fitted values
given the random effects, and the reffects option to get BLUPS.
As a check we verify that we can reproduce the fitted values using the
fixed and random coefficients.
Next we plot the fited regression lines and the two estimates of the school intercepts
. line yhat1 iqvc, connect(ascending) title(Random Intercept)
. graph export fig2lang2.png, width(400) replace
(file fig2lang2.png written in PNG format)
. gen sa1 = _b[_cons]+ra1 // school intercept based no model 1
. twoway (scatter sa1 sa ) (function y=x, range(16 48)) ///
> , legend(off) ytitle(BLUP) xtitle(ML) title(School Intercepts)
. graph export fig3lang2.png, width(400) replace
(file fig3lang2.png written in PNG format)
. list schoolnr sn if sa < 30 & first
+---------------+
| schoolnr sn |
|---------------|
32. | 2 7 |
337. | 47 8 |
841. | 103 4 |
2282. | 258 7 |
+---------------+
We see that all the lines are parallel as one would expect. We also note that the intercepts have been shrunk, particularly for the four small schools with very low language scores.
Random Slopes
Our next model treats the intercept and slope as observations from a
bivariate normal distribution with mean a,b
and variance-covariance matrix with elements
s2a,
s2b, and
sab.
To let Stata know that we want the covariance between the intercept and
slope to be estimated we specify covariance(unstructured).
(The default assumes that the covariance is zero.
. xtmixed langpost iqvc || schoolnr: iqvc, mle covariance(unstructured)
Performing EM optimization:
Performing gradient-based optimization:
Iteration 0: log likelihood = -7615.9951
Iteration 1: log likelihood = -7615.3893
Iteration 2: log likelihood = -7615.3887
Computing standard errors:
Mixed-effects ML regression Number of obs = 2287
Group variable: schoolnr Number of groups = 131
Obs per group: min = 4
avg = 17.5
max = 35
Wald chi2(1) = 962.03
Log likelihood = -7615.3887 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
langpost | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
iqvc | 2.52637 .0814522 31.02 0.000 2.366726 2.686013
_cons | 40.70956 .3042423 133.81 0.000 40.11325 41.30586
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
schoolnr: Unstructured |
sd(iqvc) | .4583713 .1100965 .286264 .7339526
sd(_cons) | 3.058354 .2491357 2.607043 3.587791
corr(iqvc,_cons) | -.8168636 .1743621 -.9744848 -.1196644
-----------------------------+------------------------------------------------
sd(Residual) | 6.44051 .1004244 6.246659 6.640377
------------------------------------------------------------------------------
LR test vs. linear regression: chi2(3) = 246.91 Prob > chi2 = 0.0000
Note: LR test is conservative and provided only for reference.
The expected language score for a child with average IQ now averages 40.7 across schools, with a standard dviation of 3.1. The expected gain in language score per point of IQ averages 2.5, almost the same as before, with a standard devaition of 0.48. The intercept and slope have a negative correlation of -0.82 across schools, so schools with higher language scores for a kid ith average verbal IQ tend to show smaller average gains.
The next step is to predict fitted values as well as the random effects. We verify that we can reproduce the fitted values
. predict yhat2, fitted // yhat for model 2
. predict rb2 ra2, reffects //residual slope and intercept for model 2
. capture drop check
. gen check = (_b[_cons]+ra2) + (_b[iqvc]+rb2)*iqvc
. list yhat2 check in 1/10
+---------------------+
| yhat2 check |
|---------------------|
1. | 20.78043 20.78043 |
2. | 24.53934 24.53934 |
3. | 27.04527 27.04527 |
4. | 30.80418 30.80418 |
5. | 33.31012 33.31012 |
|---------------------|
6. | 34.56309 34.56309 |
7. | 34.56309 34.56309 |
8. | 34.56309 34.56309 |
9. | 34.56309 34.56309 |
10. | 35.81606 35.81606 |
+---------------------+
. line yhat2 iqvc, connect(ascending)
. graph export fig4lang2.png, width(400) replace
(note: file fig4lang2.png not found)
(file fig4lang2.png written in PNG format)
The graph of fitted lines shows clearly how school differences are more pronounced at lower than at higher verbal IQs.
Next we comparing the latest Bayesian estimates with the within-school regressions to note the shrinking
. gen sa2 = _b[_cons] + ra2 . gen sb2 = _b[iqvc] + rb2 . twoway scatter sa2 sa , title(Intercepts) name(a, replace) . scatter sb2 sb, title(Slopes) name(b, replace) . graph combine a b, title(Bayesian and Within-School Estimates) /// > xsize(4) ysize(2) . graph export fig5lang2.png, width(400) replace (note: file fig5lang2.png not found) (file fig5lang2.png written in PNG format)
We can also check how much individual schools are affected. Try 40 for little change, 2, 47, 103 or 258 for substantial change:
. gen use = schoolnr==2 . twoway (scatter langpost iqvc if use) /// scatterplot > (line yhat iqvc if use, lpat(dot)) /// within school > (line yhat2 iqvc if use, lpat(dash)) /// mixed model > ( function y=_b[_cons]+_b[iqvc]*x, range(-5 5) ) , /// average > legend(order(2 "within" 3 "EB" 4 "G") cols(1) ring(0) pos(5)) . graph export fig6lang2.png, width(400) replace (note: file fig6lang2.png not found) (file fig6lang2.png written in PNG format)
We continue with a level-2 predictor of intercepts and slopes
