Modelling Intercetps and Slopes
We conclude our analysis of the Snijders and Bosker data by letting the intercept and slopes depend on a school-level predictor. Here are the data one last time, this time read from a Stata file. We center verbal IQ.
. use http://data.princeton.edu/pop510/snijders, clear
(Scores in language test from Snijders and Noskers (1999))
. sum iq_verb
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
iq_verb | 2287 11.83406 2.06889 4 18
. gen iqvc = iq_verb - r(mean)
A Level-2 Predictor: School SES
A useful exploratory step is to plot the estimated intercept and slope residuals obtained in the last model against potential predictors at the school level.
An obvious candidate is school SES. As usual we will center this variable. We also store the school-level standard deviation for later use. In order to let both the intercept and IQ slope depend on school SES we need to create a cross-level interaction:
. bysort schoolnr: gen first = _n==1
. sum schoolses if first // just one observation per school!
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
schoolses | 131 18.40458 4.433749 10 29
. gen sesc = schoolses - r(mean)
. scalar sesc_sd = r(sd)
. gen iqvcXsesc = iqvc * sesc
We are now ready to fit the model. We specify all the fixed effects first and then the random part, which consists of just the constant and the slope for verbal IQ:
. xtmixed langpost iqvc sesc iqvcXsesc || schoolnr: iqvc ///
> , mle covariance(unstructured)
Performing EM optimization:
Performing gradient-based optimization:
Iteration 0: log likelihood = -7608.5763
Iteration 1: log likelihood = -7607.8425
Iteration 2: log likelihood = -7607.8383
Iteration 3: log likelihood = -7607.8383
Computing standard errors:
Mixed-effects ML regression Number of obs = 2287
Group variable: schoolnr Number of groups = 131
Obs per group: min = 4
avg = 17.5
max = 35
Wald chi2(3) = 1059.58
Log likelihood = -7607.8383 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
langpost | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
iqvc | 2.515011 .0789096 31.87 0.000 2.360351 2.669671
sesc | .2410466 .0658997 3.66 0.000 .1118856 .3702077
iqvcXsesc | -.0470625 .0174818 -2.69 0.007 -.0813263 -.0127988
_cons | 40.71326 .2910762 139.87 0.000 40.14276 41.28376
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
schoolnr: Unstructured |
sd(iqvc) | .3859035 .1208045 .2089373 .7127571
sd(_cons) | 2.86771 .2384242 2.436495 3.375242
corr(iqvc,_cons) | -.8008473 .2294731 -.9821521 .1519006
-----------------------------+------------------------------------------------
sd(Residual) | 6.443356 .1005435 6.249277 6.643462
------------------------------------------------------------------------------
LR test vs. linear regression: chi2(3) = 215.75 Prob > chi2 = 0.0000
Note: LR test is conservative and provided only for reference.
We find that the relationship between language scores and verbal IQ varies substantially from school to school, depending on the school's SES and unobserved factors.
For the average school at mean SES, the mean language score for a child with average verbal IQ is 40.71 and increases an average of 2.52 points per point (or 5.20 points per standard deviation) of verbal IQ.
The expected score for an average kid increases 0.24 points per point (or about one point per standard deviation) of school SES, but this effect has a standard deviation across schools of 2.87 points, so there remain substantial unobserved school effects.
The average gain in language scores per point of verbal IQ decreases 0.05 points per point (or about 0.2 points per standard deviation) of school SES. This effect has a standard deviation of 0.39 across schools. Comparatively speaking, SES explains a bigger share of the variation in slopes than in intercepts.
Finally, we note that the random intercept and slope have a negative correlation of -0.80. This means that schools than tend to show higher language scores for average kids also tend to show smaller gains in language scores per point of IQ.
Next we will compute the predicted random effects and regression lines and plot the results:
. predict yhat3, fitted . sort schoolnr iqvc . line yhat3 iqvc, connect(ascending) /// > title(Random Coefficient Model with SES) /// > xtitle(Verbal IQ (centered)) ytitle(Language Score) . graph export fig1lang3.png, width(400) replace (file fig1lang3.png written in PNG format)

A better way to compare the effects of school SES and unobserved school characteristics is to plot the regression lines for 4 schools which represent different scenarios. We set two to have school SES one standard deviation above and one standard deviation below the mean. We then combine these with school effects on the intercept and slope also set one standard deviation above and below the mean.
Because of the high negative correlation between the slope and intercept, it makes more sense to combine a positive intercept residual with a negative slope residual, reflecting a "better" school where scores are higher and less dependent on verbal IQ. The contrast is provided by a negative intercept residual combined with a positive slope residual, reflecting a "worse" school where scores are lower and more dependent on verbal IQ.
Here are the four scenarios, using the standard deviations saved earlier and the parameter estimates from the last run.
. scalar a0 = _b[_cons] . scalar b0 = _b[iqvc] . scalar af = _b[sesc] * sesc_sd . scalar bf = _b[iqvcXsesc] * sesc_sd . scalar ar = exp(_b[lns1_1_2:_cons]) . scalar br = -exp(_b[lns1_1_1:_cons]) // negative sign to pair +ve with -ne . capture drop arit* // to prevent ambiguity with ar . local range range(-7.8 6.1) . twoway /// > function y=(a0+af+ar) + (b0+bf+br)*x, lpat(solid) lc(green) `range' /// > || function y=(a0+af-ar) + (b0+bf-br)*x, lpat(solid) lc(red) `range' /// > || function y=(a0-af+ar) + (b0-bf+br)*x, lpat(dash) lc(green) `range' /// > || function y=(a0-af-ar) + (b0-bf-br)*x, lpat(dash) lc(red) `range' /// > , title(Predicted Regression Lines Given School SES and RE) /// > xtitle(verbal IQ (centered)) ytitle(Language score) /// > legend(order(1 "+ses +re" 2 "+ses -re" 3 "-ses +re" 4 "-ses -re")) . graph export fig2lang3.png, width(400) replace (file fig2lang3.png written in PNG format)

Compare first the top two lines, which are schools with SES one sd above (solid) and below (dashed) the mean, which happen to have favorable unobserved characteristics leading to higher scores and a shallower slope. We see differences in language scores by school SES at low verbal IQs, but these become smaller at higher verbal IQs and vanish at the top end.
Compare now the bottom two lines, which are schools with SES one sd above (solid) and below (dashed) the mean, which happen to have adverse unobserved characteristics leading to lower scores and a steeper slope. We see essentially the same story, with differences by school SES declining with child's verbal IQ.
Compare next the two solid lines, which represent schools with high SES with favorable (green) or adverse (red) conditions. The differences in language scores are larger than the differences by SES, and although they decline with verbal IQ, they never dissappear.
The same is true for the two dashed lines, which correspond to schools with low SES and favorable or adverse unobserved characteristics, which tell the same story but with generally lwoer language scores.
Graphs like these are essential to sort out the effects of observed and unobserved characteristics. They are a key component of my analysis of infant and child mortality in Kenya in the multilevel handbook.

