Home | GLMs | Multilevel | Survival | Demography | Stata | R
Home 2 Levels 3 Levels Logit winBUGS MLwiN

Random-Coefficient Models

We continue our analysis of the Snijders and Bosker data. This time we will consider verbal IQ as a predictor of language scores. Here are the data.

. infile schoolnr  pupilnr iq_verb  iq_perf sex  minority  repeatgr ///
>      aritpret  classnr aritpost langpret langpost ses denomina schoolses ///
>      satiprin natitest meetings currmeet mixedgra percmino aritdiff ///
>      homework classsiz groupsiz using ///
>      http://data.princeton.edu/wws509/datasets/snijders.dat
(2287 observations read)

To simplify interpretation we will center verbal IQ on the overall mean. We also compute the number of observations per schol and create a flag for the first observation in each school (as before).

. sum iq_verb
 
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
     iq_verb |      2287    11.83406     2.06889          4         18
 
. gen iqvc = iq_verb-r(mean)
 
. egen sn = count(langpost), by(schoolnr)
 
. bysort schoolnr : gen first=_n==1

School Regressions

Our first step wll be to run a separate regression for each school. We take advantage of Stata's stabsby command to run the regression for each school and save the intercept and slope as new variables sa and sb in a new Stata dataset called "ols". We then merge that dataset with the current data and plot the regression lines. (For an alternative way of doing this sort of thing using looping see my panel data log.)

. statsby sa=_b[_cons] sb=_b[iqvc], by(schoolnr) saving(ols, replace) ///
>         : regress langpost iqvc
(running regress on estimation sample)
 
      command:  regress langpost iqvc
           sa:  _b[_cons]
           sb:  _b[iqvc]
           by:  schoolnr
 
Statsby groups
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
..................................................    50
..................................................   100
...............................
 
. sort schoolnr
 
. merge schoolnr using ols
variable schoolnr does not uniquely identify observations in the master data
 
. drop _merge
 
. gen yhat = sa+sb*iqvc
 
. sort schoolnr iqvc
 
. line yhat iqvc, connect(ascending) title(School Regressions)
 
. graph export fig1lang2.png, width(400) replace 
(file fig1lang2.png written in PNG format)

We will compare these lines with the Bayesian estimates based on random intercept and random slope models.

Random Intercepts

We now consider a model where each school has its onw intercept but these are drawn from a normal distribution with mean a and standard deviation sa. This model can be fit using xtreg, but we will use xtmixed so we can get BLUPS.

. xtmixed langpost iqvc || schoolnr: , mle
 
Performing EM optimization: 
 
Performing gradient-based optimization: 
 
Iteration 0:   log likelihood = -7625.8865  
Iteration 1:   log likelihood = -7625.8865  
 
Computing standard errors:
 
Mixed-effects ML regression                     Number of obs      =      2287
Group variable: schoolnr                        Number of groups   =       131
 
                                                Obs per group: min =         4
                                                               avg =      17.5
                                                               max =        35
 
                                                Wald chi2(1)       =   1261.42
Log likelihood = -7625.8865                     Prob > chi2        =    0.0000
 
------------------------------------------------------------------------------
    langpost |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        iqvc |   2.488094   .0700548    35.52   0.000     2.350789    2.625399
       _cons |   40.60937   .3068551   132.34   0.000     40.00794    41.21079
------------------------------------------------------------------------------
 
------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
schoolnr: Identity           |
                   sd(_cons) |   3.081719   .2552304      2.619967    3.624851
-----------------------------+------------------------------------------------
                sd(Residual) |   6.498244   .0991428      6.306804    6.695495
------------------------------------------------------------------------------
LR test vs. linear regression: chibar2(01) =   225.92 Prob >= chibar2 = 0.0000

The expected language score for a kid with average verbal IQ averages 4.6 across all schools, but shows substantial variation, with a standard deviation of 3.1. The common slope is estimated as a gain of 2.5 points in language score per point of verbal IQ. (The standard deviation of verbal IQ is 2.5, so one standard deviation of verbal IQ is associated to 5.1 points in language score. This gives us a good idea of the relative importance of observed nd unobserved effects.)

Next we use predict to compute fitted lines and estimate the random effects

. predict yhat1, fitted // y-hat for model 1
 
. predict ra1, reffects // residual intercept for model 1
 
. gen check = (_b[_cons]+ra1) + _b[iqvc]*iqvc 
 
. list yhat1 check in 1/10
 
     +---------------------+
     |    yhat1      check |
     |---------------------|
  1. | 20.74202   20.74202 |
  2. | 24.47416   24.47416 |
  3. | 26.96226   26.96226 |
  4. |  30.6944    30.6944 |
  5. |  33.1825    33.1825 |
     |---------------------|
  6. | 34.42654   34.42654 |
  7. | 34.42654   34.42654 |
  8. | 34.42654   34.42654 |
  9. | 34.42654   34.42654 |
 10. | 35.67059   35.67059 |
     +---------------------+

We used the fitted option of predict to obtain fitted values given the random effects, and the reffects option to get BLUPS. As a check we verify that we can reproduce the fitted values using the fixed and random coefficients.

Next we plot the fited regression lines and the two estimates of the school intercepts

. line yhat1 iqvc, connect(ascending) title(Random Intercept)
 
. graph export fig2lang2.png, width(400) replace
(file fig2lang2.png written in PNG format)
 
. gen sa1 = _b[_cons]+ra1 // school intercept based no model 1
 
. twoway (scatter sa1 sa ) (function y=x, range(16 48)) ///
>         , legend(off) ytitle(BLUP) xtitle(ML) title(School Intercepts)
 
. graph export fig3lang2.png, width(400) replace
(file fig3lang2.png written in PNG format)
 
. list schoolnr sn if sa < 30 & first
 
      +---------------+
      | schoolnr   sn |
      |---------------|
  32. |        2    7 |
 337. |       47    8 |
 841. |      103    4 |
2282. |      258    7 |
      +---------------+

<

We see that all the lines are parallel as one would expect. We also note that the intercepts have been shrunk, particularly for the four small schools with very low language scores.

Random Slopes

Our next model treats the intercept and slope as observations from a bivariate normal distribution with mean a,b and variance-covariance matrix with elements s2a, s2b, and sab. To let Stata know that we want the covariance between the intercept and slope to be estimated we specify covariance(unstructured). (The default assumes that the covariance is zero.

. xtmixed langpost iqvc || schoolnr: iqvc, mle covariance(unstructured)
 
Performing EM optimization: 
 
Performing gradient-based optimization: 
 
Iteration 0:   log likelihood = -7615.9951  
Iteration 1:   log likelihood = -7615.3893  
Iteration 2:   log likelihood = -7615.3887  
 
Computing standard errors:
 
Mixed-effects ML regression                     Number of obs      =      2287
Group variable: schoolnr                        Number of groups   =       131
 
                                                Obs per group: min =         4
                                                               avg =      17.5
                                                               max =        35
 
                                                Wald chi2(1)       =    962.03
Log likelihood = -7615.3887                     Prob > chi2        =    0.0000
 
------------------------------------------------------------------------------
    langpost |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        iqvc |    2.52637   .0814522    31.02   0.000     2.366726    2.686013
       _cons |   40.70956   .3042423   133.81   0.000     40.11325    41.30586
------------------------------------------------------------------------------
 
------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
schoolnr: Unstructured       |
                    sd(iqvc) |   .4583713   .1100965       .286264    .7339526
                   sd(_cons) |   3.058354   .2491357      2.607043    3.587791
            corr(iqvc,_cons) |  -.8168636   .1743621     -.9744848   -.1196644
-----------------------------+------------------------------------------------
                sd(Residual) |    6.44051   .1004244      6.246659    6.640377
------------------------------------------------------------------------------
LR test vs. linear regression:       chi2(3) =   246.91   Prob > chi2 = 0.0000
 
Note: LR test is conservative and provided only for reference.

The expected language score for a child with average IQ now averages 40.7 across schools, with a standard dviation of 3.1. The expected gain in language score per point of IQ averages 2.5, almost the same as before, with a standard devaition of 0.48. The intercept and slope have a negative correlation of -0.82 across schools, so schools with higher language scores for a kid ith average verbal IQ tend to show smaller average gains.

The next step is to predict fitted values as well as the random effects. We verify that we can reproduce the fitted values

. predict yhat2, fitted           // yhat for model 2
 
. predict rb2 ra2, reffects       //residual  slope and intercept for model 2
 
. capture drop check
 
. gen check = (_b[_cons]+ra2) + (_b[iqvc]+rb2)*iqvc
 
. list yhat2 check in 1/10
 
     +---------------------+
     |    yhat2      check |
     |---------------------|
  1. | 20.78043   20.78043 |
  2. | 24.53934   24.53934 |
  3. | 27.04527   27.04527 |
  4. | 30.80418   30.80418 |
  5. | 33.31012   33.31012 |
     |---------------------|
  6. | 34.56309   34.56309 |
  7. | 34.56309   34.56309 |
  8. | 34.56309   34.56309 |
  9. | 34.56309   34.56309 |
 10. | 35.81606   35.81606 |
     +---------------------+
 
. line yhat2 iqvc, connect(ascending)
 
. graph export fig4lang2.png, width(400) replace
(note: file fig4lang2.png not found)
(file fig4lang2.png written in PNG format)

The graph of fitted lines shows clearly how school differences are more pronounced at lower than at higher verbal IQs.

Next we comparing the latest Bayesian estimates with the within-school regressions to note the shrinking

. gen sa2 = _b[_cons] + ra2
 
. gen sb2 = _b[iqvc]  + rb2
 
. twoway scatter sa2 sa , title(Intercepts) name(a, replace)
 
. scatter sb2 sb, title(Slopes) name(b, replace)
 
. graph combine a b, title(Bayesian and Within-School Estimates) ///
>         xsize(4) ysize(2)
 
. graph export fig5lang2.png, width(400) replace
(note: file fig5lang2.png not found)
(file fig5lang2.png written in PNG format)

We can also check how much individual schools are affected. Try 40 for little change, 2, 47, 103 or 258 for substantial change:

. gen use = schoolnr==2
 
. twoway  (scatter langpost iqvc if use) ///      scatterplot
>         (line yhat  iqvc if use, lpat(dot)) ///         within school
>         (line yhat2 iqvc if use, lpat(dash)) ///                mixed model
>         ( function y=_b[_cons]+_b[iqvc]*x, range(-5 5) ) , /// average
>         legend(order(2 "within" 3 "EB" 4 "G") cols(1) ring(0) pos(5))
 
. graph export fig6lang2.png, width(400) replace
(note: file fig6lang2.png not found)
(file fig6lang2.png written in PNG format)

We continue with a level-2 predictor of intercepts and slopes.