Germán Rodríguez
Multilevel Models Princeton University

## Random Intercepts and Slopes

We continue our analysis of the Snijders and Bosker data. This time we will consider verbal IQ as a predictor of language scores. Here are the data.

```. use https://data.princeton.edu/pop510/snijders
(Scores in language test from Snijders and Bosker, 1999)
```
```> library(foreign)

```

To simplify interpretation we will center verbal IQ on the overall mean. We also compute the number of observations per school and flag the first, as we did before.

```. sum iq_verb

Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
iq_verb |      2,287    11.83406     2.06889          4         18

. gen iqvc = iq_verb - r(mean)

. egen sn = count(langpost), by(schoolnr)

. bysort schoolnr : gen first=_n==1
```
```> snijders <- mutate(snijders,
+   iqvc = iq_verb - mean(iq_verb))
```

### School Regressions

Our first step will be to run a separate regression for each school, saving the intercept and slope. This is easy to do with `statsby`, creating variables `sa` and `sb` in a new Stata dataset called "ols", which we then merge with the current dataset. The final step is to plot the school-specific regression lines To do this we take advantage of `dplyr`'s `do()` to fit the models, extract the coefficients, join them with the data, and plot the lines.

```. statsby sa=_b[_cons] sb=_b[iqvc], by(schoolnr) saving(ols, replace) ///
>         : regress langpost iqvc
(running regress on estimation sample)

command:  regress langpost iqvc
sa:  _b[_cons]
sb:  _b[iqvc]
by:  schoolnr

Statsby groups
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
..................................................    50
..................................................   100
...............................

. sort schoolnr

. merge m:1 schoolnr using ols

Result                           # of obs.
-----------------------------------------
not matched                             0
matched                             2,287  (_merge==3)
-----------------------------------------

. drop _merge

. gen yhat = sa + sb * iqvc

. sort schoolnr iqvc

. line yhat iqvc, connect(ascending) title(School Regressions)

. graph export fig1lang2.png, width(500) replace
(file fig1lang2.png written in PNG format)
``` ```> fits <- group_by(snijders, schoolnr) %>% do( lf = lm(langpost ~ iqvc, data = .) )

> ols <- data.frame(id = fits[], t(sapply(fits[],coef)))

> names(ols) <- c("schoolnr", "sa", "sb")

> snijders <- left_join(snijders, ols, by = "schoolnr") %>%
+   mutate(fv = sa + sb * iqvc)

> ggplot(snijders, aes(iqvc, fv, group= schoolnr)) + geom_line() +
+   ggtitle("School Regressions")

> ggsave("fig1lang2r.png", width = 500/72, height = 400/72, dpi = 72)
``` We will compare these lines with the Bayesian estimates based on random intercept and random slope models.

### Random Intercepts

We now consider a model where each school has its onw intercept but these are drawn from a normal distribution with mean α and standard deviation σa. We will use `xtmixed` instead of `xtreg` so we can get BLUPS.

```. xtmixed langpost iqvc || schoolnr: , mle

Performing EM optimization:

Iteration 0:   log likelihood = -7625.8865
Iteration 1:   log likelihood = -7625.8865

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =      2,287
Group variable: schoolnr                        Number of groups  =        131

Obs per group:
min =          4
avg =       17.5
max =         35

Wald chi2(1)      =    1261.42
Log likelihood = -7625.8865                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
langpost |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
iqvc |   2.488094   .0700548    35.52   0.000     2.350789    2.625399
_cons |   40.60937   .3068552   132.34   0.000     40.00794    41.21079
------------------------------------------------------------------------------

------------------------------------------------------------------------------
Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
schoolnr: Identity           |
sd(_cons) |    3.08172   .2552305      2.619968    3.624853
-----------------------------+------------------------------------------------
sd(Residual) |   6.498244   .0991428      6.306804    6.695494
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 225.92        Prob >= chibar2 = 0.0000
```
```> library(lme4)

> ri <- lmer(langpost ~ iqvc + (1 | schoolnr), data = snijders, REML = FALSE)

> ri
Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: langpost ~ iqvc + (1 | schoolnr)
Data: snijders
AIC       BIC    logLik  deviance  df.resid
15259.773 15282.713 -7625.886 15251.773      2283
Random effects:
Groups   Name        Std.Dev.
schoolnr (Intercept) 3.082
Residual             6.498
Number of obs: 2287, groups:  schoolnr, 131
Fixed Effects:
(Intercept)         iqvc
40.609        2.488
```

The expected language score for a kid with average verbal IQ averages 40.6 across all schools, but shows substantial variation from one school to another, with a standard deviation of 3.1. The common slope is estimated as a gain of 2.49 points in language score per point of verbal IQ. (The standard deviation of verbal IQ is 2.07, so one standard deviation of verbal IQ is associated with 5.15 points in language score. This gives us a good idea of the relative importance of observed and unobserved effects.)

Next we compute fitted lines and estimate the random effects. As a check we verify that we can reproduce the fitted values "by hand" using the fixed and random coefficients.

```. predict yhat1, fitted // y-hat for model 1

. predict ra1, reffects // residual intercept for model 1

. gen check = (_b[_cons] + ra1) + _b[iqvc]*iqvc

. list yhat1 check in 1/5

+---------------------+
|    yhat1      check |
|---------------------|
1. | 20.74202   20.74202 |
2. | 24.47416   24.47416 |
3. | 26.96226   26.96226 |
4. |  30.6944    30.6944 |
5. |  33.1825    33.1825 |
+---------------------+
```
```> snijders <- mutate(snijders, fv1 = predict(ri))

> b <- fixef(ri)

> re <- ranef(ri)\$schoolnr[,1]

> check <- b + re[snijders\$schoolnr] + b * snijders\$iqvc

> cbind(snijders\$fv1[1:5], check[1:5])
[,1]     [,2]
1 48.11106 48.11106
2 46.86701 46.86701
3 34.42654 34.42654
4 38.15868 38.15868
5 30.69440 30.69440
```

Next we plot the fitted regression lines and the two estimates of the school intercepts

```. line yhat1 iqvc, connect(ascending) title(Random Intercept)

. graph export fig2lang2.png, width(500) replace
(file fig2lang2.png written in PNG format)

. gen sa1 = _b[_cons] + ra1 // school intercept based no model 1

. twoway (scatter sa1 sa ) (function y=x, range(30 48)) ///
>    , legend(off) ytitle(BLUP) xtitle(ML) title(School Intercepts)

. graph export fig3lang2.png, width(500) replace
(file fig3lang2.png written in PNG format)

. list schoolnr sn if sa < 30 & first

+---------------+
| schoolnr   sn |
|---------------|
31. |        2    7 |
337. |       47    8 |
841. |      103    4 |
2282. |      258    7 |
+---------------+
``` ```> ggplot(snijders, aes(iqvc, fv1, group = factor(schoolnr))) + geom_line() +
+   ggtitle("Random Intercept")

> ggsave("fig2lang2r.png", width = 500/72, height = 400/72, dpi = 72)

> ols <- mutate(ols, sab = b + re)

> ggplot(ols, aes(sa, sab)) + geom_point() +
+   geom_abline(intercept=0, slope=1) + ggtitle("School Intercepts")

>  ggsave("fig3lang2r.png", width = 500/72, height = 400/72, dpi = 72)
``` We see that all the lines are parallel as one would expect.  We also note that the intercepts have shrunk, particularly for the four small schools with very low language scores.

### Random Slopes

Our next model treats the intercept and slope as observations from a bivariate normal distribution with mean α,β and variance-covariance matrix with elements σ2a, σ2b, and σab. In Stata you must specify `covariance(unstructured)`.

```. xtmixed langpost iqvc || schoolnr: iqvc, mle covariance(unstructured)

Performing EM optimization:

Iteration 0:   log likelihood = -7615.9951
Iteration 1:   log likelihood = -7615.3896
Iteration 2:   log likelihood = -7615.3887
Iteration 3:   log likelihood = -7615.3887

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =      2,287
Group variable: schoolnr                        Number of groups  =        131

Obs per group:
min =          4
avg =       17.5
max =         35

Wald chi2(1)      =     962.02
Log likelihood = -7615.3887                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
langpost |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
iqvc |   2.526371   .0814526    31.02   0.000     2.366727    2.686015
_cons |   40.70956   .3042429   133.81   0.000     40.11325    41.30586
------------------------------------------------------------------------------

------------------------------------------------------------------------------
Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
schoolnr: Unstructured       |
sd(iqvc) |   .4583786   .1100923      .2862759    .7339457
sd(_cons) |   3.058361   .2491362      2.607049      3.5878
corr(iqvc,_cons) |  -.8168454   .1743483     -.9744734   -.1197812
-----------------------------+------------------------------------------------
sd(Residual) |   6.440508   .1004239      6.246657    6.640373
------------------------------------------------------------------------------
LR test vs. linear model: chi2(3) = 246.91                Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.
```
```> rs <- lmer(langpost ~ iqvc + (iqvc | schoolnr), data=snijders, REML=FALSE)

> rs
Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: langpost ~ iqvc + (iqvc | schoolnr)
Data: snijders
AIC       BIC    logLik  deviance  df.resid
15242.777 15277.187 -7615.389 15230.777      2281
Random effects:
Groups   Name        Std.Dev. Corr
schoolnr (Intercept) 3.0584
iqvc        0.4584   -0.82
Residual             6.4405
Number of obs: 2287, groups:  schoolnr, 131
Fixed Effects:
(Intercept)         iqvc
40.710        2.526
```

The expected language score for a child with average IQ now averages 40.7 across schools, with a standard deviation of 3.1. The expected gain in language score per point of IQ averages 2.5, almost the same as before, with a standard deviation of 0.46. The intercept and slope have a negative correlation of -0.82 across schools, so schools with higher language scores for a kid with average verbal IQ tend to show smaller average gains.

The next step is to predict fitted values as well as the random effects. We verify that we can reproduce the fitted values "by hand" and the plot the fitted lines

```. predict yhat2, fitted           // yhat for model 2

. predict rb2 ra2, reffects       //residual  slope and intercept for model 2

. capture drop check

. gen check = (_b[_cons] + ra2) + (_b[iqvc] + rb2)*iqvc

. list yhat2 check in 1/5

+---------------------+
|    yhat2      check |
|---------------------|
1. | 20.78046   20.78046 |
2. | 24.53936   24.53936 |
3. | 27.04529   27.04529 |
4. | 30.80419   30.80419 |
5. | 33.31012   33.31012 |
+---------------------+

. line yhat2 iqvc, connect(ascending)

. graph export fig4lang2.png, width(500) replace
(file fig4lang2.png written in PNG format)
``` ```> snijders <- mutate(snijders, fv2 = predict(rs))

> re <- ranef(rs)\$schoolnr

> b <- fixef(rs)

> map <- snijders\$schoolnr

> check <- (b + re[map, 1]) + (b + re[map ,2]) * snijders\$iqvc

> cbind(snijders\$fv2[1:5], check[1:5])
[,1]     [,2]
1 48.34571 48.34571
2 47.09274 47.09274
3 34.56309 34.56309
4 38.32198 38.32198
5 30.80419 30.80419

> ggplot(snijders, aes(iqvc, fv2, group=schoolnr)) + geom_line() +
+   ggtitle("Random Slopes")

> ggsave("fig4lang2r.png", width = 500/72, height = 400/72, dpi = 72)
``` The graph of fitted lines shows clearly how school differences are more pronounced at lower than at higher verbal IQs.

Next we compare the latest Bayesian estimates with the within-school regressions to note the shrinking typical of Bayes methods

```. gen sa2 = _b[_cons] + ra2

. gen sb2 = _b[iqvc]  + rb2

. twoway scatter sa2 sa , title(Intercepts) name(a, replace)

. scatter sb2 sb, title(Slopes) name(b, replace)

. graph combine a b, title(Bayesian and Within-School Estimates) ///
>    xsize(6) ysize(3)

. graph export fig5lang2.png, width(720) replace
(file fig5lang2.png written in PNG format)
``` ```> ols <- mutate(ols, sa2 = b + re[,1], sb2 = b + re[,2])

> g1 <- ggplot(ols, aes(sa, sa2)) + geom_point() + ggtitle("Intercepts")

> g2 <- ggplot(ols, aes(sb, sb2)) + geom_point() + ggtitle("Slopes")

> g <- arrangeGrob(g1, g2, ncol=2)

> ggsave("fig5lang2r.png", width = 10, height = 5, dpi = 72)
``` We can also check how much individual schools are affected. Try school 40 for little change, or schools 2, 47, 103 or 258 for substantial change. Here is a plot of school 2:

```. gen use = schoolnr==2

. twoway  (scatter langpost iqvc if use) ///      scatterplot
>    (line yhat  iqvc if use, lpat(dot)) ///         within school
>    (line yhat2 iqvc if use, lpat(dash)) ///                mixed model
>    ( function y=_b[_cons] + _b[iqvc]*x, range(-5 1) ) , /// average
>    legend(order(2 "within" 3 "EB" 4 "Avg") cols(1) ring(0) pos(5))

. graph export fig6lang2.png, width(500) replace
(file fig6lang2.png written in PNG format)
``` ```> school2 <- filter(snijders, schoolnr ==2) %>%
+   mutate(avg = b + b * iqvc)

> ggplot(school2, aes(iqvc, langpost)) + geom_point() +
+   geom_line(aes(iqvc, fv,  color="Within")) +
+   geom_line(aes(iqvc, fv2, color="EB")) +
+   geom_line(aes(iqvc, avg, color="Average")) +
+   guides(color=guide_legend(title="Line"))

> ggsave("fig6lang2r.png", width = 500/72, height = 400/72, dpi = 72)
``` The next step will be to consider a contextual or school-level predictor.

Continue with