Germán Rodríguez
Multilevel Models Princeton University

Immunization in Guatemala

This is a collection of runs reproducing some of the results in Rodriguez and Goldman (2001). The data used are documented here. The code is no longer needed in light of the excellent implementation of adaptive Gaussian quadrature in Stata, but has been kept here for archival purposes.

Gaussian Quadrature: lr3

I wrote S-Plus/R functions for fitting multilevel generalized linear models with random intercepts. The S-Plus functions for 2- and 3-level logistic regression are documented in some detail at http://data.princeton.edu/multilevel/lr3.html, including examples.

Here are the commands needed to reproduce the Guatemala results on immunization (assuming thall needed files are in the local chapter) using 12 quadrature points (the default):

 
source("lr3.s")
dll.load("lr3.dll",lr3.entry.points)
guimr = read.table("GuImmun.dat", header=T)
model = immun~kid2p+mom25p+order23+order46+order7p+
    indNoSpa+indSpa+momEdPri+momEdSec+husEdPri+husEdSec+husEdDK+momWork+
     rural+pcInd81
g12 = lr3(model,mom,cluster,data=guimr)

 
&gr; summary(g12)
3-level logit model
observations:  2159 1595 161 
logL =  -1323.9618222914 
                 Coef Std. Error  t value 
(Intercept) -1.233400     0.4832 -2.55267
      kid2p  1.716657     0.2173  7.89901
     mom25p -0.215995     0.2316 -0.93250
    order23 -0.260719     0.2319 -1.12418
    order46  0.179374     0.2945  0.60912
    order7p  0.431083     0.3720  1.15873
   indNoSpa -0.172474     0.4894 -0.35244
     indSpa -0.083649     0.3633 -0.23025
   momEdPri  0.433026     0.2224  1.94679
   momEdSec  0.420005     0.4844  0.86712
   husEdPri  0.539349     0.2323  2.32187
   husEdSec  0.505110     0.4143  1.21929
    husEdDK -0.007241     0.3569 -0.02029
    momWork  0.390044     0.2027  1.92442
      rural -0.888257     0.3062 -2.90067
    pcInd81 -1.149777     0.5004 -2.29781
log(sigma2)  0.839785     0.1129  7.43514
log(sigma3)  0.024317     0.1563  0.15559
     sigma2  2.315870     0.2616  8.85361
     sigma3  1.024615     0.1601  6.39856

These results are within 0.01 of the estimates published in our paper. If you use 20 quadrature points the results coincide exactly.

Adaptive Gaussian Quadrature: gllamm

Here's how to reproduce the results using Stata's gllamm, which can be installed from SSC and is described in detail at http://www.gllamm.org. See also A. Skrondal and S. Rabe-Hesketh (2004), Generalized Latent Variable Modeling, Boca Raton: Chapman & Hall/CRC.

We read the data from the website using insheet specifying that we use a single blank as delimitor. We then specify the model using a fairly simple syntax. If you use all defaults you get estimates within 0.01 of the results in the paper. If you specify the option adapt to use adaptive quadrature, which is more precise, you reproduce the published results exactly.

 
insheet using http://data.princeton.edu/multilevel/GuImmun.dat, delim(" ")
desc,short
gllamm immun kid2p-pcind81, i(mom cluster) family(binomial) adapt

Here are the main results. (Note that the procedure is relatively slow.)

 
gllamm model
 
log likelihood = -1323.959
 
------------------------------------------------------------------------------
       immun |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       kid2p |   1.719531   .2186059     7.87   0.000     1.291072    2.147991
      mom25p |  -.2148891   .2320367    -0.93   0.354    -.6696727    .2398945
     order23 |  -.2631825   .2324459    -1.13   0.258     -.718768     .192403
     order46 |   .1761471   .2950018     0.60   0.550    -.4020458    .7543401
     order7p |   .4295824   .3726656     1.15   0.249    -.3008287    1.159994
    indnospa |  -.1735924   .4911932    -0.35   0.724    -1.136313    .7891286
      indspa |  -.0831608   .3641192    -0.23   0.819    -.7968213    .6304997
    momedpri |    .432602    .222846     1.94   0.052     -.004168    .8693721
    momedsec |    .419121   .4848783     0.86   0.387     -.531223    1.369465
    husedpri |   .5412021   .2330428     2.32   0.020     .0844465    .9979576
    husedsec |   .5067223   .4149988     1.22   0.222    -.3066605    1.320105
     huseddk |  -.0062265   .3577698    -0.02   0.986    -.7074425    .6949894
     momwork |   .3898511   .2030766     1.92   0.055    -.0081718     .787874
       rural |  -.8849883   .3048577    -2.90   0.004    -1.482498   -.2874781
     pcind81 |  -1.150408   .5009737    -2.30   0.022    -2.132299   -.1685179
       _cons |  -1.237557    .483198    -2.56   0.010    -2.184608   -.2905063
------------------------------------------------------------------------------
 
 
Variances and covariances of random effects
------------------------------------------------------------------------------
 
 
***level 2 (mom)
 
    var(1): 5.399757 (1.2482465)
 
***level 3 (cluster)
 
    var(1): 1.0499332 (.32836955)
------------------------------------------------------------------------------

These results essentially confirm our estimates.

Adaptive Gaussian Quadrature: Stata 13

These are the results with melogit in Stata 13 using all the defaults, which means 7-point Gauss-Hermite adaptive quadrature using posterior means and variance

. melogit immun kid2p-pcind81 || cluster: || mom: 

... iteration log supressed ...

Mixed-effects logistic regression               Number of obs      =      2159

-----------------------------------------------------------
                |   No. of       Observations per Group
 Group Variable |   Groups    Minimum    Average    Maximum
----------------+------------------------------------------
        cluster |      161          1       13.4         55
            mom |     1595          1        1.4          3
-----------------------------------------------------------

Integration method: mvaghermite                 Integration points =         7

                                                Wald chi2(15)      =     86.66
Log likelihood = -1324.1183                     Prob > chi2        =    0.0000
------------------------------------------------------------------------------
       immun |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       kid2p |    1.70202   .2131658     7.98   0.000     1.284222    2.119817
      mom25p |  -.2113769   .2294292    -0.92   0.357    -.6610497     .238296
     order23 |  -.2548676   .2299495    -1.11   0.268    -.7055603    .1958251
     order46 |   .1807318   .2920347     0.62   0.536    -.3916456    .7531092
     order7p |   .4290456   .3693368     1.16   0.245    -.2948412    1.152932
    indnospa |  -.1811467   .4852798    -0.37   0.709    -1.132278    .7699844
      indspa |  -.0864748   .3606235    -0.24   0.810    -.7932838    .6203342
    momedpri |   .4278608   .2201934     1.94   0.052    -.0037103    .8594318
    momedsec |   .4155336   .4795519     0.87   0.386    -.5243708    1.355438
    husedpri |   .5371582   .2301506     2.33   0.020     .0860713    .9882452
    husedsec |   .5050997   .4104378     1.23   0.218    -.2993436    1.309543
     huseddk |  -.0050446   .3532848    -0.01   0.989    -.6974702    .6873809
     momwork |   .3873643   .2010269     1.93   0.054    -.0066412    .7813698
       rural |  -.8833379   .3042309    -2.90   0.004    -1.479619   -.2870563
     pcind81 |  -1.144626   .4978997    -2.30   0.022    -2.120491   -.1687603
       _cons |  -1.225093   .4787145    -2.56   0.010    -2.163356   -.2868294
-------------+----------------------------------------------------------------
cluster      |
   var(_cons)|   1.046991     .32473                      .5700828     1.92286
-------------+----------------------------------------------------------------
cluster>mom  |
   var(_cons)|   5.165571   1.095497                      3.408775    7.827776
------------------------------------------------------------------------------
LR test vs. logistic regression:     chi2(2) =   150.94   Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

The procedure is not only accurate but also quite fast. Increasing the number of integration points to 12 comes even closer to our published results.

The Laplace Approximation: lme4 in R

The R package lme4 implements adaptive Gaussian quadrature but only for 2-level models (or models with just one grouping level). For 3-level models like the immunization model only the Laplace approximation is available.

> gu = 
> pql=glmer(immun ~ kid2p+mom25p+order23+order46+order7p+indNoSpa+indSpa+momEdPri+
+ husEdPri+husEdSec+momWork+rural+pcInd81+(1|cluster) + (1|mom),data=gu,family=binomial)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge with max|grad| = 4.66041 (tol = 0.001)
  
> pql
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmerMod]
 Family: binomial ( logit )
Formula: immun ~ kid2p + mom25p + order23 + order46 + order7p + indNoSpa +  
    indSpa + momEdPri + husEdPri + husEdSec + momWork + rural +  
    pcInd81 + (1 | cluster) + (1 | mom)
   Data: gu
      AIC       BIC    logLik  deviance  df.resid 
 2744.753  2835.591 -1356.376  2712.753      2143 
Random effects:
 Groups  Name        Std.Dev.
 mom     (Intercept) 1.0803  
 cluster (Intercept) 0.7126  
Number of obs: 2159, groups: mom, 1595; cluster, 161
Fixed Effects:
(Intercept)        kid2p       mom25p      order23      order46      order7p     indNoSpa  
    -0.7137       1.2306      -0.1279      -0.1804       0.1371       0.2090      -0.2452  
     indSpa     momEdPri     husEdPri     husEdSec      momWork        rural      pcInd81  
    -0.1440       0.2280       0.3539       0.3934       0.2204      -0.7026      -0.7929  

Unfortunately the procedule fails to converge.