Germán Rodríguez
Multilevel Models Princeton University

Random Effects Logit Models

The Stata manual has data on union membership from the NLS for 4434 women who were 14-24 in 1968 and were observed between 1 and 12 times.

We read the data from the web and compute southXt, an interaction term between south and year centered on 70.

. webuse union, clear
(NLS Women 14-24 in 1968)

. gen southXt = south * (year-70)

Logit Estimates

We first compute logit estimates for later comparison, fitting the same model as in [R] xtlogit with clustered standard errors

. logit union age grade not_smsa south southXt
 
Iteration 0:   log likelihood =  -13864.23
Iteration 1:   log likelihood = -13550.511
Iteration 2:   log likelihood =  -13545.74
Iteration 3:   log likelihood = -13545.736
 
Logistic regression                               Number of obs   =      26200
                                                  LR chi2(5)      =     636.99
                                                  Prob > chi2     =     0.0000
Log likelihood = -13545.736                       Pseudo R2       =     0.0230
 
------------------------------------------------------------------------------
       union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0099931   .0026737     3.74   0.000     .0047527    .0152335
       grade |   .0483487   .0064259     7.52   0.000     .0357541    .0609432
    not_smsa |  -.2214908   .0355831    -6.22   0.000    -.2912324   -.1517493
       south |  -.7144461   .0612145   -11.67   0.000    -.8344244   -.5944678
     southXt |   .0068356   .0052258     1.31   0.191    -.0034067    .0170779
       _cons |  -1.888256    .113141   -16.69   0.000    -2.110009   -1.666504
------------------------------------------------------------------------------
 
. estimates store logit
 
. logit union age grade not_smsa south southXt, cluster(idcode)
 
Iteration 0:   log pseudolikelihood =  -13864.23
Iteration 1:   log pseudolikelihood = -13550.511
Iteration 2:   log pseudolikelihood =  -13545.74
Iteration 3:   log pseudolikelihood = -13545.736
 
Logistic regression                               Number of obs   =      26200
                                                  Wald chi2(5)    =     161.37
                                                  Prob > chi2     =     0.0000
Log pseudolikelihood = -13545.736                 Pseudo R2       =     0.0230
 
                              (Std. Err. adjusted for 4434 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
       union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0099931   .0039563     2.53   0.012     .0022389    .0177473
       grade |   .0483487    .013936     3.47   0.001     .0210346    .0756628
    not_smsa |  -.2214908   .0713568    -3.10   0.002    -.3613475   -.0816341
       south |  -.7144461   .0919865    -7.77   0.000    -.8947362   -.5341559
     southXt |   .0068356   .0070241     0.97   0.330    -.0069314    .0206026
       _cons |  -1.888256    .207871    -9.08   0.000    -2.295676   -1.480837
------------------------------------------------------------------------------
 
. estimates store cluster

Random Intercepts

The next step is to fit a random-intercepts model and compare results

. xtlogit union age grade not_smsa south southXt, i(idcode)
 
Fitting comparison model:
 
Iteration 0:   log likelihood =  -13864.23
Iteration 1:   log likelihood = -13550.511
Iteration 2:   log likelihood =  -13545.74
Iteration 3:   log likelihood = -13545.736
 
Fitting full model:
 
tau =  0.0     log likelihood = -13545.736
tau =  0.1     log likelihood = -12926.225
tau =  0.2     log likelihood = -12419.526
tau =  0.3     log likelihood = -12003.162
tau =  0.4     log likelihood = -11656.844
tau =  0.5     log likelihood =  -11367.53
tau =  0.6     log likelihood = -11129.716
tau =  0.7     log likelihood = -10947.266
tau =  0.8     log likelihood = -10845.532
 
Iteration 0:   log likelihood = -10947.312  
Iteration 1:   log likelihood = -10557.296  
Iteration 2:   log likelihood = -10540.582  
Iteration 3:   log likelihood = -10540.367  
Iteration 4:   log likelihood = -10540.367  
Iteration 5:   log likelihood = -10540.366  
 
Random-effects logistic regression              Number of obs      =     26200
Group variable: idcode                          Number of groups   =      4434
 
Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       5.9
                                                               max =        12
 
                                                Wald chi2(5)       =    227.30
Log likelihood  = -10540.366                    Prob > chi2        =    0.0000
 
------------------------------------------------------------------------------
       union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0093936    .004454     2.11   0.035      .000664    .0181232
       grade |   .0867878   .0176345     4.92   0.000     .0522247    .1213508
    not_smsa |  -.2519379    .082334    -3.06   0.002    -.4133095   -.0905663
       south |  -1.163769   .1114164   -10.45   0.000    -1.382141    -.945397
     southXt |    .023245   .0078497     2.96   0.003     .0078599    .0386302
       _cons |  -3.360131   .2586306   -12.99   0.000    -3.867038   -2.853225
-------------+----------------------------------------------------------------
    /lnsig2u |   1.749534   .0469964                      1.657423    1.841645
-------------+----------------------------------------------------------------
     sigma_u |   2.398317   .0563561                      2.290366    2.511356
         rho |   .6361486   .0108779                      .6145729    .6571902
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =  6010.74 Prob >= chibar2 = 0.000
 
. estimates store xtlogit
 
. estimates table  logit xtlogit, eq(1:1)
 
----------------------------------------
    Variable |   logit       xtlogit    
-------------+--------------------------
#1           |                          
         age |  .00999311    .00939361  
       grade |  .04834865    .08678776  
    not_smsa | -.22149081   -.25193788  
       south | -.71444608   -1.1637691  
     southXt |   .0068356    .02324502  
       _cons | -1.8882564   -3.3601312  
-------------+--------------------------
lnsig2u      |                          
       _cons |               1.7495341  
----------------------------------------

Except for age, we see that the subject-specific estimates are larger in magnitude than in the marginal logit model.

The odds of being in a union in 1970 are 69% lower for a woman who lives in the south than for one who doesn't, everything else being equal. The effect declined over time, and by 1988 the odds of belonging to a union were 53% lower in the south than elsewhere.

In contrast the logit model estimates the effect as 51% lower odds in 1970 and 45% lower in 1988. These estimates can be interpreted as population average effects.

Given how different the coefficients are, it doesn't make much sense to compare standar errors. Generally speaking, though, they increase as one moves from the logit model to robust standard errors to the estimates based on the random intercept model.

Intra-class Correlation

Stata reports the intraclass correlation as 0.636. This coefficient pertains to a latent variable reflecting propensity to belong to a union, rather than manifest union membership. The correlation between this propensity in any two years for the same individual is 0.64. We can also say that 64% of the variance in the propensity to belong to a union can be attributed to individuals.

Using the xtrho command we can compute the correlation in actual union membership in any two years for a woman with a median linear predictor:

. xtrho
 
Measures of intra-class manifest association in random-effects logit
Evaluated at median linear predictor
 
Measure          |    Estimate     [95% Conf.Interval]
-----------------+------------------------------------
Marginal prob.   |     .225847     .218973     .232833
Joint prob.      |     .125072     .117026     .133375
Odds ratio       |     8.29305     7.64627     9.00295
Pearson's r      |     .423617       .4039     .443193
Yule's Q         |     .784785     .768686     .800059

We estimate a probability of 23% of belonging to a union in any given year and 12% of belonging in two years, much more than one would expect under independence. The correlation is reflected in an odds ratio of 7.7, so for women at the median the odds of belonging to a union at t_2 are 7.7 times as high for those who belonged to a union at t_1 than for those who didn't. Pearson's r is 0.41 and Yule's Q is 0.77.

These measures can be computed for women whose observed characteristics make then more or less likely to belong to a union by using the detail option:

. xtrho, detail
 
Measures of intra-class manifest association in random-effects logit
Evaluated with linear predictor set at selected percentiles
 
Measure          |          p1         p25         p50         p75         p99
-----------------+------------------------------------------------------------
Marginal prob.   |      .10807     .161144     .225847     .251047     .309243
Joint prob.      |     .046724     .079682     .125072      .14408     .190535
Odds ratio       |     10.3122     9.09452     8.29305     8.08413     7.73468
Pearson's r      |      .36357      .39737     .423617     .431096     .444279
Yule's Q         |       .8232     .801873     .784785     .779836     .771028

The correlation as measured by the odds ratio or Yule's Q is higher when women are less likely to belong to a union, but the opposite is true if one uses Pearson's r.

For a more detailed discussion of this post-estimation command see muy paper with Elo in the Stata Journal 3(1):32--46 (2003), available here