Germán Rodríguez
Generalized Linear Models Princeton University

8.3 Longitudinal Logits

This is a dataset on union membership used in the Stata manuals and in my own paper on intra-class correlation for binary data. This is a subsample of the National Longitudinal Survey of Youth (NLSY) and has union membership information from 1970-88 for 4,434 women aged 14-26 in 1968. The data are available in the Stata and OPR websites

. clear
 
. use http://data.princeton.edu/wws509/datasets/union
(NLS Women 14-24 in 1968)

Logits

Here is a logit model

. logit union age grade not_smsa south southXt
 
Iteration 0:   log likelihood =  -13864.23
Iteration 1:   log likelihood = -13550.511
Iteration 2:   log likelihood =  -13545.74
Iteration 3:   log likelihood = -13545.736
 
Logit estimates                                   Number of obs   =      26200
                                                  LR chi2(5)      =     636.99
                                                  Prob > chi2     =     0.0000
Log likelihood = -13545.736                       Pseudo R2       =     0.0230
 
------------------------------------------------------------------------------
       union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0099931   .0026737     3.74   0.000     .0047527    .0152335
       grade |   .0483487   .0064259     7.52   0.000     .0357541    .0609432
    not_smsa |  -.2214908   .0355831    -6.22   0.000    -.2912324   -.1517493
       south |  -.7144461   .0612145   -11.67   0.000    -.8344244   -.5944678
     southXt |   .0068356   .0052258     1.31   0.191    -.0034067    .0170779
       _cons |  -1.888256    .113141   -16.69   0.000    -2.110009   -1.666504
------------------------------------------------------------------------------
 
. estimates store logit

Fixed-Effects

Let us try a fixed-effects model first

. xtlogit union age grade not_smsa south southXt, i(id) fe
 
note: multiple positive outcomes within groups encountered.
note: 2744 groups (14165 obs) dropped due to all positive or
      all negative outcomes.
Iteration 0:   log likelihood = -4541.9044
Iteration 1:   log likelihood = -4511.1353
Iteration 2:   log likelihood = -4511.1042
 
Conditional fixed-effects logistic regression   Number of obs      =     12035
Group variable (i): idcode                      Number of groups   =      1690
 
                                                Obs per group: min =         2
                                                               avg =       7.1
                                                               max =        12
 
                                                LR chi2(5)         =     78.16
Log likelihood  = -4511.1042                    Prob > chi2        =    0.0000
 
------------------------------------------------------------------------------
       union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0079706   .0050283     1.59   0.113    -.0018848    .0178259
       grade |   .0811808   .0419137     1.94   0.053    -.0009686    .1633302
    not_smsa |   .0210368    .113154     0.19   0.853    -.2007411    .2428146
       south |  -1.007318   .1500491    -6.71   0.000    -1.301409   -.7132271
     southXt |   .0263495   .0083244     3.17   0.002      .010034    .0426649
------------------------------------------------------------------------------
 
. estimates store fixed

Note how we lost 63% of our sample (2744 out of 4434). These are women who didn't have variation in union membership. We will compare the estimates later.

Random-Effects

Now we fit a random-effects model:

. xtlogit union age grade not_smsa south southXt, i(id)
 
Fitting comparison model:
 
Iteration 0:   log likelihood =  -13864.23
Iteration 1:   log likelihood = -13550.511
Iteration 2:   log likelihood =  -13545.74
Iteration 3:   log likelihood = -13545.736
 
Fitting full model:
 
tau =  0.0     log likelihood = -13545.736
tau =  0.1     log likelihood = -12926.225
tau =  0.2     log likelihood = -12419.526
tau =  0.3     log likelihood = -12003.162
tau =  0.4     log likelihood = -11656.844
tau =  0.5     log likelihood =  -11367.53
tau =  0.6     log likelihood = -11129.716
tau =  0.7     log likelihood = -10947.266
tau =  0.8     log likelihood = -10845.532
Iteration 0:   log likelihood = -10947.266
Iteration 1:   log likelihood = -10604.628
Iteration 2:   log likelihood = -10557.905
Iteration 3:   log likelihood = -10556.297
Iteration 4:   log likelihood = -10556.294
 
Random-effects logistic regression              Number of obs      =     26200
Group variable (i): idcode                      Number of groups   =      4434
 
Random effects u_i ~ Gaussian                   Obs per group: min =         1
                                                               avg =       5.9
                                                               max =        12
 
                                                Wald chi2(5)       =    221.95
Log likelihood  = -10556.294                    Prob > chi2        =    0.0000
 
------------------------------------------------------------------------------
       union |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0092401   .0044368     2.08   0.037     .0005441    .0179361
       grade |   .0840066   .0181622     4.63   0.000     .0484094    .1196038
    not_smsa |  -.2574574   .0844771    -3.05   0.002    -.4230294   -.0918854
       south |  -1.152854   .1108294   -10.40   0.000    -1.370075   -.9356323
     southXt |   .0237933   .0078548     3.03   0.002     .0083982    .0391884
       _cons |   -3.25016   .2622898   -12.39   0.000    -3.764238   -2.736081
-------------+----------------------------------------------------------------
    /lnsig2u |   1.669888   .0430016                      1.585607     1.75417
-------------+----------------------------------------------------------------
     sigma_u |   2.304685   .0495526                      2.209582    2.403882
         rho |   .6175213   .0101565                      .5974278    .6372209
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =  5978.89 Prob >= chibar2 = 0.000
 
. estimates store random

Comparisons

Here's a table comparing the estimates (we use the equation option so Stata can find the correct estimates).

. estimates table logit random fixed, equation(1)
 
-----------------------------------------------------
    Variable |   logit        random       fixed     
-------------+---------------------------------------
#1           |                                       
         age |  .00999311    .00924011    .00797058  
       grade |  .04834865    .08400659    .08118077  
    not_smsa | -.22149081   -.25745741    .02103677  
       south | -.71444608   -1.1528539   -1.0073178  
     southXt |   .0068356    .02379331    .02634948  
       _cons | -1.8882564   -3.2501596               
-------------+---------------------------------------
lnsig2u      |                                       
       _cons |               1.6698883               
-----------------------------------------------------

The main change is in the coefficient of not_smsa. You might think this indicates something wrong with the logit and random-effects models, but note that only women who have moved between standard metropolitan statistical areas and other places contribute to the fixed-effects estimate. It seems reasonable to believe that these women differ from the rest.

The random-effect coefficients are larger in magnitude than the ordinary logit coefficients. This is almost always the case. Omission of the random effect biases the coefficients towards zero.

Intra-class correlation

The random-effects estimate shows an intra-class correlation of 0.6175, indicating a high correlation between a woman's propensity to be a union member in different years after controlling for education and residence.

My paper with Elo in the Stata journal, 2003, shows how this can be interpreted in terms of an odds ratio and translated into measures of manifest correlation using xtrho. The command is available from the Stata journal website; in Stata type findit xtrho, or net describe st0031, from(http://www.stata-journal.com/software/sj3-1). For the average woman the correlation between actual union membership in any two years is 0.408 using Pearson's r and 0.769 using Yule's Q.