### 8.3 Longitudinal Logits

This is a dataset on union membership used in the Stata manuals and in my own paper on intra-class correlation for binary data. This is a subsample of the National Longitudinal Survey of Youth (NLSY) and has union membership information from 1970-88 for 4,434 women aged 14-26 in 1968. The data are available in the Stata and OPR websites

. clear . use http://data.princeton.edu/wws509/datasets/union (NLS Women 14-24 in 1968)

#### Logits

Here is a logit model

. logit union age grade not_smsa south southXt Iteration 0: log likelihood = -13864.23 Iteration 1: log likelihood = -13550.511 Iteration 2: log likelihood = -13545.74 Iteration 3: log likelihood = -13545.736 Logit estimates Number of obs = 26200 LR chi2(5) = 636.99 Prob > chi2 = 0.0000 Log likelihood = -13545.736 Pseudo R2 = 0.0230 ------------------------------------------------------------------------------ union | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0099931 .0026737 3.74 0.000 .0047527 .0152335 grade | .0483487 .0064259 7.52 0.000 .0357541 .0609432 not_smsa | -.2214908 .0355831 -6.22 0.000 -.2912324 -.1517493 south | -.7144461 .0612145 -11.67 0.000 -.8344244 -.5944678 southXt | .0068356 .0052258 1.31 0.191 -.0034067 .0170779 _cons | -1.888256 .113141 -16.69 0.000 -2.110009 -1.666504 ------------------------------------------------------------------------------ . estimates store logit

#### Fixed-Effects

Let us try a fixed-effects model first

. xtlogit union age grade not_smsa south southXt, i(id) fe note: multiple positive outcomes within groups encountered. note: 2744 groups (14165 obs) dropped due to all positive or all negative outcomes. Iteration 0: log likelihood = -4541.9044 Iteration 1: log likelihood = -4511.1353 Iteration 2: log likelihood = -4511.1042 Conditional fixed-effects logistic regression Number of obs = 12035 Group variable (i): idcode Number of groups = 1690 Obs per group: min = 2 avg = 7.1 max = 12 LR chi2(5) = 78.16 Log likelihood = -4511.1042 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ union | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0079706 .0050283 1.59 0.113 -.0018848 .0178259 grade | .0811808 .0419137 1.94 0.053 -.0009686 .1633302 not_smsa | .0210368 .113154 0.19 0.853 -.2007411 .2428146 south | -1.007318 .1500491 -6.71 0.000 -1.301409 -.7132271 southXt | .0263495 .0083244 3.17 0.002 .010034 .0426649 ------------------------------------------------------------------------------ . estimates store fixed

Note how we lost 63% of our sample (2744 out of 4434). These are women who didn't have variation in union membership. We will compare the estimates later.

#### Random-Effects

Now we fit a random-effects model:

. xtlogit union age grade not_smsa south southXt, i(id) Fitting comparison model: Iteration 0: log likelihood = -13864.23 Iteration 1: log likelihood = -13550.511 Iteration 2: log likelihood = -13545.74 Iteration 3: log likelihood = -13545.736 Fitting full model: tau = 0.0 log likelihood = -13545.736 tau = 0.1 log likelihood = -12926.225 tau = 0.2 log likelihood = -12419.526 tau = 0.3 log likelihood = -12003.162 tau = 0.4 log likelihood = -11656.844 tau = 0.5 log likelihood = -11367.53 tau = 0.6 log likelihood = -11129.716 tau = 0.7 log likelihood = -10947.266 tau = 0.8 log likelihood = -10845.532 Iteration 0: log likelihood = -10947.266 Iteration 1: log likelihood = -10604.628 Iteration 2: log likelihood = -10557.905 Iteration 3: log likelihood = -10556.297 Iteration 4: log likelihood = -10556.294 Random-effects logistic regression Number of obs = 26200 Group variable (i): idcode Number of groups = 4434 Random effects u_i ~ Gaussian Obs per group: min = 1 avg = 5.9 max = 12 Wald chi2(5) = 221.95 Log likelihood = -10556.294 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ union | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0092401 .0044368 2.08 0.037 .0005441 .0179361 grade | .0840066 .0181622 4.63 0.000 .0484094 .1196038 not_smsa | -.2574574 .0844771 -3.05 0.002 -.4230294 -.0918854 south | -1.152854 .1108294 -10.40 0.000 -1.370075 -.9356323 southXt | .0237933 .0078548 3.03 0.002 .0083982 .0391884 _cons | -3.25016 .2622898 -12.39 0.000 -3.764238 -2.736081 -------------+---------------------------------------------------------------- /lnsig2u | 1.669888 .0430016 1.585607 1.75417 -------------+---------------------------------------------------------------- sigma_u | 2.304685 .0495526 2.209582 2.403882 rho | .6175213 .0101565 .5974278 .6372209 ------------------------------------------------------------------------------ Likelihood-ratio test of rho=0: chibar2(01) = 5978.89 Prob >= chibar2 = 0.000 . estimates store random

#### Comparisons

Here's a table comparing the estimates (we use the
`equation`

option so Stata can find the correct estimates).

. estimates table logit random fixed, equation(1) ----------------------------------------------------- Variable | logit random fixed -------------+--------------------------------------- #1 | age | .00999311 .00924011 .00797058 grade | .04834865 .08400659 .08118077 not_smsa | -.22149081 -.25745741 .02103677 south | -.71444608 -1.1528539 -1.0073178 southXt | .0068356 .02379331 .02634948 _cons | -1.8882564 -3.2501596 -------------+--------------------------------------- lnsig2u | _cons | 1.6698883 -----------------------------------------------------

The main change is in the coefficient of `not_smsa`

.
You might think this indicates something wrong with the logit and
random-effects models, but note that only women who have *moved*
between standard metropolitan statistical areas and other places contribute
to the fixed-effects estimate. It seems reasonable to believe that these
women differ from the rest.

The random-effect coefficients are larger in magnitude than the ordinary logit coefficients. This is almost always the case. Omission of the random effect biases the coefficients towards zero.

#### Intra-class correlation

The random-effects estimate shows an intra-class correlation of 0.6175, indicating a high correlation between a woman's propensity to be a union member in different years after controlling for education and residence.

My paper with Elo in the Stata journal, 2003, shows how this can be
interpreted in terms of an odds ratio and translated into measures of
manifest correlation using `xtrho`

. The command is available from
the Stata journal website; in Stata type `findit xtrho`

, or
`net describe st0031, from(http://www.stata-journal.com/software/sj3-1)`

.
For the average woman the correlation between actual union membership in
any two years is 0.408 using Pearson's r and 0.769 using Yule's Q.