## Random Effects Logit Models

The Stata manual has data on union membership from the NLS for 4434 women who were 14-24 in 1968 and were observed between 1 and 12 times.

We read the data from the web and compute `southXt`

,
an interaction term between `south`

and `year`

centered on 70.

. webuse union, clear (NLS Women 14-24 in 1968) . gen southXt = south * (year-70)

#### Logit Estimates

We first compute logit estimates for later comparison, fitting the same model as in [R] xtlogit with clustered standard errors

. logit union age grade not_smsa south southXt Iteration 0: log likelihood = -13864.23 Iteration 1: log likelihood = -13550.511 Iteration 2: log likelihood = -13545.74 Iteration 3: log likelihood = -13545.736 Logistic regression Number of obs = 26200 LR chi2(5) = 636.99 Prob > chi2 = 0.0000 Log likelihood = -13545.736 Pseudo R2 = 0.0230 ------------------------------------------------------------------------------ union | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0099931 .0026737 3.74 0.000 .0047527 .0152335 grade | .0483487 .0064259 7.52 0.000 .0357541 .0609432 not_smsa | -.2214908 .0355831 -6.22 0.000 -.2912324 -.1517493 south | -.7144461 .0612145 -11.67 0.000 -.8344244 -.5944678 southXt | .0068356 .0052258 1.31 0.191 -.0034067 .0170779 _cons | -1.888256 .113141 -16.69 0.000 -2.110009 -1.666504 ------------------------------------------------------------------------------ . estimates store logit . logit union age grade not_smsa south southXt, cluster(idcode) Iteration 0: log pseudolikelihood = -13864.23 Iteration 1: log pseudolikelihood = -13550.511 Iteration 2: log pseudolikelihood = -13545.74 Iteration 3: log pseudolikelihood = -13545.736 Logistic regression Number of obs = 26200 Wald chi2(5) = 161.37 Prob > chi2 = 0.0000 Log pseudolikelihood = -13545.736 Pseudo R2 = 0.0230 (Std. Err. adjusted for 4434 clusters in idcode) ------------------------------------------------------------------------------ | Robust union | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0099931 .0039563 2.53 0.012 .0022389 .0177473 grade | .0483487 .013936 3.47 0.001 .0210346 .0756628 not_smsa | -.2214908 .0713568 -3.10 0.002 -.3613475 -.0816341 south | -.7144461 .0919865 -7.77 0.000 -.8947362 -.5341559 southXt | .0068356 .0070241 0.97 0.330 -.0069314 .0206026 _cons | -1.888256 .207871 -9.08 0.000 -2.295676 -1.480837 ------------------------------------------------------------------------------ . estimates store cluster

#### Random Intercepts

The next step is to fit a random-intercepts model and compare results

. xtlogit union age grade not_smsa south southXt, i(idcode) Fitting comparison model: Iteration 0: log likelihood = -13864.23 Iteration 1: log likelihood = -13550.511 Iteration 2: log likelihood = -13545.74 Iteration 3: log likelihood = -13545.736 Fitting full model: tau = 0.0 log likelihood = -13545.736 tau = 0.1 log likelihood = -12926.225 tau = 0.2 log likelihood = -12419.526 tau = 0.3 log likelihood = -12003.162 tau = 0.4 log likelihood = -11656.844 tau = 0.5 log likelihood = -11367.53 tau = 0.6 log likelihood = -11129.716 tau = 0.7 log likelihood = -10947.266 tau = 0.8 log likelihood = -10845.532 Iteration 0: log likelihood = -10947.312 Iteration 1: log likelihood = -10557.296 Iteration 2: log likelihood = -10540.582 Iteration 3: log likelihood = -10540.367 Iteration 4: log likelihood = -10540.367 Iteration 5: log likelihood = -10540.366 Random-effects logistic regression Number of obs = 26200 Group variable: idcode Number of groups = 4434 Random effects u_i ~ Gaussian Obs per group: min = 1 avg = 5.9 max = 12 Wald chi2(5) = 227.30 Log likelihood = -10540.366 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ union | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0093936 .004454 2.11 0.035 .000664 .0181232 grade | .0867878 .0176345 4.92 0.000 .0522247 .1213508 not_smsa | -.2519379 .082334 -3.06 0.002 -.4133095 -.0905663 south | -1.163769 .1114164 -10.45 0.000 -1.382141 -.945397 southXt | .023245 .0078497 2.96 0.003 .0078599 .0386302 _cons | -3.360131 .2586306 -12.99 0.000 -3.867038 -2.853225 -------------+---------------------------------------------------------------- /lnsig2u | 1.749534 .0469964 1.657423 1.841645 -------------+---------------------------------------------------------------- sigma_u | 2.398317 .0563561 2.290366 2.511356 rho | .6361486 .0108779 .6145729 .6571902 ------------------------------------------------------------------------------ Likelihood-ratio test of rho=0: chibar2(01) = 6010.74 Prob >= chibar2 = 0.000 . estimates store xtlogit . estimates table logit xtlogit, eq(1:1) ---------------------------------------- Variable | logit xtlogit -------------+-------------------------- #1 | age | .00999311 .00939361 grade | .04834865 .08678776 not_smsa | -.22149081 -.25193788 south | -.71444608 -1.1637691 southXt | .0068356 .02324502 _cons | -1.8882564 -3.3601312 -------------+-------------------------- lnsig2u | _cons | 1.7495341 ----------------------------------------

Except for age, we see that the subject-specific estimates are larger in magnitude than in the marginal logit model.

The odds of being in a union in 1970 are 69% lower for a woman who lives in the south than for one who doesn't, everything else being equal. The effect declined over time, and by 1988 the odds of belonging to a union were 53% lower in the south than elsewhere.

In contrast the logit model estimates the effect as 51% lower odds in 1970 and 45% lower in 1988. These estimates can be interpreted as population average effects.

Given how different the coefficients are, it doesn't make much sense to compare standar errors. Generally speaking, though, they increase as one moves from the logit model to robust standard errors to the estimates based on the random intercept model.

#### Intra-class Correlation

Stata reports the intraclass correlation as 0.636. This coefficient pertains to a latent variable reflecting propensity to belong to a union, rather than manifest union membership. The correlation between this propensity in any two years for the same individual is 0.64. We can also say that 64% of the variance in the propensity to belong to a union can be attributed to individuals.

Using the `xtrho`

command we can compute the
correlation in actual union membership in any two years
for a woman with a median linear predictor:

. xtrho Measures of intra-class manifest association in random-effects logit Evaluated at median linear predictor Measure | Estimate [95% Conf.Interval] -----------------+------------------------------------ Marginal prob. | .225847 .218973 .232833 Joint prob. | .125072 .117026 .133375 Odds ratio | 8.29305 7.64627 9.00295 Pearson's r | .423617 .4039 .443193 Yule's Q | .784785 .768686 .800059

We estimate a probability of 23% of belonging to a union in any given year and 12% of belonging in two years, much more than one would expect under independence. The correlation is reflected in an odds ratio of 7.7, so for women at the median the odds of belonging to a union at t_2 are 7.7 times as high for those who belonged to a union at t_1 than for those who didn't. Pearson's r is 0.41 and Yule's Q is 0.77.

These measures can be computed for women whose observed characteristics
make then more or less likely to belong to a union by using the
`detail`

option:

. xtrho, detail Measures of intra-class manifest association in random-effects logit Evaluated with linear predictor set at selected percentiles Measure | p1 p25 p50 p75 p99 -----------------+------------------------------------------------------------ Marginal prob. | .10807 .161144 .225847 .251047 .309243 Joint prob. | .046724 .079682 .125072 .14408 .190535 Odds ratio | 10.3122 9.09452 8.29305 8.08413 7.73468 Pearson's r | .36357 .39737 .423617 .431096 .444279 Yule's Q | .8232 .801873 .784785 .779836 .771028

The correlation as measured by the odds ratio or Yule's Q is
higher when women are *less* likely to belong to a union,
but the opposite is true if one uses Pearson's r.

For a more detailed discussion of this post-estimation command see muy paper with Elo in the Stata Journal 3(1):32--46 (2003), available here