Table of Contents The Comparison of Several Groups

3  Logit Models in Stata

Stata has several commands that can be used to fit logistic regression models by maximum likelihood. The basic commands are logit for individual data and blogit for grouped data. There is also a logistic command that presents the results in terms of odd-ratios instead of log-odds and can produce a variety of summary and diagnostic statistics. Finally, one can fit a logistic regression model as a special case of a generalized linear model with Bernoulli or binomial errors and link logit, using the glm command. We will have occassion to use all commands, but we will emphasize the first two which are simpler.

3.3  The Comparison of Two Groups

Following the lecture notes we will consider comparing two groups and then move on to more than two.

3.3.1  A 2-by-2 Table

Consider the data on contraceptive use by desire for more children on Table 3.2 (page 14 of the notes). We can read these data into Stata as 2 observations. To make life easier I will enter desire for more children as a dummy variable that takes the value 1 for women who want no more children

. input nomore users n

        nomore      users          n
  1.       0   219   972
  2.       1   288   635
  3. end

3.3.2  Testing Homegeneity

Fitting the null model is easy. Note that with blogit you specify the outcome in terms of the number of 'successes' and the binomial denominator, here users and n.

. blogit users n

Logit estimates                                   Number of obs   =       1607
                                                  LR chi2(0)      =       0.00
                                                  Prob > chi2     =          .
Log likelihood = -1001.8468                       Pseudo R2       =     0.0000

------------------------------------------------------------------------------
    _outcome |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |  -.7745545   .0536794   -14.43   0.000    -.8797641   -.6693448
------------------------------------------------------------------------------

The estimate of the constant is simply the logit of the overall proportion using contraception, say y/n, and the standard error is the square root of 1/y + 1/(n-y). You may want to check these results by hand.

Stata's blogit does not calculate the model deviance, but we can obtain it 'by hand' using predict to obtain fitted counts:

. predict yhat
(option n assumed; predicted no. of cases)

. gen di = 2*( users*log(users/yhat) + (n-users)*log((n-users)/(n-yhat)) )

. gen DI = sum(di)

. display "Deviance = " DI[_N]
Deviance = 91.674393

Try a similar method to calculate Pearson's chi-squared. It's 92.64. Alternatively, you can fit the model using glm, which reports both the deviance and Pearson's chi-squared by default. Try glm users, family(binomial n).

3.3.3  The Odds Ratio

Let us now fit the (saturated) model with want no more children as a predictor.

. blogit users n nomore

Logit estimates                                   Number of obs   =       1607
                                                  LR chi2(1)      =      91.67
                                                  Prob > chi2     =     0.0000
Log likelihood = -956.00957                       Pseudo R2       =     0.0458

------------------------------------------------------------------------------
    _outcome |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      nomore |   1.048629    .110672     9.48   0.000      .831716    1.265542
       _cons |  -1.234993   .0767739   -16.09   0.000    -1.385468   -1.084519
------------------------------------------------------------------------------

Note the estimate of the effect of wanting no more children, about one in the logit scale. Thus, the odds of using contraception among women who want no more children are triple those of women who want more.

The z-statistic is as reported on page 16 of the notes. Let us square it:

. di (_b[nomore]/_se[nomore])^2
89.777635

This is Wald's chi-squared statistic. Another way to get this value is using Stata's test command. Try test nomore.

The chi2 statistic reported by Stata on the second line of output is the likelihood ratio chi-squared comparing the current model with the null model. Can you explain why we get the same 91.67 we had before? Hint: What's the deviance for this model?

A third test of the effect of want no more is given by Pearson's chi-squared, which is 92.64. Note that all three statistics are different, although close in value.

Stata is kind enough to give us a 95% confidence interval for the log-odds. Can you convert that into a 95% CI for the odds-ratio? Simple, we just exponentiate the bounds:

. di exp(0.8315523) "-" exp(1.265706)
2.2968814-3.545595

An even easier way is to type blogit, or. The blogit command without any variables retrieves the results of the last fit. The option or is short for odds-ratios. Either way, the odds of using contraception among women who want no more children are 2.3 to 3.5 times as high as for women who desire more children.


Continue with The Comparison of Several Groups
Copyright © Germán Rodríguez, 1993-2002. Please send feedback to grodri@princeton.edu