3.3 The Comparison of Two Groups
We start our applications of logit regression with the simplest possible example: a two by two table. We study a binary outcome in two groups, and introduce the odds ratio and the logit analogue of the twosample t test.
3.3.1 A 2by2 Table
We will use the contraceptive use data classified by desire for more children, as summarized in Table 3.2
Desires i 
Using y_{i} 
Not Using n_{i}y_{i} 
All n_{i} 
Yes  219  753  972 
No  288  347  635 
All  507  1100  1607 
We treat the counts of users y_{i} as realizations of independent random variables Y_{i} having binomial distributions B(n_{i},p_{i}) for I = 1,2, and consider models for the logits of the probabilities.
3.3.2 Testing Homogeneity
There are only two possible models we can entertain for these data. The first one is the null model. This model assumes homogeneity, so the two groups have the same probability and therefore the same logit

The deviance for the null model happens to be 91.7 on one d.f. (two groups minus one parameter). This value is highly significant, indicating that this model does not fit the data, i.e. the two groups classified by desire for more children do not have the same probability of using contraception.
The value of the deviance is easily verified by hand. The estimated probability of 0.316, applied to the sample sizes in Table 3.2, leads us to expect 306.7 and 200.3 users of contraception in the two groups, and therefore 665.3 and 434.7 nonusers . Comparing the observed and expected numbers of users and nonusers in the two groups using Equation 3.13 gives 91.7.
You can also compare the observed and expected frequencies using Pearson's chisquared statistic from Equation 3.14. The result is 92.6 on one d.f., and provides an alternative test of the goodness of fit of the null model.
3.3.3 The Odds Ratio
The other model that we can entertain for the twobytwo table is the onefactor model, where we write


Parameter  Symbol  Estimate  Std. Error  zratio 
Constant  h  1.235  0.077  16.09 
Desire  α_{2}  1.049  0.111  9.48 
The estimate of h is, as you might expect, the logit of the observed proportion using contraception among women who desire more children, logit(219/972) = 1.235. The estimate of α_{2} is the difference between the logits of the two groups, logit(288/635)logit(219/972) = 1.049.
Exponentiating the additive logit model we obtain a multiplicative model for the odds:

In our example, the effect of 1.049 in the logit scale translates into an odds ratio of 2.85. Thus, the odds of using contraception among women who want no more children are nearly three times as high as the odds for women who desire more children.
From the estimated logit effect of 1.049 and its standard error we can calculate a 95% confidence interval with boundaries (0.831, 1.267). Exponentiating these boundaries we obtain a 95% confidence interval for the odds ratio of (2.30, 3.55). Thus, we conclude with 95% confidence that the odds of using contraception among women who want no more children are between two and threeandahalf times the corresponding odds for women who want more children.
The estimate of the odds ratio can be calculated directly as the crossproduct of the frequencies in the twobytwo table. If we let f_{ij} denote the frequency in cell I,j then the estimated odds ratio is

The deviance of this model is zero, because the model is saturated: it has two parameters to represent two groups, so it has to do a perfect job. The reduction in deviance of 91.7 from the null model down to zero can be interpreted as a test of

An alternative test of this effect is obtained from the m.l.e of 1.049 and its standard error of 0.111, and gives a zratio of 9.47. Squaring this value we obtain a chisquared of 89.8 on one d.f. Note that the Wald test is similar, but not identical, to the likelihood ratio test. Recall that in linear models the two tests were identical. In logit models they are only asymptotically equivalent.
The logit of the observed proportion p_{i} = y_{i}/n_{i} has largesample variance

3.3.4 The Conventional Analysis
It might be instructive to compare the results obtained here with the conventional analysis of this type of data, which focuses on the sample proportions and their difference. In our example, the proportions using contraception are 0.225 among women who want another child and 0.453 among those who do not. The difference of 0.228 has a standard error of 0.024 (calculated using the pooled estimate of the proportion). The corresponding zratio is 9.62 and is equivalent to a chisquared of 92.6 on one d.f.
Note that the result coincides with the Pearson chisquared statistic testing the goodness of fit of the null model. In fact, Pearson's chisquared and the conventional test for equality of two proportions are one and the same.
In the case of two samples it is debatable whether the group effect is best measured in terms of a difference in probabilities, the oddsratio, or even some other measures such as the relative difference proposed by Sheps (1961). For arguments on all sides of this issue see Fleiss (1973).
Continue with 3.4. The Comparison of Several Groups