3.5 Models with Two Predictors
We now turn to logistic regression models with two or more predictors.
The Contraceptive use Data
The rest of the analyses in this chapter will use a small dataset showing contraceptive use by age (<25, 35-29, 30-39 and 40-49), education (none, some) and desire for more children (yes, no). The dataset is described on page 31 of the notes and is available in the datasets section of the course website, so we'll read it directly from the web
. clear
. use http://data.princeton.edu/wws509/datasets/cuse
(Contraceptive Use Data (Fiji, 1976))
. list, clean nolabel
age educ desire cuse n
1. 1 0 0 0 53
2. 1 0 0 1 6
3. 1 0 1 0 10
4. 1 0 1 1 4
5. 1 1 0 0 212
6. 1 1 0 1 52
7. 1 1 1 0 50
8. 1 1 1 1 10
9. 2 0 0 0 60
10. 2 0 0 1 14
11. 2 0 1 0 19
12. 2 0 1 1 10
13. 2 1 0 0 155
14. 2 1 0 1 54
15. 2 1 1 0 65
16. 2 1 1 1 27
17. 3 0 0 0 112
18. 3 0 0 1 33
19. 3 0 1 0 77
20. 3 0 1 1 80
21. 3 1 0 0 118
22. 3 1 0 1 46
23. 3 1 1 0 68
24. 3 1 1 1 78
25. 4 0 0 0 35
26. 4 0 0 1 6
27. 4 0 1 0 46
28. 4 0 1 1 48
29. 4 1 0 0 8
30. 4 1 0 1 8
31. 4 1 1 0 12
32. 4 1 1 1 31
I omitted the labels so you can see the actual numeric codes used. Age is coded 1-4 for the four groups, education is coded as a dummy variable with 1 for 'Some', desire is coded as a dummy variable with 1 for 'Wants no more', and contraceptive use is coded 1 for 'Yes'.
The layout simulates individual data, with one row for each combination of values of the predictors and outcome. The last column is the number of women in that combination of categories, and can be used as a frequency weight in Stata.
For clarity and consistency with earlier work I will create a
new variable called nomore.
. gen nomore = desire == 1
The Age Model
Let us start by fitting a model treating age as a factor, building the dummies 'by hand', and saving the estimates for later use under the name "age"
. gen age2529 = age == 2
. gen age30s = age == 3
. gen age40s = age == 4
. logit cuse age2529 age30s age40s [fw=n]
Iteration 0: log likelihood = -1001.8468
Iteration 1: log likelihood = -962.76243
Iteration 2: log likelihood = -962.25132
Iteration 3: log likelihood = -962.25091
Iteration 4: log likelihood = -962.25091
Logistic regression Number of obs = 1607
LR chi2(3) = 79.19
Prob > chi2 = 0.0000
Log likelihood = -962.25091 Pseudo R2 = 0.0395
------------------------------------------------------------------------------
cuse | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age2529 | .4606758 .1727255 2.67 0.008 .1221401 .7992116
age30s | 1.048293 .1544406 6.79 0.000 .7455952 1.350991
age40s | 1.424638 .1939574 7.35 0.000 1.044488 1.804788
_cons | -1.507159 .1302529 -11.57 0.000 -1.76245 -1.251868
------------------------------------------------------------------------------
. estimates store age
. display -2*e(ll)
1924.5018
Note that the parameter estimates, standard errors, and model chi-squared are exactly the same as in Section 3.4, when we worked with only four binomial observations, showing that grouping the data does not affect any of these statistics.
The model deviance, however, is different. In Section 3.4 the deviance was zero because the model fitted exactly the four groups. In the lecture notes I work with eight groups and get a deviance of 66.48. Here we are working essentially with individual data and the deviance is simply negative twice the log-likelihood, in this case 1924.5 on 1603 d.f. With individual data the deviance can no longer be used as a goodness of fit test, but we will consider alternative approaches later.
You may want to verify that we also get the same results for the model with 'wants no more childre' as the sole predictor.
The Additive Model
We are now ready to consider a model with both age and desire for no more children:
. logit cuse age2529 age30s age40s nomore [fw=n]
Iteration 0: log likelihood = -1001.8468
Iteration 1: log likelihood = -938.32767
Iteration 2: log likelihood = -937.40556
Iteration 3: log likelihood = -937.40449
Iteration 4: log likelihood = -937.40449
Logistic regression Number of obs = 1607
LR chi2(4) = 128.88
Prob > chi2 = 0.0000
Log likelihood = -937.40449 Pseudo R2 = 0.0643
------------------------------------------------------------------------------
cuse | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age2529 | .3678306 .1753677 2.10 0.036 .0241163 .711545
age30s | .8077888 .1597537 5.06 0.000 .4946773 1.1209
age40s | 1.022618 .2039339 5.01 0.000 .6229153 1.422322
nomore | .824092 .1171128 7.04 0.000 .5945551 1.053629
_cons | -1.693933 .1352307 -12.53 0.000 -1.958981 -1.428886
------------------------------------------------------------------------------
. di exp(_b[nomore])
2.2798098
. estimates store additive
Compare the results with Table 3.9 on page 26 of the notes. Exponentiating the coefficient of "nomore" we get an odds ratio of 2.28. This means that the odds of using contraception among women who want no more children are double the odds among women in the same age group who do want more children. The model assumes that the odds ratio is the same for every age group, an assumption we will need to test.
To test the significance of the odds ratio we can use the Wald test given in the output, a z-statistic of 7.04 (which can be squared to obtain a chi-squared statistic on one d.f.) The likelihood ratio test would compare the additive model with the age model, which we saved just so we could do this test. Here are the two tests:
. test nomore
( 1) [cuse]nomore = 0
chi2( 1) = 49.52
Prob > chi2 = 0.0000
. lrtest age .
Likelihood-ratio test LR chi2(1) = 49.69
(Assumption: age nested in additive) Prob > chi2 = 0.0000
We obtain chi-squared statistics of 49.5 and 49.7 on one d.f., so there is no doubt that the odds of using contraception in any given age group vary by whether the women want more children.
We verify quickly that we could obtain exactly the same results using factor variables:
. logit cuse i.age nomore [fw=n]
Iteration 0: log likelihood = -1001.8468
Iteration 1: log likelihood = -938.32767
Iteration 2: log likelihood = -937.40556
Iteration 3: log likelihood = -937.40449
Iteration 4: log likelihood = -937.40449
Logistic regression Number of obs = 1607
LR chi2(4) = 128.88
Prob > chi2 = 0.0000
Log likelihood = -937.40449 Pseudo R2 = 0.0643
------------------------------------------------------------------------------
cuse | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age |
2 | .3678306 .1753677 2.10 0.036 .0241163 .711545
3 | .8077888 .1597537 5.06 0.000 .4946773 1.1209
4 | 1.022618 .2039339 5.01 0.000 .6229153 1.422322
|
nomore | .824092 .1171128 7.04 0.000 .5945551 1.053629
_cons | -1.693933 .1352307 -12.53 0.000 -1.958981 -1.428886
------------------------------------------------------------------------------
You could use i.nomore, and Stata would recognize that
'no more' is a dummy variable and report the coefficient as
1.more, but leaving out the i. produces
cleaner output.
A Model with an Interaction
We now add an interaction between age and desire for no more children. As usual we compute our own dummies and then fit the model:
. gen nom_age2529 = nomore * age2529
. gen nom_age30s = nomore * age30s
. gen nom_age40s = nomore * age40s
. logit cuse age2529 age30s age40s nomore ///
> nom_age2529 nom_age30s nom_age40s [fw=n]
Iteration 0: log likelihood = -1001.8468
Iteration 1: log likelihood = -930.01061
Iteration 2: log likelihood = -929.01056
Iteration 3: log likelihood = -929.01009
Iteration 4: log likelihood = -929.01009
Logistic regression Number of obs = 1607
LR chi2(7) = 145.67
Prob > chi2 = 0.0000
Log likelihood = -929.01009 Pseudo R2 = 0.0727
------------------------------------------------------------------------------
cuse | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age2529 | .3681565 .2009279 1.83 0.067 -.0256549 .7619679
age30s | .4506554 .1949898 2.31 0.021 .0684824 .8328283
age40s | .397144 .3401461 1.17 0.243 -.26953 1.063818
nomore | .0639996 .3303183 0.19 0.846 -.5834125 .7114116
nom_age2529 | .2672319 .4091443 0.65 0.514 -.5346762 1.06914
nom_age30s | 1.090493 .3732853 2.92 0.003 .3588674 1.822119
nom_age40s | 1.367148 .4834193 2.83 0.005 .4196637 2.314632
_cons | -1.519287 .1449654 -10.48 0.000 -1.803414 -1.23516
------------------------------------------------------------------------------
I could have used nom_age* as short-hand for the three
dummies, but using wild cards can be dangerous if we later add a
variable with a smilar prefix.
The estimates show a difference by desire for more children of 0.06 in the logit scale for women under age 25, with additional differences of 0.27 for women 25-29, 1.09 for women in their thirties and 1.38 for women in their forties. Clearly the difference is larger the older the age group. (You may want to be careful about saying "increases with age" as this is a cross-section, but maybe that is clear enough from the context.)
Exponentiating these numbers we get an odds ratio of 1.07 (or 7% higher odds of using contraception for those who want no more children) among women under 25, which gets multiplied by 1.3 for women 25-29, by about 3 for women in their thirties and by about 4 for women in their forties. Thus, the odds ratio among women in their forties is four times the odds ratio for women under 25.
We can test the significance of the interaction term using a Wald test or computing a likelihood ratio test that compares this model with the additive model saved earlier:
. test nom_age2529 nom_age30s nom_age40s
( 1) [cuse]nom_age2529 = 0
( 2) [cuse]nom_age30s = 0
( 3) [cuse]nom_age40s = 0
chi2( 3) = 16.03
Prob > chi2 = 0.0011
. lrtest additive .
Likelihood-ratio test LR chi2(3) = 16.79
(Assumption: additive nested in .) Prob > chi2 = 0.0008
we find the interaction to be significant, with a P-value < 0.001, so the ratio of the odds of using contraception among women who do and do not want another child varies by age.
This is not the same as testing for the effect of preferences
for more children. A test for the "main" effect of nomore
would just test for women under age 25 and would find no difference.
To test at all ages together we can compare this model with the age
model, or do a Wald test for all terms involving preferences:
. test nomore nom_age2529 nom_age30s nom_age40s
( 1) [cuse]nomore = 0
( 2) [cuse]nom_age2529 = 0
( 3) [cuse]nom_age30s = 0
( 4) [cuse]nom_age40s = 0
chi2( 4) = 62.45
Prob > chi2 = 0.0000
. lrtest age .
Likelihood-ratio test LR chi2(4) = 66.48
(Assumption: age nested in .) Prob > chi2 = 0.0000
Before leaving this section we note that we can obtain exacly the
same results using factor variables. Obviously we need to use the
i. prefix for age. If you don't say anything about
'no more' but include the hash for the interaction Stata treats
it as a factor. I think the output looks a bit cleaner if we remind
Stata that this is a dummy using the c. prefix:
. logit cuse i.age##c.nomore [fw=n]
Iteration 0: log likelihood = -1001.8468
Iteration 1: log likelihood = -930.01061
Iteration 2: log likelihood = -929.01056
Iteration 3: log likelihood = -929.01009
Iteration 4: log likelihood = -929.01009
Logistic regression Number of obs = 1607
LR chi2(7) = 145.67
Prob > chi2 = 0.0000
Log likelihood = -929.01009 Pseudo R2 = 0.0727
------------------------------------------------------------------------------
cuse | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age |
2 | .3681565 .2009279 1.83 0.067 -.0256549 .7619679
3 | .4506554 .1949898 2.31 0.021 .0684824 .8328283
4 | .397144 .3401461 1.17 0.243 -.26953 1.063818
|
nomore | .0639996 .3303183 0.19 0.846 -.5834125 .7114116
|
age#c.nomore |
2 | .2672319 .4091443 0.65 0.514 -.5346762 1.06914
3 | 1.090493 .3732853 2.92 0.003 .3588674 1.822119
4 | 1.367148 .4834193 2.83 0.005 .4196637 2.314632
|
_cons | -1.519287 .1449654 -10.48 0.000 -1.803414 -1.23516
------------------------------------------------------------------------------
We can test the significance of the interaction using our old friend
testparm with just one hash:
. testparm i.age#c.nomore
( 1) [cuse]2.age#c.nomore = 0
( 2) [cuse]3.age#c.nomore = 0
( 3) [cuse]4.age#c.nomore = 0
chi2( 3) = 16.03
Prob > chi2 = 0.0011
Can you reproduce the Wald test for all terms involving preferences?
Reparametrizing Interactions
It may be easier to present the result for this model in terms of odds ratios by desire for more children in the different age groups, as discussed on page 27 of the notes, see also Table 3.10.
To this end we add a dummy variable for the interaction at age < 25, and omit the 'main' effect of wanting no more:
. gen nom_agelt25 = nomore * (age==1)
. logit cuse age2529 age30s age40s ///
> nom_agelt25 nom_age2529 nom_age30s nom_age40s [fw=n]
Iteration 0: log likelihood = -1001.8468
Iteration 1: log likelihood = -930.01061
Iteration 2: log likelihood = -929.01056
Iteration 3: log likelihood = -929.01009
Iteration 4: log likelihood = -929.01009
Logistic regression Number of obs = 1607
LR chi2(7) = 145.67
Prob > chi2 = 0.0000
Log likelihood = -929.01009 Pseudo R2 = 0.0727
------------------------------------------------------------------------------
cuse | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age2529 | .3681565 .2009279 1.83 0.067 -.0256549 .7619679
age30s | .4506554 .1949898 2.31 0.021 .0684824 .8328283
age40s | .397144 .3401461 1.17 0.243 -.26953 1.063818
nom_agelt25 | .0639996 .3303183 0.19 0.846 -.5834125 .7114116
nom_age2529 | .3312314 .2414308 1.37 0.170 -.1419642 .8044271
nom_age30s | 1.154493 .1738727 6.64 0.000 .8137085 1.495277
nom_age40s | 1.431148 .3529646 4.05 0.000 .7393498 2.122945
_cons | -1.519287 .1449654 -10.48 0.000 -1.803414 -1.23516
------------------------------------------------------------------------------
. mata exp( st_matrix("e(b)")[4..7] )
1 2 3 4
+---------------------------------------------------------+
1 | 1.066091954 1.392682073 3.172413793 4.183497537 |
+---------------------------------------------------------+
The parameters now represent differences in the logit scale between
women who do not want more children and those who do in each of the
four age groups. Exponentiating the coefficients (which you can also
do with the or option) we find that the odds of using
contraception for women who want no more children, compared to those
who do, are 7% higher at age under 25, 39% higher at ages 25-29,
three times as high at age 30-39, and four times as high at age 40-49.
The parametrization used here effectively combines the 'main'
effect of wanting no more children with the 'additional' effect
as one moves to older age groups in a single number for each age
group. This leads to a more direct presentation of the results.
A Deviance Table
The deviances in Table 3.8 in the lecture notes are based on a tabulation of contraceptive use by age and desire which has eight groups. The analyses in this log are based on a table that also breaks use by education, with a total of sixteen groups. However, we can easily compute those deviances by comparing each model of interest with the model we just fit.
. estimates store ageXnom . quietly logit cuse [fw=n] . lrtest . ageXnom Likelihood-ratio test LR chi2(7) = 145.67 (Assumption: . nested in ageXnom) Prob > chi2 = 0.0000 . quietly logit cuse i.age [fw=n] . lrtest . ageXnom Likelihood-ratio test LR chi2(4) = 66.48 (Assumption: . nested in ageXnom) Prob > chi2 = 0.0000 . quietly logit cuse nomore [fw=n] . lrtest . ageXnom Likelihood-ratio test LR chi2(6) = 54.00 (Assumption: . nested in ageXnom) Prob > chi2 = 0.0000 . quietly logit cuse i.age nomore [fw=n] . lrtest . ageXnom Likelihood-ratio test LR chi2(3) = 16.79 (Assumption: . nested in ageXnom) Prob > chi2 = 0.0008
You can also verify these results from the log-likelihoods printed for the various models. You should be able to reproduce the tests for the gross effect of age, the net effect of wanting no more children given age, and the interaction term, from these deviances.
Analysis of Covariance Models
We now treat age as a covariate, using the mid-points of the age groups just as we did before:
. recode age 1=20 2=27.5 3=35 4=45, gen(agem) (32 differences between age and agem)
The first model of interest has a linear effect of age, and is analogous to simple linear regression:
. logit cuse agem [fw=n]
Iteration 0: log likelihood = -1001.8468
Iteration 1: log likelihood = -963.7238
Iteration 2: log likelihood = -963.45262
Iteration 3: log likelihood = -963.45258
Logistic regression Number of obs = 1607
LR chi2(1) = 76.79
Prob > chi2 = 0.0000
Log likelihood = -963.45258 Pseudo R2 = 0.0383
------------------------------------------------------------------------------
cuse | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
agem | .060671 .0071034 8.54 0.000 .0467486 .0745934
_cons | -2.672667 .2332492 -11.46 0.000 -3.129827 -2.215507
------------------------------------------------------------------------------
. lrtest . ageXnom
Likelihood-ratio test LR chi2(6) = 68.88
(Assumption: . nested in ageXnom) Prob > chi2 = 0.0000
. estimates store agem
The estimated slope shows that the odds of using contraception increase about six percent per year of age.
The second model of interest includes additive effects of age and desire for no more children, and is analogous to an analysis of covariance model:
. logit cuse agem nomore [fw=n]
Iteration 0: log likelihood = -1001.8468
Iteration 1: log likelihood = -939.19023
Iteration 2: log likelihood = -938.50429
Iteration 3: log likelihood = -938.50406
Iteration 4: log likelihood = -938.50406
Logistic regression Number of obs = 1607
LR chi2(2) = 126.69
Prob > chi2 = 0.0000
Log likelihood = -938.50406 Pseudo R2 = 0.0632
------------------------------------------------------------------------------
cuse | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
agem | .0441062 .007529 5.86 0.000 .0293497 .0588627
nomore | .8258978 .11711 7.05 0.000 .5963664 1.055429
_cons | -2.516654 .2365293 -10.64 0.000 -2.980243 -2.053065
------------------------------------------------------------------------------
. di exp(_b[nomore])
2.2839303
. lrtest . ageXnom
Likelihood-ratio test LR chi2(5) = 18.99
(Assumption: . nested in ageXnom) Prob > chi2 = 0.0019
We see that the odds of using contraception are 128% higher among women who want no more children than among women who want more and have the same age. The estimated difference in log-odds when we adjust linearly for age is very similar to that obtained by treating age as a factor (0.826 vs. 0.824).
None of these models fits the data very well, so we add an interaction between age and desire for no more children. As usual, we center age before constructing the interaction
. gen agec = agem - 30.6
. gen nomXagec = nomore * agec
. logit cuse agec nomore nomXagec [fw=n]
Iteration 0: log likelihood = -1001.8468
Iteration 1: log likelihood = -934.311
Iteration 2: log likelihood = -933.57774
Iteration 3: log likelihood = -933.57756
Iteration 4: log likelihood = -933.57756
Logistic regression Number of obs = 1607
LR chi2(3) = 136.54
Prob > chi2 = 0.0000
Log likelihood = -933.57756 Pseudo R2 = 0.0681
------------------------------------------------------------------------------
cuse | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
agec | .0218229 .0103662 2.11 0.035 .0015055 .0421403
nomore | .757509 .121842 6.22 0.000 .518703 .996315
nomXagec | .0479913 .015438 3.11 0.002 .0177334 .0782493
_cons | -1.194371 .0785969 -15.20 0.000 -1.348418 -1.040323
------------------------------------------------------------------------------
. mata exp(st_matrix("e(b)"))[1..3]
1 2 3
+-------------------------------------------+
1 | 1.022062801 2.132956375 1.049161578 |
+-------------------------------------------+
. test nomXagec
( 1) [cuse]nomXagec = 0
chi2( 1) = 9.66
Prob > chi2 = 0.0019
The estimates agree with the results in Table 3.12. We see that the odds of using contraception among women who want more children are about two percent higher per year of age. The odds among women who want no more children are double those of women who want more at the mean age (30.6). This ratio is about five percent higher per year of age above the mean (and of course five percent lower per year below the mean).
Another way to look at this result is to note that the odds of using contraception among women who want no more children are about 7 percent higher per year of age. This may be seen more clearly if we parametrize the model using two constants and two slopes
. gen more = 1-nomore
. gen moreXagec = more*agec
. logit cuse more moreXagec nomore nomXagec [fw=n], noconstant
Iteration 0: log likelihood = -1113.8875
Iteration 1: log likelihood = -933.72582
Iteration 2: log likelihood = -933.57757
Iteration 3: log likelihood = -933.57756
Logistic regression Number of obs = 1607
Wald chi2(4) = 300.69
Log likelihood = -933.57756 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
cuse | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
more | -1.194371 .0785969 -15.20 0.000 -1.348418 -1.040323
moreXagec | .0218229 .0103662 2.11 0.035 .0015055 .0421403
nomore | -.4368615 .0931021 -4.69 0.000 -.6193383 -.2543847
nomXagec | .0698143 .01144 6.10 0.000 .0473923 .0922362
------------------------------------------------------------------------------
which is the main body of Table 3.12. Testing for equality of slopes is equivalent to testing the interaction term in the previous specification
. test moreXagec = nomXagec
( 1) [cuse]moreXagec - [cuse]nomXagec = 0
chi2( 1) = 9.66
Prob > chi2 = 0.0019
Plotting Observed and Fitted Logits
Time for a plot. We will reproduce Figure 3.3 in the notes, comparing the analysis of covariance model we just fitted, a model with a quadratic effect of age where the curvature is assumed to be the same for the two groups defined by preferences, and the anova type model which was saturated for the age by preferences table and thus represents observed logits.
. predict lfit, xb . gen agecsq = agec^2 . quietly logit cuse agec agecsq nomore nomXagec[fw=n] . predict qfit, xb . estimates restore nomXage (results nomXage are active now) . predict obs, xb . graph twoway (scatter obs agem) /// > (line lfit agem if more) (lin lfit agem if nomore) /// > (mspline qfit agem if more, bands(4)) /// > (mspline qfit agem if nomore, bands(4)) /// > , title(Figure 3.3: Contraceptive Use by Age and Desire) /// > xtitle(age) ytitle(logit) legend(off) . graph export fig33.png, width(500) replace (file fig33.png written in PNG format)

Visually the model with some curvature provides a better fit, but we have no evidence that it is in fact better than the model with two straight lines with different slopes.
For purposes of presentation you may consider doing an
equivalent plot in the probability scale. Don't forget that
a linear relationship in the logit scale will be nonlinear
in the probability scale, so you should probably use the
function or mspline plot types
to reflect the curvature.
Continue with 3.6 Multi-factor Models: Model Selection

