![]() |
Eco572: Research Methods in Demography | ![]() |
![]() | ||
We start by reading the data from the website.
. infile age n using /// > http://data.princeton.edu/eco572/datasets/sipp01w3741.dat, clear (31 observations read)
(a) Since these are cohort data and we are only interested in the experience up to age 37, which is complete for women aged 37-41 at interview, we can compute dx directly, just dividing the frequencies by the total number of women. The other life table functions follow directly. The only time we need to make an assumption about the distribution of events in a year of age is in computing Lx, and we assume a uniform distribution.
. quietly summarize n
. gen dx = n/r(sum) if age < 37
(6 missing values generated)
. gen lx = 1
. replace lx = lx[_n-1] - dx[_n-1] if _n > 1
(30 real changes made, 5 to missing)
. gen qx = dx/lx
(6 missing values generated)
. gen Lx = (lx + lx[_n+1])/2
(6 missing values generated)
. format %8.6f dx qx lx Lx
. list age dx qx lx Lx if age <= 37
+-------------------------------------------------+
| age dx qx lx Lx |
|-------------------------------------------------|
1. | 12 0.000654 0.000654 1.000000 0.999673 |
2. | 13 0.000654 0.000655 0.999346 0.999018 |
3. | 14 0.002618 0.002621 0.998691 0.997382 |
4. | 15 0.008181 0.008213 0.996073 0.991983 |
5. | 16 0.025851 0.026168 0.987893 0.974967 |
|-------------------------------------------------|
6. | 17 0.040903 0.042517 0.962042 0.941590 |
7. | 18 0.077552 0.084192 0.921139 0.882363 |
8. | 19 0.081152 0.096199 0.843586 0.803010 |
9. | 20 0.070353 0.092275 0.762435 0.727258 |
10. | 21 0.062500 0.090307 0.692081 0.660831 |
|-------------------------------------------------|
11. | 22 0.068390 0.108628 0.629581 0.595386 |
12. | 23 0.064463 0.114869 0.561191 0.528959 |
13. | 24 0.050393 0.101449 0.496728 0.471531 |
14. | 25 0.051374 0.115103 0.446335 0.420648 |
15. | 26 0.040903 0.103563 0.394961 0.374509 |
|-------------------------------------------------|
16. | 27 0.035013 0.098891 0.354058 0.336551 |
17. | 28 0.031414 0.098462 0.319045 0.303338 |
18. | 29 0.022906 0.079636 0.287631 0.276178 |
19. | 30 0.017016 0.064277 0.264725 0.256217 |
20. | 31 0.022579 0.091149 0.247709 0.236420 |
|-------------------------------------------------|
21. | 32 0.018652 0.082849 0.225131 0.215805 |
22. | 33 0.013416 0.064976 0.206479 0.199771 |
23. | 34 0.013089 0.067797 0.193063 0.186518 |
24. | 35 0.012762 0.070909 0.179974 0.173593 |
25. | 36 0.010144 0.060665 0.167212 0.162140 |
|-------------------------------------------------|
26. | 37 . . 0.157068 . |
+-------------------------------------------------+
. quietly sum Lx
. di 12 + r(sum)
25.715641
. drop if age > 37
(5 observations deleted)
The average time lived in the single state by age 37.0 is 25.7.
(b) To answer these questions we need l20, l25, and l37. I'll store these in scalars for clarity
. scalar lx20 = lx[9] . scalar lx25 = lx[14] . scalar lx37 = lx[26] . di 1 - lx20 .23756546 . di 1 - lx25 .55366492 . di (lx20-lx25)/lx20 .41459227 . di (lx20-lx25)/(lx20-lx37) .52216214
(c) We fit a Hernes model following the suggested procedure.
. gen y = log(qx/(1-lx))
(2 missing values generated)
. gen am = (age+0.5)
. gen am15 = am - 15
. reg y am15
Source | SS df MS Number of obs = 24
-------------+------------------------------ F( 1, 22) = 196.19
Model | 26.4196017 1 26.4196017 Prob > F = 0.0000
Residual | 2.96252171 22 .134660078 R-squared = 0.8992
-------------+------------------------------ Adj R-squared = 0.8946
Total | 29.3821234 23 1.27748363 Root MSE = .36696
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
am15 | -.1515703 .0108211 -14.01 0.000 -.1740119 -.1291288
_cons | .2402122 .131607 1.83 0.082 -.032724 .5131485
------------------------------------------------------------------------------
. scalar r = _b[am15]
. gen z = logit(1-lx)
(1 missing value generated)
. gen x = exp(r*(age-15))
. reg z x
Source | SS df MS Number of obs = 25
-------------+------------------------------ F( 1, 23) = 5546.13
Model | 169.889412 1 169.889412 Prob > F = 0.0000
Residual | .704536948 23 .030632041 R-squared = 0.9959
-------------+------------------------------ Adj R-squared = 0.9957
Total | 170.593949 24 7.10808121 Root MSE = .17502
------------------------------------------------------------------------------
z | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | -7.001133 .0940098 -74.47 0.000 -7.195607 -6.806658
_cons | 1.828685 .0497742 36.74 0.000 1.725719 1.93165
------------------------------------------------------------------------------
. di invlogit(_b[_cons])
.86160494
We then predict the survival function, difference it to obtain marriage frequencies, and construct the plot.
. predict zfit (option xb assumed; fitted values) . gen lfit = 1 - invlogit(zfit) . format %3.1f lx dx . scatter lx age || line lfit age, ylabels(0(0.2)1) /// > title(Proportion Single) legend(off) name(A,replace) . gen dfit = lfit - lfit[_n+1] (1 missing value generated) . scatter dx am || line dfit am, xtitle(age) /// > title(Marriage Frequency) legend(off) name(B,replace) . graph combine A B, title(Hernes Fit) subtitle(U.S. Women 37-41 in SIPP) . graph export ps3fig1.png, replace (file ps3fig1.png written in PNG format)

The fit is not great, suggesting that forecasts based on the model would not work very well.
(c) We fit a Coale-McNeil model. For simplicity we work with marriages by age 37 but you could, if you wanted, adjust the parameters slightly to allow for marriages after age 37.0. This doesn't make a huge difference.
. gen w = dx/(1-lx[26])
(1 missing value generated)
. sum am [aw=w]
Variable | Obs Weight Mean Std. Dev. Min Max
-------------+-----------------------------------------------------------------
am | 25 .999999968 23.61297 5.077051 12.5 36.5
. egen pcm = pnupt(age), mean(23.613) stdev(5.077) pem(.8429)
. gen lcm = 1-pcm
. gen dcm = lcm - lcm[_n+1]
(1 missing value generated)
. scatter lx age || line lcm age, ylabels(0(0.2)1) ///
> title(Proportion Single) legend(off) name(A,replace)
. scatter dx am || line dcm am, xtitle(age) ///
> title(Marriage Frequency) legend(off) name(B,replace)
. graph combine A B, xsize(6) ysize(3) ///
> title(Coale-McNeil Fit) subtitle(U.S. Women 37-41 in SIPP)
. graph export ps3fig2.png, replace
(file ps3fig2.png written in PNG format)

The fit of the Coale-McNeil is comparable to that of the Hernes model. Note that we could improve the fit of both models by estimating the parameters using maximum likelihood.
We have data from two surveys. I have stored the commands needed to do the analysis in a do file which follows the handout on rates. (See it here.) Note that I pass the year as a parameter to the do file.
. quietly do do\DrFertRates 75
. poisson // replay last estimates
Poisson regression Number of obs = 2254
LR chi2(3) = 146.31
Prob > chi2 = 0.0000
Log likelihood = -2101.5858 Pseudo R2 = 0.0336
------------------------------------------------------------------------------
births | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dur | -.0212533 .0048206 -4.41 0.000 -.0307015 -.0118051
urban | -.1565696 .0860627 -1.82 0.069 -.3252495 .0121102
urbanDur | -.0309201 .008294 -3.73 0.000 -.047176 -.0146641
_cons | -2.509833 .0558191 -44.96 0.000 -2.619237 -2.40043
os | (offset)
------------------------------------------------------------------------------
. display exp(_b[_cons]),exp(_b[_cons]+_b[urban])
.08128177 .06950177
. display 1-exp(_b[dur]), 1-exp(_b[dur]+_b[urbanDur])
.02102907 .05083573
. */**
> 
> The estimates show similar levels of natural fertility (at duration 0)
> in rural an urban areas (.0813 and .0695, or 14% lower in urban),
> but subtantially higher control of fertility in urban than rural
> areas, with declines of 2.10 and 5.08 per year of marriage,
> respectively.
> The figure illustrates the duration profile (not required).
> To obtain predicted levels one would need to specify age as well.
> Now for 1980:
> */
. quietly do do\DrFertRates 80
. poisson // replay last estimates
Poisson regression Number of obs = 3600
LR chi2(3) = 287.43
Prob > chi2 = 0.0000
Log likelihood = -3078.2992 Pseudo R2 = 0.0446
------------------------------------------------------------------------------
births | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dur | -.0374255 .0042582 -8.79 0.000 -.0457714 -.0290795
urban | -.1565444 .0719618 -2.18 0.030 -.2975869 -.0155019
urbanDur | -.0282096 .00734 -3.84 0.000 -.0425958 -.0138235
_cons | -2.537071 .0458957 -55.28 0.000 -2.627025 -2.447117
os | (offset)
------------------------------------------------------------------------------
. display exp(_b[_cons]),exp(_b[_cons]+_b[urban])
.07909777 .06763599
. display 1-exp(_b[dur]), 1-exp(_b[dur]+_b[urbanDur])
.0367338 .06352749

Again we see similar levels of natural fertility (at duration 0) in rural an urban areas (.0791 and .0676, or 14% lower in urban), but subtantially higher control of fertility in urban than rural areas, with declines of 3.67 and 6.35 per year of marriage, respectively. The figure illustrates the duration profile for 1980.
The main change between 1975 and 1980 is increased control of fertility, particularly in rural areas which went from duration slopes of 2.1 to 3.67 percent per year, compared to 5.08 to 6.35 percent per year in urban areas. So the gap has narrowed somewhat.
Again I have stored the commands needed to do the analysis in a do file, so I can call it for each survey. (See it here.) I store the quintums, quartiles, and trimean in a matrix. Here are the results for 1975:
. quietly do do\DrBirthInts 75
. mat list q75
q75[3,5]
Quintum Trimean Q1 Q2 Q3
Rural .88741291 19.783598 12.757429 19.440124 27.496717
Town .81248641 21.40552 13.979024 20.189366 31.264324
City .7389999 20.049581 13.131694 18.600382 29.865864

We see a monotonic decrease in quantum as we move from women who grew up in rural areas to those who grew up in small towns and cities, with very small differences in tempo. Curiously, it is women from towns who have slightly longer births intervals, with no difference between cities and rural areas. Note, however, that only 74% of women who grew up in cities move on from parity two to three, it's just that those who do don't wait any longer than their rural counterparts.
We now look at the 1980 survey.
. quietly do do\DrBirthInts 80
. mat list q80
q80[3,5]
Quintum Trimean Q1 Q2 Q3
Rural .84864891 20.051037 13.164762 20.100345 26.838696
Town .74025041 21.003418 12.467427 20.180363 31.18552
City .54947275 23.025648 13.383775 21.120355 36.478106

We see much larger differences in 1980. The proportion who are moving on to have a third child is now only 55% for women who grew up in cities, and for those who grew up in small towns it is now 74%, the value we saw for city folk five years earlier. We also see the emergence of differences in tempo, with women who grew up in small towns and particularly in cities showing somewhat longer birth intervals that women of rural origin.
. mat q7580 = q75[1..3,1],q80[1..3,1],q75[1..3,2],q80[1..3,2]
. mat colnames q7580 = Q75 Q80 T75 T80
. mat list q7580
q7580[3,4]
Q75 Q80 T75 T80
Rural .88741291 .84864891 19.783598 20.051037
Town .81248641 .74025041 21.40552 21.003418
City .7389999 .54947275 20.049581 23.025648
When we compare results over time we see very large changes in quantum in towns and particularly in cities, and a modest increase in birth interval length that is largely confined to cities.
We use the reports for the DHS surveys in the Philippiness in 1993, 1998 and 2003. I will use Mata for my calculations, but they can all be done in Excel.
Cm. This is usually estimated as the ratio of the TFR to the TMFR, but the DHS doesn't publish marital fertility rates by age. We can get a rough estimate of these rates, however, by inflating the age-specific fertility rates by the proportion married at each age, assuming that all births occur within marriage. (An alternative would be to use proportion of time spent within marriage.)
. mata:
------------------------------------------------- mata (type end to exit) -------------------------------------------
: asfr = ((50\190\217\181\120\51\8) , // 1993 Tab 3.1 p 26
> (46\177\210\155\111\40\7) , // 1998 Tab 3.1 p 32
> (53\178\191\142\ 95\43\5) ) // 2003 Tab 4.1 p 41
: tfr = J(1,7,.005) * asfr
: tfr
1 2 3
+-------------------------+
1 | 4.085 3.73 3.535 |
+-------------------------+
: m = (
> (.047\.384\.663\.779\.816\.81\.773) // 1993 Tab 5.1 p 59
> :+ (.027\.067\.063\.058\.059\.054\.056) ,
> (.048\.345\.643\.769\.797\.775\.783) // 1998 Tab 5.1 p 75
> :+ (.036\.076\.075\.072\.073\.064\.041) ,
> (.039\.369\.664\.771\.800\.790\.787) // 2003 Tab 6.1 p 79
> :+ (.051\.127\.097\.080\.072\.068\.066) )
: m
1 2 3
+----------------------+
1 | .074 .084 .09 |
2 | .451 .421 .496 |
3 | .726 .718 .761 |
4 | .837 .841 .851 |
5 | .875 .87 .872 |
6 | .864 .839 .858 |
7 | .829 .824 .853 |
+----------------------+
: round(asfr :/ m)
1 2 3
+-------------------+
1 | 676 548 589 |
2 | 421 420 359 |
3 | 299 292 251 |
4 | 216 184 167 |
5 | 137 128 109 |
6 | 59 48 50 |
7 | 10 8 6 |
+-------------------+
: tmfr = J(1,7,.005) * (asfr :/ m)
: tmfr
1 2 3
+-------------------------------------------+
1 | 9.089645504 8.142936331 7.652655428 |
+-------------------------------------------+
: end
---------------------------------------------------------------------------------------------------------------------
Unfortunately the age-specific marital fertility rates for the two youngest age groups, where a large proportion of births may in fact, be born outside marriage, are just not credible. Taking these rates at face value would overestimate the effect of marriage and would require implausibly high estimates of natural fertility to match the observed TFRs. Clearly some adjustment is necessary.
Fortunately the proportions married have not changed much over time, if anything they may have increased slightly, so the choice of weights is not likely to obscure trends, even if the level of the marriage index may be off. While fairly sophisticated adjustments are possible, I'll simply set the first two age groups so their weight is the same as 25-29. I also show the unweighted average proportions married for comparison.
. mata
------------------------------------------------- mata (type end to exit) -------------------------------------------
: mfr = asfr :/ m
: mfr[1,] = mfr[3,]
: mfr[2,] = mfr[3,]
: cm = tfr :/ (J(1,7,.005) * mfr)
: cm
1 2 3
+-------------------------------------------+
1 | .6195197068 .5989567799 .6517676938 |
+-------------------------------------------+
: J(1,7,1/7) * m
1 2 3
+-------------------------------------------+
1 | .6651428571 .6567142857 .683 |
+-------------------------------------------+
: end
---------------------------------------------------------------------------------------------------------------------
Cc. We estimate a measure of contraceptive use u as the average of the proportions using in each age group, and a measure of efficiency e as the average effectiveness of the methods used by married women. Below I store proportions using contraception among married women in each age group, ready to average. I also store the method mix and efficiencies borrowed from Bongaarts's paper
. mata:
------------------------------------------------- mata (type end to exit) -------------------------------------------
: cuse = ((.172\.319\.391\.458\.482\.431\.272) , // 1993 Tab 4.4 p 43
> (.183\.374\.486\.521\.541\.486\.343) , // 1998 Tab 4.4 p 54
> (.256\.427\.513\.534\.566\.499\.377) ) // 2003 Tab 5.4 p 59
: u = J(1,7,1/7) * cuse
: u
1 2 3
+-------------------------------------------+
1 | .3607142857 .4191428571 .4531428571 |
+-------------------------------------------+
: // methods are pill iud inj condon fster mster nat with other
: eff = (.98\.96\.98\.91\1\1\.82\.8\.9) // From Bongaarts p 112
: mix = ((.085\.030\.001\.010\.119\.040\.073\.074\.040), // 1993 Tab 4.4 p 43
> (.099\.037\.024\.016\.103\.001\.089\.089\.008), // 1998 Tab 4.4 p 54
> (.132\.041\.031\.019\.105\.001\.068\.082\.009)) // 2003 Tab 5.4 p 59
: pmix = mix :/ (J(1,9,1)*mix) // divide by sums
: e = J(1,9,1) * (pmix :* eff) // average effectivenes
: e
1 2 3
+-------------------------------------------+
1 | .9242372881 .9141630901 .9259221311 |
+-------------------------------------------+
: cc = 1 :- 1.18 * u :* e
: cc
1 2 3
+-------------------------------------------+
1 | .606605 .5478653832 .5049015 |
+-------------------------------------------+
: end
---------------------------------------------------------------------------------------------------------------------
We see that contraceptive use has increased markedly while efficiency has essentially remained constant.
Ci. The DHS has direct estimates of the length of the post-partum non-susceptible period, so we use those
. mata:
------------------------------------------------- mata (type end to exit) -------------------------------------------
: i = (8.8, // 1993 Tab 5.8 p 66
> 9.0, // 1998 Tab 5.7 p 83
> 10.0) // 2003 tAB 6.8 P 86
: ci = 20 :/ (18.5 :+ i)
: ci
1 2 3
+-------------------------------------------+
1 | .7326007326 .7272727273 .701754386 |
+-------------------------------------------+
: end
---------------------------------------------------------------------------------------------------------------------
We see that post-partum insusceptibility has increased slightly over time.
As a check on the model we can divide the TFR by the product of the indices. The result is an estimate of total natural fertility assuming no abortion, and should be in the neighborhood of 15.3, or less if there is abortion.
. mata tfr :/ (cm :* cc :* ci)
1 2 3
+-------------------------------------------+
1 | 14.83759801 15.62939561 15.30751845 |
+-------------------------------------------+
The values are not unreasonable, except perhaps for 1993, unless there was some abortion which was replaced by contraceptive use. I would not put much credence on these absolute levels anyway, considering our difficulty estimating total marital fertility.
It should be clear, however, that the fertility decline has been driven almost entirely by increased contraceptive use, as the only index that shows substantial change over time is the index of contraception
. mata cm\cc\ci
1 2 3
+-------------------------------------------+
1 | .6195197068 .5989567799 .6517676938 |
2 | .606605 .5478653832 .5049015 |
3 | .7326007326 .7272727273 .701754386 |
+-------------------------------------------+
The marriage indices show, if anything, more exposure in 2003 than ten years earlier, while post-partum insusceptibility shows only a modest three-point change.
Copyright © 2006, Germán Rodríguez, Office of Population Research, Princeton University