Current Status Life Tables
We will consider applications of current status life tables to nuptiality and breastfeeding.
Proportions Married
This is the example on page 90 of the textbook, showing proportions single by age for Turkish men in 1990. I start at age 15 rather than 0 because the proportion ever married is zero in the younger age groups.
. clear
. input a n s
a n s
1. 15 3165061 3030203
2. 20 2581153 1853222
3. 25 2435765 629077
4. 30 2096899 180767
5. 35 1784121 77134
6. 40 1418784 43412
7. 45 1111113 26267
8. 50 980115 22527
9. end
Next I compute the proportion single in each age group, estimate the
proportion single at age 50 (s50)as the average of
age groups 45-49 and 50-54, and compute SMAM.
I add 15 because that's the age at which I started,
and multiply s50 by 35 (which is 50-15).
. gen p = s/n . scalar s50 = (p[7]+p[8])/2 . quietly sum p in 1/7 // exclude 50+ . scalar smam = 15 + (5*r(sum)-35*s50)/(1-s50) . display smam 25.003883
SMAM is 25 years and corresponds to the lighter area in the
figure below scaled by 1-s50. (I create two
variables to split the proportion single in the fraction
that will remain single by age 50 and the rest, so I can
show them in different colors.)
. gen p50=s50 . gen pr = p-p50 . graph bar p50 pr, stack over(a, gap(0) ) legend(off) /// > bar(2, color(51 102 153)) /// > title(Proportions Never Married by Age) /// > subtitle(Turkey Males 1990) . graph export smamtk90.png, replace (file smamtk90.png written in PNG format)

SMAM is often interpreted as the mean age at marriage conditional on marrying by age 50. This interpretation applies to real cohorts only if we are willing to assume that nuptiality has been constant over the last 35 years. However, the measure is a useful synthetic summary of period nuptiality even without the assumption of stationarity. (In fact, the unscaled version has a direct interpretation as time lived in the single state by Turkish men in 1990.)
Duration of Breastfeeding
In an earlier handout on smoothing I showed that retrospective reports of duration of breastfeeding in Bangladesh (as elsewhere), show very substantial heaping at multiples of 12 (and to a lesser extent 6) months. We now use a current status life table to obtain more reliable estimates.
We start by reading a short extract from the Bangladesh WFS survey.
. use bdbrfeed, clear
(BDSR03 extract)
. desc
Contains data from bdbrfeed.dta
obs: 3,850 BDSR03 extract
vars: 11 24 Mar 2006 16:58
size: 103,950 (99.7% of memory free)
-------------------------------------------------------------------------------
storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
v000 long %12.0g
birth byte %10.0g
v007 int %8.0g Date of interview
b1 byte %8.0g b011 Order within mult birth <1-5>
b2 int %8.0g b012 Date of birth <1-5>
b3 byte %9.0g b013 Sex of child <1-5>
b4 byte %20.0g b014 Age at death <1-5>
b5 byte %20.0g b015 Age at death <1-5>
v205 byte %8.0g v205 Duration of current pregenancy
v301 byte %8.0g v301 Breastfed in open int
last double %10.0g
-------------------------------------------------------------------------------
Sorted by: v000
The key variable here is v301,
duration of breastfeeding in the open birth interval,
not so much because we are interested in the actual duration, but
because it tells us whether the child is still being breastfed as
of the time of interview.
The extract has all births in the three years before the interview (months -1 to -36), so we get a representative sample of births in a period. (Analyses based on only open or closed intervals are known to be seriously biased.)
I also kept an indicator of whether a birth was last and whether the
respondent was pregnant (had duration of current pregnancy > 0).
These two bits of information are used to determine if the birth
starts the open interval, so the information on v301
pertains to her. Here are the retrospective durations, so you
can see the usual evidence of heaping:
. tab v301 if last & v205==0
Breastfed in open |
int | Freq. Percent Cum.
--------------------+-----------------------------------
0 | 4 0.14 0.14
1 | 11 0.38 0.52
2 | 7 0.24 0.76
3 | 8 0.28 1.04
4 | 4 0.14 1.17
5 | 6 0.21 1.38
6 | 12 0.41 1.80
7 | 4 0.14 1.93
8 | 3 0.10 2.04
9 | 2 0.07 2.11
10 | 4 0.14 2.24
12 | 25 0.86 3.11
16 | 3 0.10 3.21
18 | 19 0.66 3.87
20 | 1 0.03 3.90
21 | 1 0.03 3.94
22 | 1 0.03 3.97
24 | 71 2.45 6.42
25 | 1 0.03 6.46
28 | 3 0.10 6.56
29 | 1 0.03 6.60
30 | 10 0.35 6.94
32 | 1 0.03 6.98
34 | 2 0.07 7.04
36 | 7 0.24 7.29
48 | 1 0.03 7.32
Still breastfeeding | 2,416 83.43 90.75
Until child died | 211 7.29 98.03
Did not breastfeed | 50 1.73 99.76
Not stated | 7 0.24 100.00
--------------------+-----------------------------------
Total | 2,896 100.00
With this data we can obtain current status information for
all births in the sample. If the birth is the last and the mother is
not pregnant we obtain the status information from v301.
Otherwise we assume that the child has been weaned. (Because we are
using only current status information we don't need to know the duration.)
. gen age = v007 - b2 // date of interview - date of birth . gen stillbr = v301 == 96 . replace stillbr = 0 if !last | v205 > 0 (454 real changes made) . collapse (sum) stillbr (count) n=v000, by(age) . gen lx = stillbr/n . scatter lx age || line lx age, legend(off)
One slight problem with this "lx" function (shown below) is that it is not monotone. There is an algorithm to force monotonicity known as "pool adjacent violators"; essentially you need to backtrack each time you find a violation of monotonicity and combine adjacent categories until the problem disappears, at which time you resume working forward.
A simpler approach is to use a smoother, and the figure shows a regression spline. I fitted it used logistic regression to take into account the fact that the underlying data are binary, but otherwise the basic idea is the same as before.
. bspline ,xvar(age) knots(0 12 24 36) p(3) gen(bs) . quietly blogit still n bs*, noconstant . predict fit, pr . twoway scatter lx age || line fit age, legend(off) /// > xlabels(0 6 12 18 24 30 36) /// > title(Proportion Still Breasteefing by Age) /// > subtitle(Bangladesh WFS 1976) . graph export bdbrfeed.png, replace (file bdbrfeed.png written in PNG format)

An important point to note is that there is no evidence of precipitous declines in proportions still breastfeeding after 12 or 24 months, so the heaping of retrospective reports on these values was probably bad data rather than a real phenomenon. This is why current status life tables are the method of choice for duration data on breastfeeding, post-partum amenorrhea, and post-partum abstinence, which are usually poorly reported.
It is clear that upwards of 90% of all children are breastfed; our original tabulation suggests it may be 95%, but our first current status data point is at exact age one month. Moreover, 50% of all children are still breastfeeding at age 26 months. Strictly speaking the median is defined as the age by which 50% of the kids who are breatfed are still breastfeeding, and is probably around age 29 months.
Note also that almost 20% of the kids are still breastfeeding at age 3 exact years. The time spent breastfeeding in the first three years, computed as the area under the "lx" curve, is
. quietly sum lx . di r(sum) 22.963591
or almost 23 months. (Using the spline gives 22.8 months.) The mean, of course, is probably a bit higher, depending on how far the 20% upper tail extends.
The incidence-prevalence method estimates duration in a state (such as an illness) as a ratio of prevalence to incidence, or existing to new cases. To estimate this mean we compute the overall proportion still breasteefing at durations 1-36 and divide by 0.95/36, an estimate of cases who start breastfeeding each month:
. sum lx[fw=n]
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
lx | 3850 .6275325 .2118189 .1864407 .9298246
. di r(mean)/(0.95/36)
23.780178
The incidence-prevalence estimate is 23.8 months, consistent with the other results.
Note in closing that with current status data all observations are censored,
- Children still breastfeeding are right censored at their current age (all we know is that they will breastfeed longer than that), and
- Children weaned, or who never breastfed, are left censored at their current age (all we know is that they breastfed less than that).
