Eco572: Research Methods in Demography

Problem Set 4
Due May 10, 2006

[1] Marriage in France

The datasets section has data on first marriage frequencies in France obtained from last year's course blackboard site. The data are available in an Excel spreadsheet called FranceAfmPeriodAge.xls and as an ascii file called FranceAfmPeriodAge.dat. In both datasets the rows represent years from 1968 to 2000 and the columns represents ages from 15 to 50. Each entry is the number of first marriages by 10,000 women at the start of the year.

Note that these are marriage frequencies, not hazard rates; the denominator includes married as well as single women. Also, age is defined as the age a woman turns in a calendar year, and can be taken to represent on average exact rather than completed years (so there's no need to add 0.5).

(a) Compute and plot period total marriage rates (the sum of the marriage frequencies at all ages) and period mean ages at marriage. You should find that the proportion who would ever marry according to these period estimates declined from 88 to a low below 50% in the early nineties, but has recovered a bit since. Period mean age at first marriage as estimated from these frequencies has increased steadily over time

(b) Compute the Bongaarts-Feeney tempo-adjusted total first marriage rates. Use numerical derivatives of mean age at first marriage as recommended by B-F, i.e. using years t-1 and t+1 for t. Plot the adjusted and unadjusted TMRs against year and comment. (This plot should be combined with part a.)

(c) The series is only 33 years long, so we don't have the complete experience for any cohort! Fortunately most marriages occur well before age 50. Construct an array of cohort marriage frequencies for the cohorts turning 15 each year from 1968 to 2000, using missing values for all values that lie in the future. Accumulate these frequencies and plot them by age for the first 20 cohorts. You should find that each successive cohort has followed a lower marriage curve than its predecessors. (We will discuss in class how to go about re-organizing the data by cohort.)

(d) Divide the cohort marriage frequencies by the complement of the cumulative cohort frequencies at the start of each year of age to estimate proper hazard rates. Plot these for the first 20 cohorts. (All these plots look better if you omit the legends.) A notable feature of your results should be that the hazard for ages below 30 has declined and moved to older ages with only a slight increase in hazards above 30. (One could use these hazard rates to construct period life tables of age at marriage. That would be my preferred way to produce period estimates.)

(e) What do you think the future holds for the cohort that turned 15 in 1987 (the most recent one in the last two plots)? They turned 28 in 2000 and only 41.2% had married.

[2] Mortality in the U.S.

File usm3301.dat has mortality rates for the U.S., both sexes combined, from 1933 to 2001, and were obtained from the Human Mortality Database. We will use the data from 1989 to 2001 to examine how well the Lee-Carter forecasts are doing and to make our own forecast with a 50-year horizon.

(a) Find the forecast value of k for 2001, and its standard deviation. Estimate the observed value of k by simple linear regression. Take the log of the 2001 rates, subtract ax and regress the result on bx forcing the regression through the origin. The a's and b's are in ascii form in LeeCarterAb.dat, and as a Stata file LeeCarterAb.dta, both in the datasets section of the website.) Compare the observed value with the forecast interval.

(b) Use the ax and bx values together with the forecast value of kt and its standard deviation to produce a 95% confidence band for the age-specific mortality rates in 2001, and plot the observed values. Work only with ages up to 80-84, omitting the special procedures needed for ages above 85. (You may try this for 2000 to check your method against the published forecasts.)

(c) A problem with using the model for forecasting is that it doesn't fit exactly on the jump off year.An alternative is to set a to the log of the rates for the last year and k to 0, thereby assuring a perfect fit at the start of the projection. Lee and Carter worried that this might extrapolate idiosyncratic features of the last year and prefered using the average pattern. In retrospect they think this was a mistake, as the 0.6 year error in life expectancy in 1989 caused bias in the forecasts for the next ten years. Check how much better the fit for 2001 would be using a correct jump off. (For comparison with part b stick to ages below 85.)

(d) Produce a new forecast for 2050 using 2000 as a jumpoff and assuming that the other parameters (b and the drift) hold. This time use all ages up to 105 and forecast expectation of life. In constructing the life table assume that risk is constant in each age interval except for the first three age groups, where nax values of 0.25, 1.5 and 2.2 are more suitable. Provide 95% confidence bands for life expectancy at birth in 2050.

(e) Find another set of projections, preferably made in the last ten years, and compare their forecast with part d. The projections made by the Social Security Administration may be of particular interest.

[3] The Population of Malaysia

File Malaysia85rev.dat in the datasets section has data for Malaysian women in 1985. I found the data in Watcher's textbook (page 275) but he credits Keyfitz and Flieger. The counts of population and deaths are women, but births are both sexes. Watcher notes that there were 208,422 boy babies and 198,384 girl babies born in Malaysia in 1985.

To save you some time I computed a life table using standard methods with nax values of 0.2 and 1.5 for the first two age groups, and added the person-years lived in each age to the dataset. You need to know that life expectancy at age 85 is 6.803 more years. Time lived at ages 85 and above is 1.50744.

(a) Use this information to project the female population of Malaysia to 2000 and 2015 assuming constant fertility and mortality. How well do we do for 2000? (Note that you will need to combine data for ages 0 and 1-4 to obtain equal-width intervals.)

(b) Compute the intrinsic rate of growth and the stable equivalent age distribution (i) from the Leslie matrix, (ii) using the methods in Box 7.1 of the textbook.

(c) Suppose fertility dropped to replacement level in 1985. How much would the (female) population of Malaysia still grow? Answer this question (i) by carrying out a projection for 100 years (20 projection periods) under the new regime, (ii) using the Preston-Guillot method described in the textbook, (iii) using Keyftiz's approximation (either variant).

(d) Plot the current age distribution and the stationary equivalent and comment on the implications for population momentum.