Home | GLMs | Multilevel | Survival | Demography | Stata | R

The U.S. 2006 Life Table

The file us2006s.dat has two columns representing age and the survival function by single years of age, for ages 0 to 110. The data come from the latest period life table for the U.S. I downloaded them from http://www.mortality.org, which seems more user-friendly than the National Center for Health Statistics, which is the original source.

Here's how to read and plot the survival function, and how to compute and plot the (log of the) hazard function in Stata. I also include a command to check visually that the Gompertz provides a good fit to mortality above age 30.

// survival
infile age S using us2006s.dat, clear
twoway line S age, title("Survival Function, U.S. 2006") name(a,replace)

// hazard
gen H = - log(S)
gen h = H[_n] - H[_n-1]
list in 1/5
gen logh = log(h)
gen agem = age - 0.5 if h < .
twoway line logh agem,  xtitle("age") ///
	title("Hazard Function, U.S. 2006") name(b,replace)

// graph
graph combine a b, xsize(7) ysize(3)
graph export us2006s.png, width(600) replace

The Gompertz distribution provides a remarkably close fit to the hazard at adult ages, say above 30. Here is a graph with the log hazard and a simple OLS fit:

And these are the commands used to produce the graph:

// Gompertz
twoway (line logh agem if age > 30) ///
       (lfit logh agem if age > 30) ///
	, title("Hazard and Gompertz Fit for Adults, U.S. 2006") ///
	   xtitle("age") legend(off)
graph export us2006g.png, width(400)