Competing Risks
Let us work through the example in the textbook on multiple decrement and associated single decrement life tables, see Boxes 4.1 and 4.2 on pages 77 and 85.
Rather than just repeat all the calculations, however, I will illustrate a simpler approach to cause-deleted tables based on the assumption that cause-specific hazards are constant within each age interval. This approach gives almost the same results as the textbook, yet allows us to concentrate on the underlying concepts.
We start by reading the data, which have deaths for all causes and from neoplasms for U.S. females in 1991. I also include the lx and nax columns.
. set type double . infile age D Di lx a using /// > http://data.princeton.edu/eco572/datasets/prestonBox41.dat (19 observations read)
Multiple Decrements
The first task is to estimate mortality due to neoplasms in the presence of other causes. We start by estimating overall survival using conventional life table techniques. Then we compute conditional probabilities of death. (These usually would have been computed on the way to lx.)
. gen q = 1-lx[_n+1]/lx (1 missing value generated) . replace q = 1 in -1 (1 real change made)
The conditional probability of dying of a given cause given survival to the age group is obtained multiplying by the ratio of deaths of a given cause to all deaths:
. gen qi = q * Di/D
The unconditional probabilities of dying of any cause and of a given cause are obtained multiplying by the probability of the condition, which is lx
. gen d = lx * q . gen di = lx * qi
Let us print our results
. format %8.5f q qi
. format %8.0fc l d di
. list age l q qi d di
+----------------------------------------------------+
| age lx q qi d di |
|----------------------------------------------------|
1. | 0 100,000 0.00783 0.00003 783 3 |
2. | 1 99,217 0.00168 0.00015 167 14 |
3. | 5 99,050 0.00092 0.00015 91 15 |
4. | 10 98,959 0.00090 0.00012 89 12 |
5. | 15 98,870 0.00236 0.00019 233 19 |
|----------------------------------------------------|
6. | 20 98,637 0.00262 0.00025 258 24 |
7. | 25 98,379 0.00314 0.00041 309 41 |
8. | 30 98,070 0.00425 0.00087 417 85 |
9. | 35 97,653 0.00584 0.00171 570 167 |
10. | 40 97,083 0.00818 0.00314 794 304 |
|----------------------------------------------------|
11. | 45 96,289 0.01330 0.00582 1,281 561 |
12. | 50 95,008 0.02095 0.00963 1,990 915 |
13. | 55 93,018 0.03371 0.01547 3,136 1,439 |
14. | 60 89,882 0.05155 0.02191 4,633 1,969 |
15. | 65 85,249 0.07669 0.02920 6,538 2,489 |
|----------------------------------------------------|
16. | 70 78,711 0.11552 0.03696 9,093 2,909 |
17. | 75 69,618 0.17427 0.04439 12,132 3,091 |
18. | 80 57,486 0.27363 0.05012 15,730 2,881 |
19. | 85 41,756 1.00000 0.10212 41,756 4,264 |
+----------------------------------------------------+
. quietly sum di
. di r(sum)
21204.543
We obtain a total of 21,205 life table deaths due to neoplasms. Thus, the probability that a female will die of this cause under the age-cause-specific rates prevailing in the U.S. in 1991 is 21.2%. Can you compute the probability that a woman who survives to age 50 will die of neoplasms?
In preparation for the next part note that if we had nmx and we were willing to assume that the hazard is constant in each age group we would have had a slightly different estimate of the survival function. Let us "back out" the rates:
. gen n = age[_n+1]-age (1 missing value generated) . gen m = q/(n-q*(n-a)) (1 missing value generated) . replace m = 1/a in -1 (1 real change made)
With these we compute the hazard and survival as
. gen H = sum(n*m) . gen S = 1 . replace S=exp(-H[_n-1]) in 2/-1 (18 real changes made)
The result differs from the survival function in the text by less than 0.1% in all age groups except the last two, where the error is 0.2 and 0.6% respectively.
Cause-Deleted Life Tables
The next question is how female mortality would look if deaths due to neoplasms could be avoided. The honest answer, of course, is that we don't know. But assuming independence of the underlying risks we can do some calculations.
Let me start with the simpler approach here. We compute cause-specific rates dividing deaths of a given cause by person-years of exposure, which is equivalent to multiplying the overall rate by the ratio of deaths of a given cause to the total. Here we want deaths for causes other than neoplasms. I will use the d subscript for deleted:
. gen Rd = (D-Di)/D . gen md = m * R
Then we construct a survival function just as before, but treating this hazard as if it was the only one operating:
. gen Hd = sum(n*md) . gen Sd = 1 . replace Sd = exp(-Hd[_n-1]) in 2/-1 (18 real changes made)
That's it. Really. We take a hazard that represents a subset of causes and build a life table from it. If the hazard is constant within each age group we can also compute person-years lived at each age quite easily
. gen Pd = (Sd-Sd[_n+1])/md (1 missing value generated) . replace Pd = Sd/md in -1 (1 real change made) . quietly sum Pd . di r(sum) 82.391779
So if neoplasm deaths could be avoided female life expectancy at birth would be 82.4 years, under the assumption of independence.
You might be interested to see the shape of the overall, cause-specific, and cause-deleted hazards:
. gen agem = age + n/2
(1 missing value generated)
. gen mi = m - md // rate for neoplasms
. line m md mi agem, xtitle(age) yscale(log) ///
> lp(solid dash solid) ///
> title("Hazards for U.S. Females 1991") ///
> legend(order(1 "All Causes" 2 "No Neoplasms" 3 "Neoplasms") ///
> ring(0) pos(5) cols(1))
. graph export usf91neo.png, replace
(file usf91neo.png written in PNG format)

Now let's review the alternative approach used in the textbook. This uses Chiang's method, which assumes proportionality of cause-specific hazards within an age group (weaker than our constant risk assumption)
. gen pd = (1-q)^Rd . gen ld = 100000 . replace ld = ld[_n-1] * pd[_n-1] in 2/-1 (18 real changes made)
The cause-deleted survival functions computed by the two methods differ by less than 0.1% at all ages except the last two, where the error is 0.2 and 0.5%, respectively.
To compute time lived we need the nax factors. The text uses the Taylor series method for ages 0, 1, 5, and 80, and a quadratic interpolation formula from 10 to 75. Let's do that:
. gen dd = ld-ld[_n+1] // deaths after cause-deletion (1 missing value generated) . replace dd = ld in -1 (1 real change made) . gen qd = dd/ld . gen ad = a/Rd in -1 (18 missing values generated) . // Equation 4.8: . replace ad = n + Rd*(q/qd)*(a-n) /// > if age < 10 | age==80 (4 real changes made) . // Equation 4.6: . replace ad =( (-5/24)*dd[_n-1] + 2.5*dd + (5/24)*dd[_n+1] )/dd /// > if age >= 10 & age <= 75 (14 real changes made)
We can then compute time lived and life expectancy:
. gen Ld = dd*ad + (ld-dd)*n (1 missing value generated) . replace Ld = ld*ad in -1 (1 real change made) . quietly summarize Ld . di r(sum) 8245748.3
We get 82.46 years, in agreement with the text, and very close to the 82.39 computed using the simpler approach.
We now compute life expectancy at every age and print the "official" cause-deleted life table.
. gen Td=r(sum) - sum(Ld) + Ld
. gen ed = Td/ld
. format %8.6f Rd
. format %5.3f a ad
. format %5.2f ed
. format %8.0fc lx ld
. list age Rd lx a ld ad ed
+------------------------------------------------------------+
| age Rd lx a ld ad ed |
|------------------------------------------------------------|
1. | 0 0.996002 100,000 0.152 100,000 0.152 82.46 |
2. | 1 0.913222 99,217 1.605 99,220 1.605 82.10 |
3. | 5 0.835985 99,050 2.275 99,068 2.275 78.23 |
4. | 10 0.862047 98,959 2.843 98,992 2.875 73.29 |
5. | 15 0.919595 98,870 2.657 98,915 2.653 68.34 |
|------------------------------------------------------------|
6. | 20 0.905618 98,637 2.547 98,700 2.548 63.48 |
7. | 25 0.868125 98,379 2.550 98,467 2.577 58.63 |
8. | 30 0.795927 98,070 2.616 98,198 2.585 53.78 |
9. | 35 0.706327 97,653 2.677 97,866 2.582 48.96 |
10. | 40 0.616676 97,083 2.685 97,462 2.637 44.15 |
|------------------------------------------------------------|
11. | 45 0.562189 96,289 2.681 96,969 2.672 39.36 |
12. | 50 0.540143 95,008 2.655 96,242 2.695 34.64 |
13. | 55 0.541050 93,018 2.647 95,148 2.703 30.00 |
14. | 60 0.574919 89,882 2.646 93,399 2.695 25.51 |
15. | 65 0.619308 85,249 2.631 90,600 2.696 21.22 |
|------------------------------------------------------------|
16. | 70 0.680059 78,711 2.628 86,231 2.686 17.16 |
17. | 75 0.745246 69,618 2.618 79,325 2.676 13.42 |
18. | 80 0.816833 57,486 2.570 68,776 2.637 10.07 |
19. | 85 0.897875 41,756 6.539 52,969 7.283 7.28 |
+------------------------------------------------------------+
How much would a woman's life expectancy at 40 change if neoplasm deaths were eliminated?
