Home | GLMs | Multilevel | Survival | Demography | Stata | R

Competing Risks

Singer and Willet (2003) Applied Longitudinal Data Analysis, Oxford, analyze data on the length of service of U.S. Supreme Court Justices, treating death and retirement as competing risks and age at appointment and calendar year of appointment as predictors.

We will conduct a similar analysis using an updated dataset that I created by extracting data from Wikipedia in February 2012. I saved the file in Stata 11 format as justices.dta, so you can read it from the course website.

. use http://data.princeton.edu/pop509/justices.dta

. stset tenure, fail(event)

     failure event:  event != 0 & event < .
obs. time interval:  (0, tenure]
 exit on or before:  failure

------------------------------------------------------------------------------
      112  total obs.
        0  exclusions
------------------------------------------------------------------------------
      112  obs. remaining, representing
      103  failures in single record/single failure data
     1831  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =        36

. tab event

      event |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |          9        8.04        8.04
          1 |         49       43.75       51.79
          2 |         54       48.21      100.00
------------+-----------------------------------
      Total |        112      100.00

The 'event' variable is coded 0 for censored (9 justices currently serving), 1 for those who left the court due to death (49), and 2 for justices who retired or resigned (54).

I ignore the return engagements by John Rutledge and Charles Evans Hughes, who resigned and later were reappointed as Chief Justice.

Overall Survival

The first thing we calculate is the Kaplan-Meier estimator of overall survival

. sts graph

         failure _d:  event
   analysis time _t:  tenure

. graph export justices-km.png, width(400) replace
(file justices-km.png written in PNG format)

The median length of service is about 15.5 years. We generate the kaplan-Meier survival function for later use.

. sts gen km = s

Incidence Functions

The command stcompet, written by V. Coviello and M. Boggess (2004) "Cumulative incidence estimation in the presence of competing risks" Stata Journal, 4(2):103-112, can estimate incidence functions. To install this command type findit stcompet. We specify death as the event of interest and retirement as the competing event.

. streset, fail(event==1)
-> stset tenure, failure(event==1)

     failure event:  event == 1
obs. time interval:  (0, tenure]
 exit on or before:  failure

------------------------------------------------------------------------------
      112  total obs.
        0  exclusions
------------------------------------------------------------------------------
      112  obs. remaining, representing
       49  failures in single record/single failure data
     1831  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =        36

. stcompet ci=ci, compet(2) // you just specify the code

The command generates the cumulative incidence function CIF for the cause of interest and each named competing cause. To plot them you need to select the appropriate event code. Here's the incidence of death

. twoway line ci _t if event==1, c(J) sort ///
>   title(Cumulative incidence of death)

. graph export justices-cid.png, width(400) replace  
(file justices-cid.png written in PNG format)

And here's the incidence of retirement

. twoway line ci _t if event==2, c(J) sort ///
>   title(Cumulative incidence of retirement)

. graph export justices-cir.png, width(400) replace  
(file justices-cir.png written in PNG format)

The Status Plot

I now stack all these estimates together, using the fact that S(t) + I1(t) + I2(t) = 1, so the plot has boundaries 1, 1-I1(t), S(t) and 0.

First we need to extract the CIF for death and fill in the blanks for other types of events.

. sort _t

. gen cid = ci if event==1
(63 missing values generated)

. replace cid = cid[_n-1] if missing(cid)
(60 real changes made)

. replace cid = 0 if missing(cid)
(3 real changes made)

We also add a dummy observation so the plot actually starts at time 0

. expand 2 in -1 // new observation is at the end
(1 observation created)

. replace _d = . in -1
(1 real change made, 1 to missing)

. replace _t = 0 in -1
(1 real change made)

.replace km = 1 in -1
(1 real change made)

. replace cid = 0 in -1
(1 real change made)

We can now compute the boundaries and plot them

. gen b1 = 1

. gen b2 = 1 - cid

. gen b3 = km

. gen b4 = 0

. sort _t

. twoway rarea b1 b2 _t, color("160 160 255") c(J) ///
>   ||   rarea b2 b3 _t, color("255 160 160") c(J) ///
>   ||   rarea b3 b4 _t, color("160 255 160") c(J) ///
>    , legend(off)  xtitle(Years) text(.2 15 "Active") ///
>      text(.5 25 "Died") text(.8 30 "Retired") ///
>      title("Status of justices by years since appointment")     

. graph export justices-status.png, width(400) replace 
(file justices-status.png written in PNG format)

. drop in -1 //remove dummy
(1 observation deleted)

The green area is the survival function, showing the probability of remaining in the court, and the two incidence functions tells us that exits by death and retirement are approximately equally likely.

Cause-Specific Hazards

We now introduce two covariates into our analysis: age, the age of the justice in years when he or she was appointed, and year, the calendar year of the appointment.

We obviously expect survival to improve over time but decline with the age of the justice at the start of his or her tenure. Age should also increase the risk of retirement, but is not clear whether the trend over time is to retire earlier or later.

We start with a Cox regression that looks at overall tenure in the court

. stset tenure, fail(event) // event > 0

     failure event:  event != 0 & event < .
obs. time interval:  (0, tenure]
 exit on or before:  failure

------------------------------------------------------------------------------
      112  total obs.
        0  exclusions
------------------------------------------------------------------------------
      112  obs. remaining, representing
      103  failures in single record/single failure data
     1831  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =        36

. stcox age year

         failure _d:  event
   analysis time _t:  tenure

Iteration 0:   log likelihood = -388.83948
Iteration 1:   log likelihood = -376.25794
Iteration 2:   log likelihood = -376.18995
Iteration 3:   log likelihood = -376.18993
Refining estimates:
Iteration 0:   log likelihood = -376.18993

Cox regression -- Breslow method for ties

No. of subjects =          112                     Number of obs   =       112
No. of failures =          103
Time at risk    =         1831
                                                   LR chi2(2)      =     25.30
Log likelihood  =   -376.18993                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
          _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |    1.08164   .0188029     4.51   0.000     1.045408    1.119128
        year |   .9946444   .0017573    -3.04   0.002      .991206    .9980946
------------------------------------------------------------------------------

We see that the risk of leaving the court at any length of service is 8% higher for every year of age at appointment, and about half a percent lower per calendar year. This, of course, confounds trends in mortality with trends in retirement.

To look at the risk of death we reset the data with only death as the event of interest, and everything else treated as censoring

. streset , fail(event==1) // just death
-> stset tenure, failure(event==1)

     failure event:  event == 1
obs. time interval:  (0, tenure]
 exit on or before:  failure

------------------------------------------------------------------------------
      112  total obs.
        0  exclusions
------------------------------------------------------------------------------
      112  obs. remaining, representing
       49  failures in single record/single failure data
     1831  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =        36

. stcox age year

         failure _d:  event == 1
   analysis time _t:  tenure

Iteration 0:   log likelihood = -185.71543
Iteration 1:   log likelihood = -174.87416
Iteration 2:   log likelihood =  -174.8357
Iteration 3:   log likelihood =  -174.8357
Refining estimates:
Iteration 0:   log likelihood =  -174.8357

Cox regression -- Breslow method for ties

No. of subjects =          112                     Number of obs   =       112
No. of failures =           49
Time at risk    =         1831
                                                   LR chi2(2)      =     21.76
Log likelihood  =    -174.8357                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
          _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   1.069978   .0254353     2.85   0.004      1.02127     1.12101
        year |   .9886699    .002668    -4.22   0.000     .9834545     .993913
------------------------------------------------------------------------------

We see that the risk of death while in the court is about 7% higher per year of age at appointment, and declines just over one percent per calendar year of appointment, reflecting general improvements in mortality over time.

To look at the risk of retirement we reset the data with retirement as the event of interest

. streset , fail(event==2) // just retirement
-> stset tenure, failure(event==2)

     failure event:  event == 2
obs. time interval:  (0, tenure]
 exit on or before:  failure

------------------------------------------------------------------------------
      112  total obs.
        0  exclusions
------------------------------------------------------------------------------
      112  obs. remaining, representing
       54  failures in single record/single failure data
     1831  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =        36

. stcox age year

         failure _d:  event == 2
   analysis time _t:  tenure

Iteration 0:   log likelihood = -203.12406
Iteration 1:   log likelihood =  -194.9564
Iteration 2:   log likelihood = -194.70836
Iteration 3:   log likelihood = -194.70805
Refining estimates:
Iteration 0:   log likelihood = -194.70805

Cox regression -- Breslow method for ties

No. of subjects =          112                     Number of obs   =       112
No. of failures =           54
Time at risk    =         1831
                                                   LR chi2(2)      =     16.83
Log likelihood  =   -194.70805                     Prob > chi2     =    0.0002

------------------------------------------------------------------------------
          _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   1.099232   .0279497     3.72   0.000     1.045794      1.1554
        year |   1.000135    .002477     0.05   0.957      .995292    1.005002
------------------------------------------------------------------------------

We see an increase in the risk of retirement of about 10% per year of age at appointment, and essentially no trend over time, so the risk of retirement depends only on age and not on calendar year of appointment.

The Fine and Gray Model

The Fine and Gray model looks at the sub-hazard of each risk, and can be fit using the stcrreg command. Here we focus on the risk of death and treat retirement as the competing event

. streset , fail(event==1)
-> stset tenure, failure(event==1)

     failure event:  event == 1
obs. time interval:  (0, tenure]
 exit on or before:  failure

------------------------------------------------------------------------------
      112  total obs.
        0  exclusions
------------------------------------------------------------------------------
      112  obs. remaining, representing
       49  failures in single record/single failure data
     1831  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =        36

. stcrreg age year, compete(event==2)

         failure _d:  event == 1
   analysis time _t:  tenure

Iteration 0:   log pseudolikelihood = -210.97156  
Iteration 1:   log pseudolikelihood = -207.63221  
Iteration 2:   log pseudolikelihood = -207.56478  
Iteration 3:   log pseudolikelihood = -207.56476  

Competing-risks regression                        No. of obs       =       112
                                                  No. of subjects  =       112
Failure event  : event == 1                       No. failed       =        49
Competing event: event == 2                       No. competing    =        54
                                                  No. censored     =         9

                                                  Wald chi2(2)     =     15.23
Log pseudolikelihood = -207.56476                 Prob > chi2      =    0.0005

------------------------------------------------------------------------------
             |               Robust
          _t |        SHR   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |    1.01186   .0182642     0.65   0.514     .9766882    1.048298
        year |   .9911023   .0023471    -3.77   0.000     .9865127    .9957132
------------------------------------------------------------------------------

We see that the probability that a judge will die while in the court doesn't appear to vary with age at appointment but decreases with calendar year of appointment, so judges appointment more recently are less likely to die on the job.

To find out how much more likely we need to translate the risk ratios into something easier to understand.

Predicted Cumulative Incidence

We use stcurve to predict the CIF for death for a justice appointed at age 55 in 1950 and in 2000

. stcurve, cif at1(age=55 year=1950) at2(age=55 year=2000) ///
>         title("Cumulative incidence of death") xtitle(Tenure) ///
>         subtitle("Justice appointed at age 55 in 1950 and 2000") ///
>         text(.20 38 "2000") text(.30 38 "1950") legend(off)

. graph export justices-c55y.png, width(400) replace
(file justices-c55y.png written in PNG format)

Aside: It is instructive to compute these curves 'by hand'. Because predict sets the covariates to zero, it will be advantageous to center age on 55 and year on 1950, so the predicted survival is our first scenario.

. gen cage = age-55

. gen cyear = year - 1950

. stcrreg cage cyear, compete(event==2)

         failure _d:  event == 1
   analysis time _t:  tenure

Iteration 0:   log pseudolikelihood = -210.97156  
Iteration 1:   log pseudolikelihood = -207.63221  
Iteration 2:   log pseudolikelihood = -207.56478  
Iteration 3:   log pseudolikelihood = -207.56476  

Competing-risks regression                        No. of obs       =       112
                                                  No. of subjects  =       112
Failure event  : event == 1                       No. failed       =        49
Competing event: event == 2                       No. competing    =        54
                                                  No. censored     =         9

                                                  Wald chi2(2)     =     15.23
Log pseudolikelihood = -207.56476                 Prob > chi2      =    0.0005

------------------------------------------------------------------------------
             |               Robust
          _t |        SHR   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        cage |    1.01186   .0182642     0.65   0.514     .9766882    1.048298
       cyear |   .9911023   .0023471    -3.77   0.000     .9865127    .9957132
------------------------------------------------------------------------------

. predict cif1950 , basecif

We know that the model is linear in the c-log-log scale, so we transform the CIF for 1950, add 50 times the year ccoefficient, and then transform back to probabilities.

. gen cif2000= invcloglog( cloglog(cif1950) + 50*_b[cyear] )
(2 missing values generated)

. // line cif1950 cif2000 _t
. list _t cif1950 cif2000 in -1

     +--------------------------+
     | _t    cif1950    cif2000 |
     |--------------------------|
112. | 36   .2967684   .2016365 |
     +--------------------------+

We see that the probability of dying while serving is declining over time. For a justice appointed at age 55, the probability would be 29.7% if appointed in 1950 and 20.2 if appointed in 2000, assuming of course that current trends continue, so the model applies.

Incidence from Cox Regressions

We now show how one can calculate the incidence function after fitting Cox models to the cause-specific hazards. For convenience we center age on 55 and calendar year on 1950. We start by fitting a Cox model to the risk of death

. gen cage = age-55

. gen cyear = year-1950

. stset tenure, fail(event==1)

     failure event:  event == 1
obs. time interval:  (0, tenure]
 exit on or before:  failure

------------------------------------------------------------------------------
      112  total obs.
        0  exclusions
------------------------------------------------------------------------------
      112  obs. remaining, representing
       49  failures in single record/single failure data
     1831  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =        36

. quietly stcox cage cyear, efron

Next we predict the baseline hazard, representing a justice appointed at age 55 in 1950, and the hazard for a justice appointed at the same age in 2000. Because we want just one hazard contribution for each distinct failure time we use by to set the superflous values to missing

. predict h0_d, basehc
(63 missing values generated)

. gsort _t-_d

. by _t: replace h0_d = . if _n > 1 // set to missing for ties
(24 real changes made, 24 to missing)

. gen h1_d = h0_d * exp(50*_b[cyear])
(87 missing values generated)

The next step is to do the exact same calculations for the risk of retirement

. streset, fail(event==2)
-> stset tenure, failure(event==2)

     failure event:  event == 2
obs. time interval:  (0, tenure]
 exit on or before:  failure

------------------------------------------------------------------------------
      112  total obs.
        0  exclusions
------------------------------------------------------------------------------
      112  obs. remaining, representing
       54  failures in single record/single failure data
     1831  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =        36

. quietly stcox cage cyear, efron

. predict h0_r, basehc
(58 missing values generated)

. gsort _t-_d

. by _t: replace h0_r = . if _n > 1 // set to missing for ties
(27 real changes made, 27 to missing)

. gen h1_r = h0_r * exp(50*_b[cyear])
(85 missing values generated)

The overall risk is just the sum of the risks of death and retirement. The survival function can be estimated as the product of the conditional survival probabilities for each failure time

To compute a product in Stata we exponentiate the sum of the logs, after setting all missing values to zero. Observations where both hazards are missing don't contribute and can be dropped

. drop if missing(h0_d) & missing(h0_r)
(60 observations deleted)

. foreach h of varlist h0_d h1_d h0_r h1_r {
  2.         quietly replace `h'=0 if missing(`h')
  3. }

. gen S0 = exp(sum(log(1 - h0_d - h0_r)))

. gen S1 = exp(sum(log(1 - h1_d - h1_r)))

We can then calculate the CIF by computing the product of the cause-specific hazard at each failure time by the survival function just before that time, and summing.

. gen cif0 = sum(S0[_n-1] * h0_d )

. gen cif1 = sum(S1[_n-1] * h1_d )

. sort _t

. list _t cif0 cif1 in -1

     +--------------------------+
     | _t       cif0       cif1 |
     |--------------------------|
 52. | 36   .2717366   .1687914 |
     +--------------------------+

Finally we plot the curves

. line cif0 cif1 _t, connect(J J) ///
>         title("Cumulative incidence of death") xtitle(Tenure) ///
>         subtitle("Justice appointed at age 55 in 1950 and 2000") ///
>         note(Based on Cox model of cause-specific hazards) ///
>         text(.17 38 "2000") text(.27 38 "1950") legend(off)

. graph export cif_h.png, width(400) replace
(file cif_h.png written in PNG format)

The general appearance of the graph is very similar to the estimate obtained from the Fine and Gray model, but the overall probabilities of death by the longest observed tenure are estimated as 27.2% and 16.9% instead of 29.7% and 20.2%.

The main difference between the models is that the Cox regressions assume that the log-relative risks of death and retirement are linear on age and year, whereas the Fine-Gray model assumes that the complementary log-log transformation of the CIF is linear on age and year. They agree, however, in estimating a 10 percentage point reduction over the last 50 years in the probability that a justice will die before retiring.