Germán Rodríguez
Generalized Linear Models Princeton University

## Piecewise Exponential Models

This is an illustration of piecewise exponential survival using Stata, relying on the commands `stset` and `stsplit` to create pseudo-observations and `poisson` to fit the model using the Poisson equivalence. Stata can also fit this model using `streg` with `distribution(exponential)` on the split data.

### The Data

The dataset we will consider is analyzed in Wooldridge (2002) and credited to Chung, Schmidt and Witte (1991). The data pertain to a random sample of convicts released from prison between July 1, 1977 and June 30, 1978. Of interest is the time until they return to prison. The information was collected retrospectively by looking at records in April 1984, so the maximum length of observation is 81 months. The data are available from the Stata website in Stata format.

```. use http://www.stata.com/data/jwooldridge/eacsap/recid.dta, clear

. desc, s

Contains data from http://www.stata.com/data/jwooldridge/eacsap/recid.dta
obs:         1,445
vars:            18                          28 Sep 1998 13:28
size:        33,235
Sorted by:
```

The file has a censoring indicator, which we subtract from 1 to get a failure indicator. We also create an id variable and list observation number 9, which goes back to prison after 54 months.

```. gen fail = 1 - cens

. gen id = _n

. list id durat fail in 9

id   durat   fail
9.   9      54      1

```

### Creating Pseudo-Observations

To create pseudo-observations for survival analysis using the piecewise exponential model we `stset` the data making sure we specify an id variable, and then use `stsplit` to split the data into single-year intervals of duration from 0-12 to 48-60 with an open-ended category 60+. The first command generates the built-in variables `_t0` for entering time, `_t` for exit time and `_d` for failure. These are adjusted after the split to reflect what happens in each interval. We compute exposure as the difference between the exit and entering times. I also create a variable for the number of events, but this is not necessary as `_d` would serve the same purpose. We list these variables for individual 9 before and after the split to illustrate how the episodes are created.

```. stset durat, fail(fail) id(id)

id:  id
failure event:  fail != 0 & fail < .
obs. time interval:  (durat[_n-1], durat]
exit on or before:  failure

1445  total observations
0  exclusions
1445  observations remaining, representing
1445  subjects
552  failures in single-failure-per-subject data
80013  total analysis time at risk and under observation
at risk from t =         0
earliest observed entry t =         0
last observed exit t =        81

. list id _t0 _t _d if id==9

id   _t0   _t   _d
9.   9     0   54    1

. stsplit year, at(0 12 24 36 48 60 100)
(5273 observations (episodes) created)

. gen exposure = _t - _t0

. gen events = _d

. list id _t0 _t _d year exposure events if id==9

id   _t0   _t   _d   year   exposure   events
41.   9     0   12    0      0         12        0
42.   9    12   24    0     12         12        0
43.   9    24   36    0     24         12        0
44.   9    36   48    0     36         12        0
45.   9    48   54    1     48          6        1

```

The sample observation going back to prison after 54 months contributes five episodes or pseudo-observations, one each for years one to four with 12 months of exposure and no events, and another for year five with 6 months of exposure and one event. Note also that the variable generated by Stata to identify episodes, here `year`, reflects the time at which the interval starts, the same as `_t0`.

### A PWE Proportional Hazards Model

We are now ready to fit a proportional hazards model with a piecewise exponential baseline where the hazard changes from year to year. We use the same model as Wooldridge(2002), involving ten predictors, all fixed covariates. I specify the offset using the `exposure()` option. I could, of course, take logs and then use the `offset()` option.

```. local predictors workprg priors tserved felon alcohol drugs black married educ age

. poisson events i.year `predictors', exposure(exposure)

Iteration 0:   log likelihood = -2244.0023
Iteration 1:   log likelihood = -2241.5492
Iteration 2:   log likelihood = -2241.5353
Iteration 3:   log likelihood = -2241.5353

Poisson regression                                Number of obs   =       6718
LR chi2(15)     =     312.36
Prob > chi2     =     0.0000
Log likelihood = -2241.5353                       Pseudo R2       =     0.0651

events       Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
year
12      .036532   .1093659     0.33   0.738    -.1778212    .2508851
24    -.3738156   .1296172    -2.88   0.004    -.6278607   -.1197706
36    -.8115436   .1564067    -5.19   0.000    -1.118095   -.5049921
48    -.9382311   .1683272    -5.57   0.000    -1.268146   -.6083159
60    -1.547178   .2033594    -7.61   0.000    -1.945755   -1.148601
workprg    .0838291   .0907983     0.92   0.356    -.0941323    .2617906
priors    .0872458   .0134763     6.47   0.000     .0608327     .113659
tserved    .0130089   .0016863     7.71   0.000     .0097039    .0163139
felon   -.2839252   .1061534    -2.67   0.007     -.491982   -.0758684
alcohol    .4324425   .1057254     4.09   0.000     .2252245    .6396605
drugs    .2747141   .0978667     2.81   0.005     .0828989    .4665293
black     .433556   .0883658     4.91   0.000     .2603622    .6067497
married   -.1540477   .1092154    -1.41   0.158    -.3681059    .0600104
educ   -.0214162   .0194453    -1.10   0.271    -.0595283     .016696
age     -.00358   .0005223    -6.85   0.000    -.0046037   -.0025563
_cons   -3.830127    .280282   -13.67   0.000     -4.37947   -3.280785
ln(exposure)           1  (exposure)

```

The results are exactly the same as in the sister page using R. We see that the risk of recidivism is about the same in the first two years, but then decreases substantially with duration since release. At any given duration felons have 25% lower risk of recidivism than non-felons with the same observed characteristics. Subjects imprisoned for alcohol or drug related offenses have much higher risk of recidivism, everything else being equal.

### Survival Probabilities

We now illustrate the calculation of survival probabilities, starting with the baseline hazard, which requires setting all predictors to zero. There are different ways to do these calculations in Stata, but I will proceed from first principles using Mata. An alternative is to use or create a dummy variable with the required predictor values and then use `predict`. If you go this way make sure to specify the options `xb` and `nooffset` to predict the log-hazard.

Stata stores the constant as the last coefficient, here with index 17 . We add that to the year coefficients to obtain the log-hazard for each year, exponentiate to obtain hazards, and multiply by 12 and sum to obtain the cumulative or integrated hazard. Note that we only need the first five years.

```. mata
// mata (type end to exit)
: b = st_matrix("e(b)")

: logh = b[17] :+ b[1..5]

: h = exp(logh)

: H = runningsum(12 * h)

: S = exp(-H)

: S
1             2             3             4             5
1    .7706798852   .5882188201   .4916958096   .4379748047   .3955311901

: end
```

These calculations apply to the reference cell and are not very meaningful because they set age to zero (and age, by the way, is measured in months).

We will now estimate the probability of staying out of prison for five years given average values of the predictors. In calculating the mean of each predictor we have to be careful to include only one observation per person, so we restrict the calculation to the first interval. The easiest way to compute means in Stata is to `collapse`

```. keep if year == 0
(5273 observations deleted)

. save recid1, replace
file recid1.dta saved

. collapse `predictors'
```

Now that we have the means we multiply each by the corresponding coefficient to obtain the linear predictor `xb`

```. scalar xb = 0

. foreach var of local predictors {
2.         scalar xb = xb + `var' * _b[`var']
3. }

. di "xb=" xb
-.79219682

. mata exp( -H[5] * exp(st_numscalar("xb")) )
.6570278064
```

Thus, the probability of staying out of prison for the average person is 65.7%. We can easily calculate this probability for felons and non-felons keeping all other variables at their means. All we need to do is subtract the coefficient of felon times the mean to get the linear predictor for a non-felon, and then add the coefficient to get the linear predictor for felons:

```. scalar xb0 = xb - felon*_b[felon]

. scalar xb1 = xb0 + _b[felon]

. mata exp( -H[5] * exp(st_numscalar("xb0")) )
.6317763048

. mata exp( -H[5] * exp(st_numscalar("xb1")) )
.7077167906
```

The predicted probability is 70.8% for felons and 63.2% for non-felons when all other characteristics are set to the mean, a difference of 7.6 percentage points.

An alternative calculation sets every person to be a felon or non-felon leaving all other characteristics as they are, and then averages the predicted probability of surviving five years without returning to prison. To do this we need the file with the first episode for each person, which conveniently I saved. I'll also store the cumulative hazard at duration 60 in scalar `H5`

```. use recid1, clear

. scalar drop _all // to reuse names xb0 and xb1

. mata st_numscalar("H5",H[5])

. gen xb0 = 0

. local predictors workprg priors tserved felon alcohol drugs black married educ age

. foreach var of local predictors {
2.         if "`var'" == "felon" continue
3.         quietly replace xb0 = xb0 + `var' * _b[`var']
4. }

. gen xb1 = xb0 + _b[felon]

. gen S0 = exp(-H5 * exp(xb0) )

. gen S1 = exp(-H5 * exp(xb1) )

. sum S0 S1

Variable        Obs        Mean    Std. Dev.       Min        Max
S0       1445    .6118797    .1549424   .0021267   .9595686
S1       1445    .6857928    .1392872   .0097329   .9694076

```

The average probability of staying out of prison for five years is 68.6% if a felon and 61.2% if not, a difference of 7.4 percentage points. This can be interpreted as a discrete marginal effect.