7.6 Discrete Time Models
We discuss briefly two extensions of the proportional hazards model
to discrete time, starting with a definition of the hazard and
survival functions in discrete time and then proceeding to models
based on the logit and the complementary log-log transformations.
7.6.1 Discrete Hazard and Survival
Let T be a discrete random variable that takes the values
t1 < t2 < with probabilities
We define the survivor function at time tj as
the probability that the survival time T is at least tj
|
S(tj) = Sj = Pr{T tj} = |
k = j
|
fj. |
|
Next, we define the hazard at time tj as
the conditional probability of dying at that time given
that one has survived to that point, so that
|
l(tj) = lj = Pr{T = tj | T tj} = |
fj Sj
|
. |
| (7.17) |
Note that in discrete time the hazard is a conditional probability
rather than a rate. However, the general result expressing the
hazard as a ratio of the density to the survival function is
still valid.
A further result of interest in discrete time
is that the survival function at time tj can be written in terms of
the hazard at all prior times t1,,tj-1, as
|
Sj = (1-l1) (1-l2) (1-lj-1). |
| (7.18) |
In words, this result states that in order to survive to
time tj one must first survive t1, then one must survive
t2 given that one survived t1, and so on, finally
surviving tj-1 given survival up to that point.
This result is analogous to the result linking the survival
function in continuous time to the integrated or cumulative
hazard at all previous times.
An example of a survival process that takes place in discrete time
is time to conception measured in menstrual cycles. In this case
the possible values of T are the positive integers, fj is
the probability of conceiving in the j-th cycle, Sj is the
probability of conceiving in the j-th cycle or later, and lj
is the conditional probability of conceiving in the j-th cycle
given that conception had not occurred earlier.
The result relating the survival function to the hazard states
that in order to get to the j-th cycle without conceiving, one has
to fail in the first cycle, then fail in the second given that
one didn't succeed in the first, and so on, finally failing in the
(j-1)-st cycle given that one hadn't succeeded yet.
7.6.2 Discrete Survival and Logistic Regression
Cox (1972) proposed an extension of the proportional hazards model
to discrete time by working with the conditional odds of dying at each
time tj given survival up to that point. Specifically, he
proposed the model
|
|
l(tj|xi) 1-l(tj|xi)
|
= |
l0(tj) 1-l0(tj)
|
exp{xib}, |
|
where l(tj|xi) is the hazard at time tj for an
individual with covariate values xi, l0(tj) is
the baseline hazard at time tj, and exp{xib}
is the relative risk associated with covariate values xi.
Taking logs, we obtain a model on the logit of the
hazard or conditional probability of dying at tj given survival
up to that time,
|
logitl(tj|xi) = aj + xib, |
| (7.19) |
where aj = logitl0(tj) is the logit of the
baseline hazard and xib is the effect of the covariates on
the logit of the hazard.
Note that the model essentially treats time as a discrete
factor by introducing one parameter aj for each
possible time of death tj.
Interpretation of the parameters b associated with
the other covariates follows along the same
lines as in logistic regression.
In fact, the analogy with logistic regression goes further:
we can fit the discrete-time proportional-hazards model
by running a logistic regression on a set of pseudo observations
generated as follows.
Suppose individual i dies or is censored at time point tj(i).
We generate death indicators dij
that take the value one if individual i died at time j and
zero otherwise, generating one for each discrete time from
t1 to tj(i).
To each of these indicators we associate a copy of the covariate
vector xi and a label j identifying the time point.
The proportional hazards model 7.19 can then be fit
by treating the dij as independent Bernoulli observations
with probability given by the hazard lij for
individual i at time point tj.
More generally, we can group pseudo-observations with identical
covariate values.
Let dij denote the number of deaths and nij the total
number of individuals with covariate values xi observed at
time point tj. Then we can treat dij as binomial with
parameters nij and lij, where the latter
satisfies the proportional hazards model.
The proof of this result runs along the same lines as the
proof of the equivalence of the Poisson likelihood and the
likelihood for piece-wise exponential survival data under
non-informative censoring in Section 7.4.3,
and relies on Equation 7.18, which writes the probability
of surviving to time tj as a product of the conditional
hazards at all previous times.
It is important to note that we do not assume that the
pseudo-observations are independent and have a Bernoulli or binomial
distribution. Rather, we note that the likelihood function for the
discrete-time survival model under non-informative censoring
coincides with the binomial likelihood that would be obtained
by treating the death indicators as independent Bernoulli or
binomial.
Time-varying covariates and time-dependent effects can be
introduced in this model along the same lines as before.
In the case of time-varying covariates, note that only the values
of the covariates at the discrete times t1 < t2 < are relevant.
Time-dependent effects are introduced as interactions between
the covariates and the discrete factor (or set of dummy variables)
representing time.
7.6.3 Discrete Survival and the C-Log-Log Link
An alternative extension of the proportional hazards model
to discrete time starts from the survival function, which in
a proportional hazards framework can be written as
|
S(tj|xi) = S0(tj)exp{xib}, |
|
where S(tj|xi) is the probability that an individual with
covariate values xi will survive up to time point tj,
and S0(tj) is the baseline survival function.
Recalling Equation 7.18 for the discrete survival
function, we obtain a similar relationship for the
complement of the hazard function, namely
|
1-l(tj|xi) = [1-l0(tj)] exp{xib}, |
|
so that solving for the hazard for individual i at time point tj
we obtain the model
|
l(tj|xi) = 1 - [1-l0(tj)] exp{xib}. |
|
The transformation that makes the right hand side a linear
function of the parameters is the complementary log-log.
Applying this transformation we obtain the model
|
log(-log(1-l(tj|xi))) = aj + xib, |
| (7.20) |
where aj = log(-log(1-l0(tj))) is the complementary
log-log transformation of the baseline hazard.
This model can be fitted to discrete
survival data by generating pseudo-observations as before and
fitting a generalized linear model with binomial error structure and
complementary log-log link.
In other words, the equivalence between the binomial likelihood and the
discrete-time survival likelihood under non-informative censoring
holds both for the logit and complementary log-log links.
It is interesting to note that this model can be obtained by
grouping time in the continuous-time proportional-hazards model.
To see this point let us assume that time is continuous and we are
really interested in the standard proportional hazards model
Suppose, however, that time is grouped into intervals with boundaries
0 = t0 < t1 < < tJ = , and that
all we observe is whether an individual survives
or dies in an interval. Note that this construction imposes
some constraints on censoring. If an individual is censored
at some point inside an interval, we do not know whether it would
have survived the interval or not. Therefore we must censor
it at the end of the previous interval, which is the last point
for which we have complete information. Unlike the piece-wise
exponential set-up, here we can not use information about
exposure to part of an interval. On the other hand, it turns out that
we do not need to assume that the hazard is constant in each interval.
Let lij denote the discrete hazard or conditional probability that
individual i will die in interval j given that it was alive at the start
of the interval. This probability is the same as the complement of the conditional
probability of surviving the interval given that one was alive at the
start, and can be written as
|
| |
|
|
| |
|
|
1-exp{ - |
|
tj
tj-1
|
l(t|xi) dt} |
| |
|
|
1-exp{- |
|
tj
tj-1
|
l0(t)dt}exp{xib} |
| |
|
|
| |
|
where
lj is the baseline probability of dying in interval j given survival
to the start of the interval.
The second line follows from
Equation 7.4 relating the survival function to the integrated hazard,
the third line follows from the proportional hazards assumption,
and the last line defines lj.
As noted by Kalbfleish and Prentice (1980, p. 37), ``this discrete
model is then the uniquely appropriate one for grouped data
from the continuous proportional hazards model''.
In practice, however, the model with a logit link is used much
more often than the model with a c-log-log link, probably because
logistic regression is better known that generalized linear models
with c-log-log links, and because software for the former is more
widely available than for the latter.
In fact, the logit model is often used in cases where the
piece-wise exponential model would be more appropriate, probably because
logistic regression is better known than Poisson regression.
In closing, it may be useful to provide some suggestions regarding
the choice of approach to survival analysis using generalized
linear models:
- If time is truly discrete, then one should probably use the
discrete model with a logit link, which has a direct interpretation
in terms of conditional odds, and is easily implemented using
standard software for logistic regression.
- If time is continuous but one only observes it in grouped form,
then the complementary log-log link would seem more appropriate.
In particular, results based on the c-log-log link should be more
robust to the choice of categories than results based on the
logit link. However, one cannot take into account partial
exposure in a discrete time context, no matter which link is used.
- If time is continuous and one is willing to assume that the hazard
is constant in each interval, then the piecewise exponential approach
based on the Poisson likelihood is preferable. This approach is
reasonably robust to the choice of categories and is unique in
allowing the use of information from cases that have partial exposure.
Finally, if time is truly continuous and one wishes to estimate the
effects of the covariates without making any assumptions about
the baseline hazard, then Cox's (1972) partial likelihood is
a very attractive approach.
Continue with A. Review of Likelihood Theory
Copyright © Germán Rodríguez, 1993-2000.
Please send feedback to grodri@princeton.edu
Conversion from LaTeX was done using TTH, version 2.34.