7.2 Censoring and The Likelihood Function
The second distinguishing feature of the field of survival analysis
is censoring: the fact that for some units the event of interest
has occurred and therefore we know the exact waiting time,
whereas for others it has not occurred, and all we know is that
the waiting time exceeds the observation time.
7.2.1 Censoring Mechanisms
There are several mechanisms that can lead to censored data.
Under censoring of Type I, a sample of n units is followed
for a fixed time t. The number of units experiencing the
event, or the number of `deaths', is random, but the total
duration of the study is fixed. The fact that the duration is
fixed may be an important practical advantage in designing a
follow-up study.
In a simple generalization of this scheme, called fixed censoring,
each unit has a potential maximum observation time ti
for i = 1,,n which may differ from one case to the next but
is nevertheless fixed in advance. The probability that unit i will
be alive at the end of her observation time is S(ti), and the
total number of deaths is again random.
Under censoring of Type II, a sample of n units is followed
as long as necessary until d units have experienced the event.
In this design the number of deaths d, which determines the
precision of the study, is fixed in advance and can be used as
a design parameter. Unfortunately, the total duration of the
study is then random and cannot be known with certainty in advance.
In a more general scheme called random censoring, each unit has
associated with it a potential censoring time Ci and a potential
lifetime Ti, which are assumed to the independent random variables.
We observe Yi = min{Ci,Ti}, the minimum of the censoring
and life times, and an indicator variable, often called di or
di, that tells us whether observation terminated by death or
by censoring.
All these schemes have in common the fact that the censoring
mechanism is non-informative
and they all lead to essentially the same
likelihood function. The weakest assumption required to obtain
this common likelihood is that the censoring of an observation
should not provide any information regarding the prospects of
survival of that particular unit beyond the censoring time.
In fact, the basic assumption that we will make is simply this:
all we know for an observation censored at duration t is that
the lifetime exceeds t.
7.2.2 The Likelihood Function for Censored Data
Suppose then that we have n units with lifetimes governed by a
survivor function S(t) with associated density f(t) and
hazard l(t). Suppose unit i is observed for a time
ti.
If the unit died at ti, its contribution to the
likelihood function is the density at that duration, which
can be written as the product of the survivor and hazard functions
|
Li = f(ti) = S(ti) l(ti). |
|
If the unit is still alive at ti, all we know under non-informative
censoring is that the lifetime exceeds ti. The probability of this
event is
which becomes the contribution of a censored observation to the
likelihood.
Note that both types of contribution share the survivor function
S(ti), because in both cases the unit lived up to time ti.
A death multiplies this contribution by the hazard l(ti),
but a censored observation does not.
We can write the two contributions in a single expression.
To this end, let di be a death indicator, taking the value one if
unit i died and the value zero otherwise. Then the likelihood
function may be written as follows
|
L = |
n
i = 1
|
Li = |
i
|
l(ti)di S(ti). |
|
Taking logs, and recalling the expression linking the survival
function S(t) to the cumulative hazard function L(t),
we obtain the log-likelihood function for censored survival data
|
logL = |
n
i = 1
|
{ di logl(ti) - L(ti) }. |
| (7.7) |
We now consider an example to reinforce these ideas.
Example: Suppose we have a sample of n censored
observations from an exponential distribution. Let ti be the
observation time and di the death indicator for unit i.
In the exponential distribution l(t) = l for all t.
The cumulative risk turns out to be the integral of a constant
and is therefore L(t) = lt. Using these two
results on Equation 7.7 gives the log-likelihood
function
Let D = di denote the total number of deaths, and let
T = ti denote the total observation (or exposure) time.
Then we can rewrite the log-likelihood as a function of these
totals to obtain
Differentiating this expression with respect to l we obtain
the score function
and setting the score to zero gives the maximum likelihood
estimator of the hazard
the total number of deaths divided by the total exposure time.
Demographers will recognize this expression as the general definition
of a death rate. Note that the estimator is optimal (in a maximum
likelihood sense) only if the risk is constant and does not depend
on age.
We can also calculate the observed information by taking minus
the second derivative of the score, which is
To obtain the expected information we need to calculate the
expected number of deaths, but this depends on the censoring
scheme. For example under Type I censoring with fixed duration t,
one would expect n(1-S(t)) deaths. Under Type II censoring
the number of deaths would have been fixed in advance. Under
some schemes calculation of the expectation may be fairly
complicated if not impossible.
A simpler alternative is to use the observed information,
estimated using the m.l.e. of l given in Equation 7.9.
Using this approach, the large sample variance of the
m.l.e. of the hazard rate may be estimated as
a result that leads to large-sample tests of hypotheses
and confidence intervals for l.
If there are no censored cases, so that di = 1 for all i and D = n,
then the results obtained here reduce to standard maximum likelihood
estimation for the exponential distribution,
and the m.l.e. of l turns out to be
the reciprocal of the sample mean.
It may be interesting to note in passing that the log-likelihood
for censored exponential data given in Equation 7.8
coincides exactly (except for constants) with the log-likelihood that
would be obtained by treating D as a Poisson random variable with
mean lT. To see this point, you should write the Poisson
log-likelihood when D ~ P(lT),
and note that it differs from Equation 7.8
only in the presence of a term Dlog(T), which is a constant
depending on the data but not on the parameter l.
Thus, treating the deaths as Poisson conditional on exposure time
leads to exactly the same estimates (and standard errors) as
treating the exposure times as censored observations from an
exponential distribution. This result will be exploited below
to link survival models to generalized linear models with Poisson
error structure.
Continue with 7.3. Approaches to Survival Modeling
Copyright © Germán Rodríguez, 1993-2000.
Please send feedback to grodri@princeton.edu
Conversion from LaTeX was done using TTH, version 2.34.