Germán Rodríguez
Survival Analysis Princeton University

This is the home page of Pop 509: Survival Analysis, as offered in the Spring of 2018 (Session I). The course registrar's page is here. For Pop 510: Multilevel Models click here. For my research on multilevel models click here.

Course Description

This half-course offered in the first half of the spring term focuses on the statistical analysis of time-to-event or survival data. We introduce the hazard and survival functions; censoring mechanisms, parametric and non-parametric estimation, and comparison of survival curves. We cover continuous and discrete-time regression models with emphasis on Cox's proportional hazards model and partial likelihood estimation. We discuss competing risk models, unobserved heterogeneity, and multivariate survival models including event history analysis. The course emphasizes basic concepts and techniques as well as applications in social science research using R or Stata. Prerequisite: WWS509 or equivalent.

Course Materials

The course is organized around five topics. The materials for each topic usually include a handout in PDF format and one or more computing logs in HTML format showing how to do the relevant calculations in R and Stata. For a more detailed description of the course, including a list of topics to be covered each week, see the syllabus, available also in printer-friendly PDF.

The beamer slides used in class in the Spring of 2018 are available here. A bundle with all slides arranged four to a page in 2x2 layout is available here

1. Parametric Models

Materials for week 1 include a handhout on Parametric Survival Models, a plot of the 2013 U.S. survival and hazard functions, and a computing log fitting parametric models to recidivism data.

2. Non-Parametric Estimation

Weeks 2 and 3 are devoted to Non-parametric Estimation in Survival Models. Materials include a computing log applying Kaplan-Meier and Mantel-Haenzsel, and a log fitting Cox's proportional hazards model to a two-group comparison. See also this application of Cox Regression to the recidivism data. We compare flexible discrete and continuous time models fit to the same data. Addendum: an example using splines in a piecewise exponential model.

3. Competing Risks

Week 4 deals with Competing Risks, the analysis of survival time when there are multiple causes of failure. Additional materials include a discussion of cumulative incidence, including estimation of the cumulative incidence function (CIF) and Fine and Gray's competing risk model, and an expanded computing log fitting competing risk models to the tenure of U.S. Supreme Court justices. A competing risk simulation is here.

4. Unobserved Heterogeneity

In week 5 we tackle Unobserved Heterogeneity, discussing univariate frailty models and the identification problem, including very useful formulas for converting back and forth between subject-specific and population-average hazards. Illustrations include two shiny apps, one shows frailty acting on proportional hazards, and another shows how heterogeneity can undo a mortality crossover.

5. Multivariate Survival

Week 6 is devoted to Multivariate Survival, where we review various approaches to the analysis of multiple-spell survival data, focusing on shared-frailty models. Don't miss the computing handouts fitting shared frailty models to child survival data from Guatemala, we fit a piecewise exponential model using Stata and a Cox model using R. We also have a discussion of model interpretation via post-estimation, including computation of survival probabilities.