Survival Analysis Methods for Representing Complex Interdependencies in Health Data

  • Bindu Vekaria

Student thesis: Phd


Survival analysis is a branch of statistics that is used to model the time from some index event (e.g., hospital admission) to some event of interest (e.g., hospital discharge). These can be used for association studies (i.e., how do certain covariates increase/decrease the risk of the event), or in prognostic research (i.e., can a set of covariates predict the risk and timing of the event). While these methods are well-studied, there are many areas of health based research which can benefit from further exploration. This thesis considers two main areas of investigation: (1) the application of survival analysis methods in developing models to predict outcomes of people within a closed system, and within a target population, and (2) methodological investigations into improving how absolute risk predictions are made, to assist in prognosis research. Specifically, the thesis had the following aims: (i) application of survival analysis methods to develop risk models for system-level predictions in (a) COVID-19 bed occupancy planning in Manchester, and (b) suicide risk, (ii) to understand the impact of sample sizes on the performance of multi-state models for individual- and system-level predictions, especially when data are scarce such as in an emerging pandemic, and (iii) to investigate the use Gaussian Processes for modelling the baseline hazard of survival models for absolute risk prediction in prognostic modelling. Chapter 3 develops an open-source multi-state survival model to estimate the length of stay (LoS) for COVID-19 patients in England. Modelling within a given hospital system, allows for predictions that are specific to local patient cohorts and highlights the potential for survival analysis to facilitate operational research. Chapter 4 uses methods of simulation to investigate how estimates from a multi-state survival model are affected by changes in sample sizes. We find that average LoS is a stable measure even when data is sparse, but high levels of variability make it unsuitable for individual level predictions. Chapter 5 uses survival analysis methods on suicide data in England and Wales, and takes a principled approach to modelling to provide evidence in support of existing hypotheses within the field, around the idea of birth cohort effects, and offers solutions of how survival analysis can be used to quantify these effects. Chapter 6 explores the use of Gaussian processes as an advancement to current approaches for absolute risk prediction. In particular, we highlight how they are a potential solution to challenges presented in Chapters 3 and 5, and discuss the difficulties in their implementation. Overall, this thesis offers approaches to diversifying the use of survival analysis across health research domains, using both existing and novel methodology. We provide both theoretical and practical examples in support of this.
Date of Award31 Dec 2023
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorPauline Turnbull (Supervisor), Glen Martin (Supervisor) & Thomas House (Supervisor)

Cite this