Informative Presence and Observation in Routinely Collected Health Data: Methods to Support the Development of Clinical Prediction Models

  • Rose Sisk

Student thesis: Phd


The development and implementation of clinical prediction models using routinely collected health data is a challenging yet promising avenue of research. When data are collected opportunistically as a result of routine healthcare contacts, information is only collected according to clinical indication, or patient/clinician concern. This means the patterns of observing the information/data are potentially informative with respect to patient condition: so-called “informative presence” and “informative observation”. Within this thesis, we aim to assess to what extent informative presence/observation have been considered in the methodological prediction modelling literature and summarise the available methods for doing so to help applied researchers. We then perform simulations and empirical analyses to quantify the impact of allowing clinical prediction models to learn from informative observation processes, and study challenges associated with the use of (informatively) missing data in the development and implementation of clinical prediction models. We provide guidance for applied and methodological researchers on how to approach informative presence and observation, as well as setting out an agenda for further research. We find that simple ways of harnessing informative measurement patterns (such as including missing indicators, or measures that summarise the observation process, as model predictors) can offer gains in predictive performance, especially within our clinical exemplar where one of the key outcomes can be difficult to predict. The findings and implications of this thesis have the potential to improve the development and implementation of clinical prediction models using routinely collected health data.
Date of Award1 Aug 2022
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorMatthew Sperrin (Supervisor), Niels Peek (Supervisor) & Glen Martin (Supervisor)


  • Informative presence
  • Routinely collected data
  • Informatively missing data
  • Electronic health records
  • Informative observation
  • Clinical prediction model
  • Missing data

Cite this