Harnessing repeated measurements of predictor variables to enhance prediction models in medicine

  • Lucy Bull

Student thesis: Phd

Abstract

Background: Longitudinal health data, widely available from observational cohorts, electronic health records and randomised controlled trials, provides an opportunity to incorporate repeatedly measured predictors into clinical prediction models (CPMs). Despite a growing applied and methodological interest amongst the literature, strategies for incorporating longitudinal measurements into CPMs remain unclear and under-researched. The aim of this thesis was, therefore, to explore how to model repeatedly-measured predictors in the development of CPMs given available methods and to identify key implementation challenges. Methods: Three research questions were considered using a methodological review (Chapter 2), an empirical investigation (Chapter 3), and a simulation study (Chapter 4) respectively: (1) What methods are available to model repeatedly measured predictors in the development of a clinical prediction model, and what are their properties? (2) How do these methods compare, to each other and to methods that ignore repeated observations, in terms of suitability, practicality and performance when applied to a real-world example? (3) How do different data features (sample size, event prevalence, follow-up sparsity, and dimensionality) interplay to influence the performance, stability and optimism of existing methods? The British Society for Rheumatology Biologics Register for Rheumatoid Arthritis (BSRBR-RA) was utilised to illustrate the implementation of methods, and for the simulation of biologically-plausible longitudinal health data. Results: Chapter 2 confirmed that a wide array of methods for longitudinal data analysis have already been applied in the context of clinical risk prediction. The implementation of methods, reported in Chapter 3, suggested that the predictive gain of modelling longitudinal data could lie in the agreement between observed and predicted risks post-baseline in a dynamic prediction setting, and highlighted the need to consider method-specific development decisions. Chapter 4 informed that follow-up sparsity of longitudinal data has little influence on performance, optimism and prediction stability. Both Chapters 3 and 4 highlighted challenges in quantifying optimism-adjusted performance over time, and in implementing the joint modelling approach with increased computational demand and convergence problems. Concluding remarks: Overall, this thesis has identified key challenges when implementing and validating methods for modelling repeatedly measured predictors in the development of CPMs, and has highlighted areas of prioritisation for future methodological research in the field.
Date of Award1 Aug 2023
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorMark Lunt (Supervisor), Kimme Hyrich (Supervisor), Jamie Sergeant (Supervisor) & Glen Martin (Supervisor)

Keywords

  • Time-dependent covariates
  • Prediction models
  • Joint models
  • Electronic health records
  • Personalised medicine
  • Survival analysis
  • Dynamic prediction
  • Clinical prediction models
  • Longitudinal data
  • Repeated observations

Cite this

'