Evaluating generalisability and clinical utility of risk prediction model developed from routinely collected electronic health records and longitudinal cohort using Cardiovascular disease as exemplar

Student thesis: Phd


Risk prediction models are mathematical formulas which use disease as outcome variable and individual's characteristics or risk factors as predictors to predict a risk of the individual having the disease in future. They are used in health care system to assist clinicians to make treatment decisions for patients. Healthcare guideline such as NICE recommends prescribing statins to patients who have QRISK3 predicted risk above 10%. These models are developed from routinely collected electronic health records or longitudinal cohort. However, they are validated on population level but being used on individual level for individual patients. Generalisability and clinical utility reflect whether a model developed in one setting could be generalised to other setting and still being clinical useful. Current statistical framework for the model development and validation does not consider reporting or minimally assessing the generalisability and clinical utility of risk prediction models especially on individual level. The objective of this PhD is to assess the generalisability and clinical utility of risk prediction models in different settings especially on accurately predicting high risk patients who are missed by the model, with Cardiovascular disease risk prediction as exemplar. There are 6 main chapters in this PhD. Chapter 2 evaluated generalisability of QRISK3 by assessing the effects of practice variability on individual risk prediction. Chapter 3 assessed the effects of data quality and variation of association between disease outcome and predictor on the risk predictions of individual patients. Chapter 4 assessed clinical utility of machine learning models and Cox models on both population and individual level. Chapter 5 implemented ClinRisk's QRISK3 algorithm into R. Chapter 6 assessed whether a new individual level measurement may improve clinical utility of risk prediction model. Chapter 7 discussed all the identified generalisability and clinical utility issues for current models and possible solutions. This PhD found that risk prediction models may have good performance on population level but with limited generalisability and clinical utility especially on individual level. The reason is that prediction models based on different techniques or modelling decisions can yield inconsistent individual results. Risk prediction models should be used in conjunction with additional clinical tests and clinical judgement.
Date of Award31 Dec 2020
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorMatthew Sperrin (Supervisor) & Tjeerd Van Staa (Supervisor)


  • EHR; generalisability; clinical utility; statistical models; machine learning models; CVD risk prediction; practice variability

Cite this