Statistical Primer: sample size considerations for developing and validating clinical prediction models

Glen Martin, Richard D. Riley, Joie Ensor, Stuart Grant

Research output: Contribution to journalArticlepeer-review

Abstract

Clinical prediction models are statistical models or machine learning algorithms that combine information on a set of predictor variables about an individual to estimate their risk of a given clinical outcome. It is crucial to ensure that the sample size of the data used to develop or validate a clinical prediction model is large enough. If the data are inadequate, developed models can be unstable and estimates of predictive performance imprecise. This can lead to models that are unfit or even harmful for clinical practice. Recently, there have been a series of sample size formulae developed to estimate the minimum required sample size for prediction model development or external validation. The aim of this statistical primer is to overview these criteria, describe what information is required to make the calculations, and to illustrate their implementation through worked examples. The software that is available to implement the sample size criteria is reviewed and code is provided for all the worked examples.
Original languageEnglish
JournalEuropean Journal of Cardio-Thoracic Surgery
Publication statusAccepted/In press - 14 Jan 2025

Keywords

  • Risk prediction
  • sample size
  • development
  • validation
  • overfitting
  • Evaluation

Fingerprint

Dive into the research topics of 'Statistical Primer: sample size considerations for developing and validating clinical prediction models'. Together they form a unique fingerprint.

Cite this