Penalisation and shrinkage methods can produce unreliable clinical prediction models especially when sample size is small

Richard Riley , Kym I E Snell, Glen Martin, Rebecca Whittle, Lucinda Whittle, Matthew Sperrin, Gary S Collins

Research output: Contribution to journalArticlepeer-review

Abstract

Objectives
When developing a clinical prediction model, penalisation techniques are recommended to address overfitting, as they shrink predictor effect estimates towards the null and reduce mean-square prediction error in new individuals. However, shrinkage and penalty terms (‘tuning parameters’) are estimated with uncertainty from the development dataset. We examined the magnitude of this uncertainty and the subsequent impact on prediction model performance.

Study design and setting
Applied examples and a simulation study of the following methods: uniform shrinkage (estimated via a closed-form solution or bootstrapping), ridge regression, the lasso, and elastic net

Results
In a particular model development dataset, penalisation methods can be unreliable because tuning parameters are estimated with large uncertainty. This is of most concern when development datasets have a small effective sample size and the model’s Cox-Snell R_^2 is low. The problem can lead to considerable miscalibration of model predictions in new individuals.

Conclusions
Penalisation methods are not a ‘carte blanche’; they do not guarantee a reliable prediction model is developed. They are more unreliable when needed most (i.e. when overfitting may be large). We recommend they are best applied with large effective sample sizes, as identified from recent sample size calculations that aim to minimise the potential for model overfitting and precisely estimate key parameters.
Original languageEnglish
JournalJournal of Clinical Epidemiology
Publication statusAccepted/In press - 3 Dec 2020

Fingerprint

Dive into the research topics of 'Penalisation and shrinkage methods can produce unreliable clinical prediction models especially when sample size is small'. Together they form a unique fingerprint.

Cite this