Using machine learning to predict anticoagulation control in atrial fibrillation: A UK Clinical Practice Research Datalink study

Jason Gordon, Thomas Mason, Max Norman, Michael Hurst, Carissa Dickerson, Belinda Sandler, Kevin Pollock, Usman Farooqui, Lara Groves, Carmen Tsang, David Clifton, Ameet Bakhai, Nathan Hill

Research output: Contribution to journalArticlepeer-review


Objective: To investigate the predictive performance of machine learning (ML) algorithms for estimating anticoagulation control in patients with atrial fibrillation (AF) who are treated with warfarin. Methods: This was a retrospective cohort study of adult patients (≥18 years) between 2007 and 2016 using linked primary and secondary care data (Clinical Practice Research Datalink GOLD and Hospital Episode Statistics). Various ML techniques were explored to predict suboptimal anticoagulation control, defined as time in therapeutic range (TTR) < 70% based on International Normalised Ratio (INR) 2.0–3.0. Baseline (linear and non-linear support vector machines; random forests; stochastic gradient boosting [XGBoost]; neural networks [NN]) and time-varying data (6-week intervals up to 30 weeks (long-short term memory [LSTM] NN)) were applied. Patient records depicting unique lines of warfarin therapy (LOT) were separated into training (70%) and holdout sets (30%) for model training and testing, respectively. Results: 35,479 patients were eligible for inclusion, of whom 24,684 and 10,795 were assigned to the training (32,683 unique LOTs) and holdout sets (14,218 unique LOTs). Across all models, depression (diagnosis and/or prescription of antidepressant medication) was a significant driver in predicting anticoagulation control. At baseline, XGBoost was the best-performing model (area under the curve [AUC]: 0.624) due to its ability to identify non-linear associations such as age and weight (greater probability of suboptimal control: <65 and >80 years and <70 kg, respectively). Addition of time-varying data to the LSTM NN improved predictive performance, plateauing at AUC of 0.830 at 30 weeks. Conclusion: ML algorithms displayed clinically useful ability to predict patients who are at greater risk of suboptimal control. The addition of time-varying data to the algorithm, especially prior INR measurements, improved predictive performance. These algorithms provide improved predictive tools for identifying patients who may benefit from more frequent INR monitoring or switching to alternative therapies.

Original languageEnglish
Article number100688
Journal Informatics in Medicine Unlocked
Publication statusPublished - 1 Aug 2021


  • Anticoagulation control
  • Atrial fibrillation
  • International normalised ratio
  • Machine learning
  • Unsupervised learning
  • Warfarin


Dive into the research topics of 'Using machine learning to predict anticoagulation control in atrial fibrillation: A UK Clinical Practice Research Datalink study'. Together they form a unique fingerprint.

Cite this