TY - JOUR
T1 - A deep clustering-based state-space model for improved disease risk prediction in personalized healthcare
AU - Niu, Shuai
AU - Ma, Jing
AU - Yin, Qing
AU - Bai, Liang
AU - Li, Chen
AU - Yang, Xian
PY - 2024/2/1
Y1 - 2024/2/1
N2 - Decision support systems are being developed to assist clinicians in complex decision-making processes by leveraging information from clinical knowledge and electronic health records (EHRs). One typical application is disease risk prediction, which can be challenging due to the complexity of modelling longitudinal EHR data, including unstructured medical notes. To address this challenge, we propose a deep state-space model (DSSM) that simulates the patient’s state transition process and formally integrates latent states with risk observations. A typical DSSM consists of three parts: a prior module that generates the distribution of the current latent state based on previous states; a posterior module that approximates the latent states using up-to-date medical notes; and a likelihood module that predicts disease risks using latent states. To efficiently and effectively encode raw medical notes, our posterior module uses an attentive encoder to better extract information from unstructured high-dimensional medical notes. Additionally, we couple a predictive clustering algorithm into our DSSM to learn clinically useful representations of patients’ latent states. The latent states are clustered into multiple groups, and the weighted average of the cluster centres is used for prediction. We demonstrate the effectiveness of our deep clustering-based state-space model using two real-world EHR datasets, showing that it not only generates better risk prediction results than other baseline methods but also clusters similar patient health states into groups.
AB - Decision support systems are being developed to assist clinicians in complex decision-making processes by leveraging information from clinical knowledge and electronic health records (EHRs). One typical application is disease risk prediction, which can be challenging due to the complexity of modelling longitudinal EHR data, including unstructured medical notes. To address this challenge, we propose a deep state-space model (DSSM) that simulates the patient’s state transition process and formally integrates latent states with risk observations. A typical DSSM consists of three parts: a prior module that generates the distribution of the current latent state based on previous states; a posterior module that approximates the latent states using up-to-date medical notes; and a likelihood module that predicts disease risks using latent states. To efficiently and effectively encode raw medical notes, our posterior module uses an attentive encoder to better extract information from unstructured high-dimensional medical notes. Additionally, we couple a predictive clustering algorithm into our DSSM to learn clinically useful representations of patients’ latent states. The latent states are clustered into multiple groups, and the weighted average of the cluster centres is used for prediction. We demonstrate the effectiveness of our deep clustering-based state-space model using two real-world EHR datasets, showing that it not only generates better risk prediction results than other baseline methods but also clusters similar patient health states into groups.
KW - Deep state-space model
KW - Disease risk prediction
KW - Modelling longitudinal medical notes
KW - Predictive clustering
KW - Text mining
UR - http://www.scopus.com/inward/record.url?scp=85183849516&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/305fd4de-6d6f-3be4-b6d7-85a972a13c1c/
U2 - 10.1007/s10479-023-05817-1
DO - 10.1007/s10479-023-05817-1
M3 - Article
SN - 0254-5330
JO - Annals of Operations Research
JF - Annals of Operations Research
ER -