TY - GEN
T1 - Learning disentangled representations for explainable chest X-ray classification using Dirichlet VAEs
AU - Harkness, Rachael
AU - Frangi, Alejandro F.
AU - Zucker, Kieran
AU - Ravikumar, Nishant
N1 - Funding Information:
This study was partially supported by cloud computing credits awarded by Amazon Web Services through the Diagnostic Development Initiative. Kieran Zucker is funded by the NIHR for this research project. The views expressed in this publication are those of the author(s) and not necessarily those of the NIHR, NHS or the UK Department of Health and Social Care.
Publisher Copyright:
© 2023 SPIE.
PY - 2023
Y1 - 2023
N2 - This study explores the use of the Dirichlet Variational Autoencoder (DirVAE) for learning disentangled latent representations of chest X-ray (CXR) images. Our working hypothesis is that distributional sparsity, as facilitated by the Dirichlet prior, will encourage disentangled feature learning for the complex task of multi-label classification of CXR images. The DirVAE is trained using CXR images from the CheXpert database, and the predictive capacity of multi-modal latent representations learned by DirVAE models is investigated through implementation of an auxiliary multi-label classification task, with a view to enforce separation of latent factors according to class-specific features. The predictive performance and explainability of the latent space learned using the DirVAE were quantitatively and qualitatively assessed, respectively, and compared with a standard Gaussian prior-VAE (GVAE). We introduce a new approach for explainable multi-label classification in which we conduct gradient-guided latent traversals for each class of interest. Study findings indicate that the DirVAE is able to disentangle latent factors into class-specific visual features, a property not afforded by the GVAE, and achieve a marginal increase in predictive performance relative to GVAE. We generate visual examples to show that our explainability method, when applied to the trained DirVAE, is able to highlight regions in CXR images that are clinically relevant to the class(es) of interest and additionally, can identify cases where classification relies on spurious feature correlations.
AB - This study explores the use of the Dirichlet Variational Autoencoder (DirVAE) for learning disentangled latent representations of chest X-ray (CXR) images. Our working hypothesis is that distributional sparsity, as facilitated by the Dirichlet prior, will encourage disentangled feature learning for the complex task of multi-label classification of CXR images. The DirVAE is trained using CXR images from the CheXpert database, and the predictive capacity of multi-modal latent representations learned by DirVAE models is investigated through implementation of an auxiliary multi-label classification task, with a view to enforce separation of latent factors according to class-specific features. The predictive performance and explainability of the latent space learned using the DirVAE were quantitatively and qualitatively assessed, respectively, and compared with a standard Gaussian prior-VAE (GVAE). We introduce a new approach for explainable multi-label classification in which we conduct gradient-guided latent traversals for each class of interest. Study findings indicate that the DirVAE is able to disentangle latent factors into class-specific visual features, a property not afforded by the GVAE, and achieve a marginal increase in predictive performance relative to GVAE. We generate visual examples to show that our explainability method, when applied to the trained DirVAE, is able to highlight regions in CXR images that are clinically relevant to the class(es) of interest and additionally, can identify cases where classification relies on spurious feature correlations.
KW - chest X-ray images
KW - Dirichlet distribution
KW - disentanglement
KW - explainability
KW - Multi-label classification
KW - variational autoencoders
UR - http://www.scopus.com/inward/record.url?scp=85159660538&partnerID=8YFLogxK
U2 - 10.1117/12.2654345
DO - 10.1117/12.2654345
M3 - Conference contribution
AN - SCOPUS:85159660538
T3 - Progress in Biomedical Optics and Imaging - Proceedings of SPIE
BT - Medical Imaging 2023
A2 - Colliot, Olivier
A2 - Isgum, Ivana
PB - SPIE
T2 - Medical Imaging 2023: Image Processing
Y2 - 19 February 2023 through 23 February 2023
ER -