Deep Bayesian Experimental Design for Drug Discovery

Muhammad Arslan Masood, Tianyu Cui, Samuel Kaski

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

Abstract

In drug discovery, prioritizing compounds for testing is an important task. Active learning can assist in this endeavor by prioritizing molecules for label acquisition based on their estimated potential to enhance in-silico models. However, in specialized cases like toxicity modeling, limited dataset sizes can hinder effective training of modern neural networks for representation learning and to perform active learning. In this study, we leverage a transformer-based BERT model pretrained on millions of SMILES to perform active learning. Additionally, we explore different acquisition functions to assess their compatibility with pretrained BERT model. Our results demonstrate that pretrained models enhance active learning outcomes. Furthermore, we observe that active learning selects a higher proportion of positive compounds compared to random acquisition functions, an important advantage, especially in dealing with imbalanced toxicity datasets. Through a comparative analysis, we find that both BALD and EPIG acquisition functions outperform random acquisition, with EPIG exhibiting slightly superior performance over BALD. In summary, our study highlights the effectiveness of active learning in conjunction with pretrained models to tackle the problem of data scarcity.

Original languageEnglish
Title of host publicationAI in Drug Discovery - 1st International Workshop, AIDD 2024, Held in Conjunction with ICANN 2024, Proceedings
PublisherSpringer London
Pages149-159
Number of pages11
ISBN (Electronic)9783031723810
ISBN (Print)9783031723803
DOIs
Publication statusPublished - 20 Sept 2024
EventInternational Workshop on AI in Drug Discovery - , Switzerland
Duration: 19 Sept 202419 Sept 2024

Conference

ConferenceInternational Workshop on AI in Drug Discovery
Country/TerritorySwitzerland
Period19/09/2419/09/24

Keywords

  • Active learning
  • Bayesian
  • BERT
  • Drug Discovery

Fingerprint

Dive into the research topics of 'Deep Bayesian Experimental Design for Drug Discovery'. Together they form a unique fingerprint.

Cite this