TY - JOUR
T1 - Pre-trained language models with domain knowledge for biomedical extractive summarization
AU - Xie, Qianqian
AU - Bishop, Jennifer
AU - Tiwari, Prayag
AU - Ananiadou, Sophia
PY - 2022/7/19
Y1 - 2022/7/19
N2 - Biomedical text summarization is a critical task for comprehension of an ever-growing amount of biomedical literature. Pre-trained language models (PLMs) with transformer-based architectures have been shown to greatly improve performance in biomedical text mining tasks. However, existing methods for text summarization generally fine-tune PLMs on the target corpora directly and do not consider how fine-grained domain knowledge, such as PICO elements used in evidence-based medicine, can help to identify the context needed for generating coherent summaries. To fill the gap, we propose KeBioSum, a novel knowledge infusion training framework, and experiment using a number of PLMs as bases, for the task of extractive summarization on biomedical literature. We investigate generative and discriminative training techniques to fuse domain knowledge (i.e., PICO elements) into knowledge adapters and apply adapter fusion to efficiently inject the knowledge adapters into the basic PLMs for fine-tuning the extractive summarization task. Experimental results from the extractive summarization task on three biomedical literature datasets show that existing PLMs (BERT, RoBERTa, BioBERT, and PubMedBERT) are improved by incorporating the KeBioSum knowledge adapters, and our model outperforms the strong baselines.
AB - Biomedical text summarization is a critical task for comprehension of an ever-growing amount of biomedical literature. Pre-trained language models (PLMs) with transformer-based architectures have been shown to greatly improve performance in biomedical text mining tasks. However, existing methods for text summarization generally fine-tune PLMs on the target corpora directly and do not consider how fine-grained domain knowledge, such as PICO elements used in evidence-based medicine, can help to identify the context needed for generating coherent summaries. To fill the gap, we propose KeBioSum, a novel knowledge infusion training framework, and experiment using a number of PLMs as bases, for the task of extractive summarization on biomedical literature. We investigate generative and discriminative training techniques to fuse domain knowledge (i.e., PICO elements) into knowledge adapters and apply adapter fusion to efficiently inject the knowledge adapters into the basic PLMs for fine-tuning the extractive summarization task. Experimental results from the extractive summarization task on three biomedical literature datasets show that existing PLMs (BERT, RoBERTa, BioBERT, and PubMedBERT) are improved by incorporating the KeBioSum knowledge adapters, and our model outperforms the strong baselines.
KW - Extractive Summarisation
KW - Biomedical text mining
KW - PICO elements
KW - Pre-trained language models
U2 - https://www.sciencedirect.com/science/article/pii/S0950705122007328?via%3Dihub
DO - https://www.sciencedirect.com/science/article/pii/S0950705122007328?via%3Dihub
M3 - Article
SN - 0950-7051
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
ER -