Bert-Based Chinese Medical Keyphrase Extraction Model Enhanced with External Features

Liangping Ding, Zhixiong Zhang, Yang Zhao

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Keyphrase extraction is a key natural language processing task and has widespread adoption in many information retrieval and text mining applications. In this paper, we construct nine Bert-based Chinese medical keyphrase extraction models enhanced with external features and present a thorough empirical evaluation to explore the impacts of feature types and feature fusion methods. The results show that encoding part-of-speech (POS) feature and lexicon feature generated from descriptive keyphrase metadata into the word embedding space improves the baseline Bert-SoftMax model for 4.82%, meaning that it’s beneficial to incorporate features into Chinese medical keyphrase extraction model. Furthermore, the results of the comparative evaluation experiments show that model performance is sensitive to both of feature types and feature fusion methods, so it’s advisable to consider these two factors when dealing with feature enhanced tasks. Our study also provides a feasible approach to employ metadata, aiming to help stakeholders of digital libraries to take full advantage of large quantities of metadata resources to boost the development of scholarly knowledge discovery.

Original languageEnglish
Title of host publicationTowards Open and Trustworthy Digital Societies - 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Proceedings
EditorsHao-Ren Ke, Chei Sian Lee, Kazunari Sugiyama
PublisherSpringer Nature
Pages167-176
Number of pages10
ISBN (Print)9783030916688
DOIs
Publication statusPublished - 2021
Event23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021 - Virtual, Online
Duration: 1 Dec 20213 Dec 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13133 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021
CityVirtual, Online
Period1/12/213/12/21

Keywords

  • Digital library
  • Feature fusion
  • Keyphrase extraction
  • Metadata
  • Pretrained language model
  • Scholarly text mining

Research Beacons, Institutes and Platforms

  • Manchester Institute of Innovation Research

Fingerprint

Dive into the research topics of 'Bert-Based Chinese Medical Keyphrase Extraction Model Enhanced with External Features'. Together they form a unique fingerprint.

Cite this