Design and Implementation of Keyphrase Extraction Engine for Chinese Scientific Literature

Liangping Ding, Zhixiong Zhang*, Huan Liu, Yang Zhao

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

Accurate keyphrases summarize the main topics, which are important for information retrieval and many other natural language processing tasks. In this paper, we construct a keyphrase extraction engine for Chinese scientific literature to assist researchers in improving the efficiency of scientific research. There are four key technical problems in the process of building the engine: how to select a keyphrase extraction algorithm, how to build a large-scale training set to achieve application-level performance, how to adjust and optimize the model to achieve better application results, and how to be conveniently invoked by researchers. Aiming at the above problems, we propose corresponding solutions. The engine is able to automatically recommend four to five keyphrases for the Chinese scientific abstracts given by the user, and the response speed is generally within 3 seconds. The keyphrase extraction engine for Chinese scientific literature is developed based on advanced deep learning algorithms, large-scale training set, and high-performance computing capacity, which might be an effective tool for researchers and publishers to quickly capture the key stating points of scientific text.

Original languageEnglish
Pages (from-to)26-35
Number of pages10
JournalCEUR Workshop Proceedings
Volume3004
Publication statusPublished - 2021
Event2nd Workshop on on Extraction and Evaluation of Knowledge Entities from Scientific Documents, EEKE 2021 - Virtual, Online
Duration: 30 Sept 2021 → …

Keywords

  • Artificial intelligence engine
  • Chinese scientific literature
  • Keyphrase extraction

Research Beacons, Institutes and Platforms

  • Manchester Institute of Innovation Research
  • Institute for Data Science and AI

Fingerprint

Dive into the research topics of 'Design and Implementation of Keyphrase Extraction Engine for Chinese Scientific Literature'. Together they form a unique fingerprint.

Cite this