Long Document Text Summarisation

  • Jennifer Bishop

Student thesis: Master of Philosophy

Abstract

Text summarisation is the task of converting a longer piece of text into a shorter text whilst communicating the same key points from the original document. The value of automatic text summarisation is derived from the efficiency saving gained through distilling long documents into shorter text. Despite this, most studies researching text summarisation, and the automated metrics used to assess its efficacy, have primarily been focused on short documents. Recently, Pretrained Language Models (PLMs) have been used to improve the performance of text summarisation. However, PLMs are limited by their need of large corpora of data for pretraining (ideally in the domain of any anticipated downstream tasks), labelled training data for fine-tuning, and by their attention mechanism, which often makes them unsuitable for use on long documents. The computational complexity of their attention mechanisms means that if a document is long, it generally must be truncated to be computationally feasible to process. This work aims to develop methods which adapt PLMs in ways to make them suitable for summarisation of long documents. Three main novel contributions are proposed in this work. Firstly, GenCompareSum, a hybrid, unsupervised, abstractive-extractive method is developed, which cycles through a document generating salient textual fragments and uses these to guide an unsupervised extractive summarisation. This hybrid approach can be easily extended to any document length and out-performs existing unsupervised methods, as well as state-of-the-art supervised methods, despite not needing labelled training data for the summarisation task. Secondly, since most long document data sets are highly domain-specific, a framework for injecting domain knowledge into PLMs is proposed: KeBioSum. Evaluation of this method shows that using an adapter-based framework to inject domain knowledge into PLMs improves performance of text summarisation. Lastly, maintaining factual consistency is a critical issue in abstractive text summarisation, but it cannot be assessed by traditional metrics, such as ROUGE scores. Recent efforts have been devoted to developing improved metrics for measuring factual consistency using PLMs. However, there is a lack of research on automatic metrics which can assess the factual consistency of long document summarisation. To this end, LongDocFACTScore (LDFACTS) is proposed. This metric extends an existing evaluation metric, BARTScore, by comparing each sentence in the generated summary with the most similar sections of the source document. It is designed to be extendable to any length document and demonstrates a strong correlation with the human judgement of factual consistency on long document summarisation data sets. In addition to these three main novel contributions, both intrinsic and extrinsic evaluation of different methods for the abstractive summarisation of long documents are conducted and discussed.
Date of Award31 Dec 2023
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorSophia Ananiadou (Supervisor) & Junichi Tsujii (Supervisor)

Keywords

  • evaluation metrics
  • extractive summarisation
  • natural language processing
  • long document
  • text summarisation
  • abstractive summarisation

Cite this

'