Whose Story Is It Anyway? Automatic Extraction of Accounts from News Articles

Research output: Contribution to journalArticlepeer-review

Abstract

Narratives are comprised of stories that provide insight into social processes. To
facilitate the analysis of narratives in a more ecient manner, natural language
processing (NLP) methods have been employed in order to automatically extract
information from textual sources, e.g., newspaper articles. Existing work
on automatic narrative extraction, however, has ignored the nested character
of narratives. In this work, we argue that a narrative may contain multiple accounts given by dierent actors. Each individual account provides insight into
the beliefs and desires underpinning an actor's actions. We present a pipeline for
automatically extracting accounts, consisting of NLP methods for: (1) named
entity recognition, (2) event extraction, and (3) attribution extraction. Machine
learning-based models for named entity recognition were trained based on
a state-of-the-art neural network architecture for sequence labelling. For event
extraction, we developed a hybrid approach combining the use of semantic role
labelling tools, the FrameNet repository of semantic frames, and a lexicon of
event nouns. Meanwhile, attribution extraction was addressed with the aid of a
dependency parser and Levin's verb classes. To facilitate the development and
evaluation of these methods, we constructed a new corpus of news articles, in
which named entities, events and attributions have been manually marked up
following a novel annotation scheme that covers over 20 event types relating to
socio-economic phenomena. Evaluation results show that relative to a baseline
method underpinned solely by semantic role labelling tools, our event extraction
approach optimises recall by 12.22-14.20 percentage points (reaching as high as 92.60% on one data set). Meanwhile, the use of Levin's verb classes in attribution extraction obtains optimal performance in terms of F-score, outperforming a baseline method by 7.64-11.96 percentage points. Our proposed approach was applied on news articles focused on industrial regeneration cases. This facilitated the generation of accounts of events that are attributed to specic actors.
Original languageEnglish
JournalInformation Processing & Management
Early online date1 Mar 2019
DOIs
Publication statusPublished - 2019

Keywords

  • Narrative Analysis
  • Named Entity Recognition
  • Event Extraction
  • Attribution Extraction
  • Corpus Annotation

Research Beacons, Institutes and Platforms

  • Manchester Institute of Innovation Research
  • Sustainable Consumption Institute

Fingerprint

Dive into the research topics of 'Whose Story Is It Anyway? Automatic Extraction of Accounts from News Articles'. Together they form a unique fingerprint.

Cite this