Parliamentary debate speeches provide access to the opinions and policy positions expressed by politicians towards many important topics. This information is of interest to citizens who wish to monitor the activities of their political representatives. However, due to the quantity, complexity, and specialised, esoteric language of the debates, they are not straightforward for human readers to process. In prior work, sentiment analysis of legislative debates has been approached similarly to that applied to other domains. However, debate speeches are different in that their targets---the debate motions (proposals): (a) are non-neutral, which has a polarity-shifting effect on the content of speeches, and therefore on sentiment classifiers; and (b) are themselves sources of important topic information, without which, the output of analysis of the speeches is arguably uninformative. I therefore examine the extraction of speaker sentiment with respect to the topics of the motions under debate. I evaluate state-of-the-art NLP approaches to (1) sentiment polarity classification, (2) topic identification, and (3) topic-centric stance detection. These include the use of transformer-based language models, which I apply to this domain for (as far as I am aware) the first time. I compare approaches to class labelling for supervised classification, language representation, debate structure modelling, and machine learning methods and paradigms. The main contributions of this thesis are as follows: Sentiment polarity classification: I evaluate approaches to this task, optimised for the domain of UK parliamentary debate speeches. I examine the validity of vote-derived sentiment class labels, finding that, to a large extent, they appear to align with the judgements of human readers. I propose a motion-dependent framework for dealing with the discourse structure of the debates, finding that this considerably boosts performance over motion-independent systems. Topic identification: Topic-modelling yields overly broad outputs, which tend not to be the targets of speech sentiment. Proposing instead a supervised approach, I evaluate labelling schema for this task. I explore the use of two labelling frameworks: debate motions labelled by crowdsourced annotators; and a schema designed by political scientists for the annotation of party-political documents. Policy preference-focused stance detection: To answer the question 'what is the position of speaker X on topic Y?', I formulate this task as a form of topic-centric sentiment analysis. I evaluate a range of approaches to the task. This work advances the state-of-the-art of sentiment analysis for the legislative debate domain. It provides insights into the nature of the task and remaining challenges, such as validation of commonly-applied assumptions about ground-truth speech sentiment, and analysis of sentiment-bearing parliamentary language.
- civic technology
- natural language processing
- parliamentary debates
- stance detection
- sentiment analysis
- topic detection
Topic-centric sentiment analysis of UK parliamentary debates
Abercrombie, G. (Author). 1 Aug 2021
Student thesis: Phd