Transcripts of UK parliamentary debates provide access to the opinions of politicians towards many important topics, but due to the large quantity of textual data and the specialised language used, they are not straightforward for human readers to process. We apply opinion mining methods to these transcripts to classify the sentiment polarity of speakers as being either positive or negative towards the motions proposed in the debates. We compare classification performance on a novel corpus using both manually annotated sentiment labels and labels derived from the speakers’ votes (‘aye’ or ‘no’). We introduce a two-step classification model, and evaluate the performance of both one- and two-step models, as well as the use of a range of textual and contextual features. Results suggest that textual features are more indicative of manually annotated class labels. Conversely, in addition to boosting performance, contextual metadata features are particularly indicative of vote labels. Use of the two-step debate model results in performance gains and appears to capture some of the complexity of the debate format. Optimum performance on this data is achieved using all features to train a multi-layer neural network, indicating that such models may be most able to exploit the relationships between textual and contextual cues in parliamentary debate speeches.
|Title of host publication
|LREC 2018, Eleventh International Conference on Language Resources and Evaluation
|European Language Resources Association
|Published - 7 May 2018
- Hansard transcripts
- parliamentary debates
- Sentiment Analysis