Public Opinion Without Polls: Investigating the feasibility of Twitter-based election forecasts

  • Niklas Loynes

Student thesis: Phd


This thesis investigates whether, and if so, how digital trace data taken from the social media platform Twitter can be used as a valid and reliable basis for measuring public opinion. While Twitter data has been used in a range of spheres to measure public opinion, in the political research sphere work has typically centred on forecasting the outcome of elections. This functions as a predictive exercise in its own right, but can also be framed as providing a test for the robustness of measures derived from Twitter data as a new proxy for voter opinion: election outcomes offer a ‘ground truth’ against which bias in these data can be assessed and the extent of error measured. Results to date have been mixed, with studies showing varying levels of accuracy in matching their estimates of vote intention against polls and/or electoral outcomes. So far, however, none have consistently achieved the standards of accuracy, reproducibility and reliability that would make them an unquestioned alternative to surveys, thereby offering ample scope for new research, which this thesis contributes to. The research documented in this thesis is guided by the assumption that the sheer amount of political content on Twitter provides enough pertinent and varied information to measure public opinion, but the tools necessary for achieving this reliably and at scale do not yet exist. The core goals of this thesis are exploring which methods are required, and contributing to their development. This is approached in three papers. First, ‘Finding Friends’ demonstrates a new approach to estimating any given Twitter user’s home location. Second, ‘Understanding Political Sentiment’ benchmarks existing approaches to measuring public opinion by extracting sentiment from tweets, as well as a new approach to estimating public opinion metrics derived from hand-labelled and machine-propagated sentiment scores, while applying twelve aggregate vote share prediction models in a comparative framework in three US 2016 presidential primaries. This paper, as well as third paper employ samples of Twitter users generated using the geo-locating algorithm outlined in paper 1. Paper 3, ‘Listening in on the noise’ introduces a novel approach to estimating individual-level political preferences using distant supervision and Machine Learning. This is applied on two distinct, geo-located samples of users, one selected based on indication of previous voting, one selected randomly, in order to trace public opinion in the 2018 US midterm elections, both nationally and in the four most populous states. Each of my papers marks a significant contribution to its particular issue-area in the field of Twitter-based public opinion research. Paper 1 adds a reproducible geo-locating pipeline with an open source software package. Paper 2 adds evidence to the evaluation of the usefulness of sentiment analysis on tweets for public opinion measurement. Paper 3 introduces an approach to estimating individual-level political preferences, as well as highlighting the impact sampling decisions have on research outcomes. The best models for predicting vote shares in papers 2 and 3 achieve mean errors on par with opinion polling, and offer, through their application in diverse election scenarios and by following a theory- grounded and reproducible framework, a strong contribution to existing practice in the field. Furthermore, the ability to estimate individual users’ home locations significantly improves sampling capabilities for researchers employing Twitter data across fields.
Date of Award31 Dec 2021
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorMark Elliot (Supervisor), Rachel Gibson (Supervisor) & Marta Cantijoch Cunill (Supervisor)


  • computational social science
  • forecasting
  • American politics
  • elections
  • Twitter
  • public opinion

Cite this