Goodbye Human Annotators? Content Analysis of Social Policy Debates using ChatGPT

Erwin Gielens*, Jakub Sowula, Philip Leifeld

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Content analysis is a valuable tool for analysing policy discourse, but annotation by humans is costly and time consuming. ChatGPT is a potentially valuable tool to partially automate content analysis for policy debates, largely replacing human annotators. We evaluate ChatGPT’s ability to classify documents using pre-defined argument descriptions, comparing its performance with human annotators for two policy debates: the Universal Basic Income debate on Dutch Twitter (2014–2016) and the pension reforms debate in German newspapers (1993–2001). We use the API (GPT-4 Turbo) and user interface version (GPT-4) and evaluate multiple performance metrics (accuracy, precision and recall). ChatGPT is highly reliable and accurate in classifying pre-defined arguments across datasets. However, precision and recall are much lower, and vary strongly between arguments. These results hold for both datasets, despite differences in language and media type. Moreover, the cut-off method proposed in this paper may aid researchers in navigating the trade-off between detection and noise. Overall, we do not (yet) recommend a blind application of ChatGPT to classify arguments in policy debates. Those interested in adopting this tool should manually validate bot classifications before using them in further analyses. At least for now, human annotators are here to stay.
Original languageEnglish
Pages (from-to)1-20
JournalJournal of Social Policy
DOIs
Publication statusPublished - 3 Jan 2025

Fingerprint

Dive into the research topics of 'Goodbye Human Annotators? Content Analysis of Social Policy Debates using ChatGPT'. Together they form a unique fingerprint.

Cite this