Abstract
The focus of this report is the use of automated content-based tools – in particular those that use artificial intelligence (AI) and machine learning – to detect terrorist content online. In broad terms, such tools follow either a matching-based or a classification-based approach. Matching-based approaches rely on a technique known as hashing. The report explains the distinction between cryptographic hashing and perceptual hashing, explaining that tech companies have tended to rely on the latter for the purposes of content moderation. Classification-based approaches typically involve using a large corpus of texts, which have been manually annotated by human reviewers, to train algorithms to predict whether a new item of content belongs to a particular category (e.g., terrorist content). This approach also raises important issues, including the difficulties compiling a dataset to train the algorithms, the temporal, contextual and cultural limitations of machine learning algorithms, and the resultant danger of incorrect outcomes. In the light of this discussion, the report concludes that human input remains necessary and that oversight mechanisms are essential to correct errors and ensure accountability. It also considers capacity-building measures, including off-the-shelf content moderation solutions and collaborative initiatives, as well as potential future development of AI to address some of the challenges identified.
Original language | English |
---|---|
Type | Research Report |
Media of output | Online |
Publisher | Tech Against Terrorism Europe |
Number of pages | 32 |
DOIs | |
Publication status | Published - 2024 |
Keywords
- Content Moderation
- Artificial Intelligence
- Online Extremism
- Small to Medium Platforms
- Digital Services Act
- Terrorist Content Online Regulation