Student thesis: Master of Philosophy


A primary goal of artificial intelligence is to understand human mental states. One direction aims at emotionally coherent and empathetic machine systems. As emotion is often indicated in natural language, emotion recognition from text has become an important research topic in the Natural Language Processing (NLP) community. For example, Emotion Recognition in Conversations (ERC) aims to identify the emotion of each utterance within a dialogue, which has attracted growing research interest due to its wide applications in real-world scenarios. In another line of work, interdisciplinary researchers put much effort into automatic mental health analysis, which devises NLP techniques to detect and analyse mental health conditions (e.g. depression, stress and bipolar). Particularly, mental health analysis from social media posts develops fast with the growing availability of large-scale data from social networks. This thesis aims to push the boundary of the above two tasks from the perspective of representation learning, which is the core of modern deep learning and NLP techniques. Firstly, we explore the application of contrastive learning. Though previous works mainly perform contrastive learning in an unsupervised manner, we focus on supervised contrastive learning as both tasks are modelled as text classification, and rich labelled data are available. For ERC, we propose a low-dimensional Supervised Cluster-level Contrastive Learning (SCCL). SCCL first reduces the high-dimensional contrastive learning space to a three-dimensional affect (emotion) representation space Valance-Arousal-Dominance (VAD), then performs cluster-level contrastive learning to incorporate measurable emotion prototypes from a human-labelled VAD sentiment lexicon. For stress and depression detection, we also introduce contrastive learning to fully leverage label information for capturing class-specific features. Secondly, we propose new knowledge infusion methods to enhance the representations. For ERC, we leverage the pre-trained knowledge adapters to infuse linguistic and factual knowledge in a plug-in manner. To explicitly model the speakers' mental states and enhance the mentalisation ability for stress and depression detection, we extract mental state knowledge from a commonsense knowledge base and infuse the knowledge explicitly to the representations. Then we propose a knowledge–aware mentalisation module to accordingly attend to the most relevant knowledge aspects. Experiments show that our methods achieve new state-of-the-art results on three ERC and three stress and depression detection datasets. The analysis also proves that the VAD space is not only suitable for ERC but also interpretable, and VAD prototypes enhance the ERC performance and stabilise the training of SCCL. In addition, the pre-trained knowledge adapters benefit the performance of the utterance encoder and SCCL. Finally, factor-specific analysis and visualisation are performed to prove the effectiveness of all proposed modules.
Date of Award1 Aug 2023
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorSophia Ananiadou (Supervisor) & Junichi Tsujii (Supervisor)

Cite this