Affective Human-Robot Interaction with Multimodal Explanations

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Facial expressions are one of the most practical and straightforward ways to communicate emotions. Facial Expression Recognition has been used in lots of fields such as human behaviour understanding and health monitoring. Deep learning models can achieve excellent performance in facial expression recognition tasks. As these deep neural networks have very complex nonlinear structures, when the model makes a prediction, it is not easy for human users to understand what is the basis for the model’s prediction. Specifically, we do not know which facial units contribute to the classification more or less. Developing affective computing models with more explainable and transparent feedback for human interactors is essential for a trustworthy human-robot interaction. Comparing to “white-box” approaches, “black-box” approaches using deep neural networks, which have advantages in terms of overall accuracy but lack reliability and explainability. In this work, we introduce a multimodal affective human-robot interaction framework, with visualbased and verbal-based explanation, by Layer Wise Relevance Propagation (LRP) and Local Intepretable Mode-Agnostic Explanation (LIME). The proposed framework has been tested on the KDEF dataset, and in human-robot interaction experiments with the Pepper robot. This experimental evaluation shows the benefits of linking deep learning emotion recognition systems with explainable strategies.
Original languageEnglish
Title of host publicationInternational Conference on Social Robotics 2022
Publication statusAccepted/In press - 14 Oct 2022


Dive into the research topics of 'Affective Human-Robot Interaction with Multimodal Explanations'. Together they form a unique fingerprint.

Cite this