Speech-driven Robot Face Action Generation with Deep Generative Model for Social Robots

Chuang Yu, Heng Zhang, Zhegong Shangguan, Xiaoxuan Hei, Angelo Cangelosi, Adriana Tapus

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


The natural co-speech facial action as a kind of non-verbal behavior plays an essential role in human communication, which also leads to a natural and friendly human-robot interaction. However, a lot of previous works for robot speech-based behaviour generation are rule-based or handcrafted methods, which are time-consuming and with limited synchronization levels between the speech and the facial action. Based on the Generative Adversarial Networks (GAN) model, this paper developed an effective speech-driven facial action synthesizer, i.e., given an acoustic speech, a synchronous and realistic 3D facial action sequence is generated. In addition, a mapping between the 3D human facial action to the real robot facial action that regulates Zeno robot facial expressions is also completed. The evaluation results show the model has potential for natural human-robot interaction.
Original languageEnglish
Title of host publicationInternational Conference on Social Robotics 2022
Publication statusAccepted/In press - 14 Oct 2022


  • Social Robot
  • Face Action
  • Human-Robot Interaction


Dive into the research topics of 'Speech-driven Robot Face Action Generation with Deep Generative Model for Social Robots'. Together they form a unique fingerprint.

Cite this