Zero-Shot Human-Object Interaction Recognition via Affordance Graphs

Alessio Sarullo, Tingting Mu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


We propose a new approach for Zero-Shot Human-Object Interaction Recognition in the challenging setting that involves interactions with unseen actions (as opposed to just unseen combinations of seen actions and objects). Our approach makes use of knowledge external to the image content in the form of a graph that models affordance relations between actions and objects, i.e., whether an action can be performed on the given object or not. We propose a loss function with the aim of distilling the knowledge contained in the graph into the model, while also using the graph to regularise learnt representations by imposing a local structure on the latent space. We evaluate our approach on several datasets (including the popular HICO and HICO-DET) and show that it outperforms the current state of the art.
Original languageEnglish
Title of host publicationAAAI'21 workshop on Commonsense Knowledge Graphs
Publication statusAccepted/In press - 2 Dec 2020


Dive into the research topics of 'Zero-Shot Human-Object Interaction Recognition via Affordance Graphs'. Together they form a unique fingerprint.

Cite this