Learning Disentangled Representations for Natural Language Definitions

Danilo Silva De Carvalho, Giangiacomo Mercatali, Yingji Zhang, Andre Freitas

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Disentangling the encodings of neural models is a fundamental aspect for improving interpretability, semantic control and downstream task performance in Natural Language Processing. Currently, most disentanglement methods are unsupervised or rely on synthetic datasets with known generative factors. We argue that recurrent syntactic and semantic regularities in textual data can be used to provide the models with both structural biases and generative factors. We leverage the semantic structures present in a representative and semantically dense category of sentence types, definitional sentences, for training a Variational Autoencoder to learn disentangled representations. Our experimental results show that the proposed model outperforms unsupervised baselines on several qualitative and quantitative benchmarks for disentanglement, and it also improves the results in the downstream task of definition modeling.
Original languageEnglish
Title of host publication Findings of the European chapter of Association for Computational Linguistics (Findings of EACL)
Number of pages8
Publication statusAccepted/In press - 20 Jan 2023
EventEuropean Chapter of the Association for Computational Linguistics - Dubrovnik, Croatia, Dubrovnik, Croatia
Duration: 2 May 20236 May 2023
Conference number: 17


ConferenceEuropean Chapter of the Association for Computational Linguistics
Abbreviated titleEACL 2023
Internet address


  • Definition Semantic Roles
  • Disentanglement
  • Language Models
  • Variational Autoencoders


Dive into the research topics of 'Learning Disentangled Representations for Natural Language Definitions'. Together they form a unique fingerprint.

Cite this