Multilingual Semantic Relatedness using Lightweight Machine Translation

Siamak Barzegar, Brian Davis, Siegfried Handschuh, Andre Freitas

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

79 Downloads (Pure)

Abstract

Distributional semantic models are strongly dependent on the size and the quality of the reference corpora, which embeds the commonsense knowledge necessary to build comprehensive models. While high-quality texts containing large-scale commonsense information are present in English, such as Wikipedia, other languages may lack sufficient textual support to build distributional models. This paper proposes using the combination of a lightweight (sloppy) machine translation model and an English Distributional Semantic Model (DSM) to provide higher quality word vectors for languages other than English. Results show that the lightweight MT model introduces significant improvements when compared to language-specific distributional models. Additionally, the lightweight MT outperforms more complex MT methods for the task of word-pair translation.
Original languageEnglish
Title of host publication2018 IEEE 12th International Conference on Semantic Computing (ICSC)
PublisherIEEE
Pages108-114
Number of pages7
ISBN (Electronic)78-1-5386-4408-9
ISBN (Print)978-1-5386-4409-6
DOIs
Publication statusPublished - 12 Apr 2018

Keywords

  • Machine Translation
  • Multilingual Distributional Semantic Models
  • Semantic Relatedness
  • Semantic Similarity

Fingerprint

Dive into the research topics of 'Multilingual Semantic Relatedness using Lightweight Machine Translation'. Together they form a unique fingerprint.

Cite this