cushLEPOR uses LABSE distilled knowledge to improve correlation with human translation evaluations

Gleb Erofeev, Irina Sorokina, Lifeng Han, Serge Gladkoff

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

Abstract

Automatic MT evaluation metrics are indispensable for MT research. Augmented metrics such as hLEPOR include broader evaluation factors (recall and position difference penalty) in addition to the factors used in BLEU (sentence length, precision), and demonstrated higher accuracy. However, the obstacles preventing the wide use of hLEPOR were the lack of easy portable Python package and empirical weighting parameters that were tuned by manual work. This project addresses the above issues by offering a Python implementation of hLEPOR and automatic tuning of the parameters. We use existing translation memories (TM) as reference set and distillation modeling with LaBSE (Language-Agnostic BERT Sentence Embedding) to calibrate parameters for custom hLEPOR (cushLEPOR). cushLEPOR maximizes the correlation between hLEPOR and the distilling model similarity score towards reference. It can be used quickly and precisely to evaluate MT output from different engines, without need of manual weight tuning for optimization. In this session you will learn how to tune hLEPOR to obtain automatic custom-tuned cushLEPOR metric far more precise than BLEU. The method does not require costly human evaluations, existing TM is taken as a reference translation set, and cushLEPOR is created to select the best MT engine for the reference data-set.
Original languageEnglish
Title of host publicationProceedings of Machine Translation Summit XVIII
Subtitle of host publicationUsers and Providers Track
EditorsBen Huyck, Stephen Larocca, Janice Campbell, Jay Marciano, Konstantin Savenkov, Alex Yanishevsky, Stephen Richardson
PublisherAssociation for Machine Translation in the Americas
Pages421–439
Number of pages19
Volume2
Publication statusPublished - Aug 2021
Event18th Biennial Conference of the International Association of Machine Translation: MT Summit 2021 - Online, United States
Duration: 1 Aug 2021 → …
https://mtsummit2021.amtaweb.org/

Conference

Conference18th Biennial Conference of the International Association of Machine Translation
Abbreviated titleIAMT
Country/TerritoryUnited States
Period1/08/21 → …
Internet address

Fingerprint

Dive into the research topics of 'cushLEPOR uses LABSE distilled knowledge to improve correlation with human translation evaluations'. Together they form a unique fingerprint.

Cite this