cushLEPOR: customising hLEPOR metric using Optuna for higher agreement with human judgments or pre-trained language model LaBSE

Lifeng Han, Irina Sorokina, Gleb Erofeev, Serge Gladkoff

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

Abstract

Human evaluation has always been expensive while researchers struggle to trust the automatic metrics. To address this, we propose to customise traditional metrics by taking advantages of the pre-trained language models (PLMs) and the limited available human labelled scores. We first re-introduce the hLEPOR metric factors, followed by the Python version we developed (ported) which achieved the automatic tuning of the weighting parameters in hLEPOR metric. Then we present the customised hLEPOR (cushLEPOR) which uses Optuna hyper-parameter optimisation framework to fine-tune hLEPOR weighting parameters towards better agreement to pre-trained language models (using LaBSE) regarding the exact MT language pairs that cushLEPOR is deployed to. We also optimise cushLEPOR towards professional human evaluation data based on MQM and pSQM framework on English-German and Chinese-English language pairs. The experimental investigations show cushLEPOR boosts hLEPOR performances towards better agreements to PLMs like LABSE with much lower cost, and better agreements to human evaluations including MQM and pSQM scores, and yields much better performances than BLEU. Official results show that our submissions win three language pairs including English-German and Chinese-English on News domain via cushLEPOR(LM) and English-Russian on TED domain via hLEPOR. (data available at https://github.com/poethan/cushLEPOR)
Original languageEnglish
Title of host publicationProceedings of the Sixth Conference on Machine Translation
EditorsLoic Barrault, Ondrej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Tom Kocmi, Andre Martins, Makoto Morishita, Christof Monz
Place of PublicationStroudsburg, PA
PublisherAssociation for Computational Linguistics
Pages1014–1023
Number of pages10
ISBN (Print)9781954085947
Publication statusPublished - Nov 2021
Event6th Conference on Machine Translation - Online
Duration: 10 Nov 202111 Nov 2021
Conference number: WMT 2021

Conference

Conference6th Conference on Machine Translation
Period10/11/2111/11/21

Fingerprint

Dive into the research topics of 'cushLEPOR: customising hLEPOR metric using Optuna for higher agreement with human judgments or pre-trained language model LaBSE'. Together they form a unique fingerprint.

Cite this