Abstract
In the conventional evaluation metrics of machine translation, considering less information about the translations usually makes the result not reasonable and low correlation with human judgments. On the other hand, using many external linguistic resources and tools (e.g. Part-ofspeech tagging, morpheme, stemming, and synonyms) makes the metrics complicated, timeconsuming and not universal due to that different languages have the different linguistic features. This paper proposes a novel evaluation metric employing rich and augmented factors without relying on any additional resource or tool. Experiments show that this novel metric yields the state-of-the-art correlation with human judgments compared with classic metrics BLEU, TER, Meteor-1.3 and two latest metrics (AMBER and MP4IBM1), which proves it a robust one by employing a feature-rich and model-independent approach.
Original language | English |
---|---|
Title of host publication | Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012) |
Place of Publication | Mumbai, India |
Publisher | The COLING 2012 Organizing Committee |
Pages | 441-450 |
Number of pages | 10 |
Publication status | Published - 2012 |
Keywords
- Machine translation
- evaluation metric
- context-dependent n-gram alignment
- modified length penalty
- precision
- recall