Phrase Tagset Mapping for French and English Treebanks and Its Application in Machine Translation Evaluation

Lifeng Han, Derek F. Wong, Lidia S. Chao, Liangye He, Shuo Li, Ling Zhu

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review


Many treebanks have been developed in recent years for different languages. But these treebanks usually employ different syntactic tag sets. This forms an obstacle for other researchers to take full advantages of them, especially when they undertake the multilingual research. To address this problem and to facilitate future research in unsupervised induction of syntactic structures, some researchers have developed a universal POS tag set. However, the disaccord problem of the phrase tag sets remains unsolved. Trying to bridge the phrase level tag sets of multilingual treebanks, this paper designs a phrase mapping between the French Treebank and the English Penn Treebank. Furthermore, one of the potential applications of this mapping work is explored in the machine translation evaluation task. This novel evaluation model developed without using reference translations yields promising results as compared to the state-of-the-art evaluation metrics.
Original languageEnglish
Title of host publicationLanguage Processing and Knowledge in the Web
Subtitle of host publication25th International Conference, GSCL 2013, Darmstadt, Germany, September 25-27, 2013, Proceedings
EditorsIryna Gurevych, Chris Biemann, Torsten Zesch
Place of PublicationHeidelberg
PublisherSpringer Berlin
Number of pages13
ISBN (Electronic)9783642407222
ISBN (Print)9783642407215
Publication statusPublished - 21 Aug 2013

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


  • Natural language processing
  • Phrase tagset mapping
  • Multilingual treebanks
  • Machine translation evaluation


Dive into the research topics of 'Phrase Tagset Mapping for French and English Treebanks and Its Application in Machine Translation Evaluation'. Together they form a unique fingerprint.

Cite this