Paraphrasing with bilingual parallel corpora

Colin Bannard*, Chris Callison-Burch

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

Abstract

Previous work has used monolingual parallel corpora to extract and generate paraphrases. We show that this task can be done using bilingual parallel corpora, a much more commonly available resource. Using alignment techniques from phrasebased statistical machine translation, we show how paraphrases in one language can be identified using a phrase in another language as a pivot. We define a paraphrase probability that allows paraphrases extracted from a bilingual parallel corpus to be ranked using translation probabilities, and show how it can be refined to take contextual information into account. We evaluate our paraphrase extraction and ranking methods using a set of manual word alignments, and contrast the quality with paraphrases extracted from automatic alignments.

Original languageEnglish
Title of host publicationACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics
Pages597-604
Number of pages8
ISBN (Print)1932432515, 9781932432510
DOIs
Publication statusPublished - 2005
Event43rd Annual Meeting of the Association for Computational Linguistics, ACL-05 - Ann Arbor, MI, United States
Duration: 25 Jun 200530 Jun 2005

Publication series

NameACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference43rd Annual Meeting of the Association for Computational Linguistics, ACL-05
Country/TerritoryUnited States
CityAnn Arbor, MI
Period25/06/0530/06/05

Fingerprint

Dive into the research topics of 'Paraphrasing with bilingual parallel corpora'. Together they form a unique fingerprint.

Cite this