Abstract
We propose an efficient method to conduct phrase alignment on parse forests for paraphrase detection. Unlike previous studies, our method identifies syntactic paraphrases under linguistically motivated grammar. In addition, it allows phrases to non-compositionally align to handle paraphrases with non-homographic phrase correspondences. A dataset that provides gold parse trees and their phrase alignments is created. The experimental results confirm that the proposed method conducts highly accurate phrase alignment compared to human performance.
Original language | English |
---|---|
Title of host publication | Proceedings of EMNLP 2017 |
Pages | 1-11 |
Number of pages | 11 |
Publication status | Published - Sept 2017 |