Abstract
The aim here is to create a dependency treebank from a phrase-structure treebank for Arabic. Arabic has a number of characteristics,described below, which make it particularly challenging to any natural language processing (NLP) applications. We describe an encouraging semi-automatic technique for converting phrase-structure trees to dependency trees by using a head percolation table.One of the most significant challenges here is the determination of the head of each subtree. We therefore examined different versionsof the head percolation table to find the best priority list for each entry in the table. Given that there is no absolute measure of the‘correctness’ of a conversion of a phrase structure tree to dependency form, we tested the various transformations by seeing how well astate-of-the-art dependency parser learnt the generalisations that were embodied by the converted trees.
Original language | English |
---|---|
Title of host publication | host publication |
Editors | Jan Hajic, Koenraad De Smedt, Marko Tadic, Antonio Branco |
Place of Publication | META-RESEARCH Workshop on Advanced Treebanking Advanced Treebanking 2012 |
Pages | 61-68 |
Number of pages | 8 |
Publication status | Published - May 2012 |
Event | META-RESEARCH Workshop on Advanced Treebanking, Language Resources and Evaluation Conference - Istanbul Duration: 21 May 2012 → 27 May 2012 http://www.lrec-conf.org/proceedings/lrec2012/index.html |
Conference
Conference | META-RESEARCH Workshop on Advanced Treebanking, Language Resources and Evaluation Conference |
---|---|
City | Istanbul |
Period | 21/05/12 → 27/05/12 |
Internet address |
Keywords
- part of speech tagging