• Sardar Jaf

Student thesis: Phd


The process of determining the structural relationships between words in both natural and machine languages is known as parsing. Parsers are used as core components in a number of Natural Language Processing (NLP) applications such as online tutoring applications, dialogue-based systems and textual entailment systems. They have been used widely in the development of machine languages. In order to understand the way parsers work, we will investigate and describe a number of widely used parsing algorithms. These algorithms have been utilised in a range of different contexts such as dependency frameworks and phrase structure frameworks. We will investigate and describe some of the fundamental aspects of each of these frameworks, which can function in various ways including grammar-driven approaches and data-driven approaches. Grammar-driven approaches use a set of grammatical rules for determining the syntactic structures of sentences during parsing. Data-driven approaches use a set of parsed data to generate a parse model which is used for guiding the parser during the processing of new sentences. A number of state-of-the-art parsers have been developed that use such frameworks and approaches. We will briefly highlight some of these in this thesis. There are three specific important features that it is important to integrate into the development of parsers. These are efficiency, accuracy, and robustness. Efficiency is concerned with the use of as little time and computing resources as possible when processing natural language text. Accuracy involves maximising the correctness of the analyses that a parser produces. Robustness is a measure of a parser's ability to cope with grammatically complex sentences and produce analyses of a large proportion of a set of sentences.In this thesis, we present a parser that can efficiently, accurately, and robustly parse a set of natural language sentences. Additionally, the implementation of the parser presented here allows for some trading-off between different levels of parsing performance. For example, some NLP applications may emphasise efficiency/robustness over accuracy while some other NLP systems may require a greater focus on accuracy. In dialogue-based systems, it may be preferable to produce a correct grammatical analysis of a question, rather than incorrectly analysing the grammatical structure of a question or quickly producing a grammatically incorrect answer for a question. Alternatively, it may be desirable that document translation systems translate a document into a different language quickly but less accurately, rather than slowly but highly accurately, because users may be able to correct grammatically incorrect sentences manually if necessary. The parser presented here is based on data-driven approaches but we will allow for the application of constraint rules to it in order to improve its performance.
Date of Award31 Dec 2015
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorAllan Ramsay (Supervisor)


  • syntactic parsing
  • grammar extraction
  • non-projective parsing
  • parsing Arabic
  • constraint rules
  • data-driven parsing
  • constraint rules for parsing
  • dependency parsing
  • natural language parsing
  • parsing

Cite this