The aim of this thesis is to investigate a number of factors that could affect theperformance of an Arabic automatic speech understanding (ASU) system. The workdescribed in this thesis belongs to the speech recognition (ASR) phase, but the factthat it is part of an ASU project rather than a stand-alone piece of work on ASR influences the way in which it will be carried out. Our main concern in this work is todetermine the best way to exploit the phonological properties of the Arabic language inorder to improve the performance of the speech recogniser. One of the main challengesfacing the processing of Arabic is the effect of the local context, which induces changesin the phonetic representation of a given text, thereby causing the recognition engineto misclassify it. The proposed solution is to develop a set of language-dependentgrapheme-to-allophone rules that can predict such allophonic variations and eventuallyprovide a phonetic transcription that is sensitive to the local context for the ASRsystem. The novel aspect of this method is that the pronunciation of each word is extracteddirectly from a context-sensitive phonetic transcription rather than a predefineddictionary that typically does not reflect the actual pronunciation of the word. Besidesinvestigating the boundary effect on pronunciation, the research also seeks to addressthe problem of Arabic's complex morphology. Two solutions are proposed to tackle thisproblem, namely, using underspecified phonetic transcription to build the system, andusing phonemes instead of words to build the Hidden Markov Models (HMMs). Theresearch also seeks to investigate several technical settings that might have an effect onthe system's performance. These include training on the sub-population to minimisethe variation caused by training on the main undifferentiated population, as well asinvestigating the correlation between training size and performance of the ASR system.
|Date of Award||1 Aug 2014|
- The University of Manchester
|Supervisor||Allan Ramsay (Supervisor)|
- Modern standard Arabic, speech processing, automatic speech recognition, phonological rules, sound-spelling correspondences