Effects of intrinsic and imposed modulation masking on speech perception

Student thesis: Phd


The temporal envelope of speech provides fundamental cues for speech recognition. However, when two speech sentences are presented simultaneously, the intelligibility of the target speech may be degraded due to masking of the target temporal envelope by that of the interferer, i.e. modulation masking (MM). MM may arise from the natural properties of the competing signals, i.e. inherent MM, or from modulation-domain-based signal processing techniques, e.g. dynamic range compression (DRC), that generate imposed MM. Existing competing speech investigations overlook inherent MM effects, and little is known about the perceptual effects of MM imposed by the application of DRC. The present research investigates the role of both inherent and imposed MM on the perception of two competing speech signals, and demonstrates that both forms of MM result in degraded speech perception. In a first experiment, older listeners with normal hearing identified a set of keywords in a two-talker competing speech task. The sentences used in the speech task were produced by different-sex talkers and subjected to varying degrees of DRC. Results revealed that the individual characteristics of the speakers affected the relative intelligibility of the competing speech signals, obscuring the predicted effects of DRC. This empirical finding was supported by theoretical models of MM predicting significant variability in inherent MM between the speakers. To limit MM differences across stimuli, while preserving fundamental frequency intelligibility cues, pitch-shifted sentences spoken by the same talker were used as target and masker in the final (online) experiment. The use of pitch-shifted sentences revealed that aggressive DRC schemes can hinder speech perception and increase listening effort for older listeners with normal hearing. The work presented here demonstrates that both inherent and imposed MM play a critical role in speech perception. Significant inherent MM variability can be found in widely used speech corpora and may confound competing-speech investigations. This thesis provides further insight to recent discussions on the perceptual effects of MM, and also lays the foundation for the integration of listening effort assessments in evaluations of the side-effects of hearing-aid signal processing on speech perception.
Date of Award1 Aug 2023
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorPatrick Gaydecki (Supervisor), Michael Stone (Supervisor) & Rebecca Millman (Supervisor)


  • intelligibility
  • dynamic range compression
  • listening effort
  • modulation masking

Cite this