Lip Reading for robust speech recognition on embedded devices

Jesús F. Guitarte Pérez, Alejandro F. Frangi, Eduardo Lleida Solano, Klaus Lukas

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this article a complete audio-visual speech recognition system suitable for embedded devices is presented. As visual feature extraction algorithms Active Shape Models (ASM) and Discrete Cosine transformation (DCT) have been investigated and discussed for an embedded implementation. The audio-visual information integration has also been designed by taking into account device limitations. It is well known that the use of visual cues improves the recognition results especially in scenarios with high level of acoustical noise. We wanted to compare the performance of Lip Reading and the conventional Noise Reduction systems in these degraded scenarios, as well as the combination of both kinds of solutions. Important improvements are obtained especially for non-stationary background noises like voice interference, car accelerations or indicators clicks. For this kind of noises Lip Reading outperforms the results obtained with conventional Noise Reduction technologies.

Original languageEnglish
Title of host publication2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing
PublisherIEEE
Pages473-476
Number of pages4
ISBN (Print)0780388747, 9780780388741
DOIs
Publication statusPublished - 2005
Event2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Philadelphia, PA, United States
Duration: 18 Mar 200523 Mar 2005

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
VolumeI
ISSN (Print)1520-6149

Conference

Conference2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05
Country/TerritoryUnited States
CityPhiladelphia, PA
Period18/03/0523/03/05

Fingerprint

Dive into the research topics of 'Lip Reading for robust speech recognition on embedded devices'. Together they form a unique fingerprint.

Cite this