Visual speech recognition and utterance segmentation based on mouth movement

Yau, W, Weghorn, H and Kumar, D 2007, 'Visual speech recognition and utterance segmentation based on mouth movement', in M. Bottema, A. Maeder, N. Redding and A. van den Hengel (ed.) Digital Image Computing : Techniques and Applications (DICTA 2007), Glenelg, Australia, 3-5 December 2007, pp. 7-14.


Document type: Conference Paper
Collection: Conference Papers

Attached Files
Name Description MIMEType Size
n2006007584.pdf Published version application/pdf 530.11KB
Title Visual speech recognition and utterance segmentation based on mouth movement
Author(s) Yau, W
Weghorn, H
Kumar, D
Year 2007
Conference name Digital Image Computing : Techniques and Applications (DICTA 2007)
Conference location Glenelg, Australia
Conference dates 3-5 December 2007
Proceedings title Digital Image Computing : Techniques and Applications (DICTA 2007)
Editor(s) M. Bottema
A. Maeder
N. Redding
A. van den Hengel
Publisher IEEE
Place of publication Piscataway, USA
Start page 7
End page 14
Total pages 8
Abstract This paper presents a vision-based approach to recognize speech without evaluating the acoustic signals. The proposed technique combines motion features and support vector machines (SVMs) to classify utterances. Segmentation of utterances is important in a visual speech recognition system. This research proposes a video segmentation method to detect the start and end frames of isolated utterances from an image sequence. Frames that correspond to `speaking' and `silence' phases are identified based on mouth movement information. The experimental results demonstrate that the proposed visual speech recognition technique yields high accuracy in a phoneme classification task. Potential applications of such a system are, e.g., human computer interface (HCI) for mobility-impaired users, lip-reading mobile phones, in-vehicle systems, and improvement of speech-based computer control in noisy environments.
Subjects Biomedical Engineering not elsewhere classified
DOI - identifier 10.1109/DICTA.2007.4426769
Copyright notice © 2007 IEEE
ISBN 0-7695-3067-2
Versions
Version Filter Type
Citation counts: Scopus Citation Count Cited 4 times in Scopus Article | Citations
Altmetric details:
Access Statistics: 218 Abstract Views, 615 File Downloads  -  Detailed Statistics
Created: Wed, 08 Apr 2009, 09:42:32 EST by Catalyst Administrator
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us