Visual speech recognition using dynamic features and support vector machines

Yau, W, Kumar, D and Poosapadi Arjunan, S 2008, 'Visual speech recognition using dynamic features and support vector machines', International Journal of Image and Graphics, vol. 8, no. 3, pp. 419-437.


Document type: Journal Article
Collection: Journal Articles

Title Visual speech recognition using dynamic features and support vector machines
Author(s) Yau, W
Kumar, D
Poosapadi Arjunan, S
Year 2008
Journal name International Journal of Image and Graphics
Volume number 8
Issue number 3
Start page 419
End page 437
Total pages 19
Publisher World Scientific Publishing Co Pte Ltd
Abstract This paper presents a vision based technique to identify the unspoken phones using a small camera that is located on the headset of the speaker. The system is based on temporal integration of the video data to generate motion history image (MHI). The paper proposes the use of global features to classify the MHI and compares the use of image moments with Discrete Cosine Transform (DCT). A comparison between Zernike moments (ZM) with DCT indicates that while the accuracy of classification for both techniques is very comparable (96% for ZM and 94% for DCT) when there is no relative motion between the camera and the mouth, ZM is resilient to rotation of the camera and continues to gives good results despite rotation but DCT is sensitive to rotation. Based on the accuracy of the system and its resilience to movement artefacts such as rotation, the authors propose the use of such a system for human computer interface. Such a system could be invaluable when it is important to communicate without making a sound, such as giving passwords when in an open office or in public spaces.
Subject Signal Processing
Keyword(s) visual speech recognition
motion segmentation
zernike moments
descrete cosine transforms
support vector machines
hidden markov models
DOI - identifier 10.1142/S0219467808003167
ISSN 0219-4678
Versions
Version Filter Type
Altmetric details:
Access Statistics: 199 Abstract Views  -  Detailed Statistics
Created: Wed, 22 Dec 2010, 10:15:00 EST by Catalyst Administrator
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us