Visual speech recognition using image moments and multiresolution wavelet

Yau, W, Kumar, D, Poosapadi Arjunan, S and Kumar, S 2006, 'Visual speech recognition using image moments and multiresolution wavelet', in M. Piccardi (ed.) International Conference on Computer Graphics, Imaging and Visualisation 2006, Sydney, Australia, 26-28 July 2006, pp. 194-199.


Document type: Conference Paper
Collection: Conference Papers

Title Visual speech recognition using image moments and multiresolution wavelet
Author(s) Yau, W
Kumar, D
Poosapadi Arjunan, S
Kumar, S
Year 2006
Conference name International Conference on Computer Graphics, Imaging and Visualisation
Conference location Sydney, Australia
Conference dates 26-28 July 2006
Proceedings title International Conference on Computer Graphics, Imaging and Visualisation 2006
Editor(s) M. Piccardi
Publisher IEEE
Place of publication Sydney, Australia
Start page 194
End page 199
Total pages 6
Abstract This paper describes a new technique for recognizing speech using visual speech information. The video data of the speaker’s mouth is represented using grayscale images named as motion history image (MHI). MHI is generated by applying accumulative image differencing on the frames of the video to implicitly represent the temporal information of the mouth movement. The MHIs are decomposed into wavelet sub images using Discrete Stationary Wavelet Transform (SWT). Three moment-based features (geometric moments, Zernike moments and Hu moments) are extracted from the SWT approximate sub images. Multilayer perceptron (MLP) type artificial neural network (ANN) with back propagation learning algorithm is used to classify the moments features. This paper evaluates and compares the image representation ability of the different moments. The initial experiments show that this method can classify English consonants with an error rate less than 5%.
Subjects Signal Processing
Keyword(s) visual speech recognition
motion history image
image moments
discrete stationary wavelet transform
DOI - identifier 10.1109/CGIV.2006.92
Copyright notice © 2006 IEEE
ISBN 0-7695-2606-3
Versions
Version Filter Type
Altmetric details:
Access Statistics: 186 Abstract Views  -  Detailed Statistics
Created: Wed, 08 Apr 2009, 09:42:32 EST by Catalyst Administrator
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us