Visual recognition of speech consonants using facial movement features

Yau, W, Kumar, D and Poosapadi Arjunan, S 2007, 'Visual recognition of speech consonants using facial movement features', Integrated Computer-Aided Engineering, vol. 14, no. 1, pp. 49-61.


Document type: Journal Article
Collection: Journal Articles

Title Visual recognition of speech consonants using facial movement features
Author(s) Yau, W
Kumar, D
Poosapadi Arjunan, S
Year 2007
Journal name Integrated Computer-Aided Engineering
Volume number 14
Issue number 1
Start page 49
End page 61
Total pages 13
Publisher IOS Press
Abstract This paper presents a visual speech recognition technique using facial movement video. The acoustic signals of consonants are often confusing in noisy environments. To overcome this shortcoming, the focus of this paper is identifying consonants using visual information. This paper investigates the feasibility of using facial movements to identify phonemes. The proposed approach adopts a visual speech model based on the viseme model of the Moving Picture Experts Group 4 (MPEG-4) standard. It is a movement-based system, and the facial movements are segmented from the video using an accumulative image subtraction method that results in a 2-D grayscale motion history image (MHI). The MHI is classified using a combination of the discrete stationary wavelet transform (SWT) and image moments (Hu moments, geometric moments and Zernike moments). Feedforward multilayer perceptron (MLP) neural networks with backpropagation (BPN) learning algorithm are used to classify the features to investigate the performance of the three moment features. The experimental results indicate that Zernike moments have better representation ability and provide rotational invariant property for the proposed application. The results also demonstrate that the proposed technique can identify consonants reliably using the viseme model of MPEG-4 standard with a recognition rate of 85%.
Subject Artificial Intelligence and Image Processing not elsewhere classified
Copyright notice © 2007 - IOS Press and the author(s). All rights reserved.
ISSN 1069-2509
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 17 times in Thomson Reuters Web of Science Article | Citations
Scopus Citation Count Cited 18 times in Scopus Article | Citations
Access Statistics: 196 Abstract Views  -  Detailed Statistics
Created: Mon, 06 Dec 2010, 14:11:00 EST by Catalyst Administrator
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us