Using spatial audio cues from speech excitation for meeting speech segmentation

Cheng, E and Burnett, I 2006, 'Using spatial audio cues from speech excitation for meeting speech segmentation', in 8th International Conference on Signal Processing (ICSP '06), Beijing, China, 16-20 November 2006, pp. 3067-3070.


Document type: Conference Paper
Collection: Conference Papers

Title Using spatial audio cues from speech excitation for meeting speech segmentation
Author(s) Cheng, E
Burnett, I
Year 2006
Conference name ICSP 2006
Conference location Beijing, China
Conference dates 16-20 November 2006
Proceedings title 8th International Conference on Signal Processing (ICSP '06)
Publisher IEEE
Place of publication USA
Start page 3067
End page 3070
Total pages 4
Abstract Multiparty meetings generally involve stationary participants. Participant location information can thus be used to segment the recorded meeting speech into each speaker's 'turn' for meeting 'browsing'. To represent speaker location information from speech, previous research showed that the most reliable time delay estimates are extracted from the Hubert envelope of the linear prediction residual signal. The authors' past work has proposed the use of spatial audio cues to represent speaker location information. This paper proposes extracting spatial audio cues from the Hubert envelope of the speech residual for indicating changing speaker location for meeting speech segmentation. Experiments conducted on recordings of a real acoustic environment show that spatial cues from the Hubert envelope are more consistent across frequency subbands and can clearly distinguish between spatially distributed speakers, compared to spatial cues estimated from the recorded speech or residual signal
Subjects Communications Technologies not elsewhere classified
Electrical and Electronic Engineering not elsewhere classified
DOI - identifier 10.1109/ICOSP.2006.346086
Copyright notice © IEEE 2006
ISBN 0780397371
Versions
Version Filter Type
Altmetric details:
Access Statistics: 132 Abstract Views  -  Detailed Statistics
Created: Mon, 18 Mar 2013, 14:42:00 EST by Catalyst Administrator
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us