Document Type
Conference Proceeding
Language
eng
Format of Original
4 p.
Publication Date
4-6-2003
Publisher
Institute of Electrical and Electronics Engineers
Source Publication
Acoustics, Speech, and Signal Processing, 2003
Source ISSN
1520-6149
Original Item ID
doi: 10.1109/ICASSP.2003.1198716
Abstract
The paper presents a novel method for speech recognition by utilizing nonlinear/chaotic signal processing techniques to extract time-domain based phase space features. By exploiting the theoretical results derived in nonlinear dynamics, a processing space called a reconstructed phase space can be generated where a salient model (the natural distribution of the attractor) can be extracted for speech recognition. To discover the discriminatory power of these features, isolated phoneme classification experiments were performed using the TIMIT corpus and compared to a baseline classifier that uses MFCC (Mel frequency cepstral coefficient) features. The results demonstrate that phase space features contain substantial discriminatory power, even though MFCC features outperformed the phase space features on direct comparisons. The authors conjecture that phase space and MFCC features used in combination within a classifier may yield increased accuracy for various speech recognition tasks.
Recommended Citation
Lindgren, Andrew C.; Johnson, Michael T.; and Povinelli, Richard J., "Speech Recognition using Reconstructed Phase Space Features" (2003). Electrical and Computer Engineering Faculty Research and Publications. 122.
https://epublications.marquette.edu/electric_fac/122
Comments
Accepted version. Published as part of the proceedings of the conference, Acoustics, Speech, and Signal Processing, 2003, Vol. 1: 60-63. DOI. © 2003 Institute of Electrical and Electronic Engineers (IEEE). Used with permission.