Date of Award

Spring 2003

Document Type

Thesis - Restricted

Degree Name

Master of Science (MS)

Department

Electrical and Computer Engineering

First Advisor

Johnson, Michael T.

Second Advisor

Povinelli, Richard J.

Third Advisor

Heinen, James A.

Abstract

A novel method for speech recognition is presented, utilizing nonlinear/chaotic signal processing techniques to extract time-domain based, reconstructed phase space derived features. By exploiting the theoretical results derived in nonlinear dynamics, a distinct signal processing space called a reconstructed phase space can be generated where salient features (the natural distribution and trajectory of the attractor) can be extracted for speech recognition. These nonlinear methodologies differ strongly from the traditional linear signal processing techniques typically employed for speech recognition. To discover the discriminatory strength of these reconstructed phase space derived features, isolated phoneme classification experiments are executed using the TIMIT corpus and are compared to a baseline classifier that uses Mel frequency cepstral coefficient features, which are the typical benchmark. Statistical methods are implemented to model these features, e.g. Gaussian Mixture Models and Hidden Markov Models. The results demonstrate that reconstructed phase space derived features contain substantial discriminatory power, even though the Mel frequency cepstral coefficient features outperformed them on direct comparisons. When the two feature sets are combined, improvement is made over the baseline, suggesting that the features extracted using the nonlinear techniques contain different discriminatory information than the features extracted from linear approaches. These nonlinear methods are particularly interesting, because they attack the speech recognition problem in a radically different manner and are an attractive research opportunity for improved speech recognition accuracy.

Share

COinS

Restricted Access Item

Having trouble?