Date of Award
Spring 2004
Document Type
Thesis - Restricted
Degree Name
Master of Science (MS)
Department
Electrical and Computer Engineering
First Advisor
Johnson, Michael T.
Second Advisor
Povinelli, Richard J.
Third Advisor
Yaz, Edwin E.
Abstract
A speech recognition system implements the task of automatically transcribing speech into text. As computer power has advanced and sophisticated tools have become available, there has been significant progress in this field. But a huge gap still exists between the performance of the Automatic Speech Recognition (ASR) systems and human listeners. In this thesis, a novel signal analysis technique using Reconstructed Phase Spaces (RPS) is presented for speech recognition. The most widely used techniques for acoustic modeling are currently derived from frequency domain feature extraction. The reconstructed phase space modeling technique taken from dynamical systems methods addresses the acoustic modeling problem in the time domain instead. Such a method has the potential of capturing nonlinear information usually ignored by the traditional linear human speech production model. The features from this time domain approach can be used for speech recognition when combined with statistical modeling techniques such as Hidden Markov Models (HMM) and Gaussian Mixture Models (GMM). Issues associated with this RPS approach are discussed, and experiments are done using the TIMIT database. Most of this work focuses on isolated phoneme classification, with some extended work presented on continuous speech recognition. The direct statistical modeling of RPS can be used for the isolated phoneme recognition. The Singular Value Decomposition (SYD) is used to extract frame-based features from RPS, and can be applied to both isolated phoneme recognition and continuous speech recognition.
Recommended Citation
Ye, Jinjin, "Speech Recognition Using Time Domain Features from Phase Space Reconstructions" (2004). Master's Theses (1922-2009) Access restricted to Marquette Campus. 4355.
https://epublications.marquette.edu/theses/4355