Date of Award

Spring 2004

Document Type

Thesis - Restricted

Degree Name

Master of Science (MS)

Department

Electrical and Computer Engineering

First Advisor

Johnson, Michael T.

Second Advisor

Povinelli, Richard J.

Third Advisor

Yaz, Edwin E.

Abstract

A speech recognition system implements the task of automatically transcribing speech into text. As computer power has advanced and sophisticated tools have become available, there has been significant progress in this field. But a huge gap still exists between the performance of the Automatic Speech Recognition (ASR) systems and human listeners. In this thesis, a novel signal analysis technique using Reconstructed Phase Spaces (RPS) is presented for speech recognition. The most widely used techniques for acoustic modeling are currently derived from frequency domain feature extraction. The reconstructed phase space modeling technique taken from dynamical systems methods addresses the acoustic modeling problem in the time domain instead. Such a method has the potential of capturing nonlinear information usually ignored by the traditional linear human speech production model. The features from this time domain approach can be used for speech recognition when combined with statistical modeling techniques such as Hidden Markov Models (HMM) and Gaussian Mixture Models (GMM). Issues associated with this RPS approach are discussed, and experiments are done using the TIMIT database. Most of this work focuses on isolated phoneme classification, with some extended work presented on continuous speech recognition. The direct statistical modeling of RPS can be used for the isolated phoneme recognition. The Singular Value Decomposition (SYD) is used to extract frame-based features from RPS, and can be applied to both isolated phoneme recognition and continuous speech recognition.

Share

COinS

Restricted Access Item

Having trouble?