Date of Award
Fall 1997
Document Type
Dissertation - Restricted
Degree Name
Doctor of Philosophy (PhD)
Department
Electrical and Computer Engineering
First Advisor
Heinen, James
Second Advisor
Brown, Ronald H.
Third Advisor
Richie, James E.
Abstract
The primary focus of this dissertation is the problem of formant tracking of speech embedded in noise. Accordingly, the objective of this work is a formant tracking algorithm that is a significant improvement over existing algorithms. A rather broad approach based upon moments and cumulants was adopted. The rationalization for this approach resulted from studies of autocorrelation which demonstrated clear signal peak enhancement in the frequency domain. Further, an initial literature search revealed that higher-order spectra are valuable tools, particularly in spectral analysis of signals embedded in noise. The second moment, second cumulant, and autocorrelation are all closely related and their Fourier transform is the familiar power spectrum. It is well known that signal phase information is lost in the power spectrum. Less well known are the properties of higher-order moments and cumulants. The third moment and third cumulant retain the signal phase information. In the literature the two-dimensional Fourier transform of the third cumulant is called the bispectrum. A growing number of applications of the bispectrum is being reported in the literature, including application to speech processing. Fourth-order spectra, called trispectra, have received less attention due in part to the computational complexity of the fourth cumulant and its spectrum. A particularly salient point is that the application of higher-order spectra to formant tracking has not to this point been thoroughly investigated. An in-depth investigation of the application of higher-order spectra to formant tracking is the intent of the work reported in this dissertation. A brief introduction to the properties of voiced speech and a discussion of the general problem of formant tracking of noise corrupted speech are presented in Chapter 1. The formal theoretical foundation of moments and cumulants is established in Chapter 2. The effective enhancement of signal-to-noise ratio is made clear in this context as is the retention of signal phase information for the third moment. A fact of central importance in this work is that cuts of the third and fourth moments contain the essential signal information, and are discrete-time sequences that can be used in either Fourier spectral estimation or parametric spectral analysis. This fundamental fact is the basis of the formant tracking algorithms investigated. In view of the foregoing, to find the most effective formant tracking algorithm for speech in noise, it was necessary to investigate, in the context of moments and cumulants, classical (Fourier based) spectral estimation methods. Well known methods have been applied by many researchers to the second moment and cumulant, i.e., power spectral estimation. Much less work has been done on application of the higher order moments and cumulants. Accordingly, Fourier based spectral analysis is reviewed in Chapter 3, and extended, in Chapter 4, to include the bispectrum and trispectrum...