Date of Award
Spring 2003
Document Type
Thesis - Restricted
Degree Name
Master of Science (MS)
Department
Electrical and Computer Engineering
First Advisor
Johnson, Michael T.
Second Advisor
Ropella, Kristina
Third Advisor
Heinen, James
Abstract
In many speech processing applications, speech has to be processed in the presence of undesirable background noise like white Gaussian noise, colored noise, multi-talker babble noise, car interior noise and so on. Various methods have been applied for the suppression of ambient noise while minimizing the extent of speech distortion. Bionic wavelet transforms are a new time-frequency analysis method recently proposed in the biomedical engineering field (Yao 2001), based on an active cochlear model, which have proven to be meaningful in cochlear implant research. The standard wavelet transform resembles, to some degree, the way that the front end human auditory system, mainly the passive cochlea, processes speech, yet the bionic wavelet transform promises even more similarity to the active cochlea and hence more flexibility and efficiency. Therefore applying this method to speech enhancement may lead to a promising future in this field. Spectral subtraction methods have been widely used in speech enhancement, but all are notorious for unexpected music tone artifacts. Wiener filtering and Ephraim Malah filtering methods have achieved good performance dealing with white Gaussian noise at different SNR levels. Wavelet-based methods using thresholding techniques are promising for coping with real life noise of various kinds. However many improvements have yet to be made to render this approach more flexible and robust. The bionic wavelet transform (BWT) was proposed in 2001 and to date no one has yet introduced it into speech enhancement other than cochlear implant research. Due to the integration of human auditory system model into the wavelet transform, the BWT has great potential in speech enhancement and may lead to a new path in wavelet-based speech processing. In the thesis, basic spectral subtraction, iterative Wiener filtering, Ephraim Malah filtering and traditional wavelet thresholding techniques have been used as baseline methods for speech enhancement tests. Segmental signal-to-noise ratio (SSNR) and signal-to-noise ratio (SNR) are adopted for the objective speech quality evaluation and the mean opinion score (MOS) subjective measure is also employed. Bionic wavelet based thresholding is then implemented to enhance speech quality and comparisons are made among the performances of 5 different enhancement methods. Bionic wavelet-based thresholding showed significant advantage over traditional thresholding and Ephraim Malah filtering proved the best among all five algorithms.
Recommended Citation
Yuan, Xiaolong, "Auditory Model-Based Bionic Wavelet Transform for Speech Enhancement" (2003). Master's Theses (1922-2009) Access restricted to Marquette Campus. 4430.
https://epublications.marquette.edu/theses/4430