Format of Original
Institute of Engineering and Technology
IET Signal Processing
Original Item ID
In this study, the authors propose multichannel weighted Euclidean (WE) and weighted cosh (WCOSH) cost function estimators for speech enhancement in the distributed microphone scenario. The goal of the work is to illustrate the advantages of utilising additional microphones and modified cost functions for improving signal-to-noise ratio (SNR) and segmental SNR (SSNR) along with log-likelihood ratio (LLR) and perceptual evaluation of speech quality (PESQ) objective metrics over the corresponding single-channel baseline estimators. As with their single-channel counterparts, the perceptually-motivated multichannel WE and WCOSH estimators are functions of a weighting law parameter, which influences attention of the noisy spectral amplitude through a spectral gain function, emphasises spectral peak (formant) information, and accounts for auditory masking effects. Based on the simulation results, the multichannel WE and WCOSH cost function estimators produced gains in SSNR improvement, LLR output and PESQ output over the single-channel baseline results and unweighted cost functions with the best improvements occurring with negative values of the weighting law parameter across all input SNR levels and noise types.
Trawicki, Marek B. and Johnson, Michael T., "Distributed Multichannel Speech Enhancement Based on Perceptually-Motivated Bayesian Estimators of the Spectral Amplitude" (2013). Electrical and Computer Engineering Faculty Research and Publications. 50.
ADA Accessible Version