Document Type

Article

Language

eng

Format of Original

8 p.

Publication Date

6-2013

Publisher

Institute of Engineering and Technology

Source Publication

IET Signal Processing

Source ISSN

1751-9683

Original Item ID

doi: 10.1049/iet-spr.2012.0167

Abstract

In this study, the authors propose multichannel weighted Euclidean (WE) and weighted cosh (WCOSH) cost function estimators for speech enhancement in the distributed microphone scenario. The goal of the work is to illustrate the advantages of utilising additional microphones and modified cost functions for improving signal-to-noise ratio (SNR) and segmental SNR (SSNR) along with log-likelihood ratio (LLR) and perceptual evaluation of speech quality (PESQ) objective metrics over the corresponding single-channel baseline estimators. As with their single-channel counterparts, the perceptually-motivated multichannel WE and WCOSH estimators are functions of a weighting law parameter, which influences attention of the noisy spectral amplitude through a spectral gain function, emphasises spectral peak (formant) information, and accounts for auditory masking effects. Based on the simulation results, the multichannel WE and WCOSH cost function estimators produced gains in SSNR improvement, LLR output and PESQ output over the single-channel baseline results and unweighted cost functions with the best improvements occurring with negative values of the weighting law parameter across all input SNR levels and noise types.

Comments

Accepted version. IET Signal Processing, Vol. 7, No. 4 (June 2013): 337-344. DOI. © 2013 The Institute of Electrical and Electronics Engineers. Used with permission.

johnson_5627acc.docx (257 kB)
ADA Accessible Version

Share

COinS