Distributed multichannel processing for signal enhancement

Marek B Trawicki, Marquette University

Abstract

The goal of this work is to generalize speech enhancement methods from single channel microphones, dual channel microphones, and microphone arrays to distributed microphones. The focus has been on developing and implementing robust and optimal time domain and frequency domain estimators for estimating the true source signal in this configuration and measuring the performance improvement with both objective (e.g., signal-to-noise ratios) and subjective (e.g., listening tests) metrics. Statistical estimation techniques (e.g., minimum mean-square error or MMSE) with Gaussian speech priors and Gaussian noise likelihoods have been used to derive solutions for five basic classes of estimators: (1) time domain; (2) spectral amplitude; (3) perceptually-motivated spectral amplitude; (4) spectral phase; and (5) complex real and imaginary spectral component. Experimental work using different true source signal attenuation factors (e.g., unity, linear, and logarithmic) demonstrates significant gains in segmental signal-to-noise ratio (SSNR) with an increase in the number of microphones. Of particular importance is the inclusion of the optimal MMSE spectral phase estimator to the spectral amplitude estimators. Overall, the statistical estimators show tremendous promise for distributed microphone speech enhancement of noisy acoustic signals with application to many consumer, industrial, and military products under severely noisy environments.

This paper has been withdrawn.