Format of Original
Acoustical Society of America
Journal of the Acoustical Society of America
Original Item ID
Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0 dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures.