Document Type

Article

Language

eng

Format of Original

15 p.

Publication Date

7-2006

Publisher

Elsevier

Source Publication

Speech Communication

Source ISSN

0167-6393

Original Item ID

doi: 10.1016/j.specom.2004.12.002

Abstract

A novel method combining filter banks and reconstructed phase spaces is proposed for the modeling and classification of speech. Reconstructed phase spaces, which are based on dynamical systems theory, have advantages over spectral-based analysis methods in that they can capture nonlinear or higher-order statistics. Recent work has shown that the natural measure of a reconstructed phase space can be used for modeling and classification of phonemes. In this work, sub-banding of speech, which has been examined for recognition of noise-corrupted speech, is studied in combination with phase space reconstruction. This sub-banding, which is motivated by empirical psychoacoustical studies, is shown to dramatically improve the phoneme classification accuracy of reconstructed phase space-based approaches. Experiments that examine the performance of fused sub-banded reconstructed phase spaces for phoneme classification are presented. Comparisons against a cepstral-based classifier show that the proposed approach is competitive with state-of-the-art methods for modeling and classification of phonemes. Combination of cepstral-based features and the sub-band RPS features shows improvement over a cepstral-only baseline.

Comments

Accepted version. Speech Communication, Vol. 48, No. 7 (July 2006): 760-774. DOI. © 2006 Elsevier. Used with permission.

johnson_5651acc.docx (223 kB)
ADA Accessible Version

Share

COinS