Grant Title

Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations

Document Type

Conference Proceeding

Publication Date

4-2007

Source Publication

IEEE International Conference on Acoustics, Speech and Signal Processing, 2007: ICASSP; Honolulu, Hawaii, April 15-20, 2007

Source ISSN

1-4244-0727-3; 1520-6149

Abstract

In this paper, we evaluate the use of appended jitter and shimmer speech features for the classification of human speaking styles and of animal vocalization arousal levels. Jitter and shimmer features are extracted from the fundamental frequency contour and added to baseline spectral features, specifically Mel-frequency cepstral coefficients (MFCCs) for human speech and Greenwood function cepstral coefficients (GFCCs) for animal vocalizations. Hidden Markov models (HMMs) with Gaussian mixture models (GMMs) state distributions are used for classification. The appended jitter and shimmer features result in an increase in classification accuracy for several illustrative datasets, including the SUSAS dataset for human speaking styles as well as vocalizations labeled by arousal level for African elephant and Rhesus monkey species.

Document Rights and Citation of Original

Accepted version. Published as a part of the proceedings of the conference, IEEE International Conference on Acoustics, Speech and Signal Processing, 2007: ICASSP; Honolulu, Hawaii, April 15-20, 2007, 1081-1084. DOI. © 2007 Institute of Electrical and Electronics Engineers (IEEE). Used with permission.

Share

COinS