Grant Title
Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations
Document Type
Conference Proceeding
Publication Date
4-2007
Source Publication
IEEE International Conference on Acoustics, Speech and Signal Processing, 2007: ICASSP; Honolulu, Hawaii, April 15-20, 2007
Source ISSN
1-4244-0727-3; 1520-6149
Abstract
In this paper, we evaluate the use of appended jitter and shimmer speech features for the classification of human speaking styles and of animal vocalization arousal levels. Jitter and shimmer features are extracted from the fundamental frequency contour and added to baseline spectral features, specifically Mel-frequency cepstral coefficients (MFCCs) for human speech and Greenwood function cepstral coefficients (GFCCs) for animal vocalizations. Hidden Markov models (HMMs) with Gaussian mixture models (GMMs) state distributions are used for classification. The appended jitter and shimmer features result in an increase in classification accuracy for several illustrative datasets, including the SUSAS dataset for human speaking styles as well as vocalizations labeled by arousal level for African elephant and Rhesus monkey species.
Recommended Citation
Li, Xi; Tao, Jidong; Johnson, Michael T.; Soltis, Joseph; Savage, Anne; Leong, Kirsten; and Newman, John D., "Stress and Emotion Classification Using Jitter and Shimmer Features" (2007). Dr. Dolittle Project: A Framework for Classification and Understanding of Animal Vocalizations. 9.
https://epublications.marquette.edu/data_drdolittle/9
Document Rights and Citation of Original
Accepted version. Published as a part of the proceedings of the conference, IEEE International Conference on Acoustics, Speech and Signal Processing, 2007: ICASSP; Honolulu, Hawaii, April 15-20, 2007, 1081-1084. DOI. © 2007 Institute of Electrical and Electronics Engineers (IEEE). Used with permission.