Automatic Song-Type Classification and Speaker Identification of Norwegian Ortolan Bunting Emberiza Hortulana Vocalizations
Document Type
Conference Proceeding
Language
eng
Format of Original
6 p.
Publication Date
9-2005
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Source Publication
2005 IEEE Workshop on Machine Learning for Signal Processing
Source ISSN
1551-2541
Original Item ID
DOI: 10.1109/MLSP.2005.1532913
Abstract
This paper presents an approach to song-type classification and speaker identification of Norwegian Ortolan Bunting (Emberiza Hortulana) vocalizations using traditional human speech processing methods. Hidden Markov models (HMMs) are used for both tasks, with features including mel-frequency cepstral coefficients (MFCCs), log energy, and delta (velocity) and delta-delta (acceleration) coefficients. Vocalizations were tested using leave-one-out cross-validation. Classification accuracy for 5 song-types is 92.4%, dropping to 63.6% as the number and similarity of the songs increases. Song-type dependent speaker identification rates peak at 98.7%, with typical accuracies of 80-95% and a low end at 76.2% as the number of speakers increases. These experiments fit into a larger framework of research working towards methods for acoustic censusing of endangered species populations and more automated bioacoustic analysis methods.
Recommended Citation
Trawicki, Marek B.; Johnson, Michael T.; and Osiejuk, Tomasz S., "Automatic Song-Type Classification and Speaker Identification of Norwegian Ortolan Bunting Emberiza Hortulana Vocalizations" (2005). Electrical and Computer Engineering Faculty Research and Publications. 144.
https://epublications.marquette.edu/electric_fac/144
Comments
Published as part of the proceedings of the conference, 2005 IEEE Workshop on Machine Learning for Signal Processing. (2005): 277-282. DOI.