Document Type

Conference Proceeding

Language

eng

Format of Original

5 p.

Publication Date

2012

Publisher

Institute of Electrical and Electronics Engineers

Source Publication

Proceedings of the Third International Conference on Audio, Language, and Image Processing

Source ISSN

978-1-4673-0173-2

Original Item ID

doi: 10.1109/ICALIP.2012.6376788

Abstract

This paper describes a unique cross-phoneme speaker identification experiment, using deliberately mismatched phoneme sets for training and testing. The underlying goal is to identify features that represent broad individually unique characteristics rather than those that represent phonetic differences, as are more typical of modern speaker identification and verification systems. A wide range of features are proposed and evaluated within this context using a Gaussian Mixture Model framework. The results show that log-area ratio has better phonetic independence than MFCCs, that residual phase carries substantial speaker information, and identifies several other features that also have usefulness for speaker identification independent of phonetic content.

Comments

Accepted version. Published as part of the proceedings of the conference, Third International Conference on Audio, Language, and Image Processing, 2012: 1141-1145. DOI: 10.1109/ICALIP.2012.6376788. © 2012 IEEE. Used with permission.

Share

COinS