Electrical and Computer Engineering Faculty Research and Publications

Homogenous Ensemble Phonotactic Language Recognition Based on SVM Supervector Reconstruction

Wei-Wei Liu, Tsinghua University
Wei-Qiang Zhang, Tsinghua UniversityFollow
Michael T. Johnson, Marquette UniversityFollow
Jia Liu, Tsinghua UniversityFollow

Document Type

Article

Language

eng

Publication Date

12-2014

Publisher

Springer

Source Publication

EURASIP Journal on Audio, Speech, and Music Processing

Source ISSN

1687-4722

Original Item ID

doi: 10.1186/s13636-014-0042-5

Abstract

Currently, acoustic spoken language recognition (SLR) and phonotactic SLR systems are widely used language recognition systems. To achieve better performance, researchers combine multiple subsystems with the results often much better than a single SLR system. Phonotactic SLR subsystems may vary in the acoustic features vectors or include multiple language-specific phone recognizers and different acoustic models. These methods achieve good performance but usually compute at high computational cost. In this paper, a new diversification for phonotactic language recognition systems is proposed using vector space models by support vector machine (SVM) supervector reconstruction (SSR). In this architecture, the subsystems share the same feature extraction, decoding, and N-gram counting preprocessing steps, but model in a different vector space by using the SSR algorithm without significant additional computation. We term this a homogeneous ensemble phonotactic language recognition (HEPLR) system. The system integrates three different SVM supervector reconstruction algorithms, including relative SVM supervector reconstruction, functional SVM supervector reconstruction, and perturbing SVM supervector reconstruction. All of the algorithms are incorporated using a linear discriminant analysis-maximum mutual information (LDA-MMI) backend for improving language recognition evaluation (LRE) accuracy. Evaluated on the National Institute of Standards and Technology (NIST) LRE 2009 task, the proposed HEPLR system achieves better performance than a baseline phone recognition-vector space modeling (PR-VSM) system with minimal extra computational cost. The performance of the HEPLR system yields 1.39%, 3.63%, and 14.79% equal error rate (EER), representing 6.06%, 10.15%, and 10.53% relative improvements over the baseline system, respectively, for the 30-, 10-, and 3-s test conditions.

Comments

Published version. EURASIP Journal on Audio, Speech, and Music Processing, Vol. 42 (December 2014). DOI. © 2014 SpringerOpen. This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0.

Recommended Citation

Liu, Wei-Wei; Zhang, Wei-Qiang; Johnson, Michael T.; and Liu, Jia, "Homogenous Ensemble Phonotactic Language Recognition Based on SVM Supervector Reconstruction" (2014). Electrical and Computer Engineering Faculty Research and Publications. 83.
https://epublications.marquette.edu/electric_fac/83

Download

Find in your library

Included in

Computer Engineering Commons, Electrical and Computer Engineering Commons

COinS

e-Publications@Marquette

Electrical and Computer Engineering Faculty Research and Publications

Homogenous Ensemble Phonotactic Language Recognition Based on SVM Supervector Reconstruction

Document Type

Language

Publication Date

Publisher

Source Publication

Source ISSN

Original Item ID

Abstract

Comments

Recommended Citation

Included in

Browse

Information about e-Pubs@MU

Links

e-Publications@Marquette

Electrical and Computer Engineering Faculty Research and Publications

Homogenous Ensemble Phonotactic Language Recognition Based on SVM Supervector Reconstruction

Authors

Document Type

Language

Publication Date

Publisher

Source Publication

Source ISSN

Original Item ID

Abstract

Comments

Recommended Citation

Included in

Share

Browse

Information about e-Pubs@MU

Links