Electrical and Computer Engineering Faculty Research and Publications

Bayesian Speaker Adaptation Based on a New Hierarchical Probabilistic Model

Wen-Lin Zhang, Zhengzhou Information Science and Technology Institute
Wei-Qiang Zhang, Tsinghua UniversityFollow
Bi-Cheng Li, Zhengzhou Information Science and Technology Institute
Dan Qu, Zhengzhou Information Science and Technology Institute
Michael T. Johnson, Marquette UniversityFollow

Document Type

Article

Language

eng

Format of Original

14 p.

Publication Date

9-2012

Publisher

Institute of Electrical and Electronics Engineers

Source Publication

IEEE Transactions on Audio, Speech, and Language Processing

Source ISSN

1558-7916

Original Item ID

doi: 10.1109/TASL.2012.2193390

Abstract

In this paper, a new hierarchical Bayesian speaker adaptation method called HMAP is proposed that combines the advantages of three conventional algorithms, maximum a posteriori (MAP), maximum-likelihood linear regression (MLLR), and eigenvoice, resulting in excellent performance across a wide range of adaptation conditions. The new method efficiently utilizes intra-speaker and inter-speaker correlation information through modeling phone and speaker subspaces in a consistent hierarchical Bayesian way. The phone variations for a specific speaker are assumed to be located in a low-dimensional subspace. The phone coordinate, which is shared among different speakers, implicitly contains the intra-speaker correlation information. For a specific speaker, the phone variation, represented by speaker-dependent eigenphones, are concatenated into a supervector. The eigenphone supervector space is also a low dimensional speaker subspace, which contains inter-speaker correlation information. Using principal component analysis (PCA), a new hierarchical probabilistic model for the generation of the speech observations is obtained. Speaker adaptation based on the new hierarchical model is derived using the maximum a posteriori criterion in a top-down manner. Both batch adaptation and online adaptation schemes are proposed. With tuned parameters, the new method can handle varying amounts of adaptation data automatically and efficiently. Experimental results on a Mandarin Chinese continuous speech recognition task show good performance under all testing conditions.

Comments

Accepted version. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, No. 7 (September 2012): 2002-2015. DOI. © 2012 Institute of Electrical and Electronics Engineers (IEEE)]. Used with parmission.

Recommended Citation

Zhang, Wen-Lin; Zhang, Wei-Qiang; Li, Bi-Cheng; Qu, Dan; and Johnson, Michael T., "Bayesian Speaker Adaptation Based on a New Hierarchical Probabilistic Model" (2012). Electrical and Computer Engineering Faculty Research and Publications. 51.
https://epublications.marquette.edu/electric_fac/51

Download

Find in your library

Included in

Computer Engineering Commons, Electrical and Computer Engineering Commons

COinS

e-Publications@Marquette

Electrical and Computer Engineering Faculty Research and Publications

Bayesian Speaker Adaptation Based on a New Hierarchical Probabilistic Model

Document Type

Language

Format of Original

Publication Date

Publisher

Source Publication

Source ISSN

Original Item ID

Abstract

Comments

Recommended Citation

Included in

Browse

Information about e-Pubs@MU

Links

e-Publications@Marquette

Electrical and Computer Engineering Faculty Research and Publications

Bayesian Speaker Adaptation Based on a New Hierarchical Probabilistic Model

Authors

Document Type

Language

Format of Original

Publication Date

Publisher

Source Publication

Source ISSN

Original Item ID

Abstract

Comments

Recommended Citation

Included in

Share

Browse

Information about e-Pubs@MU

Links