Electrical and Computer Engineering Faculty Research and Publications

Exploiting Contextual Information for Prosodic Event Detection Using Auto-Context

Junhong Zhao, University of the Chinese Academy of Sciences
Wei-Qiang Zhang, Tsinghua UniversityFollow
Hua Yang, Tsinghua University
Michael T. Johnson, Marquette UniversityFollow
Jia Liu, Tsinghua UniversityFollow
Shanhong Xia, University of the Chinese Academy of Sciences

Document Type

Article

Language

eng

Publication Date

12-2013

Publisher

Springer

Source Publication

EURASIP Journal on Audio, Speech, and Music Processing

Source ISSN

1687-4722

Original Item ID

doi: 10.1186/1687-4722-2013-30

Abstract

Prosody and prosodic boundaries carry significant information regarding linguistics and paralinguistics and are important aspects of speech. In the field of prosodic event detection, many local acoustic features have been investigated; however, contextual information has not yet been thoroughly exploited. The most difficult aspect of this lies in learning the long-distance contextual dependencies effectively and efficiently. To address this problem, we introduce the use of an algorithm called auto-context. In this algorithm, a classifier is first trained based on a set of local acoustic features, after which the generated probabilities are used along with the local features as contextual information to train new classifiers. By iteratively using updated probabilities as the contextual information, the algorithm can accurately model contextual dependencies and improve classification ability. The advantages of this method include its flexible structure and the ability of capturing contextual relationships. When using the auto-context algorithm based on support vector machine, we can improve the detection accuracy by about 3% and F-score by more than 7% on both two-way and four-way pitch accent detections in combination with the acoustic context. For boundary detection, the accuracy improvement is about 1% and the F-score improvement reaches 12%. The new algorithm outperforms conditional random fields, especially on boundary detection in terms of F-score. It also outperforms an n-gram language model on the task of pitch accent detection.

Comments

Recommended Citation

Zhao, Junhong; Zhang, Wei-Qiang; Yang, Hua; Johnson, Michael T.; Liu, Jia; and Xia, Shanhong, "Exploiting Contextual Information for Prosodic Event Detection Using Auto-Context" (2013). Electrical and Computer Engineering Faculty Research and Publications. 58.
https://epublications.marquette.edu/electric_fac/58

Download

Find in your library

Included in

Computer Engineering Commons, Electrical and Computer Engineering Commons

COinS

e-Publications@Marquette

Electrical and Computer Engineering Faculty Research and Publications

Exploiting Contextual Information for Prosodic Event Detection Using Auto-Context

Document Type

Language

Publication Date

Publisher

Source Publication

Source ISSN

Original Item ID

Abstract

Comments

Recommended Citation

Included in

Browse

Information about e-Pubs@MU

Links

e-Publications@Marquette

Electrical and Computer Engineering Faculty Research and Publications

Exploiting Contextual Information for Prosodic Event Detection Using Auto-Context

Authors

Document Type

Language

Publication Date

Publisher

Source Publication

Source ISSN

Original Item ID

Abstract

Comments

Recommended Citation

Included in

Share

Browse

Information about e-Pubs@MU

Links