Document Type

Conference Proceeding

Language

eng

Format of Original

5 p.

Publication Date

9-2014

Publisher

International Speech Communication Association

Source Publication

INTERSPEECH 2014 15th Annual Conference of the International Speech Communication Association

Source ISSN

1990-9770

Abstract

The selection of effective articulatory features is an important component of tasks such as acoustic-to-articulator inversion and articulatory synthesis. Although it is common to use direct articulatory sensor measurements as feature variables, this approach fails to incorporate important physiological information such as palate height and shape and thus is not as representative of vocal tract cross section as desired. We introduce a set of articulator feature variables that are palate referenced and normalized with respect to the articulatory working space in order to improve the quality of the vocal tract representation. These features include normalized horizontal positions plus the normalized palatal height of two midsagittal and one lateral tongue sensor, as well as normalized lip separation and lip protrusion. The quality of the feature representation is evaluated subjectively by comparing the variances and vowel separation in the working space and quantitatively through measurement of acoustic-to-articulator inversion error. Results indicate that the palate-referenced features have reduced variance and increased separation between vowels spaces and substantially lower inversion error than direct sensor measures.

Comments

Published version. Published as part of the proceedings of the conference, INTERSPEECH 2014 15th Annual Conference of the International Speech Communication Association, 2014: 721-725. DOI. © 2014 International Speech Communication Association. Used with permission.

Share

COinS