Document Type
Article
Language
eng
Format of Original
4 p.
Publication Date
5-2005
Publisher
Institute of Electrical and Electronics Engineers
Source Publication
IEEE Signal Processing Letters
Source ISSN
1070-9908
Original Item ID
doi: 10.1109/LSP.2005.845598
Abstract
The ability of a standard hidden Markov model (HMM) or expanded state HMM (ESHMM) to accurately model duration distributions of phonemes is compared with specific duration-focused approaches such as semi-Markov models or variable transition probabilities. It is demonstrated that either a three-state ESHMM or a standard HMM with an increased number of states is capable of closely matching both Gamma distributions and duration distributions of phonemes from the TIMIT corpus, as measured by Bhattacharyya distance to the true distributions. Standard HMMs are easily implemented with off-the-shelf tools, whereas duration models require substantial algorithmic development and have higher computational costs when implemented, suggesting that a simple adjustment to HMM topologies is perhaps a more efficient solution to the problem of duration than more complex approaches.
Recommended Citation
Johnson, Michael T., "Capacity and Complexity of HMM Duration Modeling Techniques" (2005). Electrical and Computer Engineering Faculty Research and Publications. 41.
https://epublications.marquette.edu/electric_fac/41
Comments
Accepted version. IEEE Signal Processing Letters, Vol. 12, No. 5 (May 2005): 407-410. DOI. © 2005 Institute of Electrical and Electronics Engineers (IEEE). Used with permission.