Document Type
Article
Language
eng
Format of Original
5 p.
Publication Date
8-2009
Publisher
Institute of Electrical and Electronics Engineers
Source Publication
IEEE Transactions on Consumer Electronics
Source ISSN
0098-3063
Original Item ID
doi: 10.1109/TCE.2009.5278018
Abstract
Automatic speech recognition (ASR) for a very large vocabulary of isolated words is a difficult task on a resource-limited embedded device. This paper presents a novel fast decoding algorithm for a Mandarin speech recognition system which can simultaneously process hundreds of thousands of items and maintain high recognition accuracy. The proposed algorithm constructs a semi-tree search network based on Mandarin pronunciation rules, to avoid duplicate syllable matching and save redundant memory. Based on a two-stage fixed-width beam-search baseline system, the algorithm employs a variable beam-width pruning strategy and a frame-synchronous word-level pruning strategy to significantly reduce recognition time. This algorithm is aimed at an in-car navigation system in China and simulated on a standard PC workstation. The experimental results show that the proposed method reduces recognition time by nearly 6-fold and memory size nearly 2- fold compared to the baseline system, and causes less than 1% accuracy degradation for a 200,000 word recognition task.
Recommended Citation
Qian, Yanmin; Liu, Jia; and Johnson, Michael T., "Efficient Embedded Speech Recognition for Very Large Vocabulary Mandarin Car-Navigation Systems" (2009). Electrical and Computer Engineering Faculty Research and Publications. 52.
https://epublications.marquette.edu/electric_fac/52
Comments
Accepted version. IEEE Transactions on Consumer Electronics, Vol. 55, No. 3 (August 2009): 1496-1500. DOI. © 2009 Institute of Electrical and Electronics Engineers (IEEE). Used with permission.