Please use this identifier to cite or link to this item: http://hdl.handle.net/1783.1/4534

Utterance verification of Chinese long and short keywords

Authors Lam, Kwok Leung
Issue Date 2000
Summary In Mandarin speech recognition, initial-final subword units are commonly used. According to the Frequency Dictionary of Modern Chinese[4], among the top 9000 most frequent words, 26.7% are unigrams, 69.8% are bigrams, 2.7% are trigrams, 0.0007% 4-grams, and 0.0002% 5-grams. Another study[19] showed that in general, 75% of Chinese words are bigrams, 14% trigrams, 6% n-grams with n [greater than] 3. Each character is monosyllabic. If initial-final segmentation is used, each Chinese word would only consist of two to six units. This is relatively short compared with English words which contain about seven phonemes on average. For this reason, the utterance verification of Chinese keywords performs relatively lower than English particularly for short Chinese utterances. In this thesis, we propose three methods improving the overall performance for both Chinese long and short keyword utterances. To improve confidence scoring for verification of keywords, we propose a state-independent Log Likelihood Ratio (LLR) that discriminates between true and mis-recognition scores. A 13% improvement is obtained with the state-independent LLR with 10% false rejection rate. Moreover, for setting the optimal rejection threshold, a dynamic threshold setting method is proposed so that each keyword has an individual threshold. This method gives a maximum 10% improvement in false acceptance rate. Initial-final HMMs is popular for Chinese speech recognition. However, since most Chinese keywords are very short, keyword recognition accuracy suffers because there are only a very few initial and finals in each keyword. We propose using higher resolution subword units for HMM based Chinese keyword verification. In addition to the initial unit, the finals split into two segments as well. A 10% error reduction rate is obtained compared to the baseline system for short utterances when we fix the false rejection rate at 25%.
Note Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2000
Subjects
Language English
Format Thesis
Access
Files in this item:
File Description Size Format
th_redirect.html 339 B HTML