HKUST Library Institutional Repository Banner

HKUST Institutional Repository >
Electronic and Computer Engineering  >
ECE Preprints >

Please use this identifier to cite or link to this item:
Title: Sub-phonetic polynomial segment model for large vocabulary continuous speech recognition
Authors: Au Yeung, Siu-Kei
Li, Chak Fai
Siu, Man Hung
Keywords: Segment model
Speech recognition
Issue Date: Mar-2005
Abstract: Polynomial Segment Model (PSM) has opened up an alternative research direction for acoustic modeling. In our previous papers~\cite{ref:Fai1,ref:Fai2}, we proposed efficient incremental likelihood evaluation and EM training algorithms for PSM, making it possible to train and recognize using PSM alone. In this paper, we shift our focus to make it feasible to use PSM on large vocabulary recognition. First, we used sub-phonetic PSM that represents a phoneme as multiple independent segmental units. Second, we derived and compared different top-down mixture growing approaches that are orders of magnitude more efficient than previously proposed agglomerative clustering techniques. Experimental results show that the top-down clustering performs better than the bottom-up approach. Recognition via N-best re-scoring shows that PSM models out-performed HMM by 7% to 19% on the 5k closed vocabulary Wall Street Journal Nov 92 testset. Our best PSM model achieve 7.15% WER compare with 7.81% use 16 mixture HMM model.
Appears in Collections:ECE Preprints

Files in This Item:

File Description SizeFormat
psmlvcsr.pdf67KbAdobe PDFView/Open

All items in this Repository are protected by copyright, with all rights reserved.