HKUST Library Institutional Repository Banner

HKUST Institutional Repository >
Computer Science and Engineering >
CSE Conference Papers >

Please use this identifier to cite or link to this item:
Title: High-density discrete HMM with the use of scalar quantization indexing
Authors: Mak, Brian Kan-Wing
Au Yeung, Siu-Kei
Lai, Yiu-Pong
Siu, Man Hung
Keywords: Automatic speech recognition
Discrete hidden Markov model
Scalar quantization indexing
Speech corpora
Issue Date: Sep-2005
Citation: Proceedings of 9th European Conference on Speech Communication and Technology, Interspeech 2005-Eurospeech, Lisbon, Portugal, 4-8 September 2005, p. 2121-2124
Abstract: With the advance in semiconductor memory and the availability of very large speech corpora (of hundreds to thousands of hours of speech), we would like to revisit the use of discrete hidden Markov model (DHMM) in automatic speech recognition. To estimate the discrete density in a DHMM state, the acoustic space is divided into bins and one simply count the relative amount of observations falling into each bin. With a very large speech corpus, we believe that the number of bins may be greatly increased to get a much higher density than before, and we will call the new models, the high-density discrete hidden Markov model (HDDHMM). Our HDDHMM is different from traditional DHMM in two aspects: firstly, the codebook will have a size in thousands or even tens of thousands; secondly, we propose a method based on scalar quantization indexing so that for a d-dimensional acoustic vector, the discrete codeword can be determined in O(d) time. During recognition, the state probability is reduced to an O(1) table look-up. The new HDDHMM was tested on WSJ0 with 5K vocabulary. Compared with a baseline 4-stream continuous density HMM system which has a WER of 9.71%, a 4-stream HDDHMM system converted from the former achieves a WER of 11.60%, with no distance or Gaussian computation.
Appears in Collections:CSE Conference Papers

Files in This Item:

File Description SizeFormat
interspeech2005hddhmm.pdfpre-published version82KbAdobe PDFView/Open

All items in this Repository are protected by copyright, with all rights reserved.