HKUST Library Institutional Repository Banner

HKUST Institutional Repository >
Computer Science and Engineering >
CSE Conference Papers >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1783.1/2626
Title: A comparative study of two kernel eigenspace-based speaker adaptation methods on large vocabulary continuous speech recognition
Authors: Hsiao, Roger
Mak, Brian Kan-Wing
Keywords: Kernel eigenspace-based speaker adaptation
Speech recognition
Large vocabularies
Issue Date: Sep-2005
Citation: Proceedings 9th European Conference on Speech Communication and Technology, Interspeech 2005-Eurospeech, Lisbon, Portugal, 4-8 September 2005, p. 1797-1800
Abstract: Eigenvoice (EV) speaker adaptation has been shown effective for fast speaker adaptation when the amount of adaptation data is scarce. In the past two years, we have been investigating the application of kernel methods to improve EV speaker adaptation by exploiting possible nonlinearity in the speaker space, and two methods were proposed: embedded kernel eigenvoice (eKEV) and kernel eigenspace-based MLLR (KEMLLR). In both methods, kernel PCA is used to derive eigenvoices in the kernel-induced high-dimensional feature space, and they differ mainly in the representation of the speaker models. Both had been shown to outperform all other common adaptation methods when the amount of adaptation data is less than 10s. However, in the past, only small vocabulary speech recognition tasks were tried since we were not familiar with the behaviour of these kernelized methods. As we gain more experience, we are now ready to tackle larger vocabularies. In this paper, we show that both methods continue to outperform MAP, and MLLR when only 5s or 10s of adaptation data are available on the WSJO 5K-vocabulary task. Compared with the speaker-independent model, the two methods reduce recognition word error rate by 13.4%–21.1%.
URI: http://hdl.handle.net/1783.1/2626
Appears in Collections:CSE Conference Papers

Files in This Item:

File Description SizeFormat
interspeech2005kadapt.pdfpre-published version85KbAdobe PDFView/Open

All items in this Repository are protected by copyright, with all rights reserved.