Please use this identifier to cite or link to this item:

Kernel eigenspace-based MLLR adaptation

Authors Hsiao, Roger Wend Huu
Issue Date 2004
Summary Kernel methods have been applied to improve the performance of existing eigenvoice-based adaptation methods. Several adaptation methods including kernel eigenvoice adaptation (KEV) and embedded kernel eigenvoice adaptation (eKEV) report promising results. The basic idea of these kernel-based adaptation methods is to exploit possible nonlinearity among different speakers. In this thesis, a variant of eigenspace-based maximum likelihood linear regression (EMLLR) adaptation, named kernel eigenspace-based MLLR adaptation (KEMLLR), is proposed. It adopts kernel methods to map the MLLR transformation matrices to a kernel-induced high dimensional space and kernel principal component analysis (KPCA) is used to derive a set of kernel eigenmatrices in the feature space. A new speaker is then represented by a linear combination of the leading kernel eigenmatrices in the feature space, and the eigenmatrix weights are optimized by using BFGS quasi-Newton algorithm. In the Resource Management adaptation task using 10-mixture monophone hidden Markov models, KEMLLR gives encouraging results in rapid speaker adaptation. It is found that when only 5s of adaptation speech are available, EMLLR successfully reduces the word error rate (WER) by 7.82%, and KEMLLR can reduce the WER by 11.4%. When 10s of adaptation speech are provided, MLLR using full transformation (MLLR-F) becomes effective and matches the performance of KEMLLR. EMLLR, eKEV and MLLR using diagonal transformation (MLLR-D) do not perform as well as KEMLLR in the experiments. The time complexities of various adaptation methods including EV, MLLR, EMLLR, KEV, eKEV, and KEMLLR are also analyzed under some mild assumptions. Comparisons are made and the analysis helps understanding the efficiency of various methods.
Note Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2004
Language English
Format Thesis
Access View full-text via DOI
Files in this item:
File Description Size Format
th_redirect.html 343 B HTML
Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.