A new eigenvoice approach to speaker adaptation

Chih Hsien Huang*, Jen-Tzung Chien, Hsin Min Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

In this paper, we present two approaches to improve the eigenvoice-based speaker adaptation. First, we present the maximum a posteriori eigen-decomposition (MAPED), where the linear combination coefficients for eigenvector decomposition are estimated according to the MAP criterion. By incorporating the prior decomposition knowledge, here we use a Gaussian distribution, the MAPED is established accordingly. MAPED is able to achieve better performance than maximum likelihood eigen-decomposition (MLED) with few adaptation data. On the other hand, we exploit the adaptation of covariance matrices of the hidden Markov model (HMM) in the eigenvoice framework. Our method is to use the principal component analysis (PCA) to project the speaker-specific HMM parameters onto a smaller orthogonal feature space. Then, we reliably calculate the HMM covariance matrices using the observations in the reduced feature space. The adapted HMM covariance matrices are estimated by transforming the covariance matrices in the reduced feature space to that in the original feature space. The experimental results show that the eigenvoice speaker adaptation using MAPED and incorporating covariance adaptation can improve the performance of the original eigenvoice adaptation in Mandarin speech recognition.

Original languageEnglish
Title of host publication2004 International Symposium on Chinese Spoken Language Processing - Proceedings
Pages109-112
Number of pages4
DOIs
StatePublished - 1 Dec 2004
Event2004 International Symposium on Chinese Spoken Language Processing - Hong Kong, China, Hong Kong
Duration: 15 Dec 200418 Dec 2004

Publication series

Name2004 International Symposium on Chinese Spoken Language Processing - Proceedings

Conference

Conference2004 International Symposium on Chinese Spoken Language Processing
CountryHong Kong
CityHong Kong, China
Period15/12/0418/12/04

Fingerprint Dive into the research topics of 'A new eigenvoice approach to speaker adaptation'. Together they form a unique fingerprint.

  • Cite this

    Huang, C. H., Chien, J-T., & Wang, H. M. (2004). A new eigenvoice approach to speaker adaptation. In 2004 International Symposium on Chinese Spoken Language Processing - Proceedings (pp. 109-112). [L5.4] (2004 International Symposium on Chinese Spoken Language Processing - Proceedings). https://doi.org/10.1109/CHINSL.2004.1409598