Improved acoustics modeling for speech recognition using transformation techniques

Carrson Fung, Oscar C. Au, Wanggen Wan, Chi H. Yim, Cyan L. Keung

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In statistical speech recognition, misclassification often occurs when there is a mismatch between the incoming signal and the acoustics model inside the recognizer. In order to combat this problem, techniques such as Cepstral Mean Subtraction, Vocal Tract Normalization, adaptation and pronunciation model can be used. In this paper, we proposed a new approach based on transformation technique where the output distribution function in the HMM model, a Gaussian probability density function, could be transformed to match the estimated distribution of the incoming signal by using a memoryless invertible nonlinearity function. Since the new density still has a Gaussian form, the function could be completely characterized by using the Expectation Maximization (EM) algorithm.

Original languageEnglish
Title of host publication6th International Conference on Spoken Language Processing, ICSLP 2000
PublisherInternational Speech Communication Association
ISBN (Electronic)7801501144, 9787801501141
StatePublished - 1 Jan 2000
Event6th International Conference on Spoken Language Processing, ICSLP 2000 - Beijing, China
Duration: 16 Oct 200020 Oct 2000

Publication series

Name6th International Conference on Spoken Language Processing, ICSLP 2000

Conference

Conference6th International Conference on Spoken Language Processing, ICSLP 2000
CountryChina
CityBeijing
Period16/10/0020/10/00

Fingerprint Dive into the research topics of 'Improved acoustics modeling for speech recognition using transformation techniques'. Together they form a unique fingerprint.

Cite this