Linear regression based Bayesian predictive classification for speech recognition

Jen-Tzung Chien*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

20 Scopus citations


The uncertainty in parameter estimation due to the adverse environments deteriorates the classification performance for speech recognition. It becomes crucial to incorporate the parameter uncertainty into decision so that the classification robustness can be assured. In this paper, we propose a novel linear regression based Bayesian predictive classification (LRBPC) for robust speech recognition. This framework is constructed under the paradigm of linear regression adaptation of speech hidden Markov models (HMMs). Because the regression mapping between HMMs and adaptation data is ill posed, we properly characterize the uncertainty of regression parameters using a joint Gaussian distribution. A closed-form predictive distribution can be derived to set up the LRBPC decision for speech recognition. Such decision is robust compared to the plug-in maximum a posteriori (MAP) decision adopted in the maximum likelihood linear regression (MLLR) and MAP linear regression (MAPLR). Since the specified distribution belongs to the conjugate prior family, the evolutionary hyperparameters are established. With the statistically rich hyperparameters, the LRBPC achieves decision robustness. In the experiments, we find that LRBPC decision in cases of general linear regression as well as single variable linear regression attains significantly better recognition performance than MLLR and MAPLR adaptation.

Original languageEnglish
Pages (from-to)70-79
Number of pages10
JournalIEEE Transactions on Speech and Audio Processing
Issue number1
StatePublished - 1 Jan 2003


  • Bayesian predictive classification
  • Conjugate prior distribution
  • Joint Gaussian distribution
  • Linear regression model
  • Speech recognition

Fingerprint Dive into the research topics of 'Linear regression based Bayesian predictive classification for speech recognition'. Together they form a unique fingerprint.

Cite this