Group sparse hidden Markov models for speech recognition

Jen-Tzung Chien, Cheng Chun Chiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

This paper presents the group sparse hidden Markov models (GS-HMMs) where a sequence of acoustic features is driven by Markov chain and each feature vector is represented by two groups of basis vectors. The group of common bases represents the features across states within a HMM. The group of individual bases compensates the intra-state residual information. Importantly, the sparse prior for sensing weights is controlled by the Laplacian scale mixture (LSM) distribution which is obtained by multiplying Laplacian variable with an inverse Gamma variable. The scale mixture parameter in LSM makes the distribution even sparser. This parameter serves as an automatic relevance determination for selecting the relevant bases from two groups. The weights and two sets of bases in GS-HMMs are estimated via Bayesian learning. We apply this framework for acoustic modeling and show the robustness of GS-HMMs for speech recognition in presence of different noises types and SNRs.

Original languageEnglish
Title of host publication13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Pages2645-2648
Number of pages4
StatePublished - 1 Dec 2012
Event13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR, United States
Duration: 9 Sep 201213 Sep 2012

Publication series

Name13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Volume3

Conference

Conference13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
CountryUnited States
CityPortland, OR
Period9/09/1213/09/12

Keywords

  • Bayesian learning
  • Group sparsity
  • Hidden Markov model
  • Speech recognition

Fingerprint Dive into the research topics of 'Group sparse hidden Markov models for speech recognition'. Together they form a unique fingerprint.

  • Cite this

    Chien, J-T., & Chiang, C. C. (2012). Group sparse hidden Markov models for speech recognition. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 (pp. 2645-2648). (13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012; Vol. 3).