Latent dirichlet language model for speech recognition

Jen-Tzung Chien*, Chuang Hua Chueh

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

21 Scopus citations

Abstract

Latent Dirichlet allocation (LDA) has been successfully for document modeling and classification. LDA the document probability based on bag-of-words without considering the sequence of words. This discovers the topic structure at document level, is different from the concern of word prediction in recognition. In this paper, we present a new latent language model (LDLM) for modeling of word . A new Bayesian framework is introduced by the Dirichlet priors to characterize the uncertainty latent topics of n-gram events. The robust topic-based model is established accordingly. In the , we implement LDLM for continuous speech and obtain better performance than probabilistic semantic analysis (PLSA) based language method.

Original languageEnglish
Title of host publication2008 IEEE Workshop on Spoken Language Technology, SLT 2008 - Proceedings
Pages201-204
Number of pages4
DOIs
StatePublished - 1 Dec 2008
Event2008 IEEE Workshop on Spoken Language Technology, SLT 2008 - Goa, India
Duration: 15 Dec 200819 Dec 2008

Publication series

Name2008 IEEE Workshop on Spoken Language Technology, SLT 2008 - Proceedings

Conference

Conference2008 IEEE Workshop on Spoken Language Technology, SLT 2008
CountryIndia
CityGoa
Period15/12/0819/12/08

Keywords

  • Bayes procedures
  • Clustering methods
  • Natural languages
  • Smoothing methods
  • Speech recognition

Fingerprint Dive into the research topics of 'Latent dirichlet language model for speech recognition'. Together they form a unique fingerprint.

Cite this