Bayesian recurrent neural network language model

Jen-Tzung Chien, Yuan Chu Ku

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

This paper presents a Bayesian approach to construct the recurrent neural network language model (RNN-LM) for speech recognition. Our idea is to regularize the RNN-LM by compensating the uncertainty of the estimated model parameters which is represented by a Gaussian prior. The objective function in Bayesian RNN (BRNN) is formed as the regularized cross entropy error function. The regularized model is not only constructed by training the regularized parameters according to the maximum a posteriori criterion but also estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to Hessian matrix is developed by selecting a small set of salient outer-products and illustrated to be effective for BRNN-LM. BRNN-LM achieves sparser model than RNN-LM. Experiments on different corpora show promising improvement by applying BRNN-LM using different amount of training data.

Original languageEnglish
Title of host publication2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages206-211
Number of pages6
ISBN (Electronic)9781479971299
DOIs
StatePublished - 1 Apr 2014
Event2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - South Lake Tahoe, United States
Duration: 7 Dec 201410 Dec 2014

Publication series

Name2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings

Conference

Conference2014 IEEE Workshop on Spoken Language Technology, SLT 2014
CountryUnited States
CitySouth Lake Tahoe
Period7/12/1410/12/14

Keywords

  • Bayesian learning
  • Hessian matrix
  • Language model
  • Recurrent neural network

Fingerprint Dive into the research topics of 'Bayesian recurrent neural network language model'. Together they form a unique fingerprint.

Cite this