Markov recurrent neural networks

Che Yu Kuo, Jen-Tzung Chien

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

Deep learning has achieved great success in many real-world applications. For speech and language processing, recurrent neural networks are learned to characterize sequential patterns and extract the temporal information based on dynamic states which are evolved through time and stored as an internal memory. Traditionally, simple transition function using input-to-hidden and hidden-to-hidden weights is insufficient. To strengthen the learning capability, it is crucial to explore the diversity of latent structure in sequential signals and learn the stochastic trajectory of signal transitions to improve sequential prediction. This paper proposes the stochastic modeling of transitions in deep sequential learning. Our idea is to enhance latent variable representation by discovering the Markov state transitions in sequential data based on a K-state long short-term memory (LSTM) model. Such a latent state machine is capable of learning the complicated latent semantics in highly structured and heterogeneous sequential data. Gumbel-softmax is introduced to implement stochastic learning procedure with discrete states. Experimental results on visual and text language modeling illustrate the merit of the proposed stochastic transitions in sequential prediction with limited amount of parameters.

Original languageEnglish
Title of host publication2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings
EditorsNelly Pustelnik, Zheng-Hua Tan, Zhanyu Ma, Jan Larsen
PublisherIEEE Computer Society
ISBN (Electronic)9781538654774
DOIs
StatePublished - 31 Oct 2018
Event28th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Aalborg, Denmark
Duration: 17 Sep 201820 Sep 2018

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
Volume2018-September
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Conference

Conference28th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018
CountryDenmark
CityAalborg
Period17/09/1820/09/18

Keywords

  • Deep learning
  • Discrete latent structure
  • Recurrent neural network
  • Stochastic transition

Fingerprint Dive into the research topics of 'Markov recurrent neural networks'. Together they form a unique fingerprint.

  • Cite this

    Kuo, C. Y., & Chien, J-T. (2018). Markov recurrent neural networks. In N. Pustelnik, Z-H. Tan, Z. Ma, & J. Larsen (Eds.), 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings [8517074] (IEEE International Workshop on Machine Learning for Signal Processing, MLSP; Vol. 2018-September). IEEE Computer Society. https://doi.org/10.1109/MLSP.2018.8517074