Stochastic fusion for multi-stream neural network in video classification

Yu Min Huang*, Huan Hsin Tsengt, Jen-Tzung Chien

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Spatial image and optical flow provide complementary information for video representation and classification. Traditional methods separately encode two stream signals and then fuse them at the end of streams. This paper presents a new multi-stream recurrent neural network where streams are tightly coupled at each time step. Importantly, we propose a stochastic fusion mechanism for multiple streams of video data based on the Gumbel samples to increase the prediction power. A stochastic backpropagation algorithm is implemented to carry out a multi-stream neural network with stochastic fusion based on a joint optimization of convolutional encoder and recurrent decoder. Experiments on UCF101 dataset illustrate the merits of the proposed stochastic fusion in recurrent neural network in terms of interpretation and classification performance.

Original languageEnglish
Title of host publication2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages69-74
Number of pages6
ISBN (Electronic)9781728132488
DOIs
StatePublished - Nov 2019
Event2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019 - Lanzhou, China
Duration: 18 Nov 201921 Nov 2019

Publication series

Name2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019

Conference

Conference2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
CountryChina
CityLanzhou
Period18/11/1921/11/19

Fingerprint Dive into the research topics of 'Stochastic fusion for multi-stream neural network in video classification'. Together they form a unique fingerprint.

  • Cite this

    Huang, Y. M., Tsengt, H. H., & Chien, J-T. (2019). Stochastic fusion for multi-stream neural network in video classification. In 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019 (pp. 69-74). [9023327] (2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/APSIPAASC47483.2019.9023327