Multi-View and Multi-Modal Action Recognition with Learned Fusion

Sandy Ardianto, Hsueh-Ming Hang

研究成果: Conference contribution同行評審

2 引文 斯高帕斯(Scopus)

摘要

In this paper, we study multi-modal and multi-view action recognition system based on the deep-learning techniques. We extended the Temporal Segment Network with additional data fusion stage to combine information from different sources. In this research, we use multiple types of information from different modality such as RGB, depth, infrared data to detect predefined human actions. We tested various combinations of these data sources to examine their impact on the final detection accuracy. We designed 3 information fusion methods to generate the final decision. The most interested one is the Learned Fusion Net designed by us. It turns out the Learned Fusion structure has the best results but requires more training.

原文English
主出版物標題2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings
發行者Institute of Electrical and Electronics Engineers Inc.
頁面1601-1604
頁數4
ISBN(電子)9789881476852
DOIs
出版狀態Published - 4 三月 2019
事件10th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Honolulu, United States
持續時間: 12 十一月 201815 十一月 2018

出版系列

名字2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings

Conference

Conference10th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018
國家United States
城市Honolulu
期間12/11/1815/11/18

指紋 深入研究「Multi-View and Multi-Modal Action Recognition with Learned Fusion」主題。共同形成了獨特的指紋。

引用此