Robust action recognition via borrowing information across video modalities

Nick C. Tang, Yen-Yu Lin, Ju Hsuan Hua, Shih En Wei, Ming Fang Weng, Hong Yuan Mark Liao

Research output: Contribution to journalArticlepeer-review

9 Scopus citations


The recent advances in imaging devices have opened the opportunity of better solving the tasks of video content analysis and understanding. Next-generation cameras, such as the depth or binocular cameras, capture diverse information, and complement the conventional 2D RGB cameras. Thus, investigating the yielded multimodal videos generally facilitates the accomplishment of related applications. However, the limitations of the emerging cameras, such as short effective distances, expensive costs, or long response time, degrade their applicability, and currently make these devices not online accessible in practical use. In this paper, we provide an alternative scenario to address this problem, and illustrate it with the task of recognizing human actions. In particular, we aim at improving the accuracy of action recognition in RGB videos with the aid of one additional RGB-D camera. Since RGB-D cameras, such as Kinect, are typically not applicable in a surveillance system due to its short effective distance, we instead offline collect a database, in which not only the RGB videos but also the depth maps and the skeleton data of actions are available jointly. The proposed approach can adapt the interdatabase variations, and activate the borrowing of visual knowledge across different video modalities. Each action to be recognized in RGB representation is then augmented with the borrowed depth and skeleton features. Our approach is comprehensively evaluated on five benchmark data sets of action recognition. The promising results manifest that the borrowed information leads to remarkable boost in recognition accuracy.

Original languageEnglish
Article number6996025
Pages (from-to)709-723
Number of pages15
JournalIEEE Transactions on Image Processing
Issue number2
StatePublished - 1 Feb 2015


  • Action recognition
  • feature borrowing
  • next-generation cameras
  • transfer learning

Fingerprint Dive into the research topics of 'Robust action recognition via borrowing information across video modalities'. Together they form a unique fingerprint.

Cite this