Semantic high-level event recognition of videos is one of most interesting issues for multimedia searching and indexing. Since low-level features are semantically distinct from high-level events, a hierarchical video analysis framework is needed, i.e., using mid-level features to provide clear linkages between low-level audio-visual features and high-level semantics. Therefore, this paper presents a framework for video event classification using temporal context of mid-level interval-based multimodal features. In the framework, a co-occurrence symbol transformation method is proposed to explore full temporal relations among multiple modalities in probabilistic HMM event classification. The results of our experiments on baseball video event classification demonstrate the superiority of the proposed approach.
|頁（從 - 到）||285-295|
|期刊||Journal of Visual Communication and Image Representation|
|出版狀態||Published - 1 二月 2014|