Towards segmenting consumer stereo videos: Benchmark, baselines and ensembles

Wei-Chen Chiu*, Fabio Galasso, Mario Fritz

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Are we ready to segment consumer stereo videos? The amount of this data type is rapidly increasing and encompasses rich information of appearance, motion and depth cues. However, the segmentation of such data is still largely unexplored. First, we propose therefore a new benchmark: videos, annotations and metrics to measure progress on this emerging challenge. Second, we evaluate several state of the art segmentation methods and propose a novel ensemble method based on recent spectral theory. This combines existing image and video segmentation techniques in an efficient scheme. Finally, we propose and integrate into this model a novel regressor, learnt to optimize the stereo segmentation performance directly via a differentiable proxy. The regressor makes our segmentation ensemble adaptive to each stereo video and outperforms the segmentations of the ensemble as well as a most recent RGB-D segmentation technique.

Original languageEnglish
Title of host publicationComputer Vision - 13th Asian Conference on Computer Vision, ACCV 2016, Revised Selected Papers
EditorsKo Nishino, Shang-Hong Lai, Vincent Lepetit, Yoichi Sato
PublisherSpringer Verlag
Pages378-395
Number of pages18
ISBN (Print)9783319541921
DOIs
StatePublished - 1 Jan 2017
Event13th Asian Conference on Computer Vision, ACCV 2016 - Taipei, Taiwan
Duration: 20 Nov 201624 Nov 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10115 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th Asian Conference on Computer Vision, ACCV 2016
CountryTaiwan
City Taipei
Period20/11/1624/11/16

Fingerprint Dive into the research topics of 'Towards segmenting consumer stereo videos: Benchmark, baselines and ensembles'. Together they form a unique fingerprint.

Cite this