Publication | Closed Access
Video Instance Segmentation with a Propose-Reduce Paradigm
77
Citations
37
References
2021
Year
EngineeringMachine LearningRedundant SequencesSequence Propagation HeadVideo ProcessingVideo SummarizationImage Sequence AnalysisImage AnalysisData SciencePattern RecognitionVideo Content AnalysisMachine VisionVideo GenerationComputer ScienceVideo UnderstandingComputer VisionVideo AnalysisVideo Instance SegmentationImage Segmentation
Prior methods typically segment individual frames or clips first and then merge incomplete results through tracking or matching. The study proposes a Propose‑Reduce paradigm that generates complete instance sequences for videos in a single step. The method builds a sequence propagation head on an image‑level instance segmentation network, proposes multiple sequences, and reduces redundant sequences to achieve robust, high‑recall long‑term propagation. The approach attains state‑of‑the‑art results, achieving 47.6 % AP on YouTube‑VIS and 70.4 % on DAVIS‑UVOS validation sets.
Video instance segmentation (VIS) aims to segment and associate all instances of predefined classes for each frame in videos. Prior methods usually obtain segmentation for a frame or clip first, and merge the incomplete results by tracking or matching. These methods may cause error accumulation in the merging step. Contrarily, we propose a new paradigm – Propose-Reduce, to generate complete sequences for input videos by a single step. We further build a sequence propagation head on the existing image-level instance segmentation network for long-term propagation. To ensure robustness and high recall of our proposed framework, multiple sequences are proposed where redundant sequences of the same instance are reduced. We achieve state-of-the-art performance on two representative benchmark datasets – we obtain 47.6% in terms of AP on YouTube-VIS validation set and 70.4 % for J&F on DAVIS-UVOS validation set.
| Year | Citations | |
|---|---|---|
Page 1
Page 1