Publication | Closed Access
Segmentation of Moving Objects by Long Term Video Analysis
594
Citations
62
References
2014
Year
Point TrajectoriesScene AnalysisEngineeringMachine LearningFreiburg-berkeley Motion SegmentationImage Sequence AnalysisImage AnalysisPattern RecognitionStrong CueVideo Content AnalysisMachine VisionComputer ScienceVideo UnderstandingComputer VisionVideo SegmentationVideo AnalysisScene InterpretationEye TrackingScene UnderstandingMotion Analysis
Motion, especially long‑term point trajectories spanning hundreds of frames, provides a robust cue for unsupervised object‑level grouping by reducing sensitivity to short‑term variations that challenge two‑frame optical flow. The study aims to exploit motion over extended time windows by proposing a paradigm that first uses semi‑dense motion cues and then fills textureless regions with color. The authors introduce a semi‑dense motion‑cue framework that fills textureless areas with color and release the Freiburg‑Berkeley motion segmentation dataset comprising 59 sequences with pixel‑accurate ground truth. The resulting groupings achieve temporal consistency across entire video shots, eliminating the need for tedious post‑processing common in other methods.
Motion is a strong cue for unsupervised object-level grouping. In this paper, we demonstrate that motion will be exploited most effectively, if it is regarded over larger time windows. Opposed to classical two-frame optical flow, point trajectories that span hundreds of frames are less susceptible to short-term variations that hinder separating different objects. As a positive side effect, the resulting groupings are temporally consistent over a whole video shot, a property that requires tedious post-processing in the vast majority of existing approaches. We suggest working with a paradigm that starts with semi-dense motion cues first and that fills up textureless areas afterwards based on color. This paper also contributes the Freiburg-Berkeley motion segmentation (FBMS) dataset, a large, heterogeneous benchmark with 59 sequences and pixel-accurate ground truth annotation of moving objects.
| Year | Citations | |
|---|---|---|
Page 1
Page 1