Publication | Closed Access
SwiftNet: Real-time Video Object Segmentation
161
Citations
28
References
2021
Year
Unknown Venue
Scene AnalysisEngineeringVideo ProcessingValidation DatasetVideo Object SegmentationImage AnalysisPattern RecognitionPresent SwiftnetVideo Content AnalysisComputational ImagingVideo TransformerMachine VisionComputer ScienceVideo UnderstandingDeep LearningComputer VisionVideo SegmentationVideo AnalysisScene UnderstandingVideo Hallucination
SwiftNet is proposed as a real‑time, one‑shot video object segmentation framework that aims to provide a strong, efficient baseline for mobile vision applications. The method compresses spatiotemporal redundancy using Pixel‑Adaptive Memory, which selectively updates memory on frames with significant inter‑frame changes and on dynamic pixels, and incorporates a lightweight aggregation encoder with reversed sub‑pixel operations. On the DAVIS 2017 validation set, SwiftNet attains 77.8 % J&F and 70 FPS, surpassing all existing solutions in both accuracy and speed. The source code is available at https://github.com/haochenheheda/SwiftNet.
In this work we present SwiftNet for real-time semisupervised video object segmentation (one-shot VOS), which reports 77.8% $\mathcal{J}\& \mathcal{F}$ and 70 FPS on DAVIS 2017 validation dataset, leading all present solutions in overall accuracy and speed performance. We achieve this by elaborately compressing spatiotemporal redundancy in matching-based VOS via Pixel-Adaptive Memory (PAM). Temporally, PAM adaptively triggers memory updates on frames where objects display noteworthy inter-frame variations. Spatially, PAM selectively performs memory update and match on dynamic pixels while ignoring the static ones, significantly reducing redundant computations wasted on segmentation-irrelevant pixels. To promote efficient reference encoding, light-aggregation encoder is also introduced in SwiftNet deploying reversed sub-pixel. We hope SwiftNet could set a strong and efficient baseline for real-time VOS and facilitate its application in mobile vision. The source code of SwiftNet can be found at https://github.com/haochenheheda/SwiftNet.
| Year | Citations | |
|---|---|---|
Page 1
Page 1