Publication | Closed Access
Hierarchical Memory Matching Network for Video Object Segmentation
116
Citations
34
References
2021
Year
Scene AnalysisEngineeringMachine LearningVideo RetrievalMemory ReadingVideo InterpretationVideo Object SegmentationImage AnalysisData SciencePattern RecognitionVideo TransformerMachine VisionObject DetectionComputer ScienceVideo UnderstandingDeep LearningComputer VisionMemory ReadHierarchical Memory
We present Hierarchical Memory Matching Network (HMMN) for semi-supervised video object segmentation. Based on a recent memory-based method [33], we propose two advanced memory read modules that enable us to perform memory reading in multiple scales while exploiting temporal smoothness. We first propose a kernel guided memory matching module that replaces the non-local dense memory read, commonly adopted in previous memory-based methods. The module imposes the temporal smoothness constraint in the memory read, leading to accurate memory retrieval. More importantly, we introduce a hierarchical memory matching scheme and propose a top-k guided memory matching module in which memory read on a fine-scale is guided by that on a coarse-scale. With the module, we perform memory read in multiple scales efficiently and leverage both high-level semantic and low-level fine-grained memory features to predict detailed object masks. Our network achieves state-of-the-art performance on the validation sets of DAVIS 2016/2017 (90.8% and 84.7%) and YouTube-VOS 2018/2019 (82.6% and 82.5%), and test-dev set of DAVIS 2017 (78.6%). The source code and model are available online: https://github.com/Hongje/HMMN.
| Year | Citations | |
|---|---|---|
Page 1
Page 1