Publication | Closed Access
Siamese Network for RGB-D Salient Object Detection and Beyond
229
Citations
130
References
2021
Year
Siamese NetworkEngineeringMachine LearningFeature ExtractionImage AnalysisData SciencePattern RecognitionFusion LearningVideo TransformerVision RecognitionMachine VisionFeature LearningObject DetectionComputer ScienceDeep LearningFeature FusionDcf ModuleComputer VisionScene Understanding
Existing RGB-D salient object detection (SOD) models usually treat RGB and depth as independent information and design separate networks for feature extraction from each. Such schemes can easily be constrained by a limited amount of training data or over-reliance on an elaborately designed training process. Inspired by the observation that RGB and depth modalities actually present certain commonality in distinguishing salient objects, a novel joint learning and densely cooperative fusion (JL-DCF) architecture is designed to learn from both RGB and depth inputs through a shared network backbone, known as the Siamese architecture. In this paper, we propose two effective components: joint learning (JL), and densely cooperative fusion (DCF). The JL module provides robust saliency feature learning by exploiting cross-modal commonality via a Siamese network, while the DCF module is introduced for complementary feature discovery. Comprehensive experiments using 5 popular metrics show that the designed framework yields a robust RGB-D saliency detector with good generalization. As a result, JL-DCF significantly advances the SOTAs by an average of ~2.0% (F-measure) across 7 challenging datasets. In addition, we show that JL-DCF is readily applicable to other related multi-modal detection tasks, including RGB-T SOD and video SOD, achieving comparable or better performance.
| Year | Citations | |
|---|---|---|
Page 1
Page 1