Publication | Closed Access
Meta R-CNN: Towards General Solver for Instance-Level Low-Shot Learning
584
Citations
44
References
2019
Year
Unknown Venue
Few-shot LearningConvolutional Neural NetworkEngineeringMachine LearningMeta-learningImage AnalysisZero-shot LearningPattern RecognitionVideo TransformerMachine VisionObject DetectionLow-shot Object DetectionComputer ScienceRapid Learning CapabilityDeep LearningComputer VisionObject RecognitionTowards General SolverLow-shot Object Detection/segmentation
Low‑shot learning enables vision systems to learn new concepts from few samples, yet existing meta‑learning methods struggle with complex backgrounds and multi‑object images. This work proposes a flexible, general methodology to address low‑shot object detection and segmentation. The authors extend Faster/Mask R‑CNN by applying meta‑learning to RoI features and introducing a Predictor‑head Remodeling Network that uses class‑attentive vectors to remodel predictor heads via channel‑wise soft‑attention. Meta R‑CNN achieves state‑of‑the‑art performance on low‑shot object detection and improves low‑shot segmentation compared to Mask R‑CNN. Code is available at https://yanxp.github.io/metarcnn.html.
Resembling the rapid learning capability of human, low-shot learning empowers vision systems to understand new concepts by training with few samples. Leading approaches derived from meta-learning on images with a single visual object. Obfuscated by a complex background and multiple objects in one image, they are hard to promote the research of low-shot object detection/segmentation. In this work, we present aflexible and general methodology to achieve these tasks. Our work extends Faster /Mask R-CNN by proposing meta-learning over RoI (Region-of-Interest) features instead of a full image feature. This simple spirit disentangles multi-object information merged with the background, without bells and whistles, enabling Faster /Mask R-CNN turn into a meta-learner to achieve the tasks. Specifically, we introduce a Predictor-head Remodeling Network (PRN) that shares its main backbone with Faster /Mask R-CNN. PRN receives images containing low-shot objects with their bounding boxes or masks to infer their class attentive vectors. The vectors take channel-wise soft-attention on RoI features, remodeling those R-CNN predictor heads to detect or segment the objects consistent with the classes these vectors represent. In our experiments, Meta R-CNN yields the new state of the art in low-shot object detection and improves low-shot object segmentation byMaskR-CNN.Code: https://yanxp.github.io/metarcnn.html.
| Year | Citations | |
|---|---|---|
Page 1
Page 1