Publication | Open Access
APANet: Adaptive Prototypes Alignment Network for Few-Shot Semantic Segmentation
45
Citations
34
References
2022
Year
Few-shot LearningEngineeringMachine LearningMetric Learning FrameworkNatural Language ProcessingSegment Novel-class ObjectsImage AnalysisZero-shot LearningData SciencePattern RecognitionSemantic SegmentationMachine VisionFeature LearningVision Language ModelFew-shot Semantic SegmentationComputer ScienceDeep LearningComputer VisionScene InterpretationScene Understanding
Few-shot semantic segmentation aims to segment novel-class objects in a given query image with only a few labeled support images. Most advanced solutions exploit a metric learning framework that performs segmentation through matching each query feature to a learned class-specific prototype. However, this framework suffers from biased classification due to incomplete feature comparisons. To address this issue, we present an adaptive prototype representation by introducing class-specific and class-agnostic prototypes and thus construct complete sample pairs for learning semantic alignment with query features. The complementary features learning manner effectively enriches feature comparison and helps yield an unbiased segmentation model in the few-shot setting. It is implemented with a two-branch end-to-end network ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e</i> ., a class-specific branch and a class-agnostic branch), which generates prototypes and then combines query features to perform comparisons. In addition, the proposed class-agnostic branch is simple yet effective. In practice, it can adaptively generate multiple class-agnostic prototypes for query images and learn feature alignment in a self-contrastive manner. Extensive experiments on PASCAL-5 <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$^{i}$</tex-math></inline-formula> and COCO-20 <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$^{i}$</tex-math></inline-formula> demonstrate the superiority of our method. At no expense of inference efficiency, our model achieves state-of-the-art results in both 1-shot and 5-shot settings for semantic segmentation.
| Year | Citations | |
|---|---|---|
Page 1
Page 1