PANet: Few-Shot Image Semantic Segmentation With Prototype Alignment

TLDR

Despite advances in deep CNNs for image semantic segmentation, these models require many densely‑annotated images and struggle to generalize to unseen categories, motivating the development of few‑shot segmentation. This paper addresses few‑shot segmentation from a metric‑learning perspective by proposing PANet, a prototype‑alignment network that better exploits support‑set information. PANet learns class‑specific prototypes from a few support images in an embedding space, matches query pixels to these prototypes, and uses non‑parametric metric learning together with a prototype‑alignment regularization between support and query to produce discriminative, high‑quality prototypes that improve generalization. PANet achieves 48.1 % mIoU on PASCAL‑5i 1‑shot and 55.7 % on 5‑shot, surpassing the state‑of‑the‑art by 1.8 % and 8.6 %.

Abstract

Despite the great progress made by deep CNNs in image semantic segmentation, they typically require a large number of densely-annotated images for training and are difficult to generalize to unseen object categories. Few-shot segmentation has thus been developed to learn to perform segmentation from only a few annotated examples. In this paper, we tackle the challenging few-shot segmentation problem from a metric learning perspective and present PANet, a novel prototype alignment network to better utilize the information of the support set. Our PANet learns class-specific prototype representations from a few support images within an embedding space and then performs segmentation over the query images through matching each pixel to the learned prototypes. With non-parametric metric learning, PANet offers high-quality prototypes that are representative for each semantic class and meanwhile discriminative for different classes. Moreover, PANet introduces a prototype alignment regularization between support and query. With this, PANet fully exploits knowledge from the support and provides better generalization on few-shot segmentation. Significantly, our model achieves the mIoU score of 48.1% and 55.7% on PASCAL-5i for 1-shot and 5-shot settings respectively, surpassing the state-of-the-art method by 1.8% and 8.6%.

References

Page 1

	Year	Citations

Page 1