Publication | Closed Access
SIGN: Spatial-information Incorporated Generative Network for Generalized Zero-shot Semantic Segmentation
43
Citations
47
References
2021
Year
Few-shot LearningImage AnalysisMachine LearningMachine VisionSpatial InformationPattern RecognitionRelative Positional EncodingSelf-supervised LearningEngineeringScene InterpretationScene UnderstandingZero-shot LearningDeep LearningZero-shot Semantic SegmentationSemi-supervised LearningScene ModelingImage SegmentationComputer Vision
Unlike conventional zero-shot classification, zero-shot semantic segmentation predicts a class label at the pixel level instead of the image level. When solving zero-shot semantic segmentation problems, the need for pixel-level prediction with surrounding context motivates us to incorporate spatial information using positional encoding. We improve standard positional encoding by introducing the concept of Relative Positional Encoding, which integrates spatial information at the feature level and can handle arbitrary image sizes. Furthermore, while self-training is widely used in zero-shot semantic segmentation to generate pseudo-labels, we propose a new knowledge-distillation-inspired self-training strategy, namely Annealed Self-Training, which can automatically assign different importance to pseudo-labels to improve performance. We systematically study the proposed Relative Positional Encoding and Annealed Self-Training in a comprehensive experimental evaluation, and our empirical results confirm the effectiveness of our method on three benchmark datasets.
| Year | Citations | |
|---|---|---|
Page 1
Page 1