Unsupervised Single-Scene Semantic Segmentation for Earth Observation

Abstract

Earth observation data has huge potential to enrich our knowledge about our planet. An important step in many Earth observation tasks is semantic segmentation. Generally, a large number of pixelwise labeled images are required to train deep models for supervised semantic segmentation. On the contrary, strong inter-sensor and geographic variations impede the availability of annotated training data in Earth observation. In practice, most Earth observation tasks use only the target scene without assuming availability of any additional scene, labeled or unlabeled. Keeping in mind such constraints, we propose a semantic segmentation method that learns to segment from a single scene, without using any annotation. Earth observation scenes are generally larger than those encountered in typical computer vision datasets. Exploiting this, the proposed method samples smaller unlabeled patches from the scene. For each patch an alternate view is generated by simple transformations, e.g., addition of noise. Both views are then processed through a two-stream network and weights are iteratively refined using deep clustering, spatial consistency, and contrastive learning in the pixel space. The proposed model automatically segregates the major classes present in the scene and produces the segmentation map. Extensive experiments on four Earth observation datasets collected by different sensors show the effectiveness of the proposed method. Implementation is available at https://gitlab.lrz.de/ai4eo/cd/-/tree/main/unsupContrastiveSemanticSeg.

References

Page 1

	Year	Citations
Very Deep Convolutional Networks for Large-Scale Image Recognition Karen Simonyan, Andrew Zisserman arXiv (Cornell University) Geometric LearningConvolutional Neural NetworkEngineeringMachine LearningConvolutional Network Depth	2014	75.4K
Fully convolutional networks for semantic segmentation Jonathan Long, Evan Shelhamer, Trevor Darrell	2015	36.2K
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification Kaiming He, Xiangyu Zhang, Shaoqing Ren, Convolutional Neural NetworkEngineeringMachine LearningAutoencodersImagenet Classification	2015	18.4K
Rethinking Atrous Convolution for Semantic Image Segmentation Liang-Chieh Chen, George Papandreou, Florian Schroff, arXiv (Cornell University) Convolutional Neural NetworkScene AnalysisEngineeringMachine LearningSemantic Image Segmentation	2017	7.4K
An overview of gradient descent optimization algorithms Sebastian Ruder arXiv (Cornell University) Artificial IntelligenceModel OptimizationReview ArchitecturesGradient DescentEngineering	2016	4.8K
Bootstrap your own latent: A new approach to self-supervised Learning Jean-Bastien Grill, Florian Strub, Florent Altché, arXiv (Cornell University) Artificial IntelligenceFew-shot LearningConvolutional Neural NetworkEngineeringMachine Learning	2020	3.4K
Exploring Simple Siamese Representation Learning Xinlei Chen, Kaiming He Natural Language ProcessingFew-shot LearningSiamese NetworksSiamese ArchitecturesMachine Vision	2021	3.2K
Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources IEEE Geoscience and Remote Sensing Magazine	2017	2.8K
Unsupervised Visual Representation Learning by Context Prediction Carl Doersch, Abhinav Gupta, Alexei A. Efros Convolutional Neural NetworkEngineeringMachine LearningVisual SimilarityContext Prediction	2015	2.7K
Improved Baselines with Momentum Contrastive Learning Xinlei Chen, Haoqi Fan, Ross Girshick, arXiv (Cornell University) Artificial IntelligenceStructured PredictionEngineeringMachine LearningSequential Learning	2020	2.1K

Page 1