Concepedia

Publication | Closed Access

Distilling Segmenters From CNNs and Transformers for Remote Sensing Images’ Semantic Segmentation

36

Citations

49

References

2023

Year

Abstract

Semantic segmentation is a crucial task in remote sensing and has been predominantly performed using convolutional neural networks (CNNs) for the past decade. Recently, transformers with self-attention mechanisms have demonstrated superior performance compared to CNNs. However, due to the locality of CNN and the high computational complexity and massive data resource requirements of transformer, neither of them can be well applied in resource-constrained practical remote sensing scenarios. Motivated by the limitations of using either convolutional neural networks (CNNs) or transformers alone in the task of semantic segmentation of remote sensing images, a novel cross-model knowledge distillation framework, named distilling segmenters from CNNs and transformers (DSCT), is proposed in this paper to harness the complementary advantages of both models. The framework utilizes a channel-weighted attention-guided feature distillation (CAFD) module to condense the feature from the teacher model and enhance the student model’s focus on the teacher-focused regions. Additionally, a target-nontarget knowledge distillation (TNKD) module is proposed that decouples logit distillation into target and nontarget knowledge distillation to guide the student model in learning the underlying representations and decision boundaries from the teacher model. By learning the complementary knowledge from the teacher, our proposed DSCT framework improves the student’s segmentation performance without adding trainable parameters. Experiments on four available remote sensing datasets (ISPRS Potsdam, Vaihingen, GID and LoveDA) indicate that the proposed DSCT outperforms the state-of-the-art knowledge distillation methods and demonstrates its effectiveness and robustness.

References

YearCitations

Page 1