Publication | Closed Access
AAFormer: Attention-Attended Transformer for Semantic Segmentation of Remote Sensing Images
44
Citations
13
References
2024
Year
Remote Sensing ImagesEngineeringMachine LearningAttention GateImage AnalysisVisual GroundingData SciencePattern RecognitionSemantic SegmentationVideo TransformerIsprs PotsdamMachine VisionObject DetectionVision Language ModelComputer ScienceDeep LearningComputer VisionScene InterpretationScene UnderstandingRemote SensingImage Segmentation
The rapid advancements in remote sensing technology have enabled the widespread availability of fine-resolution remote sensing images (RSIs), offering rich spatial details and semantics. Despite the applicability and scalability of transformers in semantic segmentation of RSIs by learning pairwise contextual affinity, they inevitably introduce irrelevant context, hindering accurate inference of patch semantics. To address this, we propose a novel multi-head attention-attended module (AAM) that refines the multi-head self-attention mechanism. The AAM filters out irrelevant context while highlighting informative ones by considering the relevance between self-attention maps and the query vector. The AAM generates an attention gate to complement contextual affinity and emphasize the useful ones with a higher weight simultaneously. Leveraging multi-head AAM as the core unit, we construct a lightweight attention-attended transformer block (ATB). Subsequently, we devise AAFormer, a pure transformer with a mask transformer decoder, for achieving semantic segmentation of RSIs. We extensively evaluate our approach on the ISPRS Potsdam and LoveDA datasets, demonstrating compelling performance compared to mainstream methods. Additionally, we conduct evaluations to analyze the effects of AAM.
| Year | Citations | |
|---|---|---|
Page 1
Page 1