Concepedia

Publication | Open Access

Attention mechanisms in computer vision: A survey

2.2K

Citations

175

References

2022

Year

TLDR

Attention mechanisms in computer vision mimic the human ability to locate salient regions by dynamically weighting image features. This survey reviews and categorizes attention mechanisms in computer vision and proposes future research directions. The authors systematically review attention mechanisms, grouping them into channel, spatial, temporal, and branch categories, and provide a repository of related work. Attention mechanisms have achieved significant success across a wide range of visual tasks, from classification to multi‑modal learning.

Abstract

Humans can naturally and effectively find salient regions in complex scenes. Motivated by this observation, attention mechanisms were introduced into computer vision with the aim of imitating this aspect of the human visual system. Such an attention mechanism can be regarded as a dynamic weight adjustment process based on features of the input image. Attention mechanisms have achieved great success in many visual tasks, including image classification, object detection, semantic segmentation, video understanding, image generation, 3D vision, multi-modal tasks and self-supervised learning. In this survey, we provide a comprehensive review of various attention mechanisms in computer vision and categorize them according to approach, such as channel attention, spatial attention, temporal attention and branch attention; a related repository https://github.com/MenghaoGuo/Awesome-Vision-Attentions is dedicated to collecting related work. We also suggest future directions for attention mechanism research.

References

YearCitations

Page 1