Publication | Open Access
A Comprehensive Survey of Transformers for Computer Vision
85
Citations
87
References
2023
Year
Convolutional Neural NetworkEngineeringMachine LearningVision TransformersImage Sequence AnalysisImage ClassificationImage AnalysisPattern RecognitionVideo TransformerGeometric ModelingMachine VisionComprehensive SurveyObject DetectionImage Super-resolutionComputer EngineeringComputer ScienceStructure From MotionDeep LearningComputer VisionNatural SciencesComputer Stereo VisionVarious Computer Vision
As a special type of transformer, vision transformers (ViTs) can be used for various computer vision (CV) applications. Convolutional neural networks (CNNs) have several potential problems that can be resolved with ViTs. For image coding tasks such as compression, super-resolution, segmentation, and denoising, different variants of ViTs are used. In our survey, we determined the many CV applications to which ViTs are applicable. CV applications reviewed included image classification, object detection, image segmentation, image compression, image super-resolution, image denoising, anomaly detection, and drone imagery. We reviewed the state of the-art and compiled a list of available models and discussed the pros and cons of each model.
| Year | Citations | |
|---|---|---|
Page 1
Page 1