Publication | Closed Access
DHT: Dynamic Vision Transformer Using Hybrid Window Attention for Industrial Defect Images Classification
13
Citations
21
References
2023
Year
Convolutional Neural NetworkEngineeringFeature DetectionMachine LearningImage ClassificationImage AnalysisPattern RecognitionVideo TransformerMachine VisionFeature LearningObject DetectionComputer EngineeringComputer ScienceIndustrial Defect DetectionIndustrial Product QualityDeep LearningAutomated InspectionComputer VisionEfficient Defect Detection
Industrial defect detection is gaining importance in the control of industrial product quality. Highly accurate and efficient defect detection with complex and variable industrial defect types is therefore an interesting but challenging problem. Vision transformers have been highly successful in a variety of computer vision tasks, due to their ability to capture global information in images. Nevertheless, simply capturing global information is problematic. On the one hand, because they are incapable of inductive bias as Convolutional Neural Network (CNN), transformers will have difficulty focusing on local features of defects in industrial defect image inspection tasks. On the other hand, using global computation leads to excessive memory and computational cost. To mitigate these issues, we propose a new vision transformer architecture which contains Hybrid Window Attention (HWA) and Dynamic Token Normalization (DTN). HWA, which combines pooling attention and window attention, makes the computational complexity reduced to improve efficiency. DTN enables transformers to focus on both the global information and the local features of defects, thus providing improved accuracy of industrial surface defect detection. Extensive experiments demonstrate that our Dynamic Vision Transformer (DHT) achieves 96.8% and 98.5% classification accuracy on the NEU dataset and the DAGM dataset, respectively, with a low computational complexity.
| Year | Citations | |
|---|---|---|
Page 1
Page 1