Publication | Closed Access
Vision Transformer With Contrastive Learning for Hyperspectral Image Classification
27
Citations
19
References
2023
Year
EngineeringMachine LearningFeature ExtractionImage ClassificationImage AnalysisData SciencePattern RecognitionComputational ImagingMachine VisionImage Classification (Visual Culture Studies)Feature LearningSpectral ImagingComputer ScienceDeep LearningComputer VisionHyperspectral ImagingVision TransformerMedicineTransformer BlocksImage Classification (Electrical Engineering)
The vision transformer (ViT) has become a hot topic in image processing due to its global feature extraction capabilities. However, the ViT suffers from over-smoothing in feature extraction and over-fitting in the training procedure, so it is hard to achieve satisfactory performance in hyperspectral image (HSI) classification. To address these issues, we propose a ViT with contrastive learning (CViT). The network architecture includes a patch embedding module, transformer blocks, and a classifier. The training of CViT can be considered as an optimization problem with a supervised contrastive loss, an unsupervised contrastive loss, and an ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> -regularizer with respect to linear self-attention weights. Specifically, the supervised contrastive loss is proposed to alleviate the negative effects of HSI features’ spectral variability and spatial diversity by increasing intra-class consistency. On the other hand, the unsupervised contrastive loss is exploited to reduce redundancy by reconstructing global structural information. In particular, regularized linear self-attention weights reduce the over-smoothing issue. Extensive experimental results on three HSI datasets demonstrate that the proposed CViT achieves competitive performance.
| Year | Citations | |
|---|---|---|
Page 1
Page 1