Concepedia

Publication | Closed Access

Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation

53

Citations

55

References

2023

Year

TLDR

Deep neural networks have advanced significantly, yet they are biased toward majority classes when training on long‑tailed data, and existing mixture‑of‑experts methods use equally deep experts, ignoring class‑specific depth preferences. This work introduces Self‑Heterogeneous Integration with Knowledge Excavation (SHIKE) to address these limitations. SHIKE employs Depth‑wise Knowledge Fusion to combine shallow and deep features within each expert, and Dynamic Knowledge Transfer to mitigate the impact of hard negative classes on tail performance. The method yields substantial gains, achieving state‑of‑the‑art accuracies of 56.3 %, 60.3 %, 75.4 %, and 41.9 % on CIFAR100‑LT, ImageNet‑LT, iNaturalist 2018, and Places‑LT, respectively. Source code is available at https://github.com/jinyan-06/SHIKE.

Abstract

Deep neural networks have made huge progress in the last few decades. However, as the real-world data often exhibits a long-tailed distribution, vanilla deep models tend to be heavily biased toward the majority classes. To address this problem, state-of-the-art methods usually adopt a mixture of experts (MoE) to focus on different parts of the long-tailed distribution. Experts in these methods are with the same model depth, which neglects the fact that different classes may have different preferences to be fit by models with different depths. To this end, we propose a novel MoE-based method called Self-Heterogeneous Integration with Knowledge Excavation (SHIKE). We first propose Depth-wise Knowledge Fusion (DKF) to fuse features between different shallow parts and the deep part in one network for each expert, which makes experts more diverse in terms of representation. Based on DKF, we further propose Dynamic Knowledge Transfer (DKT) to reduce the influence of the hardest negative class that has a non-negligible impact on the tail classes in our MoEframework. As a result, the classification accuracy of long-tailed data can be significantly improved, especially for the tail classes. SHIKE achieves the state-of-the-art performance of 56.3%, 60.3%, 75.4% and 41.9% on CIFAR100-LT (IF100), ImageNet-LT, iNaturalist 2018, and Places-LT, respectively. The source code is available at https://github.com/jinyan-06/SHIKE.

References

YearCitations

Page 1