ViM: Out-Of-Distribution with Virtual-logit Matching

TLDR

Existing OOD detection methods rely on a single input source—feature, logit, or softmax—making them fragile because some OOD samples are easy to detect in feature space but hard in logit space and vice versa. We introduce Virtual‑logit Matching (ViM), a novel OOD scoring method that fuses a class‑agnostic feature score with ID class‑dependent logits, and we release a large, human‑annotated ImageNet1K OOD dataset. ViM generates a virtual OOD logit from the residual of the feature against the principal space, scales it to match the original logits, and uses the resulting softmax probability as the OOD indicator, validated on CNNs and vision transformers. On four challenging OOD benchmarks, ViM attains an average AUROC of 90.91% with the BiT‑S model—4% higher than the best baseline—and the code and dataset are publicly available.

Abstract

Most of the existing Out-Of-Distribution (OOD) detection algorithms depend on single input source: the feature, the logit, or the softmax probability. However, the immense diversity of the OOD examples makes such methods fragile. There are OOD samples that are easy to identify in the feature space while hard to distinguish in the logit space and vice versa. Motivated by this observation, we propose a novel OOD scoring method named Virtual-logit Matching (ViM), which combines the class-agnostic score from feature space and the In-Distribution (ID) class-dependent logits. Specifically, an additional logit representing the virtual OOD class is generated from the residual of the feature against the principal space, and then matched with the original logits by a constant scaling. The probability of this virtual logit after softmax is the indicator of OOD-ness. To facilitate the evaluation of large-scale OOD detection in academia, we create a new OOD dataset for ImageNet1K, which is human-annotated and is 8.8× the size of existing datasets. We conducted extensive experiments, including CNNs and vision transformers, to demonstrate the effectiveness of the proposed ViM score. In particular, using the BiT-S model, our method gets an average AUROC 90.91% on four difficult OOD benchmarks, which is 4% ahead of the best baseline. Code and dataset are available at https://github.com/haoqiwang/vim.

References

Page 1

	Year	Citations

Page 1