Publication | Closed Access
Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is All You Need
43
Citations
43
References
2023
Year
Unknown Venue
EngineeringFeature DetectionMachine LearningBiometricsInformation ForensicsImage ManipulationDetection TechniqueImage ForensicsOod SamplesImage ClassificationImage AnalysisImage ModelingData SciencePattern RecognitionOut-of-distribution DetectionOod Detection FrameworkVision RecognitionMachine VisionFeature LearningObject DetectionComputer ScienceDeep LearningSignal ProcessingComputer VisionOod DetectionObject Recognition
OOD detection aims to learn an in‑distribution representation that is distinguishable from OOD samples, yet prior recognition‑based methods often capture shortcuts rather than comprehensive representations. The study investigates whether reconstruction‑based pretext tasks can serve as a generally applicable prior to improve OOD detection by learning intrinsic ID data distributions. MOOD employs Masked Image Modeling as a reconstruction‑based pretext task to learn ID representations for OOD detection. MOOD achieves state‑of‑the‑art performance, surpassing prior one‑class, multi‑class, and near‑distribution OOD detection by 5.7 %, 3.0 %, and 2.1 % respectively, and even outperforms a 10‑shot‑per‑class outlier exposure method without using any OOD samples. Code is available at https://github.com/lijingyao20010602/MOOD.
The core of out-of-distribution (OOD) detection is to learn the in-distribution (ID) representation, which is distinguishable from OOD samples. Previous work applied recognition-based methods to learn the ID features, which tend to learn shortcuts instead of comprehensive representations. In this work, we find surprisingly that simply using reconstruction-based methods could boost the performance of OOD detection significantly. We deeply explore the main contributors of OOD detection and find that reconstruction-based pretext tasks have the potential to provide a generally applicable and efficacious prior, which benefits the model in learning intrinsic data distributions of the ID dataset. Specifically, we take Masked Image Modeling as a pretext task for our OOD detection framework (MOOD). Without bells and whistles, MOOD outperforms previous SOTA of one-class OOD detection by 5.7%, multi-class OOD detection by 3.0%, and near-distribution OOD detection by 2.1 %. It even defeats the 10-shot-per-class out-lier exposure OOD detection, although we do not include any OOD samples for our detection. Codes are available at https://github.com/lijingyao20010602/MOOD
| Year | Citations | |
|---|---|---|
Page 1
Page 1