Publication | Closed Access
Isolation-Based Anomaly Detection
1.9K
Citations
38
References
2012
Year
Anomaly DetectionMachine LearningEngineeringInformation SecurityImage AnalysisIsolation-based Anomaly DetectionData ScienceData MiningPattern RecognitionManagementStatisticsIsolation ForestI ForestPredictive AnalyticsOutlier DetectionKnowledge DiscoveryData PrivacyComputer ScienceData SecurityNovelty DetectionClassifier SystemData Points
Anomalies are data points that are few and different. The article proposes Isolation Forest, a method that detects anomalies solely through isolation, avoiding distance or density measures. Isolation Forest builds random trees that isolate anomalies via subsampling, achieving low linear time complexity and small memory requirements. Empirical results show that Isolation Forest outperforms ORCA, one‑class SVM, LOF, and Random Forests in AUC and speed, remains robust to masking and swamping, and performs well on high‑dimensional data even without training anomalies.
Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation . This article proposes a method called Isolation Forest ( i Forest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods. As a result, i Forest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that i Forest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. i Forest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample.
| Year | Citations | |
|---|---|---|
Page 1
Page 1