Concepedia

Publication | Closed Access

Angle-based outlier detection in high-dimensional data

802

Citations

32

References

2008

Year

TLDR

Outlier detection in large datasets seeks to identify distinct group mechanisms, but existing distance‑based methods degrade in high dimensions due to the curse of dimensionality. The paper proposes ABOD, an angle‑based outlier detection method that evaluates angle variance between a point and all others. ABOD computes angle‑based variance between a point and all others, and its variants were evaluated against LOF on synthetic and real datasets, demonstrating superior performance in high dimensions. ABOD mitigates the curse of dimensionality, requires no parameter tuning, and outperforms LOF in high‑dimensional synthetic and real datasets.

Abstract

Detecting outliers in a large set of data objects is a major data mining task aiming at finding different mechanisms responsible for different groups of objects in a data set. All existing approaches, however, are based on an assessment of distances (sometimes indirectly by assuming certain distributions) in the full-dimensional Euclidean data space. In high-dimensional data, these approaches are bound to deteriorate due to the notorious "curse of dimensionality". In this paper, we propose a novel approach named ABOD (Angle-Based Outlier Detection) and some variants assessing the variance in the angles between the difference vectors of a point to the other points. This way, the effects of the "curse of dimensionality" are alleviated compared to purely distance-based approaches. A main advantage of our new approach is that our method does not rely on any parameter selection influencing the quality of the achieved ranking. In a thorough experimental evaluation, we compare ABOD to the well-established distance-based method LOF for various artificial and a real world data set and show ABOD to perform especially well on high-dimensional data.

References

YearCitations

Page 1