Concepedia

Abstract

This paper compares and contrasts the most widely used network security datasets, evaluating their efficacy in providing a benchmark for intrusion and anomaly detection systems. The antiquated nature of some of the most widely used datasets along with their inadequacies is examined and used as a basis for discussion of a new approach to analyzing network traffic data. Live network traffic is collected that consists of real normal traffic and both real and penetration testing attack data. Attack data is then inspected and labeled by means of manual analysis. While network attacks and anomaly features vary widely, they share some commonalities that are examined here. Among these are: self-similarity convergence, periodicity, and repetition. Further, the knowledge inherent in the definition of network boundaries and advertised services can provide crucial context that allows the network analyst to consider self-aware attributes when examining network traffic sessions. To these ends the Session Aggregation for Network Traffic Analysis (SANTA) dataset is proposed. The motivation and the methodology of collection, aggregation and evaluation of the raw data are presented, as well as the conceptualization of the SANTA attributes and advantages provided by this approach.

References

YearCitations

Page 1