Towards effective feature selection in machine learning-based botnet detection approaches

Abstract

Botnets, as one of the most formidable cyber security threats, are becoming more sophisticated and resistant to detection. In spite of specific behaviors each botnet has, there exist adequate similarities inside each botnet that separate its behavior from benign traffic. Several botnet detection systems have been proposed based on these similarities. However, offering a solution for differentiating botnet traffic (even those using same protocol, e.g. IRC) from normal traffic is not trivial. Extraction of features in either host or network level to model a botnet has been one of the most popular methods in botnet detection. A subset of features, usually selected based on some intuitive understanding of botnets, is used by the machine learning algorithms to classify/ cluster botnet traffic. These approaches, tested against two or three botnet traces, have mostly showed satisfactory detection results. Even though, their effectiveness in detection of other botnets or real traffic remains in doubt. Additionally, effectiveness of different combination of features in terms of providing more detection coverage has not been fully studied. In this paper we revisit flow-based features employed in the existing botnet detection studies and evaluate their relative effectiveness. To ensure a proper evaluation we create a dataset containing a diverse set of botnet traces and background traffic.

References

Page 1

	Year	Citations

Page 1