Publication | Open Access
A Consolidated Decision Tree-Based Intrusion Detection System for Binary and Multiclass Imbalanced Datasets
153
Citations
45
References
2021
Year
High-class Imbalanced DatasetAnomaly DetectionEngineeringInformation SecurityInformation ForensicsDetection TechniqueMining MethodsData ScienceData MiningPattern RecognitionClass ImbalanceManagementDecision Tree LearningMultiple Classifier SystemIntrusion Detection SystemThreat DetectionKnowledge DiscoveryComputer ScienceMulticlass Imbalanced DatasetsData ClassificationIntrusion DetectionRelative Random SamplingWidespread AcceptanceCost-sensitive Learning
The rapid expansion of internet and mobile technologies has amplified cybercrime, making intrusion detection a critical challenge for security experts. This study introduces a host‑based intrusion detection system that combines a C4.5 decision tree with the Consolidated Tree Construction algorithm to handle class‑imbalanced data. The system employs a Supervised Relative Random Sampling technique to balance highly skewed datasets and a multi‑class feature‑selection filter to identify optimal features, and is benchmarked against leading IDS models. On the NSL‑KDD and CICIDS2017 datasets, the proposed IDS achieved 99.96 % and 99.95 % accuracy using 34 features.
The widespread acceptance and increase of the Internet and mobile technologies have revolutionized our existence. On the other hand, the world is witnessing and suffering due to technologically aided crime methods. These threats, including but not limited to hacking and intrusions and are the main concern for security experts. Nevertheless, the challenges facing effective intrusion detection methods continue closely associated with the researcher’s interests. This paper’s main contribution is to present a host-based intrusion detection system using a C4.5-based detector on top of the popular Consolidated Tree Construction (CTC) algorithm, which works efficiently in the presence of class-imbalanced data. An improved version of the random sampling mechanism called Supervised Relative Random Sampling (SRRS) has been proposed to generate a balanced sample from a high-class imbalanced dataset at the detector’s pre-processing stage. Moreover, an improved multi-class feature selection mechanism has been designed and developed as a filter component to generate the IDS datasets’ ideal outstanding features for efficient intrusion detection. The proposed IDS has been validated with state-of-the-art intrusion detection systems. The results show an accuracy of 99.96% and 99.95%, considering the NSL-KDD dataset and the CICIDS2017 dataset using 34 features.
| Year | Citations | |
|---|---|---|
Page 1
Page 1