Publication | Closed Access
Big Data Analytics for Classification of Network Enabled Devices
21
Citations
27
References
2016
Year
Unknown Venue
EngineeringMachine LearningBig Data AnalyticsNetwork AnalysisInformation ForensicsBig Data ModelNetwork AnalyticsSupport Vector MachineClassification MethodData ScienceData MiningPattern RecognitionDecision Tree LearningInternet Of ThingsPredictive AnalyticsKnowledge DiscoveryIntelligent ClassificationMobile ComputingComputer ScienceData ClassificationRandom Forest ClassifierBusinessClassifier SystemNetwork Traffic MeasurementBig Data
As information technology (IT) and telecommunication systems continue to grow in size and complexity, especially with Internet of Things (IoT) gaining popularity, maintaining a secure and seamless exchange of information between devices becomes a challenging task. A large number of devices connected over the Internet leads to an increase in vulnerabilities and security threats, which makes the identification of critical assets necessary. Asset identification helps organizations to identify and to respond quickly to any security breaches. In this paper, machine learning based techniques are used to identify assets based on their connectivity, i.e., servers and endpoints. For the analysis presented in this paper four different machine learning algorithms, K-Nearest Neighbor, Naive Bayes, Support Vector Machines, and Random Forest algorithms are used and the performance of these algorithms is assessed in terms of the F-score calculated for each algorithm. Results show that for a given dataset, amongst all four algorithms, the Random Forest classifier achieved highest accuracy in terms of identifying the assets correctly. However, the Random Forest algorithm is computationally intensive and may not work for large datasets. Naive Bayes algorithm yielded the worst performance and KNearest Neighbor's performance was very close to that achieved by Support Vector Machines. Our results shows that for the given dataset, Support Vector Machine based classifier was found to be a good compromise in terms of accuracy and computational expensiveness.
| Year | Citations | |
|---|---|---|
Page 1
Page 1