Publication | Closed Access
Email Classification and Forensics Analysis using Machine Learning
23
Citations
26
References
2021
Year
Unknown Venue
Abuse DetectionEngineeringMachine LearningInformation ForensicsText MiningSpam FilteringClassification MethodInformation RetrievalData ScienceData MiningPattern RecognitionForensic MedicineAutomatic ClassificationKnowledge DiscoveryIntelligent ClassificationComputer ScienceComputer ForensicsLogistic Regression PerformsBusinessDigital ForensicsLogistic RegressionClassificationEmail ClassificationRandom Forest
Emails are being used as a reliable, secure, and formal mode of communication for a long time. With fast and secure communication technologies, reliance on Email has increased as well. The massive increase in email data has led to a big challenge in managing emails. Emails so far can be classified and grouped based on sender, size, and date. However, there is a need to detect and classify emails based on the contents contained therein. Several approaches have been used in the past for content-based classification of emails as Spam or Non-Spam Email. In this paper, we propose a multi-label email classification approach to organize emails. An efficient classification method has been proposed for forensic investigations of massive email data (e.g., a disk image of an email server). This method would help the investigator in Email related crimes investigations. A comparative study of machine learning algorithms identified Logistic Regression as a method that achieves the highest accuracy compared to Naive Bayes, Stochastic Gradient Descent, Random Forest, and Support Vector Machine. Experiments conducted on benchmark data sets depicted that logistic Regression performs best, with an accuracy of 91.9% with bi-gram features.
| Year | Citations | |
|---|---|---|
Page 1
Page 1