Publication | Closed Access
PLELog: Semi-Supervised Log-Based Anomaly Detection via Probabilistic Label Estimation
54
Citations
72
References
2021
Year
Unknown Venue
Natural Language ProcessingProbabilistic Label EstimationAnomaly DetectionMachine LearningData ScienceData MiningInformation RetrievalEngineeringLog AnalysisOutlier DetectionKnowledge DiscoveryNovelty DetectionOpen-source Log DataLog SequencesComputer ScienceLog EventsLog ManagementText Mining
PLELog is a novel approach for log-based anomaly detection via probabilistic label estimation. It is designed to effectively detect anomalies in unlabeled logs and meanwhile avoid the manual labeling effort for training data generation. We use semantic information within log events as fixed-length vectors and apply HDBSCAN to automatically clustering log sequences. After that, we also propose a Probabilistic Label Estimation approach to reduce the noises introduced by error labeling and put "labeled" instances into an attention-based GRU network for training. We conducted an empirical study to evaluate the effectiveness of PLELog on two open-source log data (i.e., HDFS and BGL). The results demonstrate the effectiveness of PLELog. In particular, PLELog has been applied to two real-world systems from a university and a large corporation, further demonstrating its practicability.
| Year | Citations | |
|---|---|---|
Page 1
Page 1