Publication | Closed Access
Robust Failure Diagnosis of Microservice System Through Multimodal Data
77
Citations
45
References
2023
Year
EngineeringMachine LearningIntelligent DiagnosticsDiagnosisRobust Failure DiagnosisFault ForecastingIntelligent SystemsMining MethodsReliability EngineeringData ScienceData MiningPattern RecognitionSystems EngineeringFailure DetectionReliabilityMicroservices DesignKnowledge DiscoveryRoot Cause InstanceComputer ScienceLog AnalysisAutomatic Failure DiagnosisFailure TypeFailure PredictionData Modeling
Automatic failure diagnosis is crucial for large microservice systems. Currently, most failure diagnosis methods rely solely on single-modal data (i.e., using either metrics, logs, or traces). In this study, we conduct an empirical study using real-world failure cases to show that combining these sources of data (multimodal data) leads to a more accurate diagnosis. However, effectively representing these data and addressing imbalanced failures remain challenging. To tackle these issues, we propose <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DiagFusion</i> , a robust failure diagnosis approach that uses multimodal data. It leverages embedding techniques and data augmentation to represent the multimodal data of service instances, combines deployment data and traces to build a dependency graph, and uses a graph neural network to localize the root cause instance and determine the failure type. Our evaluations using real-world datasets show that <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DiagFusion</i> outperforms existing methods in terms of root cause instance localization (improving by 20.9% to 368%) and failure type determination (improving by 11.0% to 169%).
| Year | Citations | |
|---|---|---|
Page 1
Page 1