Publication | Closed Access
Monitoring multi-tier clustered systems with invariant metric relationships
25
Citations
17
References
2008
Year
Unknown Venue
Cluster ComputingFault DiagnosisAvailabilityEngineeringService MonitoringInvariant Metric RelationshipsNetwork AnalysisFault ToleranceSystem MetricSystem DiagnosisAccidental CorrelationsReliability EngineeringData ScienceData MiningSystems EngineeringData ManagementFailure DetectionReliabilityComputer ScienceFault ManagementMonitoringSystem MonitoringIndustrial InformaticsSystem Metrics
To ensure high availability, self-managing systems require self-monitoring and a system model against which to analyze monitoring data. Characterizing relationships between system metrics has been shown to model simple multi-tier transaction systems effectively, enabling failure detection and fault diagnosis. In this paper we show how to extend this invariant metric-relationships approach to clustered multi-tier systems. We show through analysis and experimentation that naive application of the approach increases cost dramatically while reducing diagnosis accuracy. We demonstrate that randomization at the load balancer during the invariant-identification phase will improve diagnosis accuracy, though it neither completely eliminates the problem nor reduces the cost; indeed, it may increase the cost, as this approach will require a long learning phase to remove all accidental correlations. Finally, we argue that knowing the system structure is necessary to effectively apply invariants to the clustered environment.
| Year | Citations | |
|---|---|---|
Page 1
Page 1