Publication | Closed Access
Stardust
41
Citations
15
References
2006
Year
Unknown Venue
Cluster ComputingEngineeringService MonitoringSoftware AnalysisParallel ComputingData ManagementProfiling ToolDistributed SystemsComputer SciencePerformance MetricsPerformance Analysis ToolPerformance MonitoringProgram AnalysisSoftware TestingCloud ComputingMinimal GuidanceStorage SystemSystem MonitoringSystem Software
Performance monitoring in most distributed systems provides minimal guidance for tuning, problem diagnosis, and decision making. Stardust is a monitoring infrastructure that replaces traditional performance counters with end-to-end traces of requests and allows for efficient querying of performance metrics. Such traces better inform key administrative performance challenges by enabling, for example, extraction of per-workload, per-resource demand information and per-workload latency graphs. This paper reports on our experience building and using end-to-end tracing as an on-line monitoring tool in a distributed storage system. Using diverse system workloads and scenarios, we show that such fine-grained tracing can be made efficient (less than 6% overhead) and is useful for on- and off-line analysis of system behavior. These experiences make a case for having other systems incorporate such an instrumentation framework.
| Year | Citations | |
|---|---|---|
Page 1
Page 1