Publication | Closed Access
WAP5
125
Citations
19
References
2006
Year
Unknown Venue
Software MaintenanceCluster ComputingEngineeringSoftware EngineeringFault-tolerant MessagingSoftware AnalysisData ScienceDistributed EnvironmentApplication TracesData ManagementSource CodeDistributed SystemsComputer ScienceNew AlgorithmSoftware DesignDistributed ComputingProgram AnalysisCloud ComputingEvent-driven MonitoringNetwork MonitoringSystem Software
Wide-area distributed applications are challenging to debug, optimize, and maintain. We present Wide-Area Project 5 (WAP5), which aims to make these tasks easier by exposing the causal structure of communication within an application and by exposing delays that imply bottlenecks. These bottlenecks might not otherwise be obvious, with or without the application's source code. Previous research projects have presented algorithms to reconstruct application structure and the corresponding timing information from black-box message traces of local-area systems. In this paper we present (1) a new algorithm for reconstructing application structure in both local- and wide-area distributed systems, (2) an infrastructure for gathering application traces in PlanetLab, and (3) our experiences tracing and analyzing three systems: CoDeeN and Coral, two content-distribution networks in PlanetLab; and Slurpee, an enterprise-scale incident-monitoring system.
| Year | Citations | |
|---|---|---|
Page 1
Page 1