Publication | Closed Access
Packet-Level Telemetry in Large Datacenter Networks
79
Citations
23
References
2015
Year
Cluster ComputingEngineeringHigh Performance Computer NetworkPresent EverflowNetwork AnalysisLarge VolumeData Center NetworkSystems EngineeringNetwork ManagementAdvanced NetworkingData Center SystemComputer EngineeringGuided ProbesComputer ScienceData Center NetworksEdge ComputingSoftware TestingCloud ComputingPacket-level TelemetryNetwork Traffic MeasurementNetwork Monitoring
Debugging faults in complex networks often requires capturing and analyzing traffic at the packet level. In this task, datacenter networks (DCNs) present unique challenges with their scale, traffic volume, and diversity of faults. To troubleshoot faults in a timely manner, DCN administrators must a) identify affected packets inside large volume of traffic; b) track them across multiple network components; c) analyze traffic traces for fault patterns; and d) test or confirm potential causes. To our knowledge, no tool today can achieve both the specificity and scale required for this task. We present Everflow, a packet-level network telemetry system for large DCNs. Everflow traces specific packets by implementing a powerful packet filter on top of "match and mirror" functionality of commodity switches. It shuffles captured packets to multiple analysis servers using load balancers built on switch ASICs, and it sends "guided probes" to test or confirm potential faults. We present experiments that demonstrate Everflow's scalability, and share experiences of troubleshooting network faults gathered from running it for over 6 months in Microsoft's DCNs.
| Year | Citations | |
|---|---|---|
Page 1
Page 1