Concepedia

Publication | Closed Access

Resilient Datacenter Load Balancing in the Wild

213

Citations

32

References

2017

Year

TLDR

Production datacenters face traffic dynamics, topology asymmetry, and failures, and existing load‑balancing schemes such as Presto, DRB, CONGA, and CLOVE either ignore path conditions, reroute only at flowlet granularity, or fail to detect failures, limiting their effectiveness. This work proposes that datacenter load‑balancing schemes must be resilient, accurately sensing path conditions and reacting promptly to mitigate uncertainties.

Abstract

Production datacenters operate under various uncertainties such as traffic dynamics, topology asymmetry, and failures. Therefore, datacenter load balancing schemes must be resilient to these uncertainties; i.e., they should accurately sense path conditions and timely react to mitigate the fallouts. Despite significant efforts, prior solutions have important drawbacks. On the one hand, solutions such as Presto and DRB are oblivious to path conditions and blindly reroute at fixed granularity. On the other hand, solutions such as CONGA and CLOVE can sense congestion, but they can only reroute when flowlets emerge; thus, they cannot always react timely to uncertainties. To make things worse, these solutions fail to detect/handle failures such as blackholes and random packet drops, which greatly degrades their performance.

References

YearCitations

Page 1