Publication | Closed Access
A Fundamental Tradeoff Between Computation and Communication in Distributed Computing
473
Citations
53
References
2017
Year
Cluster ComputingEngineeringDistributed ProgrammingDistributed AlgorithmsFundamental TradeoffComputational ComplexityDistributed Data ProcessingMap-reduceDistributed Data AnalyticsComputing SystemsDistributed EnvironmentParallel ComputingDistributed ModelComputer EngineeringOptimal Computation-communication TradeoffDistributed SystemsComputer ScienceDistributed ProcessingScalable ComputingDistributed ComputingCloud ComputingParallel ProgrammingResource Optimization
A general distributed computing framework, inspired by MapReduce, decomposes overall computation into distributed “Map” and “Reduce” functions across multiple nodes. The authors aim to determine how additional computing power can be traded for reduced communication load in distributed computing. They propose a coded distributed computing (CDC) scheme that increases Map computation load by a factor r to create coding opportunities that reduce communication load by the same factor, and apply it to Hadoop TeraSort to develop CodedTeraSort. The results show that computation and communication loads are inversely proportional, an information‑theoretic lower bound matches the CDC scheme, thereby exactly characterizing the optimal tradeoff, and CodedTeraSort speeds up Hadoop TeraSort by 1.97×–3.39×.
How can we optimally trade extra computing power to reduce the communication load in distributed computing? We answer this question by characterizing a fundamental tradeoff between computation and communication in distributed computing, i.e., the two are inversely proportional to each other. More specifically, a general distributed computing framework, motivated by commonly used structures like MapReduce, is considered, where the overall computation is decomposed into computing a set of “Map” and “Reduce” functions distributedly across multiple computing nodes. A coded scheme, named “coded distributed computing” (CDC), is proposed to demonstrate that increasing the computation load of the Map functions by a factor of r (i.e., evaluating each function at r carefully chosen nodes) can create novel coding opportunities that reduce the communication load by the same factor. An information-theoretic lower bound on the communication load is also provided, which matches the communication load achieved by the CDC scheme. As a result, the optimal computation-communication tradeoff in distributed computing is exactly characterized. Finally, the coding techniques of CDC is applied to the Hadoop TeraSort benchmark to develop a novel CodedTeraSort algorithm, which is empirically demonstrated to speed up the overall job execution by 1.97× -3.39×, for typical settings of interest.
| Year | Citations | |
|---|---|---|
Page 1
Page 1