Concepedia

Publication | Closed Access

Adaptive MapReduce using situation-aware mappers

67

Citations

32

References

2012

Year

Abstract

We propose new adaptive runtime techniques for MapReduce that improve performance and simplify job tuning. We implement these techniques by breaking a key assumption of MapReduce that mappers run in isolation. Instead, our mappers communicate through a distributed meta-data store and are aware of the global state of the job. However, we still preserve the fault-tolerance, scalability, and programming API of MapReduce. We utilize these "situation-aware mappers" to develop a set of techniques that make MapReduce more dynamic: (a) Adaptive Mappers dynamically take multiple data partitions (splits) to amortize mapper start-up costs; (b) Adaptive Combiners improve local aggregation by maintaining a cache of partial aggregates for the frequent keys; (c) Adaptive Sampling and Partitioning sample the mapper outputs and use the obtained statistics to produce balanced partitions for the reducers. Our experimental evaluation shows that adaptive techniques provide up to 3x performance improvement, in some cases, and dramatically improve performance stability across the board.

References

YearCitations

Page 1