Publication | Closed Access
A Heterogeneity-aware Data Distribution and Rebalance Method in Hadoop Cluster
16
Citations
6
References
2012
Year
Unknown Venue
Cluster ComputingEngineeringBig Data AnalyticsMap-reduceDistributed Data AnalyticsData ScienceData MiningData IntegrationParallel ComputingData ManagementStatisticsCurrent Hadoop ImplementationComputer ScienceData-intensive ComputingScalable ComputingData DistributionCloud ComputingDynamic Data MigrationHeterogeneity-aware Data DistributionData HeterogeneityMassive Data ProcessingBig Data
The current Hadoop implementation assumes that computing nodes in a cluster are homogeneous. Due to the fact that the input data are split into data blocks with a predefined block size, Hadoop suffers performance degradation during Map phase in heterogeneous cluster. To solve this problem, we propose a heterogeneity-aware data distribution and rebalance method in heterogeneous Hadoop cluster. The method consists of two aspects: 1) performance-aware data distribution, and 2) dynamic data migration. The experimental results indicate that our method can improve the Map performance in heterogeneous cluster. Furthermore, the data locality of the Map task is enhanced as well.
| Year | Citations | |
|---|---|---|
Page 1
Page 1