Publication | Closed Access
Optimizing load balancing and data-locality with data-aware scheduling
132
Citations
43
References
2014
Year
Unknown Venue
Cluster ComputingLoad Balancing (Computing)EngineeringComputer ArchitectureCloud Load BalancingData ScienceParallel ComputingData ManagementJob SchedulerLoad BalancingCloud SchedulingComputer EngineeringTask ParallelismScheduling (Computing)Computer ScienceData-intensive ComputingEdge ComputingCloud ComputingParallel ProgrammingTask MetadataTask Data SizeWork Stealing
Load balancing techniques (e.g. work stealing) are important to obtain the best performance for distributed task scheduling systems that have multiple schedulers making scheduling decisions. In work stealing, tasks are randomly migrated from heavy-loaded schedulers to idle ones. However, for data-intensive applications where tasks are dependent and task execution involves processing a large amount of data, migrating tasks blindly yields poor data-locality and incurs significant data-transferring overhead. This work improves work stealing by using both dedicated and shared queues. Tasks are organized in queues based on task data size and location. We implement our technique in MATRIX, a distributed task scheduler for many-task computing. We leverage distributed key-value store to organize and scale the task metadata, task dependency, and data-locality. We evaluate the improved work stealing technique with both applications and micro-benchmarks structured as direct acyclic graphs. Results show that the proposed data-aware work stealing technique performs well.
| Year | Citations | |
|---|---|---|
Page 1
Page 1