Publication | Closed Access
On the Scheduling of Checkpoints in Desktop Grids
30
Citations
18
References
2011
Year
Unknown Venue
Cluster ComputingEngineeringDesktop GridsCheck Pointing CostsOperations ResearchSystems EngineeringParallel ComputingJob SchedulerCloud SchedulingComputer EngineeringScheduling (Computing)Distributed SystemsComputer ScienceHigh Availability SoftwareDistributed ComputingScheduling ProblemEdge ComputingFrequent Resources FailuresCloud ComputingAutomationGrid ComputingParallel Programming
Frequent resources failures are a major challenge for the rapid completion of batch jobs. Check pointing and migration is one approach to accelerate job completion avoiding deadlock. We study the problem of scheduling checkpoints of sequential jobs in the context of Desktop Grids, consisting of volunteered distributed resources. We craft a checkpoint scheduling algorithm that is provably optimal for discrete time when failures obey any general probability distribution. We show using simulations with parameters based on real-world systems that this optimal strategy scales and outperforms other strategies significantly in terms of check pointing costs and batch completion times.
| Year | Citations | |
|---|---|---|
Page 1
Page 1