Publication | Closed Access
Evaluation of a Workflow Scheduler Using Integrated Performance Modelling and Batch Queue Wait Time Prediction
32
Citations
15
References
2006
Year
Unknown Venue
Cluster ComputingEngineeringIndustrial EngineeringComputer ArchitectureOperations ResearchSystems EngineeringParallel ComputingWorkflow SchedulerPerformance PredictionJob SchedulerCloud SchedulingComputer EngineeringBatch QueuesScheduling (Computing)Distributed SystemsComputer ScienceQueueing SystemsDistributed ProcessingWorkflow ExecutionHpc UsersCloud ComputingPerformance ModelingParallel ProgrammingScheduling (Project Management)
Large-scale distributed systems offer computational power at unprecedented levels. In the past, HPC users typically had access to relatively few individual supercomputers and, in general, would assign a one-to-one mapping of applications to machines. Modern HPC users have simultaneous access to a large number of individual machines and are beginning to make use of all of them for single-application execution cycles. One method that application developers have devised in order to take advantage of such systems is to organize an entire application execution cycle as a workflow. The scheduling of such workflows has been the topic of a great deal of research in the past few years and, although very sophisticated algorithms have been devised, a very specific aspect of these distributed systems, namely that most supercomputing resources employ batch queue scheduling software, has therefore been omitted from consideration, presumably because it is difficult to model accurately. In this work, we augment an existing workflow scheduler through the introduction of methods which make accurate predictions of both the performance of the application on specific hardware, and the amount of time individual workflow tasks would spend waiting in batch queues. Our results show that although a workflow scheduler alone may choose correct task placement based on data locality or network connectivity, this benefit is often compromised by the fact that most jobs submitted to current systems must wait in overcommitted batch queues for a significant portion of time. However, incorporating the enhancements we describe improves workflow execution time in settings where batch queues impose significant delays on constituent workflow tasks
| Year | Citations | |
|---|---|---|
Page 1
Page 1