Publication | Open Access
Merge or Separate?
34
Citations
25
References
2017
Year
Unknown Venue
Cluster ComputingHeterogeneous ComputingEngineeringComputer ArchitectureSystem IntegrationGpu ComputingCompute KernelHigh-performance ArchitectureData IntegrationParallel ComputingOpencl KernelsGpu AcceleratorsComputer EngineeringScheduling (Computing)Computer ScienceCoordinated EffectsEdge ComputingPartition (Database)Data ExchangeCloud ComputingParallel ProgrammingSystem SoftwareRuntime Framework
Computer systems are increasingly heterogeneous with nodes consisting of CPUs and GPU accelerators. As such systems become mainstream, they move away from specialized high-performance single application platforms to a more general setting with multiple, concurrent, application jobs. Determining how jobs should be dynamically best scheduled to heterogeneous devices is non-trivial. In certain cases, performance is maximized if jobs are allocated to a single device, in others, sharing is preferable. In this paper, we present a runtime framework which schedules multi-user OpenCL tasks to their most suitable device in a CPU/GPU system. We use a machine learning-based predictive model at runtime to detect whether to merge OpenCL kernels or schedule them separately to the most appropriate devices without the need for ahead-of-time profiling. We evaluate out approach over a wide range of workloads, on two separate platforms. We consistently show significant performance and turn-around time improvement over the state-of-the-art across programs, workload, and platforms.
| Year | Citations | |
|---|---|---|
Page 1
Page 1