Publication | Closed Access
Parallelism-Aware Batch Scheduling
469
Citations
41
References
2008
Year
Cluster ComputingEngineeringComputer ArchitectureMultithreading (Computer Architecture)Hardware SystemsDram SystemHigh-performance ArchitectureComputing SystemsSystems EngineeringParallel ComputingManycore ProcessorJob SchedulerComputer EngineeringParallelism-aware Batch SchedulingTask ParallelismScheduling (Computing)Computer ScienceParallelism-aware Dram SchedulingDram ControllerParallel Performance EvaluationMany-core ArchitectureParallel ProgrammingAsynchronous Systems
In chip‑multiprocessor systems, a shared DRAM controller causes bank‑level conflicts that serialize requests and destroy inter‑thread parallelism, degrading fairness and throughput. This work proposes a new shared DRAM controller that delivers quality‑of‑service to threads while simultaneously improving system throughput. The proposed parallelism‑aware batch scheduler (PAR‑BS) processes requests in batches to ensure fairness and avoid starvation, and applies a parallelism‑aware policy that schedules requests from a thread across multiple banks to reduce stall time, while also supporting thread‑level priorities and variable service levels. Evaluation on 4‑, 8‑, and 16‑core workloads shows PAR‑BS improves fairness by 1.11× and throughput by 8.3% over the best prior scheduler, and is simpler to implement.
In a chip-multiprocessor (CMP) system, the DRAM system isshared among cores. In a shared DRAM system, requests from athread can not only delay requests from other threads by causingbank/bus/row-buffer conflicts but they can also destroy other threads’DRAM-bank-level parallelism. Requests whose latencies would otherwisehave been overlapped could effectively become serialized. As aresult both fairness and system throughput degrade, and some threadscan starve for long time periods.This paper proposes a fundamentally new approach to designinga shared DRAM controller that provides quality of service to threads,while also improving system throughput. Our parallelism-aware batchscheduler (PAR-BS) design is based on two key ideas. First, PARBSprocesses DRAM requests in batches to provide fairness and toavoid starvation of requests. Second, to optimize system throughput,PAR-BS employs a parallelism-aware DRAM scheduling policythat aims to process requests from a thread in parallel in the DRAMbanks, thereby reducing the memory-related stall-time experienced bythe thread. PAR-BS seamlessly incorporates support for system-levelthread priorities and can provide different service levels, includingpurely opportunistic service, to threads with different priorities.We evaluate the design trade-offs involved in PAR-BS and compareit to four previously proposed DRAM scheduler designs on 4-, 8-, and16-core systems. Our evaluations show that, averaged over 100 4-coreworkloads, PAR-BS improves fairness by 1.11X and system throughputby 8.3% compared to the best previous scheduling technique, Stall-Time Fair Memory (STFM) scheduling. Based on simple request prioritizationrules, PAR-BS is also simpler to implement than STFM.
| Year | Citations | |
|---|---|---|
Page 1
Page 1