Concepedia

TLDR

Power consumption is a primary concern in HPC, and while DVFS and DCT are common software tools for reducing dynamic power, few studies have explored their synergistic integration or developed application‑aware simultaneous controllers in real systems. The study introduces a multi‑dimensional online performance predictor to simultaneously optimize DVFS and DCT at runtime on multi‑core systems. The predictor is implemented as a runtime library linked to Intel OpenMP and tested on a dual‑processor quad‑core system. The framework achieves near‑optimal DVFS/DCT settings, yielding 19 % average energy savings, 40 % ED2 reduction, 6 % power savings, and 14 % performance gains, and outperforms prior single‑knob or sequential approaches as well as heuristic search methods.

Abstract

Power has become a primary concern for HPC systems. Dynamic voltage and frequency scaling (DVFS) and dynamic concurrency throttling (DCT) are two software tools (or knobs) for reducing the dynamic power consumption of HPC systems. To date, few works have considered the synergistic integration of DVFS and DCT in performance-constrained systems, and, to the best of our knowledge, no prior research has developed application-aware simultaneous DVFS and DCT controllers in real systems and parallel programming frameworks. We present a multi-dimensional, online performance predictor, which we deploy to address the problem of simultaneous runtime optimization of DVFS and DCT on multi-core systems. We present results from an implementation of the predictor in a runtime library linked to the Intel OpenMP environment and running on an actual dual-processor quad-core system. We show that our predictor derives near-optimal settings of the power-aware program adaptation knobs that we consider. Our overall framework achieves significant reductions in energy (19% mean) and ED2 (40% mean), through simultaneous power savings (6% mean) and performance improvements (14% mean). We also find that our framework outperforms earlier solutions that adapt only DVFS or DCT, as well as one that sequentially applies DCT then DVFS. Further, our results indicate that prediction-based schemes for runtime adaptation compare favorably and typically improve upon heuristic search-based approaches in both performance and energy savings.

References

YearCitations

Page 1