Concepedia

Abstract

Modern high performance computing (HPC) systems, including supercomputers, routinely suffer from substantial performance variations. The same application with the same input can have more than 100% performance variation, and such variations cause reduced efficiency and wasted resources. There have been recent studies on performance variability and on designing automated methods for diagnosing "anomalies" that cause performance variability. These studies either observe data collected from HPC systems, or they rely on synthetic reproduction of performance variability scenarios. However, there is no standardized way of creating performance variability inducing synthetic anomalies; so, researchers rely on designing ad-hoc methods for reproducing performance variability.

References

YearCitations

Page 1