Concepedia

Publication | Closed Access

Exploring the relationship between parallel application run-time variability and network performance in clusters

26

Citations

25

References

2004

Year

Abstract

Highly variable parallel application execution time is a persistent issue in cluster computing environments, and can be particularly acute in systems composed of networks of workstations (NOWs). We are looking at this issue in terms of consistency. In particular, we are focusing on network performance. Before we can use techniques from fault management to attain consistency, this paper presents our preliminary analysis of run-time variability from logs and experiments, exposing important issues related to systemic inconsistency in NOW clusters. The characterization of application sensitivity can be used to set network performance goals, thereby defining operational requirements. Network performance depends on the virtual topology imposed by the scheduler's allocation of nodes and the communication patterns of the set of running applications. Therefore it is important to look at both the network and the cluster's centralized node mapper (scheduler) as critical subsystems.

References

YearCitations

Page 1