Publication | Closed Access
Exploring the relationship between parallel application run-time variability and network performance in clusters
26
Citations
25
References
2004
Year
Unknown Venue
Cluster ComputingEngineeringComputer ArchitectureNetwork AnalysisFault TolerancePerformance IssueCluster TechnologyApplication SensitivitySystems EngineeringNetwork PerformanceParallel ComputingSystemic InconsistencyComputer SciencePerformance Analysis ToolDistributed ProcessingPerformance ScalabilityHigh Availability SoftwareDistributed ComputingEdge ComputingParallel Performance EvaluationCloud ComputingParallel Programming
Highly variable parallel application execution time is a persistent issue in cluster computing environments, and can be particularly acute in systems composed of networks of workstations (NOWs). We are looking at this issue in terms of consistency. In particular, we are focusing on network performance. Before we can use techniques from fault management to attain consistency, this paper presents our preliminary analysis of run-time variability from logs and experiments, exposing important issues related to systemic inconsistency in NOW clusters. The characterization of application sensitivity can be used to set network performance goals, thereby defining operational requirements. Network performance depends on the virtual topology imposed by the scheduler's allocation of nodes and the communication patterns of the set of running applications. Therefore it is important to look at both the network and the cluster's centralized node mapper (scheduler) as critical subsystems.
| Year | Citations | |
|---|---|---|
Page 1
Page 1