Publication | Closed Access
DataStager
154
Citations
28
References
2009
Year
Unknown Venue
Cluster ComputingEngineeringComputer ArchitectureHigh Performance ComputingData ScienceData-intensive PlatformData IntegrationSystem SoftwareParallel ComputingData ManagementComputer EngineeringComputer ScienceData-intensive ComputingStorage VirtualizationPetascale MachinesCloud ComputingParallel ProgrammingPetascale MachineIn-storage ComputingFlexible 'Datastager
Known challenges for petascale machines are that (1) the costs of I/O for high performance applications can be substantial, especially for output tasks like checkpointing, and (2) noise from I/O actions can inject undesirable delays into the runtimes of such codes on individual compute nodes. This paper introduces the flexible 'DataStager' framework for data staging and alternative services within that jointly address (1) and (2). Data staging services moving output data from compute nodes to staging or I/O nodes prior to storage are used to reduce I/O overheads on applications' total processing times, and explicit management of data staging offers reduced perturbation when extracting output data from a petascale machine's compute partition. Experimental evaluations of DataStager on the Cray XT machine at Oak Ridge National Laboratory establish both the necessity of intelligent data staging and the high performance of our approach, using the GTC fusion modeling code and benchmarks running on 1000+ processors.
| Year | Citations | |
|---|---|---|
Page 1
Page 1