Publication | Closed Access
Fault Tolerance in Message Passing Interface Programs
131
Citations
18
References
2004
Year
EngineeringVerificationSoftware EngineeringFault ToleranceFault-tolerant MessagingSoftware AnalysisStandard Mpi ProgramsFormal VerificationMpi StandardSystems EngineeringMpi SemanticsMessage PassingComputer EngineeringComputer ScienceSoftware DesignFault-tolerant NetworkProgram AnalysisFormal MethodsFault InjectionSystem Software
In this paper we examine the topic of writing fault-tolerant Message Passing Interface (MPI) applications. We discuss the meaning of fault tolerance in general and what the MPI Standard has to say about it. We survey several approaches to this problem, namely checkpointing, restructuring a class of standard MPI programs, modifying MPI semantics, and extending the MPI specification. We conclude that, within certain constraints, MPI can provide a useful context for writing application programs that exhibit significant degrees of fault tolerance.
| Year | Citations | |
|---|---|---|
Page 1
Page 1