Publication | Closed Access
On the use and implementation of message logging
103
Citations
37
References
2002
Year
Unknown Venue
Cluster ComputingEngineeringFault ToleranceCommunicationSoftware AnalysisFormal VerificationParallel ComputingLog ManagementData ManagementMessage Logging ProtocolsMessage PassingNetworked Computer SystemsDistributed SystemsComputer ScienceNew ProtocolsData SecurityLog AnalysisOperating SystemsDistributed ComputingProgram AnalysisFormal MethodsParallel ProgrammingAsynchronous SystemsSystem SoftwareMessage Logging
We present a number of experiments showing that for compute-intensive applications executing in parallel on clusters of workstations, message logging has higher failure-free overhead than coordinated checkpointing. Message logging protocols, however, result in much shorter output latency than coordinated checkpointing. Therefore, message logging should be used for applications involving substantial interactions with the outside world, while coordinated checkpointing should be used otherwise. We also present an unorthodox message logging design that uses coordinated checkpointing with message logging, departing from the conventional approaches that use independent checkpointing. This combination of message logging and coordinated checkpointing offers several advantages, including improved failure-free performance, bounded recovery time, simplified garbage collection, and reduced complexity. Meanwhile, the new protocols retain the advantages of the conventional message logging protocols with respect to output commit. Finally, we discuss three "lessons learned" from an implementation of various message logging protocols.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
| Year | Citations | |
|---|---|---|
Page 1
Page 1