Concepedia

Publication | Closed Access

The performance of consistent checkpointing

355

Citations

27

References

2003

Year

Abstract

Consistent checkpointing provides transparent fault tolerance for long-running distributed applications. Performance measurements of an implementation of consistent checkpointing are described. The measurements show that consistent checkpointing performs remarkably well. Eight computation-intensive distributed applications were executed on a network of 16 diskless Sun-3/60 workstations, and the performance without checkpointing was compared to the performance with consistent checkpoints taken at two-minute intervals. For six of the eight applications, the running time increased by less than 1% as a result of the checkpointing. The highest overhead measured was 5.8%. Incremental checkpointing and copy-on write checkpointing were the most effective techniques in lowering the running time overhead. It is argued that these measurements show that consistent checkpointing is an efficient way to provide fault tolerance for long-running distributed applications.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>

References

YearCitations

Page 1