Concepedia

Abstract

The authors demonstrate a methodology for evaluating the fault-tolerance characteristics of operational software and illustrate it through case studies of three operating systems: the Tandem GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Based on measurements from these systems, software error characteristics are investigated by analyzing error distributions and correlation. Two levels of models are developed to analyze the error and recovery processes inside an operating system and the interactions among multiple copies of an operating system running in a distributed environment. Reward analysis is used to evaluate the loss of service due to software errors and the effect of fault-tolerant techniques implemented in the systems.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>

References

YearCitations

Page 1