Publication | Closed Access
Model for Transient and Permanent Error-Detection and Fault-Isolation Coverage
37
Citations
18
References
1982
Year
EngineeringVerificationSoftware AnalysisFormal VerificationHardware SecurityReliability EngineeringFault AnalysisSystems EngineeringComputer TechnologiesFailure DetectionReliabilityDynamic Error DetectionComputer EngineeringFault-isolation CoverageComputer ScienceIbm 3081Fault ManagementSmart GridSoftware TestingIndustrial InformaticsFault DetectionFault Injection
As computer technologies advance to achieve higher performance and density, intermittent failures become more dominant than solid failures, with the result that the effectiveness of any diagnostic procedure which relies on reproducing failures is greatly reduced. This problem is solved at the system level by a new strategy of dynamic error detection and fault isolation based on error checking and analysis of captured information. The model developed in this paper allows the system designer to project the dynamic error-detection and fault-isolation coverages of the system as a function of the failure rates of components and the types and placement of error checkers, which has resulted in significant improvements to both detection and isolation in the IBM 3081 Processor Unit. The model has also resulted in new probabilistic isolation strategies based on the likelihood of failures. Our experiences with this model on several IBM products, including the 3081, show good correlation between the model and practical experiments.
| Year | Citations | |
|---|---|---|
Page 1
Page 1