Publication | Closed Access
SWIFT: Software Implemented Fault Tolerance
706
Citations
32
References
2005
Year
Unknown Venue
Software MaintenanceEngineeringComputer ArchitectureSoftware EngineeringFault ToleranceSoftware AnalysisFormal VerificationHardware SecurityReliability EngineeringProcessor DesignersSystems EngineeringFault RecoveryItanium 2Exceptional Fault CoverageRuntime VerificationComputer EngineeringComputer ScienceProgram AnalysisSoftware TestingFault AttackFault InjectionSystem Software
This paper introduces SWIFT, a software‑only transient fault detection technique. SWIFT achieves this by reclaiming unused instruction‑level resources and employing enhanced control‑flow checking, and its implementation on an Itanium 2 shows high fault coverage with modest performance overhead. Compared to the best single‑threaded ECC‑memory approach, SWIFT achieves a 51 % average speedup while maintaining excellent fault coverage.
To improve performance and reduce power, processor designers employ advances that shrink feature sizes, lower voltage levels, reduce noise margins, and increase clock rates. However, these advances make processors more susceptible to transient faults that can affect correctness. While reliable systems typically employ hardware techniques to address soft-errors, software techniques can provide a lower-cost and more flexible alternative. This paper presents a novel, software-only, transient-fault-detection technique, called SWIFT. SWIFT efficiently manages redundancy by reclaiming unused instruction-level resources present during the execution of most programs. SWIFT also provides a high level of protection and performance with an enhanced control-flow checking mechanism. We evaluate an implementation of SWIFT on an Itanium 2 which demonstrates exceptional fault coverage with a reasonable performance cost. Compared to the best known single-threaded approach utilizing an ECC memory system, SWIFT demonstrates a 51% average speedup.
| Year | Citations | |
|---|---|---|
Page 1
Page 1