Publication | Closed Access
Understanding the propagation of hard errors to software and implications for resilient system design
241
Citations
55
References
2008
Year
Unknown Venue
Software MaintenanceResilient System DesignEngineeringSurvivable SystemExpensive RedundancyComputer ArchitectureRobustness TestingSoftware EngineeringFault ToleranceDependable System ArchitectureSoftware AnalysisFormal VerificationHardware SecurityReliability EngineeringSystems EngineeringHard ErrorsFailure DetectionSoftware System SafetyHardware Reliability SolutionComputer EngineeringComputer ScienceSoftware DesignContinued Cmos ScalingProgram AnalysisSoftware TestingFormal MethodsFault AttackFault InjectionSystem Software
With continued CMOS scaling, future shipped hardware will be increasingly vulnerable to in-the-field faults. To be broadly deployable, the hardware reliability solution must incur low overheads, precluding use of expensive redundancy. We explore a cooperative hardware-software solution that watches for anomalous software behavior to indicate the presence of hardware faults. Fundamental to such a solution is a characterization of how hardware faults indifferent microarchitectural structures of a modern processor propagate through the application and OS.
| Year | Citations | |
|---|---|---|
Page 1
Page 1