Concepedia

Abstract

This paper presents a new software technique to detect transient hardware errors. The objective of the technique is to guarantee data integrity in the presence of transient errors and to reduce energy consumption at the same time. The basic approach is to duplicate computations. There are three choices for duplicate computations: (1) duplicating every statement in the program and comparing their results, (2) re-executing procedures through duplicated procedure calls and comparing their results, (3) re-executing the whole program and comparing the final results. Our technique combines choices (1) and (2): Given a program, our technique analyzes procedure call behavior of the program and determines which procedures can have duplicated statements (choice (1)) and which procedure calls can be duplicated (choice (2)) to minimize energy consumption with reasonable error detection latency. We simulated our technique with benchmark programs and found that it reduces energy consumption by over 25% on average compared to previous techniques that do not take energy consumption into consideration.

References

YearCitations

Page 1