Publication | Closed Access
dsReliM: Power-constrained reliability management in Dark-Silicon many-core chips under process variations
22
Citations
27
References
2015
Year
Unknown Venue
EngineeringComputer ArchitectureDsrelim SystemHardware SecurityProcess VariationsReliability EngineeringSystems EngineeringParallel ComputingManycore ProcessorDsrelim LeveragesPower-aware DesignReliabilityElectrical EngineeringPower-aware ComputingHardware ReliabilityComputer EngineeringDark-silicon Many-core ChipsDevice ReliabilityMicroelectronicsPower-constrained Reliability ManagementMany-core ArchitectureCircuit ReliabilityPower-efficient ComputingTight Power Envelope
Due to the tight power envelope, in the future technology nodes it is envisaged that not all cores in a many-core chip can be simultaneously powered-on (at full performance level). The power-gated cores are referred to as Dark Silicon. At the same time, growing reliability issues due to process variations and soft errors challenge the cost-effective deployment of future technology nodes. This paper presents a reliability management system for Dark Silicon chips (dsReliM) that optimizes for reliability of on-chip systems while jointly accounting for soft errors, process variations and the thermal design power (TDP) constraint. Towards the TDP-constrained reliability optimization, dsReliM leverages multiple reliable application versions that can potentially execute on different cores with frequency variations and supporting differenst voltage-frequency levels, thus facilitating distinct power, reliability and performance tradeoffs at run time. Experiments show that our dsReliM system provides up to 20% reliability improvements under different TDP constraints when compared to a state-of-the-art technique. Also, compared to an ideal-case optimal solution, dsReliM deviates up to 2.5% in terms of reliability efficiency, but speeds up the reliability management decision time by a factor of up to 3100.
| Year | Citations | |
|---|---|---|
Page 1
Page 1