Concepedia

Abstract

Load instructions diminish processor performance in two ways. First, due to the continuously widening gap between CPU and memory speed, the relative latency of load instructions grows constantly and the slows program execution. Next, memory reads limit the available instruction-level parallelism as instructions that use the result of a load must wait for the memory access to complete before they can start executing. Load-value predictors alleviate both problems by allowing the CPU to speculatively continue processing without having to wait for load instructions, which can significantly improve the execution speed. In this paper, we investigate the performance of all hybrids that can be built out of a register value, a last value, a stride 2-delta, the last four values, and a finite context method predictor. Our analysis shows that hybrids can deliver 25 percent more speedup than the best single-component predictors. Our hybridization study identified the register value + stride 2-delta predictor as one of the best two-component hybrids. It matches or exceeds the speedup of two-component hybrids from the literature in spite of its substantially smaller and simpler design. Of all the predictors we studied, the register value + stride 2-delta + last four value hybrid performs best.

References

YearCitations

Page 1