Publication | Closed Access
The performance of runtime data cache prefetching in a dynamic optimization system
84
Citations
26
References
2003
Year
Unknown Venue
Cluster ComputingEngineeringTraditional SoftwareComputer ArchitectureSoftware EngineeringHigh-performance ArchitectureManagementRuntime Cache MissSystems EngineeringPerformance TuningParallel ComputingCompilersPerformance PredictionData CacheComputer EngineeringCachingComputer ScienceProgram OptimizationPerformance Analysis ToolProgram AnalysisDynamic Optimization SystemParallel Performance EvaluationParallel ProgrammingSystem Software
Traditional software controlled data cache prefetching is often ineffective due to the lack of runtime cache miss and miss address information. To overcome this limitation, we implement runtime data cache prefetching in the dynamic optimization system ADORE (ADaptive Object code Reoptimization). Its performance has been compared with static software prefetching on the SPEC2000 benchmark suite. Runtime cache prefetching shows better performance. On an Itanium 2 based Linux workstation, it can increase performance by more than 20% over static prefetching on some benchmarks. For benchmarks that do not benefit from prefetching, the runtime optimization system adds only 1%-2% overhead. We have also collected cache miss profiles to guide static data cache prefetching in the ORC compiler. With that information the compiler can effectively avoid generating prefetches for loops that hit well in the data cache.
| Year | Citations | |
|---|---|---|
Page 1
Page 1