Publication | Closed Access
The Design Space of Data-Parallel Memory Systems
18
Citations
14
References
2006
Year
Unknown Venue
Throughput DegradationEngineeringParallel SoftwareShared MemoryHigh-performance ArchitectureComputer EngineeringComputer ArchitectureParallel StorageParallel ProgrammingComputer ScienceDesign SpaceParallel ComputingData-level ParallelismMemory ArchitectureData-parallel Memory SystemsDram Bandwidth
Data-parallel memory systems must maintain a large number of outstanding memory references to fully use increasing DRAM bandwidth in the presence of rising latencies. Additionally, throughput is increasingly sensitive to the reference patterns due to the rising latency of issuing DRAM commands, switching between reads and writes, and precharging/activating internal DRAM banks. We study the design space of data-parallel memory systems in light of these trends of increasing concurrency, latency, and sensitivity to access patterns. We perform a detailed performance analysis of scientific and multimedia applications and micro-benchmarks, varying DRAM parameters and the memory-system configuration. We identify the interference between concurrent read and write memory-access threads, and bank conflicts, both within a single thread and across multiple threads, as the most critical factors affecting performance. We then develop hardware techniques to minimize throughput degradation. We advocate either relying on multiple concurrent accesses from a single memory-reference thread only, while sacrificing load-balance, or introducing new hardware to maintain both locality of reference and load-balance between multiple DRAM channels with multiple threads. We show that a low-cost configuration with only 16 channel-buffer entries achieves over 80% of peak throughput in most cases
| Year | Citations | |
|---|---|---|
Page 1
Page 1