Concepedia

Abstract

To feed the high degrees of parallelism in modern graphics processors and manycore CPU designs, DRAM manufacturers have created new DRAM architectures that deliver high bandwidth. This paper presents a simulation-based study of the most common forms of DRAM today: DDR3, DDR4, and LPDDR4 SDRAM; GDDR5 SGRAM; and two recent 3D-stacked architectures: High Bandwidth Memory (HBM1, HBM2), and Hybrid Memory Cube (HMC1, HMC2). Our simulations give both time and power/energy results and reveal several things: (a) current multi-channel DRAM technologies have succeeded in translating bandwidth into better execution time for all applications, turning memory-bound applications into compute-bound; (b) the inherent parallelism in the memory system is the critical enabling factor (high bandwidth alone is insufficient); (c) while all current DRAM architectures have addressed the memory-bandwidth problem, the memory-latency problem does still remain, dominated by queuing delays arising from lack of parallelism; and (d) the majority of power and energy is spent in the I/O interface, driving bits across the bus; DRAM-specific overhead beyond bandwidth has been reduced significantly, which is great news (an ideal memory technology would dissipate power only in bandwidth, all else would be free).

References

YearCitations

Page 1