Publication | Closed Access
MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing
24
Citations
183
References
2024
Year
Unknown Venue
EngineeringComputer ArchitectureDram ArrayParallel AlgorithmsMulti-channel Memory ArchitectureHardware SecurityHigh-performance ArchitectureSimd ParallelismParallel ComputingCompilersComputer EngineeringComputer ScienceMemory ArchitectureExternal-memory AlgorithmSource MimdramProgram AnalysisEnd-to-end Processing-using-dram SystemMany-core ArchitectureParallel ProgrammingData-level Parallelism
Processing-using-DRAM (PUD) is a processing-in-memory (PIM) approach that uses a DRAM array's massive internal parallelism to execute very-wide (e.g., 16,384-262,144-bit-wide) data-parallel operations, in a single-instruction multiple-data (SIMD) fashion. However, DRAM rows' large and rigid granularity limit the effectiveness and applicability of PUD in three ways. First, since applications have varying degrees of SIMD parallelism (which is often smaller than the DRAM row granularity), PUD execution often leads to underutilization, through-put loss, and energy waste. Second, due to the high area cost of implementing interconnects that connect columns in a wide DRAM row, most PUD architectures are limited to the execution of parallel map operations, where a single operation is performed over equally-sized input and output arrays. Third, the need to feed the wide DRAM row with tens of thousands of data elements combined with the lack of adequate compiler support for PUD systems create a programmability barrier, since programmers need to manually extract SIMD parallelism from an application and map computation to the PUD hardware. Our goal is to design a flexible PUD system that overcomes the limitations caused by the large and rigid granularity of PUD. To this end, we propose MIMDRAM, a hardware/software co-designed PUD system that introduces new mechanisms to allocate and control only the necessary resources for a given PUD operation. The key idea of MIMDRAM is to leverage fine-grained DRAM (i.e., the ability to independently access smaller segments of a large DRAM row) for PUD computation. MIMDRAM exploits this key idea to enable a multiple-instruction multiple-data (MIMD) execution model in each DRAM subarray (and SIMD execution within each DRAM row segment). We evaluate MIMDRAM using twelve real-world applications and 495 multi-programmed application mixes. Our evaluation shows that MIMDRAM provides 34 × the performance, 14.3 × the energy efficiency, 1.7 × the throughput, and 1.3 × the fairness of a state-of-the-art PUD framework, along with 30.6 × and 6.8 × the energy efficiency of a high-end CPU and GPU, respectively. MIMDRAM adds small area cost to a DRAM chip (1.11%) and CPU die (0.6%). We hope and believe that MIMDRAM's ideas and results will help to enable more efficient and easy-to-program PUD systems. To this end, we open source MIMDRAM at https://glthub.com/CMU-SAFARI/MIMDRAM.
| Year | Citations | |
|---|---|---|
Page 1
Page 1