Publication | Closed Access
A 8-b-Precision 6T SRAM Computing-in-Memory Macro Using Segmented-Bitline Charge-Sharing Scheme for AI Edge Chips
47
Citations
40
References
2022
Year
EngineeringVlsi DesignEnergy EfficiencyEmerging Memory TechnologyComputer ArchitectureIntegrated CircuitsHardware SystemsMulti-channel Memory ArchitectureHardware SecurityComputing SystemsMemory DevicesParallel ComputingNovel Sram-cim Structure8-B-precision 6TElectrical EngineeringComputer EngineeringComputer ScienceMicroelectronicsMemory ArchitectureMac OperationsVlsi ArchitectureEdge ComputingAi Edge ChipsIn-memory Computing
Advances in static random access memory (SRAM)-CIM devices are meant to increase capacity while improving energy efficiency (EF) and reducing computing latency ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$T_{\mathrm {AC}}$ </tex-math></inline-formula> ). This work presents a novel SRAM-CIM structure using: 1) a segmented-bitline charge-sharing (SBCS) scheme for multiply-and-accumulate (MAC) operations with low energy consumption and a consistently high signal margin across MAC values; 2) a bitline-combining (BL-CMB) scheme to reduce the number of analog-to-digital converters (ADCs) and, thereby, provide options in determining a tradeoff between EF and inference accuracy; 3) a source-injection local-multiplication cell (SILMC) connected to two types of global-bitline-switch to support the SBCS and BL-CMB schemes with consistent signal margin against process variation in transistors; and 4) prioritized-hybrid ADC to suppress area and power overhead for analog readout operations. We fabricated a 28-nm 384-kb SRAM-CIM macro using foundry-provided compact-6T cells supporting MAC operations with 16 accumulations of 8-b input and 8-b weight with near-full precision output (20 b). This macro achieved <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$T_{\mathrm {AC}}$ </tex-math></inline-formula> of 7.2 ns and EF of 22.75 TOPS/W performing 8-b-MAC operations.
| Year | Citations | |
|---|---|---|
Page 1
Page 1