Publication | Closed Access
In-Memory Computation of a Machine-Learning Classifier in a Standard 6T SRAM Array
505
Citations
15
References
2017
Year
EngineeringMachine LearningVlsi DesignHardware AlgorithmComputer ArchitectureIntegrated CircuitsIn-memory ComputationMachine-learning ModelHardware SystemsMulti-channel Memory ArchitecturePattern RecognitionComputing SystemsParallel ComputingSram ArrayComputer EngineeringMachine-learning ClassifierComputer ScienceMemory ArchitectureSemiconductor MemoryIn-memory Computing
This paper presents a machine‑learning classifier that performs computations directly within a standard 6T SRAM array, storing the model in the memory cells. The design uses peripheral circuits to realize mixed‑signal weak classifiers in SRAM columns, and a boosting‑based training algorithm combines multiple columns into a strong classifier while mitigating circuit nonidealities. A 128 × 128 prototype implemented in 130‑nm CMOS achieves 90 % ten‑way MNIST accuracy, runs at 300 MHz in SRAM mode and 50 MHz in classify mode, and delivers 630 pJ per decision—113× and 13× more energy‑efficient than discrete systems using standard and proposed training algorithms, respectively.
This paper presents a machine-learning classifier where computations are performed in a standard 6T SRAM array, which stores the machine-learning model. Peripheral circuits implement mixed-signal weak classifiers via columns of the SRAM, and a training algorithm enables a strong classifier through boosting and also overcomes circuit nonidealities, by combining multiple columns. A prototype 128 × 128 SRAM array, implemented in a 130-nm CMOS process, demonstrates ten-way classification of MNIST images (using image-pixel features downsampled from 28 × 28 = 784 to 9 × 9 = 81, which yields a baseline accuracy of 90%). In SRAM mode (bit-cell read/write), the prototype operates up to 300 MHz, and in classify mode, it operates at 50 MHz, generating a classification every cycle. With accuracy equivalent to a discrete SRAM/digital-MAC system, the system achieves ten-way classification at an energy of 630 pJ per decision, 113× lower than a discrete system with standard training algorithm and 13× lower than a discrete system with the proposed training algorithm.
| Year | Citations | |
|---|---|---|
Page 1
Page 1