Publication | Open Access
A highly parameterizable framework for Conditional Restricted Boltzmann Machine based workloads accelerated with FPGAs and OpenCL
11
Citations
13
References
2019
Year
EngineeringMachine LearningNeural Networks (Machine Learning)Advanced ComputingHardware AlgorithmComputer ArchitectureSocial SciencesData ScienceHigh-performance ArchitectureSparse Neural NetworkParameterizable FrameworkComputing SystemsEmbedded Machine LearningWorkload CharacterizationParallel ComputingComputer EngineeringComputer ScienceNeural Networks (Computational Neuroscience)Neural Architecture SearchModel CompressionBatch SizesHardware AccelerationParallel ProgrammingArtificial Neural Network
Conditional Restricted Boltzmann Machine (CRBM) is a promising candidate for a multidimensional system modeling that can learn a probability distribution over a set of data. It is a specific type of an artificial neural network with one input (visible) and one output (hidden) layer. Recently published works demonstrate that CRBM is a suitable mechanism for modeling multidimensional time series such as human motion, workload characterization, city traffic analysis. The process of learning and inference of these systems relies on linear algebra functions like matrix–matrix multiplication, and for higher data sets, they are very compute-intensive. In this paper, we present a configurable framework for CRBM based workloads for arbitrary large models. We show how to accelerate the learning process of CRBM with FPGAs and OpenCL, and we conduct an extensive scalability study for different model sizes and system configurations. We show significant improvement in performance/Watt for large models and batch sizes (from 1.51x up to 5.71x depending on the host configuration) when we use FPGA and OpenCL for the acceleration, and limited benefits for small models comparing to the state-of-the-art CPU solution.
| Year | Citations | |
|---|---|---|
Page 1
Page 1