Publication | Closed Access
Tensaurus: A Versatile Accelerator for Mixed Sparse-Dense Tensor Computations
117
Citations
52
References
2020
Year
Unknown Venue
Array ComputingEngineeringMachine LearningData ScienceTensor OperationsMatrix FactorizationHardware AccelerationSparse Tensor FactorizationsComputer EngineeringComputer ArchitectureVersatile AcceleratorParallel ProgrammingComputer ScienceParallel ComputingTensor FactorizationsGpu ClusterGpu ComputingBig Data
Tensor factorizations are powerful tools in many machine learning and data analytics applications. Tensors are often sparse, which makes sparse tensor factorizations memory bound. In this work, we propose a hardware accelerator that can accelerate both dense and sparse tensor factorizations. We co-design the hardware and a sparse storage format, which allows accessing the sparse data in vectorized and streaming fashion and maximizes the utilization of the memory bandwidth. We extract a common computation pattern that is found in numerous matrix and tensor operations and implement it in the hardware. By designing the hardware based on this common compute pattern, we can not only accelerate tensor factorizations but also mixed sparse-dense matrix operations. We show significant speedup and energy benefit over the state-of-the-art CPU and GPU implementations of tensor factorizations and over CPU, GPU and accelerators for matrix operations.
| Year | Citations | |
|---|---|---|
Page 1
Page 1