Publication | Closed Access
Area and time efficient implementations of matrix multiplication on FPGAs
57
Citations
5
References
2003
Year
Unknown Venue
Hardware SecurityConfigurable HardwareArray ComputingEngineeringHardware AccelerationVlsi ArchitectureHigh-performance ArchitectureHardware AlgorithmComputer ArchitectureComputer EngineeringMatrix MultiplicationParallel ProgrammingComputer ScienceReconfigurable ArchitectureParallel ComputingFpga DesignSame Latency
We develop new algorithms and architectures for matrix multiplication on configurable hardware. These designs significantly reduce the latency as well as the area. Our designs improve the previous designs in terms of the area/speed metric where the speed denotes the maximum achievable running frequency. The area/speed metrics for the previous designs and our design are 14.45, 4.93, and 2.35, respectively, for 4 /spl times/ 4 matrix multiplication. The latency of one of the previous design is 0.57 /spl mu/s, while our design takes 0.15 /spl mu/s using 18% less area. The area of our designs is smaller by 11% - 46% compared with the best known systolic designs with the same latency for the matrices of sizes 3 /spl times/ 3 - 12 /spl times/ 12. The performance improvements tend to grow with the problem size.
| Year | Citations | |
|---|---|---|
Page 1
Page 1