Publication | Open Access
Automatic generation of specialized direct convolutions for mobile GPUs
15
Citations
11
References
2020
Year
Unknown Venue
Convolutional Neural NetworkEngineeringMachine LearningComputer ArchitectureGpu ComputingImage AnalysisParallel ComputingMachine VisionAutomatic GenerationComputer EngineeringComputer ScienceDeep LearningNeural Architecture SearchGpu ClusterComputer VisionGpu ArchitectureHardware AccelerationArm Compute LibraryConvolutional Neural NetworksParallel ProgrammingMachine-learning Frameworks
Convolutional Neural Networks (CNNs) are a powerful and versatile tool for performing computer vision tasks in both resource constrained settings and server-side applications. Most GPU hardware vendors provide highly tuned libraries for CNNs such as Nvidia's cuDNN or ARM Compute Library. Such libraries are the basis for higher-level, commonly-used, machine-learning frameworks such as PyTorch or Caffe, abstracting them away from vendor-specific implementation details. However, writing optimized parallel code for GPUs is far from trivial. This places a significant burden on hardware-specific library writers which have to continually play catch-up with rapid hardware and network evolution.
| Year | Citations | |
|---|---|---|
Page 1
Page 1