Publication | Closed Access
μLayer
93
Citations
40
References
2019
Year
Unknown Venue
EngineeringMachine LearningData ScienceAdvanced ComputingEdge ComputingMobile NnMachine Learning ModelSparse Neural NetworkComputer ArchitectureComputer EngineeringEmbedded Machine LearningComputer ScienceMobile ComputingDeep LearningNeural Architecture SearchMobile ServicesExploit Data Parallelism
Emerging mobile services heavily utilize Neural Networks (NNs) to improve user experiences. Such NN-assisted services depend on fast NN execution for high responsiveness, demanding mobile devices to minimize the NN execution latency by efficiently utilizing their underlying hardware resources. To better utilize the resources, existing mobile NN frameworks either employ various CPU-friendly optimizations (e.g., vectorization, quantization) or exploit data parallelism using heterogeneous processors such as GPUs and DSPs. However, their performance is still bounded by the performance of the single target processor, so that realtime services such as voice-driven search often fail to react to user requests in time. It is obvious that this problem will become more serious with the introduction of more demanding NN-assisted services.
| Year | Citations | |
|---|---|---|
Page 1
Page 1