Publication | Closed Access
Joint DNN Partition Deployment and Resource Allocation for Delay-Sensitive Deep Learning Inference in IoT
119
Citations
41
References
2020
Year
EngineeringMachine LearningPartition DeploymentData ScienceEmbedded Machine LearningInternet Of ThingsParallel ComputingEdge IntelligenceDnn Partition DeploymentComputer EngineeringLow LatencyComputer ScienceMobile ComputingDeep LearningNeural Architecture SearchEdge ArchitectureDeep Neural NetworkEdge ComputingBusinessMulti-access Edge ComputingResource AllocationResource Optimization
Nowadays, the widely used Internet-of-Things (IoT) mobile devices (MDs) generate huge volumes of data, which need analyzing and extracting accurate information in real time by compute-intensive deep learning (DL) inference tasks. Due to its multilayer structure, the deep neural network (DNN) is appropriate for the mobile-edge computing (MEC) environment, and the DL tasks can be offloaded to DNN partitions deployed in MEC servers (MECSs) for speed-up inference. In this article, we first assume the arrival process of DL tasks as Poisson distribution and develop a tandem queueing model to evaluate the end-to-end (E2E) inference delay of DL tasks in multiple DNN partitions. To minimize the E2E delay, we develop a joint optimization problem model of partition deployment and resource allocation in MECSs (JPDRA). Since the JPDRA is a mixed-integer nonlinear programming (MINLP) problem, we decompose the original problem into a computing resource allocation (CRA) problem with fixed partition deployment decision and a DNN partition deployment (DPD) problem that optimizes the optimal-delay function related to the CRA problem. Next, we design a CRA algorithm based on Markov approximation and a low-complexity DPD algorithm to obtain the near-optimal solution in the polynomial time. The simulation results demonstrate that the proposed algorithms are more efficient and can reduce the average E2E delay by 25.7% with better convergence performance.
| Year | Citations | |
|---|---|---|
Page 1
Page 1