Edge Intelligence - Concepedia

TLDR

Deep neural networks are the backbone of machine learning, yet executing them on resource‑constrained mobile devices incurs high performance and energy overhead, and offloading to the cloud suffers unpredictable latency. This paper proposes Edgent, a collaborative on‑demand DNN co‑inference framework that synergizes device and edge resources. Edgent achieves this by adaptively partitioning DNN computation between device and edge and by applying early‑exit right‑sizing at intermediate layers to reduce inference latency. Prototype experiments on a Raspberry Pi demonstrate that Edgent delivers low‑latency edge intelligence in real‑time scenarios.

Abstract

As the backbone technology of machine learning, deep neural networks (DNNs) have have quickly ascended to the spotlight. Running DNNs on resource-constrained mobile devices is, however, by no means trivial, since it incurs high performance and energy overhead. While offloading DNNs to the cloud for execution suffers unpredictable performance, due to the uncontrolled long wide-area network latency. To address these challenges, in this paper, we propose Edgent, a collaborative and on-demand DNN co-inference framework with device-edge synergy. Edgent pursues two design knobs: (1) DNN partitioning that adaptively partitions DNN computation between device and edge, in order to leverage hybrid computation resources in proximity for real-time DNN inference. (2) DNN right-sizing that accelerates DNN inference through early-exit at a proper intermediate DNN layer to further reduce the computation latency. The prototype implementation and extensive evaluations based on Raspberry Pi demonstrate Edgent's effectiveness in enabling on-demand low-latency edge intelligence.

References

Page 1

	Year	Citations

Page 1