Concepedia

Publication | Closed Access

A Distributed In-Situ CNN Inference System for IoT Applications

15

Citations

22

References

2020

Year

Abstract

CNN is a popular deep learning structure able to provide intelligent processing in IoT applications. Instead of deploying the resource-hungry CNN inference workloads on the cloud, it would be promising to utilize local IoT devices for the in-situ processing. Since a single IoT device has only limited resources available, distributing over multiple local devices becomes a potential solution, especially for high-accuracy and time-sensitive tasks. However, it is non-trivial to distribute the inference of existing CNN models efficiently as they are inherently tightly-coupled structure. In this paper, we propose a distributed in-situ CNN inference system with the loosely-coupled CNN structure (LCS), the synchronization-oriented partitioning (SOP), and the decentralized asynchronous communication (DAC) for IoT applications. LCS is based on two novel design ideas, the homogeneous group and the intermittent shuffle. Experiments on ImageNet classification illustrate that LCS has the leading accuracy compared with other structures, under a given computation budget. SOP and DAC target on converting the loosely-coupled feature of LCS into practical performance improvement. SOP tries to partition LCS with fewer synchronization points and DAC reduces the communication overhead by overlapping communications. When the number of IoT devices increases from 1 to 4, our system accelerates by up to 3.85 ×, and reduces the memory footprint in each device by 70%, outperforming other approaches.

References

YearCitations

Page 1