DeeperThings: Fully Distributed CNN Inference on Resource-Constrained Edge Devices

Rafael Stahl, Alexander Hoffman, Daniel Mueller-Gritschneder, Andreas Gerstlauer, Ulf Schlichtmann

Research output: Contribution to journalArticlepeer-review

27 Scopus citations


Performing inference of Convolutional Neural Networks (CNNs) on Internet of Things (IoT) edge devices ensures both privacy of input data and possible run time reductions when compared to a cloud solution. As most edge devices are memory- and compute-constrained, they cannot store and execute complex CNNs. Partitioning and distributing layer information across multiple edge devices to reduce the amount of computation and data on each device presents a solution to this problem. In this article, we propose DeeperThings, an approach that supports a full distribution of CNN inference tasks by partitioning fully-connected as well as both feature- and weight-intensive convolutional layers. Additionally, we jointly optimize memory, computation and communication demands. This is achieved using techniques to combine both feature and weight partitioning with a communication-aware layer fusion method, enabling holistic optimization across layers. For a given number of edge devices, the schemes are applied jointly using Integer Linear Programming (ILP) formulations to minimize data exchanged between devices, to optimize run times and to find the entire model’s minimal memory footprint. Experimental results from a real-world hardware setup running four different CNN models confirm that the scheme is able to evenly balance the memory footprint between devices. For six devices on 100 Mbit/s connections the integration of layer fusion additionally leads to a reduction of communication demands by up to 28.8%. This results in run time speed-up of the inference task by up to 1.52x compared to layer partitioning without fusing.

Original languageEnglish
Pages (from-to)600-624
Number of pages25
JournalInternational Journal of Parallel Programming
Issue number4
StatePublished - Aug 2021


  • Deep learning
  • Distributed computing
  • IoT


Dive into the research topics of 'DeeperThings: Fully Distributed CNN Inference on Resource-Constrained Edge Devices'. Together they form a unique fingerprint.

Cite this