TY - JOUR
T1 - Leveraging Temporal Patterns
T2 - Automated Augmentation to Create Temporal Early Exit Networks for Efficient Edge AI
AU - Sponner, Max
AU - Servadei, Lorenzo
AU - Waschneck, Bernd
AU - Wille, Robert
AU - Kumar, Akash
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2024
Y1 - 2024
N2 - In embedded systems, efficient deep learning solutions are crucial to balance accuracy and resource constraints. Early Exit Neural Networks offer a promising solution, but their manual configuration and optimization hinder widespread adoption. To overcome this hurdle, we propose a novel, fully automated flow that transforms traditional neural networks into Temporal Decision Early Exit models, optimizing their temporal decision mechanism for efficient inference. Temporal decisions adapt the inference workload by monitoring changes in early predictions over time. This enables them to significantly improve latency and efficiency when operating on streaming data. Our approach enables significant latency reductions without the need for expert knowledge while maintaining prediction quality, making it ideal for resource-constrained embedded systems. We demonstrate the effectiveness of our approach on three representative tasks: ECG classification, wake word detection, and video-based human presence detection. Our results show latency reductions of up to 28.6%, 42.5%, and 24.5% compared to the traditional inference execution, respectively, with minimal accuracy loss. By enabling broader adoption of Temporal Decision Early Exit Neural Networks, our method has the potential to transform the field of embedded deep learning and unlock new possibilities for edge AI.
AB - In embedded systems, efficient deep learning solutions are crucial to balance accuracy and resource constraints. Early Exit Neural Networks offer a promising solution, but their manual configuration and optimization hinder widespread adoption. To overcome this hurdle, we propose a novel, fully automated flow that transforms traditional neural networks into Temporal Decision Early Exit models, optimizing their temporal decision mechanism for efficient inference. Temporal decisions adapt the inference workload by monitoring changes in early predictions over time. This enables them to significantly improve latency and efficiency when operating on streaming data. Our approach enables significant latency reductions without the need for expert knowledge while maintaining prediction quality, making it ideal for resource-constrained embedded systems. We demonstrate the effectiveness of our approach on three representative tasks: ECG classification, wake word detection, and video-based human presence detection. Our results show latency reductions of up to 28.6%, 42.5%, and 24.5% compared to the traditional inference execution, respectively, with minimal accuracy loss. By enabling broader adoption of Temporal Decision Early Exit Neural Networks, our method has the potential to transform the field of embedded deep learning and unlock new possibilities for edge AI.
KW - Conditional Deep Learning
KW - Early Exit Neural Networks
KW - Edge AI
KW - Embedded Deep Learning
KW - Embedded systems
UR - http://www.scopus.com/inward/record.url?scp=85209633683&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2024.3497158
DO - 10.1109/ACCESS.2024.3497158
M3 - Article
AN - SCOPUS:85209633683
SN - 2169-3536
JO - IEEE Access
JF - IEEE Access
ER -