Analysis of Neural Network Inference Response Times on Embedded Platforms

Patrick Huber, Ulrich Gohner, Mario Trapp, Jonathan Zender, Rabea Lichtenberg

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The response time of Artificial Neural Network (ANN)-inference is of utmost importance in embedded applications, particularly continual stream-processing. Predictive maintenance applications require timely predictions of state changes. This study serves to enable the reader to estimate the response time of a given model based on the underlying platform, and emphasizes the relevance of benchmarking generic ANN applications on edge devices. We analyze the influence of net parameters, activation functions as well as single-and multithreading on execution times. Potential side effects such as tact rate variances or other hardware-related influences are being outlined and accounted for. The results underline the complexity of task-partitioning and scheduling strategies while emphasizing the necessity of precise concertation of the parameters to achieve optimal performance on any platform. This study shows that cutting-edge frameworks don't necessarily perform the required concertations automatically for all configurations, which may negatively impact performance.

Original languageEnglish
Title of host publication2024 Asian Conference on Communication and Networks, ASIANComNet 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350367003
DOIs
StatePublished - 2024
Event1st Asian Conference on Communication and Networks, ASIANComNet 2024 - Hybrid, Bangkok, Thailand
Duration: 24 Oct 202427 Oct 2024

Publication series

Name2024 Asian Conference on Communication and Networks, ASIANComNet 2024

Conference

Conference1st Asian Conference on Communication and Networks, ASIANComNet 2024
Country/TerritoryThailand
CityHybrid, Bangkok
Period24/10/2427/10/24

Keywords

  • ANN inference
  • benchmarking
  • embedded systems
  • response times
  • tensorflow lite

Fingerprint

Dive into the research topics of 'Analysis of Neural Network Inference Response Times on Embedded Platforms'. Together they form a unique fingerprint.

Cite this