Weight-Oriented Approximation for Energy-Efficient Neural Network Inference Accelerators

Zois Gerasimos Tasoulas, Georgios Zervakis, Iraklis Anagnostopoulos, Hussam Amrouch, Jorg Henkel

Research output: Contribution to journalArticlepeer-review

60 Scopus citations

Abstract

Current research in the area of Neural Networks (NN) has resulted in performance advancements for a variety of complex problems. Especially, embedded system applications rely more and more on the utilization of convolutional NNs to provide services such as image/audio classification and object detection. The core arithmetic computation performed during NN inference is the multiply-accumulate (MAC) operation. In order to meet tighter and tighter throughput constraints, NN accelerators integrate thousands of MAC units resulting in a significant increase in power consumption. Approximate computing is established as a design alternative to improve the efficiency of computing systems by trading computational accuracy for high energy savings. In this work, we bring approximate computing principles and NN inference together by designing NN specific approximate multipliers that feature multiple accuracy levels at run-time. We propose a time-efficient automated framework for mapping the NN weights to the accuracy levels of the approximate reconfigurable accelerator. The proposed weight-oriented approximation mapping is able to satisfy tight accuracy loss thresholds, while significantly reducing energy consumption without any need for intensive NN retraining. Our approach is evaluated against several NNs demonstrating that it delivers high energy savings (17.8% on average) with a minimal loss in inference accuracy (0.5%).

Original languageEnglish
Article number9186830
Pages (from-to)4670-4683
Number of pages14
JournalIEEE Transactions on Circuits and Systems I: Regular Papers
Volume67
Issue number12
DOIs
StatePublished - Dec 2020
Externally publishedYes

Keywords

  • Approximate computing
  • low-power
  • neural network inference
  • reconfigurable approximate multipliers

Fingerprint

Dive into the research topics of 'Weight-Oriented Approximation for Energy-Efficient Neural Network Inference Accelerators'. Together they form a unique fingerprint.

Cite this