Weight-Oriented Approximation for Energy-Efficient Neural Network Inference Accelerators

Zois Gerasimos Tasoulas, Georgios Zervakis, Iraklis Anagnostopoulos, Hussam Amrouch, Jorg Henkel

Research output: Contribution to journalArticlepeer-review

60 Scopus citations


Current research in the area of Neural Networks (NN) has resulted in performance advancements for a variety of complex problems. Especially, embedded system applications rely more and more on the utilization of convolutional NNs to provide services such as image/audio classification and object detection. The core arithmetic computation performed during NN inference is the multiply-accumulate (MAC) operation. In order to meet tighter and tighter throughput constraints, NN accelerators integrate thousands of MAC units resulting in a significant increase in power consumption. Approximate computing is established as a design alternative to improve the efficiency of computing systems by trading computational accuracy for high energy savings. In this work, we bring approximate computing principles and NN inference together by designing NN specific approximate multipliers that feature multiple accuracy levels at run-time. We propose a time-efficient automated framework for mapping the NN weights to the accuracy levels of the approximate reconfigurable accelerator. The proposed weight-oriented approximation mapping is able to satisfy tight accuracy loss thresholds, while significantly reducing energy consumption without any need for intensive NN retraining. Our approach is evaluated against several NNs demonstrating that it delivers high energy savings (17.8% on average) with a minimal loss in inference accuracy (0.5%).

Original languageEnglish
Article number9186830
Pages (from-to)4670-4683
Number of pages14
JournalIEEE Transactions on Circuits and Systems I: Regular Papers
Issue number12
StatePublished - Dec 2020
Externally publishedYes


  • Approximate computing
  • low-power
  • neural network inference
  • reconfigurable approximate multipliers


Dive into the research topics of 'Weight-Oriented Approximation for Energy-Efficient Neural Network Inference Accelerators'. Together they form a unique fingerprint.

Cite this