TY - GEN
T1 - Logic Design of Neural Networks for High-Throughput and Low-Power Applications
AU - Xu, Kangwei
AU - Zhang, Grace Li
AU - Schlichtmann, Ulf
AU - Li, Bing
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Neural networks (NNs) have been successfully deployed in various fields. In NNs, a large number of multiply-accumulate (MAC) operations need to be performed. Most existing digital hardware platforms rely on parallel MAC units to accelerate these MAC operations. However, under a given area constraint, the number of MAC units in such platforms is limited, so MAC units have to be reused to perform MAC operations in a neural network. Accordingly, the throughput in generating classification results is not high, which prevents the application of traditional hardware platforms in extreme-throughput scenarios. Besides, the power consumption of such platforms is also high, mainly due to data movement. To overcome this challenge, in this paper, we propose to flatten and implement all the operations at neurons, e.g., MAC and ReLU, in a neural network with their corresponding logic circuits. To improve the throughput and reduce the power consumption of such logic designs, the weight values are embedded into the MAC units to simplify the logic, which can reduce the delay of the MAC units and the power consumption incurred by weight movement. The retiming technique is further used to improve the throughput of the logic circuits for neural networks. In addition, we propose a hardware-aware training method to reduce the area of logic designs of neural networks. Experimental results demonstrate that the proposed logic designs can achieve high throughput and low power consumption for several high-throughput applications.
AB - Neural networks (NNs) have been successfully deployed in various fields. In NNs, a large number of multiply-accumulate (MAC) operations need to be performed. Most existing digital hardware platforms rely on parallel MAC units to accelerate these MAC operations. However, under a given area constraint, the number of MAC units in such platforms is limited, so MAC units have to be reused to perform MAC operations in a neural network. Accordingly, the throughput in generating classification results is not high, which prevents the application of traditional hardware platforms in extreme-throughput scenarios. Besides, the power consumption of such platforms is also high, mainly due to data movement. To overcome this challenge, in this paper, we propose to flatten and implement all the operations at neurons, e.g., MAC and ReLU, in a neural network with their corresponding logic circuits. To improve the throughput and reduce the power consumption of such logic designs, the weight values are embedded into the MAC units to simplify the logic, which can reduce the delay of the MAC units and the power consumption incurred by weight movement. The retiming technique is further used to improve the throughput of the logic circuits for neural networks. In addition, we propose a hardware-aware training method to reduce the area of logic designs of neural networks. Experimental results demonstrate that the proposed logic designs can achieve high throughput and low power consumption for several high-throughput applications.
UR - https://www.scopus.com/pages/publications/85189363034
U2 - 10.1109/ASP-DAC58780.2024.10473844
DO - 10.1109/ASP-DAC58780.2024.10473844
M3 - Conference contribution
AN - SCOPUS:85189363034
T3 - Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC
SP - 902
EP - 907
BT - ASP-DAC 2024 - 29th Asia and South Pacific Design Automation Conference, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 29th Asia and South Pacific Design Automation Conference, ASP-DAC 2024
Y2 - 22 January 2024 through 25 January 2024
ER -