TY - JOUR
T1 - Approximation- and Quantization-Aware Training for Graph Neural Networks
AU - Novkin, Rodion
AU - Klemme, Florian
AU - Amrouch, Hussam
N1 - Publisher Copyright:
© 1968-2012 IEEE.
PY - 2024/2/1
Y1 - 2024/2/1
N2 - Graph Neural Networks (GNNs) are one of the best-performing models for processing graph data. They are known to have considerable computational complexity, despite the smaller number of parameters compared to traditional Deep Neural Networks (DNNs). Operations-to-parameters ratio for GNNs can be tens and hundreds of times higher than for DNNs, depending on the input graph size. This complexity indicates the importance of arithmetic operation optimization within GNNs through model quantization and approximation. In this work, for the first time, we combine both approaches and implement quantization- and approximation-aware training for GNNs to sustain their accuracy under the errors induced by inexact multiplications. We employ matrix multiplication CUDA kernel to speed up the simulation of approximate multiplication within GNNs. Further, we demonstrate the execution speed, accuracy, and energy efficiency of GNNs with approximate multipliers in comparison with quantized low-bit GNNs. We evaluate the performance of state-of-the-art GNN architectures (i.e., GIN, SAGE, GCN, and GAT) on various datasets and tasks (i.e., Reddit-Binary, Collab for graph classification, Cora and PubMed for node classification) with a wide range of approximate multipliers. Our framework is available online: https://github.com/TUM-AIPro/AxC-GNN.
AB - Graph Neural Networks (GNNs) are one of the best-performing models for processing graph data. They are known to have considerable computational complexity, despite the smaller number of parameters compared to traditional Deep Neural Networks (DNNs). Operations-to-parameters ratio for GNNs can be tens and hundreds of times higher than for DNNs, depending on the input graph size. This complexity indicates the importance of arithmetic operation optimization within GNNs through model quantization and approximation. In this work, for the first time, we combine both approaches and implement quantization- and approximation-aware training for GNNs to sustain their accuracy under the errors induced by inexact multiplications. We employ matrix multiplication CUDA kernel to speed up the simulation of approximate multiplication within GNNs. Further, we demonstrate the execution speed, accuracy, and energy efficiency of GNNs with approximate multipliers in comparison with quantized low-bit GNNs. We evaluate the performance of state-of-the-art GNN architectures (i.e., GIN, SAGE, GCN, and GAT) on various datasets and tasks (i.e., Reddit-Binary, Collab for graph classification, Cora and PubMed for node classification) with a wide range of approximate multipliers. Our framework is available online: https://github.com/TUM-AIPro/AxC-GNN.
KW - Graph neural network
KW - approximate computing
KW - deep learning
KW - quantization
UR - http://www.scopus.com/inward/record.url?scp=85179086950&partnerID=8YFLogxK
U2 - 10.1109/TC.2023.3337319
DO - 10.1109/TC.2023.3337319
M3 - Article
AN - SCOPUS:85179086950
SN - 0018-9340
VL - 73
SP - 599
EP - 612
JO - IEEE Transactions on Computers
JF - IEEE Transactions on Computers
IS - 2
ER -