TY - JOUR
T1 - Training Multi-Bit Quantized and Binarized Networks with a Learnable Symmetric Quantizer
AU - Pham, Phuoc
AU - Abraham, Jacob A.
AU - Chung, Jaeyong
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2021
Y1 - 2021
N2 - Quantizing weights and activations of deep neural networks is essential for deploying them in resource-constrained devices, or cloud platforms for at-scale services. While binarization is a special case of quantization, this extreme case often leads to several training difficulties, and necessitates specialized models and training methods. As a result, recent quantization methods do not provide binarization, thus losing the most resource-efficient option, and quantized and binarized networks have been distinct research areas. We examine binarization difficulties in a quantization framework and find that all we need to enable the binary training are a symmetric quantizer, good initialization, and careful hyperparameter selection. These techniques also lead to substantial improvements in multi-bit quantization. We demonstrate our unified quantization framework, denoted as UniQ, on the ImageNet dataset with various architectures such as ResNet-18,-34 and MobileNetV2. For multi-bit quantization, UniQ outperforms existing methods to achieve the state-of-the-art accuracy. In binarization, the achieved accuracy is comparable to existing state-of-the-art methods even without modifying the original architectures.
AB - Quantizing weights and activations of deep neural networks is essential for deploying them in resource-constrained devices, or cloud platforms for at-scale services. While binarization is a special case of quantization, this extreme case often leads to several training difficulties, and necessitates specialized models and training methods. As a result, recent quantization methods do not provide binarization, thus losing the most resource-efficient option, and quantized and binarized networks have been distinct research areas. We examine binarization difficulties in a quantization framework and find that all we need to enable the binary training are a symmetric quantizer, good initialization, and careful hyperparameter selection. These techniques also lead to substantial improvements in multi-bit quantization. We demonstrate our unified quantization framework, denoted as UniQ, on the ImageNet dataset with various architectures such as ResNet-18,-34 and MobileNetV2. For multi-bit quantization, UniQ outperforms existing methods to achieve the state-of-the-art accuracy. In binarization, the achieved accuracy is comparable to existing state-of-the-art methods even without modifying the original architectures.
KW - binarization
KW - Learnable quantizer
KW - machine learning
KW - model compression
KW - neuromorphic computing
KW - quantization
UR - http://www.scopus.com/inward/record.url?scp=85103262927&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2021.3067889
DO - 10.1109/ACCESS.2021.3067889
M3 - Article
AN - SCOPUS:85103262927
SN - 2169-3536
VL - 9
SP - 47194
EP - 47203
JO - IEEE Access
JF - IEEE Access
M1 - 9383003
ER -