Efficient Adaptive Activation Rounding for Post-Training Quantization

Li, Zhengyi; Guo, Cong; Zhu, Zhanda; Zhou, Yangjie; Qiu, Yuxian; Gao, Xiaotian; Leng, Jingwen; Guo, Minyi

Computer Science > Machine Learning

arXiv:2208.11945 (cs)

[Submitted on 25 Aug 2022 (v1), last revised 24 Aug 2023 (this version, v3)]

Title:Efficient Adaptive Activation Rounding for Post-Training Quantization

Authors:Zhengyi Li, Cong Guo, Zhanda Zhu, Yangjie Zhou, Yuxian Qiu, Xiaotian Gao, Jingwen Leng, Minyi Guo

View PDF

Abstract:Post-training quantization attracts increasing attention due to its convenience in deploying quantized neural networks. Although rounding-to-nearest remains the prevailing method for DNN quantization, prior research has demonstrated its suboptimal nature when applied to weight quantization. They propose optimizing weight rounding schemes by leveraging output error rather than the traditional weight quantization error. Our study reveals that similar rounding challenges also extend to activation quantization. Despite the easy generalization, the challenges lie in the dynamic nature of activation. Adaptive rounding is expected for varying activations and the method is subjected to runtime overhead. To tackle this, we propose the AQuant quantization framework with a novel perspective to reduce output error by adjusting rounding schemes of activations. Instead of using the constant rounding border 0.5 of the rounding-to-nearest operation, we make the border become a function w.r.t. the activation value to change the activation rounding by the adaptive border. To deal with the runtime overhead, we use a coarse-grained version of the border function. Finally, we introduce our framework to optimize the border function. Extensive experiments show that AQuant achieves notable improvements compared to state-of-the-art works and pushes the accuracy of ResNet-18 up to 60.31% under the 2-bit weight and activation quantization.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2208.11945 [cs.LG]
	(or arXiv:2208.11945v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2208.11945

Submission history

From: Zhengyi Li [view email]
[v1] Thu, 25 Aug 2022 09:02:32 UTC (586 KB)
[v2] Mon, 6 Feb 2023 14:36:58 UTC (1,149 KB)
[v3] Thu, 24 Aug 2023 01:53:24 UTC (751 KB)

Computer Science > Machine Learning

Title:Efficient Adaptive Activation Rounding for Post-Training Quantization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Efficient Adaptive Activation Rounding for Post-Training Quantization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators