Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers

Xu, Junhao; Chen, Xie; Hu, Shoukang; Yu, Jianwei; Liu, Xunying; Meng, Helen

Computer Science > Machine Learning

arXiv:2111.14836 (cs)

[Submitted on 29 Nov 2021]

Title:Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers

Authors:Junhao Xu, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng

View PDF

Abstract:The high memory consumption and computational costs of Recurrent neural network language models (RNNLMs) limit their wider application on resource constrained devices. In recent years, neural network quantization techniques that are capable of producing extremely low-bit compression, for example, binarized RNNLMs, are gaining increasing research interests. Directly training of quantized neural networks is difficult. By formulating quantized RNNLMs training as an optimization problem, this paper presents a novel method to train quantized RNNLMs from scratch using alternating direction methods of multipliers (ADMM). This method can also flexibly adjust the trade-off between the compression rate and model performance using tied low-bit quantization tables. Experiments on two tasks: Penn Treebank (PTB), and Switchboard (SWBD) suggest the proposed ADMM quantization achieved a model size compression factor of up to 31 times over the full precision baseline RNNLMs. Faster convergence of 5 times in model training over the baseline binarized RNNLM quantization was also obtained. Index Terms: Language models, Recurrent neural networks, Quantization, Alternating direction methods of multipliers.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2111.14836 [cs.LG]
	(or arXiv:2111.14836v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2111.14836

Submission history

From: Junhao Xu [view email]
[v1] Mon, 29 Nov 2021 09:30:06 UTC (155 KB)

Computer Science > Machine Learning

Title:Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators