Kernel Quantization for Efficient Network Compression

Yu, Zhongzhi; Shi, Yemin; Huang, Tiejun; Yu, Yizhou

Computer Science > Machine Learning

arXiv:2003.05148 (cs)

[Submitted on 11 Mar 2020]

Title:Kernel Quantization for Efficient Network Compression

Authors:Zhongzhi Yu, Yemin Shi, Tiejun Huang, Yizhou Yu

View PDF

Abstract:This paper presents a novel network compression framework Kernel Quantization (KQ), targeting to efficiently convert any pre-trained full-precision convolutional neural network (CNN) model into a low-precision version without significant performance loss. Unlike existing methods struggling with weight bit-length, KQ has the potential in improving the compression ratio by considering the convolution kernel as the quantization unit. Inspired by the evolution from weight pruning to filter pruning, we propose to quantize in both kernel and weight level. Instead of representing each weight parameter with a low-bit index, we learn a kernel codebook and replace all kernels in the convolution layer with corresponding low-bit indexes. Thus, KQ can represent the weight tensor in the convolution layer with low-bit indexes and a kernel codebook with limited size, which enables KQ to achieve significant compression ratio. Then, we conduct a 6-bit parameter quantization on the kernel codebook to further reduce redundancy. Extensive experiments on the ImageNet classification task prove that KQ needs 1.05 and 1.62 bits on average in VGG and ResNet18, respectively, to represent each parameter in the convolution layer and achieves the state-of-the-art compression ratio with little accuracy loss.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2003.05148 [cs.LG]
	(or arXiv:2003.05148v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2003.05148

Submission history

From: Zhongzhi Yu [view email]
[v1] Wed, 11 Mar 2020 08:00:04 UTC (1,338 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2020-03

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Zhongzhi Yu
Yemin Shi
Tiejun Huang
Yizhou Yu

export BibTeX citation

Computer Science > Machine Learning

Title:Kernel Quantization for Efficient Network Compression

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Kernel Quantization for Efficient Network Compression

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators