research-article

Deep image compression with multi-stage representation

Authors:

Fan LiAuthors Info & Claims

Volume 79, Issue C

https://doi.org/10.1016/j.jvcir.2021.103226

Published: 01 August 2021 Publication History

Abstract

While deep learning-based image compression methods have shown impressive coding performance, most existing methods are still in the mire of two limitations: (1) unpredictable compression efficiency gain when adopting convolutional neural networks with different depths, and (2) lack of an accurate model to estimate the entropy during the training process. To address these two problems, in this paper, a deep multi-stage representation based image compression (MSRIC) method is proposed. Owing to this architecture, the detail information of shallow stages and the compact information of deep stages can be utilized for image reconstruction. Furthermore, a data-dependent channel-wised factorized probability model (DCFPM) is adopted to increase the accuracy of entropy estimation. Experimental results indicate that the proposed method guarantees better perceptual performance at a wide range of bit-rates. Also, ablation studies are carried out to validate the above mentioned technologies.

Highlights

•

Extracting multi-stage representation of input images improves the compression efficiency.

•

Data-dependent channel-wised factorized probability model improves the accuracy of entropy estimation.

•

Both more efficient deep network architecture and more accurate entropy estimation improve the performance of deep image compression.

•

Proper setting strategy of network architecture parameters maximizes the performance.

References

[1]

Wallace G.K., The JPEG still picture compression standard, IEEE Trans. Consumer Electron. 38 (1) (1992) 18–34.

[2]

Skodras A., Christopoulos C., Ebrahimi T., The jpeg 2000 still image compression standard, IEEE Signal Process. Mag. 18 (5) (2001) 36–58.

[3]

Sullivan G.J., Ohm J.-R., Han W.-J., Wiegand T., Overview of the high efficiency video coding (HEVC) standard, IEEE Trans. Circuits Syst. Video Technol. 22 (12) (2012) 1649–1668.

Digital Library

[4]

J. Ballé, V. Laparra, E.P. Simoncelli, End-to-end optimized image compression, in: Proc. Int. Conf. Learn. Represent, 2017.

[5]

L. Theis, W. Shi, A. Cunningham, F. Huszár, Lossy image compression with compressive autoencoders, in: Proc. Int. Conf. Learn. Represent, 2017.

[6]

F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, L. Van Gool, Conditional probability models for deep image compression, in: Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 4394–4402.

[7]

J. Ballé, D. Minnen, S. Singh, S.J. Hwang, N. Johnston, Variational image compression with a scale hyperprior, in: Proc. Int. Conf. Learn. Represent, 2018.

[8]

G. Toderici, S.M. O’Malley, S.J. Hwang, D. Vincent, D. Minnen, S. Baluja, M. Covell, R. Sukthankar, Variable rate image compression with recurrent neural networks, in: Proc. Int. Conf. Learn. Represent, 2016.

[9]

G. Toderici, D. Vincent, N. Johnston, S. Jin Hwang, D. Minnen, J. Shor, M. Covell, Full resolution image compression with recurrent neural networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2017, pp. 5306–5314.

[10]

N. Johnston, D. Vincent, D. Minnen, M. Covell, S. Singh, T. Chinen, S. Jin Hwang, J. Shor, G. Toderici, Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 4385–4393.

[11]

E. Agustsson, M. Tschannen, F. Mentzer, R. Timofte, L.V. Gool, Generative adversarial networks for extreme learned image compression, in: Proc. Int. Conf. Comput. Vis., 2019, pp. 221–231.

[12]

M. Akbari, J. Liang, J. Han, DSSLIC: Deep semantic segmentation-based layered image compression, in: Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2019, pp. 2042–2046.

[13]

C. Huang, H. Liu, T. Chen, S. Pu, Q. Shen, Z. Ma, Extreme image compression via multiscale autoencoders with generative adversarial optimization, in: Proc. IEEE International Conference on Visual Communications and Image Processing, 2019.

[14]

L. Zhou, C. Cai, Y. Gao, S. Su, J. Wu, Variational autoencoder for low bit-rate image compression, in: Proc. IEEE Conf. Comput. Vis. Pattern Recog. Worksh., 2018, pp. 2617–2620.

[15]

H. Liu, T. Chen, Q. Shen, Z. Ma, Practical stacked non-local attention modules for image compression, in: Proc. IEEE Conf. Comput. Vis. Pattern Recog. Worksh., 2019, pp. 3–6.

[16]

Z. Cheng, H. Sun, M. Takeuchi, J. Katto, Learned image compression with discretized Gaussian mixture likelihoods and attention modules, in: Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2020.

[17]

Z. Cheng, H. Sun, M. Takeuchi, J. Katto, Deep convolutional autoencoder-based lossy image compression, in: Proc. Pict. Coding Symp., 2018, pp. 253–257.

[18]

Cheng Z., Sun H., Takeuchi M., Katto J., Energy compaction-based image compression using convolutional autoencoder, IEEE Trans. Multimed. 22 (4) (2020) 860–873.

[19]

J. Campos, S. Meierhans, A. Djelouah, C. Schroers, Content adaptive optimization for neural image compression, in: Proc. IEEE Conf. Comput. Vis. Pattern Recog. Worksh., 2019.

[20]

Li M., Zuo W., Gu S., You J., Zhang D., Learning content-weighted deep image compression, IEEE Trans. Pattern Anal. Mach. Intell. (2020),.

[21]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, Imagenet: A large-scale hierarchical image database, in: Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2009, pp. 248–255.

[22]

Kingma D., Ba J., Adam: A method for stochastic optimization, Comput. Sci. (2014).

[23]

Dufaux F., Sullivan G.J., Ebrahimi T., The JPEG XR image coding standard [Standards in a Nutshell], IEEE Signal Process. Mag. 26 (6) (2009) 195–204.

[24]

libjpeg, http://libjpeg.sourceforge.net/.

[25]

OpenJPEG, http://www.openjpeg.org.

[26]

JPEG XR reference codec, https://jpeg.org/jpegxr/software.html.

[27]

libbpg, https://bellard.org/bpg/.

[28]

VVC Test model, https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM.

[29]

Ballé J., Hwang S.J., Johnston N., David M., Tensorflow/compression, 2018, https://github.com/tensorflow/compression.

[30]

Bégaint J., Racapé F., Feltman S., Pushparaja A., CompressAI: a PyTorch library and evaluation platform for end-to-end compression research, 2020, arXiv preprint arXiv:2011.03029.

[31]

J. Ballé, Efficient nonlinear transforms for lossy image compression, in: Pict. Coding Symp., 2018, pp. 248–252.

[32]

T. Dumas, A. Roumy, C. Guillemot, Autoencoder based image compression: can the learning be quantization independent? in: Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2018, pp. 1188–1192.

Cited By

Rajasoundaran SKumar S V N SSelvi MGanapathy SKannan A(2022)Multi-tier block truncation coding model using genetic auto encoders for gray scale imagesMultimedia Tools and Applications10.1007/s11042-022-13475-x81:29(42621-42647)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1007/s11042-022-13475-x

Index Terms

Deep image compression with multi-stage representation
1. Computing methodologies
2. Information systems
  1. Data management systems
    1. Data structures
      1. Data layout
        Data compression

Index terms have been assigned to the content through auto-classification.

Recommendations

Improved hybrid layered image compression using deep learning and traditional codecs
Abstract
Recently deep learning-based methods have been applied in image compression and achieved many promising results. In this paper, we propose an improved hybrid layered image compression framework by combining deep learning and the traditional image ...
Graphical abstract

Display Omitted
Highlights
- An improved hybrid layered image compression framework by combining deep learning and the traditional image codecs.
- Improved autoencoder architecture.
Deep image compression based on multi-scale deformable convolution
Abstract
Deep image compression efficiency has been improved in the past years. However, to fully exploit context information for compressing image objects of different scales and shapes, more adaptive geometric structure of inputs should be considered. ...
Highlights
- Multi-scale deformable convolution is presented to adjust context information area.
- Multi-scale deformable spatial attention module is used to generate attention maps.
- Down/up sampling modules are also improved for alleviating ...
DeepSIC: Deep Semantic Image Compression
Neural Information Processing
Abstract
Incorporating semantic analysis into image compression can significantly reduce the repetitive computation of fundamental semantic analysis in client-side applications such as semantic image retrieval. The same practice also enables the compressed ...

Comments

Information & Contributors

Information

Published In

cover image Journal of Visual Communication and Image Representation

Journal of Visual Communication and Image Representation Volume 79, Issue C

Aug 2021

491 pages

ISSN:1047-3203

Issue’s Table of Contents

Copyright © 2021.

Publisher

Academic Press, Inc.

United States

Publication History

Published: 01 August 2021

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Rajasoundaran SKumar S V N SSelvi MGanapathy SKannan A(2022)Multi-tier block truncation coding model using genetic auto encoders for gray scale imagesMultimedia Tools and Applications10.1007/s11042-022-13475-x81:29(42621-42647)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1007/s11042-022-13475-x

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents