Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Universal Optimization Framework for Learning-based Image Codec

Published: 25 August 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Recently, machine learning-based image compression has attracted increasing interests and is approaching the state-of-the-art compression ratio. But unlike traditional codec, it lacks a universal optimization method to seek efficient representation for different images. In this paper, we develop a plug-and-play optimization framework for seeking higher compression ratio, which can be flexibly applied to existing and potential future compression networks. To make the latent representation more efficient, we propose a novel latent optimization algorithm to adaptively remove the redundancy for each image. Additionally, inspired by the potential of side information for traditional codecs, we introduce side information into our framework, and integrate side information optimization with latent optimization to further enhance the compression ratio. In particular, with the joint side information and latent optimization, we can achieve fine rate control using only single model instead of training different models for different rate-distortion trade-offs, which significantly reduces the training and storage cost to support multiple bit rates. Experimental results demonstrate that our proposed framework can remarkably boost the machine learning-based compression ratio, achieving more than 10% additional bit rate saving on three different representative network structures. With the proposed optimization framework, we can achieve 7.6% bit rate saving against the latest traditional coding standard VVC on Kodak dataset, yielding the state-of-the-art compression ratio.

    References

    [1]
    Eirikur Agustsson, Fabian Mentzer, Michael Tschannen, Lukas Cavigelli, Radu Timofte, Luca Benini, and Luc Van Gool. 2017. Soft-to-hard vector quantization for end-to-end learning compressible representations. Neural Information Processing Systems (2017), 1141–1151.
    [2]
    Eirikur Agustsson, Michael Tschannen, Fabian Mentzer, Radu Timofte, and Luc Van Gool. 2019. Generative adversarial networks for extreme learned image compression. IEEE International Conference on Computer Vision (2019), 221–231.
    [3]
    Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2016. Density modeling of images using a generalized normalization transformation. International Conference on Learning Representations (2016).
    [4]
    Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2016. End-to-end optimization of nonlinear transform codes for perceptual quality. Picture Coding Symposium (PCS) (2016), 1–5.
    [5]
    Johannes Ballé, Valero Laparra, and Eero P. Simoncelli. 2017. End-to-end optimized image compression. International Conference on Learning Representations (2017).
    [6]
    Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. International Conference on Learning Representations (2018).
    [7]
    Jean Bégaint, Fabien Racapé, Simon Feltman, and Akshay Pushparaja. 2020. CompressAI: A PyTorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029 (2020).
    [8]
    Gisle Bjontegaard. 2001. Calculation of average PSNR differences between RD-curves. VCEG-M33 (2001).
    [9]
    Frank Bossen, Xiang Li, and Karsten Suehring. 2004. JVET AHG report: Test model software development (AHG3). JVET document, JVET-Q0003, Jan. 2020.
    [10]
    Chunlei Cai, Li Chen, Xiaoyun Zhang, and Zhiyong Gao. 2018. Efficient variable rate image compression with multi-scale decomposition network. IEEE Transactions on Circuits and Systems for Video Technology 29, 12 (2018), 3687–3700.
    [11]
    Joaquim Campos, Simon Meierhans, Abdelaziz Djelouah, and Christopher Schroers. 2019. Content adaptive optimization for neural image compression. IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019).
    [12]
    Tong Chen, Haojie Liu, Zhan Ma, Qiu Shen, Xun Cao, and Yao Wang. 2021. End-to-end learnt image compression via non-local attention optimization and improved context modeling. IEEE Transactions on Image Processing 30 (2021), 3179–3191.
    [13]
    Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. 2019. Deep residual learning for image compression. IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019).
    [14]
    Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. 2020. Learned image compression with discretized Gaussian mixture likelihoods and attention modules. IEEE Conference on Computer Vision and Pattern Recognition (2020), 7939–7948.
    [15]
    Inchoon Choi, Jeyun Lee, and Byeungwoo Jeon. 2006. Fast coding mode selection with rate-distortion optimization for MPEG-4 part-10 AVC/H. 264. IEEE Transactions on Circuits and Systems for Video Technology 16, 12 (2006), 1557–1561.
    [16]
    Yoojin Choi, Mostafa El-Khamy, and Jungwon Lee. 2019. Variable rate deep image compression with a conditional autoencoder. IEEE International Conference on Computer Vision (2019), 3146–3154.
    [17]
    CLIC. 2021. Workshop and Challenge on Learned Image Compression. http://www.compression.cc/challenge/.
    [18]
    Ze Cui, Jing Wang, Shangyin Gao, Tiansheng Guo, Yihui Feng, and Bo Bai. 2021. Asymmetric gained deep image compression with continuous rate adaptation. IEEE Conference on Computer Vision and Pattern Recognition (2021), 10532–10541.
    [19]
    Chih-Ming Fu, Elena Alshina, Alexander Alshin, Yu-Wen Huang, Ching-Yeh Chen, Chia-Yang Tsai, Chih-Wei Hsu, Shaw-Min Lei, Jeong-Hoon Park, and Woo-Jin Han. 2012. Sample adaptive offset in the HEVC standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 1755–1764.
    [20]
    T. Fu, Z. Cheng, J. Hu, L. Guo, S. Wang, X. Zhao, D. Zhou, and Y. Song. 2021. Quality enhancement of VVC intra-frame coding based on HGRDN. 4th Challenge on Learned Image Compression (June2021).
    [21]
    Zongyu Guo, Yaojun Wu, Runsen Feng, Zhizheng Zhang, and Zhibo Chen. 2020. 3-D context entropy model for improved practical image compression. IEEE Conference on Computer Vision and Pattern Recognition Workshops (2020), 116–117.
    [22]
    Zongyu Guo, Zhizheng Zhang, Runsen Feng, and Zhibo Chen. 2021. Causal contextual prediction for learned image compression. IEEE Transactions on Circuits and Systems for Video Technology (2021).
    [23]
    HM. 2021. HEVC Reference Software. https://vcgit.hhi.fraunhofer.de/jvet/HM.
    [24]
    Yueyu Hu, Wenhan Yang, and Jiaying Liu. 2020. Coarse-to-fine hyper-prior modeling for learned image compression. AAAI Conference on Artificial Intelligence 34, 07 (2020), 11013–11020.
    [25]
    Yueyu Hu, Wenhan Yang, Zhan Ma, and Jiaying Liu. 2021. Learning end-to-end lossy image compression: A benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).
    [26]
    Yi-Hsin Huang, Tao-Sheng Ou, Po-Yen Su, and Homer H. Chen. 2010. Perceptual rate-distortion optimization using structural similarity index as quality metric. IEEE Transactions on Circuits and Systems for Video Technology 20, 11 (2010), 1614–1624.
    [27]
    Nick Johnston, Damien Vincent, David Minnen, Michele Covell, Saurabh Singh, Troy Chinen, Sung Jin Hwang, Joel Shor, and George Toderici. 2018. Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. IEEE Conference on Computer Vision and Pattern Recognition (2018), 4385–4393.
    [28]
    Jan Klopp, Yu-Chiang Frank Wang, Shao-Yi Chien, and Liang-Gee Chen. 2018. Learning a code-space predictor by exploiting intra-image-dependencies. British Machine Vision Virtual Conference (2018), 124.
    [29]
    Kodak. 2013. Kodak lossless true color image suite. http://r0k.us/graphics/kodak/.
    [30]
    Hoyoung Lee, Seungha Yang, Younghyeon Park, and Byeungwoo Jeon. 2015. Fast quantization method with simplified rate–distortion optimized quantization for an HEVC encoder. IEEE Transactions on Circuits and Systems for Video Technology 26, 1 (2015), 107–116.
    [31]
    Jooyoung Lee, Seunghyun Cho, and Seung-Kwon Beack. 2018. Context-adaptive entropy model for end-to-end optimized image compression. International Conference on Learning Representations (2018).
    [32]
    Mu Li, Wangmeng Zuo, Shuhang Gu, Debin Zhao, and David Zhang. 2018. Learning convolutional networks for content-weighted image compression. IEEE Conference on Computer Vision and Pattern Recognition (2018), 3214–3223.
    [33]
    Jerry Liu, Shenlong Wang, Wei-Chiu Ma, Meet Shah, Rui Hu, Pranaab Dhawan, and Raquel Urtasun. 2020. Conditional entropy coding for efficient video compression. European Conference on Computer Vision (2020), 453–468.
    [34]
    Guo Lu, Chunlei Cai, Xiaoyun Zhang, Li Chen, Wanli Ouyang, Dong Xu, and Zhiyong Gao. 2020. Content adaptive and error propagation aware deep video compression. European Conference on Computer Vision (2020), 456–472.
    [35]
    Haichuan Ma, Dong Liu, Ning Yan, Houqiang Li, and Feng Wu. 2020. End-to-end optimized versatile image compression with wavelet-like transform. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).
    [36]
    Siwei Ma, Wen Gao, and Yan Lu. 2005. Rate-distortion analysis for H. 264/AVC video coding and its application to rate control. IEEE Transactions on Circuits and Systems for Video Technology 15, 12 (2005), 1533–1544.
    [37]
    Siwei Ma, Xinfeng Zhang, Chuanmin Jia, Zhenghui Zhao, Shiqi Wang, and Shanshe Wang. 2019. Image and video compression with neural networks: A review. IEEE Transactions on Circuits and Systems for Video Technology 30, 6 (2019), 1683–1698.
    [38]
    Fabian Mentzer, Eirikur Agustsson, Michael Tschannen, Radu Timofte, and Luc Van Gool. 2018. Conditional probability models for deep image compression. IEEE Conference on Computer Vision and Pattern Recognition (2018), 4394–4402.
    [39]
    Fabian Mentzer, George Toderici, Michael Tschannen, and Eirikur Agustsson. 2020. High-fidelity generative image compression. Neural Information Processing Systems (2020).
    [40]
    David Minnen, Johannes Ballé, and George Toderici. 2018. Joint autoregressive and hierarchical priors for learned image compression. Neural Information Processing Systems (2018).
    [41]
    David Minnen and Saurabh Singh. 2020. Channel-wise autoregressive entropy models for learned image compression. IEEE International Conference on Image Processing (2020), 3339–3343.
    [42]
    Dipti Mishra, Satish Kumar Singh, and Rajat Kumar Singh. 2020. Wavelet-based deep auto encoder-decoder (WDAED)-based image compression. IEEE Transactions on Circuits and Systems for Video Technology 31, 4 (2020), 1452–1462.
    [43]
    Jens-Rainer Ohm and Gary J. Sullivan. 2018. Versatile video coding–towards the next generation of video compression. Picture Coding Symposium 2018 (2018).
    [44]
    OpenJPEG. 2000. JPEG2000 Reference Software. https://jpeg.org/jpeg2000/software.html.
    [45]
    Majid Rabbani and Rajan Joshi. 2002. An overview of the JPEG 2000 still image compression standard. Signal Processing: Image Communication 17, 1 (2002), 3–48.
    [46]
    Oren Rippel and Lubomir Bourdev. 2017. Real-time adaptive image compression. International Conference on Machine Learning (2017), 2922–2930.
    [47]
    Myungseo Song, Jinyoung Choi, and Bohyung Han. 2021. Variable-rate deep image compression through spatially-adaptive feature transform. IEEE International Conference on Computer Vision (2021), 2380–2389.
    [48]
    Gary J. Sullivan, Jens-Rainer Ohm, Woo-Jin Han, and Thomas Wiegand. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 1649–1668.
    [49]
    Gary J. Sullivan and Thomas Wiegand. 1998. Rate-distortion optimization for video compression. IEEE Signal Processing Magazine 15, 6 (1998), 74–90.
    [50]
    Lucas Theis, Wenzhe Shi, Andrew Cunningham, and Ferenc Huszár. 2017. Lossy image compression with compressive autoencoders. International Conference on Learning Representations (2017).
    [51]
    George Toderici, Sean M. O’Malley, Sung Jin Hwang, Damien Vincent, David Minnen, Shumeet Baluja, Michele Covell, and Rahul Sukthankar. 2015. Variable rate image compression with recurrent neural networks. International Conference on Learning Representations (2015).
    [52]
    George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, and Michele Covell. 2017. Full resolution image compression with recurrent neural networks. IEEE Conference on Computer Vision and Pattern Recognition (2017), 5306–5314.
    [53]
    Chia-Yang Tsai, Ching-Yeh Chen, Tomoo Yamakage, In Suk Chong, Yu-Wen Huang, Chih-Ming Fu, Takayuki Itoh, Takashi Watanabe, Takeshi Chujoh, Marta Karczewicz, et al. 2013. Adaptive loop filtering for video coding. IEEE Journal of Selected Topics in Signal Processing 7, 6 (2013), 934–945.
    [54]
    Michael Tschannen, Eirikur Agustsson, and Mario Lucic. 2018. Deep generative models for distribution-preserving lossy compression. Neural Information Processing Systems (2018).
    [55]
    VTM. 2021. VVC Reference Software. https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware _VTM. Accessed: 2021-03-09.
    [56]
    Gregory K. Wallace. 1992. The JPEG still picture compression standard. IEEE Transactions on Consumer Electronics 38, 1 (1992).
    [57]
    Yefei Wang, Dong Liu, Siwei Ma, Feng Wu, and Wen Gao. 2020. Ensemble learning-based rate-distortion optimization for end-to-end image compression. IEEE Transactions on Circuits and Systems for Video Technology 31, 3 (2020), 1193–1207.
    [58]
    Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.
    [59]
    Tianfan Xue, Baian Chen, Jiajun Wu, Donglai Wei, and William T. Freeman. 2019. Video enhancement with task-oriented flow. International Journal of Computer Vision 127, 8 (2019), 1106–1125.
    [60]
    En-Hui Yang and Xiang Yu. 2007. Rate distortion optimization for H. 264 interframe coding: A general framework and algorithms. IEEE Transactions on Image Processing 16, 7 (2007), 1774–1784.
    [61]
    Fei Yang, Luis Herranz, Yongmei Cheng, and Mikhail G. Mozerov. 2021. Slimmable compressive autoencoders for practical neural image compression. IEEE Conference on Computer Vision and Pattern Recognition (2021), 4998–5007.
    [62]
    Kaifang Yang, Shuai Wan, Yanchao Gong, Hong Ren Wu, and Yan Feng. 2015. Perceptual based SAO rate-distortion optimization method with a simplified JND model for H. 265/HEVC. Signal Processing: Image Communication 31 (2015), 10–24.
    [63]
    Kai Zhang, Yawei Li, Wangmeng Zuo, Lei Zhang, Luc Van Gool, and Radu Timofte. 2021. Plug-and-play image restoration with deep denoiser prior. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).
    [64]
    Jing Zhao, Bin Li, Jiahao Li, Ruiqin Xiong, and Yan Lu. 2021. A universal encoder rate distortion optimization framework for learned compression. IEEE Conference on Computer Vision and Pattern Recognition Workshops (2021), 1880–1884.
    [65]
    Xin Zhao, Li Zhang, Siwei Ma, and Wen Gao. 2011. Video coding with rate-distortion optimized transform. IEEE Transactions on Circuits and Systems for Video Technology 22, 1 (2011), 138–151.

    Cited By

    View all
    • (2024)Fair and Robust Federated Learning via Decentralized and Adaptive Aggregation based on BlockchainACM Transactions on Sensor Networks10.1145/3673656Online publication date: 17-Jun-2024
    • (2024)Scene Graph Lossless Compression with Adaptive Prediction for Objects and RelationsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364950320:7(1-23)Online publication date: 27-Mar-2024

    Index Terms

    1. A Universal Optimization Framework for Learning-based Image Codec

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 1
      January 2024
      639 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3613542
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 August 2023
      Online AM: 03 July 2023
      Accepted: 04 January 2023
      Revised: 14 October 2022
      Received: 04 May 2022
      Published in TOMM Volume 20, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Image compression
      2. rate distortion optimization
      3. universal optimization framework
      4. machine learning

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)366
      • Downloads (Last 6 weeks)36
      Reflects downloads up to 27 Jul 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Fair and Robust Federated Learning via Decentralized and Adaptive Aggregation based on BlockchainACM Transactions on Sensor Networks10.1145/3673656Online publication date: 17-Jun-2024
      • (2024)Scene Graph Lossless Compression with Adaptive Prediction for Objects and RelationsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364950320:7(1-23)Online publication date: 27-Mar-2024

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media