Learned HDR Image Compression for Perceptually Optimal Storage and Display

Cao, Peibei; Chen, Haoyu; Ma, Jingzhe; Yuan, Yu-Chieh; Xie, Zhiyong; Xie, Xin; Bai, Haiqing; Ma, Kede

doi:10.1007/978-3-031-72967-6_7

Peibei Cao ORCID: orcid.org/0000-0001-7463-0409¹³,
Haoyu Chen ORCID: orcid.org/0000-0001-8093-9648¹³,
Jingzhe Ma¹⁴,
Yu-Chieh Yuan¹⁴,
Zhiyong Xie¹⁴,
Xin Xie¹⁴,
Haiqing Bai¹⁴ &
…
Kede Ma ORCID: orcid.org/0000-0001-8608-1128^13,15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15107))

Included in the following conference series:

European Conference on Computer Vision

122 Accesses

Abstract

High dynamic range (HDR) capture and display have seen significant growth in popularity driven by the advancements in technology and increasing consumer demand for superior image quality. As a result, HDR image compression is crucial to fully realize the benefits of HDR imaging without suffering from large file sizes and inefficient data handling. Conventionally, this is achieved by introducing a residual/gain map as additional metadata to bridge the gap between HDR and low dynamic range (LDR) images, making the former compatible with LDR image codecs but offering suboptimal rate-distortion performance. In this work, we initiate efforts towards end-to-end optimized HDR image compression for perceptually optimal storage and display. Specifically, we learn to compress an HDR image into two bitstreams: one for generating an LDR image to ensure compatibility with legacy LDR displays, and another as side information to aid HDR image reconstruction from the output LDR image. To measure the perceptual quality of output HDR and LDR images, we use two recently proposed image distortion metrics, both validated against human perceptual data of image quality and with reference to the uncompressed HDR image. Through end-to-end optimization for rate-distortion performance, our method dramatically improves HDR and LDR image quality at all bit rates. The code is available at https://github.com/cpb68/EPIC-HDR/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

HDR image encoding using a companding-based nonlinear quantization approach without metadata

Article 22 February 2022

HDRC: a subjective quality assessment database for compressed high dynamic range image

Article Open access 06 May 2024

A Novel Method for Image and Video Compression Based on Two-Level DCT with Hexadata Coding

Article 12 July 2020

References

Agustsson, E., et al.: Soft-to-hard vector quantization for end-to-end learning compressible representations. In: Advances in Neural Information Processing Systems, pp. 1141–1151 (2017)
Google Scholar
Agustsson, E., Tschannen, M., Mentzer, F., Timofte, R., Gool, L.V.: Generative adversarial networks for extreme learned image compression. In: IEEE International Conference on Computer Vision, pp. 221–231 (2019)
Google Scholar
Artusi, A., et al.: JPEG XT: a compression standard for HDR and WCG images [Standards in a Nutshell]. IEEE Signal Process. Mag. 33(2), 118–124 (2016)
Article Google Scholar
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: International Conference on Learning Representations (2017)
Google Scholar
Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. In: International Conference on Learning Representations (2018)
Google Scholar
Banterle, F., Ledda, P., Debattista, K., Chalmers, A.: Inverse tone mapping. In: International Conference on Computer Graphics and Interactive Techniques in Australasia and Southeast Asia, pp. 349–356 (2006)
Google Scholar
Bellard, F.: BPG image format (2018). https://bellard.org/bpg. Accessed 13 July 2024
Bjøntegaard, G.: Calculation of average PSNR differences between RD-curves. Input document VCEG-M33, Video Coding Experts Group, 13th VCEG Meeting, Austin, Texas, USA (2001)
Google Scholar
Boschetti, A., Adami, N., Leonardi, R., Okuda, M.: Flexible and effective high dynamic range image coding. In: IEEE International Conference on Image Processing, pp. 3145–3148 (2010)
Google Scholar
Bross, B., et al.: Overview of the versatile video coding (VVC) standard and its applications. IEEE Trans. Circuits Syst. Video Technol. 31(10), 3736–3764 (2021)
Article Google Scholar
Cao, L., Jiang, A., Li, W., Wu, H., Ye, N.: OoDHDR-codec: out-of-distribution generalization for HDR image compression. In: AAAI Conference on Artificial Intelligence, pp. 158–166 (2022)
Google Scholar
Cao, P., Le, C., Fang, Y., Ma, K.: A perceptually optimized and self-calibrated tone mapping operator. arXiv preprint arXiv:2206.09146 (2022)
Cao, P., Mantiuk, R.K., Ma, K.: Perceptual assessment and optimization of HDR image rendering. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 22433–22443 (2024)
Google Scholar
Carandini, M., Heeger, D.J.: Normalization as a canonical neural computation. Nat. Rev. Neurosci. 13(1), 51–62 (2012)
Article Google Scholar
Cheng, Z., Sun, H., Takeuchi, M., Katto, J.: Learned image compression with discretized Gaussian mixture likelihoods and attention modules. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7939–7948 (2020)
Google Scholar
Ding, K., Liu, Y., Zou, X., Wang, S., Ma, K.: Locally adaptive structure and texture similarity for image quality assessment. In: ACM International Conference on Multimedia, pp. 2483–2491 (2021)
Google Scholar
Drago, F., Myszkowski, K., Annen, T., Chiba, N.: Adaptive logarithmic mapping for displaying high contrast scenes. In: Computer Graphics Forum, pp. 419–426 (2003)
Google Scholar
Durand, F., Dorsey, J.: Fast bilateral filtering for the display of high-dynamic-range images. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 257–266 (2002)
Google Scholar
Garbas, J.U., Thoma, H.: Temporally coherent luminance-to-luma mapping for high dynamic range video coding with H. 264/AVC. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 829–832 (2011)
Google Scholar
Google: WebP compression study (2023). https://developers.google.com/speed/webp/docs/webp_study. Accessed 13 July 2024
Guleryuz, O.G., et al.: Sandwiched image compression: increasing the resolution and dynamic range of standard codecs. In: Picture Coding Symposium, pp. 175–179 (2022)
Google Scholar
Guo, Z., Zhang, Z., Feng, R., Chen, Z.: Soft then hard: rethinking the quantization in neural image compression. In: International Conference on Machine Learning, pp. 3920–3929 (2021)
Google Scholar
Han, F., Wang, J., Xiong, R., Zhu, Q., Yin, B.: HDR image compression with convolutional autoencoder. In: IEEE International Conference on Visual Communications and Image Processing, pp. 25–28 (2020)
Google Scholar
Hanji, P., Mantiuk, R.K., Eilertsen, G., Hajisharif, S., Unger, J.: Comparison of single image HDR reconstruction methods - the caveats of quality assessment. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 1–8 (2022)
Google Scholar
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems, pp. 6840–6851 (2020)
Google Scholar
Kim, M.H., Kautz, J.: Consistent tone reproduction. In: International Conference on Computer Graphics and Imaging, pp. 152–159 (2008)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Land, E.H., McCann, J.J.: Lightness and retinex theory. J. Opt. Soc. Am. 61(1), 1–11 (1971)
Article Google Scholar
Laparra, V., Berardino, A., Ballé, J., Simoncelli, E.P.: Perceptually optimized image rendering. J. Opt. Soc. Am. A 34(9), 1511–1525 (2017)
Article Google Scholar
Lee, C., Kim, C.S.: Rate-distortion optimized layered coding of high dynamic range videos. J. Vis. Commun. Image Represent. 23(6), 908–923 (2012)
Article Google Scholar
Li, H., Ma, K., Yong, H., Zhang, L.: Fast multi-scale structural patch decomposition for multi-exposure image fusion. IEEE Trans. Image Process. 29, 5805–5816 (2020)
Article MathSciNet Google Scholar
Li, M., Ma, K., You, J., Zhang, D., Zuo, W.: Efficient and effective context-based convolutional entropy modeling for image compression. IEEE Trans. Image Process. 29, 5900–5911 (2020)
Article MathSciNet Google Scholar
Li, M., Zuo, W., Gu, S., You, J., Zhang, D.: Learning content-weighted deep image compression. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3446–3461 (2021)
Article Google Scholar
Liang, Z., Xu, J., Zhang, D., Cao, Z., Zhang, L.: A hybrid $\ell _1$-$\ell _0$ layer decomposition model for tone mapping. In: IEEE Conference on Computer Vison and Pattern Recognition, pp. 4758–4766 (2018)
Google Scholar
Mai, Z., Mansour, H., Mantiuk, R.K., Nasiopoulos, P., Ward, R., Heidrich, W.: Optimizing a tone curve for backward-compatible high dynamic range image and video compression. IEEE Trans. Image Process. 20(6), 1558–1571 (2011)
Article MathSciNet Google Scholar
Mantiuk, R.K., Hammou, D., Hanji, P.: HDR-VDP-3: a multi-metric for predicting image differences, quality and contrast distortions in high dynamic range and regular content. arXiv preprint arXiv:2304.13625 (2023)
Mantiuk, R.K., Heidrich, W.: Visualizing high dynamic range images in a web browser. J. Graph. GPU Game Tools 14(1), 43–53 (2009)
Article Google Scholar
Mantiuk, R.K., Krawczyk, G., Myszkowski, K., Seidel, H.P.: Perception-motivated high dynamic range video encoding. ACM Trans. Graph. 23(3), 733–741 (2004)
Article Google Scholar
Mertens, T., Kautz, J., Van Reeth, F.: Exposure fusion. In: Pacific Conference on Computer Graphics and Applications, pp. 382–390. IEEE (2007)
Google Scholar
Miller, S., Nezamabadi, M., Daly, S.: Perceptual signal coding for more efficient usage of bit codes. SMPTE Motion Imaging J. 122(4), 52–59 (2013)
Article Google Scholar
Minnen, D., Ballé, J., Toderici, G.D.: Joint autoregressive and hierarchical priors for learned image compression. Adv. Neural. Inf. Process. Syst. 31, 10771–10780 (2018)
Google Scholar
Mukherjee, R., Debattista, K., Rogers, T.B., Bessa, M., Chalmers, A.: Uniform color space-based high dynamic range video compression. IEEE Trans. Circuits Syst. Video Technol. 29(7), 2055–2066 (2018)
Article Google Scholar
Paris, S., Hasinoff, S.W., Kautz, J.: Local Laplacian filters: edge-aware image processing with a Laplacian pyramid. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 68:1–68:12 (2011)
Google Scholar
Rana, A., Singh, P., Valenzise, G., Dufaux, F., Komodakis, N., Smolic, A.: Deep tone mapping operator for high dynamic range images. IEEE Trans. Image Process. 29(98), 1285–1298 (2020)
Article MathSciNet Google Scholar
Reinhard, E., Devlin, K.: Dynamic range reduction inspired by photoreceptor physiology. IEEE Trans. Visual Comput. Graphics 11(1), 13–24 (2005)
Article Google Scholar
Reinhard, E., Stark, M., Shirley, P., Ferwerda, J.: Photographic tone reproduction for digital images. ACM Trans. Graph. 21(3), 267–276 (2002)
Article Google Scholar
Theis, L., Shi, W., Cunningham, A., Huszár, F.: Lossy image compression with compressive autoencoders. In: International Conference on Learning Representations (2017)
Google Scholar
Toderici, G., et al.: Full resolution image compression with recurrent neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5306–5314 (2017)
Google Scholar
Torfason, R., Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Van Gool, L.: Towards image understanding from deep compression without decoding. In: International Conference on Learning Representations (2018)
Google Scholar
Tumblin, J., Rushmeier, H.: Tone reproduction for realistic images. IEEE Comput. Graphics Appl. 13(6), 42–48 (1993)
Article Google Scholar
Ward, G., Simmons, M.: JPEG-HDR: a backwards-compatible, high dynamic range extension to JPEG. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 3–10 (2006)
Google Scholar
Xu, R., Pattanaik, S.N., Hughes, C.E.: High-dynamic-range still-image encoding in JPEG 2000. IEEE Comput. Graphics Appl. 25(6), 57–64 (2005)
Article Google Scholar
Yang, J., Liu, Z., Lin, M., Yanushkevich, S., Yadid-Pecht, O.: Deep reformulated Laplacian tone mapping. arXiv preprint arXiv:2102.00348 (2021)
Yang, R., Mandt, S.: Lossy image compression with conditional diffusion models. In: Advances in Neural Information Processing Systems, pp. 64971 – 64995 (2023)
Google Scholar
Yeganeh, H., Wang, Z.: Objective quality assessment of tone-mapped images. IEEE Trans. Image Process. 22(2), 657–667 (2013)
Article MathSciNet Google Scholar
Zaid, A.O., Houimli, A.: HDR image compression with optimized JPEG coding. In: European Signal Processing Conference, pp. 1539–1543 (2017)
Google Scholar
Zhang, S., Kang, N., Ryder, T., Li, Z.: iFlow: numerically invertible flows for efficient lossless compression via a uniform coder. Adv. Neural. Inf. Process. Syst. 34, 5822–5833 (2021)
Google Scholar
Zhang, X., Yang, K., Zhou, J., Li, Y.: Retina inspired tone mapping method for high dynamic range images. Opt. Express 28(5), 5953–5964 (2020)
Article Google Scholar
Zhu, Y., Yang, Y., Cohen, T.: Transformer-based transform coding. In: International Conference on Learning Representations (2022)
Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (62071407), the Hong Kong RGC Early Career Scheme (2121382), and the Hong Kong ITC Innovation and Technology Fund (9440379 and 9440390).

Author information

Authors and Affiliations

Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
Peibei Cao, Haoyu Chen & Kede Ma
Xellar Biosystems, Boston, USA
Jingzhe Ma, Yu-Chieh Yuan, Zhiyong Xie, Xin Xie & Haiqing Bai
Shenzhen Research Institute, City University of Hong Kong, Kowloon, Hong Kong
Kede Ma

Authors

Peibei Cao
View author publications
You can also search for this author in PubMed Google Scholar
Haoyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jingzhe Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Chieh Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Xie
View author publications
You can also search for this author in PubMed Google Scholar
Xin Xie
View author publications
You can also search for this author in PubMed Google Scholar
Haiqing Bai
View author publications
You can also search for this author in PubMed Google Scholar
Kede Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kede Ma .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 50184 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cao, P. et al. (2025). Learned HDR Image Compression for Perceptually Optimal Storage and Display. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15107. Springer, Cham. https://doi.org/10.1007/978-3-031-72967-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-72967-6_7
Published: 03 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72966-9
Online ISBN: 978-3-031-72967-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learned HDR Image Compression for Perceptually Optimal Storage and Display

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

HDR image encoding using a companding-based nonlinear quantization approach without metadata

HDRC: a subjective quality assessment database for compressed high dynamic range image

A Novel Method for Image and Video Compression Based on Two-Level DCT with Hexadata Coding

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 50184 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Learned HDR Image Compression for Perceptually Optimal Storage and Display

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

HDR image encoding using a companding-based nonlinear quantization approach without metadata

HDRC: a subjective quality assessment database for compressed high dynamic range image

A Novel Method for Image and Video Compression Based on Two-Level DCT with Hexadata Coding

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 50184 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation