Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Learned HDR Image Compression for Perceptually Optimal Storage and Display

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Abstract

High dynamic range (HDR) capture and display have seen significant growth in popularity driven by the advancements in technology and increasing consumer demand for superior image quality. As a result, HDR image compression is crucial to fully realize the benefits of HDR imaging without suffering from large file sizes and inefficient data handling. Conventionally, this is achieved by introducing a residual/gain map as additional metadata to bridge the gap between HDR and low dynamic range (LDR) images, making the former compatible with LDR image codecs but offering suboptimal rate-distortion performance. In this work, we initiate efforts towards end-to-end optimized HDR image compression for perceptually optimal storage and display. Specifically, we learn to compress an HDR image into two bitstreams: one for generating an LDR image to ensure compatibility with legacy LDR displays, and another as side information to aid HDR image reconstruction from the output LDR image. To measure the perceptual quality of output HDR and LDR images, we use two recently proposed image distortion metrics, both validated against human perceptual data of image quality and with reference to the uncompressed HDR image. Through end-to-end optimization for rate-distortion performance, our method dramatically improves HDR and LDR image quality at all bit rates. The code is available at https://github.com/cpb68/EPIC-HDR/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Agustsson, E., et al.: Soft-to-hard vector quantization for end-to-end learning compressible representations. In: Advances in Neural Information Processing Systems, pp. 1141–1151 (2017)

    Google Scholar 

  2. Agustsson, E., Tschannen, M., Mentzer, F., Timofte, R., Gool, L.V.: Generative adversarial networks for extreme learned image compression. In: IEEE International Conference on Computer Vision, pp. 221–231 (2019)

    Google Scholar 

  3. Artusi, A., et al.: JPEG XT: a compression standard for HDR and WCG images [Standards in a Nutshell]. IEEE Signal Process. Mag. 33(2), 118–124 (2016)

    Article  Google Scholar 

  4. Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: International Conference on Learning Representations (2017)

    Google Scholar 

  5. Ballé, J., Minnen, D., Singh, S., Hwang, S.J., Johnston, N.: Variational image compression with a scale hyperprior. In: International Conference on Learning Representations (2018)

    Google Scholar 

  6. Banterle, F., Ledda, P., Debattista, K., Chalmers, A.: Inverse tone mapping. In: International Conference on Computer Graphics and Interactive Techniques in Australasia and Southeast Asia, pp. 349–356 (2006)

    Google Scholar 

  7. Bellard, F.: BPG image format (2018). https://bellard.org/bpg. Accessed 13 July 2024

  8. Bjøntegaard, G.: Calculation of average PSNR differences between RD-curves. Input document VCEG-M33, Video Coding Experts Group, 13th VCEG Meeting, Austin, Texas, USA (2001)

    Google Scholar 

  9. Boschetti, A., Adami, N., Leonardi, R., Okuda, M.: Flexible and effective high dynamic range image coding. In: IEEE International Conference on Image Processing, pp. 3145–3148 (2010)

    Google Scholar 

  10. Bross, B., et al.: Overview of the versatile video coding (VVC) standard and its applications. IEEE Trans. Circuits Syst. Video Technol. 31(10), 3736–3764 (2021)

    Article  Google Scholar 

  11. Cao, L., Jiang, A., Li, W., Wu, H., Ye, N.: OoDHDR-codec: out-of-distribution generalization for HDR image compression. In: AAAI Conference on Artificial Intelligence, pp. 158–166 (2022)

    Google Scholar 

  12. Cao, P., Le, C., Fang, Y., Ma, K.: A perceptually optimized and self-calibrated tone mapping operator. arXiv preprint arXiv:2206.09146 (2022)

  13. Cao, P., Mantiuk, R.K., Ma, K.: Perceptual assessment and optimization of HDR image rendering. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 22433–22443 (2024)

    Google Scholar 

  14. Carandini, M., Heeger, D.J.: Normalization as a canonical neural computation. Nat. Rev. Neurosci. 13(1), 51–62 (2012)

    Article  Google Scholar 

  15. Cheng, Z., Sun, H., Takeuchi, M., Katto, J.: Learned image compression with discretized Gaussian mixture likelihoods and attention modules. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7939–7948 (2020)

    Google Scholar 

  16. Ding, K., Liu, Y., Zou, X., Wang, S., Ma, K.: Locally adaptive structure and texture similarity for image quality assessment. In: ACM International Conference on Multimedia, pp. 2483–2491 (2021)

    Google Scholar 

  17. Drago, F., Myszkowski, K., Annen, T., Chiba, N.: Adaptive logarithmic mapping for displaying high contrast scenes. In: Computer Graphics Forum, pp. 419–426 (2003)

    Google Scholar 

  18. Durand, F., Dorsey, J.: Fast bilateral filtering for the display of high-dynamic-range images. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 257–266 (2002)

    Google Scholar 

  19. Garbas, J.U., Thoma, H.: Temporally coherent luminance-to-luma mapping for high dynamic range video coding with H. 264/AVC. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 829–832 (2011)

    Google Scholar 

  20. Google: WebP compression study (2023). https://developers.google.com/speed/webp/docs/webp_study. Accessed 13 July 2024

  21. Guleryuz, O.G., et al.: Sandwiched image compression: increasing the resolution and dynamic range of standard codecs. In: Picture Coding Symposium, pp. 175–179 (2022)

    Google Scholar 

  22. Guo, Z., Zhang, Z., Feng, R., Chen, Z.: Soft then hard: rethinking the quantization in neural image compression. In: International Conference on Machine Learning, pp. 3920–3929 (2021)

    Google Scholar 

  23. Han, F., Wang, J., Xiong, R., Zhu, Q., Yin, B.: HDR image compression with convolutional autoencoder. In: IEEE International Conference on Visual Communications and Image Processing, pp. 25–28 (2020)

    Google Scholar 

  24. Hanji, P., Mantiuk, R.K., Eilertsen, G., Hajisharif, S., Unger, J.: Comparison of single image HDR reconstruction methods - the caveats of quality assessment. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 1–8 (2022)

    Google Scholar 

  25. Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems, pp. 6840–6851 (2020)

    Google Scholar 

  26. Kim, M.H., Kautz, J.: Consistent tone reproduction. In: International Conference on Computer Graphics and Imaging, pp. 152–159 (2008)

    Google Scholar 

  27. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  28. Land, E.H., McCann, J.J.: Lightness and retinex theory. J. Opt. Soc. Am. 61(1), 1–11 (1971)

    Article  Google Scholar 

  29. Laparra, V., Berardino, A., Ballé, J., Simoncelli, E.P.: Perceptually optimized image rendering. J. Opt. Soc. Am. A 34(9), 1511–1525 (2017)

    Article  Google Scholar 

  30. Lee, C., Kim, C.S.: Rate-distortion optimized layered coding of high dynamic range videos. J. Vis. Commun. Image Represent. 23(6), 908–923 (2012)

    Article  Google Scholar 

  31. Li, H., Ma, K., Yong, H., Zhang, L.: Fast multi-scale structural patch decomposition for multi-exposure image fusion. IEEE Trans. Image Process. 29, 5805–5816 (2020)

    Article  MathSciNet  Google Scholar 

  32. Li, M., Ma, K., You, J., Zhang, D., Zuo, W.: Efficient and effective context-based convolutional entropy modeling for image compression. IEEE Trans. Image Process. 29, 5900–5911 (2020)

    Article  MathSciNet  Google Scholar 

  33. Li, M., Zuo, W., Gu, S., You, J., Zhang, D.: Learning content-weighted deep image compression. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3446–3461 (2021)

    Article  Google Scholar 

  34. Liang, Z., Xu, J., Zhang, D., Cao, Z., Zhang, L.: A hybrid \(\ell _1\)-\(\ell _0\) layer decomposition model for tone mapping. In: IEEE Conference on Computer Vison and Pattern Recognition, pp. 4758–4766 (2018)

    Google Scholar 

  35. Mai, Z., Mansour, H., Mantiuk, R.K., Nasiopoulos, P., Ward, R., Heidrich, W.: Optimizing a tone curve for backward-compatible high dynamic range image and video compression. IEEE Trans. Image Process. 20(6), 1558–1571 (2011)

    Article  MathSciNet  Google Scholar 

  36. Mantiuk, R.K., Hammou, D., Hanji, P.: HDR-VDP-3: a multi-metric for predicting image differences, quality and contrast distortions in high dynamic range and regular content. arXiv preprint arXiv:2304.13625 (2023)

  37. Mantiuk, R.K., Heidrich, W.: Visualizing high dynamic range images in a web browser. J. Graph. GPU Game Tools 14(1), 43–53 (2009)

    Article  Google Scholar 

  38. Mantiuk, R.K., Krawczyk, G., Myszkowski, K., Seidel, H.P.: Perception-motivated high dynamic range video encoding. ACM Trans. Graph. 23(3), 733–741 (2004)

    Article  Google Scholar 

  39. Mertens, T., Kautz, J., Van Reeth, F.: Exposure fusion. In: Pacific Conference on Computer Graphics and Applications, pp. 382–390. IEEE (2007)

    Google Scholar 

  40. Miller, S., Nezamabadi, M., Daly, S.: Perceptual signal coding for more efficient usage of bit codes. SMPTE Motion Imaging J. 122(4), 52–59 (2013)

    Article  Google Scholar 

  41. Minnen, D., Ballé, J., Toderici, G.D.: Joint autoregressive and hierarchical priors for learned image compression. Adv. Neural. Inf. Process. Syst. 31, 10771–10780 (2018)

    Google Scholar 

  42. Mukherjee, R., Debattista, K., Rogers, T.B., Bessa, M., Chalmers, A.: Uniform color space-based high dynamic range video compression. IEEE Trans. Circuits Syst. Video Technol. 29(7), 2055–2066 (2018)

    Article  Google Scholar 

  43. Paris, S., Hasinoff, S.W., Kautz, J.: Local Laplacian filters: edge-aware image processing with a Laplacian pyramid. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 68:1–68:12 (2011)

    Google Scholar 

  44. Rana, A., Singh, P., Valenzise, G., Dufaux, F., Komodakis, N., Smolic, A.: Deep tone mapping operator for high dynamic range images. IEEE Trans. Image Process. 29(98), 1285–1298 (2020)

    Article  MathSciNet  Google Scholar 

  45. Reinhard, E., Devlin, K.: Dynamic range reduction inspired by photoreceptor physiology. IEEE Trans. Visual Comput. Graphics 11(1), 13–24 (2005)

    Article  Google Scholar 

  46. Reinhard, E., Stark, M., Shirley, P., Ferwerda, J.: Photographic tone reproduction for digital images. ACM Trans. Graph. 21(3), 267–276 (2002)

    Article  Google Scholar 

  47. Theis, L., Shi, W., Cunningham, A., Huszár, F.: Lossy image compression with compressive autoencoders. In: International Conference on Learning Representations (2017)

    Google Scholar 

  48. Toderici, G., et al.: Full resolution image compression with recurrent neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5306–5314 (2017)

    Google Scholar 

  49. Torfason, R., Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., Van Gool, L.: Towards image understanding from deep compression without decoding. In: International Conference on Learning Representations (2018)

    Google Scholar 

  50. Tumblin, J., Rushmeier, H.: Tone reproduction for realistic images. IEEE Comput. Graphics Appl. 13(6), 42–48 (1993)

    Article  Google Scholar 

  51. Ward, G., Simmons, M.: JPEG-HDR: a backwards-compatible, high dynamic range extension to JPEG. In: Annual Conference on Computer Graphics and Interactive Techniques, pp. 3–10 (2006)

    Google Scholar 

  52. Xu, R., Pattanaik, S.N., Hughes, C.E.: High-dynamic-range still-image encoding in JPEG 2000. IEEE Comput. Graphics Appl. 25(6), 57–64 (2005)

    Article  Google Scholar 

  53. Yang, J., Liu, Z., Lin, M., Yanushkevich, S., Yadid-Pecht, O.: Deep reformulated Laplacian tone mapping. arXiv preprint arXiv:2102.00348 (2021)

  54. Yang, R., Mandt, S.: Lossy image compression with conditional diffusion models. In: Advances in Neural Information Processing Systems, pp. 64971 – 64995 (2023)

    Google Scholar 

  55. Yeganeh, H., Wang, Z.: Objective quality assessment of tone-mapped images. IEEE Trans. Image Process. 22(2), 657–667 (2013)

    Article  MathSciNet  Google Scholar 

  56. Zaid, A.O., Houimli, A.: HDR image compression with optimized JPEG coding. In: European Signal Processing Conference, pp. 1539–1543 (2017)

    Google Scholar 

  57. Zhang, S., Kang, N., Ryder, T., Li, Z.: iFlow: numerically invertible flows for efficient lossless compression via a uniform coder. Adv. Neural. Inf. Process. Syst. 34, 5822–5833 (2021)

    Google Scholar 

  58. Zhang, X., Yang, K., Zhou, J., Li, Y.: Retina inspired tone mapping method for high dynamic range images. Opt. Express 28(5), 5953–5964 (2020)

    Article  Google Scholar 

  59. Zhu, Y., Yang, Y., Cohen, T.: Transformer-based transform coding. In: International Conference on Learning Representations (2022)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (62071407), the Hong Kong RGC Early Career Scheme (2121382), and the Hong Kong ITC Innovation and Technology Fund (9440379 and 9440390).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kede Ma .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 50184 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cao, P. et al. (2025). Learned HDR Image Compression for Perceptually Optimal Storage and Display. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15107. Springer, Cham. https://doi.org/10.1007/978-3-031-72967-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72967-6_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72966-9

  • Online ISBN: 978-3-031-72967-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics