Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Self-supervised High Dynamic Range Imaging: What Can Be Learned from a Single 8-bit Video?

Published: 23 March 2024 Publication History
  • Get Citation Alerts
  • Abstract

    Recently, Deep Learning-based methods for inverse tone mapping standard dynamic range (SDR) images to obtain high dynamic range (HDR) images have become very popular. These methods manage to fill over-exposed areas convincingly both in terms of details and dynamic range. To be effective, deep learning-based methods need to learn from large datasets and transfer this knowledge to the network weights. In this work, we tackle this problem from a completely different perspective. What can we learn from a single SDR 8-bit video? With the presented self-supervised approach, we show that, in many cases, a single SDR video is sufficient to generate an HDR video of the same quality or better than other state-of-the-art methods.

    Supplementary Material

    3648570.supp (3648570.supp.pdf)
    Supplementary material
    Supplementary_Video (supplementary_video.mp4)
    Supplementary video

    References

    [1]
    Ahmet Oǧuz Akyüz, Roland Fleming, Bernhard E. Riecke, Erik Reinhard, and Heinrich H. Bülthoff. 2007. Do HDR displays support LDR content? A psychophysical evaluation. ACM Trans. Graph. 26, 3 (July 2007), 38–es.
    [2]
    Ahmet Oğuz Akyüz and Erik Reinhard. 2007. Noise reduction in high dynamic range imaging. J. Vis. Commun. Image Represent. 18, 5 (2007), 366–376.
    [3]
    Tunç Ozan Aydın, RafałMantiuk, and Hans-Peter Seidel. 2008. Extending quality metrics to full luminance range images. In Conference on Human Vision and Electronic Imaging (SPIE Proceedings). Bernice E. Rogowitz and Thrasyvoulos N. Pappas (Eds.), Vol. 6806. SPIE, 68060B.
    [4]
    Maryam Azimi, Amin Banitalebi-Dehkordi, Yuanyuan Dong, Mahsa T. Pourazad, and Panos Nasiopoulos. 2014. Evaluating the performance of existing full-reference quality metrics on high dynamic range (HDR) video content. In International Conference on Multimedia Signal Processing (ICMSP’14).
    [5]
    A. Banitalebi-Dehkordi, M. Azimi Hashemi, M. T. Pourazad, and P. Nasiopoulos. 2014. Compression of high dynamic range video using the HEVC and H.264/AVC standards. In 10th International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness (QShine’14). IEEE, 8–12. DOI:
    [6]
    Francesco Banterle, Patrick Ledda, Kurt Debattista, and Alan Chalmers. 2006. Inverse tone mapping. In International Conference on Computer Graphics and Interactive Techniques (GRAPHITE’06). ACM, New York, NY, 349–356. DOI:
    [7]
    Cambodge Bist, Rémi Cozot, Gérard Madec, and Xavier Ducloux. 2017. Tone expansion using lighting style aesthetics. Comput. Graph. 62 (2017), 77–86. DOI:
    [8]
    Guanying Chen, Chaofeng Chen, Shi Guo, Zhetong Liang, Kwan-Yee K. Wong, and Lei Zhang. 2021. HDR video reconstruction: A coarse-to-fine network and A real-world benchmark dataset. In IEEE/CVF International Conference on Computer Vision (ICCV’21). IEEE, 2482–2491. DOI:
    [9]
    Xiangyu Chen, Zhengwen Zhang, Jimmy S. Ren, Lynhoo Tian, Yu Qiao, and Chao Dong. 2021. A new journey from SDRTV to HDRTV. In IEEE/CVF International Conference on Computer Vision. IEEE, 4500–4509.
    [10]
    Paul Debevec. 2002. Image-based lighting. IEEE Comput. Graph. Appl. 22, 2 (2002), 26–34. DOI:
    [11]
    Paul E. Debevec and Jitendra Malik. 1997. Recovering high dynamic range radiance maps from photographs. In 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’97). G. Scott Owen, Turner Whitted, and Barbara Mones-Hattal (Eds.). ACM, 369–378. DOI:
    [12]
    Paul E. Debevec and Jitendra Malik. 1997. Recovering high dynamic range radiance maps from photographs. In 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’97). ACM Press/Addison-Wesley Publishing Co., 369–378. DOI:
    [13]
    Piotr Didyk, Rafał Mantiuk, Matthias Hein, and Hans-Peter Seidel. 2008. Enhancement of bright video features for HDR displays. Comput. Graph. Forum 27, 4 (2008), 1265–1274.
    [14]
    Gabriel Eilertsen, Saghi Hajisharif, Param Hanji, Apostolia Tsirikoglou, Rafal K. Mantiuk, and Jonas Unger. 2021. How to cheat with metrics in single-image HDR reconstruction. In IEEE/CVF International Conference on Computer Vision Workshops (ICCVW’21). IEEE, 3981–3990. DOI:
    [15]
    Gabriel Eilertsen, Joel Kronander, Gyorgy Denes, RafałK. Mantiuk, and Jonas Unger. 2017. HDR image reconstruction from a single exposure using deep CNNs. ACM Trans. Graph. 36, 6 (2017), 178:1–178:15. DOI:
    [16]
    Gabriel Eilertsen, RafałK. Mantiuk, and Jonas Unger. 2019. Single-frame regularization for temporally stable CNNs. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). Computer Vision Foundation/IEEE, 11176–11185. DOI:
    [17]
    Yuki Endo, Yoshihiro Kanamori, and Jun Mitani. 2017. Deep reverse tone mapping. ACM Trans. Graph. 36, 6, Article 177 (2017), 10 pages. DOI:
    [18]
    Jan Froehlich, Stefan Grandinetti, Bernd Eberhardt, Simon Walter, Andreas Schilling, and Harald Brendel. 2014. Creating cinematic wide gamut HDR-video for the evaluation of tone mapping operators and HDR-displays. In Proceedings IS&T/SPIE Electronic Imaging, 2014, Vol. 9023. SPIE, 9023–9023–10. DOI:
    [19]
    Param Hanji, Rafal Mantiuk, Gabriel Eilertsen, Saghi Hajisharif, and Jonas Unger. 2022. Comparison of single image HDR reconstruction methods–The caveats of quality assessment. In ACM SIGGRAPH Conference (SIGGRAPH’22). Association for Computing Machinery, New York, NY, Article 1, 8 pages. DOI:
    [20]
    Samuel W. Hasinoff, Dillon Sharlet, Ryan Geiss, Andrew Adams, Jonathan T. Barron, Florian Kainz, Jiawen Chen, and Marc Levoy. 2016. Burst photography for high dynamic range and low-light imaging on mobile cameras. ACM Trans. Graph. 35, 6, Article 192 (Nov.2016), 12 pages. DOI:
    [21]
    Yongqing Huo, Fan Yang, Le Dong, and Vincent Brost. 2014. Physiological inverse tone mapping based on retina response. Vis. Comput. 30 (2014), 507–517.
    [22]
    ITU-R. 2018. Recommendation ITU-R BT.2100-2: Image Parameter Values for High Dynamic Range Television for Use in Production and International Programme Exchange. (July2018). https://www.itu.int/rec/R-REC-BT.2100
    [23]
    So Yeon Jo, Siyeong Lee, Namhyun Ahn, and Suk-Ju Kang. 2021. Deep arbitrary HDRI: Inverse tone mapping with controllable exposure changes. IEEE Trans. Multim. 24 (2021), 2713–2726.
    [24]
    Nima Khademi Kalantari and Ravi Ramamoorthi. 2019. Deep HDR video from sequences with alternating exposures. Comput. Graph. Forum 38, 2 (2019), 193–205. DOI:
    [25]
    Nima Khademi Kalantari, Eli Shechtman, Connelly Barnes, Soheil Darabi, Dan B. Goldman, and Pradeep Sen. 2013. Patch-based high dynamic range video. ACM Trans. Graph. 32, 6, Article 202 (Nov.2013), 8 pages. DOI:
    [26]
    Soo Ye Kim, Jihyong Oh, and Munchurl Kim. 2019. Deep SR-ITM: Joint learning of super-resolution and inverse tone-mapping for 4k UHD HDR applications. In IEEE/CVF International Conference on Computer Vision. IEEE, 3116–3125.
    [27]
    Soo Ye Kim, Jihyong Oh, and Munchurl Kim. 2020. JSI-GAN: GAN-based joint super-resolution and inverse tone-mapping with pixel-wise task-specific filters for UHD HDR video. In AAAI Conference on Artificial Intelligence, Vol. 34. AAAI Press, 11287–11295.
    [28]
    Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations (ICLR’14).
    [29]
    Rafael Pacheco Kovaleski and Manuel M. Oliveira. 2014. High-quality reverse tone mapping for a wide range of exposures. In 27th SIBGRAPI Conference on Graphics, Patterns and Images. IEEE Computer Society, New York, 49–56.
    [30]
    Hayden Landis. 2002. Production-ready global illumination. In SIGGRAPH Course Notes 16. ACM, New York, 87–101.
    [31]
    Bruno Lecouat, Thomas Eboli, Jean Ponce, and Julien Mairal. 2022. High dynamic range and super-resolution from raw image bursts. ACM Trans. Graph. 41, 4, Article 38 (July2022), 21 pages. DOI:
    [32]
    Chen-Yu Lee, Patrick W. Gallagher, and Zhuowen Tu. 2018. Generalizing pooling functions in CNNs: Mixed, gated, and tree. IEEE Trans. Pattern Anal. Mach. Intell. 40, 4 (2018), 863–875. DOI:
    [33]
    Siyeong Lee, Gwon Hwan An, and Suk-Ju Kang. 2018. Deep recursive HDRI: Inverse tone mapping using generative adversarial networks. In European Conference on Computer Vision (ECCV’18). 596–611.
    [34]
    Siyeong Lee, So Yeon Jo, Gwon Hwan An, and Suk-Ju Kang. 2020. Learning to generate multi-exposure stacks with cycle consistency for high dynamic range imaging. IEEE Trans. Multim. 23 (2020), 2561–2574.
    [35]
    Han Li and Pieter Peers. 2017. CRF-Net: Single image radiometric calibration using CNNs. In 14th European Conference on Visual Media Production (CVMP’17). Association for Computing Machinery, New York, NY, Article 5, 9 pages. DOI:
    [36]
    Stephen Lin, Jinwei Gu, Shuntaro Yamazaki, and Heung-Yeung Shum. 2004. Radiometric calibration from a single image. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’04). IEEE Computer Society, Washington, DC, 938–945.
    [37]
    Yu-Lun Liu, Wei-Sheng Lai, Yu-Sheng Chen, Yi-Lung Kao, Ming-Hsuan Yang, Yung-Yu Chuang, and Jia-Bin Huang. 2020. Single-image HDR reconstruction by learning to reverse the camera pipeline. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 1651–1660.
    [38]
    Yu-Lun Liu, Wei-Sheng Lai, Yu-Sheng Chen, Yi-Lung Kao, Ming-Hsuan Yang, Yung-Yu Chuang, and Jia-Bin Huang. 2020. Single-image HDR reconstruction by learning to reverse the camera pipeline. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). IEEE, 1648–1657.
    [39]
    Rafał K. Mantiuk and Maryam Azimi. 2021. PU21: A novel perceptually uniform encoding for adapting existing quality metrics for HDR. In Picture Coding Symposium (PCS’21). IEEE, 1–5. DOI:
    [40]
    Rafal K. Mantiuk, Dounia Hammou, and Param Hanji. 2023. HDR-VDP-3: A Multi-metric for Predicting Image Differences, Quality and Contrast Distortions in High Dynamic Range and Regular Content. (2023). arxiv:eess.IV/2304.13625
    [41]
    Demetris Marnerides, Thomas Bashford-Rogers, Jonathan Hatchett, and Kurt Debattista. 2018. ExpandNet: A deep convolutional neural network for high dynamic range expansion from low dynamic range content. Comput. Graph. Forum 37, 2 (2018), 37–49. DOI:
    [42]
    Belen Masia, Sandra Agustin, Roland W. Fleming, Olga Sorkine, and Diego Gutierrez. 2009. Evaluation of reverse tone mapping through varying exposure conditions. ACM Trans. Graph. 28, 5 (2009), 1–8. DOI:
    [43]
    Belen Masia, Ana Serrano, and Diego Gutierrez. 2017. Dynamic range expansion based on image statistics. Multim. Tools Applic. 76, 1 (2017), 631–648. DOI:
    [44]
    Christopher A. Metzler, Hayato Ikoma, Yifan Peng, and Gordon Wetzstein. 2020. Deep optics for single-shot high-dynamic-range imaging. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). Computer Vision Foundation/IEEE, 1372–1382. DOI:
    [45]
    Laurence Meylan and Sabine Süsstrunk. 2006. High dynamic range image rendering with a retinex-based adaptive filter. IEEE Trans. Image Process. 15, 9 (2006), 2820–2830.
    [46]
    Venkatanath N., Praneeth D., Maruthi Chandrasekhar Bh., Sumohana S. Channappayya, and Swarup S. Medasani. 2015. Blind image quality evaluation using perception based features. In 21st National Conference on Communications (NCC’15). IEEE, 1–6. DOI:
    [47]
    Manish Narwaria, RafałK. Mantiuk, Mattheiu Perreira Da Silva, and Patrick Le Callet. 2015. HDR-VDP-2.2: A calibrated method for objective quality prediction of high dynamic range and standard images. J. Electron. Imag. 24, 1 (2015), 1050.1–1050.3.
    [48]
    Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Ales Leonardis, and Radu Timofte. 2021. NTIRE 2021 challenge on high dynamic range imaging: Dataset, methods and results. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops’21). Computer Vision Foundation/IEEE, 691–700.
    [49]
    Allan G. Rempel, Matthew Trentacoste, Helge Seetzen, H. David Young, Wolfgang Heidrich, Lorne Whitehead, and Greg Ward. 2007. LDR2HDR: On-the-fly reverse tone mapping of legacy video and photographs. ACM Trans. Graph. 26, 3 (2007), 39. DOI:
    [50]
    Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional networks for biomedical image segmentation. In 18th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI’15) (Lecture Notes in Computer Science), Nassir Navab, Joachim Hornegger, William M. Wells III, and Alejandro F. Frangi (Eds.), Vol. 9351. Springer, 234–241. DOI:
    [51]
    Marcel Santana Santos, Tsang Ing Ren, and Nima Khademi Kalantari. 2020. Single image HDR reconstruction using a CNN with masked features and perceptual loss. ACM Trans. Graph. 39, 4, Article 80 (2020), 10 pages. DOI:
    [52]
    Helge Seetzen, Greg Ward, Lorne Whitehead, and Wolfgang Heidrich. 2004. High dynamic range display system. In ACM SIGGRAPH Emerging Technologies (SIGGRAPH’04). Association for Computing Machinery, New York, NY, 8. DOI:
    [53]
    Aashish Sharma, Robby T. Tan, and Loong-Fah Cheong. 2020. Single-image camera response function using prediction consistency and gradual refinement. In 15th Asian Conference on Computer Vision (ACCV’20) (Lecture Notes in Computer Science). Hiroshi Ishikawa, Cheng-Lin Liu, Tomás Pajdla, and Jianbo Shi (Eds.), Vol. 12627. Springer, 19–35. DOI:
    [54]
    Assaf Shocher, Nadav Cohen, and Michal Irani. 2018. “Zero-Shot” super-resolution using deep internal learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, 3118–3126. DOI:
    [55]
    Qilin Sun, Ethan Tseng, Qiang Fu, Wolfgang Heidrich, and Felix Heide. 2020. Learning rank-1 diffractive optics for single-shot high dynamic range imaging. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). Computer Vision Foundation/IEEE, 1383–1393. DOI:
    [56]
    Dmitry Ulyanov, Andrea Vedaldi, and Victor S. Lempitsky. 2020. Deep image prior. Int. J. Comput. Vis. 128, 7 (2020), 1867–1888. DOI:
    [57]
    Tu Van Vo and Chul Lee. 2020. High dynamic range video synthesis using superpixel-based illuminance-invariant motion estimation. IEEE Access 8 (2020), 24576–24587.
    [58]
    Chao Wang, Ana Serrano, Xingang Pan, Bin Chen, Hans-Peter Seidel, Christian Theobalt, Karol Myszkowski, and Thomas Leimkuehler. 2022. GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild. (2022). arxiv:cs.CV/2211.12352
    [59]
    Lvdi Wang, Li-Yi Wei, Kun Zhou, Baining Guo, and Heung-Yeung Shum. 2007. High dynamic range image hallucination. In ACM SIGGRAPH 2007 Sketches. ACM, New York, NY, 72. DOI:
    [60]
    Zhou Wang, Eero P. Simoncelli, and Alan C. Bovik. 2003. Multi-scale structural similarity for image quality assessment. In 37th IEEE Asilomar Conference on Signals, Systems and Computers. IEEE, New York, NY, 1398–1402. DOI:
    [61]
    Xin Yang, Ke Xu, Yibing Song, Qiang Zhang, Xiaopeng Wei, and Rynson WH Lau. 2018. Image correction via deep reciprocating HDR transformation. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1798–1807.
    [62]
    Hanning Yu, Wentao Liu, Chengjiang Long, Bo Dong, Qin Zou, and Chunxia Xiao. 2021. Luminance attentive networks for HDR image and panorama reconstruction. Comput. Graph. Forum 40, 7 (2021), 181–192.
    [63]
    Yang Zhang and Tunç Ozan Aydin. 2021. Deep HDR estimation with generative detail reconstruction. Comput. Graph. Forum 40, 2 (2021), 179–190. DOI:
    [64]
    Maria Zontak and Michal Irani. 2011. Internal statistics of a single natural image. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). IEEE, 977–984.

    Cited By

    View all
    • (2024)GDUI: Guided Diffusion Model for Unlabeled ImagesAlgorithms10.3390/a1703012517:3(125)Online publication date: 18-Mar-2024

    Index Terms

    1. Self-supervised High Dynamic Range Imaging: What Can Be Learned from a Single 8-bit Video?

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 43, Issue 2
        April 2024
        199 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/3613549
        • Editor:
        • Carol O'Sullivan
        Issue’s Table of Contents

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 23 March 2024
        Online AM: 20 February 2024
        Accepted: 26 January 2024
        Revised: 11 January 2024
        Received: 08 November 2022
        Published in TOG Volume 43, Issue 2

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. High dynamic range imaging
        2. inverse tone mapping
        3. deep learning
        4. computational photography

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)438
        • Downloads (Last 6 weeks)88

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)GDUI: Guided Diffusion Model for Unlabeled ImagesAlgorithms10.3390/a1703012517:3(125)Online publication date: 18-Mar-2024

        View Options

        Get Access

        Login options

        Full Access

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        Full Text

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media