Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3503161.3548259acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Best of Both Worlds: See and Understand Clearly in the Dark

Published: 10 October 2022 Publication History

Abstract

Recently, with the development of intelligent technology, the perception of low-light scenes has been gaining widespread attention. However, existing techniques usually focus on only one task (e.g., enhancement) and lose sight of the others (e.g., detection), making it difficult to perform all of them well at the same time. To overcome this limitation, we propose a new method that can handle visual quality enhancement and semantic-related tasks (e.g., detection, segmentation) simultaneously in a unified framework. Specifically, we build a cascaded architecture to meet the task requirements. To better enhance the entanglement in both tasks and achieve mutual guidance, we develop a new contrastive-alternative learning strategy for learning the model parameters, to largely improve the representational capacity of the cascaded architecture. Notably, the contrastive learning mechanism establishes the communication between two objective tasks in essence, which actually extends the capability of contrastive learning to some extent. Finally, extensive experiments are performed to fully validate the advantages of our method over other state-of-the-art works in enhancement, detection, and segmentation. A series of analytical evaluations are also conducted to reveal our effectiveness. The code is available at https://github.com/k914/contrastive-alternative-learning.

Supplementary Material

MP4 File (MM22-fp2165.mp4)
Presentation video of "Best of Both Worlds: See and Understand Clearly in the Dark". Recently, with the development of intelligent technology, the perception of low-light scenes has been gaining widespread attention. However, existing techniques usually focus on only one task (e.g., enhancement) and lose sight of the others (e.g., detection), making it difficult to perform all of them well at the same time. To overcome this limitation, we propose a new method that can handle visual quality enhancement and semantic-related tasks (e.g., detection, segmentation) simultaneously in a unified framework.

References

[1]
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. 2014. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062 (2014).
[2]
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018b. Encoder-decoder with atrous separable convolution for semantic image segmentation. In ECCV. 801--818.
[3]
Wei Chen, Wenjing Wang, Wenhan Yang, and Jiaying Liu. 2018a. Deep Retinex Decomposition for Low-Light Enhancement. In BMVC.
[4]
Xinlei Chen, Haoqi Fan, Ross Girshick, and Kaiming He. 2020. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297 (2020).
[5]
Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In CVPR. 3213--3223.
[6]
Dengxin Dai and Luc Van Gool. 2018. Dark model adaptation: Semantic image segmentation from daytime to nighttime. In ITSC. 3819--3824.
[7]
Xuan Dong, Guan Wang, Yi Pang, Weixin Li, Jiangtao Wen, Wei Meng, and Yao Lu. 2011. Fast efficient algorithm for enhancement of low lighting video. In ICME. 1--6.
[8]
Minhao Fan, Wenjing Wang, Wenhan Yang, and Jiaying Liu. 2020. Integrating semantic segmentation and retinex model for low-light image enhancement. In ACM MM. 2317--2325.
[9]
Chunle Guo, Chongyi Li, Jichang Guo, Chen Change Loy, Junhui Hou, Sam Kwong, and Runmin Cong. 2020. Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement. In CVPR. 1780--1789.
[10]
Xiaojie Guo, Yu Li, and Haibin Ling. 2017. LIME: Low-light image enhancement via illumination map estimation. IEEE TIP, Vol. 26, 2 (2017), 982--993.
[11]
Jiang Hai, Zhu Xuan, Ren Yang, Yutong Hao, Fengzhu Zou, Fang Lin, and Songchen Han. 2021. R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network. arXiv preprint arXiv:2106.14501 (2021).
[12]
Shih-Chia Huang, Fan-Chieh Cheng, and Yi-Sheng Chiu. 2012. Efficient contrast enhancement using adaptive gamma correction with weighting distribution. IEEE TIP, Vol. 22, 3 (2012), 1032--1041.
[13]
Vidit Jain and Erik Learned-Miller. 2010. Fddb: A benchmark for face detection in unconstrained settings. Technical Report. UMass Amherst technical report.
[14]
Yifan Jiang, Xinyu Gong, Ding Liu, Yu Cheng, Chen Fang, Xiaohui Shen, Jianchao Yang, Pan Zhou, and Zhangyang Wang. 2021. Enlightengan: Deep light enhancement without paired supervision. IEEE TIP, Vol. 30 (2021), 2340--2349.
[15]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[16]
Edwin H Land and John J McCann. 1971. Lightness and retinex theory. Journal of the Optical Society of America (1971).
[17]
Chongyi Li, Chunle Guo, Ling-Hao Han, Jun Jiang, Ming-Ming Cheng, Jinwei Gu, and Chen Change Loy. 2021. Low-Light Image and Video Enhancement Using Deep Learning: A Survey. IEEE TPAMI (2021), 1--1.
[18]
Jian Li, Yabiao Wang, Changan Wang, Ying Tai, Jianjun Qian, Jian Yang, Chengjie Wang, Jilin Li, and Feiyue Huang. 2019. DSFD: dual shot face detector. In CVPR. 5060--5069.
[19]
Jinxiu Liang, Jingwen Wang, Yuhui Quan, Tianyi Chen, Jiaying Liu, Haibin Ling, and Yong Xu. 2021. Recurrent exposure generation for low-light face detection. IEEE TMM (2021).
[20]
Jiaying Liu, Dejia Xu, Wenhan Yang, Minhao Fan, and Haofeng Huang. 2021. Benchmarking Low-Light Image Enhancement and Beyond. IJCV, Vol. 129, 4 (2021), 1153--1184.
[21]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In CVPR. 3431--3440.
[22]
Kin Gwn Lore, Adedotun Akintayo, and Soumik Sarkar. 2017. LLNet: A deep autoencoder approach to natural low-light image enhancement. Pattern Recognition, Vol. 61 (2017), 650--662.
[23]
Feifan Lv, Feng Lu, Jianhua Wu, and Chongsoon Lim. 2018. MBLLEN: Low-light image/video enhancement using CNNs. In BMVC, Vol. 220. 4.
[24]
Long Ma, Risheng Liu, Yiyang Wang, Xin Fan, and Zhongxuan Luo. 2022. Low-Light Image Enhancement via Self-Reinforced Retinex Projection Model. IEEE Transactions on Multimedia (2022).
[25]
Long Ma, Risheng Liu, Jiaao Zhang, Xin Fan, and Zhongxuan Luo. 2021. Learning deep context-sensitive decomposition for low-light image enhancement. IEEE TNNLS (2021).
[26]
Anish Mittal, Anush Krishna Moorthy, and Alan Conrad Bovik. 2012a. No-reference image quality assessment in the spatial domain. IEEE TIP, Vol. 21 (2012), 4695--4708.
[27]
Anish Mittal, Rajiv Soundararajan, and Alan C Bovik. 2012b. Making a "completely blind" image quality analyzer. IEEE SPL, Vol. 20, 3 (2012), 209--212.
[28]
Taesung Park, Alexei A Efros, Richard Zhang, and Jun-Yan Zhu. 2020. Contrastive learning for unpaired image-to-image translation. In ECCV. 319--345.
[29]
Stephen M Pizer, E Philip Amburn, John D Austin, Robert Cromartie, Ari Geselowitz, Trey Greer, Bart ter Haar Romeny, John B Zimmerman, and Karel Zuiderveld. 1987. Adaptive histogram equalization and its variations. Computer vision, graphics, and image processing, Vol. 39, 3 (1987), 355--368.
[30]
Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2019. Guided curriculum model adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation. In ICCV. 7374--7383.
[31]
Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2020. Map-guided curriculum domain adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation. IEEE TPAMI (2020).
[32]
Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2021. ACDC: The adverse conditions dataset with correspondences for semantic driving scene understanding. In ICCV. 10765--10775.
[33]
Hao Tang, Zechao Li, Zhimao Peng, and Jinhui Tang. 2020. lockMix: Meta Regularization and Self-Calibrated Inference for Metric-Based Meta-Learning. In ACM MM. 610--618.
[34]
Hao Tang, Chengcheng Yuan, Zechao Li, and Jinhui Tang. 2022. Learning attention-guided pyramidal features for few-shot fine-grained recognition. PR, Vol. 130 (2022), 108792.
[35]
Xu Tang, Daniel K Du, Zeqiang He, and Jingtuo Liu. 2018. Pyramidbox: A context-assisted single shot face detector. In ECCV. 797--813.
[36]
Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. 2020. What makes for good views for contrastive learning? NeurIPS, Vol. 33 (2020), 6827--6839.
[37]
Di Wang, Jinyuan Liu, Xin Fan, and Risheng Liu. 2022. Unsupervised Misaligned Infrared and Visible Image Fusion via Cross-Modality Image Generation and Registration. CoRR, Vol. abs/2205.11876 (2022).
[38]
Wenjing Wang, Wenhan Yang, and Jiaying Liu. 2021. Hla-face: Joint high-low adaptation for low light face detection. In CVPR. 16195--16204.
[39]
Xinyi Wu, Zhenyao Wu, Hao Guo, Lili Ju, and Song Wang. 2021. Dannet: A one-stage domain adaptation network for unsupervised nighttime semantic segmentation. In CVPR. 15769--15778.
[40]
Ke Xu, Xin Yang, Baocai Yin, and Rynson WH Lau. 2020. Learning to Restore Low-Light Images via Decomposition-and-Enhancement. In CVPR. 2281--2290.
[41]
Li Xu, Qiong Yan, Yang Xia, and Jiaya Jia. 2012. Structure Extraction from Texture via Relative Total Variation. ACM Transactions on Graphics (2012).
[42]
Shuo Yang, Ping Luo, Chen-Change Loy, and Xiaoou Tang. 2016. Wider face: A face detection benchmark. In CVPR. 5525--5533.
[43]
Wenhan Yang, Shiqi Wang, Yuming Fang, Yue Wang, and Jiaying Liu. 2020a. From fidelity to perceptual quality: A semi-supervised approach for low-light image enhancement. In CVPR. 3063--3072.
[44]
Wenhan Yang, Ye Yuan, Wenqi Ren, Jiaying Liu, Walter J Scheirer, Zhangyang Wang, Taiheng Zhang, Qiaoyong Zhong, Di Xie, Shiliang Pu, et al. 2020b. Advancing Image Understanding in Poor Visibility Environments: A Collective Benchmark Study. IEEE TIP, Vol. 29 (2020), 5737--5752.
[45]
Shifeng Zhang, Xiangyu Zhu, Zhen Lei, Hailin Shi, Xiaobo Wang, and Stan Z Li. 2017. S3fd: Single shot scale-invariant face detector. In ICCV. 192--201.
[46]
Yu Zhang, Xiaoguang Di, Bin Zhang, and Chunhui Wang. 2020. Self-supervised Image Enhancement Network: Training with Low Light Images Only. arXiv preprint arXiv:2002.11300 (2020).
[47]
Yonghua Zhang, Xiaojie Guo, Jiayi Ma, Wei Liu, and Jiawan Zhang. 2021. Beyond brightening low-light images. International Journal of Computer Vision, Vol. 129, 4 (2021), 1013--1037.
[48]
Yonghua Zhang, Jiawan Zhang, and Xiaojie Guo. 2019. Kindling the Darkness: A Practical Low-light Image Enhancer. In ACM MM.
[49]
Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid scene parsing network. In CVPR. 2881--2890.

Cited By

View all
  • (2024)Learning With Constraint Learning: New Perspective, Solution Strategy and Various ApplicationsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336415746:7(5026-5043)Online publication date: Jul-2024
  • (2024)Detection-Driven Exposure-Correction Network for Nighttime Drone-View Object DetectionIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.335113462(1-14)Online publication date: 2024
  • (2024)Latent domain knowledge distillation for nighttime semantic segmentationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.107940132(107940)Online publication date: Jun-2024
  • Show More Cited By

Index Terms

  1. Best of Both Worlds: See and Understand Clearly in the Dark

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '22: Proceedings of the 30th ACM International Conference on Multimedia
    October 2022
    7537 pages
    ISBN:9781450392037
    DOI:10.1145/3503161
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. contrastive alternative learning
    2. dark face detection
    3. low-light image enhancement
    4. nighttime semantic segmentation

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)125
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 03 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Learning With Constraint Learning: New Perspective, Solution Strategy and Various ApplicationsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336415746:7(5026-5043)Online publication date: Jul-2024
    • (2024)Detection-Driven Exposure-Correction Network for Nighttime Drone-View Object DetectionIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.335113462(1-14)Online publication date: 2024
    • (2024)Latent domain knowledge distillation for nighttime semantic segmentationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.107940132(107940)Online publication date: Jun-2024
    • (2024)Meta-Learning Based Knowledge Distillation for Domain Adaptive Nighttime SegmentationPattern Recognition and Computer Vision10.1007/978-981-97-8490-5_3(31-45)Online publication date: 7-Nov-2024
    • (2023)Learning Non-Uniform-Sampling for Ultra-High-Definition Image EnhancementProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611836(1412-1421)Online publication date: 26-Oct-2023
    • (2023)Empowering Low-Light Image Enhancer through Customized Learnable Priors2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01154(12525-12535)Online publication date: 1-Oct-2023
    • (2023)FeatEnHancer: Enhancing Hierarchical Features for Object Detection and Beyond Under Low-Light Vision2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.00619(6702-6712)Online publication date: 1-Oct-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media