Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Deep bilateral learning for real-time image enhancement

Published: 20 July 2017 Publication History
  • Get Citation Alerts
  • Abstract

    Performance is a critical challenge in mobile image processing. Given a reference imaging pipeline, or even human-adjusted pairs of images, we seek to reproduce the enhancements and enable real-time evaluation. For this, we introduce a new neural network architecture inspired by bilateral grid processing and local affine color transforms. Using pairs of input/output images, we train a convolutional neural network to predict the coefficients of a locally-affine model in bilateral space. Our architecture learns to make local, global, and content-dependent decisions to approximate the desired image transformation. At runtime, the neural network consumes a low-resolution version of the input image, produces a set of affine transformations in bilateral space, upsamples those transformations in an edge-preserving fashion using a new slicing node, and then applies those upsampled transformations to the full-resolution image. Our algorithm processes high-resolution images on a smartphone in milliseconds, provides a real-time viewfinder at 1080p resolution, and matches the quality of state-of-the-art approximation techniques on a large class of image operators. Unlike previous work, our model is trained off-line from data and therefore does not require access to the original operator at runtime. This allows our model to learn complex, scene-dependent transformations for which no reference implementation is available, such as the photographic edits of a human retoucher.

    Supplementary Material

    MP4 File (papers-0027.mp4)

    References

    [1]
    Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). http://tensorflow.org/
    [2]
    Andrew Adams, Jongmin Baek, and Myers Abraham Davis. 2010. Fast High-Dimensional Filtering Using the Permutohedral Lattice. Computer Graphics Forum (2010).
    [3]
    Mathieu Aubry, Sylvain Paris, Samuel W Hasinoff, Jan Kautz, and Frédo Durand. 2014. Fast local laplacian filters: Theory and applications. ACM TOG (2014).
    [4]
    Jonathan T Barron, Andrew Adams, YiChang Shih, and Carlos Hernández. 2015. Fast bilateral-space stereo for synthetic defocus. CVPR (2015).
    [5]
    Jonathan T Barron and Ben Poole. 2016. The Fast Bilateral Solver. ECCV (2016).
    [6]
    Adrien Bousseau, Sylvain Paris, and Frédo Durand. 2009. User-assisted intrinsic images. ACM TOG (2009).
    [7]
    Vladimir Bychkovsky, Sylvain Paris, Eric Chan, and Frédo Durand. 2011. Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs. CVPR (2011).
    [8]
    Jiawen Chen, Andrew Adams, Neal Wadhwa, and Samuel W Hasinoff. 2016. Bilateral guided upsampling. ACM TOG (2016).
    [9]
    Jiawen Chen, Sylvain Paris, and Frédo Durand. 2007. Real-time edge-aware image processing with the bilateral grid. ACM TOG (2007).
    [10]
    Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2014. Learning a deep convolutional network for image super-resolution. ECCV (2014).
    [11]
    David Eigen, Christian Puhrsch, and Rob Fergus. 2014. Depth map prediction from a single image using a multi-scale deep network. NIPS (2014).
    [12]
    Zeev Farbman, Raanan Fattal, and Dani Lischinski. 2011. Convolution pyramids. ACM TOG (2011).
    [13]
    Michaël Gharbi, Gaurav Chaurasia, Sylvain Paris, and Frédo Durand. 2016. Deep Joint Demosaicking and Denoising. ACM TOG (2016).
    [14]
    Michaël Gharbi, YiChang Shih, Gaurav Chaurasia, Jonathan Ragan-Kelley, Sylvain Paris, and Frédo Durand. 2015. Transform Recipes for Efficient Cloud Photo Enhancement. ACM TOG (2015).
    [15]
    Samuel W Hasinoff, Dillon Sharlet, Ryan Geiss, Andrew Adams, Jonathan T Barron, Florian Kainz, Jiawen Chen, and Marc Levoy. 2016. Burst photography for high dynamic range and low-light imaging on mobile cameras. ACM TOG (2016).
    [16]
    Kaiming He and Jian Sun. 2015. Fast Guided Filter. CoRR (2015).
    [17]
    Kaiming He, Jian Sun, and Xiaoou Tang. 2013. Guided image filtering. TPAMI (2013).
    [18]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. CoRR (2015).
    [19]
    James Hegarty, John Brunhaver, Zachary DeVito, Jonathan Ragan-Kelley, Noy Cohen, Steven Bell, Artem Vasilyev, Mark Horowitz, and Pat Hanrahan. 2014. Darkroom: compiling high-level image processing code into hardware pipelines. ACM TOG (2014).
    [20]
    Sung Ju Hwang, Ashish Kapoor, and Sing Bing Kang. 2012. Context-based automatic local image enhancement. ECCV (2012).
    [21]
    Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM TOG (2016).
    [22]
    Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, and Thomas Brox. 2016. Flownet 2.0: Evolution of optical flow estimation with deep networks. CoRR (2016).
    [23]
    Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. ICML (2015).
    [24]
    Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2016. Image-to-Image Translation with Conditional Adversarial Networks. CoRR (2016).
    [25]
    Max Jaderberg, Karen Simonyan, Andrew Zisserman, and others. 2015. Spatial transformer networks. In Advances in Neural Information Processing Systems. 2017--2025.
    [26]
    Vidit Jain and Erik Learned-Miller. 2010. FDDB: A Benchmark for Face Detection in Unconstrained Settings. Technical Report UM-CS-2010--009. University of Massachusetts, Amherst.
    [27]
    Varun Jampani, Martin Kiefel, and Peter V. Gehler. 2016. Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks. CVPR (2016).
    [28]
    Liad Kaufman, Dani Lischinski, and Michael Werman. 2012. Content-Aware Automatic Photo Enhancement. Computer Graphics Forum (2012).
    [29]
    Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. ICLR (2015).
    [30]
    Johannes Kopf, Michael F Cohen, Dani Lischinski, and Matt Uyttendaele. 2007. Joint bilateral upsampling. ACM TOG (2007).
    [31]
    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet classification with deep convolutional neural networks. NIPS (2012).
    [32]
    Anat Levin, Dani Lischinski, and Yair Weiss. 2008. A closed-form solution to natural image matting. TPAMI (2008).
    [33]
    Sifei Liu, Jinshan Pan, and Ming-Hsuan Yang. 2016. Learning recursive filters for low-level vision via a hybrid neural network. ECCV (2016).
    [34]
    Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. CVPR (2015).
    [35]
    Ravi Teja Mullapudi, Andrew Adams, Dillon Sharlet, Jonathan Ragan-Kelley, and Kayvon Fatahalian. 2016. Automatically Scheduling Halide Image Processing Pipelines. ACM TOG (2016).
    [36]
    Sylvain Paris and Frédo Durand. 2006. A fast approximation of the bilateral filter using a signal processing approach. ECCV (2006).
    [37]
    Sylvain Paris, Samuel W Hasinoff, and Jan Kautz. 2011. Local Laplacian filters: edge-aware image processing with a Laplacian pyramid. ACM TOG (2011).
    [38]
    Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Frédo Durand. 2012. Decoupling Algorithms from Schedules for Easy Optimization of Image Processing Pipelines. ACM TOG (2012).
    [39]
    Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention.
    [40]
    Xiaoyong Shen, Xin Tao, Hongyun Gao, Chao Zhou, and Jiaya Jia. 2016. Deep Automatic Portrait Matting. ECCV (2016).
    [41]
    Yichang Shih, Sylvain Paris, Frédo Durand, and William T Freeman. 2013. Data-driven hallucination of different times of day from a single outdoor photo. ACM TOG (2013).
    [42]
    Carlo Tomasi and Roberto Manduchi. 1998. Bilateral filtering for gray and color images. ICCV (1998).
    [43]
    Li Xu, Jimmy Ren, Qiong Yan, Renjie Liao, and Jiaya Jia. 2015. Deep Edge-Aware Filters. ICML (2015).
    [44]
    Zhicheng Yan, Hao Zhang, Baoyuan Wang, Sylvain Paris, and Yizhou Yu. 2016. Automatic photo adjustment using deep neural networks. ACM TOG (2016).
    [45]
    Fisher Yu and Vladlen Koltun. 2015. Multi-scale context aggregation by dilated convolutions. CoRR (2015).
    [46]
    Lu Yuan and Jian Sun. 2011. High quality image reconstruction from raw and jpeg image pair. ICCV (2011).
    [47]
    Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. 2016. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. CoRR (2016).

    Cited By

    View all
    • (2024)Depth-Guided Bilateral Grid Feature Fusion Network for DehazingSensors10.3390/s2411358924:11(3589)Online publication date: 2-Jun-2024
    • (2024)Improving Computer-Aided Thoracic Disease Diagnosis through Comparative Analysis Using Chest X-ray Images Taken at Different TimesSensors10.3390/s2405147824:5(1478)Online publication date: 24-Feb-2024
    • (2024)Bilateral Guided Radiance Field ProcessingACM Transactions on Graphics10.1145/365814843:4(1-13)Online publication date: 19-Jul-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Graphics
    ACM Transactions on Graphics  Volume 36, Issue 4
    August 2017
    2155 pages
    ISSN:0730-0301
    EISSN:1557-7368
    DOI:10.1145/3072959
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 July 2017
    Published in TOG Volume 36, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. convolutional neural networks
    2. data-driven methods
    3. deep learning
    4. real-time image processing

    Qualifiers

    • Research-article

    Funding Sources

    • Toyota

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)363
    • Downloads (Last 6 weeks)45
    Reflects downloads up to 06 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Depth-Guided Bilateral Grid Feature Fusion Network for DehazingSensors10.3390/s2411358924:11(3589)Online publication date: 2-Jun-2024
    • (2024)Improving Computer-Aided Thoracic Disease Diagnosis through Comparative Analysis Using Chest X-ray Images Taken at Different TimesSensors10.3390/s2405147824:5(1478)Online publication date: 24-Feb-2024
    • (2024)Bilateral Guided Radiance Field ProcessingACM Transactions on Graphics10.1145/365814843:4(1-13)Online publication date: 19-Jul-2024
    • (2024)MMDCP: An Image Enhancement Algorithm Incorporating Multi-Channel Phase Activation and Multi-Constrained Dark Channel PriorInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142454005338:04Online publication date: 25-Apr-2024
    • (2024)Airport runway foreign object detection adapted to low-light environmentsThird International Conference on Advanced Manufacturing Technology and Electronic Information (AMTEI 2023)10.1117/12.3025709(8)Online publication date: 1-Apr-2024
    • (2024)4K-Resolution Photo Exposure Correction at 125 FPS with ~8K Parameters2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00161(1576-1586)Online publication date: 3-Jan-2024
    • (2024)PhISH-Net: Physics Inspired System for High Resolution Underwater Image Enhancement2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00153(1495-1505)Online publication date: 3-Jan-2024
    • (2024)Differentiable Image Data Augmentation and Its Applications: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.333086246:2(1148-1164)Online publication date: 1-Feb-2024
    • (2024)Learning Deep Representations for Photo RetouchingIEEE Transactions on Multimedia10.1109/TMM.2023.330790326(3153-3163)Online publication date: 1-Jan-2024
    • (2024)A Semi-Supervised Underexposed Image Enhancement Network With Supervised Context Attention and Multi-Exposure FusionIEEE Transactions on Multimedia10.1109/TMM.2023.327838026(1229-1243)Online publication date: 1-Jan-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media