Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Differentiable programming for image processing and deep learning in halide

Published: 30 July 2018 Publication History
  • Get Citation Alerts
  • Abstract

    Gradient-based optimization has enabled dramatic advances in computational imaging through techniques like deep learning and nonlinear optimization. These methods require gradients not just of simple mathematical functions, but of general programs which encode complex transformations of images and graphical data. Unfortunately, practitioners have traditionally been limited to either hand-deriving gradients of complex computations, or composing programs from a limited set of coarse-grained operators in deep learning frameworks. At the same time, writing programs with the level of performance needed for imaging and deep learning is prohibitively difficult for most programmers.
    We extend the image processing language Halide with general reverse-mode automatic differentiation (AD), and the ability to automatically optimize the implementation of gradient computations. This enables automatic computation of the gradients of arbitrary Halide programs, at high performance, with little programmer effort. A key challenge is to structure the gradient code to retain parallelism. We define a simple algorithm to automatically schedule these pipelines, and show how Halide's existing scheduling primitives can express and extend the key AD optimization of "checkpointing."
    Using this new tool, we show how to easily define new neural network layers which automatically compile to high-performance GPU implementations, and how to solve nonlinear inverse problems from computational imaging. Finally, we show how differentiable programming enables dramatically improving the quality of even traditional, feed-forward image processing algorithms, blurring the distinction between classical and deep methods.

    Supplementary Material

    ZIP File (139-526.zip)
    Supplemental files.
    MP4 File (a139-li.mp4)

    References

    [1]
    Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems.
    [2]
    Jonathan T Barron and Ben Poole. 2016. The fast bilateral solver. In European Conference on Computer Vision. Springer, 617--632.
    [3]
    James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. 2010. Theano: a CPU and GPU Math Expression Compiler. In Proceedings of the Python for Scientific Computing Conference (SciPy).
    [4]
    Christian Bischof, Alan Carle, George Corliss, and Andreas Griewank. 1992. ADIFOR: Automatic Differentiation in a Source Translator Environment. In Papers from the International Symposium on Symbolic and Algebraic Computation (ISSAC '92). 294--302.
    [5]
    Vladimir Bychkovsky, Sylvain Paris, Eric Chan, and Frédo Durand. 2011. Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs. In The Twenty-Fourth IEEE Conference on Computer Vision and Pattern Recognition.
    [6]
    Jiawen Chen, Sylvain Paris, and Frédo Durand. 2007. Real-time Edge-aware Image Processing with the Bilateral Grid. ACM Trans. Graph. (Proceedings of SIGGRAPH) 26, 3, Article 103 (July 2007).
    [7]
    Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. cuDNN: Efficient Primitives for Deep Learning. arXiv preprint arXiv:1410.0759 (2014).
    [8]
    J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.
    [9]
    Zachary Devito, Michael Mara, Michael Zollhöfer, Gilbert Bernstein, Jonathan Ragan-Kelley, Christian Theobalt, Pat Hanrahan, Matthew Fisher, and Matthias Niessner. 2017. Opt: A Domain Specific Language for Non-Linear Least Squares Optimization in Graphics and Imaging. ACM Trans. Graph. 36, 5, Article 171 (Oct. 2017), 27 pages.
    [10]
    Martin A. Fischler and Robert C. Bolles. 1981. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Commun. ACM 24, 6 (June 1981), 381--395.
    [11]
    Horacio E. Fortunato and Manuel M. Oliveira. 2014. Fast high-quality non-blind deconvolution using sparse adaptive priors. The Visual Computer 30, 6--8 (2014), 661--671.
    [12]
    Leon A Gatys, Alexanders Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2414--2423.
    [13]
    Michaël Gharbi, Gaurav Chaurasia, Sylvain Paris, and Frédo Durand. 2016. Deep Joint Demosaicking and Denoising. ACM Trans. Graph. (Proceedings of SIGGRAPH Asia) 35, 6, Article 191 (Nov. 2016), 12 pages.
    [14]
    Michaël Gharbi, Jiawen Chen, Jonathan T Barron, Samuel W Hasinoff, and Frédo Durand. 2017. Deep bilateral learning for real-time image enhancement. ACM Trans. Graph. (Proceedings of SIGGRAPH) 36, 4 (2017), 118.
    [15]
    Mark Girolami and Ben Calderhead. 2011. Riemann manifold langevin and hamiltonian monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73, 2 (2011), 123--214.
    [16]
    Andreas Griewank, David Juedes, and Jean Utke. 1996. Algorithm 755: ADOL-C: A Package for the Automatic Differentiation of Algorithms Written in C/C++. ACM Trans. Math. Softw. 22, 2 (June 1996), 131--167.
    [17]
    Andreas Griewank and Shawn Reese. 1991. On the Calculation of Jacobian Matrices by the Markowitz Rule. In Automatic Differentiation of Algorithms: Theory Implementation, and Application, Andreas Griewank and George F. Corliss (Eds.). 126--135.
    [18]
    Andreas Griewank and Andrea Walther. 2008. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation (second ed.). Society for Industrial and Applied Mathematics.
    [19]
    Brian Guenter. 2007. Efficient Symbolic Differentiation for Graphics Applications. ACM Trans. Graph. (Proceedings of SIGGRAPH) 26, 3 (July 2007).
    [20]
    Laurent Hascoet and Valérie Pascual. 2013. The Tapenade Automatic Differentiation Tool: Principles, Model, and Specification. ACM Trans. Math. Softw. 39, 3, Article 20 (May 2013), 43 pages.
    [21]
    Felix Heide, Steven Diamond, Matthias Niessner, Jonathan Ragan-Kelley, Wolfgang Heidrich, and Gordon Wetzstein. 2016. ProxImaL: Efficient Image Optimization Using Proximal Algorithms. ACM Trans. Graph. (Proceedings of SIGGRAPH) 35, 4, Article 84 (July 2016), 15 pages.
    [22]
    Felix Heide, Markus Steinberger, Yun-Ta Tsai, Mushfiqur Rouf, Dawid Pająk, Dikpal Reddy, Orazio Gallo, Jing Liu, Wolfgang Heidrich, Karen Egiazarian, Jan Kautz, and Kari Pulli. 2014. FlexISP: A Flexible Camera Image Processing Framework. ACM Trans. Graph. (Proceedings of SIGGRAPH) 33, 6, Article 231 (Nov. 2014), 13 pages.
    [23]
    Keigo Hirakawa and Thomas W Parks. 2005. Adaptive homogeneity-directed demosaicing algorithm. IEEE Trans. Image Processing 14, 3 (2005), 360--369.
    [24]
    Robin J. Hogan. 2014. Fast Reverse-Mode Automatic Differentiation Using Expression Templates in C++. ACM Trans. Math. Softw. 40, 4, Article 26 (July 2014), 16 pages.
    [25]
    Berthold KP Horn and Brian G Schunck. 1981. Determining optical flow. Artificial intelligence 17, 1--3 (1981), 185--203.
    [26]
    Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Transactions on Graphics (TOG) 35, 4 (2016), 110.
    [27]
    E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox. 2017. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
    [28]
    Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al. 2015. Spatial transformer networks. In Advances in Neural Information Processing Systems. 2017--2025.
    [29]
    Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. In Proceedings of the 22Nd ACM International Conference on Multimedia (MM '14). 675--678.
    [30]
    Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
    [31]
    Orest Kupyn, Volodymyr Budzan, Mykola Mykhailych, Dmytro Mishkin, and Jiri Matas. 2017. DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks. arXiv preprint arXiv:1711.07064 (2017).
    [32]
    Leslie Lamport. 1975. The Hyperplane Method for an Array Computer. In Proceedings of the Sagamore Computer Conference on Parallel Processing. 113--131.
    [33]
    Gunther Lange. 1957. Gauss type photographic objective containing two outer collective and two inner dispersive members. U.S. Patent 2,799,207 A.
    [34]
    Seppo Linnainmaa. 1970. The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master's thesis. Univ. Helsinki.
    [35]
    David G. Lowe. 2004. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vision 60, 2 (Nov. 2004), 91--110.
    [36]
    Fujun Luan, Sylvain Paris, Eli Shechtman, and Kavita Bala. 2017. Deep Photo Style Transfer. arXiv preprint arXiv:1703.07511 (2017).
    [37]
    Ravi Teja Mullapudi, Andrew Adams, Dillon Sharlet, Jonathan Ragan-Kelley, and Kayvon Fatahalian. 2016. Automatically Scheduling Halide Image Processing Pipelines. ACM Trans. Graph. (Proceedings of SIGGRAPH) 35, 4, Article 83 (July 2016), 11 pages.
    [38]
    Ravi Teja Mullapudi, Vinay Vasista, and Uday Bondhugula. 2015. PolyMage: Automatic Optimization for Image Processing Pipelines. SIGARCH Comput. Archit. News 43, 1 (March 2015), 429--443.
    [39]
    Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017).
    [40]
    Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Frédo Durand. 2012. Decoupling Algorithms from Schedules for Easy Optimization of Image Processing Pipelines. ACM Trans. Graph. (Proceedings of SIGGRAPH) 31, 4, Article 32 (July 2012), 12 pages.
    [41]
    Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines. SIGPLAN Not. 48, 6 (June 2013), 519--530.
    [42]
    S. Roth and M.J. Black. 2005. Fields of Experts: A framework for learning image priors. In IEEE Conf. on Computer Vision and Pattern Recognition, Vol. 2. 860--867.
    [43]
    Leonid I Rudin, Stanley Osher, and Emad Fatemi. 1992. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena 60, 1--4 (1992), 259--268.
    [44]
    D. E. Rumelhart, G. E. Hinton, and R. J. Williams. 1986. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1. Chapter Learning Internal Representations by Error Propagation, 318--362.
    [45]
    Patricia Suriana, Andrew Adams, and Shoaib Kamil. 2017. Parallel Associative Reductions in Halide. In Proceedings of the 2017 International Symposium on Code Generation and Optimization (CGO '17).
    [46]
    Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2017. Deep Image Prior. arXiv preprint arXiv:1711.10925 (2017).
    [47]
    Yu. M. Volin and G. M. Ostrovskii. 1985. Automatic computation of derivatives with the use of the multilevel differentiating technique --- I: Algorithmic basis. Computers and Mathematics with Applications 11 (1985), 1099--1114.
    [48]
    Paul J Werbos. 1982. Applications of advances in nonlinear sensitivity analysis. In System modeling and optimization. Springer, 762--770.
    [49]
    Alexander B Wiltschko, Bart van MerriÃńnboer, and Dan Moldovan. 2017. Tangent: automatic differentiation using source code transformation in Python.
    [50]
    Li Xu, Jimmy Ren, Qiong Yan, Renjie Liao, and Jiaya Jia. 2015. Deep edge-aware filters. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15). 1669--1678.
    [51]
    Yuting Yang, Sam Prestwood, and Connelly Barnes. 2016. VizGen: Accelerating Visual Computing Prototypes in Dynamic Languages. ACM Trans. Graph. (Proceedings of SIGGRAPH Asia) 35, 6, Article 206 (Nov. 2016), 13 pages.
    [52]
    Dong Yu, Adam Eversole, Mike Seltzer, Kaisheng Yao, Oleksii Kuchaiev, Yu Zhang, Frank Seide, Zhiheng Huang, Brian Guenter, Huaming Wang, Jasha Droppo, Geoffrey Zweig, Chris Rossbach, Jie Gao, Andreas Stolcke, Jon Currey, Malcolm Slaney, Guoguo Chen, Amit Agarwal, Chris Basoglu, Marko Padmilac, Alexey Kamenev, Vladimir Ivanov, Scott Cypher, Hari Parthasarathi, Bhaskar Mitra, Baolin Peng, and Xuedong Huang. 2014. An Introduction to Computational Networks and the Computational Network Toolkit. Technical Report.
    [53]
    Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. 2017. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing (2017).
    [54]
    Richard Zhang, Phillip Isola, and Alexei A Efros. 2016. Colorful image colorization. In European Conference on Computer Vision. Springer, 649--666.
    [55]
    Barbara Zitova and Jan Flusser. 2003. Image registration methods: a survey. Image and vision computing 21, 11 (2003), 977--1000.

    Cited By

    View all
    • (2024)Aperture-Aware Lens DesignACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657398(1-10)Online publication date: 13-Jul-2024
    • (2024)Differentiable Image Data Augmentation and Its Applications: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.333086246:2(1148-1164)Online publication date: Feb-2024
    • (2024)DTDeMo: A Deep Learning-Based Two-Stage Image Demosaicing Model With Interpolation and EnhancementIEEE Transactions on Computational Imaging10.1109/TCI.2024.342636010(1026-1039)Online publication date: 2024
    • Show More Cited By

    Index Terms

    1. Differentiable programming for image processing and deep learning in halide

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Transactions on Graphics
          ACM Transactions on Graphics  Volume 37, Issue 4
          August 2018
          1670 pages
          ISSN:0730-0301
          EISSN:1557-7368
          DOI:10.1145/3197517
          Issue’s Table of Contents
          Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 30 July 2018
          Published in TOG Volume 37, Issue 4

          Check for updates

          Author Tags

          1. automatic differentiation
          2. deep learning
          3. image processing

          Qualifiers

          • Research-article

          Funding Sources

          • Google
          • ADEPT Lab industrial sponsors
          • Intel Science and Technology Center for Agile Design
          • NSF/Intel Partnership on Computer Assisted Programming for Heterogeneous Architectures
          • Toyota
          • Siemens
          • SK Hynix

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)274
          • Downloads (Last 6 weeks)33
          Reflects downloads up to 05 Aug 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Aperture-Aware Lens DesignACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657398(1-10)Online publication date: 13-Jul-2024
          • (2024)Differentiable Image Data Augmentation and Its Applications: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.333086246:2(1148-1164)Online publication date: Feb-2024
          • (2024)DTDeMo: A Deep Learning-Based Two-Stage Image Demosaicing Model With Interpolation and EnhancementIEEE Transactions on Computational Imaging10.1109/TCI.2024.342636010(1026-1039)Online publication date: 2024
          • (2024)TapeFlow: Streaming Gradient Tapes in Automatic Differentiation2024 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO57630.2024.10444805(81-92)Online publication date: 2-Mar-2024
          • (2024)SlidingConv: Domain-Specific Description of Sliding Discrete Cosine Transform Convolution for HalideIEEE Access10.1109/ACCESS.2023.334566012(7563-7583)Online publication date: 2024
          • (2023)A no-API approach to massive-parallel architecturesKeldysh Institute Preprints10.20948/prepr-2023-58(1-54)Online publication date: 2023
          • (2023)SLANG.D: Fast, Modular and Differentiable Shader ProgrammingACM Transactions on Graphics10.1145/361835342:6(1-28)Online publication date: 5-Dec-2023
          • (2023)Grape: Practical and Efficient Graphed Execution for Dynamic Deep Neural Networks on GPUsProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614248(1364-1380)Online publication date: 28-Oct-2023
          • (2023)LAGrad: Statically Optimized Differentiable Programming in MLIRProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction10.1145/3578360.3580259(228-238)Online publication date: 17-Feb-2023
          • (2023)Optimizing DNNs With Partially Equivalent Transformations and Automated CorrectionsIEEE Transactions on Computers10.1109/TC.2023.330779572:12(3546-3560)Online publication date: Dec-2023
          • Show More Cited By

          View Options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Get Access

          Login options

          Full Access

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media