Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ExtraNet: real-time extrapolated rendering for low-latency temporal supersampling

Published: 10 December 2021 Publication History

Abstract

Both the frame rate and the latency are crucial to the performance of realtime rendering applications such as video games. Spatial supersampling methods, such as the Deep Learning SuperSampling (DLSS), have been proven successful at decreasing the rendering time of each frame by rendering at a lower resolution. But temporal supersampling methods that directly aim at producing more frames on the fly are still not practically available. This is mainly due to both its own computational cost and the latency introduced by interpolating frames from the future. In this paper, we present ExtraNet, an efficient neural network that predicts accurate shading results on an extrapolated frame, to minimize both the performance overhead and the latency. With the help of the rendered auxiliary geometry buffers of the extrapolated frame, and the temporally reliable motion vectors, we train our ExtraNet to perform two tasks simultaneously: irradiance in-painting for regions that cannot find historical correspondences, and accurate ghosting-free shading prediction for regions where temporal information is available. We present a robust hole-marking strategy to automate the classification of these tasks, as well as the data generation from a series of high-quality production-ready scenes. Finally, we use lightweight gated convolutions to enable fast inference. As a result, our ExtraNet is able to produce plausibly extrapolated frames without easily noticeable artifacts, delivering a 1.5× to near 2× increase in frame rates with minimized latency in practice.

Supplementary Material

MP4 File (a278-guo.mp4)

References

[1]
Dmitry Andreev. 2010. Real-Time Frame Rate up-Conversion for Video Games: Or How to Get from 30 to 60 Fps for "Free". In ACM SIGGRAPH 2010 Talks (Los Angeles, California) (SIGGRAPH '10). Association for Computing Machinery, Article 16, 1 pages.
[2]
Simon Baker, Stefan Roth, Daniel Scharstein, Michael J. Black, J.P. Lewis, and Richard Szeliski. 2007. A Database and Evaluation Methodology for Optical Flow. In 2007 IEEE 11th International Conference on Computer Vision. 1--8.
[3]
Steve Bako, Thijs Vogels, Brian Mcwilliams, Mark Meyer, Jan NováK, Alex Harvill, Pradeep Sen, Tony Derose, and Fabrice Rousselle. 2017. Kernel-predicting Convolutional Networks for Denoising Monte Carlo Renderings. ACM Trans. Graph. 36, 4 (July 2017), 97:1--97:14.
[4]
Wenbo Bao, Wei-Sheng Lai, Chao Ma, Xiaoyun Zhang, Zhiyong Gao, and Ming-Hsuan Yang. 2019. Depth-Aware Video Frame Interpolation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3698--3707.
[5]
Wenbo Bao, Wei-Sheng Lai, Xiaoyun Zhang, Zhiyong Gao, and Ming-Hsuan Yang. 2021. MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 3 (2021), 933--948.
[6]
Huw Bowles, Kenny Mitchell, Robert Sumner, Jeremy Moore, and Markus Gross. 2012. Iterative Image Warping. Computer Graphics Forum 31 (05 2012), 1.
[7]
Chakravarty R. Alla Chaitanya, Anton S. Kaplanyan, Christoph Schied, Marco Salvi, Aaron Lefohn, Derek Nowrouzezahrai, and Timo Aila. 2017. Interactive Reconstruction of Monte Carlo Image Sequences Using a Recurrent Denoising Autoencoder. ACM Trans. Graph. 36, 4 (July 2017), 98:1--98:12.
[8]
Gyorgy Denes, Kuba Maruszczyk, George Ash, and Rafał K. Mantiuk. 2019. Temporal Resolution Multiplexing: Exploiting the limitations of spatio-temporal vision for more efficient VR rendering. IEEE Transactions on Visualization and Computer Graphics 25, 5 (2019), 2072--2082.
[9]
Piotr Didyk, Elmar Eisemann, Tobias Ritschel, Karol Myszkowski, and Hans-Peter Seidel. 2010a. Perceptually-motivated Real-time Temporal Upsampling of 3D Content for High-refresh-rate Displays. Computer Graphics Forum 29, 2 (2010), 713--722.
[10]
Piotr Didyk, Tobias Ritschel, Elmar Eisemann, Karol Myszkowski, and Hans-Peter Seidel. 2010b. Adaptive Image-space Stereo View Synthesis. In Vision, Modeling, and Visualization (2010). The Eurographics Association.
[11]
Epic Games. 2018. Unreal Engine 4.19: Screen Percentage with Temporal Upsample. https://docs.unrealengine.com/en-US/Engine/Rendering/ScreenPercentage/index.html. Accessed in August 2019.
[12]
Denis Fortun, Patrick Bouthemy, and Charles Kervrann. 2015. Optical flow modeling and computation: A survey. Computer Vision and Image Understanding 134 (2015), 1 -- 21. Image Understanding for Real-world Distributed Video Networks.
[13]
Jie Guo, Mengtian Li, Quewei Li, Yuting Qiang, Bingyang Hu, Yanwen Guo, and Ling-Qi Yan. 2019. GradNet: Unsupervised Deep Screened Poisson Reconstruction for Gradient-Domain Rendering. ACM Trans. Graph. 38, 6, Article 223 (Nov. 2019), 13 pages.
[14]
Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, and Mu Li. 2019. Bag of tricks for image classification with convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 558--567.
[15]
Zhewei Huang, Tianyuan Zhang, Wen Heng, Boxin Shi, and Shuchang Zhou. 2020. RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation. arXiv preprint arXiv:2011.06294 (2020).
[16]
Huaizu Jiang, Deqing Sun, Varan Jampani, Ming-Hsuan Yang, Erik Learned-Miller, and Jan Kautz. 2018. Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9000--9008.
[17]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[18]
Thomas Leimkühler, Hans-Peter Seidel, and Tobias Ritschel. 2017. Minimal Warping: Planning Incremental Novel-view Synthesis. Computer Graphics Form (Proc. EGSR) 36, 4 (2017).
[19]
Edward Liu. 2020. DLSS 2.0 - Image Reconstruction for Real-Time Rendering with Deep learning. In Game Developers Conference.
[20]
Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. 2018. Image Inpainting for Irregular Holes Using Partial Convolutions. In The European Conference on Computer Vision (ECCV).
[21]
Hongyu Liu, Bin Jiang, Yibing Song, Wei Huang, and Chao Yang. 2020. Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations. In Computer Vision - ECCV 2020. Springer International Publishing, Cham, 725--741.
[22]
Michael Mara, Morgan McGuire, Benedikt Bitterli, and Wojciech Jarosz. 2017. An efficient denoising algorithm for global illumination. High Performance Graphics 10 (2017), 3105762--3105774.
[23]
William R. Mark, Leonard McMillan, and Gary Bishop. 1997. Post-Rendering 3D Warping. In Proceedings of the 1997 Symposium on Interactive 3D Graphics (Providence, Rhode Island, USA) (I3D '97). Association for Computing Machinery, New York, NY, USA, 7--ff.
[24]
Joerg H. Mueller, Thomas Neff, Philip Voglreiter, Markus Steinberger, and Dieter Schmalstieg. 2021. Temporally Adaptive Shading Reuse for Real-Time Rendering and Virtual Reality. ACM Trans. Graph. 40, 2, Article 11 (April 2021), 14 pages.
[25]
Netflix. 2016. Toward a practical perceptual video quality metric. https://medium.com/netflix-techblog/toward-a-practical-perceptual-video-quality-metric-653f208b9652.
[26]
Simon Niklaus and Feng Liu. 2020. Softmax Splatting for Video Frame Interpolation. In IEEE Conference on Computer Vision and Pattern Recognition.
[27]
Oculus. 2016. Asynchronous SpaceWarp (ASW). https://developer.oculus.com/blog/asynchronous-spacewarp//.
[28]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035.
[29]
AJ Piergiovanni and Michael S. Ryoo. 2019. Representation Flow for Action Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[30]
Bernhard Reinert, Johannes Kopf, Tobias Ritschel, Eduardo Cuervo, David Chu, and Hans-Peter Seidel. 2016. Proxy-guided Image-based Rendering for Mobile Devices. Computer Graphics Forum 35, 7 (2016), 353--362.
[31]
Yurui Ren, Xiaoming Yu, Ruonan Zhang, Thomas H. Li, Shan Liu, and Ge Li. 2019. StructureFlow: Image Inpainting via Structure-Aware Appearance Flow. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
[32]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234--241.
[33]
Marcel Santana Santos, Tsang Ing Ren, and Nima Khademi Kalantari. 2020. Single Image HDR Reconstruction Using a CNN with Masked Features and Perceptual Loss. ACM Trans. Graph. 39, 4, Article 80 (July 2020), 10 pages.
[34]
Daniel Scherzer, Lei Yang, Oliver Mattausch, Diego Nehab, Pedro V. Sander, Michael Wimmer, and Elmar Eisemann. 2012. Temporal Coherence Methods in Real-Time Rendering. Comput. Graph. Forum 31, 8 (Dec. 2012), 2378--2408.
[35]
Christoph Schied, Anton Kaplanyan, Chris Wyman, Anjul Patney, Chakravarty R Alla Chaitanya, John Burgess, Shiqiu Liu, Carsten Dachsbacher, Aaron Lefohn, and Marco Salvi. 2017. Spatiotemporal variance-guided filtering: real-time reconstruction for path-traced global illumination. In Proceedings of High Performance Graphics. 1--12.
[36]
Andre Schollmeyer, Simon Schneegans, Stephan Beck, Anthony Steed, and Bernd Froehlich. 2017. Efficient Hybrid Image Warping for High Frame-Rate Stereoscopic Rendering. IEEE Transactions on Visualization and Computer Graphics 23, 4 (2017), 1332--1341.
[37]
Pradeep Sen, Matthias Zwicker, Fabrice Rousselle, Sung-Eui Yoon, and Nima Khademi Kalantari. 2015. Denoising Your Monte Carlo Renders: Recent Advances in Image-space Adaptive Sampling and Reconstruction. In ACM SIGGRAPH 2015 Courses (Los Angeles, California) (SIGGRAPH '15). 11:1--11:255.
[38]
Eli Shechtman, Alex Rav-Acha, Michal Irani, and Steve Seitz. 2010. Regenerative Morphing. In IEEE Conference on Computer VIsion and Pattern Recognition (CVPR). San-Francisco, CA.
[39]
Karen Simonyan and Andrew Zisserman. 2014. Two-Stream Convolutional Networks for Action Recognition in Videos. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1 (Montreal, Canada) (NIPS'14). MIT Press, Cambridge, MA, USA, 568--576.
[40]
Josef Spjut, Ben Boudaoud, Kamran Binaee, Jonghyun Kim, Alexander Majercik, Morgan McGuire, David Luebke, and Joohwan Kim. 2019. Latency of 30 Ms Benefits First Person Targeting Tasks More Than Refresh Rate Above 60 Hz. In SIGGRAPH Asia 2019 Technical Briefs (Brisbane, QLD, Australia) (SA '19). Association for Computing Machinery, New York, NY, USA, 110--113.
[41]
Tiancheng Sun, Zexiang Xu, Xiuming Zhang, Sean Fanello, Christoph Rhemann, Paul Debevec, Yun-Ta Tsai, Jonathan T Barron, and Ravi Ramamoorthi. 2020. Light stage super-resolution: continuous high-frequency relighting. ACM Transactions on Graphics (TOG) 39, 6 (2020), 1--12.
[42]
Thijs Vogels, Fabrice Rousselle, Brian McWilliams, Gerhard Röthlin, Alex Harvill, David Adler, Mark Meyer, and Jan Novák. 2018. Denoising with kernel prediction and asymmetric loss functions. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1--15.
[43]
Zhou Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600--612.
[44]
Xiaoyu Xiang, Yapeng Tian, Yulun Zhang, Yun Fu, Jan P. Allebach, and Chenliang Xu. 2020. Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3370--3379.
[45]
Kai Xiao, Gabor Liktor, and Karthik Vaidyanathan. 2018. Coarse Pixel Shading with Temporal Supersampling. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (Montreal, Quebec, Canada) (I3D '18). Association for Computing Machinery, New York, NY, USA, Article 1, 7 pages.
[46]
Lei Xiao, Salah Nouri, Matt Chapman, Alexander Fix, Douglas Lanman, and Anton Kaplanyan. 2020. Neural Supersampling for Real-Time Rendering. ACM Trans. Graph. 39, 4, Article 142 (July 2020), 12 pages.
[47]
Wei Xiong, Jiahui Yu, Zhe Lin, Jimei Yang, Xin Lu, Connelly Barnes, and Jiebo Luo. 2019. Foreground-Aware Image Inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[48]
Lei Yang, Shiqiu Liu, and Marco Salvi. 2020. A Survey of Temporal Antialiasing Techniques. Computer Graphics Forum 39, 2 (2020), 607--621.
[49]
Lei Yang, Diego Nehab, Pedro V. Sander, Pitchaya Sitthi-amorn, Jason Lawrence, and Hugues Hoppe. 2009. Amortized Supersampling. ACM Trans. Graph. 28, 5 (Dec. 2009), 1--12.
[50]
Lei Yang, Yu-Chiu Tse, Pedro V Sander, Jason Lawrence, Diego Nehab, Hugues Hoppe, and Clara L Wilkins. 2011. Image-based bidirectional scene reprojection. In Proceedings of the 2011 SIGGRAPH Asia Conference. 1--10.
[51]
Zili Yi, Qiang Tang, Shekoofeh Azizi, Daesik Jang, and Zhan Xu. 2020. Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting., 7505-7514 pages.
[52]
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. 2018. Generative Image Inpainting With Contextual Attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[53]
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2019. Free-form image inpainting with gated convolution. In Proceedings of the IEEE International Conference on Computer Vision. 4471--4480.
[54]
Jiyang Yu and Ravi Ramamoorthi. 2020. Learning Video Stabilization Using Optical Flow. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8156--8164.
[55]
Zheng Zeng, Shiqiu Liu, Jinglei Yang, Lu Wang, and Ling-Qi Yan. 2021. Temporally Reliable Motion Vectors for Real-time Ray Tracing (to appear). In Computer Graphics Forum (Proceedings of Eurographics 2021).
[56]
Henning Zimmer, Fabrice Rousselle, Wenzel Jakob, Oliver Wang, David Adler, Wojciech Jarosz, Olga Sorkine-Hornung, and Alexander Sorkine-Hornung. 2015. Path-space Motion Estimation and Decomposition for Robust Animation Filtering. Computer Graphics Forum (Proceedings of EGSR) 34, 4 (June 2015).

Cited By

View all
  • (2024)DHyper: A Recurrent Dual Hypergraph Neural Network for Event Prediction in Temporal Knowledge GraphsACM Transactions on Information Systems10.1145/365301542:5(1-23)Online publication date: 29-Apr-2024
  • (2024)Towards Unified Representation Learning for Career Mobility Analysis with Trajectory HypergraphACM Transactions on Information Systems10.1145/365115842:4(1-28)Online publication date: 26-Apr-2024
  • (2024)Deep Fourier-based Arbitrary-scale Super-resolution for Real-time RenderingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657439(1-11)Online publication date: 13-Jul-2024
  • Show More Cited By

Index Terms

  1. ExtraNet: real-time extrapolated rendering for low-latency temporal supersampling

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 40, Issue 6
      December 2021
      1351 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/3478513
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 10 December 2021
      Published in TOG Volume 40, Issue 6

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. extrapolation
      2. low-latency
      3. supersampling
      4. temporal

      Qualifiers

      • Research-article

      Funding Sources

      • NSFC

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)622
      • Downloads (Last 6 weeks)65
      Reflects downloads up to 10 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)DHyper: A Recurrent Dual Hypergraph Neural Network for Event Prediction in Temporal Knowledge GraphsACM Transactions on Information Systems10.1145/365301542:5(1-23)Online publication date: 29-Apr-2024
      • (2024)Towards Unified Representation Learning for Career Mobility Analysis with Trajectory HypergraphACM Transactions on Information Systems10.1145/365115842:4(1-28)Online publication date: 26-Apr-2024
      • (2024)Deep Fourier-based Arbitrary-scale Super-resolution for Real-time RenderingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657439(1-11)Online publication date: 13-Jul-2024
      • (2024)Mob-FGSR: Frame Generation and Super Resolution for Mobile Real-Time RenderingSpecial Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers '2410.1145/3641519.3657424(1-11)Online publication date: 13-Jul-2024
      • (2024)A Survey on Graph Representation Learning MethodsACM Transactions on Intelligent Systems and Technology10.1145/363351815:1(1-55)Online publication date: 16-Jan-2024
      • (2024)DHMAE: A Disentangled Hypergraph Masked Autoencoder for Group RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657699(914-923)Online publication date: 10-Jul-2024
      • (2024)MNSS: Neural Supersampling Framework for Real-Time Rendering on Mobile DevicesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.325914130:7(4271-4284)Online publication date: Jul-2024
      • (2024)PopGR: Popularity reweighting for debiasing in group recommendationWorld Wide Web10.1007/s11280-024-01272-527:4Online publication date: 17-May-2024
      • (2024)GroupMO: a memory-augmented meta-optimized model for group recommendationWorld Wide Web10.1007/s11280-024-01267-227:3Online publication date: 18-Apr-2024
      • (2024)Group-to-group recommendation with neural graph matchingWorld Wide Web10.1007/s11280-024-01250-x27:2Online publication date: 5-Mar-2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media