Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Real-time, High-resolution Depth Upsampling on Embedded Accelerators

Published: 27 March 2021 Publication History
  • Get Citation Alerts
  • Abstract

    High-resolution, low-latency apps in computer vision are ubiquitous in today’s world of mixed-reality devices. These innovations provide a platform that can leverage the improving technology of depth sensors and embedded accelerators to enable higher-resolution, lower-latency processing for 3D scenes using depth-upsampling algorithms. This research demonstrates that filter-based upsampling algorithms are feasible for mixed-reality apps using low-power hardware accelerators. The authors parallelized and evaluated a depth-upsampling algorithm on two different devices: a reconfigurable-logic FPGA embedded within a low-power SoC; and a fixed-logic embedded graphics processing unit. We demonstrate that both accelerators can meet the real-time requirements of 11 ms latency for mixed-reality apps.1

    References

    [1]
    NVIDIA. 2020. NVIDIA Tensor Cores: Versatility for HPC & AI. Retrieved from https://www.nvidia.com/en-us/data-center/tensor-cores/.
    [2]
    Passmark. 2020. PassMark PerformanceTest - PC benchmark software. Retrieved from https://www.passmark.com/products/performancetest/.
    [3]
    Amira Belhedi, Adrien Bartoli, Steve Bourgeois, Vincent Gay-Bellile, Kamel Hamrouni, and Patrick Sayd. 2015. Noise modelling in time-of-flight sensors with application to depth noise removal and uncertainty estimation in three-dimensional measurement. IET Comput. Vis. 9, 6 (2015), 967--977.
    [4]
    Ankita Bhutani and Pallavi Bhardwaj. 2017. Augmented Reality Market Size, Analysis - Industry Share 2017-2024. Retrieved from https://www.gminsights.com/ industry-analysis/augmented-reality-ar-market.
    [5]
    Atman Binstock. 2015. Powering the Rift. Retrieved from https://www.oculus.com/blog/powering-the-rift/.
    [6]
    J. Mark Bull. 1999. Measuring synchronisation and scheduling overheads in OpenMP. In Proceedings of the 1st European Workshop on OpenMP, Vol. 8. 49.
    [7]
    Derek Chan, Hylke Buisman, Christian Theobalt, and Sebastian Thrun. 2008. A noise-aware filter for real-time depth upsampling. In Proceedings of the Workshop on Multi-Camera and Multi-modal Sensor Fusion Algorithms and Applications.
    [8]
    T. Edeler, K. Ohliger, S. Hussmann, and A. Mertins. 2010. Time-of-flight depth image denoising using prior noise information. In Proceedings of the IEEE 10th International Conference on Signal Processing. 119--122.
    [9]
    Ivan Eichhardt, Dmitry Chetverikov, and Zsolt Janko. 2017. Image-guided ToF depth upsampling: A survey. Mach. Vis. Applic. 28, 3--4 (2017), 267--282.
    [10]
    Georgios Evangelidis, Miles Hansard, and Radu Horaud. 2015. Fusion of range and stereo data for high-resolution scene-modeling. IEEE Trans. Pattern Anal. Mach. Intell. 37, 11 (Nov. 2015), 2178--2192.
    [11]
    Anna Gabiger-Rose, Matthias Kube, Robert Weigel, and Richard Rose. 2013. An FPGA-based fully synchronized design of a bilateral filter for real-time image denoising. IEEE Trans. Industr. Electron. 61, 8 (2013), 4093--4104.
    [12]
    Vineet Gandhi, Jan Čech, and Radu Horaud. 2012. High-resolution depth maps based on TOF-stereo fusion. In Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, 4742--4749.
    [13]
    HTC. 2018. VIVE Virtual Reality System. Retrieved from https://www.vive.com/us/product/vive-virtual-reality-system/.
    [14]
    Xilinx Inc. 2019. Xilinx Zynq UltraScale+ MPSoC ZCU102 Evaluation Kit. Retrieved from https://www.xilinx.com/products/boards-and-kits/ek-u1-zcu102-g.html.
    [15]
    M. Jordà, P. Valero-Lara, and A. J. Peña. 2019. Performance evaluation of cuDNN convolution algorithms on NVIDIA Volta GPUs. IEEE Access 7 (2019), 70461--70473.
    [16]
    Johannes Kopf, Michael F. Cohen, Dani Lischinski, and Matt Uyttendaele. 2007. Joint bilateral upsampling. In ACM Transactions on Graphics, Vol. 26. ACM, 96.
    [17]
    David Langerman, Sebastian Sabogal, Barath Ramesh, and Alan George. 2018. Accelerating real-time, high-resolution depth upsampling on FPGAs. In Proceedings of the IEEE International Conference on Image Processing, Applications and Systems (IPAS’18). 37--42.
    [18]
    K. Mohammad and S. Agaian. 2009. Efficient FPGA implementation of convolution. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. 3478--3483.
    [19]
    Vladimir Nekrasov, Chunhua Shen, and Ian D. Reid. 2018. Light-weight RefineNet for real-time semantic segmentation. In Proceedings of the British Machine Vision Conference (BMVC’18).
    [20]
    Nicholas Nethercote and Julian Seward. 2007. Valgrind: A framework for heavyweight dynamic binary instrumentation. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation.
    [21]
    Daniel Scharstein and Chris Pal. 2007. Learning conditional random fields for stereo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.
    [22]
    Daniel Scharstein and Richard Szeliski. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47, 1--3 (2002), 7--42.
    [23]
    Daniel Scharstein and Richard Szeliski. 2003. High-accuracy stereo depth maps using structured light. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1. IEEE, I--I.
    [24]
    Ryan Shea, Andy Sun, Silvery Fu, and Jiangchuan Liu. 2017. Towards fully offloaded cloud-based AR: Design, implementation and experience. In Proceedings of the 8th ACM on Multimedia Systems Conference. ACM, 321--330.
    [25]
    H. M. Waidyasooriya and M. Hariyama. 2019. Multi-FPGA accelerator architecture for stencil computation exploiting spacial and temporal scalability. IEEE Access 7 (2019), 53188--53201.
    [26]
    K. Wiatr and E. Jamro. 2000. Implementation image data convolutions operations in FPGA reconfigurable structures for real-time vision systems. In Proceedings of the International Conference on Information Technology: Coding and Computing (Cat. No.PR00540). 152--157.
    [27]
    Liang Yuan, Xin Jin, Yangguang Li, and Chun Yuan. 2017. Depth map super-resolution via low-resolution depth guided joint trilateral up-sampling. J. Vis. Commun. Image Repres. 46 (2017), 280--291.
    [28]
    Ming-Ze Yuan, Lin Gao, Hongbo Fu, and Shihong Xia. 2019. Temporal upsampling of depth maps using a hybrid camera. IEEE Trans. Vis. Comput. Graph. 25, 3 (Mar. 2019), 1591--1602.
    [29]
    David J. Zielinski, Hrishikesh M. Rao, Mark A. Sommer, and Regis Kopper. 2015. Exploring the effects of image persistence in low frame rate virtual environments. In Proceedings of the IEEE Virtual Reality Conference (VR’15). IEEE, 19--26.

    Cited By

    View all
    • (2022)A CNN Hardware Accelerator Using Triangle-based ConvolutionACM Journal on Emerging Technologies in Computing Systems10.1145/354497518:4(1-23)Online publication date: 27-Jun-2022

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Embedded Computing Systems
    ACM Transactions on Embedded Computing Systems  Volume 20, Issue 3
    May 2021
    217 pages
    ISSN:1539-9087
    EISSN:1558-3465
    DOI:10.1145/3458920
    • Editor:
    • Tulika Mitra
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 27 March 2021
    Accepted: 01 November 2020
    Revised: 01 October 2020
    Received: 01 June 2020
    Published in TECS Volume 20, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. FPGA
    2. GPU
    3. Real time
    4. depth upsampling
    5. high-level synthesis
    6. image processing
    7. time-of-flight sensor

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • National ScienceFoundation

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)23
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)A CNN Hardware Accelerator Using Triangle-based ConvolutionACM Journal on Emerging Technologies in Computing Systems10.1145/354497518:4(1-23)Online publication date: 27-Jun-2022

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media