Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

RayCore: A Ray-Tracing Hardware Architecture for Mobile Devices

Published: 23 September 2014 Publication History

Abstract

We present RayCore, a mobile ray-tracing hardware architecture. RayCore facilitates high-quality rendering effects, such as reflection, refraction, and shadows, on mobile devices by performing real-time Whitted ray tracing. RayCore consists of two major components: ray-tracing units (RTUs) based on a unified traversal and intersection pipeline and a tree-building unit (TBU) for dynamic scenes. The overall RayCore architecture offers considerable benefits in terms of die area, memory access, and power consumption. We have evaluated our architecture based on FPGA and ASIC evaluations and demonstrate its performance on different benchmarks. According to the results, our architecture demonstrates high performance per unit area and unit energy, making it highly suitable for use in mobile devices.

Supplementary Material

nah (nah.zip)
Supplemental movie and image files for, RayCore: A Ray-Tracing Hardware Architecture for Mobile Devices
MP4 File (a162-sidebyside.mp4)

References

[1]
Timo Aila and Tero Karras. 2010. Architecture considerations for tracing incoherent rays. In Proceedings of the Conference on High-Performance Graphics.
[2]
Timo Aila and Samuli Laine. 2009. Understanding the efficiency of ray traversal on GPUs. In Proceedings of the Conference on High Performance Graphics. ACM Press, New York, 145--149.
[3]
Carsten Benthin, Ingo Wald, Sven Woop, Manfred Ernst, and William R. Mark. 2012. Combining single and packet ray tracing for arbitrary ray distributions on the Intel MIC architecture. IEEE Trans. Visual. Comput. Graph. 18, 9, 1438--1448.
[4]
Jacco Bikker. 2007. Real-time ray tracing through the eyes of a game developer. In Proceedings of the IEEE/EG Symposium on Interactive Ray Tracing. 1--10.
[5]
Shekhar Borkar and Andrew A. Chien. 2011. The future of microprocessors. Comm. ACM 54, 5, 67--77.
[6]
Solomon Boulos, David Edwards, J. Dylan Lacewell, Joe Kniss, Jan Kautz, Ingo Wald, and Peter Shirley. 2006. Interactive distribution ray tracing. Tech. rep., No UUSCI-2006-022, SCI Institute, University of Utah.
[7]
Byn Choi, Rakesh Komuravelli, Victor Lu, Hyojin Sung, Robert L. Bocchino, Sarita V. Adve, and John C. Hart. 2010. Parallel SAH k-d tree construction. In Proceedings of the Conference on High Performance Graphics. 77--86.
[8]
Robert L. Cook, Thomas Porter, and Loren Carpenter. 1984. Distributed ray tracing. In Proceedings of the 11th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'84). ACM Press, New York, 137--145.
[9]
Peter Djeu, Warren A. Hunt, Rui Wang, Ikrima Elhassan, Gordon Stoll, and William R. Mark. 2011. Razor: An architecture for dynamic multiresolution ray tracing. ACM Trans. Graph. 30, 5, 115:1--115:26 pages.
[10]
Michael J. Doyle, Colin Fowler, and Michael Manzke. 2013. A hardware unit for fast SAH-optimised BVH construction. ACM Trans. Graph. 32, 4.
[11]
Venkatraman Govindaraju, Peter Djeu, Karthikeyan Sankaralingam, Mary Vernon, and William R. Mark. 2008. Toward a multicore architecture for real-time ray-tracing. In Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'08). 176--187.
[12]
Christiaan Gribble and Alexis Naveros. 2013. GPU ray tracing with rayforce. In Proceedings of the ACM SIGGRAPH Posters. 98:1--98:1.
[13]
Yan Gu, Yong He, Kayvon Fatahalian, and Guy Blelloch. 2013. Efficient BVH construction via approximate agglomerative clustering. In Proceedings of the 5th High-Performance Graphics Conference. 81--88.
[14]
Hilbert Hagedoorn. 2012. Geforce GTX 680 review. Tech. rep., The guru of 3D. http://www.guru3d.com/articles.
[15]
Ziyad S. Hakura and Anoop Gupta. 1997. The design and analysis of a cache architecture for texture mapping. SIGARCH Comput. Archit. News 25, 2, 108--120.
[16]
Rehan Hameed, Wajahat Qadeer, Megan Wachs, Omid Azizi, Alex Solomatnikov, Benjamin C. Lee, Stephen Richardson, Christos Kozyrakis, and Mark Horowitz. 2010. Understanding sources of inefficiency in general-purpose chips. SIGARCH Comput. Archit. News 38, 3, 37--47.
[17]
Jiri Havel and Adam Herout. 2010. Yet faster ray-triangle intersection (using SSE4). IEEE Trans. Visual. Comput. Graph. 16, 3, 434--438.
[18]
Vlastimil Havran, Robert Herzog, and Hans-Peter Seidel. 2006. On the fast construction of spatial hierarchies for ray tracing. In Proceedings of the IEEE/EG Symposium on Interactive Ray Tracing. 71--80.
[19]
Qiming Hou, Xin Sun, Kun Zhou, Christian Lauterbach, and Dinesh Manocha. 2011. Memory-scalable GPU spatial hierarchy construction. IEEE Trans. Visual. Comput. Graph. 17, 3, 466--474.
[20]
Warren Hunt, William R. Mark, and Gordon Stoll. 2006. Fast kd-tree construction with an adaptive error-bounded heuristic. In Proceedings of the IEEE/EG Symposium on Interactive Ray Tracing. 81--88.
[21]
Imgtec. 2013. Imagination technologies ships caustic series2 r2500 and r2100 ray tracing acceleration boards. http://www.imgtec.com/news/release/index.asp?NewsID=722.
[22]
Thiago Ize and Charles D. Hansen. 2011. RTSAH traversal order for occlusion rays. Comput. Graph. Forum 30, 2, 297--305.
[23]
Yoon-Sig Kang, Jae-Ho Nah, Woo-Chan Park, and Sung-Bong Yang. 2013. gkDtree: A group-based parallel update kd-tree for interactive ray tracing. J. Syst. Archit. 59, 3, 166--175.
[24]
Tero Karras. 2012. Maximizing parallelism in the construction of BVHs, octrees, and k-d trees. In Proceedings of the 4th ACM SIGGRAPH/Eurographics Conference on High-Performance Graphics. 33--37.
[25]
Tero Karras and Timo Aila. 2013. Fast parallel construction of highquality bounding volume hierarchies. In Proceedings of the 5th High-Performance Graphics Conference. 89--99.
[26]
Alexander Keller, Tero Karras, Ingo Wald, Timo Aila, Samuli Laine, Jacco Bikker, Christiaan Gribble, Won-Jong Lee, and James Mccombe. 2013. Ray tracing is the future and ever will be.... In ACM SIGGRAPH Courses.
[27]
Hong-Yun Kim, Young-Jun Kim, and Lee-Sup Kim. 2012. MRTP: Mobile ray tracing processor with reconfigurable stream multi-processors for high datapath utilization. IEEE J. Solid-State Circ. 47, 2, 518--535.
[28]
Hong-Yoon Kim, Young-Jun Kim, Jiehwan Oh, and Lee-Sup Kim. 2013. A reconfigurable SIMT processor for mobile ray tracing with contention reduction in shared memory. IEEE Trans. Circ. Syst. 60, 4, 938--950.
[29]
Daniel Kopta, Konstantin Shkurko, Josef Spjut, Erik Brunvand, and Al Davis. 2013. An energy and bandwidth efficient ray tracing architecture. In Proceedings of the 5th High-Performance Graphics Conference (HPG'13). 121--128.
[30]
Daniel Kopta, Joseph Spjut, Erik Brunvand, and Al Davis. 2010. Efficient MIMD architectures for high-performance ray tracing. In Proceedings of the IEEE International Conference on Computer Design.
[31]
Hyuck-Joo Kwon, Jae-Ho Nah, Dinesh Manocha, and Woo-Chan Park. 2014. Effective traversal algorithms and hardware architecture for pyramidal inverse displacement mapping. Comput. Graph. 38, 140--149.
[32]
Christian Lauterbach, Michael Garland, Shubhabrata Sengupta, David Luebke, and Dinesh Manocha. 2009. Fast BVH construction on GPUs. Comput. Graph. Forum 28, 2, 375--384.
[33]
Won-Jong Lee, Shi-Hwa Lee, Jae-Ho Nah, Jin-Woo Kim, Youngsam Shin, Jaedon Lee, and Seok-Yoon Jung. 2012. SGRT: A scalable mobile GPU architecture based on ray tracing. InACM SIGGRAPH Talks.
[34]
Won-Jong Lee, Youngsam Shin, Jaedon Lee, Jin-Woo Kim, Jae-Ho Nah, Seok-Yoon Jung, Shi-Hwa Lee, Hyun-Sang Park, and Tack-Don Han. 2013. SGRT: A mobile GPU architecture for real-time ray tracing. In Proceedings of the 5th High-Performance Graphics Conference. 109--119.
[35]
Jonas Lext, Ulf Assarsson, and Tomas Moller. 2001. BART: A benchmark for animated ray tracing. IEEE Comput. Graph. Appl. 21, 2, 22--31.
[36]
Aqeel Mahesri, Daniel Johnson, Neal Crago, and Sanjay J. Patel. 2008. Tradeoffs in designing accelerator architectures for visual computing. In Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture. 164--175.
[37]
Bochang Moon, Yongyoung Byun, Tae-Joon Kim, Pio Claudio, Hye-Sun Kim, Yun-Ji Ban, Seung Woo Nam, and Sung-Eui Yoon. 2010. Cache-oblivious ray reordering. ACM Trans. Graph. 29, 3, 28:1--28:10.
[38]
Guy M. Morton. 1966. A Computer Oriented Geodetic Data Base and a New Technique in File Sequencing. IBM.
[39]
Jae-Ho Nah, Yun-Hye Jung, Woo-Chan Park, and Tack-Don Han. 2012. Efficient ray sorting for the tracing of incoherent rays. IEICE Electron. Express 9, 9, 849--854.
[40]
Jae-Ho Nah, Yoon-Sig Kang, Kwang-Jo Lee, Shin-Jun Lee, Tack-Don Han, and Sung-Bong Yang. 2010. MobiRT: An implementation of OpenGL ES-Based CPU-GPU hybrid ray tracer for mobile devices. In ACM SIGGRAPH ASIA Sketches, Vol. 50, ACM Press, New York, 50:1--50:2.
[41]
Jae-Ho Nah and Dinesh Manocha. 2014. SATO: Surface-area traversal order for shadow ray tracing. Comput. Graph. Forum. (preprint).
[42]
Jae-Ho Nah, Jeong-Soo Park, Chanmin Park, Jin-Woo Kim, Yun-Hye Jung, Woo-Chan Park, and Tack-Don Han. 2011. T&I engine: Traversal and intersection engine for hardware accelerated ray tracing. ACM Trans. Graph. 30, 6, 160:1--160:10.
[43]
Notebookcheck. 2013. Apple a7 smartphone SOC. http://www.notebook check.net/Apple-A7-Smartphone-SoC.103280.0.html.
[44]
Nvidia. 2013. NVIDIA Tegra 4 family GPU architecture. Whitepaper http://www.nvidia.com/docs/IO/116757/Tegra_4_GPU_Whitepaper_FINALv2.pdf.
[45]
Woo-Chan Park, Dong-Seok Kim, Jeong-Soo Park, Sang-Duk Kim, Hong-Sik Kim, and Tack-Don Han. 2011. The design of a texture mapping unit with effective mip-map level selection for real-time ray tracing. IEICE Electron. Express 8, 13, 1064--1070.
[46]
Woo-Chan Park, Jae-Ho Nah, Jeong-Soo Park, Kyung-Ho Lee, Dong-Seok Kim, Sang-Duk Kim, Jin-Hong Park, Cheong-Ghil Kim, Yoon-Sig Kang, Sung-Bong Yang, and Tack-Don Han. 2008. An FPGA implementation of Whitted-style ray tracing accelerator. In Proceedings of the IEEE Symposium on Interactive Ray Tracing. 187--187.
[47]
Steven G. Parker, James Bigler, Andreas Dietrich, Heiko Friedrich, Jared Hoberock, David Luebke, David Mcallister, Morgan Mcguire, Keith Morley, Austin Robison, and Martin Stich. 2010. OptiX: A general purpose ray tracing engine. ACM Trans. Graph. 29, 4, 1--13.
[48]
Matt Pharr and Greg Humphreys. 2010. Physically Based Rendering 2nd Ed. Morgan Kaufmann, San Fransisco, CA.
[49]
Karthik Ramani, Christiaan P. Gribble, and Al Davis. 2009. StreamRay: A stream filtering architecture for coherent ray tracing. In Proceeding of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'09). ACM Press, New York, 325--336.
[50]
Alexander Reshetov, Alexei Soupikov, and Jim Hurley. 2005. Multi-level ray tracing algorithm. ACM Trans. Graph. 24, 3, 1176--1185.
[51]
Jorg Schmittler, Sven Woop, Daniel Wagner, Wolfgang J. Paul, and Philipp Slusallek. 2004. Realtime ray tracing of dynamic scenes on an FPGA chip. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware. 95--106.
[52]
Maxim Shevtsov, Alexei Soupikov, and Alexander Kapustin. 2007. Highly parallel fast kd-tree construction for interactive ray tracing of dynamic scenes. Comput. Graph. Forum 26, 3, 395--404.
[53]
Peter Shirley, Kelvin Sung, Erik Brunvand, Alan Davis, Steven Parker, and Solomon Boulos. 2008. Fast ray tracing and the potential effects on graphics and gaming courses. Comput. Graph. 32, 2, 260--267.
[54]
Siliconarts. 2013. RaySort. http://www.siliconarts.co.kr.
[55]
Josef Spjut, Andrew Kensler, Daniel Kopta, and Erik Brunvand. 2009. TRaX: A multicore hardware architecture for real-time ray tracing. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 28, 12, 1802--1815.
[56]
Joseph Spjut, Daniel Kopta, Erik Brunvand, and Al Davis. 2012. A mobile accelerator architecture for ray tracing. In Proceedings of the 3rd Workshop on SoCs, Heterogeneous Architectures and Workloads (SHAW'12).
[57]
Kevin Suffern. 2007. Ray Tracing from the Ground Up. A. K. Peters, Ltd.
[58]
Synopsys. 2013. Power optimization in design compiler. http://www.synopsys.com/Tools/Implementation/RTLSynthesis/Pages/PowerCompi ler.aspx.
[59]
Tony Tamasi. 2008. Evolution of computer graphics. http://www.nvidia.com/content/nvision2008/tech_presentations/Technology_Keynotes/NVISIO N08-Tech_Keynote-GPU.pdf.
[60]
Art Tevs, Ivo Ihrke, and Hans-Peter Seidel. 2008. Maximum mipmaps for fast, accurate, and scalable dynamic height field rendering. In Proceedings of the Symposium on Interactive 3D Graphics and Games (I3D'07). ACM Press, New York, 183--190.
[61]
Tsmc. 2012. 28nm technology. http://www.tsmc.com/english/dedicated Foundry/technology/28nm.htm.
[62]
Eric Veach and Leonidas Guibas. 1994. Bidirectional estimators for light transport. In Proceedings of the Eurographics Rendering Workshop. 147--162.
[63]
Carsten Wachter and Alexander Keller. 2006. Instant ray tracing: The bounding interval hierarchy. In Proceedings of the 17th Eurographics Workshop on Rendering. 139--149.
[64]
Barry Wagner. 2013. The evolving mobile platform. http://www.jedec. org/sites/default/files/Barry%20Wagner_Mobile%20Forum_May_2013-Final-04232013.pdf.
[65]
Ingo Wald. 2004. Realtime ray tracing and interactive global illumination. http://www.sci.utah.edu/~wald/PhD/wald_phd.pdf.
[66]
Ingo Wald, Carsten Benthin, and Philipp Slusallek. 2003. Distributed interactive ray tracing of dynamic scenes. In Proceedings of the IEEE Symposium on Parallel and Large-Data Visualization and Graphics. 77--86.
[67]
Ingo Wald, Solomon Boulos, and Peter Shirley. 2007. Ray tracing deformable scenes using dynamic bounding volume hierarchies. ACM Trans. Graph. 26, 1, 6:1--6:18.
[68]
Ingo Wald and Vlastimil Havran. 2006. On building fast kd-trees for ray tracing, and on doing that in o(n log n). In Proceedings of the IEEE/EG Symposium on Interactive Ray Tracing. 61--69.
[69]
Ingo Wald, Thiago Ize, and Steven G. Parker. 2008. Fast, parallel, and asynchronous construction of BVHS for ray tracing animated scenes. Comput. Graph. 32, 1, 3--13.
[70]
Ingo Wald, William R. Mark, Johannes Gunther, Solomon Boulos, Thiago Ize, Warren Hunt, Steven G. Parker, and Peter Shirley. 2009. State of the art in ray tracing animated scenes. Comput. Graph. Forum 28, 6, 1691--1722.
[71]
Ingo Wald, Philipp Slusallek, Carsten Benthin, and Markus Wagner. 2001. Interactive rendering with coherent ray tracing. Comput. Graph. Forum 20, 3, 153--164.
[72]
Turner Whitted. 1980. An improved illumination model for shaded display. Comm. ACM 23, 6, 343--349.
[73]
Sven Woop, Erik Brunvand, and Philipp Slusallek. 2006a. Estimating performance of a ray-tracing ASIC design. In Proceedings of the IEEE/EG Symposium on Interactive Ray Tracing. 7--14.
[74]
Sven Woop, Gerd Marmitt, and Philipp Slusallek. 2006b. B-KD trees for hardware accelerated ray tracing of dynamic scenes. In Proceedings of the 21st ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware (GH'06). ACM Press, New York, 67--77.
[75]
Sven Woop, Jorg Schmittler, and Philipp Slusallek. 2005. RPU: A programmable ray processing unit for realtime ray tracing. ACM Trans. Graph. 24, 3, 434--444.
[76]
Zhefeng Wu, Fukai Zhao, and Xinguo Liu. 2011. SAH KD-tree construction on GPU. In Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics (HPG'11). 71--78.
[77]
Kun Zhou, Qiming Hou, Rui Wang, and Baining Guo. 2008. Real-time kd-tree construction on graphics hardware. ACM Trans. Graph. 27, 5, 1--11.

Cited By

View all
  • (2024)QuickTree: A Fast Hardware BVH Construction EngineProceedings of the 21st ACM International Conference on Computing Frontiers10.1145/3649153.3649292(294-297)Online publication date: 7-May-2024
  • (2024)MPRTA: An Efficient Multilevel Parallel Mobile Accelerator for High-Performance Ray TracingIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.333471132:2(396-400)Online publication date: 1-Feb-2024
  • (2024)Software-Hardware Codesign of Ray-Tracing Accelerator for Edge AR/VR With Viewpoint-Focused 3D Construction and Efficient Data Structure2024 IEEE 67th International Midwest Symposium on Circuits and Systems (MWSCAS)10.1109/MWSCAS60917.2024.10658949(267-271)Online publication date: 11-Aug-2024
  • Show More Cited By

Index Terms

  1. RayCore: A Ray-Tracing Hardware Architecture for Mobile Devices

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 33, Issue 5
      August 2014
      152 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/2672594
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 23 September 2014
      Accepted: 01 March 2014
      Received: 01 December 2013
      Published in TOG Volume 33, Issue 5

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Ray tracing
      2. global illumination
      3. kd-tree
      4. ray-tracing hardware

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)61
      • Downloads (Last 6 weeks)8
      Reflects downloads up to 03 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)QuickTree: A Fast Hardware BVH Construction EngineProceedings of the 21st ACM International Conference on Computing Frontiers10.1145/3649153.3649292(294-297)Online publication date: 7-May-2024
      • (2024)MPRTA: An Efficient Multilevel Parallel Mobile Accelerator for High-Performance Ray TracingIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.333471132:2(396-400)Online publication date: 1-Feb-2024
      • (2024)Software-Hardware Codesign of Ray-Tracing Accelerator for Edge AR/VR With Viewpoint-Focused 3D Construction and Efficient Data Structure2024 IEEE 67th International Midwest Symposium on Circuits and Systems (MWSCAS)10.1109/MWSCAS60917.2024.10658949(267-271)Online publication date: 11-Aug-2024
      • (2024)Extending GPU Ray-Tracing Units for Hierarchical Search Acceleration2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00079(1027-1040)Online publication date: 2-Nov-2024
      • (2024)Real-Time Monte Carlo Denoising With Adaptive Fusion NetworkIEEE Access10.1109/ACCESS.2024.336958812(29154-29165)Online publication date: 2024
      • (2024)Highly immersive imaging: Depth of field effect implemented through ray tracing with multiple samplesJournal of Physics: Conference Series10.1088/1742-6596/2711/1/0120172711:1(012017)Online publication date: 1-Feb-2024
      • (2023)A survey on generative 3D digital humans based on neural networks: representation, rendering, and learningSCIENTIA SINICA Informationis10.1360/SSI-2022-031953:10(1858)Online publication date: 13-Oct-2023
      • (2023)An Architecture and Implementation of Real-Time Sound Propagation Hardware for Mobile DevicesSIGGRAPH Asia 2023 Conference Papers10.1145/3610548.3618237(1-9)Online publication date: 10-Dec-2023
      • (2023)MMsRT: A Hardware Architecture for Ray Tracing in the Mobile DomainJournal of Circuits, Systems and Computers10.1142/S021812662350192X32:11Online publication date: 2-Feb-2023
      • (2023)Post0-VR: Enabling Universal Realistic Rendering for Modern VR via Exploiting Architectural Similarity and Data Sharing2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071097(390-402)Online publication date: Feb-2023
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media