Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

GShuttle: Optimizing Memory Access Efficiency for Graph Convolutional Neural Network Accelerators

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Graph convolutional neural networks (GCNs) have emerged as an effective approach to extending deep learning for graph data analytics, but they are computationally challenging given the irregular graphs and the large number of nodes in a graph. GCNs involve chain sparse-dense matrix multiplications with six loops, which results in a large design space for GCN accelerators. Prior work on GCN acceleration either employs limited loop optimization techniques, or determines the design variables based on random sampling, which can hardly exploit data reuse efficiently, thus degrading system efficiency. To overcome this limitation, this paper proposes GShuttle, a GCN acceleration scheme that maximizes memory access efficiency to achieve high performance and energy efficiency. GShuttle systematically explores loop optimization techniques for GCN acceleration, and quantitatively analyzes the design objectives (e.g., required DRAM accesses and SRAM accesses) by analytical calculation based on multiple design variables. GShuttle further employs two approaches, pruned search space sweeping and greedy search, to find the optimal design variables under certain design constraints. We demonstrated the efficacy of GShuttle by evaluation on five widely used graph datasets. The experimental simulations show that GShuttle reduces the number of DRAM accesses by a factor of 1.5 and saves energy by a factor of 1.7 compared with the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Jiang W W, Luo J Y. Graph neural network for traffic forecasting: A survey. arXiv: 2101.11174, 2021. https://arxiv.org/abs/2101.11174, Dec. 2022.

  2. Shi W J, Rajkumar R. Point-GNN: Graph neural network for 3D object detection in a point cloud. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.1711–1719. https://doi.org/10.1109/CVPR42600.2020.00178.

  3. Wee C Y, Liu C Q, Lee A, Poh J S, Ji H, Qiu A Q, The Alzheimers Disease Neuroimage Initiative. Cortical graph neural network for AD and MCI diagnosis and transfer learning across populations. NeuroImage: Clinical, 2019, 23: 101929. https://doi.org/10.1016/j.nicl.2019.101929.

  4. Zhang Z W, Cui P, Zhu W W. Deep learning on graphs: A survey. IEEE Trans. Knowledge and Data Engineering, 2022, 34(1): 249–270. https://doi.org/10.1109/TKDE.2020.2981333.

    Article  Google Scholar 

  5. Yang H X. AliGraph: A comprehensive graph neural network platform. In Proc. the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Jul. 2019, pp.3165–3166. https://doi.org/10.1145/3292500.3340404.

  6. Lerer A, Wu L, Shen J, Lacroix T, Wehrstedt L, Bose A, Peysakhovich A. PyTorch-BigGraph: A large-scale graph embedding system. arXiv: 1903.12287, 2019. https://arxiv.org/abs/1903.12287, Dec. 2022.

  7. Yan M Y, Chen Z D, Deng L, Ye X C, Zhang Z M, Fan D R, Xie Y. Characterizing and understanding GCNs on GPU. IEEE Computer Architecture Letters, 2020, 19(1): 22–25. https://doi.org/10.1109/LCA.2020.2970395.

    Article  Google Scholar 

  8. Zhang Z H, Leng J W, Ma L X, Miao Y S, Li C, Guo M Y. Architectural implications of graph neural networks. IEEE Computer Architecture Letters, 2020, 19(1): 59–62. https://doi.org/10.1109/LCA.2020.2988991.

    Article  Google Scholar 

  9. Geng T, Li A, Shi R B, Wu C S, Wang T Q, Li Y F, Haghi P, Tumeo A, Che S, Reinhardt S, Herbordt M C. AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing. In Proc. the 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Oct. 2020, pp.922–936. https://doi.org/10.1109/MICRO50266.2020.00079.

  10. Ma Y F, Cao Y, Vrudhula S, Seo J S. Optimizing loop operation and dataflow in FPGA acceleration of deep convolutional neural networks. In Proc. the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb. 2017, pp.45–54. https://doi.org/10.1145/3020078.3021736.

  11. Yan M Y, Deng L, Hu X, Liang L, Feng, Y J, Ye X C, Zhang Z M, Fan D R, Xie Y. HyGCN: A GCN accelerator with hybrid architecture. In Proc. the 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA), Feb. 2020, pp.15–29. https://doi.org/10.1109/HPCA47549.2020.00012.

  12. Li J J, Louri A, Karanth A, Bunescu R. GCNAX: A flexible and energy-efficient accelerator for graph convolutional neural networks. In Proc. the 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Mar. 2021, pp.775–788. https://doi.org/10.1109/HPCA51647.2021.00070.

  13. Galal S, Horowitz M. Energy-efficient floating-point unit design. IEEE Transactions on Computers, 2011, 60(7): 913-922.

    Article  MathSciNet  MATH  Google Scholar 

  14. Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. arXiv: 1609.02907, 2016. https://arxiv.org/abs/1609.02907, Dec. 2022.

  15. Hamilton W L, Ying R, Leskovec J. Inductive representation learning on large graphs. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.1024–1034.

  16. Xu K, Hu W H, Leskovec J, Jegelka S. How powerful are graph neural networks? arXiv: 1810.00826, 2018. https://arxiv.org/abs/1810.00826, Dec. 2022.

  17. Allen J R, Kennedy K. Automatic loop interchange. In Proc. the 1984 SIGPLAN Symposium on Compiler Construction, Jun. 1984, pp.233–246. https://doi.org/10.1145/502874.502897.

  18. Zhang C, Li P, Sun G Y et al. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In Proc. the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb. 2015, pp.161–170. 10.1145/2684746.2689060.

  19. Pugh W. Uniform techniques for loop optimization. In Proc. the 5th International Conference on Supercomputing, Jun. 1991, pp.341–352. https://doi.org/10.1145/109025.109108.

  20. Pal S, Beaumont J, Park D H, Amarnath A, Feng S Y, Chakrabarti C, Kim H S, Blaauw D, Mudge T, Dreslinski R. OuterSPACE: An outer product based sparse matrix multiplication accelerator. In Proc. the 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), Feb. 2018, pp.724–736. https://doi.org/10.1109/HPCA.2018.00067.

  21. Nie J. Memory-driven data-flow optimization for neural processing accelerators [Ph.D. Thesis]. Princeton University, 2020. https://www.proquest.com/openview/41fe23f43fd65cafaa8c2e051aed4059/1?pq-origsite=gscholar&cbl=18750&diss=y, Jan. 2023.

  22. Sen P, Namata G, Bilgic M et al. Collective classification in network data. AI Magazine, 2008, 29(3): 93-106. https://doi.org/10.1609/aimag.v29i3.2157.

    Article  Google Scholar 

  23. Carlson A, Betteridge J, Kisiel B et al. Toward an architecture for never-ending language learning. In Proc. the 34th AAAI Conference on Artificial Intelligence, July 2010, pp.1306–1313.

  24. Auten A, Tomei M, Kumar R. Hardware acceleration of graph neural networks. In Proc. the 57th ACM/IEEE Design Automation Conference (DAC), Jul. 2020. https://doi.org/10.1109/DAC18072.2020.9218751.

  25. Liang S W, Wang Y, Liu C et al. EnGN: A high-throughput and energy-efficient accelerator for large graph neural networks. IEEE Trans. Computers, 2021, 70(9): 1511–1525. https://doi.org/10.1109/TC.2020.3014632.

    Article  MATH  Google Scholar 

  26. Kiningham K, Re C, Levis P. GRIP: A graph neural network accelerator architecture. arXiv: 2007.13828, 2020. https://arxiv.org/abs/2007.13828v1, Dec. 2022.

  27. Zeng H Q, Prasanna V. GraphACT: Accelerating GCN training on CPU-FPGA heterogeneous platforms. In Proc. the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb. 2020, pp.255–265. https://doi.org/10.1145/3373087.3375312.

  28. Shi F, Jin A Y, Zhu S C. VersaGNN: A versatile accelerator for graph neural networks. arXiv: 2105.01280, 2021. https://arxiv.org/abs/2105.01280, Dec. 2022.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jia-Jun Li.

Supplementary Information

ESM 1

(PDF 911 kb)

ESM 2

(MKV 6322 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, JJ., Wang, K., Zheng, H. et al. GShuttle: Optimizing Memory Access Efficiency for Graph Convolutional Neural Network Accelerators. J. Comput. Sci. Technol. 38, 115–127 (2023). https://doi.org/10.1007/s11390-023-2875-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-023-2875-9

Keywords