Abstract
Graph convolutional neural networks (GCNs) have emerged as an effective approach to extending deep learning for graph data analytics, but they are computationally challenging given the irregular graphs and the large number of nodes in a graph. GCNs involve chain sparse-dense matrix multiplications with six loops, which results in a large design space for GCN accelerators. Prior work on GCN acceleration either employs limited loop optimization techniques, or determines the design variables based on random sampling, which can hardly exploit data reuse efficiently, thus degrading system efficiency. To overcome this limitation, this paper proposes GShuttle, a GCN acceleration scheme that maximizes memory access efficiency to achieve high performance and energy efficiency. GShuttle systematically explores loop optimization techniques for GCN acceleration, and quantitatively analyzes the design objectives (e.g., required DRAM accesses and SRAM accesses) by analytical calculation based on multiple design variables. GShuttle further employs two approaches, pruned search space sweeping and greedy search, to find the optimal design variables under certain design constraints. We demonstrated the efficacy of GShuttle by evaluation on five widely used graph datasets. The experimental simulations show that GShuttle reduces the number of DRAM accesses by a factor of 1.5 and saves energy by a factor of 1.7 compared with the state-of-the-art approaches.
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Jiang W W, Luo J Y. Graph neural network for traffic forecasting: A survey. arXiv: 2101.11174, 2021. https://arxiv.org/abs/2101.11174, Dec. 2022.
Shi W J, Rajkumar R. Point-GNN: Graph neural network for 3D object detection in a point cloud. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.1711–1719. https://doi.org/10.1109/CVPR42600.2020.00178.
Wee C Y, Liu C Q, Lee A, Poh J S, Ji H, Qiu A Q, The Alzheimers Disease Neuroimage Initiative. Cortical graph neural network for AD and MCI diagnosis and transfer learning across populations. NeuroImage: Clinical, 2019, 23: 101929. https://doi.org/10.1016/j.nicl.2019.101929.
Zhang Z W, Cui P, Zhu W W. Deep learning on graphs: A survey. IEEE Trans. Knowledge and Data Engineering, 2022, 34(1): 249–270. https://doi.org/10.1109/TKDE.2020.2981333.
Yang H X. AliGraph: A comprehensive graph neural network platform. In Proc. the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Jul. 2019, pp.3165–3166. https://doi.org/10.1145/3292500.3340404.
Lerer A, Wu L, Shen J, Lacroix T, Wehrstedt L, Bose A, Peysakhovich A. PyTorch-BigGraph: A large-scale graph embedding system. arXiv: 1903.12287, 2019. https://arxiv.org/abs/1903.12287, Dec. 2022.
Yan M Y, Chen Z D, Deng L, Ye X C, Zhang Z M, Fan D R, Xie Y. Characterizing and understanding GCNs on GPU. IEEE Computer Architecture Letters, 2020, 19(1): 22–25. https://doi.org/10.1109/LCA.2020.2970395.
Zhang Z H, Leng J W, Ma L X, Miao Y S, Li C, Guo M Y. Architectural implications of graph neural networks. IEEE Computer Architecture Letters, 2020, 19(1): 59–62. https://doi.org/10.1109/LCA.2020.2988991.
Geng T, Li A, Shi R B, Wu C S, Wang T Q, Li Y F, Haghi P, Tumeo A, Che S, Reinhardt S, Herbordt M C. AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing. In Proc. the 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Oct. 2020, pp.922–936. https://doi.org/10.1109/MICRO50266.2020.00079.
Ma Y F, Cao Y, Vrudhula S, Seo J S. Optimizing loop operation and dataflow in FPGA acceleration of deep convolutional neural networks. In Proc. the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb. 2017, pp.45–54. https://doi.org/10.1145/3020078.3021736.
Yan M Y, Deng L, Hu X, Liang L, Feng, Y J, Ye X C, Zhang Z M, Fan D R, Xie Y. HyGCN: A GCN accelerator with hybrid architecture. In Proc. the 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA), Feb. 2020, pp.15–29. https://doi.org/10.1109/HPCA47549.2020.00012.
Li J J, Louri A, Karanth A, Bunescu R. GCNAX: A flexible and energy-efficient accelerator for graph convolutional neural networks. In Proc. the 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Mar. 2021, pp.775–788. https://doi.org/10.1109/HPCA51647.2021.00070.
Galal S, Horowitz M. Energy-efficient floating-point unit design. IEEE Transactions on Computers, 2011, 60(7): 913-922.
Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. arXiv: 1609.02907, 2016. https://arxiv.org/abs/1609.02907, Dec. 2022.
Hamilton W L, Ying R, Leskovec J. Inductive representation learning on large graphs. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.1024–1034.
Xu K, Hu W H, Leskovec J, Jegelka S. How powerful are graph neural networks? arXiv: 1810.00826, 2018. https://arxiv.org/abs/1810.00826, Dec. 2022.
Allen J R, Kennedy K. Automatic loop interchange. In Proc. the 1984 SIGPLAN Symposium on Compiler Construction, Jun. 1984, pp.233–246. https://doi.org/10.1145/502874.502897.
Zhang C, Li P, Sun G Y et al. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In Proc. the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb. 2015, pp.161–170. 10.1145/2684746.2689060.
Pugh W. Uniform techniques for loop optimization. In Proc. the 5th International Conference on Supercomputing, Jun. 1991, pp.341–352. https://doi.org/10.1145/109025.109108.
Pal S, Beaumont J, Park D H, Amarnath A, Feng S Y, Chakrabarti C, Kim H S, Blaauw D, Mudge T, Dreslinski R. OuterSPACE: An outer product based sparse matrix multiplication accelerator. In Proc. the 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), Feb. 2018, pp.724–736. https://doi.org/10.1109/HPCA.2018.00067.
Nie J. Memory-driven data-flow optimization for neural processing accelerators [Ph.D. Thesis]. Princeton University, 2020. https://www.proquest.com/openview/41fe23f43fd65cafaa8c2e051aed4059/1?pq-origsite=gscholar&cbl=18750&diss=y, Jan. 2023.
Sen P, Namata G, Bilgic M et al. Collective classification in network data. AI Magazine, 2008, 29(3): 93-106. https://doi.org/10.1609/aimag.v29i3.2157.
Carlson A, Betteridge J, Kisiel B et al. Toward an architecture for never-ending language learning. In Proc. the 34th AAAI Conference on Artificial Intelligence, July 2010, pp.1306–1313.
Auten A, Tomei M, Kumar R. Hardware acceleration of graph neural networks. In Proc. the 57th ACM/IEEE Design Automation Conference (DAC), Jul. 2020. https://doi.org/10.1109/DAC18072.2020.9218751.
Liang S W, Wang Y, Liu C et al. EnGN: A high-throughput and energy-efficient accelerator for large graph neural networks. IEEE Trans. Computers, 2021, 70(9): 1511–1525. https://doi.org/10.1109/TC.2020.3014632.
Kiningham K, Re C, Levis P. GRIP: A graph neural network accelerator architecture. arXiv: 2007.13828, 2020. https://arxiv.org/abs/2007.13828v1, Dec. 2022.
Zeng H Q, Prasanna V. GraphACT: Accelerating GCN training on CPU-FPGA heterogeneous platforms. In Proc. the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb. 2020, pp.255–265. https://doi.org/10.1145/3373087.3375312.
Shi F, Jin A Y, Zhu S C. VersaGNN: A versatile accelerator for graph neural networks. arXiv: 2105.01280, 2021. https://arxiv.org/abs/2105.01280, Dec. 2022.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, JJ., Wang, K., Zheng, H. et al. GShuttle: Optimizing Memory Access Efficiency for Graph Convolutional Neural Network Accelerators. J. Comput. Sci. Technol. 38, 115–127 (2023). https://doi.org/10.1007/s11390-023-2875-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-023-2875-9