Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

PimPam: Efficient Graph Pattern Matching on Real Processing-in-Memory Hardware

Published: 30 May 2024 Publication History

Abstract

Graph pattern matching is powerful and widely applicable to many application domains. Despite the recent algorithm advances, matching patterns in large-scale real-world graphs still faces the memory access bottleneck on conventional computing systems. Processing-in-memory (PIM) is an emerging hardware architecture paradigm that puts computing cores into memory devices to alleviate the memory wall issues. Real PIM hardware has recently become commercially accessible to the public. In this work, we leverage the real PIM hardware platform to build a graph pattern matching framework, PimPam, to benefit from its abundant computation and memory bandwidth resources. We propose four key optimizations in PimPam to improve its efficiency, including (1) load-aware task assignment to ensure load balance, (2) space-efficient and parallel data partitioning to prepare input data for PIM cores, (3) adaptive multi-threading collaboration to automatically select the best parallelization strategy during processing, and (4) dynamic bitmap structures that accelerate the key operations of set intersection. When evaluated on five patterns and six real-world graphs, PimPam outperforms the state-of-the-art CPU baseline system by 22.5x on average and up to 71.7x, demonstrating significant performance improvements.

Supplemental Material

MP4 File
Presentation video
PDF File
Presentation slides

References

[1]
Khaled Ammar, Frank McSherry, Semih Salihoglu, and Manas R. Joglekar. 2018. Distributed Evaluation of Subgraph Queries Using Worst-case Optimal and Low-Memory Dataflows. Proceedings of the VLDB Endowment, Vol. 11, 6 (2018), 691--704.
[2]
Alexander Baumstark, Muhammad Attahir Jibril, and Kai-Uwe Sattler. 2023. Processing-in-Memory for Databases: Query Processing and Data Transfer. In 19th International Workshop on Data Management on New Hardware (DaMoN).
[3]
Arthur Bernhardt, Andreas Koch, and Ilia Petrov. 2023. pimDB: From Main-Memory DBMS to Processing-In-Memory DBMS-Engines on Intelligent Memories. In 19th International Workshop on Data Management on New Hardware (DaMoN).
[4]
Maciej Besta, Raghavendra Kanakagiri, Grzegorz Kwasniewski, Rachata Ausavarungnirun, Jakub Beránek, Konstantinos Kanellopoulos, Kacper Janda, Zur Vonarburg-Shmaria, Lukas Gianinazzi, Ioana Stefan, Juan Gómez-Luna, Marcin Copik, Lukas Kapp-Schwoerer, Salvatore Di Girolamo, Marek Konieczny, Onur Mutlu, and Torsten Hoefler. 2021. SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems. In 54th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[5]
Amirali Boroumand, Saugata Ghose, Youngsok Kim, Rachata Ausavarungnirun, Eric Shiu, Rahul Thakur, Daehyun Kim, Aki Kuusela, Allan Knies, Parthasarathy Ranganathan, and Onur Mutlu. 2018. Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks. In 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[6]
Hongzhi Chen, Miao Liu, Yunjian Zhao, Xiao Yan, Da Yan, and James Cheng. 2018. G-Miner: An Efficient Task-Oriented Graph Mining System. In 13th European Conference on Computer Systems (EuroSys).
[7]
Jingji Chen and Xuehai Qian. 2022. DecoMine: A Compilation-Based Graph Pattern Mining System with Pattern Decomposition. In 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[8]
Jingji Chen and Xuehai Qian. 2023. Khuzdul: Efficient and Scalable Distributed Graph Pattern Mining Engine. In 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[9]
Liang-Chi Chen, Chien-Chung Ho, and Yuan-Hao Chang. 2023. UpPipe: A Novel Pipeline Management on In-Memory Processors for RNA-seq Quantification. In 60th ACM/IEEE Design Automation Conference (DAC).
[10]
Qihang Chen, Boyu Tian, and Mingyu Gao. 2022. FINGERS: Exploiting Fine-Grained Parallelism in Graph Mining Accelerators. In 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[11]
Xuhao Chen, Roshan Dathathri, Gurbinder Gill, Loc Hoang, and Keshav Pingali. 2021a. Sandslash: A Two-Level Framework for Efficient Graph Pattern Mining. In 34th ACM International Conference on Supercomputing (ICS).
[12]
Xuhao Chen, Roshan Dathathri, Gurbinder Gill, and Keshav Pingali. 2020. Pangolin: An Efficient and Flexible Graph Mining System on CPU and GPU. Proceedings of the VLDB Endowment, Vol. 13, 8 (2020), 1190--1205.
[13]
Xuhao Chen, Tianhao Huang, Shuotao Xu, Thomas Bourgeat, Chanwoo Chung, and Arvind. 2021b. FlexMiner: A Pattern-Aware Accelerator for Graph Pattern Mining. In 48th Annual IEEE/ACM International Symposium on Computer Architecture (ISCA).
[14]
Young-Rae Cho and Aidong Zhang. 2010. Predicting Protein Function by Frequent Functional Association Pattern Mining in Protein Interaction Networks. IEEE Transactions on Information Technology in Biomedicine, Vol. 14, 1 (2010), 30--36.
[15]
Guohao Dai, Zhenhua Zhu, Tianyu Fu, Chiyue Wei, Bangyan Wang, Xiangyu Li, Yuan Xie, Huazhong Yang, and Yu Wang. 2022. DIMMining: Pruning-Efficient and Parallel Graph Mining on Near-Memory-Computing. In 49th Annual IEEE/ACM International Symposium on Computer Architecture (ISCA).
[16]
Prangon Das, Purab Ranjan Sutradhar, Mark Indovina, Sai Manoj Pudukotai Dinakarrao, and Amlan Ganguly. 2022. Implementation and Evaluation of Deep Neural Networks in Commercially Available Processing in Memory Hardware. In 35th IEEE International System-on-Chip Conference (SOCC).
[17]
Fabrice Devaux. 2019. The True Processing In Memory Accelerator. In 2019 IEEE Hot Chips 31 Symposium (HCS).
[18]
Vinicius Dias, Carlos H. C. Teixeira, Dorgival Guedes, Wagner Meira, and Srinivasan Parthasarathy. 2019. Fractal: A General-Purpose Graph Pattern Mining System. In 2019 ACM SIGMOD International Conference on Management of Data.
[19]
Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. 1999. On Power-Law Relationships of the Internet Topology. ACM SIGCOMM Computer Communication Review, Vol. 29, 4 (1999), 251--262.
[20]
Juan Gómez-Luna, Izzat El Hajj, Ivan Fernandez, Christina Giannoula, Geraldo F. Oliveira, and Onur Mutlu. 2021. Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture. arXiv preprint arXiv:2105.03814 (May 2021).
[21]
Wentian Guo, Yuchen Li, Mo Sha, Bingsheng He, Xiaokui Xiao, and Kian-Lee Tan. 2020b. GPU-Accelerated Subgraph Enumeration on Partitioned Graphs. In 2020 ACM SIGMOD International Conference on Management of Data.
[22]
Wentian Guo, Yuchen Li, and Lee Tan. 2020a. Exploiting Reuse for GPU Subgraph Enumeration. IEEE Transactions on Knowledge and Data Engineering, Vol. 34, 9 (2020), 4231--4244.
[23]
Rana Hussein, Alberto Lerner, Andre Ryser, Lucas Bürgi, Albert Blarer, and Philippe Cudre-Mauroux. 2023. GraphINC: Graph Pattern Mining at Network Speed. In 2023 ACM SIGMOD International Conference on Management of Data.
[24]
Hynix. 2023. SK Hynix Develops PIM, Next-Generation AI Accelerator. https://news.skhynix.com/sk-hynix-develops-pim-next-generation-ai-accelerator/.
[25]
Maurus Item, Geraldo F. Oliveira, Juan Gómez-Luna, Mohammad Sadrosadati, Yuxin Guo, and Onur Mutlu. 2023. TransPimLib: Efficient Transcendental Functions for Processing-in-Memory Systems. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[26]
Anand Padmanabha Iyer, Zaoxing Liu, Xin Jin, Shivaram Venkataraman, Vladimir Braverman, and Ion Stoica. 2018. ASAP: Fast, Approximate Graph Pattern Mining at Scale. In USENIX Symposium on Operating Systems Design and Implementation (OSDI).
[27]
K. Jamshidi, R. Mahadasa, K. Vora, and Acm. 2020. PEREGRINE: A Pattern-Aware Graph Mining System. In 15th European Conference on Computer Systems (EuroSys).
[28]
Oren Kalinsky, Benny Kimelfeld, and Yoav Etsion. 2019. The TrieJax Architecture: Accelerating Graph Operations Through Relational Joins. In 25th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[29]
Hongbo Kang, Yiwei Zhao, Guy E. Blelloch, Laxman Dhulipala, Yan Gu, Charles McGuffey, and Phillip B. Gibbons. 2022. PIM-Tree: A Skew-Resistant Index for Processing-in-Memory. Proceedings of the VLDB Endowment, Vol. 16, 4 (2022), 946--958.
[30]
Jure Leskovec and Andrej Krevl. 2014a. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/soc-LiveJournal1.html.
[31]
Jure Leskovec and Andrej Krevl. 2014b. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/wiki-Vote.html.
[32]
Jure Leskovec and Andrej Krevl. 2014c. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/p2p-Gnutella04.html.
[33]
Jure Leskovec and Andrej Krevl. 2014 d. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/ca-AstroPh.html.
[34]
Jure Leskovec and Andrej Krevl. 2014 e. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/com-Youtube.html.
[35]
Jure Leskovec and Andrej Krevl. 2014 f. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/cit-Patents.html.
[36]
Wenqing Lin, Xiaokui Xiao, Xing Xie, and Xiao-Li Li. 2015. Network Motif Discovery: A GPU Approach. In 31st International Conference on Data Engineering (ICDE).
[37]
Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A System for Large-Scale Graph Processing. In 2010 ACM SIGMOD International Conference on Management of Data.
[38]
Daniel Mawhirter, Sam Reinehr, Connor Holmes, Tongping Liu, and Bo Wu. 2019. GraphZero: Breaking Symmetry for Efficient Graph Mining. arXiv preprint arXiv:1911.12877 (Nov 2019).
[39]
Daniel Mawhirter and Bo Wu. 2019. Automine: Harmonizing High-Level Abstraction and High Performance for Graph Mining. In 27th ACM Symposium on Operating Systems Principles (SOSP).
[40]
Amine Mhedhbi and Semih Salihoglu. 2019. Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins. Proceedings of the VLDB Endowment, Vol. 12, 11 (2019), 1692--1704.
[41]
Mark E. J. Newman. 2003. The Structure and Function of Complex Networks. SIAM Rev., Vol. 45, 2 (2003), 167--256.
[42]
Dhinakaran Pandiyan and Carole-Jean Wu. 2014. Quantifying the Energy Cost of Data Movement for Emerging Smart Phone Workloads on Mobile Platforms. In 2014 IEEE International Symposium on Workload Characterization (IISWC).
[43]
Gengyu Rao, Jingji Chen, Jason Yik, and Xuehai Qian. 2022. SparseCore: Stream ISA and Processor Specialization for Sparse Computation. In 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[44]
Matthias Rupp. 2011. Graph Kernels for Chemoinformatics - A Critical Discussion. Journal of Cheminformatics, Vol. 3, 1 (2011), 1--1.
[45]
Samsung. 2023. HBM-PIM: Cutting-Edge Memory Technology to Accelerate Next-Generation AI. https://semiconductor.samsung.com/news-events/tech-blog/hbm-pim-cutting-edge-memory-technology-to-accelerate-next-generation-ai/.
[46]
Vivek Seshadri, Kevin Hsieh, Amirali Boroum, Donghyuk Lee, Michael A. Kozuch, Onur Mutlu, Phillip B. Gibbons, and Todd C. Mowry. 2015. Fast Bulk Bitwise AND and OR in DRAM. IEEE Computer Architecture Letters, Vol. 14, 2 (2015), 127--131.
[47]
Vivek Seshadri, Yoongu Kim, Chris Fallin, Donghyuk Lee, Rachata Ausavarungnirun, Gennady Pekhimenko, Yixin Luo, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2013. RowClone: Fast and Energy-Efficient In-DRAM Bulk Data Copy and Initialization. In 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[48]
Tianhui Shi, Mingshu Zhai, Yi Xu, and Jidong Zhai. 2020. GraphPi: High Performance Graph Pattern Matching through Effective Redundancy Elimination. In 32nd International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[49]
Jiya Su. 2022. PIMMiner: A High-Performance PIM Architecture-Aware Graph Mining Framework. arXiv preprint arXiv:2306.10257 (Jun 2022).
[50]
Xibo Sun and Qiong Luo. 2023. Efficient GPU-Accelerated Subgraph Matching. 2023 ACM SIGMOD International Conference on Management of Data.
[51]
Nishil Talati, Haojie Ye, Yichen Yang, Leul Belayneh, Kuan-Yu Chen, David Blaauw, Trevor Mudge, and Ronald Dreslinski. 2022. NDMiner: Accelerating Graph Pattern Mining Using Near Data Processing. In 49th Annual IEEE/ACM International Symposium on Computer Architecture (ISCA).
[52]
Carlos H. C. Teixeira, Alexandre J. Fonseca, Marco Serafini, Georgos Siganos, Mohammed J. Zaki, and Ashraf Aboulnaga. 2015. Arabesque: A System for Distributed Graph Mining. In 25th ACM Symposium on Operating Systems Principles (SOSP).
[53]
Ha-Nguyen Tran, Kim Jung-Jae, and He Bingsheng. 2015. Fast Subgraph Matching on Large Graphs using Graphics Processors. In International Conference on Database Systems for Advanced Applications (DASFAA).
[54]
Vasileios Trigonakis, Jean-Pierre Lozi, Tomá? Faltín, Nicholas P. Roth, Iraklis Psaroudakis, Arnaud Delamare, Vlad Ioan Haprian, C?lin Iorgulescu, Petr Koupy, Jinsoo Lee, Sungpack Hong, and Hassan Chafi. 2021. aDFS: An Almost Depth-First-Search Distributed Graph-Querying System. In USENIX Annual Technical Conference (USENIX ATC).
[55]
UPMEM. 2023. UPMEM Website. https://www.upmem.com/.
[56]
Kai Wang, Zhiqiang Zuo, John Thorpe, Tien Quang Nguyen, and Guoqing Harry Xu. 2018. RStream: Marrying Relational Algebra with Streaming for Efficient Graph Mining on A Single Machine. In USENIX Symposium on Operating Systems Design and Implementation (OSDI).
[57]
Leyuan Wang and John Owens. 2020. Fast Gunrock Subgraph Matching (GSM) on GPUs. arXiv preprint arXiv:2003.01527 (May 2020).
[58]
Lizhi Xiang, Arif Khan, Edoardo Serra, Mahantesh Halappanavar, and Aravind Sukumaran-Rajam. 2021. cuTS: Scaling Subgraph Isomorphism on Distributed Multi-GPU Systems Using Trie Based Data Structure. In 33rd International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[59]
Da Yan, Hongzhi Chen, James Cheng, M. Tamer Özsu, Qizhen Zhang, and John C.S. Lui. 2017. G-thinker: Big Graph Mining Made Easier and Faster. arXiv preprint arXiv:1709.03110 (Sep 2017).
[60]
Pengcheng Yao, Long Zheng, Zhen Zeng, Yu Huang, Chuangyi Gui, Xiaofei Liao, Hai Jin, and Jingling Xue. 2022. A Locality-Aware Energy-Efficient Accelerator for Graph Mining Applications. In 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[61]
Li Zeng, Lei Zou, M. Tamer Özsu, Lin Hu, and Fan Zhang. 2020. GSI: GPU-Friendly Subgraph Isomorphism. In 36th International Conference on Data Engineering (ICDE).
[62]
Cheng Zhao, Zhibin Zhang, Peng Xu, Tianqi Zheng, and Jiafeng Guo. 2020. Kaleido: An Efficient Out-of-Core Graph Mining System on A Single Machine. In 36th International Conference on Data Engineering (ICDE).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 2, Issue 3
SIGMOD
June 2024
1953 pages
EISSN:2836-6573
DOI:10.1145/3670010
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 May 2024
Published in PACMMOD Volume 2, Issue 3

Author Tags

  1. graph pattern matching
  2. processing in memory

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 415
    Total Downloads
  • Downloads (Last 12 months)415
  • Downloads (Last 6 weeks)73
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media