Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3394885.3431548acmconferencesArticle/Chapter ViewAbstractPublication PagesaspdacConference Proceedingsconference-collections
research-article

A reduced-precision streaming SpMV architecture for Personalized PageRank on FPGA

Published: 29 January 2021 Publication History

Abstract

Sparse matrix-vector multiplication is often employed in many data-analytic workloads in which low latency and high throughput are more valuable than exact numerical convergence. FPGAs provide quick execution times while offering precise control over the accuracy of the results thanks to reduced-precision fixed-point arithmetic. In this work, we propose a novel streaming implementation of Coordinate Format (COO) sparse matrix-vector multiplication, and study its effectiveness when applied to the Personalized PageRank algorithm, a common building block of recommender systems in e-commerce websites and social networks. Our implementation achieves speedups up to 6x over a reference floating-point FPGA architecture and a state-of-the-art multi-threaded CPU implementation on 8 different data-sets, while preserving the numerical fidelity of the results and reaching up to 42x higher energy efficiency compared to the CPU implementation.

References

[1]
2019. ngGRAPH. docs.nvidia.com/cuda/nvgraph/index.html
[2]
Bahman Bahmani, Abdur Chowdhury, and Ashish Goel. 2010. Fast incremental and personalized pagerank. Proceedings of the VLDB Endowment (2010).
[3]
Paolo Boldi, Massimo Santini, and Sebastiano Vigna. 2009. PageRank: functional dependencies. ACM Transactions on Information Systems (TOIS) (2009).
[4]
Aydın Buluç and John R Gilbert. 2011. The Combinatorial BLAS: Design, implementation, and applications. The International Journal of High Performance Computing Applications 25, 4 (2011), 496--509.
[5]
Tong Geng, Ang Li, Tianqi Wang, Chunshu Wu, Yanfei Li, Antonino Tumeo, and Martin Herbordt. 2019. UWB-GCN: Hardware Acceleration of Graph-Convolution-Network through Runtime Workload Rebalancing. arXiv preprint arXiv:1908.10834 (2019).
[6]
Paul Grigoras. 2018. Instance directed tuning for sparse matrix kernels on reconfigurable accelerators. (2018).
[7]
Paul Grigoras, Pavel Burovskiy, Eddie Hung, and Wayne Luk. 2015. Accelerating SpMV on FPGAs by compressing nonzero values. In 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.
[8]
Sungpack Hong, Hassan Chafi, Edic Sedlar, and Kunle Olukotun. 2012. Green-Marl: a DSL for easy and efficient graph analysis. ACM SIGARCH Computer Architecture News 40, 1 (2012), 349--362.
[9]
Ilse CF Ipsen and Teresa M Selee. 2008. PageRank computation, with special attention to dangling nodes. SIAM J. Matrix Anal. Appl. 29, 4 (2008), 1281--1296.
[10]
Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) (2002).
[11]
Jeremy Kepner, Peter Aaltonen, David Bader, Aydin Buluç, Franz Franchetti, John Gilbert, Dylan Hutchison, Manoj Kumar, Andrew Lumsdaine, Henning Meyerhenke, et al. 2016. Mathematical foundations of the GraphBLAS. In 2016 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 1--9.
[12]
Amy N Langville and Carl D Meyer. 2004. Deeper inside pagerank. Internet Mathematics 1, 3 (2004), 335--380.
[13]
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.
[14]
Vladimir I Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, Vol. 10. 707--710.
[15]
Jun Li, Xiaoling Zheng, Yafeng Wu, and Deren Chen. 2010. A computational trust model in c2c e-commerce environment. In 2010 IEEE 7th International Conference on E-Business Engineering. IEEE, 244--249.
[16]
Shuang Liang, Shouyi Yin, Leibo Liu, Wayne Luk, and Shaojun Wei. 2018. FP-BNN: Binarized neural network on FPGA. Neurocomputing 275 (2018), 1072--1086.
[17]
Alan Said and Alejandro Bellogín. 2015. Replicable evaluation of recommender systems. In Proceedings of the 9th ACM Conference on Recommender Systems.
[18]
Yi Shan, Tianji Wu, Yu Wang, Bo Wang, Zilong Wang, Ningyi Xu, and Huazhong Yang. 2010. FPGA and GPU implementation of large scale SpMV. In 2010 IEEE 8th Symposium on Application Specific Processors (SASP). IEEE, 64--70.
[19]
Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. In Recommender systems handbook. Springer, 257--297.
[20]
Yaman Umuroglu and Magnus Jahre. 2015. A vector caching scheme for streaming fpga spmv accelerators. In International Symposium on Applied Reconfigurable Computing. Springer, 15--26.
[21]
Erwei Wang, James J Davis, Ruizhe Zhao, Ho-Cheung Ng, Xinyu Niu, Wayne Luk, Peter YK Cheung, and George A Constantinides. 2019. Deep Neural Network Approximation for Custom Hardware: Where We've Been, Where We're Going. ACM Computing Surveys (CSUR) 52, 2 (2019), 1--39.
[22]
Carl Yang, Aydin Buluc, and John D Owens. 2019. GraphBLAST: A highperformance linear algebra-based graph framework on the GPU. arXiv preprint arXiv:1908.01407 (2019).
[23]
Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2018. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6848--6856.
[24]
Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, and Saman Amarasinghe. 2018. Graphit: A high-performance graph dsl. Proceedings of the ACM on Programming Languages 2, OOPSLA (2018), 121.
[25]
Shijie Zhou, Kartik Lakhotia, Shreyas G Singapura, Hanqing Zeng, Rajgopal Kannan, Viktor K Prasanna, James Fox, Euna Kim, Oded Green, and David A Bader. 2017. Design and implementation of parallel pagerank on multicore platforms. In 2017 IEEE High Performance Extreme Computing Conference (HPEC).
[26]
Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16). 301--316.

Cited By

View all
  • (2024)Machine Learning-Based Kernel Selector for SpMV Optimization in Graph AnalysisACM Transactions on Parallel Computing10.1145/365257911:2(1-25)Online publication date: 8-Jun-2024
  • (2024)Toward Energy-efficient STT-MRAM-based Near Memory Computing Architecture for Embedded SystemsACM Transactions on Embedded Computing Systems10.1145/365072923:3(1-24)Online publication date: 7-Mar-2024
  • (2024)Tuning high-level synthesis SpMV kernels in Alveo FPGAsMicroprocessors & Microsystems10.1016/j.micpro.2024.105104110:COnline publication date: 1-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference
January 2021
930 pages
ISBN:9781450379991
DOI:10.1145/3394885
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 January 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Approximate Computing
  2. FPGA
  3. Graph Algorithms

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ASPDAC '21
Sponsor:

Acceptance Rates

ASPDAC '21 Paper Acceptance Rate 111 of 368 submissions, 30%;
Overall Acceptance Rate 466 of 1,454 submissions, 32%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)4
Reflects downloads up to 02 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Machine Learning-Based Kernel Selector for SpMV Optimization in Graph AnalysisACM Transactions on Parallel Computing10.1145/365257911:2(1-25)Online publication date: 8-Jun-2024
  • (2024)Toward Energy-efficient STT-MRAM-based Near Memory Computing Architecture for Embedded SystemsACM Transactions on Embedded Computing Systems10.1145/365072923:3(1-24)Online publication date: 7-Mar-2024
  • (2024)Tuning high-level synthesis SpMV kernels in Alveo FPGAsMicroprocessors & Microsystems10.1016/j.micpro.2024.105104110:COnline publication date: 1-Oct-2024
  • (2024)SpChar: Characterizing the sparse puzzle via decision treesJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104941(104941)Online publication date: Jun-2024
  • (2023)HedgeRank: Heterogeneity-Aware, Energy-Efficient Partitioning of Personalized PageRank at the EdgeMicromachines10.3390/mi1409171414:9(1714)Online publication date: 31-Aug-2023
  • (2023)MITra: A Framework for Multi-Instance Graph TraversalProceedings of the VLDB Endowment10.14778/3603581.360359416:10(2551-2564)Online publication date: 1-Jun-2023
  • (2023)A Survey of Accelerating Parallel Sparse Linear AlgebraACM Computing Surveys10.1145/360460656:1(1-38)Online publication date: 28-Aug-2023
  • (2023)Multi-Mode SpMV Accelerator for Transprecision PageRank With Real-World GraphsIEEE Access10.1109/ACCESS.2023.323707911(6261-6272)Online publication date: 2023
  • (2023)Research on FPGA Accelerator Optimization Based on Graph Neural NetworkAdvances in Natural Computation, Fuzzy Systems and Knowledge Discovery10.1007/978-3-031-20738-9_61(536-542)Online publication date: 30-Jan-2023
  • (2022)Software-defined floating-point number formats and their application to graph processingProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532360(1-17)Online publication date: 28-Jun-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media