Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

H2Pack: High-performance H2 Matrix Package for Kernel Matrices Using the Proxy Point Method

Published: 17 December 2020 Publication History

Abstract

Dense kernel matrices represented in H2 matrix format typically require less storage and have faster matrix-vector multiplications than when these matrices are represented in the standard dense format. In this article, we present H2Pack, a high-performance, shared-memory library for constructing and operating with H2 matrix representations for kernel matrices defined by non-oscillatory, translationally invariant kernel functions. Using a hybrid analytic-algebraic compression method called the proxy point method, H2Pack can efficiently construct an H2 matrix representation with linear computational complexity. Storage and matrix-vector multiplication also have linear complexity. H2Pack also introduces the concept of “partially admissible blocks” for H2 matrices to make H2 matrix-vector multiplication mathematically identical to the fast multipole method (FMM) if analytic expansions are used. We optimize H2Pack from both the algorithm and software perspectives. Compared to existing FMM libraries, H2Pack generally has much faster H2 matrix-vector multiplications, since the proxy point method is more effective at producing block low-rank approximations than the analytic methods used in FMM. As a tradeoff, H2 matrix construction in H2Pack is typically more expensive than the setup cost in FMM libraries. Thus, H2Pack is ideal for applications that need a large number of matrix-vector multiplications for a given configuration of data points.

References

[1]
Mario Bebendorf and Stefan Kunis. 2009. Recompression techniques for adaptive cross approximation. J. Integr. Eq. Appl. 21, 3 (2009), 331--357.
[2]
Mario Bebendorf and Sergej Rjasanow. 2003. Adaptive low-rank approximation of collocation matrices. Computing 70, 1 (2003), 1--24.
[3]
Steffen Börm. 2017, accessed: 2019-12-05. H2Lib. Retrieved from https://github.com/H2Lib/H2Lib/tree/community.
[4]
Steffen Börm and Lars Grasedyck. 2005. Hybrid cross approximation of integral operators. Numer. Math. 101, 2 (2005), 221--249.
[5]
Wajih Boukaram, George Turkiyyah, and David Keyes. 2019. Hierarchical matrix operations on GPUs: Matrix-vector multiplication and compression. ACM Trans. Math. Softw. 45, 1, Article 3 (2019), 28 pages.
[6]
Difeng Cai, Edmond Chow, Lucas Erlandson, Yousef Saad, and Yuanzhe Xi. 2018. SMASH: Structured matrix approximation by separation and hierarchy. Num. Lin. Algebr. Appl. 25, 6 (2018), e2204.
[7]
Aparna Chandramowlishwaran, Samuel Williams, Leonid Oliker, Ilya Lashuk, George Biros, and Richard Vuduc. 2010. Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures. In Proceedings of the 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS’10). 1--12.
[8]
Shiv Chandrasekaran, Ming Gu, and Timothy P. Pals. 2006. A fast ULV decomposition solver for hierarchically semiseparable representations. SIAM J. Matrix Anal. Appl. 28, 3 (2006), 603--622.
[9]
Hongwei Cheng, Zydrunas Gimbutas, Per-Gunnar Martinsson, and Vladimir Rokhlin. 2005. On the compression of low rank matrices. SIAM J. Sci. Comput. 26, 4 (2005), 1389--1404.
[10]
Eduardo Corona, Per-Gunnar Martinsson, and Denis Zorin. 2015. An O(N) direct solver for integral equations on the plane. Appl. Comput. Harm. Anal. 38, 2 (2015), 284--317.
[11]
William Fong and Eric Darve. 2009. The black-box fast multipole method. J. Comput. Phys. 228, 23 (2009), 8712--8725.
[12]
Pieter Ghysels, Xiaoye S. Li, Francois-Henry Rouet, Samuel Williams, and Artem Napov. 2016. An efficient multicore implementation of a novel HSS-structured multifrontal solver using randomized sampling. SIAM J. Sci. Comput. 38, 5 (2016), S358--S384.
[13]
Zydrunas Gimbutas, Leslie Greengard, Jeremy Magland, Manas Rachh, and Vladimir Rokhlin. FMM3D. Retrieved December 5, 2019 from https://fmm3d.readthedocs.io.
[14]
Leslie F. Greengard and Jingfang Huang. 2002. A new version of the fast multipole method for screened Coulomb interactions in three dimensions. J. Comput. Phys. 180, 2 (2002), 642--658.
[15]
Leslie F. Greengard and Vladimir Rokhlin. 1987. A fast algorithm for particle simulations. J. Comput. Phys. 73, 2 (1987), 325--348.
[16]
Leslie F. Greengard and Vladimir Rokhlin. 1997. A new version of the fast multipole method for the Laplace equation in three dimensions. Acta Numer. 6 (1997), 229--269.
[17]
Ming Gu and Stanley C. Eisenstat. 1996. Efficient algorithms for computing a strong rank-revealing QR factorization. SIAM J. Sci. Comput. 17, 4 (1996), 848--869.
[18]
Wolfgang Hackbusch. 1999. A sparse matrix arithmetic based on H-matrices. Part I: Introduction to H-matrices. Computing 62, 2 (1999), 89--108.
[19]
Wolfgang Hackbusch and Steffen Börm. 2002. Data-sparse approximation by adaptive H2-matrices. Computing 69, 1 (2002), 1--35.
[20]
Wolfgang Hackbusch and Boris N. Khoromskij. 2000. A sparse H-matrix arithmetic. Part II: Application to multi-dimensional problems. Computing 64, 1 (2000), 21--47.
[21]
Wolfgang Hackbusch, Boris N. Khoromskij, and Stefan A. Sauter. 2000. On H2-matrices. In Lectures on Applied Mathematics: Proceedings of the Symposium Organized by the Sonderforschungsbereich 438 on the occasion of Karl-Heinz Hoffmann’s 60th birthday, Hans-Joachim Bungartz, Ronald H. W. Hoppe, and Christoph Zenger (Eds.). Springer, Berlin, 9--29.
[22]
N. Halko, P. Martinsson, and J. Tropp. 2011. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53, 2 (2011), 217--288.
[23]
Dhairya Malhotra and George Biros. 2015. PVFMM: A parallel kernel independent FMM for particle and volume potentials. Commun. Comput. Phys. 18, 3 (2015), 808--830.
[24]
Per-Gunnar Martinsson and Vladimir Rokhlin. 2005. A fast direct solver for boundary integral equations in two dimensions. J. Comput. Phys. 205, 1 (2005), 1--23.
[25]
Per-Gunnar Martinsson and Vladimir Rokhlin. 2007. An accelerated kernel-independent fast multipole method in one dimension. SIAM J. Sci. Comput. 29, 3 (2007), 1160--1178.
[26]
Victor Minden, Anil Damle, Kenneth L. Ho, and Lexing Ying. 2017. Fast spatial Gaussian process maximum likelihood estimation via skeletonization factorizations. Multisc. Model. Simul. 15, 4 (2017), 1584--1611.
[27]
Keigo Nitadori, Junichiro Makino, and Piet Hut. 2006. Performance tuning of N-body codes on modern microprocessors: I. Direct integration with a hermite scheme on x86_64 architecture. New Astron. 12, 3 (2006), 169--181.
[28]
François-Henry Rouet, Xiaoye S. Li, Pieter Ghysels, and Artem Napov. 2016. A distributed-memory package for dense hierarchically semi-separable matrix computations using randomization. ACM Trans. Math. Softw. 42, 4 (2016), 27:1--27:35.
[29]
Ruoxi Wang. BBFMM3D. Retrieved December 5, 2019 from https://github.com/ruoxi-wang/BBFMM3D.
[30]
Xin Xing. 2019. The Proxy Point Method for Rank-structured Matrices. Ph.D. Dissertation. Georgia Institute of Technology.
[31]
Xin Xing and Edmond Chow. 2020. Interpolative decomposition via proxy points for kernel matrices. SIAM J. Matrix Anal. Appl. 41, 1 (2020), 221--243.
[32]
Lexing Ying, George Biros, and Denis Zorin. 2004. A kernel-independent adaptive fast multipole algorithm in two and three dimensions. J. Comput. Phys. 196, 2 (2004), 591--626.

Cited By

View all
  • (2024)An inherently parallel ℋ2-ULV factorization for solving dense linear systems on GPUsInternational Journal of High Performance Computing Applications10.1177/1094342024124202138:4(314-336)Online publication date: 1-Jul-2024
  • (2024)An Adaptive Factorized Nyström Preconditioner for Regularized Kernel MatricesSIAM Journal on Scientific Computing10.1137/23M156513946:4(A2351-A2376)Online publication date: 17-Jul-2024
  • (2024)An explicitly-sparse representation for oscillatory kernels with wave atom-like functionsJournal of Computational Physics10.1016/j.jcp.2023.112620497(112620)Online publication date: Jan-2024
  • Show More Cited By

Index Terms

  1. H2Pack: High-performance H2 Matrix Package for Kernel Matrices Using the Proxy Point Method

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Mathematical Software
    ACM Transactions on Mathematical Software  Volume 47, Issue 1
    March 2021
    219 pages
    ISSN:0098-3500
    EISSN:1557-7295
    DOI:10.1145/3441641
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 December 2020
    Accepted: 01 July 2020
    Revised: 01 June 2020
    Received: 01 December 2019
    Published in TOMS Volume 47, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. H2 matrix
    2. N-body problem
    3. Rank-structured matrix
    4. fast multipole method
    5. high-performance computing
    6. proxy point method

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)38
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 26 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)An inherently parallel ℋ2-ULV factorization for solving dense linear systems on GPUsInternational Journal of High Performance Computing Applications10.1177/1094342024124202138:4(314-336)Online publication date: 1-Jul-2024
    • (2024)An Adaptive Factorized Nyström Preconditioner for Regularized Kernel MatricesSIAM Journal on Scientific Computing10.1137/23M156513946:4(A2351-A2376)Online publication date: 17-Jul-2024
    • (2024)An explicitly-sparse representation for oscillatory kernels with wave atom-like functionsJournal of Computational Physics10.1016/j.jcp.2023.112620497(112620)Online publication date: Jan-2024
    • (2023)Data-Driven Construction of Hierarchical Matrices With Nested BasesSIAM Journal on Scientific Computing10.1137/22M150084846:2(S24-S50)Online publication date: 13-Jul-2023
    • (2022)H2Opus: a distributed-memory multi-GPU software package for non-local operatorsAdvances in Computational Mathematics10.1007/s10444-022-09942-648:3Online publication date: 10-May-2022
    • (2021)Sparse Cholesky Factorization by Kullback--Leibler MinimizationSIAM Journal on Scientific Computing10.1137/20M133625443:3(A2019-A2046)Online publication date: 1-Jan-2021
    • (2021)Parallel Skeletonization for Integral Equations in Evolving Multiply-Connected DomainsSIAM Journal on Scientific Computing10.1137/20M131633043:3(A2320-A2351)Online publication date: 24-Jun-2021
    • (2021)A hierarchical matrix approach for computing hydrodynamic interactionsJournal of Computational Physics10.1016/j.jcp.2021.110761(110761)Online publication date: Oct-2021

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media