research-article

H2Pack: High-performance H² Matrix Package for Kernel Matrices Using the Proxy Point Method

Authors:

Edmond ChowAuthors Info & Claims

ACM Transactions on Mathematical Software (TOMS), Volume 47, Issue 1

Article No.: 3, Pages 1 - 29

https://doi.org/10.1145/3412850

Published: 17 December 2020 Publication History

Abstract

Dense kernel matrices represented in H² matrix format typically require less storage and have faster matrix-vector multiplications than when these matrices are represented in the standard dense format. In this article, we present H2Pack, a high-performance, shared-memory library for constructing and operating with H² matrix representations for kernel matrices defined by non-oscillatory, translationally invariant kernel functions. Using a hybrid analytic-algebraic compression method called the proxy point method, H2Pack can efficiently construct an H² matrix representation with linear computational complexity. Storage and matrix-vector multiplication also have linear complexity. H2Pack also introduces the concept of “partially admissible blocks” for H² matrices to make H² matrix-vector multiplication mathematically identical to the fast multipole method (FMM) if analytic expansions are used. We optimize H2Pack from both the algorithm and software perspectives. Compared to existing FMM libraries, H2Pack generally has much faster H² matrix-vector multiplications, since the proxy point method is more effective at producing block low-rank approximations than the analytic methods used in FMM. As a tradeoff, H² matrix construction in H2Pack is typically more expensive than the setup cost in FMM libraries. Thus, H2Pack is ideal for applications that need a large number of matrix-vector multiplications for a given configuration of data points.

References

[1]

Mario Bebendorf and Stefan Kunis. 2009. Recompression techniques for adaptive cross approximation. J. Integr. Eq. Appl. 21, 3 (2009), 331--357.

[2]

Mario Bebendorf and Sergej Rjasanow. 2003. Adaptive low-rank approximation of collocation matrices. Computing 70, 1 (2003), 1--24.

Digital Library

[3]

Steffen Börm. 2017, accessed: 2019-12-05. H2Lib. Retrieved from https://github.com/H2Lib/H2Lib/tree/community.

[4]

Steffen Börm and Lars Grasedyck. 2005. Hybrid cross approximation of integral operators. Numer. Math. 101, 2 (2005), 221--249.

Digital Library

[5]

Wajih Boukaram, George Turkiyyah, and David Keyes. 2019. Hierarchical matrix operations on GPUs: Matrix-vector multiplication and compression. ACM Trans. Math. Softw. 45, 1, Article 3 (2019), 28 pages.

Digital Library

[6]

Difeng Cai, Edmond Chow, Lucas Erlandson, Yousef Saad, and Yuanzhe Xi. 2018. SMASH: Structured matrix approximation by separation and hierarchy. Num. Lin. Algebr. Appl. 25, 6 (2018), e2204.

[7]

Aparna Chandramowlishwaran, Samuel Williams, Leonid Oliker, Ilya Lashuk, George Biros, and Richard Vuduc. 2010. Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures. In Proceedings of the 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS’10). 1--12.

[8]

Shiv Chandrasekaran, Ming Gu, and Timothy P. Pals. 2006. A fast ULV decomposition solver for hierarchically semiseparable representations. SIAM J. Matrix Anal. Appl. 28, 3 (2006), 603--622.

Digital Library

[9]

Hongwei Cheng, Zydrunas Gimbutas, Per-Gunnar Martinsson, and Vladimir Rokhlin. 2005. On the compression of low rank matrices. SIAM J. Sci. Comput. 26, 4 (2005), 1389--1404.

Digital Library

[10]

Eduardo Corona, Per-Gunnar Martinsson, and Denis Zorin. 2015. An O(N) direct solver for integral equations on the plane. Appl. Comput. Harm. Anal. 38, 2 (2015), 284--317.

[11]

William Fong and Eric Darve. 2009. The black-box fast multipole method. J. Comput. Phys. 228, 23 (2009), 8712--8725.

Digital Library

[12]

Pieter Ghysels, Xiaoye S. Li, Francois-Henry Rouet, Samuel Williams, and Artem Napov. 2016. An efficient multicore implementation of a novel HSS-structured multifrontal solver using randomized sampling. SIAM J. Sci. Comput. 38, 5 (2016), S358--S384.

Digital Library

[13]

Zydrunas Gimbutas, Leslie Greengard, Jeremy Magland, Manas Rachh, and Vladimir Rokhlin. FMM3D. Retrieved December 5, 2019 from https://fmm3d.readthedocs.io.

[14]

Leslie F. Greengard and Jingfang Huang. 2002. A new version of the fast multipole method for screened Coulomb interactions in three dimensions. J. Comput. Phys. 180, 2 (2002), 642--658.

Digital Library

[15]

Leslie F. Greengard and Vladimir Rokhlin. 1987. A fast algorithm for particle simulations. J. Comput. Phys. 73, 2 (1987), 325--348.

Digital Library

[16]

Leslie F. Greengard and Vladimir Rokhlin. 1997. A new version of the fast multipole method for the Laplace equation in three dimensions. Acta Numer. 6 (1997), 229--269.

[17]

Ming Gu and Stanley C. Eisenstat. 1996. Efficient algorithms for computing a strong rank-revealing QR factorization. SIAM J. Sci. Comput. 17, 4 (1996), 848--869.

Digital Library

[18]

Wolfgang Hackbusch. 1999. A sparse matrix arithmetic based on H-matrices. Part I: Introduction to H-matrices. Computing 62, 2 (1999), 89--108.

Digital Library

[19]

Wolfgang Hackbusch and Steffen Börm. 2002. Data-sparse approximation by adaptive H²-matrices. Computing 69, 1 (2002), 1--35.

Digital Library

[20]

Wolfgang Hackbusch and Boris N. Khoromskij. 2000. A sparse H-matrix arithmetic. Part II: Application to multi-dimensional problems. Computing 64, 1 (2000), 21--47.

Digital Library

[21]

Wolfgang Hackbusch, Boris N. Khoromskij, and Stefan A. Sauter. 2000. On H²-matrices. In Lectures on Applied Mathematics: Proceedings of the Symposium Organized by the Sonderforschungsbereich 438 on the occasion of Karl-Heinz Hoffmann’s 60th birthday, Hans-Joachim Bungartz, Ronald H. W. Hoppe, and Christoph Zenger (Eds.). Springer, Berlin, 9--29.

[22]

N. Halko, P. Martinsson, and J. Tropp. 2011. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53, 2 (2011), 217--288.

Digital Library

[23]

Dhairya Malhotra and George Biros. 2015. PVFMM: A parallel kernel independent FMM for particle and volume potentials. Commun. Comput. Phys. 18, 3 (2015), 808--830.

[24]

Per-Gunnar Martinsson and Vladimir Rokhlin. 2005. A fast direct solver for boundary integral equations in two dimensions. J. Comput. Phys. 205, 1 (2005), 1--23.

Digital Library

[25]

Per-Gunnar Martinsson and Vladimir Rokhlin. 2007. An accelerated kernel-independent fast multipole method in one dimension. SIAM J. Sci. Comput. 29, 3 (2007), 1160--1178.

Digital Library

[26]

Victor Minden, Anil Damle, Kenneth L. Ho, and Lexing Ying. 2017. Fast spatial Gaussian process maximum likelihood estimation via skeletonization factorizations. Multisc. Model. Simul. 15, 4 (2017), 1584--1611.

Digital Library

[27]

Keigo Nitadori, Junichiro Makino, and Piet Hut. 2006. Performance tuning of N-body codes on modern microprocessors: I. Direct integration with a hermite scheme on x86_64 architecture. New Astron. 12, 3 (2006), 169--181.

[28]

François-Henry Rouet, Xiaoye S. Li, Pieter Ghysels, and Artem Napov. 2016. A distributed-memory package for dense hierarchically semi-separable matrix computations using randomization. ACM Trans. Math. Softw. 42, 4 (2016), 27:1--27:35.

Digital Library

[29]

Ruoxi Wang. BBFMM3D. Retrieved December 5, 2019 from https://github.com/ruoxi-wang/BBFMM3D.

[30]

Xin Xing. 2019. The Proxy Point Method for Rank-structured Matrices. Ph.D. Dissertation. Georgia Institute of Technology.

[31]

Xin Xing and Edmond Chow. 2020. Interpolative decomposition via proxy points for kernel matrices. SIAM J. Matrix Anal. Appl. 41, 1 (2020), 221--243.

Digital Library

[32]

Lexing Ying, George Biros, and Denis Zorin. 2004. A kernel-independent adaptive fast multipole algorithm in two and three dimensions. J. Comput. Phys. 196, 2 (2004), 591--626.

Digital Library

Cited By

Ma QYokota R(2024)An inherently parallel ℋ2-ULV factorization for solving dense linear systems on GPUsInternational Journal of High Performance Computing Applications10.1177/1094342024124202138:4(314-336)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1177/10943420241242021
Zhao SXu THuang HChow EXi Y(2024)An Adaptive Factorized Nyström Preconditioner for Regularized Kernel MatricesSIAM Journal on Scientific Computing10.1137/23M156513946:4(A2351-A2376)Online publication date: 17-Jul-2024
https://doi.org/10.1137/23M1565139
Cao YLiu JChen D(2024)An explicitly-sparse representation for oscillatory kernels with wave atom-like functionsJournal of Computational Physics10.1016/j.jcp.2023.112620497(112620)Online publication date: Jan-2024
https://doi.org/10.1016/j.jcp.2023.112620
Show More Cited By

Index Terms

H2Pack: High-performance H² Matrix Package for Kernel Matrices Using the Proxy Point Method
1. Mathematics of computing
  1. Mathematical software

Recommendations

SuperDC: Superfast Divide-And-Conquer Eigenvalue Decomposition With Improved Stability for Rank-Structured Matrices

For dense symmetric matrices with small off-diagonal (numerical) ranks and in a hierarchically semiseparable form, we give a divide-and-conquer eigendecomposition method with nearly linear complexity (called SuperDC) that significantly improves an earlier ...
Implicit QR for rank-structured matrix pencils
Abstract
A fast implicit QR algorithm for eigenvalue computation of low rank corrections of Hermitian matrices is adjusted to work with matrix pencils arising from zerofinding problems for polynomials expressed in Chebyshev-like bases. The modified QZ ...
Interpolative Decomposition via Proxy Points for Kernel Matrices

In the construction of rank-structured matrix representations of dense kernel matrices, a heuristic compression method, called the proxy point method, has been used in practice to efficiently compute the low-rank approximation of certain kernel matrix ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Mathematical Software

ACM Transactions on Mathematical Software Volume 47, Issue 1

March 2021

219 pages

ISSN:0098-3500

EISSN:1557-7295

DOI:10.1145/3441641

Editors:
Zhaojun Bai
University of California at Davis, USA
,
Wolfgang Bangerth
Colorado State University, USA

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 December 2020

Accepted: 01 July 2020

Revised: 01 June 2020

Received: 01 December 2019

Published in TOMS Volume 47, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Science Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
166
Total Downloads

Downloads (Last 12 months)38
Downloads (Last 6 weeks)4

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ma QYokota R(2024)An inherently parallel ℋ2-ULV factorization for solving dense linear systems on GPUsInternational Journal of High Performance Computing Applications10.1177/1094342024124202138:4(314-336)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1177/10943420241242021
Zhao SXu THuang HChow EXi Y(2024)An Adaptive Factorized Nyström Preconditioner for Regularized Kernel MatricesSIAM Journal on Scientific Computing10.1137/23M156513946:4(A2351-A2376)Online publication date: 17-Jul-2024
https://doi.org/10.1137/23M1565139
Cao YLiu JChen D(2024)An explicitly-sparse representation for oscillatory kernels with wave atom-like functionsJournal of Computational Physics10.1016/j.jcp.2023.112620497(112620)Online publication date: Jan-2024
https://doi.org/10.1016/j.jcp.2023.112620
Cai DHuang HChow EXi Y(2023)Data-Driven Construction of Hierarchical Matrices With Nested BasesSIAM Journal on Scientific Computing10.1137/22M150084846:2(S24-S50)Online publication date: 13-Jul-2023
https://doi.org/10.1137/22M1500848
Zampini SBoukaram WTurkiyyah GKnio OKeyes D(2022)H2Opus: a distributed-memory multi-GPU software package for non-local operatorsAdvances in Computational Mathematics10.1007/s10444-022-09942-648:3Online publication date: 10-May-2022
https://dl.acm.org/doi/10.1007/s10444-022-09942-6
Schäfer FKatzfuss MOwhadi H(2021)Sparse Cholesky Factorization by Kullback--Leibler MinimizationSIAM Journal on Scientific Computing10.1137/20M133625443:3(A2019-A2046)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1137/20M1336254
Ryan JDamle A(2021)Parallel Skeletonization for Integral Equations in Evolving Multiply-Connected DomainsSIAM Journal on Scientific Computing10.1137/20M131633043:3(A2320-A2351)Online publication date: 24-Jun-2021
https://doi.org/10.1137/20M1316330
Xing XHuang HChow E(2021)A hierarchical matrix approach for computing hydrodynamic interactionsJournal of Computational Physics10.1016/j.jcp.2021.110761(110761)Online publication date: Oct-2021
https://doi.org/10.1016/j.jcp.2021.110761

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents