Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3673038.3673119acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article
Open access

Optimizing a Super-Fast Eigensolver for Hierarchically Semiseparable Matrices

Published: 12 August 2024 Publication History

Abstract

In this paper, we consider the efficient computation of all eigenvalues and eigenvectors of Symmetric Hierarchically Semiseparable (HSS) matrices, which have an inherent structure: the off-diagonal blocks have hierarchical bases and have low ranks. State-of-the-art is a divide-conquer algorithm, SuperDC, to compute eigenvectors and eigenvalues in an order of magnitude faster than popular and commercial solvers. We improve on the state-of-the-art and present novel shared- and distributed-memory parallel algorithms for computing eigenvalues of HSS matrices. We take advantage of the recursive divide-conquer approach employed in SuperDC to parallelize the eigenvalue computation, present a span and available parallelism analysis, and optimize the original SuperDC algorithm to reduce the storage requirement from O(N2) to O(N) in the case of banded matrices. We do a systematic evaluation with different parallel programming paradigms, scheduling policies, and scalability configurations. We observe that in the shared-memory parallel implementations, OpenMP implementations perform better than Cilk versions, work stealing offers no significant performance advantage, and in the distributed-memory implementations, asynchronous communication yields better performance than implementation with barrier-based communication. We find the optimal input decomposition at which the parallel implementations provide the best speedup. For input symmetric matrices of different sparsity structures and sizes ranging from 4096 to 256k rows, on up to 512 cores, the implementations scale well and show a significant speedup of up to 147 × compared to the available SuperDC implementation.

References

[1]
2021. FastSolvers. https://github.com/fastsolvers
[2]
2024. For Traditional HPC Simulations: Param Pravega. Retrieved July 1, 2024 from https://www.serc.iisc.ac.in/supercomputer/for-traditional-hpc-simulations-param-pravega
[3]
E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. 1999. LAPACK Users’ Guide (third ed.). Society for Industrial and Applied Mathematics, Philadelphia, PA.
[4]
Dario Bini and Victor Pan. 1991. Parallel complexity of tridiagonal symmetric eigenvalue problem. In Proceedings of the second annual ACM-SIAM symposium on Discrete algorithms. 384–393.
[5]
Dario A Bini, Luca Gemignani, and Victor Y Pan. 2005. Fast and stable QR eigenvalue algorithms for generalized companion matrices and secular equations. Numer. Math. 100 (2005), 373–408.
[6]
James R Bunch, Christopher P Nielsen, and Danny C Sorensen. 1978. Rank-one modification of the symmetric eigenproblem. Numer. Math. 31, 1 (1978), 31–48.
[7]
Difeng Cai and Jianlin Xia. 2020. A stable matrix version of the fast multipole method: stabilization strategies and examples. ETNA-Electronic Transactions on Numerical Analysis 54 (2020).
[8]
S. Chandrasekaran, P. Dewilde, M. Gu, T. Pals, X. Sun, A. J. van der Veen, and D. White. 2005. Some Fast Algorithms for Sequentially Semiseparable Representations. SIAM J. Matrix Anal. Appl. 27, 2 (2005), 341–364. https://doi.org/10.1137/S0895479802405884 arXiv:https://doi.org/10.1137/S0895479802405884
[9]
Shivkumar Chandrasekaran and Ming Gu. 2004. A divide-and-conquer algorithm for the eigendecomposition of symmetric block-diagonal plus semiseparable matrices. Numer. Math. 96 (2004), 723–731.
[10]
Shivkumar Chandrasekaran, Ming Gu, and William Lyons. 2005. A fast adaptive solver for hierarchically semiseparable representations. Calcolo 42, 3 (2005), 171–185.
[11]
Fran coise Tisseury and Jack Dongarraz. 1998. Parallelizing the Divide and Conquer Algorithm for the Symmetric Tridiagonal Eigenvalue Problem on Distributed Memory Architectures. (1998).
[12]
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms, Third Edition (3rd ed.). The MIT Press.
[13]
Jan JM Cuppen. 1980. A divide and conquer method for the symmetric tridiagonal eigenproblem. Numer. Math. 36, 2 (1980), 177–195.
[14]
Jack J Dongarra, Jeremy Du Croz, Sven Hammarling, and Iain S Duff. 1990. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software (TOMS) 16, 1 (1990), 1–17.
[15]
Jack J Dongarra and Danny C Sorensen. 1987. A fully parallel algorithm for the symmetric eigenvalue problem. SIAM J. Sci. Statist. Comput. 8, 2 (1987), s139–s154.
[16]
Y Eidelman and I Haimovici. 2012. Divide and conquer method for eigenstructure of quasiseparable matrices using zeroes of rational matrix functions. A Panorama of Modern Operator Theory and Related Topics: The Israel Gohberg Memorial Volume (2012), 299–328.
[17]
Pieter Ghysels, Xiaoye S Li, François-Henry Rouet, Samuel Williams, and Artem Napov. 2016. An efficient multicore implementation of a novel HSS-structured multifrontal solver using randomized sampling. SIAM Journal on Scientific Computing 38, 5 (2016), S358–S384.
[18]
Gene H Golub and Charles F Van Loan. 2013. Matrix computations. JHU press.
[19]
Leslie Greengard and Vladimir Rokhlin. 1987. A fast algorithm for particle simulations. Journal of computational physics 73, 2 (1987), 325–348.
[20]
Ming Gu and Stanley C Eisenstat. 1994. A stable and efficient algorithm for the rank-one modification of the symmetric eigenproblem. SIAM journal on Matrix Analysis and Applications 15, 4 (1994), 1266–1276.
[21]
W. Hackbusch. 1999. A Sparse Matrix Arithmetic Based on Math 108-Matrices. Part I: Introduction to Math 109-Matrices. Computing 62, 2 (01 Apr 1999), 89–108. https://doi.org/11.1007/s006070050015
[22]
Wolfgang Hackbusch and Steffen Börm. 2002. Data-sparse approximation by adaptive H2-matrices. Computing 69, 1 (2002), 1–35.
[23]
Azzam Haidar, Hatem Ltaief, and Jack Dongarra. 2012. Toward a High Performance Tile Divide and Conquer Algorithm for the Dense Symmetric Eigenvalue Problem. SIAM Journal on Scientific Computing 34, 6 (2012), C249–C274. https://doi.org/10.1137/110823699 arXiv:https://doi.org/10.1137/110823699
[24]
Yuxiong He, Charles E Leiserson, and William M Leiserson. 2010. The Cilkview scalability analyzer. In Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures. 145–156.
[25]
Nikhil Hegde, Abhishek Josyula, and Pritesh Verma. 2024. hsseigen. https://doi.org/10.5281/zenodo.12634028
[26]
Xiangke Liao, Shengguo Li, Lizhi Cheng, and Ming Gu. 2016. An improved divide-and-conquer algorithm for the banded matrices with narrow bandwidths. Computers & Mathematics with Applications 71, 10 (2016), 1933–1943.
[27]
Per-Gunnar Martinsson. 2011. A fast randomized algorithm for computing a hierarchically semiseparable representation of a matrix. SIAM J. Matrix Anal. Appl. 32, 4 (2011), 1251–1274.
[28]
MATLAB. 2022. 9.12.0 (R2022a). The MathWorks Inc., Natick, Massachusetts.
[29]
Yuji Nakatsukasa and Nicholas J. Higham. 2013. Stable and Efficient Spectral Divide and Conquer Algorithms for the Symmetric Eigenvalue Decomposition and the SVD. SIAM Journal on Scientific Computing 35, 3 (2013), A1325–A1349. https://doi.org/10.1137/120876605 arXiv:https://doi.org/10.1137/120876605
[30]
Dianne P O’Leary and Gilbert W Stewart. 1990. Computing the eigenvalues and eigenvectors of symmetric arrowhead matrices. J. Comput. Phys. 90, 2 (1990), 497–505.
[31]
Xiaofeng Ou and Jianlin Xia. 2022. SuperDC: Superfast Divide-And-Conquer Eigenvalue Decomposition With Improved Stability for Rank-Structured Matrices. SIAM Journal on Scientific Computing 44, 5 (2022), A3041–A3066. https://doi.org/10.1137/21M1438633
[32]
Tao B Schardl and I-Ting Angelina Lee. 2023. OpenCilk: A Modular and Extensible Software Infrastructure for Fast Task-Parallel Code. In Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming. 189–203.
[33]
Ana Šušnjara and Daniel Kressner. 2021. A fast spectral divide-and-conquer method for banded matrices. Numerical Linear Algebra with Applications 28, 4 (2021), e2365.
[34]
Nathan Tallent, John Mellor-Crummey, Laksono Adhianto, Michael Fagan, and Mark Krentel. 2008. HPCToolkit: performance tools for scientific computing. In Journal of Physics: Conference Series, Vol. 125. IOP Publishing, 012088.
[35]
Raf Vandebril, Marc Van Barel, and Nicola Mastronardi. 2007. Matrix computations and semiseparable matrices: linear systems. Vol. 1. JHU Press.
[36]
James Vogel, Jialin Xia, Stephen Cauley, and Venkataramanan Balakrishnan. 2016. Superfast divide-and-conquer method and perturbation analysis for structured eigenvalue solutions. SIAM Journal on Scientific Computing 38, 3 (2016), A1358–A1382.
[37]
Jianlin Xia, Shivkumar Chandrasekaran, Ming Gu, and Xiaoye S Li. 2010. Fast algorithms for hierarchically semiseparable matrices. Numerical Linear Algebra with Applications 17, 6 (2010), 953–976.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '24: Proceedings of the 53rd International Conference on Parallel Processing
August 2024
1279 pages
ISBN:9798400717932
DOI:10.1145/3673038
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2024

Check for updates

Author Tags

  1. SuperDC
  2. Symmetric eigenvalue problems
  3. divide-and-conquer
  4. hierarchically semi-separable matrix
  5. parallel

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Department of Science and Technology India, National Supercomputing Mission

Conference

ICPP '24

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 69
    Total Downloads
  • Downloads (Last 12 months)69
  • Downloads (Last 6 weeks)37
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media