Abstract
Covariance matrices are ubiquitous in computational science and engineering. In particular, large covariance matrices arise from multivariate spatial data sets, for instance, in climate/weather modeling applications to improve prediction using statistical methods and spatial data. One of the most time-consuming computational steps consists in calculating the Cholesky factorization of the symmetric, positive-definite covariance matrix problem. The structure of such covariance matrices is also often data-sparse, in other words, effectively of low rank, though formally dense. While not typically globally of low rank, covariance matrices in which correlation decays with distance are nearly always hierarchically of low rank. While symmetry and positive definiteness should be, and nearly always are, exploited for performance purposes, exploiting low rank character in this context is very recent, and will be a key to solving these challenging problems at large-scale dimensions. The authors design a new and flexible tile row rank Cholesky factorization and propose a high performance implementation using OpenMP task-based programming model on various leading-edge manycore architectures. Performance comparisons and memory footprint saving on up to \(200K\times 200K\) covariance matrix size show a gain of more than an order of magnitude for both metrics, against state-of-the-art open-source and vendor optimized numerical libraries, while preserving the numerical accuracy fidelity of the original model. This research represents an important milestone in enabling large-scale simulations for covariance-based scientific applications.
Similar content being viewed by others
References
The R Project for Statistical Computing (2016). r-project.org
Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical linear algebra on emerging architectures: the PLASMA and MAGMA projects. J. Phys: Conf. Ser. 180, 012037 (2009)
Ambikasaran, S., Darve, E.: An \(\cal{O}({N} \log {N})\) fast direct solver for partial hierarchically semiseparable matrices. J. Sci. Comput. 57(3), 477–501 (2013)
Amestoy, P., Ashcraft, C., Boiteau, O., Buttari, A., L’Excellent, J.Y., Weisbecker, C.: Improving multifrontal methods by means of block low-rank representations. SIAM J. Sci. Comput. 37(3), A1451–A1474 (2015)
Amestoy, P.R., Duff, I.S., L’Excellent, J.Y.: Multifrontal parallel distributed symmetric and unsymmetric solvers. Comput. Methods Appl. Mech. Eng. 184(2), 501–520 (2000)
Aminfar, A., Darve, E.: A fast sparse solver for finite-element matrices. arXiv:1403.5337 [cs.NA], pp. 1–25 (2014)
Anderson, E., Bai, Z., Bischof, C.H., Blackford, L.S., Demmel, J.W., Dongarra, J.J., Croz, J.J.D., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.C.: LAPACK User’s Guide, 3rd edn. SIAM, Philadelphia (1999)
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr. Comput.: Pract. Exp. 23(2), 187–198 (2011)
Börm, S.: H2Lib 2.0. Max-Planck-Institut, Leipzig (1999–2012)
Börm, S.: Efficient numerical methods for non-local operators: \(\cal{H}^2\)-Matrix compression, algorithms and analysis. EMS Tracts in Mathematics, vol. 14. European Mathematical Society, Zürich (2010)
Duputel, Z., Rivera, L., Fukahata, Y., Kanamori, H.: Uncertainty estimations for seismic source inversions. Int. Geophys. J. 190(2), 1243–1256 (2012)
Duran, A., Ferrer, R., Ayguadé, E., Badia, R.M., Labarta, J.: A proposal to extend the OpenMP tasking model with dependent tasks. Int. J. Parallel Prog. 37(3), 292–305 (2009)
The FLAME project, April 2010. http://z.cs.utexas.edu/wiki/flame.wiki/FrontPage
Hackbusch, W.: A sparse matrix arithmetic based on \(\cal{H}\)-matrices. Part i: introduction to \(\cal{H}\)-matrices. Computing 62(2), 89–108 (1999)
Hackbusch, W., Börm, S.: Data-sparse approximation by adaptive \({\cal{H}}^2\)-matrices. Computing 69(1), 1–35 (2002)
Hackbusch, W., Khoromskij, B., Sauter, S.: On \(\cal{H}^{2}\)-Matrices. In: Bungartz, H.J., Hoppe, R., Zenger, C. (eds.) Lectures on Applied Mathematics, pp. 9–29. Springer, Heidelberg (2000)
Hackbusch, W.: Hierarchical Matrices: Algorithms and Analysis, vol. 49. Springer, Heidelberg (2015)
Hackbusch, W., Börm, S., Grasedyck, L.: HLib 1.4. Max-Planck-Institut, Leipzig (1999–2012)
Intel: Math Kernel Library (2016). software.intel.com/en-us/intel-mkl
Kriemann, R.: \(\cal{H}\)-LU factorization on many-core systems. Comput. Vis. Sci. 16(3), 105–117 (2013)
Ltaief, H., Gratadour, D., Charara, A., Gendron, E.: Adaptive optics simulation for the world’s largest telescope on multicore architectures with multiple GPUs. In: Proceedings of the Platform for Advanced Scientific Computing Conference, PASC 2016. pp. 9:1–9:12. ACM, New York (2016)
Meuer, H., Strohmaier, E., Dongarra, J., Simon, H.: The Top500 List, November 2016. http://www.top500.org
Rouet, F.H., Li, X.S., Ghysels, P., Napov, A.: A distributed-memory package for dense hierarchically semi-separable matrix computations using randomization. ACM Trans. Math. Softw. 42(4), 27:1–27:35 (2016)
Sun, Y., Stein, M.L.: Statistically and computationally efficient estimating equations for large spatial datasets. J. Comput. Graph. Stat. 25(1), 187–208 (2016)
Tyrtyshnikov, E.E.: Mosaic-skeleton approximations. Calcolo 33(1), 47–57 (1996)
YarKhan, A., Kurzak, J., Dongarra, J.: QUARK users’ guide: QUeueing and runtime for kernels. Technical report ICL-UT-11-02, University of Tennessee Innovative Computing Laboratory (2011)
YarKhan, A., Kurzak, J., Luszczek, P., Dongarra, J.: Porting the PLASMA numerical library to the OpenMP standard. Int. J. Parallel Program. 45(3), 612–633 (2017). doi:10.1007/s10766-016-0441-6
Acknowledgment
We would like to thank R. Kriemann from Max Planck Institute for Mathematics in the Sciences and M. Genton, A. Litvinenko, Y. Sun, and G. Turkiyyah from KAUST for fruitful discussions. We would like also to thank A. Heinecke from Intel for helping us tuning the codes on KNL. This work has been partially funded by the Intel Parallel Computing Center Award.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Akbudak, K., Ltaief, H., Mikhalev, A., Keyes, D. (2017). Tile Low Rank Cholesky Factorization for Climate/Weather Modeling Applications on Manycore Architectures. In: Kunkel, J.M., Yokota, R., Balaji, P., Keyes, D. (eds) High Performance Computing. ISC High Performance 2017. Lecture Notes in Computer Science(), vol 10266. Springer, Cham. https://doi.org/10.1007/978-3-319-58667-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-58667-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58666-3
Online ISBN: 978-3-319-58667-0
eBook Packages: Computer ScienceComputer Science (R0)