Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

TuckerMPI: A Parallel C++/MPI Software Package for Large-scale Data Compression via the Tucker Tensor Decomposition

Published: 01 June 2020 Publication History

Abstract

Our goal is compression of massive-scale grid-structured data, such as the multi-terabyte output of a high-fidelity computational simulation. For such data sets, we have developed a new software package called TuckerMPI, a parallel C++/MPI software package for compressing distributed data. The approach is based on treating the data as a tensor, i.e., a multidimensional array, and computing its truncated Tucker decomposition, a higher-order analogue to the truncated singular value decomposition of a matrix. The result is a low-rank approximation of the original tensor-structured data. Compression efficiency is achieved by detecting latent global structure within the data, which we contrast to most compression methods that are focused on local structure. In this work, we describe TuckerMPI, our implementation of the truncated Tucker decomposition, including details of the data distribution and in-memory layouts, the parallel and serial implementations of the key kernels, and analysis of the storage, communication, and computational costs. We test the software on 4.5 and 6.7 terabyte data sets distributed across 100 s of nodes (1,000 s of MPI processes), achieving compression ratios between 100 and 200,000×, which equates to 99--99.999% compression (depending on the desired accuracy) in substantially less time than it would take to even read the same dataset from a parallel file system. Moreover, we show that our method also allows for reconstruction of partial or down-sampled data on a single node, without a parallel computer so long as the reconstructed portion is small enough to fit on a single machine, e.g., in the instance of reconstructing/visualizing a single down-sampled time step or computing summary statistics. The code is available at https://gitlab.com/tensors/TuckerMPI.

References

[1]
S. Afra, E. Gildin, and M. Tarrahi. 2014. Heterogeneous reservoir characterization using efficient parameterization through higher order SVD (HOSVD). In Proceedings of the American Control Conference. 147--152.
[2]
Woody Austin, Grey Ballard, and Tamara G. Kolda. 2016. Parallel tensor compression for large-scale scientific data. In Proceedings of the 30th IEEE International Parallel and Distributed Processing Symposium (IPDPS’16). 912--922. arXiv:1510.06689
[3]
Grey Ballard, Koby Hayashi, and Ramakrishnan Kannan. 2018. Parallel Nonnegative CP Decomposition of Dense Tensors. Technical Report 1806.07985. Retrieved from https://arxiv.org/abs/1806.07985.
[4]
Rafael Ballester-Ripoll, Peter Lindstrom, and Renato Pajarola. 2019. TTHRESH: Tensor compression for multidimensional visual data. IEEE Trans. Visual. Comput. Graph. (2019).
[5]
Rafael Ballester-Ripoll and Renato Pajarola. 2015. Lossy volume compression using Tucker truncation and thresholding. Vis. Comput. 32 (May 2015), 1433--1446.
[6]
V. T. Chakaravarthy, J. W. Choi, D. J. Joseph, X. Liu, P. Murali, Y. Sabharwal, and D. Sreedhar. 2017. On optimizing distributed Tucker decomposition for dense tensors. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS’17). 1038--1047.
[7]
Venkatesan Chakravarthy. 2017. Personal communication.
[8]
E. Chan, M. Heimlich, A. Purkayastha, and R. van de Geijn. 2007. Collective communication: Theory, practice, and experience. Concurr. Comput.: Pract. Exper. 19, 13 (2007), 1749--1783.
[9]
J. H. Chen, A. Choudhary, B. de Supinski, M. DeVries, E. R. Hawkes, S. Klasky, W. K. Liao, K. L. Ma, J. Mellor-Crummey, N. Podhorszki, R. Sankaran, S. Shende, and C. S. Yoo. 2009. Terascale direct numerical simulations of turbulent combustion using S3D. Comput. Sci. Discov. 2, 1 (2009), 015001.
[10]
Jee Choi, Xing Liu, and Venkatesan Chakaravarthy. 2018. High-performance dense Tucker decomposition on GPU clusters. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’18). IEEE Press, Piscataway, NJ. Retrieved from http://dl.acm.org/citation.cfm?id=3291656.3291712.
[11]
Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. 2000. A multilinear singular value decomposition. SIAM J. Matrix Anal. Appl. 21, 4 (2000), 1253--1278.
[12]
S. Di and F. Cappello. 2016. Fast error-bounded lossy HPC data compression with SZ. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS’16). 730--739.
[13]
Nathaniel Fout, Kwan-Liu Ma, and James Ahrens. 2005. Time-varying, multivariate volume data reduction. In Proceedings of the ACM Symposium on Applied Computing (SAC’05). ACM, New York, NY, 1224--1230.
[14]
A. García-Magariño, S. Sor, and A. Velazquez. 2016. Data reduction method for droplet deformation experiments based on high order singular value decomposition. Exper. Therm. Fluid Sci. 79 (Dec. 2016), 13--24.
[15]
Wolfgang Hackbusch. 2014. Numerical tensor calculus. Acta Numerica 23 (2014), 651--742.
[16]
D. R. Hatch, D. del Castillo-Negrete, and P. W. Terry. 2012. Analysis and compression of six-dimensional gyrokinetic datasets using higher order singular value decomposition. J. Comput. Phys. 231, 11 (June 2012), 4234--4256.
[17]
Koby Hayashi, Grey Ballard, Yujie Jiang, and Michael J. Tobia. 2018. Shared-memory parallelization of MTTKRP for dense tensors. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’18). ACM, New York, NY, 393--394.
[18]
A. Karami, M. Yazdi, and G. Mercier. 2012. Compression of hyperspectral images using discrete wavelet transform and Tucker decomposition. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 5, 2 (April 2012), 444--450.
[19]
O. Kaya and B. Uçar. 2016. High performance parallel algorithms for the Tucker decomposition of sparse tensors. In Proceedings of the 45th International Conference on Parallel Processing (ICPP’16). 103--112.
[20]
Tamara G. Kolda and Brett W. Bader. 2009. Tensor decompositions and applications. SIAM Rev. 51, 3 (Sept. 2009), 455--500.
[21]
Hemanth Kolla, Xin-Yu Zhao, Jacqueline H. Chen, and N. Swaminathan. 2016. Velocity and reactive scalar dissipation spectra in turbulent premixed flames. Combust. Sci. Technol. 188, 9 (2016), 1424--1439.
[22]
Jiajia Li, Casey Battaglino, Ioakeim Perros, Jimeng Sun, and Richard Vuduc. 2015. An input-adaptive and in-place approach to dense tensor-times-matrix multiply. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’15). ACM, New York, NY.
[23]
S. Li, K. Gruchalla, K. Potter, J. Clyne, and H. Childs. 2015. Evaluating the efficacy of wavelet configurations on turbulent-flow data. In Proceedings of the IEEE 5th Symposium on Large Data Analysis and Visualization (LDAV’15). 81--89.
[24]
P. Lindstrom. 2014. Fixed-rate compressed floating-point arrays. IEEE Trans. Visual. Comput. Graph. 20, 12 (2014), 2674--2683.
[25]
Sgouria Lyra, Benjamin Wilde, Hemanth Kolla, Jerry M. Seitzman, Timothy C. Lieuwen, and Jacqueline H. Chen. 2015. Structure of hydrogen-rich transverse jets in a vitiated turbulent flow. Combust. Flame 162, 4 (2015), 1234--1248.
[26]
Linjian Ma and Edgar Solomonik. 2018. Accelerating Alternating Least Squares for Tensor Decomposition by Pairwise Perturbation. Technical Report 1811.10573. Retrieved from https://arxiv.org/abs/1811.10573.
[27]
S. Oh, N. Park, S. Lee, and U. Kang. 2018. Scalable Tucker factorization for sparse tensors—Algorithms and discoveries. In Proceedings of the 34th IEEE International Conference on Data Engineering (ICDE’18). 1120--1131.
[28]
Anh-Huy Phan, Petr Tichavsky, and Andrzej Cichocki. 2013. Fast alternating LS algorithms for high order CANDECOMP/PARAFAC tensor factorizations. IEEE Trans. Signal Process. 61, 19 (Oct. 2013), 4834--4846.
[29]
Shaden Smith and George Karypis. 2016. A medium-grained algorithm for distributed sparse tensor factorization. In Proceedings of the IEEE 30th International Parallel and Distributed Processing Symposium. 902--911.
[30]
Shaden Smith and George Karypis. 2017. Accelerating the Tucker decomposition with compressed sparse tensors. In Proceedings of the International European Conference on Parallel and Distributed Computing (Euro-Par’17), Francisco F. Rivera, Tomás F. Pena, and José C. Cabaleiro (Eds.). Springer International Publishing, Cham, 653--668.
[31]
Rajeev Thakur, Rolf Rabenseifner, and William Gropp. 2005. Optimization of collective communication operations in MPICH. Int. J. High Perform. Comput. Appl. 19, 1 (2005), 49--66.
[32]
Ledyard R. Tucker. 1966. Some mathematical notes on three-mode factor analysis. Psychometrika 31 (1966), 279--311.
[33]
Nick Vannieuwenhoven, Raf Vandebril, and Karl Meerbergen. 2012. A new truncation strategy for the higher-order singular value decomposition. SIAM J. Sci. Comput. 34, 2 (Jan. 2012), A1027--A1052.
[34]
M. A. O. Vasilescu and D. Terzopoulos. 2002. Multilinear analysis of image ensembles: TensorFaces. In Proceedings of the 7th European Conference on Computer Vision (ECCV’02) (Lecture Notes in Computer Science), Vol. 2350. Springer, 447--460.

Cited By

View all
  • (2024)Hybrid Parallel Tucker Decomposition of Streaming DataProceedings of the Platform for Advanced Scientific Computing Conference10.1145/3659914.3659934(1-12)Online publication date: 3-Jun-2024
  • (2024)Parallel Randomized Tucker Decomposition AlgorithmsSIAM Journal on Scientific Computing10.1137/22M154036346:2(A1186-A1213)Online publication date: 2-Apr-2024
  • (2024)Communication Lower Bounds and Optimal Algorithms for Multiple Tensor-Times-Matrix ComputationSIAM Journal on Matrix Analysis and Applications10.1137/22M151044345:1(450-477)Online publication date: 6-Feb-2024
  • Show More Cited By

Index Terms

  1. TuckerMPI: A Parallel C++/MPI Software Package for Large-scale Data Compression via the Tucker Tensor Decomposition

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Mathematical Software
      ACM Transactions on Mathematical Software  Volume 46, Issue 2
      June 2020
      274 pages
      ISSN:0098-3500
      EISSN:1557-7295
      DOI:10.1145/3401021
      Issue’s Table of Contents
      Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 June 2020
      Online AM: 07 May 2020
      Accepted: 01 January 2020
      Revised: 01 January 2020
      Received: 01 January 2019
      Published in TOMS Volume 46, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Tucker decomposition
      2. higher-order singular value decomposition (HOSVD)
      3. tensor decomposition

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)88
      • Downloads (Last 6 weeks)14
      Reflects downloads up to 15 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Hybrid Parallel Tucker Decomposition of Streaming DataProceedings of the Platform for Advanced Scientific Computing Conference10.1145/3659914.3659934(1-12)Online publication date: 3-Jun-2024
      • (2024)Parallel Randomized Tucker Decomposition AlgorithmsSIAM Journal on Scientific Computing10.1137/22M154036346:2(A1186-A1213)Online publication date: 2-Apr-2024
      • (2024)Communication Lower Bounds and Optimal Algorithms for Multiple Tensor-Times-Matrix ComputationSIAM Journal on Matrix Analysis and Applications10.1137/22M151044345:1(450-477)Online publication date: 6-Feb-2024
      • (2024)A General Framework for Progressive Data Compression and RetrievalIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332718630:1(1358-1368)Online publication date: 1-Jan-2024
      • (2024)SZOps: Scalar Operations for Error-bounded Lossy Compressor for Scientific DataSC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SCW63240.2024.00042(260-269)Online publication date: 17-Nov-2024
      • (2024)Error-controlled Progressive Retrieval of Scientific Data under Derivable Quantities of InterestProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00092(1-16)Online publication date: 17-Nov-2024
      • (2024)Tucker Tensor Approach for Accelerating Fock Exchange Computations in a Real-Space Finite-Element Discretization of Generalized Kohn–Sham Density Functional TheoryJournal of Chemical Theory and Computation10.1021/acs.jctc.4c0001920:9(3566-3579)Online publication date: 25-Apr-2024
      • (2024)GPUTucker: Large-Scale GPU-Based Tucker Decomposition Using Tensor PartitioningExpert Systems with Applications10.1016/j.eswa.2023.121445237(121445)Online publication date: Mar-2024
      • (2024)RA-HOOI: Rank-adaptive higher-order orthogonal iteration for the fixed-accuracy low multilinear-rank approximation of tensorsApplied Numerical Mathematics10.1016/j.apnum.2024.03.004201(290-300)Online publication date: Jul-2024
      • (2023)Algorithm 1036: ATC, An Advanced Tucker Compression Library for Multidimensional DataACM Transactions on Mathematical Software10.1145/358551449:2(1-25)Online publication date: 15-Jun-2023
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media