Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3205289.3205315acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

On Optimizing Distributed Tucker Decomposition for Sparse Tensors

Published: 12 June 2018 Publication History

Abstract

The Tucker decomposition generalizes the notion of Singular Value Decomposition (SVD) to tensors, the higher dimensional analogues of matrices. We study the problem of constructing the Tucker decomposition of sparse tensors on distributed memory systems via the HOOI procedure, a popular iterative method. The scheme used for distributing the input tensor among the processors (MPI ranks) critically influences the HOOI execution time. Prior work has proposed different distribution schemes: an offline scheme based on sophisticated hypergraph partitioning method and simple, lightweight alternatives that can be used real-time. While the hypergraph based scheme typically results in faster HOOI execution time, being complex, the time taken for determining the distribution is an order of magnitude higher than the execution time of a single HOOI iteration. Our main contribution is a lightweight distribution scheme, which achieves the best of both worlds. We show that the scheme is near-optimal on certain fundamental metrics associated with the HOOI procedure and as a result, near-optimal on the computational load (FLOPs). Though the scheme may incur higher communication volume, the computation time is the dominant factor and as the result, the scheme achieves better performance on the overall HOOI execution time. Our experimental evaluation on large real-life tensors (having up to 4 billion elements) shows that the scheme outperforms the prior schemes on the HOOI execution time by a factor of up to 3x. On the other hand, its distribution time is comparable to the prior lightweight schemes and is typically lesser than the execution time of a single HOOI iteration.

References

[1]
W. Austin, G. Ballard, and T. Kolda. 2016. Parallel tensor compression for large-scale scientific data. In IPDPS.
[2]
B. Bader and T. Kolda. 2007. Efficient MATLAB computations with sparse and factored tensors. SIAM J. on Sci. Comp. 30, 1 (2007), 205--231.
[3]
M. Baskaran, B. Meister, N. Vasilache, and R. Lethin. 2012. Efficient and scalable computations with sparse tensors. In HPEC.
[4]
E. Boman, K. Devine, L. Fisk, R. Heaphy, B. Hendrickson, V. Leung, C. Vaughan, U. Catalyurek, D. Bozdag, and W. Mitchell. 1999. Zoltan home page. (1999). http://www.cs.sandia.gov/Zoltan.
[5]
J. Carroll and J. Chang. 1970. Analysis of individual differences in multidimensional scaling via an N-way generalization of "Eckart-Young" decomposition. Psychometrika 35, 3 (1970), 283--319.
[6]
V. Chakaravarthy, J. Choi, D. Joseph, X. Liu, P. Murali, Y. Sabharwal, and D. Sreedhar. 2017. On optimizing distributed Tucker decomposition for dense tensors. In IPDPS.
[7]
J. Choi and S. Vishwanathan. 2014. DFacTo: Distributed factorization of tensors. In Advances in Neural Information Processing Systems.
[8]
R. Harshman. 1970. Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multimodal factor analysis. UCLA Working Papers in Phonetics (1970).
[9]
V. Hernandez, J. Roman, A. Tomas, and V. Vidal. 2007. Restarted Lanczos bidiagonalization for the SVD in SLEPc. STR-8, Tech. Rep. (2007).
[10]
W. Hightower, J. Prins, and J. Reif. 1992. Implementations of randomized sorting on large parallel machines. In SPAA.
[11]
I. Jeon, E. Papalexakis, U. Kang, and C. Faloutsos. 2015. HaTen2: Billion-scale tensor decompositions. In ICDE.
[12]
U. Kang, E. Papalexakis, A. Harpale, and C. Faloutsos. 2012. GigaTensor: Scaling Tensor Analysis Up by 100 Times - Algorithms and Discoveries. In KDD.
[13]
L. Karlsson, D. Kressner, and A. Uschmajew. 2016. Parallel Algorithms for Tensor Completion in the CP Format. Parallel Comput. 57 (2016), 222--234.
[14]
O. Kaya and B. Uçar. 2015. Scalable sparse tensor decompositions in distributed memory systems. In SC.
[15]
O. Kaya and B. Uçar. 2016. High performance parallel algorithms for the Tucker decomposition of sparse tensors. In ICPP.
[16]
O. Kaya and B. Uçar. 2015. Scalable Sparse Tensor Decompositions in Distributed Memory Systems. In SC.
[17]
T. Kolda and B. Bader. 2009. Tensor decompositions and applications. SIAM Rev. 51 (2009), 455--500.
[18]
T. Kolda and J. Sun. 2008. Scalable tensor decompositions for multi-aspect data mining. In ICDM.
[19]
L. De Lathauwer, B. De Moor, and J. Vandewalle. 2000. A multilinear singular value decomposition. SIAM J. on Matrix Analysis and Applications 21, 4 (2000), 1253--1278.
[20]
L. De Lathauwer, B. De Moor, and J. Vandewalle. 2000. On the best rank-1 and rank-(R1, R2, ..., RN) approximation of higherorder tensors. SIAM J. Matrix Analysis and Applications 21 (2000), 1324--1342.
[21]
N. Liu, B. Zhang, J. Yan, Z. Chen, W. Liu, F. Bai, and L. Chien. 2005. Text representation: From vector to tensor. In ICDM.
[22]
D. Muti and S. Bourennane. 2005. Multidimensional filtering based on a tensor approach. Signal Processing 85 (2005), 2338--2353.
[23]
K. Shin and U. Kang. 2014. Distributed methods for high-dimensional and large-scale tensor factorization. In ICDM.
[24]
S. Smith, J. Choi, J. Li, R. Vuduc, J. Park, X. Liu, and G. Karypis. 2017. FROSTT: The Formidable Repository of Open Sparse Tensors and Tools. http://frostt.io/. (2017).
[25]
S. Smith and G. Karypis. 2016. A medium-grained algorithm for sparse tensor factorization. In IPDPS.
[26]
S. Smith and G. Karypis. 2017. Accelerating the Tucker Decomposition with Compressed Sparse Tensors. In EuroPar.
[27]
S. Smith, J. Park, and G. Karypis. 2017. Sparse tensor factorization on many-Core processors with high-bandwidth memory. In IPDPS.
[28]
L. Tucker. 1966. Some mathematical notes on three-mode factor analysis. Psychometrika 31 (1966), 279--311.
[29]
V. Vazirani. 2001. Approximation Algorithms. Springer-Verlag.
[30]
G. Zhou, A. Cichocki, and S. Xie. 2015. Decomposition of big tensors With low multilinear rank. CoRR, arXiv:1412.1885 (2015).

Cited By

View all
  • (2023)A Survey of Next-generation Computing Technologies in Space-air-ground Integrated NetworksACM Computing Surveys10.1145/360601856:1(1-40)Online publication date: 28-Aug-2023
  • (2023)A Survey of Accelerating Parallel Sparse Linear AlgebraACM Computing Surveys10.1145/360460656:1(1-38)Online publication date: 28-Aug-2023
  • (2023)Machine Unlearning: A SurveyACM Computing Surveys10.1145/360362056:1(1-36)Online publication date: 28-Aug-2023
  • Show More Cited By

Index Terms

  1. On Optimizing Distributed Tucker Decomposition for Sparse Tensors

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '18: Proceedings of the 2018 International Conference on Supercomputing
    June 2018
    407 pages
    ISBN:9781450357838
    DOI:10.1145/3205289
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 June 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Tensor decompositions
    2. tensor distribution schemes

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICS '18
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)34
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 03 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A Survey of Next-generation Computing Technologies in Space-air-ground Integrated NetworksACM Computing Surveys10.1145/360601856:1(1-40)Online publication date: 28-Aug-2023
    • (2023)A Survey of Accelerating Parallel Sparse Linear AlgebraACM Computing Surveys10.1145/360460656:1(1-38)Online publication date: 28-Aug-2023
    • (2023)Machine Unlearning: A SurveyACM Computing Surveys10.1145/360362056:1(1-36)Online publication date: 28-Aug-2023
    • (2023)The Evolution of Distributed Systems for Graph Neural Networks and Their Origin in Graph Processing and Deep Learning: A SurveyACM Computing Surveys10.1145/359742856:1(1-37)Online publication date: 28-Aug-2023
    • (2023)Security Aspects of Cryptocurrency Wallets—A Systematic Literature ReviewACM Computing Surveys10.1145/359690656:1(1-31)Online publication date: 28-Aug-2023
    • (2023)A Taxonomy and Analysis of Misbehaviour Detection in Cooperative Intelligent Transport Systems: A Systematic ReviewACM Computing Surveys10.1145/359659856:1(1-38)Online publication date: 28-Aug-2023
    • (2023)Performance Implication of Tensor Irregularity and Optimization for Distributed Tensor DecompositionACM Transactions on Parallel Computing10.1145/358031510:2(1-27)Online publication date: 20-Jun-2023
    • (2023)Distributed non-negative RESCAL with automatic model selection for exascale dataJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.04.010179(104709)Online publication date: Sep-2023
    • (2023)Analysis of mobility patterns for urban taxi ridership: the role of the built environmentTransportation10.1007/s11116-023-10372-651:4(1409-1431)Online publication date: 22-Feb-2023
    • (2022)GSpTC: High-Performance Sparse Tensor Contraction on CPU-GPU Heterogeneous Systems2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00080(380-387)Online publication date: Dec-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media