Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Static and Streaming Tucker Decomposition for Dense Tensors

Published: 27 February 2023 Publication History

Abstract

Given a dense tensor, how can we efficiently discover hidden relations and patterns in static and online streaming settings? Tucker decomposition is a fundamental tool to analyze multidimensional arrays in the form of tensors. However, existing Tucker decomposition methods in both static and online streaming settings have limitations of efficiency since they directly deal with large dense tensors for the result of Tucker decomposition. In a static setting, although few static methods have tried to reduce their time cost by sampling tensors, sketching tensors, and efficient matrix operations, there remains a need for an efficient method. Moreover, streaming versions of Tucker decomposition are still time-consuming to deal with newly arrived tensors.
We propose D-Tucker and D-TuckerO, efficient Tucker decomposition methods for large dense tensors in static and online streaming settings, respectively. By decomposing a given large dense tensor with randomized singular value decomposition, avoiding the reconstruction from SVD results, and carefully determining the order of operations, D-Tucker and D-TuckerO efficiently obtain factor matrices and core tensor. Experimental results show that D-Tucker achieves up to 38.4 × faster running times, and requires up to 17.2 × less space than existing methods while having similar accuracy. Furthermore, D-TuckerO is up to 6.1× faster than existing streaming methods for each newly arrived tensor while its running time is proportional to the size of the newly arrived tensor, not the accumulated tensor.

References

[1]
Ardavan Afshar, Ioakeim Perros, Evangelos E. Papalexakis, Elizabeth Searles, Joyce Ho, and Jimeng Sun. 2018. COPA: Constrained PARAFAC2 for sparse & large datasets. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 793–802.
[2]
Dawon Ahn, Jun-Gi Jang, and U. Kang. 2022. Time-aware tensor decomposition for sparse tensors. Machine Learning 111, 4 (2022), 1409–1430.
[3]
Dawon Ahn, Seyun Kim, and U. Kang. 2021. Accurate online tensor factorization for temporal tensor streams with missing values. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. ACM, 2822–2826.
[4]
Woody Austin, Grey Ballard, and Tamara G. Kolda. 2016. Parallel tensor compression for large-scale scientific data. In Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium. 912–922.
[5]
Brett W. Bader and Tamara G. Kolda. 2006. Algorithm 862: MATLAB tensor classes for fast algorithm prototyping. ACM Transactions on Mathematical Software 32, 4 (Dec.2006), 635–653. DOI:
[6]
Brett W. Bader, Tamara G. Kolda, Acar Ataman, Evrim NMN, Daniel Dunlavy, Robert Bassett, Casey J. Battaglino, Todd Plantenga, Eric Chi, and Samantha Hansen. 2017. MATLAB Tensor Toolbox Version 3.0-dev. Available online. Retrieved from https://www.tensortoolbox.org.
[7]
James Baglama and Lothar Reichel. 2005. Augmented implicitly restarted lanczos bidiagonalization methods. SIAM Journal on Scientific Computing 27, 1 (2005), 19–42.
[8]
Grey Ballard, Alicia Klinvex, and Tamara G. Kolda. 2019. TuckerMPI: A parallel C++/MPI software package for large-scale data compression via the tucker tensor decomposition. ACM Transactions on Mathematical Software (TOMS) 46, 2 (2020), 1–31.
[9]
Xiaochun Cao, Xingxing Wei, Yahong Han, and Dongdai Lin. 2015. Robust face clustering via tensor decomposition. IEEE Transactions on Cybernetics 45, 11 (2015), 2546–2557.
[10]
Venkatesan T. Chakaravarthy, Jee W. Choi, Douglas J. Joseph, Xing Liu, Prakash Murali, Yogish Sabharwal, and Dheeraj Sreedhar. 2017. On optimizing distributed tucker decomposition for dense tensors. In Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 1038–1047.
[11]
Maolin Che and Yimin Wei. 2019. Randomized algorithms for the approximations of Tucker and the tensor train decompositions. Advances in Computational Mathematics 45, 1 (2019), 395–428.
[12]
Dongjin Choi, Jun-Gi Jang, and U. Kang. 2019. S3CMTF: Fast, accurate, and scalable method for incomplete coupled matrix-tensor factorization. PloS One 14, 6 (2019), e0217316.
[13]
Jee W. Choi, Xing Liu, and Venkatesan T. Chakaravarthy. [n. d.]. High-performance dense tucker decomposition on GPU clusters. In Proceedings of the SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. 42:1–42:11.
[14]
Kenneth L. Clarkson and David P. Woodruff. 2017. Low-rank approximation and regression in input sparsity time. Journal of the ACM 63, 6 (2017), 54.
[15]
Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, and Xavier Bresson. 2017. FMA: A dataset for music analysis. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR’17).
[16]
Michaël Defferrard, Sharada P. Mohanty, Sean F. Carroll, and Marcel Salathé. 2018. Learning to recognize musical genre from audio. In Proceedings of the 2018 Web Conference Companion. ACM Press. DOI:
[17]
David H. Foster, Kinjiro Amano, Sérgio M. C. Nascimento, and Michael J. Foster. 2006. Frequency of metamerism in natural scenes. Optical Society of America. Journal A: Optics, Image Science, and Vision 23, 10 (102006), 2359–2372. DOI:
[18]
Ekta Gujral, Ravdeep Pasricha, and Evangelos E. Papalexakis. 2018. SamBaTen: Sampling-based batch incremental tensor decomposition. In Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, 387–395.
[19]
Nathan Halko, Per-Gunnar Martinsson, and Joel A. Tropp. 2011. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Review 53, 2 (2011), 217–288.
[20]
Heng Huang, Chris H. Q. Ding, Dijun Luo, and Tao Li. 2008. Simultaneous tensor subspace selection and clustering: The equivalence of high order svd and k-means clustering. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 327–335.
[21]
M. A. Iwen and B. W. Ong. 2016. A distributed and incremental SVD algorithm for agglomerative data analysis on large networks. SIAM Journal on Matrix Analysis and Applications 37, 4 (2016), 1699–1718.
[22]
Jun-Gi Jang and U. Kang. 2022. DPar2: Fast and scalable PARAFAC2 decomposition for irregular dense tensors. In Proceedings of the International Council for Open and Distance Education. IEEE, 2454–2467.
[23]
Jun-Gi Jang, Dongjin Choi, Jinhong Jung, and U. Kang. 2018. Zoom-SVD: Fast and memory efficient method for extracting key patterns in an arbitrary time range. In Proceedings of the Conference on Information and Knowledge Management. ACM, 1083–1092.
[24]
Jun-Gi Jang and U. Kang. 2020. D-tucker: Fast and memory-efficient tucker decomposition for dense tensors. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE’20). IEEE, 1850–1853.
[25]
Jun-Gi Jang and U. Kang. 2021. Fast and memory-efficient tucker decomposition for answering diverse time range queries. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 725–735.
[26]
Inah Jeon, Evangelos E. Papalexakis, U. Kang, and Christos Faloutsos. 2015. HaTen2: Billion-scale tensor decompositions. In Proceedings of the 2015 IEEE 31st International Conference on Data Engineering. 1047–1058.
[27]
Yong-Deok Kim, Eunhyeok Park, Sungjoo Yoo, Taelim Choi, Lu Yang, and Dongjun Shin. 2016. Compression of deep convolutional neural networks for fast and low power mobile applications. In ICLR. arXiv:1511.06530. Retrieved from https://arxiv.org/abs/1511.06530.
[28]
Tamara G. Kolda and Brett W. Bader. 2009. Tensor decompositions and applications. SIAM Review 51, 3 (2009), 455–500.
[29]
Tamara G. Kolda and Jimeng Sun. 2008. Scalable tensor decompositions for multi-aspect data mining. In Proceedings of the 2008 8th IEEE International Conference on Data Mining. 363–372.
[30]
Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. 2000. A multilinear singular value decomposition. SIAM Journal on Matrix Analysis Applications 21, 4 (2000), 1253–1278.
[31]
Lieven De Lathauwer, Bart De Moor, and Joos Vandewalle. 2000. On the best rank-1 and rank-(R\({}_{\mbox{1}}\), R\({}_{\mbox{2}}\), ..., R\({}_{\mbox{N}}\)) approximation of higher-order tensors. SIAM Journal on Matrix Analysis Applications 21, 4 (2000), 1324–1342.
[32]
Dongjin Lee and Kijung Shin. 2021. Robust factorization of real-world tensor streams with patterns, missing values, and outliers. In Proceedings of the International Council for Open and Distance Education. IEEE, 840–851.
[33]
Jiajia Li, Casey Battaglino, Ioakeim Perros, Jimeng Sun, and Richard W. Vuduc. 2015. An input-adaptive and in-place approach to dense tensor-times-matrix multiply. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 76:1–76:12.
[34]
Osman Asif Malik and Stephen Becker. 2018. Low-rank tucker decomposition of large tensors using TensorSketch. In Proceedings of the Advances in Neural Information Processing Systems. 10117–10127.
[35]
Rachel Minster, Arvind K. Saibaba, and Misha E. Kilmer. 2019. Randomized algorithms for low-rank tensor decompositions in the Tucker format. SIAM Journal on Mathematics of Data Science 2, 1 (2020), 189–215.
[36]
Tom M. Mitchell, Svetlana V. Shinkareva, Andrew Carlson, Kai-Min Chang, Vicente L. Malave, Robert A. Mason, and Marcel Adam Just. 2008. Predicting human brain activity associated with the meanings of nouns. Science 320, 5880 (May2008), 1191–1195.
[37]
Dimitri Nion and Nicholas D. Sidiropoulos. 2009. Adaptive algorithms to track the PARAFAC decomposition of a third-order tensor. IEEE Transactions on Signal Processing 57, 6 (2009), 2299–2310.
[38]
Jinoh Oh, Kijung Shin, Evangelos E. Papalexakis, Christos Faloutsos, and Hwanjo Yu. 2017. S-HOT: Scalable high-order tucker decomposition. In Proceedings of the 10th ACM International Conference on Web Search and Data Mining. 761–770.
[39]
Sejoon Oh, Namyong Park, Jun-Gi Jang, Lee Sael, and U. Kang. 2019. High-performance tucker factorization on heterogeneous platforms. IEEE Transactions on Parallel and Distributed Systems 30, 10 (2019), 2237–2248.
[40]
Sejoon Oh, Namyong Park, Lee Sael, and U. Kang. 2018. Scalable tucker factorization for sparse tensors - algorithms and discoveries. In Proceedings of the 2018 IEEE 34th International Conference on Data Engineering. 1120–1131.
[41]
Spiros Papadimitriou, Jimeng Sun, and Christos Faloutsos. 2005. Streaming pattern discovery in multiple time-series. In Proceedings of the International Conference on Very Large Data Bases. ACM, 697–708.
[42]
Ioakeim Perros, Robert Chen, Richard W. Vuduc, and Jimeng Sun. 2015. Sparse hierarchical tucker factorization and its application to healthcare. In Proceedings of the 2015 IEEE International Conference on Data Mining. 943–948.
[43]
Ioakeim Perros, Evangelos E. Papalexakis, Fei Wang, Richard Vuduc, Elizabeth Searles, Michael Thompson, and Jimeng Sun. 2017. SPARTan: Scalable PARAFAC2 for large & sparse data. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 375–384.
[44]
Steffen Rendle and Lars Schmidt-Thieme. 2010. Pairwise interaction tensor factorization for personalized tag recommendation. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 81–90.
[45]
Florin Schimbinschi, Xuan Vinh Nguyen, James Bailey, Chris Leckie, Hai Vu, and Rao Kotagiri. 2015. Traffic forecasting in complex urban networks: Leveraging big data and machine learning. In Proceedings of the 2015 IEEE International Conference on Big Data. IEEE, 1019–1024.
[46]
Kijung Shin, Lee Sael, and U. Kang. 2017. Fully scalable methods for distributed tensor factorization. IEEE Transactions on Knowledge and Data Engineering 29, 1 (2017), 100–113.
[47]
Nicholas D. Sidiropoulos, Lieven De Lathauwer, Xiao Fu, Kejun Huang, Evangelos E. Papalexakis, and Christos Faloutsos. 2017. Tensor decomposition for signal processing and machine learning. IEEE Transactions on Signal Processing 65, 13 (2017), 3551–3582.
[48]
Shaden Smith, Kejun Huang, Nicholas D. Sidiropoulos, and George Karypis. 2018. Streaming tensor factorization for infinite data sources. In Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, 81–89.
[49]
Shaden Smith and George Karypis. 2017. Accelerating the tucker decomposition with compressed sparse tensors. In Euro-Par 2017(Lecture Notes in Computer Science, Vol. 10417). Springer, 653–668.
[50]
Sangjun Son, Yong-chan Park, Minyong Cho, and U. Kang. 2022. DAO-CP: Data-adaptive online CP decomposition for tensor stream. PLOS ONE 17, 4 (042022), 1–18.
[51]
Qingquan Song, Xiao Huang, Hancheng Ge, James Caverlee, and Xia Hu. 2017. Multi-aspect streaming tensor completion. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 435–443.
[52]
Jimeng Sun, Dacheng Tao, and Christos Faloutsos. 2006. Beyond streams and graphs: Dynamic tensor analysis. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Tina Eliassi-Rad, Lyle H. Ungar, Mark Craven, and Dimitrios Gunopulos (Eds.). ACM, 374–383.
[53]
Yiming Sun, Yang Guo, Charlene Luo, Joel A. Tropp, and Madeleine Udell. 2020. Low-rank tucker approximation of a tensor from streaming data. SIAM Journal on Mathematics of Data Science 2, 4 (2020), 1123–1150.
[54]
Jinhui Tang, Xiangbo Shu, Zechao Li, Yu-Gang Jiang, and Qi Tian. 2019. Social anchor-unit graph regularized tensor completion for large-scale image retagging. IEEE Trans. Pattern Anal. Mach. Intell. 41, 8 (2019), 2027–2034.
[55]
Jinhui Tang, Xiangbo Shu, Guo-Jun Qi, Zechao Li, Meng Wang, Shuicheng Yan, and Ramesh C. Jain. 2017. Tri-clustered tensor completion for social-aware image tag refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 8 (2017), 1662–1674.
[56]
Charalampos E. Tsourakakis. 2010. MACH: Fast randomized tensor decompositions. In Proceedings of the 2010 SIAM International Conference on Data Mining. 689–700.
[57]
Nick Vannieuwenhoven, Raf Vandebril, and Karl Meerbergen. 2012. A new truncation strategy for the higher-order singular value decomposition. SIAM Journal on Scientific Computing 34, 2 (2012), A1027–A1052.
[58]
Yi Wang, Pierre-Marc Jodoin, Fatih Murat Porikli, Janusz Konrad, Yannick Benezeth, and Prakash Ishwar. 2014. CDnet 2014: An expanded change detection benchmark dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR Workshops. 393–400.
[59]
Franco Woolfe, Edo Liberty, Vladimir Rokhlin, and Mark Tygert. 2008. A fast randomized algorithm for the approximation of matrices. Applied and Computational Harmonic Analysis 25, 3 (2008), 335–366.
[60]
Fan Yang, Fanhua Shang, Yuzhen Huang, James Cheng, Jinfeng Li, Yunjian Zhao, and Ruihao Zhao. 2017. LFTF: A framework for efficient tensor analytics at scale. Proceedings of the VLDB Endowment 10, 7 (2017), 745–756.
[61]
Kejing Yin, William K. Cheung, Benjamin C. M. Fung, and Jonathan Poon. 2021. Tedpar: Temporally dependent parafac2 factorization for phenotype-based disease progression modeling. In Proceedings of the 2021 SIAM International Conference on Data Mining (SDM’21). SIAM, 594–602.
[62]
Shuo Zhou, Sarah M. Erfani, and James Bailey. 2018. Online CP decomposition for sparse tensors. In Proceedings of the 2018 IEEE International Conference on Data Mining. IEEE Computer Society, 1458–1463.
[63]
Shuo Zhou, Xuan Vinh Nguyen, James Bailey, Yunzhe Jia, and Ian Davidson. 2016. Accelerating online CP decompositions for higher order tensors. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1375–1384.

Cited By

View all
  • (2024)Fast and Accurate Domain Adaptation for Irregular Tensor DecompositionProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671670(1383-1394)Online publication date: 25-Aug-2024
  • (2024)Fast and Accurate PARAFAC2 Decomposition for Time Range Queries on Irregular TensorsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679735(962-972)Online publication date: 21-Oct-2024
  • (2024)Sparse Geomagnetic Time-Series Sensing Data Completion Leveraging Improved Tensor Correlated Total VariationIEEE Sensors Journal10.1109/JSEN.2024.348631324:24(41484-41495)Online publication date: 15-Dec-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 17, Issue 5
June 2023
386 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3583066
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 February 2023
Online AM: 20 October 2022
Accepted: 23 September 2022
Revised: 01 August 2022
Received: 09 December 2021
Published in TKDD Volume 17, Issue 5

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Dense tensor
  2. Tucker decomposition
  3. static setting
  4. online streaming setting
  5. efficiency

Qualifiers

  • Research-article

Funding Sources

  • National Research Foundation of Korea (NRF) funded by MSIT
  • Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by MSIT
  • Artificial Intelligence Graduate School Program (Seoul National University)
  • Artificial Intelligence Innovation Hub (Artificial Intelligence Institute, Seoul National University)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)295
  • Downloads (Last 6 weeks)17
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Fast and Accurate Domain Adaptation for Irregular Tensor DecompositionProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671670(1383-1394)Online publication date: 25-Aug-2024
  • (2024)Fast and Accurate PARAFAC2 Decomposition for Time Range Queries on Irregular TensorsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679735(962-972)Online publication date: 21-Oct-2024
  • (2024)Sparse Geomagnetic Time-Series Sensing Data Completion Leveraging Improved Tensor Correlated Total VariationIEEE Sensors Journal10.1109/JSEN.2024.348631324:24(41484-41495)Online publication date: 15-Dec-2024
  • (2024)Accurate Coupled Tensor Factorization with Knowledge Graph2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825614(1009-1018)Online publication date: 15-Dec-2024
  • (2024)Scale-variant structural feature construction of EEG stream via component-increased Dynamic Tensor DecompositionKnowledge-Based Systems10.1016/j.knosys.2024.111747294(111747)Online publication date: Jul-2024
  • (2023)A Guide to the Tucker Tensor Decomposition for Data Mining: Exploratory Analysis, Clustering and ClassificationTransactions on Large-Scale Data- and Knowledge-Centered Systems LIV10.1007/978-3-662-68014-8_3(56-88)Online publication date: 22-Sep-2023
  • (2022)Accurate PARAFAC2 Decomposition for Temporal Irregular Tensors with Missing Values2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020667(982-991)Online publication date: 17-Dec-2022

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media