Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

A New Sparse Data Clustering Method Based On Frequent Items

Published: 30 May 2023 Publication History

Abstract

Large, sparse categorical data is a natural way to represent complex data like sequences, trees, and graphs. Such data is prevalent in many applications, e.g., Criteo released a terabyte size click log data of 4 billion records with millions of dimensions. While most existing clustering algorithms like k-Means work well on dense, numerical data, there exist relatively few algorithms that can cluster sets of sparse categorical features.
In this paper, we propose a new method called k-FreqItems that performs scalable clustering over high-dimensional, sparse data. To make clustering results easily interpretable, k-FreqItems is built upon a novel sparse center representation called FreqItem which will choose a set of high-frequency, non-zero dimensions to represent the cluster. Unlike most existing clustering algorithms, which adopt Euclidean distance as the similarity measure, k-FreqItems uses the popular Jaccard distance for comparing sets.
Since the efficiency and effectiveness of k-FreqItems are highly dependent on an initial set of representative seeds, we introduce a new randomized initialization method, SILK, to deal with the seeding problem of k-FreqItems. SILK uses locality-sensitive hash (LSH) functions for oversampling and identifies frequently co-occurred data in LSH buckets to determine a set of promising seeds, allowing k-FreqItems to converge swiftly in an iterative process. Experimental results over seven real-world sparse data sets show that the SILK seeding is around 1.1\sim3.2× faster yet more effective than the state-of-the-art seeding methods. Notably, SILK scales up well to a billion data objects on a commodity machine with 4 GPUs. The code is available at https://github.com/HuangQiang/k-FreqItems.

Supplemental Material

MP4 File
In this presentation video, we introduce a novel sparse center representation, FreqItem, and a new partitioning-based method, k-FreqItems, for sparse data clustering with Jaccard distance. Moreover, Like most partitioning-based methods, the initial cluster centers are crucial for the clustering performance. Thus, we further present SILK, an effective, distributed, and rapid-converged seeding method for ?-FreqItems.

References

[1]
Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules. In VLDB. 487--499.
[2]
Salem Alelyani, Jiliang Tang, and Huan Liu. 2018. Feature selection for clustering: A review. Data Clustering (2018), 29--60.
[3]
Alexandr Andoni. 2005. E2LSH 0.1 User Manual. http://web.mit.edu/andoni/www/LSH/index.html (2005).
[4]
Alexandr Andoni and Piotr Indyk. 2006. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS. 459--468.
[5]
Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn, and Ludwig Schmidt. 2015. Practical and optimal LSH for angular distance. In NIPS. 1225--1233.
[6]
Mihael Ankerst, Markus M Breunig, Hans-Peter Kriegel, and Jörg Sander. 1999. OPTICS: ordering points to identify the clustering structure. SIGMOD Record, Vol. 28, 2 (1999), 49--60.
[7]
David Arthur and Sergei Vassilvitskii. 2007. k-means$$: The advantages of careful seeding. In SODA. 1027--1035.
[8]
Olivier Bachem, Mario Lucic, Hamed Hassani, and Andreas Krause. 2016a. Fast and provably good seedings for k-means. In NIPS. 55--63.
[9]
Olivier Bachem, Mario Lucic, S Hamed Hassani, and Andreas Krause. 2016b. Approximate k-means$$ in sublinear time. In AAAI. 1459--1467.
[10]
Olivier Bachem, Mario Lucic, and Andreas Krause. 2017. Distributed and provably good seedings for k-means in constant rounds. In ICML. 292--300.
[11]
Bahman Bahmani, Benjamin Moseley, Andrea Vattani, Ravi Kumar, and Sergei Vassilvitskii. 2012. Scalable k-means$$. PVLDB, Vol. 5, 7 (2012), 622--633.
[12]
Mayank Bawa, Tyson Condie, and Prasanna Ganesan. 2005. LSH forest: self-tuning indexes for similarity search. In WWW. 651--660.
[13]
Florian Beil, Martin Ester, and Xiaowei Xu. 2002. Frequent term-based text clustering. In KDD. 436--442.
[14]
Ron Bekkerman and Martin Scholz. 2008. Data weaving: Scaling up the state-of-the-art in data clustering. In CIKM. 1083--1092.
[15]
Aditya Bhaskara and Maheshakya Wijewardena. 2018. Distributed clustering via lsh based data partitioning. In ICML. 570--579.
[16]
Mohamed Bouguessa and Shengrui Wang. 2008. Mining projected clusters in high-dimensional spaces. TKDE, Vol. 21, 4 (2008), 507--522.
[17]
Andrei Z Broder. 1997. On the resemblance and containment of documents. In Proceedings of Compression and Complexity of Sequences. 21--29.
[18]
Andrei Z Broder, Moses Charikar, Alan M Frieze, and Michael Mitzenmacher. 1998. Min-wise independent permutations. In STOC. 327--336.
[19]
Feng Cao, Anthony KH Tung, and Aoying Zhou. 2006. Scalable clustering using graphics processors. In WAIM. 372--384.
[20]
Moses S Charikar. 2002. Similarity estimation techniques from rounding algorithms. In STOC. 380--388.
[21]
Xiaojun Chen, Xiaofei Xu, Joshua Zhexue Huang, and Yunming Ye. 2011. TW-k-means: Automated two-level variable weighting clustering algorithm for multiview data. TKDE, Vol. 25, 4 (2011), 932--944.
[22]
Vincent Cohen-Addad, Silvio Lattanzi, Ashkan Norouzi-Fard, Christian Sohler, and Ola Svensson. 2020. Fast and accurate k-means$$ via rejection sampling. In NeurIPS. 16235--16245.
[23]
Ryan R Curtin. 2017. A dual-tree algorithm for fast k-means clustering with large k. In SDM. 300--308.
[24]
Sanjoy Dasgupta. 2008. The hardness of k-means clustering. Department of Computer Science and Engineering, University of California, San Diego.
[25]
Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S Mirrokni. 2004. Locality-sensitive hashing scheme based on p-stable distributions. In SoCG. 253--262.
[26]
Inderjit S Dhillon, Yuqiang Guan, and Jacob Kogan. 2002. Iterative clustering of high dimensional text data augmented by local search. In ICDM. 131--138.
[27]
Inderjit S Dhillon and Dharmendra S Modha. 2001. Concept decompositions for large sparse text data using clustering. Machine Learning, Vol. 42, 1 (2001), 143--175.
[28]
Chris Ding, Tao Li, Wei Peng, and Haesun Park. 2006. Orthogonal nonnegative matrix t-factorizations for clustering. In KDD. 126--135.
[29]
Yufei Ding, Yue Zhao, Xipeng Shen, Madanlal Musuvathi, and Todd Mytkowicz. 2015. Yinyang k-means: A drop-in replacement of the classic k-means with consistent speedup. In ICML. 579--587.
[30]
Charles Elkan. 2003. Using the triangle inequality to accelerate k-means. In ICML. 147--153.
[31]
Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD. 226--231.
[32]
Benjamin CM Fung, Ke Wang, and Martin Ester. 2003. Hierarchical document clustering using frequent itemsets. In SDM. 59--70.
[33]
Junhao Gan, Jianlin Feng, Qiong Fang, and Wilfred Ng. 2012. Locality-sensitive hashing scheme based on dynamic collision counting. In SIGMOD. 541--552.
[34]
Junhao Gan and Yufei Tao. 2015. DBSCAN revisited: Mis-claim, un-fixability, and approximation. In SIGMOD. 519--530.
[35]
Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim. 1998. CURE: An efficient clustering algorithm for large databases. SIGMOD Record, Vol. 27, 2 (1998), 73--84.
[36]
Ali Hadian and Saeed Shahrivari. 2014. High performance parallel k-means clustering for disk-resident datasets on multi-core CPUs. The Journal of Supercomputing, Vol. 69, 2 (2014), 845--863.
[37]
Nathan Halko, Per-Gunnar Martinsson, and Joel A Tropp. 2009. Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions. (2009).
[38]
Greg Hamerly. 2010. Making k-means even faster. In SDM. 130--140.
[39]
Sariel Har-Peled, Piotr Indyk, and Rajeev Motwani. 2012. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of Computing, Vol. 8, 1 (2012), 321--350.
[40]
Kurt Hornik, Ingo Feinerer, Martin Kober, and Christian Buchta. 2012. Spherical k-means clustering. Journal of Statistical Software, Vol. 50 (2012), 1--22.
[41]
Qiang Huang, Jianlin Feng, Qiong Fang, and Wilfred Ng. 2017a. Two efficient hashing schemes for high-dimensional furthest neighbor search. TKDE, Vol. 29, 12 (2017), 2772--2785.
[42]
Qiang Huang, Jianlin Feng, Qiong Fang, Wilfred Ng, and Wei Wang. 2017b. Query-aware locality-sensitive hashing scheme for $l_p$ norm. VLDBJ, Vol. 26, 5 (2017), 683--708.
[43]
Qiang Huang, Jianlin Feng, Yikai Zhang, Qiong Fang, and Wilfred Ng. 2015. Query-aware locality-sensitive hashing for approximate nearest neighbor search. PVLDB, Vol. 9, 1 (2015), 1--12.
[44]
Qiang Huang, Yifan Lei, and Anthony KH Tung. 2021. Point-to-Hyperplane Nearest Neighbor Search Beyond the Unit Hypersphere. In SIGMOD. 777--789.
[45]
Qiang Huang, Guihong Ma, Jianlin Feng, Qiong Fang, and Anthony KH Tung. 2018. Accurate and fast asymmetric locality-sensitive hashing scheme for maximum inner product search. In KDD. 1561--1570.
[46]
Qiang Huang, Yanhao Wang, and Anthony KH Tung. 2022. SAH: Shifting-aware Asymmetric Hashing for Reverse $ k $-Maximum Inner Product Search. arXiv preprint arXiv:2211.12751 (2022).
[47]
Zhexue Huang. 1998. Extensions to the k-means algorithm for clustering large data sets with categorical values. DMKD, Vol. 2, 3 (1998), 283--304.
[48]
Zhexue Huang and Michael K Ng. 1999. A fuzzy k-modes algorithm for clustering categorical data. IEEE Transactions on Fuzzy Systems, Vol. 7, 4 (1999), 446--452.
[49]
Lawrence Hubert and Phipps Arabie. 1985. Comparing partitions. Journal of Classification, Vol. 2, 1 (1985), 193--218.
[50]
Piotr Indyk and Rajeev Motwani. 1998. Approximate nearest neighbors: towards removing the curse of dimensionality. In STOC. 604--613.
[51]
Liping Jing, Michael K Ng, and Joshua Zhexue Huang. 2007. An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data. TKDE, Vol. 19, 8 (2007), 1026--1041.
[52]
Leonard Kaufman and Peter J Rousseeuw. 2009. Finding groups in data: an introduction to cluster analysis. Vol. 344. John Wiley & Sons.
[53]
Hyunjoong Kim, Han Kyul Kim, and Sungzoon Cho. 2020. Improving spherical k-means for document clustering: Fast initialization, sparse centroid projection, and efficient cluster labeling. Expert Systems with Applications, Vol. 150 (2020), 113288.
[54]
Hisashi Koga, Tetsuo Ishibashi, and Toshinori Watanabe. 2007. Fast agglomerative hierarchical clustering algorithm using Locality-Sensitive Hashing. KAIS, Vol. 12, 1 (2007), 25--53.
[55]
Peeyush Kumar, N Narasimhan, and Balaraman Ravindran. 2013. Spectral clustering as mapping to a simplex. In 2013 ICML Workshop on Spectral Learning.
[56]
Ren-Jieh Kuo, YR Zheng, and Thi Phuong Quyen Nguyen. 2021. Metaheuristic-based possibilistic fuzzy k-modes algorithms for categorical data clustering. Information Sciences, Vol. 557 (2021), 1--15.
[57]
Ken Lang. 1995. NewsWeeder: learning to filter netnews. In ICML. 331--339.
[58]
Yifan Lei, Qiang Huang, Mohan Kankanhalli, and Anthony Tung. 2019. Sublinear Time Nearest Neighbor Search over Generalized Weighted Space. In ICML. 3773--3781.
[59]
Yifan Lei, Qiang Huang, Mohan Kankanhalli, and Anthony KH Tung. 2020. Locality-Sensitive Hashing Scheme based on Longest Circular Co-Substring. In SIGMOD. 2589--2599.
[60]
David D Lewis, Yiming Yang, Tony Russell-Rose, and Fan Li. 2004. RCV1: A new benchmark collection for text categorization research. JMLR, Vol. 5 (2004), 361--397.
[61]
Liandeng Li, Teng Yu, Wenlai Zhao, Haohuan Fu, Chenyu Wang, Li Tan, Guangwen Yang, and John Thomson. 2018. Large-scale hierarchical k-means for heterogeneous many-core supercomputers. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. 160--170.
[62]
Qiuhong Li, Peng Wang, Wei Wang, Hao Hu, Zhongsheng Li, and Junxian Li. 2014. An efficient K-means clustering algorithm on MapReduce. In DASFAA. 357--371.
[63]
You Li, Kaiyong Zhao, Xiaowen Chu, and Jiming Liu. 2013. Speeding up k-means algorithm by GPUs. J. Comput. System Sci., Vol. 79, 2 (2013), 216--229.
[64]
Weiwei Liu, Xiaobo Shen, and Ivor W Tsang. 2017. Sparse embedded k-means clustering. In NIPS. 3321--3329.
[65]
Wanqi Liu, Hanchen Wang, Ying Zhang, Wei Wang, Lu Qin, and Xuemin Lin. 2021. EI-LSH: An early-termination driven I/O efficient incremental c-approximate nearest neighbor search. The VLDB Journal, Vol. 30 (2021), 215--235.
[66]
Stuart Lloyd. 1982. Least squares quantization in PCM. TIT, Vol. 28, 2 (1982), 129--137.
[67]
Kejing Lu, Yoshiharu Ishikawa, and Chuan Xiao. 2022. MQH: Locality Sensitive Hashing on Multi-level Quantization Errors for Point-to-Hyperplane Distances. PVLDB, Vol. 16, 4 (2022), 864--876.
[68]
Kejing Lu and Mineichi Kudo. 2020. R2LSH: A Nearest Neighbor Search Scheme Based on Two-dimensional Projected Spaces. In ICDE. 1045--1056.
[69]
Kejing Lu, Hongya Wang, Wei Wang, and Mineichi Kudo. 2020. VHP: Approximate Nearest Neighbor Search via Virtual Hypersphere Partitioning. PVLDB, Vol. 13, 9 (2020), 1443--1455.
[70]
Meilian Lu, Zhen Qin, Yiming Cao, Zhichao Liu, and Mengxing Wang. 2014. Scalable news recommendation using multi-dimensional similarity and Jaccard--Kmeans clustering. Journal of Systems and Software, Vol. 95 (2014), 242--251.
[71]
Carlos B Lucasius, Adrie D Dane, and Gerrit Kateman. 1993. On k-medoid clustering of large data sets with the aid of a genetic algorithm: background, feasiblity and comparison. Analytica Chimica Acta, Vol. 282, 3 (1993), 647--669.
[72]
Alessandro Lulli, Matteo Dell'Amico, Pietro Michiardi, and Laura Ricci. 2016. NG-DBSCAN: scalable density-based clustering for arbitrary data. PVLDB, Vol. 10, 3 (2016), 157--168.
[73]
Qin Lv, William Josephson, Zhe Wang, Moses Charikar, and Kai Li. 2007. Multi-probe LSH: efficient indexing for high-dimensional similarity search. In VLDB. 950--961.
[74]
Justin Ma, Lawrence K Saul, Stefan Savage, and Geoffrey M Voelker. 2009. Identifying suspicious URLs: an application of large-scale online learning. In ICML. 681--688.
[75]
James MacQueen. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. 281--297.
[76]
Ryan McConville, Xin Cao, Weiru Liu, and Paul Miller. 2016. Accelerating large scale centroid-based clustering with locality sensitive hashing. In ICDE. 649--660.
[77]
Nicholas Meisburger and Anshumali Shrivastava. 2020. Distributed Tera-Scale Similarity Search with MPI: Provably Efficient Similarity Search over billions without a Single Distance Computation. arXiv preprint arXiv:2008.03260 (2020).
[78]
Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, DB Tsai, Manish Amde, Sean Owen, et al. 2016. Mllib: Machine learning in apache spark. JMLR, Vol. 17, 1 (2016), 1235--1241.
[79]
Arman Khadjeh Nassirtoussi, Saeed Aghabozorgi, Teh Ying Wah, and David Chek Ling Ngo. 2015. Text mining of news-headlines for FOREX market prediction: A Multi-layer Dimension Reduction Algorithm with semantics and sentiment. Expert Systems with Applications, Vol. 42, 1 (2015), 306--324.
[80]
James Newling and Francc ois Fleuret. 2016a. Fast k-means with accurate bounds. In ICML. 936--944.
[81]
James Newling and Francc ois Fleuret. 2016b. Nested mini-batch k-means. In NIPS. 1352--1360.
[82]
Andrew Y Ng, Michael I Jordan, and Yair Weiss. 2001. On spectral clustering: Analysis and an algorithm. In NIPS. 849--856.
[83]
Michael K Ng, Mark Junjie Li, Joshua Zhexue Huang, and Zengyou He. 2007. On the impact of dissimilarity measure in k-modes clustering algorithm. TPAMI, Vol. 29, 3 (2007), 503--507.
[84]
Rafail Ostrovsky, Yuval Rabani, Leonard J Schulman, and Chaitanya Swamy. 2006. The Effectiveness of Lloyd-Type Methods for the k-Means Problem. In FOCS. 165--176.
[85]
David Sculley. 2010. Web-scale k-means clustering. In WWW. 1177--1178.
[86]
Jianbo Shi and Jitendra Malik. 2000. Normalized cuts and image segmentation. TPAMI, Vol. 22, 8 (2000), 888--905.
[87]
Anshumali Shrivastava and Ping Li. 2014a. Asymmetric LSH (ALSH) for sublinear time Maximum Inner Product Search (MIPS). In NIPS. 2321--2329.
[88]
Anshumali Shrivastava and Ping Li. 2014b. Densifying one permutation hashing via rotation for fast near neighbor search. In ICML. 557--565.
[89]
Yifang Sun, Wei Wang, Jianbin Qin, Ying Zhang, and Xuemin Lin. 2014. SRS: solving c-approximate nearest neighbor queries in high dimensional euclidean space with a tiny index. PVLDB, Vol. 8, 1 (2014), 1--12.
[90]
Yufei Tao, Ke Yi, Cheng Sheng, and Panos Kalnis. 2009. Quality and efficiency in high dimensional nearest neighbor search. In SIGMOD. 563--576.
[91]
Yao Tian, Xi Zhao, and Xiaofang Zhou. 2022. DB-LSH: Locality-Sensitive Hashing with Query-based Dynamic Bucketing. In ICDE. 2250--2262.
[92]
Anthony K. H. Tung, Jiawei Han, and Micheline Kamber. 2001. Spatial clustering methods in data mining: A survey. Geographic Data Mining and Knowledge Discovery (2001), 188--217.
[93]
Nguyen Xuan Vinh, Julien Epps, and James Bailey. 2010. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. JMLR, Vol. 11 (2010), 2837--2854.
[94]
Dongkuan Xu and Yingjie Tian. 2015. A comprehensive survey of clustering algorithms. Annals of Data Science, Vol. 2, 2 (2015), 165--193.
[95]
Hui Yan, Siyu Liu, and S Yu Philip. 2019. From joint feature selection and self-representation learning to robust multi-view subspace clustering. In ICDM. 1414--1419.
[96]
Wei Ye, Samuel Maurus, Nina Hubig, and Claudia Plant. 2016. Generalized independent subspace clustering. In ICDM. 569--578.
[97]
Rong Yin, Yong Liu, Weiping Wang, and Dan Meng. 2020. Extremely sparse Johnson-Lindenstrauss transform: From theory to algorithm. In ICDM. 1376--1381.
[98]
Kevin Y Yip, David W Cheung, and Michael K Ng. 2005. On discovery of extremely low-dimensional clusters using semi-supervised projected clustering. In ICDE. 329--340.
[99]
Hwanjo Yu, Jiong Yang, and Jiawei Han. 2003. Classifying large data sets using SVMs with hierarchical clusters. In KDD. 306--315.
[100]
Tian Zhang, Raghu Ramakrishnan, and Miron Livny. 1996. BIRCH: an efficient data clustering method for very large databases. SIGMOD Record, Vol. 25, 2 (1996), 103--114.
[101]
Wen Zhang, Taketoshi Yoshida, Xijin Tang, and Qing Wang. 2010. Text clustering using frequent itemsets. Knowledge-Based Systems, Vol. 23, 5 (2010), 379--388.
[102]
Yanfeng Zhang, Shimin Chen, and Ge Yu. 2016. Efficient distributed density peaks for clustering large data sets in mapreduce. TKDE, Vol. 28, 12 (2016), 3218--3230.
[103]
Xi Zhao, Bolong Zheng, Xiaomeng Yi, Xiaofan Luan, Charles Xie, Xiaofang Zhou, and Christian S Jensen. 2023. FARGO: Fast Maximum Inner Product Search via Global Multi-Probing. PVLDB, Vol. 16, 5 (2023), 1100--1112.
[104]
Bolong Zheng, Xi Zhao, Lianggui Weng, Nguyen Quoc Viet Hung, Hang Liu, and Christian S Jensen. 2020. PM-LSH: A fast and accurate LSH framework for high-dimensional approximate NN search. PVLDB, Vol. 13, 5 (2020), 643--655.
[105]
Yuxin Zheng, Qi Guo, Anthony KH Tung, and Sai Wu. 2016. LazyLSH: Approximate Nearest Neighbor Search for Multiple Distance Functions with a Single Index. In SIGMOD. 2023--2037.

Cited By

View all
  • (2024)Window Function Expression: Let the Self-Join EnterProceedings of the VLDB Endowment10.14778/3665844.366584817:9(2162-2174)Online publication date: 6-Aug-2024
  • (2024)Proximity Queries on Point Clouds using Rapid Construction Path OracleProceedings of the ACM on Management of Data10.1145/36392612:1(1-26)Online publication date: 26-Mar-2024
  • (2024)The Effect of K - Means Clustering on Collaborative Filtering in Book Recommendation2024 12th International Conference on Information and Education Technology (ICIET)10.1109/ICIET60671.2024.10542774(41-45)Online publication date: 18-Mar-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 1, Issue 1
PACMMOD
May 2023
2807 pages
EISSN:2836-6573
DOI:10.1145/3603164
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 May 2023
Published in PACMMOD Volume 1, Issue 1

Author Tags

  1. Jaccard distance
  2. frequent items
  3. locality-sensitive hashing
  4. minhash
  5. seeding
  6. sparse data clustering

Qualifiers

  • Research-article

Funding Sources

  • National University of Singapore Centre for Trusted Internet and Community Industry Collaborative Projects
  • National Research Foundation, Singapore under its Strategic Capability Research Centres Funding Initiative

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)865
  • Downloads (Last 6 weeks)67
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Window Function Expression: Let the Self-Join EnterProceedings of the VLDB Endowment10.14778/3665844.366584817:9(2162-2174)Online publication date: 6-Aug-2024
  • (2024)Proximity Queries on Point Clouds using Rapid Construction Path OracleProceedings of the ACM on Management of Data10.1145/36392612:1(1-26)Online publication date: 26-Mar-2024
  • (2024)The Effect of K - Means Clustering on Collaborative Filtering in Book Recommendation2024 12th International Conference on Information and Education Technology (ICIET)10.1109/ICIET60671.2024.10542774(41-45)Online publication date: 18-Mar-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media