Abstract
Recently, a “Purchase Tree” data structure is proposed to compress the customer transaction data and a local PurTree Spectral clustering method is proposed to recover the cluster structure from the purchase trees. However, in the PurTree distance, the node weights for the children nodes of a parent node are set as equal and the difference between different nodes are not distinguished. In this paper, we propose a Structured PurTree Subspace Spectral (SPSS) clustering algorithm for PurTree Data. In the new method, we propose a PurTree subspace similarity to compute the similarity between two trees, in which a set of sparse and structured node weights are introduced to distinguish the importance of different nodes in a purchase tree. A new clustering model is proposed to learn a structured graph with explicit cluster structure. An iterative optimization algorithm is proposed to simultaneously learn the structured graph and node weights. We propose a balanced cover tree for fast k-NN searching during building affinity matrices. SPSS was compared with six clustering algorithms on 10 benchmark data sets and the experimental results show the superiority of the new method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 97–104. ACM (2006)
Chen, X., Fang, Y., Yang, M., Nie, F., Zhao, Z., Huang, J.Z.: PurTreeClust: a clustering algorithm for customer segmentation from massive customer transaction data. IEEE Trans. Knowl. Data Eng. 30(3), 559–572 (2018). https://doi.org/10.1109/TKDE.2017.2763620
Chen, X., Huang, J.Z., Luo, J.: PurTreeClust: a purchase tree clustering algorithm for large-scale customer transaction data. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp. 661–672, May 2016. https://doi.org/10.1109/ICDE.2016.7498279
Chen, X., Peng, S., Huang, J.Z., Nie, F., Ming, Y.: Local PurTree spectral clustering for massive customer transaction data. IEEE Intell. Syst. 32(2), 37–44 (2017)
Hagen, L., Kahng, A.B.: New spectral methods for ratio cut partitioning and clustering. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 11(9), 1074–1085 (1992)
Kuo, R., Ho, L., Hu, C.M.: Integration of self-organizing feature map and k-means algorithm for market segmentation. Comput. Oper. Res. 29(11), 1475–1493 (2002)
Lu, T.C., Wu, K.Y.: A transaction pattern analysis system based on neural network. Expert Syst. Appl. 36(3), 6091–6099 (2009)
Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems 2, pp. 849–856 (2002)
Ngai, E.W., Xiu, L., Chau, D.C.: Application of data mining techniques in customer relationship management: a literature review and classification. Expert Syst. Appl. 36(2), 2592–2602 (2009)
Nie, F., Wang, X., Huang, H.: Clustering and projected clustering with adaptive neighbors. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 977–986. ACM (2014)
Nie, F., Wang, X., Jordan, M.I., Huang, H.: The constrained Laplacian rank algorithm for graph-based clustering. In: Thirtieth AAAI Conference on Artificial Intelligence, pp. 1969–1976 (2016)
Tsai, C.Y., Chiu, C.C.: A purchase-based market segmentation methodology. Expert Syst. Appl. 27(2), 265–276 (2004)
Wang, K., Xu, C., Liu, B.: Clustering transactions using large items. In: Proceedings of the Eighth International Conference on Information and Knowledge Management, pp. 483–490. ACM (1999)
Wang, W., Carreira-Perpián, M.Á.: Projection onto the probability simplex: an efficient algorithm with a simple proof, and an application. Mathematics (2013)
Xiao, Y., Dunham, M.H.: Interactive clustering for transaction data. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, pp. 121–130. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44801-2_13
Xiong, T., Wang, S., Mayers, A., Monga, E.: DHCC: Divisive hierarchical clustering of categorical data. Data Mining Knowl. Discovery 24(1), 103–135 (2012)
Acknowledgment
This research was supported by the National Key R&D Program of China 2018YFB1003201, NSFC under Grant no. 61773268, 61502177 and U1636202, and Guangdong Key Laboratory project 2017B030314073.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, X., Guo, C., Fang, Y., Mao, R. (2019). Structured Spectral Clustering of PurTree Data. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11447. Springer, Cham. https://doi.org/10.1007/978-3-030-18579-4_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-18579-4_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18578-7
Online ISBN: 978-3-030-18579-4
eBook Packages: Computer ScienceComputer Science (R0)