Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

An efficient topological-based clustering method on spatial data in network space

Published: 01 April 2023 Publication History

Highlights

Proposing a novel algorithm Network Space Topological-Based Clustering (NS-TBC) by applying topological relations.
NS-TBC overcomes the limitations of ACUTE for real-life problems.
NS-TBC can detect clusters in arbitrarily small areas located in remote areas.
NS-TBC outperforms the previously proposed algorithms, ACUTE and iNS-DBSCAN.

Abstract

In recent years, along with the rapid development of location-based information, clustering algorithms on spatial data have been extensively applied for data insight and knowledge discovery. Existing clustering techniques usually depend on user parameters and mainly perform on the plane with Euclidean distance. This work proposes a Network Space Topological-Based Clustering (NS-TBC) algorithm for clustering network-constraint objects using a topological-based framework. This approach replaces the distance measures with topological relations for the spatial clustering problem in network space to optimize the parameters. The proposed method exploits the advantages of the ACUTE algorithm, proposed in 2016, but for network-constraint objects. The NS-TBC algorithm is applied to six datasets from Open Street Map to demonstrate its effectiveness. It outperforms the internal validation Davies–Bouldin index against the ACUTE and iNS-DBSCAN algorithms, in which iNS-DBSCAN is our state-of-art efficient clustering algorithm for network-constrained spatial data. The runtime is also carefully measured for evaluation purposes. The evaluation was performed only for two algorithms, NS-TBC, and iNS-DBSCAN, because ACUTE was not designed to work in network space. The experimental results show that the NS-TBC algorithm uses less than 50% of the computation time needed by the iNS-DBSCAN algorithm. In short, the proposed algorithm NS-TBC provided a solution to reduce the number of parameters for the iNS-DBSCAN algorithm while significantly improving the execution time and cluster quality.

References

[1]
Alomari, H. W., & Al-Badarneh, A. F. (2016). A topological-based spatial data clustering. In D. Casasent & M. S. Alam (Eds.), Optical Pattern Recognition XXVII (Vol. 9845, pp. 221–229). SPIE. https://doi.org/10.1117/12.2229413.
[2]
P. Bhattacharjee, P. Mitra, A survey of density based clustering algorithms, Frontiers of Computer Science 15 (1) (2020),.
[3]
Q.-T. Bui, B. Vo, H.-A.-N. Do, N.Q.V. Hung, V. Snasel, F-Mapper: A Fuzzy Mapper clustering algorithm, Knowledge-Based Systems 189 (2020),.
[4]
Q.-T. Bui, B. Vo, V. Snasel, W. Pedrycz, T.-P. Hong, N.-T. Nguyen, M.-Y. Chen, SFCM: A fuzzy clustering algorithm of extracting the shape information of data, IEEE Transactions on Fuzzy Systems 29 (1) (2021) 75–89,.
[5]
R.J.G.B. Campello, P. Kröger, J. Sander, A. Zimek, Density-based clustering, WIREs Data Mining and Knowledge Discovery 10 (2) (2020) e1343.
[6]
Q. Cheng, X. Lu, Z. Liu, J. Huang, G. Cheng, Spatial clustering with Density-Ordered tree, Physica A: Statistical Mechanics and Its Applications 460 (2016) 188–200,.
[7]
D.L. Davies, D.W. Bouldin, A cluster separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1 (2) (1979) 224–227,.
[8]
J.C. Dunn, Well-separated clusters and optimal fuzzy partitions, Journal of Cybernetics 4 (1) (1974) 95–104,.
[9]
Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 226–231.
[10]
F. Fang, L. Qiu, S. Yuan, Adaptive core fusion-based density peak clustering for complex data with arbitrary shapes and densities, Pattern Recognition 107 (2020),.
[11]
Felice, P. Di, & Clementini, E. (2009). Topological Relationships. In L. LIU & M. T. ÖZSU (Eds.), Encyclopedia of Database Systems (pp. 3140–3143). Springer US. https://doi.org/10.1007/978-0-387-39940-9_432.
[12]
K.G. Flores, S.E. Garza, Shortest Paths, Knowledge-Based Systems 206 (2020),.
[13]
M. Haklay, How good is volunteered geographical information? A comparative study of openstreetmap and ordnance survey datasets, Environment and Planning B: Planning and Design 37 (4) (2010) 682–703,.
[14]
J. Han, M. Kamber, J. Pei, Data Mining : Concepts and Techniques : Concepts and Techniques, 3rd Edition, Elsevier Science, In Data Mining, 2012.
[15]
H. Hexmoor, Chapter 6 - Diffusion and Contagion, Computational Network Science , Morgan Kaufmann, 2015, pp. 45–64.
[16]
Jang, J., & Jiang, H. (2019). {DBSCAN}++: Towards fast and scalable density clustering. In K. Chaudhuri & R. Salakhutdinov (Eds.), Proceedings of the 36th International Conference on Machine Learning (Vol. 97, pp. 3019–3029). PMLR. https://proceedings.mlr.press/v97/jang19a.html.
[17]
J.-H. Kim, J.-H. Choi, K.-H. Yoo, A. Nasridinov, AA-DBSCAN: An approximate adaptive DBSCAN for finding clusters with varying densities, The Journal of Supercomputing 75 (1) (2019) 142–169,.
[18]
N.T. Le, B. Vo, L.B.Q. Nguyen, H. Fujita, B. Le, Mining weighted subgraphs in a single large graph, Information Sciences 514 (2020) 149–165,.
[19]
J. Leskovec, A. Rajaraman, J.D. Ullman, Clustering, Mining of Massive Datasets (2014) 228–266,.
[20]
Y. Li, W. Zhou, H. Wang, F-DPC: Fuzzy neighborhood-based density peak algorithm, IEEE Access 8 (2020) 165963–165972,.
[21]
R. Liu, W. Huang, Z. Fei, K. Wang, J. Liang, Constraint-based clustering by fast search and find of density peaks, Neurocomputing 330 (2019) 223–237,.
[22]
A. Lulli, M. Dell’Amico, P. Michiardi, L. Ricci, NG-DBSCAN: Scalable density-based clustering for arbitrary data, Proc. VLDB Endow. 10 (3) (2016) 157–168,.
[23]
G. Mishra, S.K. Mohanty, A fast hybrid clustering technique based on local nearest neighbor using minimum spanning tree, Expert Systems with Applications 132 (2019) 28–43,.
[24]
L.B.Q. Nguyen, B. Vo, N.T. Le, V. Snasel, I. Zelinka, Fast and scalable algorithms for mining subgraphs in a single large graph, Engineering Applications of Artificial Intelligence 90 (2020),.
[25]
T.T.D. Nguyen, L.T.T. Nguyen, A. Nguyen, U. Yun, B. Vo, A method for efficient clustering of spatial data in network space, Journal of Intelligent & Fuzzy Systems 40 (2021) 11653–11670,.
[26]
Oudouar, F., & El Fellahi, A. (2017). Solving the location-routing problems using clustering method. Proceedings of the 2nd International Conference on Big Data, Cloud and Applications. https://doi.org/10.1145/3090354.3090472.
[27]
A. Rodriguez, A. Laio, Clustering by fast search and find of density peaks, Science 344 (6191) (2014) 1492–1496,.
[28]
F. Ros, S. Guillaume, Munec: A mutual neighbor-based clustering algorithm, Information Sciences 486 (2019) 148–170,.
[29]
A.S. Shirkhorshidi, S. Aghabozorgi, T.Y. Wah, A comparison study on similarity and dissimilarity measures in clustering continuous data, PLoS One1 10 (12) (2015) e0144059–e,.
[30]
V. Snášel, J. Nowaková, F. Xhafa, L. Barolli, Geometrical and topological approaches to big data, Future Generation Computer Systems 67 (2017) 286–296,.
[31]
M. Toles, C. Colón-Emeric, L.C. Hanson, M. Naylor, M. Weinberger, J. Covington, J.S. Preisser, Transitional care from skilled nursing facilities to home: Study protocol for a stepped wedge cluster randomized trial, Trials 22 (1) (2021) 120,.
[32]
T. Tran, B. Vo, T.T.N. Le, N.T. Nguyen, Text clustering using frequent weighted utility itemsets, Cybernetics and Systems 48 (3) (2017) 193–209,.
[33]
J.E. Vargas-Munoz, S. Srivastava, D. Tuia, A.X. Falcão, OpenStreetMap: Challenges and opportunities in machine learning and remote sensing, IEEE Geoscience and Remote Sensing Magazine 9 (1) (2021) 184–199,.
[34]
J.E. Vargas-Muñoz, D. Tuia, A.X. Falcão, Deploying machine learning to assist digital humanitarians: Making image annotation in openstreetmap more efficient, International Journal of Geographical Information Science 35 (9) (2021) 1725–1745.
[35]
V. Vo, J. Luo, B. Vo, Time series trend analysis based on k-means and support vector machine, Computing and informatics 35 (1) (2016) 111–127. http://www.cai.sk/ojs/index.php/cai/article/view/1445.
[36]
T. Wang, C. Ren, Y. Luo, J. Tian, NS-DBSCAN: A density-based clustering algorithm in network space, ISPRS International Journal of Geo-Information 8 (5) (2019),.
[37]
K. Yeturu, Chapter 3 - Machine learning algorithms, applications, and practices in data science, Principles and Methods for Data Science, 43, Elsevier, 2020, pp. 81–206.
[38]
Yiu, M. L., & Mamoulis, N. (2004). Clustering objects on a spatial network. Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, 443–454. https://doi.org/10.1145/1007568.1007619.
[39]
H. Yu, L. Chen, J. Yao, X. Wang, A three-way clustering method based on an improved DBSCAN algorithm, Physica A: Statistical Mechanics and Its Applications 535 (2019),.
[40]
M. Yuvaraj, A.K. Dey, V. Lyubchich, Y.R. Gel, H.V. Poor, Topological clustering of multilayer networks, Proceedings of the National Academy of Sciences 118 (21) (2021),.
[41]
L. Zhao, Z. Chen, Y. Yang, L. Zou, Z.J. Wang, ICFS clustering with multiple representatives for large data, IEEE Transactions on Neural Networks and Learning Systems 30 (3) (2019) 728–738,.
[42]
L. Zhao, Z. Li, A.Y. Al-Dubai, G. Min, J. Li, A. Hawbani, A.Y. Zomaya, A novel prediction-based temporal graph routing algorithm for software-defined vehicular networks, IEEE Transactions on Intelligent Transportation Systems 23 (8) (2022) 13275–13290,.
[43]
Q. Zhao, P. Fränti, WB-index: A sum-of-squares based index for cluster validity, Data & Knowledge Engineering 92 (2014) 77–89,.
[44]
Q. Zhao, M. Xu, P. Fränti, Sum-of-squares based cluster validity index and significance analysis, in: M. Kolehmainen, P. Toivanen, B. Beliczynski (Eds.), Adaptive and natural computing algorithms, Springer, Berlin Heidelberg, 2009, pp. 313–322.

Cited By

View all
  • (2024)Efficient strategies for spatial data clustering using topological relationsApplied Intelligence10.1007/s10489-024-05927-855:2Online publication date: 23-Dec-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal  Volume 215, Issue C
Apr 2023
1634 pages

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 April 2023

Author Tags

  1. Spatial clustering
  2. Topological-based clustering
  3. Network spatial analysis
  4. Topological relations
  5. Geographic information system (GIS)

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Efficient strategies for spatial data clustering using topological relationsApplied Intelligence10.1007/s10489-024-05927-855:2Online publication date: 23-Dec-2024

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media