Abstract
In this study, we propose the semi-supervised particle swarm optimisation (ssPSO) algorithm for data clustering. The algorithm takes advantage of the strengths of semi-supervised fuzzy c-means (ssFCM) and particle swarm optimisation (PSO) to allow for a more informed search using labelled data across small number of iterations while maintaining diversity in the search process. ssFCM algorithms can find meaningful clusters using available labelled data to guide the learning process. PSOs are often chosen to solve clustering problems due to their versatility in problem representation and exploration capabilities. To verify the goodness of ssPSOs and provide practical insights to researchers, the clustering performances and clustering behaviours of ssPSOs are investigated and compared with PSO variants and ssFCMs. Two approaches of ssPSO were studied, one applied at initialisation only and the other throughout the learning process. Evaluated based on accuracy and quantisation error (QE), the ssPSO, PSOs and ssFCM algorithms were tested on 13 UCI datasets with different sizes, dimensions, number of classes and distribution, exploring several swarm size and maximum iteration settings over 100 runs. Visual examination of biplots and convergence graphs was conducted. ssPSOs were found to perform competitively well with ssFCM in most datasets in terms of accuracy and outperform ssFCM in terms of QE using swarm size 20 and maximum iteration 20. The results demonstrate that ssPSOs perform particularly well in sparsely distributed datasets with overlapping clusters and produce clusters with better structures in terms of QE. Furthermore, ssPSOs were demonstrated to perform competitively well as ssFCM in datasets with more than three clusters, while QPSO performed poorly in such datasets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Azab SS, Hady MFA, Hefny HA (2017) Semi-supervised classification: cluster and label approach using particle swarm optimization. Int J Comput Appl 160(3):39
Chen L, Wu X, Gao C (2012) Semi-supervised fuzzy clustering algorithm based on QPSO. J Inf Comput Sci 9(1):93–101
Chuang LY, Hsiao CJ, Yang CH (2011) Chaotic particle swarm optimization for data clustering. Expert Syst Appl 38(12):14555–14563
Guo J, Sato Y (2017) A bare bones particle swarm optimization algorithm with dynamic local search. In: International conference in swarm intelligence. Springer, pp 158–165
Kennedy J (2003) Bare bones particle swarms. In: Proceedings of the 2003 IEEE swarm intelligence symposium, SIS’03. IEEE, pp 80–87
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, vol 4. IEEE, pp 1942–1948
Lai DTC, Garibaldi JM (2011) A comparison of distance-based semi-supervised fuzzy c-means clustering algorithms. In: Proceedings of IEEE international conference on fuzzy systems, pp 1580–1586
Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(Nov):2579–2605
Omran M, Al-Sharhan S (2007) Barebones particle swarm methods for unsupervised image classification. In: IEEE congress on evolutionary computation, CEC 2007. IEEE, pp 3247–3252
Pedrycz W, Waletzky J (1997) Fuzzy clustering with partial supervision. IEEE Trans Syst Man Cybern 27(5):787–795
Sengupta S, Basak S, Peters RA (2018) Data clustering using a hybrid of fuzzy c-means and quantum-behaved particle swarm optimization. In: 2018 IEEE 8th annual computing and communication workshop and conference (CCWC). IEEE, pp 137–142
Sun J, Xu W, Feng B (2004) A global search strategy of quantum-behaved particle swarm optimization. In: 2004 IEEE conference on cybernetics and intelligent systems, vol 1. IEEE, pp 111–116
Van der Merwe D, Engelbrecht AP (2003) Data clustering using particle swarm optimization. In: The 2003 congress on evolutionary computation, CEC’03, vol 1. IEEE, pp 215–220
Zhang D, Tan K, Chen S (2004) Semi-supervised kernel-based fuzzy c-means. Lect Notes Comput Sci Neural Inf Process 3316:1229–1234
Zhang Y, Xiong X, Zhang Q (2013) An improved self-adaptive PSO algorithm with detection function for multimodal function optimization problems. Math Probl Eng 2013:8
Zhang X, Jiao L, Paul A, Yuan Y, Wei Z, Song Q (2014) Semisupervised particle swarm optimization for classification. Math Probl Eng 2014:832135. https://doi.org/10.1155/2014/832135
Acknowledgements
The authors would like to thank Dr. Mikiko Sato from Tokai University for her feedback. This work is partially supported by Universiti Brunei Darussalam under Grant UBD/PNC2/2/RG/1(311). D. T. C. Lai is funded by Hosei University under the Hosei International Fund Foreign Scholars Fellowship.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lai, D.T.C., Miyakawa, M. & Sato, Y. Semi-supervised data clustering using particle swarm optimisation. Soft Comput 24, 3499–3510 (2020). https://doi.org/10.1007/s00500-019-04114-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04114-z