Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Semi-supervised data clustering using particle swarm optimisation

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In this study, we propose the semi-supervised particle swarm optimisation (ssPSO) algorithm for data clustering. The algorithm takes advantage of the strengths of semi-supervised fuzzy c-means (ssFCM) and particle swarm optimisation (PSO) to allow for a more informed search using labelled data across small number of iterations while maintaining diversity in the search process. ssFCM algorithms can find meaningful clusters using available labelled data to guide the learning process. PSOs are often chosen to solve clustering problems due to their versatility in problem representation and exploration capabilities. To verify the goodness of ssPSOs and provide practical insights to researchers, the clustering performances and clustering behaviours of ssPSOs are investigated and compared with PSO variants and ssFCMs. Two approaches of ssPSO were studied, one applied at initialisation only and the other throughout the learning process. Evaluated based on accuracy and quantisation error (QE), the ssPSO, PSOs and ssFCM algorithms were tested on 13 UCI datasets with different sizes, dimensions, number of classes and distribution, exploring several swarm size and maximum iteration settings over 100 runs. Visual examination of biplots and convergence graphs was conducted. ssPSOs were found to perform competitively well with ssFCM in most datasets in terms of accuracy and outperform ssFCM in terms of QE using swarm size 20 and maximum iteration 20. The results demonstrate that ssPSOs perform particularly well in sparsely distributed datasets with overlapping clusters and produce clusters with better structures in terms of QE. Furthermore, ssPSOs were demonstrated to perform competitively well as ssFCM in datasets with more than three clusters, while QPSO performed poorly in such datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Azab SS, Hady MFA, Hefny HA (2017) Semi-supervised classification: cluster and label approach using particle swarm optimization. Int J Comput Appl 160(3):39

    Google Scholar 

  • Chen L, Wu X, Gao C (2012) Semi-supervised fuzzy clustering algorithm based on QPSO. J Inf Comput Sci 9(1):93–101

    Article  Google Scholar 

  • Chuang LY, Hsiao CJ, Yang CH (2011) Chaotic particle swarm optimization for data clustering. Expert Syst Appl 38(12):14555–14563

    Article  Google Scholar 

  • Guo J, Sato Y (2017) A bare bones particle swarm optimization algorithm with dynamic local search. In: International conference in swarm intelligence. Springer, pp 158–165

  • Kennedy J (2003) Bare bones particle swarms. In: Proceedings of the 2003 IEEE swarm intelligence symposium, SIS’03. IEEE, pp 80–87

  • Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, vol 4. IEEE, pp 1942–1948

  • Lai DTC, Garibaldi JM (2011) A comparison of distance-based semi-supervised fuzzy c-means clustering algorithms. In: Proceedings of IEEE international conference on fuzzy systems, pp 1580–1586

  • Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(Nov):2579–2605

    MATH  Google Scholar 

  • Omran M, Al-Sharhan S (2007) Barebones particle swarm methods for unsupervised image classification. In: IEEE congress on evolutionary computation, CEC 2007. IEEE, pp 3247–3252

  • Pedrycz W, Waletzky J (1997) Fuzzy clustering with partial supervision. IEEE Trans Syst Man Cybern 27(5):787–795

    Article  Google Scholar 

  • Sengupta S, Basak S, Peters RA (2018) Data clustering using a hybrid of fuzzy c-means and quantum-behaved particle swarm optimization. In: 2018 IEEE 8th annual computing and communication workshop and conference (CCWC). IEEE, pp 137–142

  • Sun J, Xu W, Feng B (2004) A global search strategy of quantum-behaved particle swarm optimization. In: 2004 IEEE conference on cybernetics and intelligent systems, vol 1. IEEE, pp 111–116

  • Van der Merwe D, Engelbrecht AP (2003) Data clustering using particle swarm optimization. In: The 2003 congress on evolutionary computation, CEC’03, vol 1. IEEE, pp 215–220

  • Zhang D, Tan K, Chen S (2004) Semi-supervised kernel-based fuzzy c-means. Lect Notes Comput Sci Neural Inf Process 3316:1229–1234

    Article  Google Scholar 

  • Zhang Y, Xiong X, Zhang Q (2013) An improved self-adaptive PSO algorithm with detection function for multimodal function optimization problems. Math Probl Eng 2013:8

    MathSciNet  MATH  Google Scholar 

  • Zhang X, Jiao L, Paul A, Yuan Y, Wei Z, Song Q (2014) Semisupervised particle swarm optimization for classification. Math Probl Eng 2014:832135. https://doi.org/10.1155/2014/832135

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Dr. Mikiko Sato from Tokai University for her feedback. This work is partially supported by Universiti Brunei Darussalam under Grant UBD/PNC2/2/RG/1(311). D. T. C. Lai is funded by Hosei University under the Hosei International Fund Foreign Scholars Fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daphne T. C. Lai.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lai, D.T.C., Miyakawa, M. & Sato, Y. Semi-supervised data clustering using particle swarm optimisation. Soft Comput 24, 3499–3510 (2020). https://doi.org/10.1007/s00500-019-04114-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-04114-z

Keywords