Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Soft dimensionality reduction for reinforcement data clustering

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

The standard Euclidean distance considers equal contributions for all features of each data sample pair when computing the similarity matrix, while different features of real-world datasets have different importance. This paper proposes a new clustering method based on reinforcement learning and soft feature selection with three innovative ideas. First, a novel distance metric based on the importance of features is introduced which additionally can disappear irrelevant features approximately. Second, a new soft weighting mechanism is defined based on this distance to determine the effect of the neighborhood probability in the similarity matrix. Since the training data consists of noisy and redundant features, a sparsity regularization term is applied to solve this problem and emphasizes feature selection. Third, after these dimensionality reduction steps, a new clustering method is developed according to reinforcement learning, which considers the obtained low-dimensional data points as the states of the learning agents. It also uses different actions until convergence to transfer the worst points with the most scattering from one cluster to another one, to produce coherent clusters as well as make a balance between them. The proposed method is able to present high within-cluster consistencies. The experimental results on several real-world datasets show good performance and efficiency of the proposed method. Statistical analysis, parameter sensitivity analysis, and time complexity analysis, all confirm the appropriateness of the results obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and materials

All datasets used in the experiments of this paper are available from UCI and Keel public data repositories

References

  1. Vachharajani, B., Pandya, D.: Dimension reduction techniques: Current status and perspectives. Materials Today: Proceedings (2022)

  2. Ray, P., Reddy, S.S., Banerjee, T.: Various dimension reduction techniques for high dimensional data analysis: a review. Articial Intelligence Review 54(5), 3473–3515 (2021)

    Article  Google Scholar 

  3. Jollie, I.T.: Principal component analysis. Technometrics 45(3), 276 (2003)

    Article  Google Scholar 

  4. Huang, D.-S., Mi, J.-X.: A new constrained independent component analysis method. IEEE Transactions on Neural Networks 18(5), 1532–1535 (2007)

    Article  Google Scholar 

  5. Martinez, A.M., Kak, A.C.: Pca versus lda. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(2), 228–233 (2001)

    Article  Google Scholar 

  6. Li, B., Li, Y.-R., Zhang, X.-L.: A survey on laplacian eigenmaps based manifold learning methods. Neurocomputing 335, 336–351 (2019)

    Article  Google Scholar 

  7. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  8. Donoho, D.L., Grimes, C.: Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proceedings of the National Academy of Sciences 100(10), 5591–5596 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  9. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15(6), 1373–1396 (2003)

    Article  MATH  Google Scholar 

  10. Zhang, Z., Zha, H.: Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM Journal on Scientic Computing 26(1), 313–338 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  11. Lin, T., Zha, H.: Riemannian manifold learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(5), 796–809 (2008)

    Article  Google Scholar 

  12. Von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  13. Tu, S.T., Chen, J.Y., Yang, W., Sun, H.: Laplacian eigenmaps-based polarimetric dimensionality reduction for sar image classication. IEEE Transactions on Geoscience and Remote Sensing 50(1), 170–179 (2011)

    Article  Google Scholar 

  14. Zhu, X., Gan, J., Lu, G., Li, J., Zhang, S.: Spectral clustering via half-quadratic optimization. World Wide Web 23(3), 1969–1988 (2020)

    Article  Google Scholar 

  15. Hajderanj, L., Chen, D., Weheliye, I.: The impact of supervised manifold learning on structure preserving and classication error: A theoretical study. IEEE Access 9, 43909–43922 (2021)

    Article  Google Scholar 

  16. Wang, Q., Lu, M., Li, M., Guan, F.: Regularized semi-supervised metric learning with latent structure preserved. International Journal of Computational Intelligence and Applications 2150013 (2021)

  17. Wang, R., Wu, X.-J., Liu, Z., Kittler, J.: Geometry-aware graph embedding projection metric learning for image set classication. IEEE Transactions on Cognitive and Developmental Systems (2021)

  18. Behmanesh, M., Adibi, P., Chanussot, J., Jutten, C., Ehsani, S.M.S.: Geometric multimodal learning based on local signal expansion for joint diagonalization. IEEE Transactions on Signal Processing 69, 1271–1286 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  19. Pournemat, A., Adibi, P., Chanussot, J.: Semisupervised charting for spectral multimodal manifold learning and alignment. Pattern Recognition 111, 107645 (2021)

    Article  Google Scholar 

  20. Zhang, J., Luan, X., Liu, F.: Multi-manifold nirs modelling via stacked contractive auto-encoders. The Canadian Journal of Chemical Engineering 99(6), 1363–1373 (2021)

    Article  Google Scholar 

  21. Poliet, M., Klein, S., Niessen, W.J., Vandemeulebroucke, J.: Laplacian eigenmaps for multimodal groupwise image registration, Vol. 10133, 101331N (International Society for Optics and Photonics) (2017)

  22. Chen, B., et al.: Soft adaptive loss based laplacian eigenmaps. Applied Intelligence 1–18 (2021)

  23. Zheng, W., Zhu, X., Zhu, Y., Hu, R., Lei, C.: Dynamic graph learning for spectral feature selection. Multimedia Tools and Applications 77(22), 29739–29755 (2018)

    Article  Google Scholar 

  24. Kumar, A., Rai, P., Daume, H.: Co-regularized multi-view spectral clustering. Advances in neural information processing systems 24, 1413–1421 (2011)

    Google Scholar 

  25. Xu, S., Feng, L., Liu, S., Zhou, J., Qiao, H.: Multi-feature weighting neighborhood density clustering. Neural Computing and Applications 32(13), 9545–9565 (2020)

    Article  Google Scholar 

  26. Nie, F., Wang, X., Huang, H.: Clustering and projected clustering with adaptive neighbors, 977–986 (2014)

  27. Wang, Q., Qin, Z., Nie, F., Li, X.: Spectral embedded adaptive neighbors clustering. IEEE Transactions on Neural Networks and Learning Systems 30(4), 1265–1271 (2018)

    Article  Google Scholar 

  28. Wang, X.-d., Rung-Ching, C., Hong, C.-q., Zeng, Z.-q. Unsupervised feature analysis with sparse adaptive learning. Pattern Recognition Letters 102, 89–94 (2018)

  29. Mahadevan, S., Maggioni, M.: Proto-value functions: A laplacian framework for learning representation and control in markov decision processes. Journal of Machine Learning Research 8 (10) (2007)

  30. Xu, X., Zuo, L., Huang, Z.: Reinforcement learning algorithms with function approximation: Recent advances and applications. Information Sciences 261, 1–31 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  31. Hashemzadeh, M., Hosseini, R., Ahmadabadi, M.N.: Clustering subspace generalization to obtain faster reinforcement learning. Evolving Systems 11(1), 89–103 (2020)

    Article  Google Scholar 

  32. Barbakh, W., Fyfe, C.: Clustering with reinforcement learning. International Conference on Intelligent Data Engineering and Automated Learning 507–516 (2007)

  33. Likas, A.: A reinforcement learning approach to online clustering. Neural computation 11(8), 1915–1932 (1999)

    Article  Google Scholar 

  34. Ma, X., Zhao, S.-Y., Li, W.-J.: Clustered reinforcement learning. arXiv preprint arXiv:1906.02457 (2019)

  35. Shalamov, V., Emova, V., Muravyov, S., Filchenkov, A.: Reinforcement-based method for simultaneous clustering algorithm selection and its hyperparameters optimization. Procedia Computer Science 136, 144–153 (2018)

    Article  Google Scholar 

  36. Moghadam, M.H., Babamir, S.M.: Makespan reduction for dynamic workloads in cluster-based data grids using reinforcement-learning based scheduling. Journal of Computational Science 24, 402–412 (2018)

    Article  MathSciNet  Google Scholar 

  37. Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction (MIT press,) (2018)

  38. Domeniconi, C., et al.: Locally adaptive metrics for clustering high dimensional data. Data Mining and Knowledge Discovery 14(1), 63–97 (2007)

    Article  MathSciNet  Google Scholar 

  39. Boyd, S., Boyd, S.P., Vandenberghe, L.: Convex optimization (Cambridge university press) (2004)

  40. Li, Y., et al.: Unsupervised feature selection by combining subspace learning with feature self-representation. Pattern Recognition Letters 109, 35–43 (2018)

    Article  Google Scholar 

  41. Daubechies, I., DeVore, R., Fornasier, M., Güntürk, C.S.: Iteratively reweighted least squares minimization for sparse recovery. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences 63(1), 1–38 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  42. Ezugwu, A.E., et al.: A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Engineering Applications of Articial Intelligence 110, 104743 (2022)

    Article  Google Scholar 

  43. Dar, Z., Lamari, Y., Slaoui, S.C.: A survey on parallel clustering algorithms for big data. Articial Intelligence Review 54(4), 2411–2443 (2021)

    Article  Google Scholar 

  44. Singh, S., Srivastava, S.: Review of clustering techniques in control system: Review of clustering techniques in control system. Procedia Computer Science 173, 272–280 (2020)

    Article  Google Scholar 

  45. Rdusseeun, L., Kaufman, P.: Clustering by means of medoids, Vol. 31 (1987)

  46. Djouzi, K., Beghdad-Bey, K.: A review of clustering algorithms for big data. 2019 International Conference on Networking and Advanced Systems (ICNAS) 1–6 (2019)

  47. Abualigah, L.M., Khader, A.T., Al-Betar, M.A., Alomari, O.A.: Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Systems with Applications 84, 24–36 (2017)

    Article  Google Scholar 

  48. Chormunge, S., Jena, S.: Efficient feature subset selection algorithm for high dimensional data. International Journal of Electrical & Computer Engineering (2088–8708) 6 (4) (2016)

  49. Mannor, S., Menache, I., Hoze, A., Klein, U.: Dynamic abstraction in reinforcement learning via clustering. 71 (2004)

  50. Barbakh, W., Fyfe, C.: Clustering with reinforcement learning, 507–516 (Springer) (2007)

  51. Huang, D., Wang, C.-D., Wu, J.-S., Lai, J.-H., Kwoh, C.-K.: Ultrascalable spectral clustering and ensemble clustering. IEEE Transactions on Knowledge and Data Engineering 32(6), 1212–1226 (2019)

    Article  Google Scholar 

  52. Cai, D., Zhang, C., He, X.: Unsupervised feature selection for multi-cluster data, 333–342 (2010)

  53. Yang, Y., Shen, H. T., Ma, Z., Huang, Z., Zhou, X.: L2, 1-norm regularized discriminative feature selection for unsupervised learning (2011)

  54. Kuhn, H.W.: The hungarian method for the assignment problem. Naval Research Logistics Quarterly 2(1–2), 83–97 (1955)

    Article  MathSciNet  MATH  Google Scholar 

  55. Huang, P., Yang, X.: Unsupervised feature selection via adaptive graph and dependency score. Pattern Recognition 127, 108622 (2022)

    Article  Google Scholar 

Download references

Funding

This research was partially funded by Iran National Science Foundation (INSF) under the grant number 98002017

Author information

Authors and Affiliations

Authors

Contributions

F. Fathinezhad and P. Adibi provided the main methodology and conceptualization of the paper. F. Fathinezhad and P. Adibi and H. Baradaran Kashani carried out experiments and data analysis. F. Fathinezhad and P. Adibi write the original draft. Revising the original meterial, and review and editing of the final manuscript text are performed by all authors

Corresponding author

Correspondence to Peyman Adibi.

Ethics declarations

Ethics approval

not applicable

Competing interests

The authors declare that they have no competing interests

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Fatemeh Fathinezhad, Bijan Shoushtarian, Hamidreza Baradaran Kashani and Jocelyn Chanussot are contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fathinezhad, F., Adibi, P., Shoushtarian, B. et al. Soft dimensionality reduction for reinforcement data clustering. World Wide Web 26, 3027–3054 (2023). https://doi.org/10.1007/s11280-023-01158-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-023-01158-y

Keywords