Dimensionality Reduction of Service Monitoring Time-Series: An Industrial Use Case

Anowar, Farzana; Sadaoui, Samira; Dalal, Hardik

doi:10.1007/s42979-022-01428-y

Dimensionality Reduction of Service Monitoring Time-Series: An Industrial Use Case

Original Research
Published: 20 October 2022

Volume 4, article number 23, (2023)
Cite this article

SN Computer Science Aims and scope Submit manuscript

116 Accesses
Explore all metrics

Abstract

Our study proposes a dimensionality reduction approach to efficiently process a service monitoring application’s high-dimensional and unlabeled time-series dataset. The approach aims to improve data quality and lower the feature space optimally. Since the dataset is vast and the reduction approach requires colossal resources, we divide it into several weekly sub-datasets. Using clustering methods and metrics, we evaluate the approach’s efficacy on each sub-dataset thoroughly and show that the information loss was minimal after data transformation. Moreover, we assess each sub-dataset’s trustworthiness and similarity to verify whether the new data acquire the same cluster labels as the initial data. Since the experiments reveal a high data quality, the industrial partner can utilize the new data in their decision-making tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Fig. 3

Mining Massive Time Series Data: With Dimensionality Reduction Techniques

A Comparison of Multivariate Time Series Clustering Methods

Dimensionality reduction for multivariate time-series data mining

Article 19 January 2022

References

Anowar F, Sadaoui S, Dalal H. Clustering quality of a high-dimensional service monitoring time-series dataset. In: 14th International Conference on Agents and Artificial Intelligence (ICAART), 2022;183–192.
Jia W, Sun M, Lian J, Hou S. Feature dimensionality reduction: a review. Complex Intell Syst. 2022;8(1):1–31.
Google Scholar
Spruyt V. The curse of dimensionality in classification. Comput Vis Dumm. 2014;21(3):35–40.
Google Scholar
Van Der Maaten L, Postma E, Van den Herik J. Dimensionality reduction: A comparative review. J Mach Learn Res. 2009;10(66–71):13.
Google Scholar
Jindal P, Kumar D. A review on dimensionality reduction techniques. Int J Comput Appl. 2017;173(2):42–6.
Google Scholar
Verleysen M, François D. The curse of dimensionality in data mining and time series prediction. In: International Work-conference on Artificial Neural Networks, 2005;758–770. Springer.
Hawkins DM. The problem of overfitting. J Chem Inf Comput Sci. 2004;44(1):1–12.
Article MathSciNet Google Scholar
Anowar F, Sadaoui S, Selim B. Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, le, ica, t-sne). Comput Sci Rev. 2021;40:1–13. https://doi.org/10.1016/j.cosrev.2021.100378.
Article MathSciNet MATH Google Scholar
Dong Y, Du B, Zhang L, Zhang L. Dimensionality reduction and classification of hyperspectral images using ensemble discriminative local metric learning. IEEE Trans Geosci Remote Sens. 2017;55(5):2509–24.
Article Google Scholar
Abdulhammed R, Musafer H, Alessa A, Faezipour M, Abuzneid A. Features dimensionality reduction approaches for machine learning based network intrusion detection. Electronics. 2019;8(3):322.
Article Google Scholar
He H, Tan Y. Automatic pattern recognition of ecg signals using entropy-based adaptive dimensionality reduction and clustering. Appl Soft Comput. 2017;55:238–52.
Article Google Scholar
Zarzour H, Al-Sharif Z, Al-Ayyoub M, Jararweh Y. A new collaborative filtering recommendation algorithm based on dimensionality reduction and clustering techniques. In: 2018 9th International Conference on Information and Communication Systems (ICICS), 2018;102–106. IEEE.
Chormunge S, Jena S. Correlation based feature selection with clustering for high dimensional data. J Electr Syst Inf Technol. 2018;5(3):542–9.
Article Google Scholar
Messaoud TA, Smiti A, Louati A. A novel density-based clustering approach for outlier detection in high-dimensional data. In: International Conference on Hybrid Artificial Intelligence Systems, 2019;322–331 . Springer.
Graving JM, Couzin ID. Vae-sne: a deep generative model for simultaneous dimensionality reduction and clustering. BioRxiv, 2020.
Prometheus: Overview. https://prometheus.io/docs/introduction/overview/. Last accessed 21 Feb 2022 (2021).
Prometheus: From metrics to insight. https://prometheus.io/docs/concepts/metric_types/. Last accessed 21 Feb 2022 (2021).
Li D, Wong WE, Wang W, Yao Y, Chau M. Detection and mitigation of label-flipping attacks in federated learning systems with kpca and k-means. In: 2021 8th International Conference on Dependable Systems and Their Applications (DSA), 2021;551–559. IEEE.
Hoffmann H. Kernel pca for novelty detection. Pattern Recogn. 2007;40(3):863–74.
Article MATH Google Scholar
Fan Z, Wang J, Xu B, Tang P. An efficient kpca algorithm based on feature correlation evaluation. Neural Comput Appl. 2014;24(7):1795–806.
Article Google Scholar
Kwak N. Nonlinear projection trick in kernel methods: an alternative to the kernel trick. IEEE Trans Neural Netw Learn Syst. 2013;24(12):2113–9.
Article Google Scholar
Baudat G, Anouar F. Kernel-based methods and function approximation. In: IJCNN’01. International Joint Conference on Neural Networks. Proceedings (Cat. No. 01CH37222), 2001;2:1244–1249. IEEE.
Ghashami M, Perry DJ, Phillips J. Streaming kernel principal component analysis. In: Artificial Intelligence and Statistics, 2016;1365–1374. PMLR.
Kumar A. PCA Explained Variance Concepts with Python Example. https://vitalflux.com/pca-explained-variance-concept-python-example/. Last accessed 21 February 2022 (2020).
Wang Y, Yao H, Zhao S. Auto-encoder based dimensionality reduction. Neurocomputing. 2016;184:232–42.
Article Google Scholar
Nousi P, Tefas A. Self-supervised autoencoders for clustering and classification. Evol Syst. 2020;11(3):453–66.
Article Google Scholar
Almotiri J, Elleithy K, Elleithy A. Comparison of autoencoder and principal component analysis followed by neural network for e-learning using handwritten recognition. In: 2017 IEEE Long Island Systems, Applications and Technology Conference (LISAT), 2017;1–5. IEEE.
Canelli F, de Cosa A, Le Pottier L, Niedziela J, Pedro K, Pierini M. Autoencoders for semivisible jet detection. J High Energy Phys. 2022;1(2):1–17.
Google Scholar
Lawton G. Autoencoders’ example uses augment data for machine learning. https://searchenterpriseai.techtarget.com/feature/Autoencoders-example-uses-augment-data-for-machine-learning. Last accessed 21 February 2022 (2020).
Wu W, Xu Z, Kou G, Shi Y. Decision-making support for the evaluation of clustering algorithms based on mcdm. Complexity. 2020;2020:1–17.
Article Google Scholar
Tavenard R, Faouzi J, Vandewiele G, Divo F, Androz G, Holtz C, Payne M, Yurchak R, Rußwurm M, Kolar K, et al. Tslearn, a machine learning toolkit for time series data. J Mach Learn Res. 2020;21(118):1–6.
MATH Google Scholar
Huang X, Ye Y, Xiong L, Lau RY, Jiang N, Wang S. Time series k-means: A new k-means type smooth subspace clustering for time series data. Inf Sci. 2016;367:1–13.
MATH Google Scholar
Paparrizos J, Gravano L. k-shape: Efficient and accurate clustering of time series. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, 2015;1855–1870.
Yuan C, Yang H. Research on k-value selection method of k-means clustering algorithm. J. 2019;2(2):226–35.
Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30.
MathSciNet MATH Google Scholar
Kaoungku N, Suksut K, Chanklan R, Kerdprasop K, Kerdrasop N. The silhouette width criterion for clustering and association mining to select image features. Int J Mach Learn Comput. 2018;8(1):1–5.
Google Scholar
Thinsungnoena T, Kaoungkub N, Durongdumronchaib P, Kerdprasopb K, Kerdprasopb N. The clustering validity with silhouette and sum of squared errors. In: The 3rd International International Conference on Industrial Application and Engineering, 2015:1–8.
Zhang Y, Li D. Cluster analysis by variance ratio criterion and firefly algorithm. Int J Digit Content Technol Appl. 2013;7(3):689–97.
Google Scholar
Anowar F, Sadaoui S. Incremental neural-network learning for big fraud data. In: 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2020:3551–3557. https://doi.org/10.1109/SMC42975.2020.9283136
Griparis A, Faur D, Datcu M. A dimensionality reduction approach for the visualization of the cluster space: A trustworthiness evaluation. In: 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2016:2917–2920. IEEE.
Lee JA, Verleysen M. Quality assessment of dimensionality reduction: rank-based criteria. Neurocomputing. 2009;72(7–9):1431–43.
Article Google Scholar
Lee J, Verleysen M. Quality assessment of nonlinear dimensionality reduction based on \(k\)-ary neighborhoods. In: New Challenges for Feature Selection in Data Mining and Knowledge Discovery, 2008:21–35. PMLR.
Anowar F, Sadaoui S. Incremental learning framework for real-world fraud detection environment. Comput Intell. 2021;37(1):635–56.
Article MathSciNet Google Scholar

Download references

Acknowledgements

We want to express our gratitude to Global AI Accelerator (GAIA) Ericsson, Montreal, for collaborating with us on this research work and the Observability team for allowing us access to the data.

Funding

This research is supported by MITACS Accelerate Canada (project number: IT16751).

Author information

Authors and Affiliations

Department of Computer Science, University of Regina, Regina, SK, Canada
Farzana Anowar & Samira Sadaoui
Ericsson Canada Inc., Montreal, QC, Canada
Farzana Anowar & Hardik Dalal

Authors

Farzana Anowar
View author publications
You can also search for this author in PubMed Google Scholar
Samira Sadaoui
View author publications
You can also search for this author in PubMed Google Scholar
Hardik Dalal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samira Sadaoui.

Ethics declarations

Conflict of interest

The authors declare that no conflicts of interest is related to this publication.

Ethics approval

Not applicable.

Consent to participate

All authors have read, approved the manuscript, and agreed for the authorship.

Consent for publication

All authors have given consent for publishing the manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection: “Advances on Agents and Artificial Intelligence” guest-edited by Jaap van den Herik, Ana Paula Rocha and Luc Steels.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Anowar, F., Sadaoui, S. & Dalal, H. Dimensionality Reduction of Service Monitoring Time-Series: An Industrial Use Case. SN COMPUT. SCI. 4, 23 (2023). https://doi.org/10.1007/s42979-022-01428-y

Download citation

Received: 17 May 2022
Accepted: 19 September 2022
Published: 20 October 2022
DOI: https://doi.org/10.1007/s42979-022-01428-y

Keywords

Access this article

Log in via an institution

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Dimensionality Reduction of Service Monitoring Time-Series: An Industrial Use Case

Abstract

Access this article

Similar content being viewed by others

Mining Massive Time Series Data: With Dimensionality Reduction Techniques

A Comparison of Multivariate Time Series Clustering Methods

Dimensionality reduction for multivariate time-series data mining

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dimensionality Reduction of Service Monitoring Time-Series: An Industrial Use Case

Abstract

Access this article

Similar content being viewed by others

Mining Massive Time Series Data: With Dimensionality Reduction Techniques

A Comparison of Multivariate Time Series Clustering Methods

Dimensionality reduction for multivariate time-series data mining

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation