research-article

Ensemble unsupervised autoencoders and Gaussian mixture model for cyberattack detection

Authors:

Chunjiong ZhangAuthors Info & Claims

Volume 59, Issue 2

https://doi.org/10.1016/j.ipm.2021.102844

Published: 01 March 2022 Publication History

Abstract

Previous studies have adopted unsupervised machine learning with dimension reduction functions for cyberattack detection, which are limited to performing robust anomaly detection with high-dimensional and sparse data. Most of them usually assume homogeneous parameters with a specific Gaussian distribution for each domain, ignoring the robust testing of data skewness. This paper proposes to use unsupervised ensemble autoencoders connected to the Gaussian mixture model (GMM) to adapt to multiple domains regardless of the skewness of each domain. In the hidden space of the ensemble autoencoder, the attention-based latent representation and reconstructed features of the minimum error are utilized. The expectation maximization (EM) algorithm is used to estimate the sample density in the GMM. When the estimated sample density exceeds the learning threshold obtained in the training phase, the sample is identified as an outlier related to an attack anomaly. Finally, the ensemble autoencoder and the GMM are jointly optimized, which transforms the optimization of objective function into a Lagrangian dual problem. Experiments conducted on three public data sets validate that the performance of the proposed model is significantly competitive with the selected anomaly detection baselines.

Highlights

•

An ensemble framework of multichannel network anomaly detection model that combines deep autoencoders and the GMM.

•

A robust optimization version of EM³ for multiple domains, which transforms the optimization problem of the objective function into a Lagrangian dual.

•

We deduce the formula and analyze the convergence of the full text, and prove that our model has stability and robustness.

•

To the best of our knowledge is the first work that performs algorithms on both differentiated data domains and data distributions.

References

[1]

Andresini G., Appice A., Di Mauro N., Loglisci C., Malerba D., Exploiting the auto-encoder residual error for intrusion detection, in: 2019 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), IEEE, 2019, pp. 281–290.

[2]

Berriel, R., Lathuillere, S., Nabi, M., Klein, T., Oliveira-Santos, T., & Sebe, N., et al. (2019). Budget-aware adapters for multi-domain learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 382–391).

[3]

Chen Z., Yan Q., Han H., Wang S., Peng L., Wang L., et al., Machine learning based mobile malware detection using highly imbalanced network traffic, Information Sciences 433 (2018) 346–364.

[4]

Dromard J., Roudière G., Owezarski P., Online and scalable unsupervised network anomaly detection method, IEEE Transactions on Network and Service Management 14 (1) (2016) 34–47.

[5]

Fourure D., Emonet R., Fromont E., Muselet D., Neverova N., Trémeau A., et al., Multi-task, multi-domain learning: application to semantic segmentation and pose regression, Neurocomputing 251 (2017) 68–80.

[6]

Ganin Y., Ustinova E., Ajakan H., Germain P., Larochelle H., Laviolette F., et al., Domain-adversarial training of neural networks, Journal of Machine Learning Research 17 (1) (2016) 2096–2030.

[7]

Gong, D., Liu, L., Le, V., Saha, B., Mansour, M. R., & Venkatesh, S., et al. (2019). Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 1705–1714).

[8]

Hu D., Li J., Liu Y., Li Y., Flow adversarial networks: Flowrate prediction for gas-liquid multiphase flows across different domains, IEEE Transactions on Neural Networks and Learning Systems 31 (2) (2020) 475–487.

[9]

Injadat M., Moubayed A., Nassif A.B., Shami A., Multi-stage optimized machine learning framework for network intrusion detection, IEEE Transactions on Network and Service Management (2020) 1,.

[10]

Injadat M., Moubayed A., Shami A., Detecting botnet attacks in IoT environments: An optimized machine learning approach, in: 2020 32nd International Conference on Microelectronics (ICM), 2020, pp. 1–4,.

[11]

Kim Y.-g., Kwon Y., Chang H., Paik M.C., Lipschitz continuous autoencoders in application to anomaly detection, in: International Conference on Artificial Intelligence and Statistics, PMLR, 2020, pp. 2507–2517.

[12]

Li Q., Zou S., Zhong W., Learning graph neural networks with approximate gradient descent, 2020, arXiv preprint arXiv:2012.03429.

[13]

Liao W., Guo Y., Chen X., Li P., A unified unsupervised gaussian mixture variational autoencoder for high dimensional outlier detection, in: 2018 IEEE International Conference on Big Data (Big Data), IEEE, 2018, pp. 1208–1217.

[14]

Liu Y., Li Z., Zhou C., Jiang Y., Sun J., Wang M., et al., Generative adversarial active learning for unsupervised outlier detection, IEEE Transactions on Knowledge and Data Engineering 32 (8) (2020) 1517–1528,.

[15]

Luong M.-T., Pham H., Manning C.D., Effective approaches to attention-based neural machine translation, 2015, arXiv preprint arXiv:1508.04025.

[16]

Mahdavifar S., Kadir A.F.A., Fatemi R., Alhadidi D., Ghorbani A.A., Dynamic android malware category classification using semi-supervised deep learning, in: 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), IEEE, 2020, pp. 515–522.

[17]

Majumdar A., Tripathi A., Asymmetric stacked autoencoder, in: 2017 International Joint Conference on Neural Networks (IJCNN), IEEE, 2017, pp. 911–918.

[18]

Mariño I.P., Míguez J., An approximate gradient-descent method for joint parameter estimation and synchronization of coupled chaotic systems, Physics Letters. A 351 (4–5) (2006) 262–267.

[19]

Peng N., Dredze M., Multi-task multi-domain representation learning for sequence tagging, 2016, CoRR, abs/1608.02689.

[20]

Pratama K., Kang D.-K., Trainable activation function with differentiable negative side and adaptable rectified point, Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies (2020) 1–18.

[21]

Qian, Q., Zhu, S., Tang, J., Jin, R., Sun, B., & Li, H. (2019). Robust optimization over multiple domains. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, no. 01 (pp. 4739–4746).

[22]

Ren, Z., & Lee, Y. J. (2018). Cross-domain self-supervised multi-task feature learning using synthetic imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 762–771).

[23]

Rezvy S., Petridis M., Lasebae A., Zebin T., Intrusion detection and classification with autoencoded deep neural network, in: International Conference on Security for Information Technology and Communications, Springer, 2018, pp. 142–156.

[24]

Schoenauer-Sebag A., Heinrich L., Schoenauer M., Sebag M., Wu L.F., Altschuler S.J., Multi-domain adversarial learning, 2019, arXiv preprint arXiv:1903.09239.

[25]

Shone N., Ngoc T.N., Phai V.D., Shi Q., A deep learning approach to network intrusion detection, IEEE Transactions on Emerging Topics in Computational Intelligence 2 (1) (2018) 41–50.

[26]

Vaca F.D., Niyaz Q., An ensemble learning based wi-fi network intrusion detection system (wnids), in: 2018 IEEE 17th International Symposium on Network Computing and Applications (NCA), IEEE, 2018, pp. 1–5.

[27]

Wang K., Xu L., Huang L., Wang C.-D., Lai J.-H., Stacked discriminative denoising auto-encoder based recommender system, in: International Conference on Intelligent Science and Big Data Engineering, Springer, 2018, pp. 276–286.

[28]

Xu, R., Chen, Z., Zuo, W., Yan, J., & Lin, L. (2018). Deep cocktail network: Multi-source unsupervised domain adaptation with category shift. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3964–3973).

[29]

Xu S., Qian Y., Hu R.Q., Data-driven network intelligence for anomaly detection, IEEE Network 33 (3) (2019) 88–95.

[30]

Yu W., Optimization of combined power and modeling attacks on VR PUFs with Lagrange multipliers, IEEE Transactions on Circuits and Systems II: Express Briefs 67 (11) (2020) 2512–2516.

[31]

Zenati H., Romain M., Foo C.-S., Lecouat B., Chandrasekhar V., Adversarially learned anomaly detection, in: 2018 IEEE International Conference on Data Mining (ICDM), IEEE, 2018, pp. 727–736.

[32]

Zhang Y., Li X., Gao L., Chen W., Li P., Intelligent fault diagnosis of rotating machinery using a new ensemble deep auto-encoder method, Measurement 151 (2020).

[33]

Zhou, C., & Paffenroth, R. C. (2017). Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 665–674).

[34]

Zong B., Song Q., Min M.R., Cheng W., Lumezanu C., Cho D., et al., Deep autoencoding gaussian mixture model for unsupervised anomaly detection, in: International Conference on Learning Representations, 2018.

Cited By

Fu XJiang CLi CLi JZhu XLi F(2024)A hybrid approach for Android malware detection using improved multi-scale convolutional neural networks and residual networksExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123675249:PBOnline publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1016/j.eswa.2024.123675
Li YChen XTang WZhu YHan ZYue Y(2024)Interaction mattersApplied Soft Computing10.1016/j.asoc.2024.111423155:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.asoc.2024.111423
Yuan WYing SDuan XCheng HZhao YShang J(2023)PVEInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10347660:5Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1016/j.ipm.2023.103476
Show More Cited By

Index Terms

Ensemble unsupervised autoencoders and Gaussian mixture model for cyberattack detection
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
    2. Machine learning approaches
      1. Neural networks
2. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
    1. Intrusion detection systems

Index terms have been assigned to the content through auto-classification.

Recommendations

PARAFAC-Based Blind Identification of Underdetermined Mixtures Using Gaussian Mixture Model

This paper presents a novel algorithm, named GMM-PARAFAC, for blind identification of underdetermined instantaneous linear mixtures. The GMM-PARAFAC algorithm uses Gaussian mixture model (GMM) to model non-Gaussianity of the independent sources. We show ...
A robust unsupervised anomaly detection framework
Abstract
Anomaly detection plays an essential role in monitoring dependable systems and networks such as computer clusters, water treatment systems, sensor networks, etc. However, anomaly detection nowadays remains a big challenge since previous researches ...
Active curve axis Gaussian mixture models

Gaussian Mixture Models (GMM) have been broadly applied for the fitting of probability density function. However, due to the intrinsic linearity of GMM, usually many components are needed to appropriately fit the data distribution, when there are curve ...

Comments

Information & Contributors

Information

Published In

cover image Information Processing and Management: an International Journal

Information Processing and Management: an International Journal Volume 59, Issue 2

Mar 2022

970 pages

ISSN:0306-4573

Issue’s Table of Contents

Copyright © 2021.

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 March 2022

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

72
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Fu XJiang CLi CLi JZhu XLi F(2024)A hybrid approach for Android malware detection using improved multi-scale convolutional neural networks and residual networksExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123675249:PBOnline publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1016/j.eswa.2024.123675
Li YChen XTang WZhu YHan ZYue Y(2024)Interaction mattersApplied Soft Computing10.1016/j.asoc.2024.111423155:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.asoc.2024.111423
Yuan WYing SDuan XCheng HZhao YShang J(2023)PVEInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10347660:5Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1016/j.ipm.2023.103476
Azar AShehab EMattar AHameed IElsaid S(2023)Deep Learning Based Hybrid Intrusion Detection Systems to Protect Satellite NetworksJournal of Network and Systems Management10.1007/s10922-023-09767-831:4Online publication date: 4-Sep-2023
https://dl.acm.org/doi/10.1007/s10922-023-09767-8
Guan YChen C(2022)Deep Learning-Driven Financial Management Innovation Upgrade for UniversitiesMobile Information Systems10.1155/2022/93287122022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/9328712
Chen X(2022)The Prediction of English Online Network Performance Based on the XGBoost AlgorithmMobile Information Systems10.1155/2022/92637362022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/9263736
Wang YYang D(2022)The Design of Psychological Education Intervention System in Universities Based on Deep LearningComputational Intelligence and Neuroscience10.1155/2022/92081722022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/9208172
Zhan WQu S(2022)Cooperation Mode of Soccer Robot Game Based on Improved SARSA AlgorithmWireless Communications & Mobile Computing10.1155/2022/91906872022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/9190687
Liu T(2022)The Application of Machine Learning Models in Network Protocol Vulnerability MiningSecurity and Communication Networks10.1155/2022/90869382022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/9086938
He XNordin J(2022)Landscape Creation of Children’s Outdoor Activity Space in Urban Residential Areas Based on Child Psychology AnalysisMobile Information Systems10.1155/2022/90113112022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/9011311
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents