A Review of Research Works on Supervised Learning Algorithms for SCADA Intrusion Detection and Classification
Abstract
:1. Introduction
2. Materials and Methods
3. Brief Overview of Modern SCADA Architecture
4. Supervised Learning for SCADA Security
4.1. Datasets Generation Mechanism Overview
4.2. Feature Engineering and Optimization Mechanism
4.3. Classification Mechanism
4.3.1. Support Vector Machine (SVM)
4.3.2. Artificial Neural Network (ANN)
4.3.3. Decision Tree (DT)
4.3.4. Random Forest (RF)
4.3.5. Bayesian Networks (BN)
4.3.6. K-Nearest Neighbors (k-NN)
5. Suggestions and Recommendations for Future Works
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Tariq, N.; Asim, M.; Khan, F.A. Securing SCADA-based Critical Infrastructures: Challenges and Open Issues. Procedia Comput. Sci. 2019, 155, 612–617. [Google Scholar] [CrossRef]
- Cifranic, N.; Hallman, R.A.; Romero-Mariona, J.; Souza, B.; Calton, T.; Coca, G. Decepti-SCADA: A cyber deception framework for active defense of networked critical infrastructures. Internet Things 2020, 12, 100320. [Google Scholar] [CrossRef]
- Upadhyay, D.; Sampalli, S. SCADA (Supervisory Control and Data Acquisition) systems: Vulnerability assessment and security recommendations. Comput. Secur. 2020, 89, 101666. [Google Scholar] [CrossRef]
- Phillips, B.; Gamess, E.; Krishnaprasad, S. An evaluation of machine learning-based anomaly detection in a SCADA system using the modbus protocol. In Proceedings of the 2020 ACM Southeast Conference, Tampa, FL, USA, 2–4 April 2020; pp. 188–196. [Google Scholar]
- Alimi, A.M.; Ouahada, K.; Abu-Mahfouz, A.M. A Review of Machine Learning Approaches to Power System Security and Stability. IEEE Access 2020, 8, 113512–113531. [Google Scholar] [CrossRef]
- Ahmad, Z.; Durad, M.H. Development of SCADA simulator using omnet. In Proceedings of the 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST), Islamabad, Pakistan, 8–12 January 2019; pp. 676–680. [Google Scholar]
- Yadav, G.; Paul, K. Architecture and security of SCADA systems: A review. Int. J. Crit. Infrastruct. Prot. 2021, 34, 100433. [Google Scholar] [CrossRef]
- Asghar, M.R.; Hu, Q.; Zeadally, S. Cybersecurity in industrial control systems: Issues, technologies, and challenges. Comput. Netw. 2019, 165, 106946. [Google Scholar] [CrossRef]
- Shlomo, A.; Kalech, M.; Moskovitch, R. Temporal pattern-based malicious activity detection in SCADA systems. Comput. Secur. 2021, 102, 102153. [Google Scholar] [CrossRef]
- Rezai, A.; Keshavarzi, P.; Moravej, Z. Key management issue in SCADA networks: A review. Eng. Sci. Technol. Int. J. 2017, 20, 354–363. [Google Scholar] [CrossRef] [Green Version]
- Yang, Y.; McLaughlin, K.; Sezer, S.; Littler, T.; Im, E.G.; Pranggono, B.; Wang, H.F. Multiattribute SCADA-Specific Intrusion Detection System for Power Networks. IEEE Trans. Power Deliv. 2014, 29, 1092–1102. [Google Scholar] [CrossRef] [Green Version]
- Moon, D.; Im, H.; Kim, I.; Park, J.H. DTB-IDS: An intrusion detection system based on decision tree using behavior analysis for preventing APT attacks. J. Supercomput. 2017, 73, 2881–2895. [Google Scholar] [CrossRef]
- Junejo, K.N.; Goh, J. Behaviour-based attack detection and classification in cyber physical systems using machine learning. In Proceedings of the 2nd ACM International Workshop on Cyber-Physical System Security, Xi’an, China, 30 May 2016; pp. 34–43. [Google Scholar]
- Hink, R.C.B.; Beaver, J.M.; Buckner, M.A.; Morris, T.; Adhikari, U.; Pan, S. Machine learning for power system disturbance and cyber-attack discrimination. In Proceedings of the 2014 7th International Symposium on Resilient Control Systems (ISRCS), Denver, CO, USA, 19–21 August 2014; pp. 1–8. [Google Scholar]
- Miller, B.; Rowe, D. A survey SCADA of and critical infrastructure incidents. In Proceedings of the 1st Annual Conference on Research in Information Technology, Calgary, AB, Canada, 11–13 October 2012; pp. 51–56. [Google Scholar]
- Rakas, S.V.B.; Stojanovic, M.D.; Markovic-Petrovic, J.D. A Review of Research Work on Network-Based SCADA Intrusion Detection Systems. IEEE Access 2020, 8, 93083–93108. [Google Scholar] [CrossRef]
- el Kalam, A.A. Securing SCADA and critical industrial systems: From needs to security mechanisms. Int. J. Crit. Infrastruct. Prot. 2021, 32, 100394. [Google Scholar] [CrossRef]
- Kabore, R.; Kouassi, A.; N’Goran, R.; Asseu, O.; Kermarrec, Y.; Lenca, P. Review of Anomaly Detection Systems in Industrial Control Systems Using Deep Feature Learning Approach. Enginerring 2021, 13, 30–44. [Google Scholar] [CrossRef]
- Yadav, G.; Paul, K. Assessment of SCADA System Vulnerabilities. In Proceedings of the 2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Zaragoza, Spain, 10–13 September 2019; pp. 1737–1744. [Google Scholar]
- Yaacoub, J.-P.A.; Salman, O.; Noura, H.N.; Kaaniche, N.; Chehab, A.; Malli, M. Cyber-physical systems security: Limitations, issues and future trends. Microprocess. Microsyst. 2020, 77, 103201. [Google Scholar] [CrossRef]
- Fortinet, Independent Study on SCADA/ICS Security Risks. Available online: https://www.fortinet.com/content/dam/fortinet/assets/white-papers/WP-Independent-Study-Pinpoints-Significant-Scada-ICS-Cybersecurity-Risks.pdf (accessed on 19 May 2021).
- Trend Micro Zero Day Initiative. Available online: https://www.trendmicro.com/en_no/about/newsroom/press-releases/2019/2019-12-03-trend-micros-zero-day-initiative-leads-vulnerability-disclosure-landscape-in-independent-research.html (accessed on 22 May 2021).
- Ahmed, M.; Anwar, A.; Mahmood, A.N.; Shah, Z.; Maher, M.J. An Investigation of Performance Analysis of Anomaly Detection Techniques for Big Data in SCADA Systems. EAI Endorsed Trans. Ind. Netw. Intell. Syst. 2015, 2, 5. [Google Scholar] [CrossRef] [Green Version]
- Microsoft Academic. Available online: https://academic.microsoft.com/ (accessed on 26 June 2021).
- Khraisat, A.; Gondal, I.; Vamplew, P.; Kamruzzaman, J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2019, 2, 20. [Google Scholar] [CrossRef]
- Ferrag, M.A.; Babaghayou, M.; Yazici, M.A. Cyber security for fog-based smart grid SCADA systems: Solutions and challenges. J. Inf. Secur. Appl. 2020, 52, 102500. [Google Scholar] [CrossRef]
- Cherdantseva, Y.; Burnap, P.; Blyth, A.; Eden, P.; Jones, K.; Soulsby, H.; Stoddart, K. A review of cyber security risk assessment methods for SCADA systems. Comput. Secur. 2016, 56, 1–27. [Google Scholar] [CrossRef] [Green Version]
- Ahmim, A.; Ferrag, M.A.; Maglaras, L.; Derdour, M.; Janicke, H.; Drivas, G. Taxonomy of Supervised Machine Learning for Intrusion Detection Systems. Sustain. Transp. Dev. Innov. Technol. 2020, 619–628. [Google Scholar] [CrossRef]
- Samdarshi, R.; Sinha, N.; Tripathi, P. A triple layer intrusion detection system for SCADA security of electric utility. In Proceedings of the 2015 Annual IEEE India Conference (INDICON), New Delhi, India, 17–20 December 2015; pp. 1–5. [Google Scholar]
- Alimi, A.M.; Ouahada, K. Security Assessment of the Smart Grid: A Review focusing on the NAN Architecture. In Proceedings of the 2018 IEEE 7th International Conference on Adaptive Science & Technology (ICAST), Accra, Ghana, 22–24 August 2018; pp. 1–8. [Google Scholar]
- Reuter, L.; Jung, O.; Magin, J. Neural network based anomaly detection for SCADA systems. In Proceedings of the 2020 23rd Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN), Paris, France, 24–27 February 2020; pp. 194–201. [Google Scholar]
- Alimi, O.A.; Ouahada, K.; Abu-Mahfouz, A.M.; Rimer, S. Power system events classification using genetic algorithm based feature weighting technique for support vector machine. Heliyon 2021, 7, e05936. [Google Scholar] [CrossRef]
- Paramkusem, K.M.; Aygun, R.S. Classifying Categories of SCADA Attacks in a Big Data Framework. Ann. Data Sci. 2018, 5, 359–386. [Google Scholar] [CrossRef]
- Zhu, B.; Joseph, A.D.; Sastry, S. A Taxonomy of Cyber Attacks on SCADA Systems. In Proceedings of the 2011 International Conference on Internet of Things and 4th International Conference on Cyber, Physical and Social Computing, Dalian, Liaoning, China, 19–22 October 2011; pp. 380–388. [Google Scholar]
- Maglaras, L.A.; Jiang, J.; Cruz, T.J. Combining ensemble methods and social network metrics for improving accuracy of OCSVM on intrusion detection in SCADA systems. J. Inf. Secur. Appl. 2016, 30, 15–26. [Google Scholar] [CrossRef] [Green Version]
- Ranganathan, G.; Rocha, A. Inventive Communication and Computational Technologies. In Proceedings of the 4th International Conference on Inventive Communication and Computational Technologies (ICICCT 2020), Tamil Nadu, India, 28–29 May 2020. [Google Scholar]
- Shakarami, A.; Ghobaei-Arani, M.; Shahidinejad, A. A survey on the computation offloading approaches in mobile edge computing: A machine learning-based perspective. Comput. Netw. 2020, 182, 107496. [Google Scholar] [CrossRef]
- Özgür, A.; Erdem, H. A review of KDD99 dataset usage in intrusion detection and machine learning between 2010 and 2015. PeerJ Prepr. 2016, 4, e1954v1. [Google Scholar]
- Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar]
- Goh, J.; Adepu, S.; Junejo, K.N.; Mathur, A. A Dataset to Support Research in the Design of Secure Water Treatment Systems. In Proceedings of the International Conference on Critical Information Infrastructures Security, Paris, France, 10–12 October 2016; pp. 88–99. [Google Scholar]
- Singapore University of Technology and Design. iTrust, Centre for Research in Cyber Security. Available online: https://itrust.sutd.edu.sg/itrust-labs-home/itrust-labs_swat (accessed on 15 July 2021).
- Nazir, S.; Patel, S.; Patel, D. Assessing and augmenting SCADA cyber security: A survey of techniques. Comput. Secur. 2017, 70, 436–454. [Google Scholar] [CrossRef] [Green Version]
- Alimi, K.O.A.; Ouahada, K.; Abu-Mahfouz, A.M.; Rimer, S. A Survey on the Security of Low Power Wide Area Networks: Threats, Challenges, and Potential Solutions. Sensors 2020, 20, 5800. [Google Scholar] [CrossRef]
- Shitharth, S. An enhanced optimization based algorithm for intrusion detection in SCADA network. Comput. Secur. 2017, 70, 16–26. [Google Scholar] [CrossRef]
- Wang, C.; Fang, L.; Dai, Y. A Simulation Environment for SCADA Security Analysis and Assessment. In Proceedings of the 2010 International Conference on Measuring Technology and Mechatronics Automation, Changsha, China, 13–14 March 2010; pp. 342–347. [Google Scholar]
- Queiroz, C.; Mahmood, A.; Tari, Z. SCADASim—A Framework for Building SCADA Simulations. IEEE Trans. Smart Grid 2011, 2, 589–597. [Google Scholar] [CrossRef]
- Mathioudakis, K.; Frangiadakis, N.; Merentitis, A.; Gazis, V. Towards generic SCADA simulators: A survey of existing multi-purpose co-simulation platforms, best practices and use-cases. AGT Group (R D) GmbH Hilpertstrasse 2013, 35, 64295. [Google Scholar]
- Gao, J.; Gan, L.; Buschendorf, F.; Zhang, L.; Liu, H.; Li, P.; Dong, X.; Lu, T. Omni SCADA Intrusion Detection Using Deep Learning Algorithms. IEEE Internet Things J. 2021, 8, 951–961. [Google Scholar] [CrossRef]
- Yang, D.; Usynin, A.; Hines, J.W. Anomaly-based intrusion detection for SCADA systems. In Proceedings of the 5th International Topical Meeting on Nuclear Plant Instrumentation, Control and Human Machine Interface Technologies (Npic&hmit 05), Knoxville, TN, USA, 12–16 November 2006; pp. 12–16. [Google Scholar]
- Linda, O.; Vollmer, T.; Manic, M. Neural Network based Intrusion Detection System for critical infrastructures. In Proceedings of the 2009 International Joint Conference on Neural Networks, Atlanta, GA, USA, 14–19 June 2009; pp. 1827–1834. [Google Scholar]
- Branisavljević, N.; Kapelan, Z.; Prodanovic, D. Improved real-time data anomaly detection using context classification. J. Hydroinform. 2011, 13, 307–323. [Google Scholar] [CrossRef]
- MIT Lincoln Laboratory. 1998 Darpa Intrusion Detection Evaluation Dataset. Available online: https://www.ll.mit.edu/r-d/datasets/1998-darpa-intrusion-detection-evaluation-dataset (accessed on 26 July 2021).
- Zhang, Y.; Wang, L.; Sun, W.; Ii, R.C.G.; Alam, M. Distributed Intrusion Detection System in a Multi-Layer Network Architecture of Smart Grids. IEEE Trans. Smart Grid 2011, 2, 796–808. [Google Scholar] [CrossRef] [Green Version]
- Poojitha, G.; Kumar, K.N.; Reddy, P.J. Intrusion Detection using Artificial Neural Network. In Proceedings of the 2010 Second International Conference on Computing, Communication and Networking Technologies, Karur, India, 29–31 July 2010; pp. 1–7. [Google Scholar]
- Al-Daweri, M.S.; Abdullah, S.; Ariffin, K.A.Z. A homogeneous ensemble based dynamic artificial neural network for solving the intrusion detection problem. Int. J. Crit. Infrastruct. Prot. 2021, 34, 100449. [Google Scholar] [CrossRef]
- Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada, 8–10 July 2009; pp. 1–6. [Google Scholar]
- Alimi, O.A.; Ouahada, K.; Abu-Mahfouz, A.M.; Rimer, S.; Adefemi Alimi, K.O. Intrusion Detection for Water Distribution Systems based on an Hybrid Particle Swarm Optimization with Back Propagation Neural Network. IEEE Africon 2021, accepted. [Google Scholar]
- Inoue, J.; Yamagata, Y.; Chen, Y.; Poskitt, C.; Sun, J. Anomaly Detection for a Water Treatment System Using Unsupervised Machine Learning. In Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW), New Orleans, LA, USA, 18–21 November 2017; pp. 1058–1065. [Google Scholar]
- Shalyga, D.; Filonov, P.; Lavrentyev, A. Anomaly detection for water treatment system based on neural network with automatic architecture optimization. arXiv 2018, arXiv:1807.07282. [Google Scholar]
- Zizzo, G.; Hankin, C.; Maffeis, S.; Jones, K. Intrusion detection for industrial control systems: Evaluation analysis and adversarial attacks. arXiv 2019, arXiv:1911.04278. [Google Scholar]
- Li, D.; Chen, D.; Jin, B.; Shi, L.; Goh, J.; Ng, S.-K. MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks. In Lecture Notes in Computer Science; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2019; pp. 703–716. [Google Scholar]
- Ferrag, M.A.; Maglaras, L.; Moschoyiannis, S.; Janicke, H. Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. J. Inf. Secur. Appl. 2020, 50, 102419. [Google Scholar] [CrossRef]
- Choi, S.; Yun, J.-H.; Kim, S.-K. A Comparison of ICS Datasets for Security Research Based on Attack Paths. In Proceedings of the International Conference on Critical Information Infrastructures Security, Kaunas, Lithuania, 24–26 September 2018; pp. 154–166. [Google Scholar]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In Proceedings of the ICISSP 2018, Madeira, Portugal, 22–24 January 2018; pp. 108–116. [Google Scholar]
- Lin, Q.; Verwer, S.; Kooij, R.; Mathur, A. Using Datasets from Industrial Control Systems for Cyber Security Research and Education. In Proceedings of the International Conference on Critical Information Infrastructures Security, Linköping, Sweden, 23–25 September 2019; pp. 122–133. [Google Scholar]
- Conti, M.; Donadel, D.; Turrin, F. A Survey on Industrial Control System Testbeds and Datasets for Security Research. arXiv 2021, arXiv:2102.05631. [Google Scholar]
- Kilincer, I.F.; Ertam, F.; Sengur, A. Machine learning methods for cyber security intrusion detection: Datasets and comparative study. Comput. Netw. 2021, 188, 107840. [Google Scholar] [CrossRef]
- Choudhary, S.; Kesswani, N. Analysis of KDD-Cup’99, NSL-KDD and UNSW-NB15 Datasets using Deep Learning in IoT. Procedia Comput. Sci. 2020, 167, 1561–1573. [Google Scholar] [CrossRef]
- Sonule, A.R.; Kalla, M.; Jain, A.; Chouhan, D.S. UNSWNB15 Dataset and Machine Learning Based Intrusion Detection Systems. Int. J. Eng. Adv. Technol. 2020, 9, 2638–2648. [Google Scholar]
- Song, J.; Takakura, H.; Okabe, Y. Description of Kyoto University Benchmark Data. Available online: http://Www.Takakura.Com/Kyoto_data/BenchmarkData-Description-V5.Pdf (accessed on 26 June 2021).
- Suman, C.; Tripathy, S.; Saha, S. Building an effective intrusion detection system using unsupervised feature selection in multi-objective optimization framework. arXiv 2019, arXiv:1905.06562. [Google Scholar]
- Kim, J.; Kim, J.; Kim, H.; Shim, M.; Choi, E. CNN-Based Network Intrusion Detection against Denial-of-Service Attacks. Electronics 2020, 9, 916. [Google Scholar] [CrossRef]
- Waghmare, S.; Kazi, F.; Singh, N. Data driven approach to attack detection in a cyber-physical smart grid system. In Proceedings of the 2017 Indian Control Conference (ICC), Guwahati, India, 4–6 January 2017; pp. 271–276. [Google Scholar]
- Mansouri, A.; Majidi, B.; Shamisa, A. Anomaly detection in industrial control systems using evolutionary-based optimization of neural networks. Commun. Adv. Comput. Sci. Appl. 2017, 2017, 49–55. [Google Scholar] [CrossRef] [Green Version]
- Khan, I.A.; Pi, D.; Khan, Z.U.; Hussain, Y.; Nawaz, A. HML-IDS: A Hybrid-Multilevel Anomaly Prediction Approach for Intrusion Detection in SCADA Systems. IEEE Access 2019, 7, 89507–89521. [Google Scholar] [CrossRef]
- Kalech, M. Cyber-attack detection in SCADA systems using temporal pattern recognition techniques. Comput. Secur. 2019, 84, 225–238. [Google Scholar] [CrossRef]
- Wang, H.; Lu, T.; Dong, X.; Li, P.; Xie, M. Hierarchical Online Intrusion Detection for SCADA Networks. arXiv 2016, arXiv:1611.09418, 2016. [Google Scholar]
- Ullah, I.; Mahmoud, Q.H. A hybrid model for anomaly-based intrusion detection in SCADA networks. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 11–14 December 2017; pp. 2160–2167. [Google Scholar]
- Ali, M.H.; Fadlizolkipi, M.; Firdaus, A.; Khidzir, N.Z. A hybrid Particle swarm optimization -Extreme Learning Machine approach for Intrusion Detection System. In Proceedings of the 2018 IEEE Student Conference on Research and Development (SCOReD), Bangi, Selangor, Malaysia, 26–28 November 2018; pp. 1–4. [Google Scholar]
- Shang, W.; Zeng, P.; Wan, M.; Li, L.; An, P. Intrusion detection algorithm based on OCSVM in industrial control system. Secur. Commun. Netw. 2015, 9, 1040–1049. [Google Scholar] [CrossRef] [Green Version]
- Tamy, S.; Belhadaoui, H.; Rabbah, M.A.; Rabbah, N.; Rifi, M. An Evaluation of Machine Learning Algorithms to Detect Attacks in Scada Network. In Proceedings of the 7th Mediterranean Congress of Telecommunications (CMT), Fes, Morocco, 24–25 October 2019; pp. 1–5. [Google Scholar]
- Robles-Durazno, A.; Moradpoor, N.; McWhinnie, J.; Russell, G. Real-time anomaly intrusion detection for a clean water supply system, utilising machine learning with novel energy-based features. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
- Alhaidari, F.A.; Al-Dahasi, E.M. New Approach to Determine DDoS Attack Patterns on SCADA System Using Machine Learning. In Proceedings of the 2019 International Conference on Computer and Information Sciences (ICCIS), Aljouf, Saudi Arabia, 10–11 April 2019; pp. 1–6. [Google Scholar]
- Alimi, A.M.; Ouahada, K.; Abu-Mahfouz, A.M. Real Time Security Assessment of the Power System Using a Hybrid Support Vector Machine and Multilayer Perceptron Neural Network Algorithms. Sustainability 2019, 11, 3586. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; Wu, C.; Wan, L.; Liang, Y. A study on SVM with feature selection for fault diagnosis of power systems. In Proceedings of the 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE), Singapore, 26–28 February 2010; Volume 2, pp. 173–176. [Google Scholar]
- Alam, S.; Sonbhadra, S.K.; Agarwal, S.; Nagabhushan, P. One-class support vector classifiers: A survey. Knowl. Based Syst. 2020, 196, 105754. [Google Scholar] [CrossRef]
- Turkoz, M.; Kim, S.; Son, Y.; Jeong, M.K.; Elsayed, E.A. Generalized support vector data description for anomaly detection. Pattern Recognit. 2020, 100, 107119. [Google Scholar] [CrossRef]
- Schuster, F.; Paul, A.; Rietz, R.; Koenig, H. Potentials of Using One-Class SVM for Detecting Protocol-Specific Anomalies in Industrial Networks. In Proceedings of the 2015 IEEE Symposium Series on Computational Intelligence, Cape Town, South Africa, 7–10 December 2015; pp. 83–90. [Google Scholar]
- Yasakethu, S.L.P.; Jiang, J.; Graziano, A. Intelligent risk detection and analysis tools for critical infrastructure protection. Eurocon 2013, 52–59. [Google Scholar] [CrossRef]
- Jiang, J.; Yasakethu, L. Anomaly Detection via One Class SVM for Protection of SCADA Systems. In Proceedings of the International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, Beijing, China, 10–12 October 2013; Volume 10–12, pp. 82–88. [Google Scholar]
- Maglaras, L.A.; Jiang, J. OCSVM model combined with K-means recursive clustering for intrusion detection in SCADA systems. In Proceedings of the 10th International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness, Rhodes, Greece, 18–20 August 2014; pp. 133–134. [Google Scholar]
- Maglaras, L.; Jiang, J. Intrusion detection in SCADA systems using machine learning techniques. In Proceedings of the Science and Information Conference, London, UK, 27–29 August 2014; pp. 626–631. [Google Scholar]
- Cruz, T.; Rosa, L.; Proenca, J.; Maglaras, L.; Aubigny, M.; Lev, L.; Jiang, J.; Simoes, P. A Cybersecurity Detection Framework for Supervisory Control and Data Acquisition Systems. IEEE Trans. Ind. Inform. 2016, 12, 2236–2246. [Google Scholar] [CrossRef]
- Lee, S.; Lee, S.; Yoo, H.; Kwon, S.; Shon, T. Design and implementation of cybersecurity testbed for industrial IoT systems. J. Supercomput. 2018, 74, 4506–4520. [Google Scholar] [CrossRef]
- Prisco, A.F.S.; Duitama, M.J.F. Intrusion detection system for SCADA platforms through machine learning algorithms. In Proceedings of the 2017 IEEE Colombian Conference on Communications and Computing (COLCOM), Cartagena, Colombia, 16–18 August 2017; pp. 1–6. [Google Scholar]
- Fang, R.; Wang, Y.; Shang, R.; Liang, Y.; Wang, L.; Peng, C. The ultra-short term power prediction of wind farm considering operational condition of wind turbines. Int. J. Hydrogen Energy 2016, 41, 15733–15739. [Google Scholar] [CrossRef]
- Terai, A.; Abe, S.; Kojima, S.; Takano, Y.; Koshijima, I. Cyber-Attack Detection for Industrial Control System Monitoring with Support Vector Machine Based on Communication Profile. In Proceedings of the 2017 IEEE European Symposium on Security and Privacy Workshops (EuroS & PW), Paris, France, 26–28 April 2017; pp. 132–138. [Google Scholar]
- Qu, H.; Qin, J.; Liu, W.; Chen, H. Instruction Detection in SCADA/Modbus Network Based on Machine Learning. In Proceedings of the International Conference on Machine Learning and Intelligent Communications, Weihai, China, 5–6 August 2017; pp. 437–454. [Google Scholar]
- Perez, R.L.; Adamsky, F.; Soua, R.; Engel, T. Machine Learning for Reliable Network Attack Detection in SCADA Systems. In Proceedings of the 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, New York, NY, USA; 2018; pp. 633–638. [Google Scholar]
- Da Silva, E.G.; Da Silva, A.S.; Wickboldt, J.; Smith, P.; Granville, L.Z.; Filho, A.E.S. A One-Class NIDS for SDN-Based SCADA Systems. In Proceedings of the 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), Atlanta, GA, USA, 10–14 June 2016; Volume 1, pp. 303–312. [Google Scholar]
- Beauseroy, P.; Honeine, P.; Nader, P. Intrusion Detection in Scada Systems Using One-Class Classification. In Proceedings of the 21st European Signal Processing Conference (EUSIPCO 2013), Marrakech, Morocco, 9–13 September 2013; pp. 9–13. [Google Scholar]
- Nader, P.; Honeine, P.; Beauseroy, P. l_p-norms in One-Class Classification for Intrusion Detection in SCADA Systems. IEEE Trans. Ind. Inform. 2014, 10, 2308–2317. [Google Scholar] [CrossRef]
- Boonprong, S.; Cao, C.; Chen, W.; Ni, X.; Xu, M.; Acharya, B.K. The Classification of Noise-Afflicted Remotely Sensed Data Using Three Machine-Learning Techniques: Effect of Different Levels and Types of Noise on Accuracy. ISPRS Int. J. Geo-Inf. 2018, 7, 274. [Google Scholar] [CrossRef] [Green Version]
- Neha, N.; Raman, M.R.G.; Somu, N.; Senthilnathan, R.; Sriram, V.S. An Improved Feedforward Neural Network Using Salp Swarm Optimization Technique for the Design of Intrusion Detection System for Computer Network. In Advances in Intelligent Systems and Computing; Springer: Berlin/Heidelberg, Germany, 2020; pp. 867–875. [Google Scholar]
- Demertzis, K.; Iliadis, L.; Spartalis, S. A Spiking One-Class Anomaly Detection Framework for Cyber-Security on Industrial Control Systems. In Proceedings of the International Conference on Engineering Applications of Neural Networks, Athens, Greece, 25–27 August 2017; pp. 122–134. [Google Scholar]
- Li, H.; Yang, J.; Zhang, M.; Guo, S.; Lv, W.; Liu, Z.; Hui, L. A method based on artificial neural network to estimate the health of wind turbine. In Proceedings of the 27th Chinese Control and Decision Conference (2015 CCDC), Qingdao, China, 23–25 May 2015; pp. 919–922. [Google Scholar]
- Zhang, Z. Automatic Fault Prediction of Wind Turbine Main Bearing Based on SCADA Data and Artificial Neural Network. Open J. Appl. Sci. 2018, 8, 211–225. [Google Scholar] [CrossRef] [Green Version]
- Kosek, A.M.; Gehrke, O. Ensemble regression model-based anomaly detection for cyber-physical intrusion detection in smart grids. In Proceedings of the 2016 IEEE Electrical Power and Energy Conference (EPEC), Ottawa, ON, Canada, 12–14 October 2016; pp. 1–7. [Google Scholar]
- Yan, X.; Jin, Y.; Xu, Y.; Li, R. Wind Turbine Generator Fault Detection Based on Multi-Layer Neural Network and Random Forest Algorithm. In Proceedings of the IEEE Innovative Smart Grid Technologies-Asia (ISGT Asia), Chengdu, China, 21–24 May 2019; pp. 4132–4136. [Google Scholar]
- Rakhra, M.; Soniya, P.; Tanwar, D.; Singh, P.; Bordoloi, D.; Agarwal, P.; Takkar, S.; Jairath, K.; Verma, N. Crop Price Prediction Using Random Forest and Decision Tree Regression: A review. Mater. Today Proc. 2021, in press. [Google Scholar]
- McNabb, P.; Wilson, D.; Bialek, J. Classification of mode damping and amplitude in power systems using synchrophasor measurements and classification trees. IEEE Trans. Power Syst. 2013, 28, 1988–1996. [Google Scholar] [CrossRef]
- Upadhyay, D.; Manero, J.; Zaman, M.; Sampalli, S. Gradient Boosting Feature Selection with Machine Learning Classifiers for Intrusion Detection on Power Grids. IEEE Trans. Netw. Serv. Manag. 2021, 18, 1104–1116. [Google Scholar] [CrossRef]
- El Mrabet, Z.; Selvaraj, D.F.; Ranganathan, P. Adaptive Hoeffding Tree with Transfer Learning for Streaming Synchrophasor Data Sets. In Proceedings of the 2019 IEEE International Conference on Big Data, Los Angeles, CA, USA, 9–12 December 2019; pp. 5697–5704. [Google Scholar]
- Al-Asiri, M.; El-Alfy, E.-S.M. On Using Physical Based Intrusion Detection in SCADA Systems. Procedia Comput. Sci. 2020, 170, 34–42. [Google Scholar] [CrossRef]
- A Siddavatam, I.; Satish, S.; Mahesh, W.; Kazi, F. An ensemble learning for anomaly identification in SCADA system. In Proceedings of the 7th International Conference on Power Systems (ICPS), Pune, India, 21–23 December 2017; pp. 457–462. [Google Scholar]
- Swetha, R.B.S.; Meena, K.G. Smart grid-A network-based intrusion detection system. Int. J. Comput. Appl. 2015, 975, 8887. [Google Scholar]
- Choubineh, A.; Wood, D.A.; Choubineh, Z. Applying separately cost-sensitive learning and Fisher’s discriminant analysis to address the class imbalance problem: A case study involving a virtual gas pipeline SCADA system. Int. J. Crit. Infrastruct. Prot. 2020, 29, 100357. [Google Scholar] [CrossRef]
- Beaver, J.M.; Hink, R.B.; Buckner, M. An Evaluation of Machine Learning Methods to Detect Malicious SCADA Communications. In Proceedings of the 2013 12th International Conference on Machine Learning and Applications, Miami, FL, USA, 4–7 December 2013; Volume 2, pp. 54–59. [Google Scholar]
- Borujeni, S.E.; Nannapaneni, S.; Nguyen, N.H.; Behrman, E.C.; Steck, J.E. Quantum circuit representation of Bayesian networks. Expert Syst. Appl. 2021, 176, 114768. [Google Scholar] [CrossRef]
- Friedman, N.; Geiger, D.; Goldszmidt, M. Bayesian Network Classifiers. Mach. Learn. 1997, 29, 131–163. [Google Scholar] [CrossRef] [Green Version]
- Huang, K.; Zhou, C.; Tian, Y.-C.; Tu, W.; Peng, Y. Application of Bayesian network to data-driven cyber-security risk assessment in SCADA networks. In Proceedings of the 2017 27th International Telecommunication Networks and Applications Conference (ITNAC), Melbourne, Australia, 22–24 November 2017; pp. 1–6. [Google Scholar]
- Shin, J.; Son, H.; Heo, G. Cyber Security Risk Evaluation of a Nuclear I&C Using BN and ET. Nucl. Eng. Technol. 2017, 49, 517–524. [Google Scholar] [CrossRef]
- Zhang, Y.; Xiang, Y.; Wang, L. Reliability analysis of power grids with cyber vulnerability in SCADA system. In Proceedings of the 2014 IEEE PES General Meeting Conference & Exposition, National Harbor, MD, USA, 27–31 July 2014; pp. 1–5. [Google Scholar]
- Zhang, Y.; Wang, L.; Xiang, Y.; Ten, C.-W. Power System Reliability Evaluation with SCADA Cybersecurity Considerations. IEEE Trans. Smart Grid 2015, 6, 1707–1721. [Google Scholar] [CrossRef]
- Alimi, O.A.; Ouahada, K.; Abu-Mahfouz, A.M.; Alimi, K.O.A. Empirical Comparison of Machine Learning Algorithms for Mitigating Power Systems Intrusion Attacks. In Proceedings of the 2020 International Symposium on Networks, Computers and Communications (ISNCC), Montreal, QC, Canada, 20–22 October 2020; pp. 1–5. [Google Scholar]
- Mokhtari, S.; Abbaspour, A.; Yen, K.; Sargolzaei, A. A Machine Learning Approach for Anomaly Detection in Industrial Control Systems Based on Measurement Data. Electronics 2021, 10, 407. [Google Scholar] [CrossRef]
- Arora, P.; Kaur, B.; Teixeira, M.A. Evaluation of Machine Learning Algorithms Used on Attacks Detection in Industrial Control Systems. J. Inst. Eng. Ser. B 2021, 102, 605–616. [Google Scholar] [CrossRef]
- Gumaei, A.; Hassan, M.M.; Huda, S.; Hassan, R.; Camacho, D.; Del Ser, J.; Fortino, G. A robust cyberattack detection approach using optimal features of SCADA power systems in smart grids. Appl. Soft Comput. 2020, 96, 106658. [Google Scholar] [CrossRef]
Dataset | Published Year | Developer | Brief Description and Comparison: Features and Scenarios | Attacks Type |
---|---|---|---|---|
DARPA (KDD98) | 1998 | MIT Lincoln Laboratory | Recognized as the earliest open source dataset for intrusion detection studies. It is made up of network traffic and audit log files, which were collected from several internet-linked computers. The training and testing dataset contains seven and two weeks, respectively of network-based attacks in the midst of normal background data. | U2R, R2L, Probing, and DoS attacks |
KDD99 | 1999 | University of California | It is an upgraded version of KDD98 dataset. It is made up of approximately 4,900,000 vectors with 41 features, which are categorized into basic, traffic and content features. The dataset generation involves three weeks and two weeks of training and testing respectively. The secondnd week of training data contains several attacks. The testing dataset involves network-based attacks in the midst of normal background data. It has 201 instances of about 56 types of attacks distributed across the testing weeks. The dataset is heavily criticized for having too many duplicate feature instances. | U2R, R2L, Probing, and DoS attacks |
NSL-KDD | 2009 | University of California | The dataset is developed to solve the issues of huge duplicate and redundant packets that is attributed to the KDD99 dataset. As a result of removing duplicate and redundant packets, the dataset contains approximately 150,000 records. The dataset has similar properties and classes as KDD99 dataset i.e it also has 41 features. The training and testing dataset includes 24 and 38 attack types, respectively. | U2R, R2L, Probing, and DoS attacks |
UNSW-NB15 | 2015 | Cyber Range Lab of UNSW Canberra, Australia | Unlike previously developed open source datasets, the UNSW-NB15 dataset present a depiction of modern-day network traffic and attack scenarios. The dataset packets were created using tools such as IXIA Perfect-Storm, etc. The dataset contains a variety of normal and attacked activities with class labels of 2,540,044 records with 49 features. The dataset is heavily criticized for having too many duplicates in the training set. | Fuzzers, Analysis, Backdoors, DoS, worm, Exploits, Generic, Reconnaissance, Shellcode. |
KYOTO | 2006–2009 | Kyoto University | The dataset is created using tools, such as honeypots, darknet sensors, e-mail server and web crawler. The dataset has 24 statistical features, whereby 14 features were extracted based on the KDD99 dataset and 10 additional features. The additional 10 features allows effective investigation on the network status. | |
CSE-CIC-IDS 2017 | 2017 | Communications Security Establishments (CSE) & the Canadian Institute for Cybersecurity (CIC). | The dataset is an improvement on the earlier developed ISCX2012 dataset. The dataset has the attributes of practical real-world dataset and it is labelled based on the timestamp, source and destination IPs, source and destination ports, protocols and attacks. The dataset has 80 network flow features with 2,830,743 instances, with attack traffic making up approximately 20% of the total number. CICFlowMeter tool is used to extract the 80 features. | Benign behavior, Brute Force FTP, Brute Force SSH, DoS, Heartbleed, Web Attack, Infiltration, Botnet |
CSE-CIC-IDS 2018 | 2018 | CSE & CIC | The dataset has the same features as the 2017 dataset variant. However, the dataset was modelled using larger network of simulated client targets and attack devices. The attack devices include 50 machines while the victim devices have 420 machines with 30 servers. The dataset involves logs from individual machines, along with 80 features extractions from captured traffic done by CICFlowMeter-V3. The dataset contains 16,233,002 instances with about 17% of the instances being attack traffic. | Brute-force, Heartbleed, Botnet, DoS, DDoS, Web attacks, and infiltration of the network from inside. |
SWAT | 2016 | iTrust Centre for Research in Cyber Security, SUTD | The testbed for the dataset generation is a water distribution system that imitates an actual water treatment facility. For the dataset generation, operation was run for 11 consecutive days, 7 days out of the 11 days were run as normal operation while the remaining 4 days were run under attack scenarios involving 41 attacks. | 41 different attacks were simulated during 4 days of attack events. |
Morris | Power System-2014, Gas Pipeline-2013, Gas Pipeline & Water-2014 New Gas Pipeline-2015 EMS-2017 | Oak Ridge National Laboratories, MSU. | Five datasets were developed: Power system datasets, Gas pipeline datasets, Energy Management System (EMS) dataset, New gas pipeline datasets and Gas pipeline and Water storage tank datasets. The three Power system datasets comprises electric transmission system normal, disturbance, control, cyber-attack behaviors data. The EMS dataset consist of a voluminous anonymized log file that are recorded over 30 days interval. The Gas pipeline, Gas pipeline and water storage tank and New gas pipeline datasets is made up of packets captured from control devices and the HMI in a gas pipeline testbed. The Gas Pipeline and Water Storage Tank datasets has additional packet data captured from a water storage tank. | Some of the attacks include data injection, relay setting change and remote tripping command injection. |
Refs. | Dataset/ Testbed | Protocol | CI Domain | Feature Engineering/Optimization Technique | Algorithm(S) Used | Method Strength /Drawback |
---|---|---|---|---|---|---|
[31] | (1) CICIDS2017 (2) IEC dataset | (1) Modbus (2) IEC 60870-5-104 | Electric power grid | Synthetic Minority Oversampling Technique | Feed forward neural network | Good result but high false positives from the CICIDS2017 evaluation |
[44] | SCADA network simulation using Ns-2 simulator | Modbus | Weighted Particle based Cuckoo Search Optimization | ANN | Accuracy of 95%. Low precision rate when tested on ADFA-LD dataset. | |
[35] | SCADA testbed simulation | TCP/FIN | Electric power delivery infrastructure | Mapping symbolic-valued attributes to numeric valued attributes and scaling. | OCSVM, | Accuracy of 96.3%. high false alarm and model was not tested on big testbed. |
[100] | SDN-based SCADA system simulation | Modbus /TCP | Power systems | OCSVM and SVDD | Approximate accuracy of 98%. | |
[75] | MSU gas pipeline dataset | Modbus | Gas pipeline | Various techniques Including Bloom filter, PCA, CCA, ICA, and AllKNN. | KNN | Accuracy of 97%. Low detection rate |
[99] | MSU gas pipeline dataset | Modbus | Gas pipeline | Gaussian Mixture Model, K-means cluster, Zero imputation and indicators | SVM, RF and Bidirectional Long Short Term Memory | Best results achieved from RF and BLSTM |
[95] | MSU gas pipeline dataset | Modbus | Gas pipeline | SVDD and KPCA. | Good result achieved. | |
[57] | SWAT dataset | Water treatment facility | PSO for optimizing the parameters for ANN | Back-propagation neural network | Precision and F1 score of 98.7% & 90.4% respectively. | |
[13] | SWAT dataset | Water treatment facility | Normalization of all the attributes in the interval between 0 and 1 | SVM, ANN, RF, J48, DT, BN, etc. | DT presented the best accuracy with 99.72%, followed closely by RF, SVM, ANN with 99.65%, 98.71% and 98.24% respectively. SVM presented the longest computational time among the models. | |
[59] | SWAT dataset | Water treatment facility | GA for optimization | ANN | Precision % F1 score of 96.7% and 81.2%. | |
[83] | KDD99 dataset. | J48, Naive Bayes, RF | DT presented the best result with 99.99% accuracy. | |||
[118] | MSU gas pipeline dataset | Modbus | Gas pipeline | RF, SVM, Naive Bayes, OneR, J48, NNge | RF and NNge presented the best results. | |
[114] | MSU gas pipeline dataset | Modbus | Gas pipeline | DT | Specific-type accuracy of 98.6%. | |
[81] | MSU gas pipeline dataset | Modbus | Gas pipeline | Naïve Bayes, SVM, J48 adn RF | RF presented the best accuracy of 99.3% and it took the longest time to build compared to the toher models. It is followed closely by SVM. | |
[115] | Modelled testbed | Modbus /TCP | Power system | DT and RF | RF presented the better results. | |
[121] | Modelled chemical reaction process | UDP | Chemical reactor | - | BN | Good performance. |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alimi, O.A.; Ouahada, K.; Abu-Mahfouz, A.M.; Rimer, S.; Alimi, K.O.A. A Review of Research Works on Supervised Learning Algorithms for SCADA Intrusion Detection and Classification. Sustainability 2021, 13, 9597. https://doi.org/10.3390/su13179597
Alimi OA, Ouahada K, Abu-Mahfouz AM, Rimer S, Alimi KOA. A Review of Research Works on Supervised Learning Algorithms for SCADA Intrusion Detection and Classification. Sustainability. 2021; 13(17):9597. https://doi.org/10.3390/su13179597
Chicago/Turabian StyleAlimi, Oyeniyi Akeem, Khmaies Ouahada, Adnan M. Abu-Mahfouz, Suvendi Rimer, and Kuburat Oyeranti Adefemi Alimi. 2021. "A Review of Research Works on Supervised Learning Algorithms for SCADA Intrusion Detection and Classification" Sustainability 13, no. 17: 9597. https://doi.org/10.3390/su13179597