Abstract
Real world cyber security datasets are essential for developing and evaluating new techniques to counter cyber attacks. Ideally, these datasets should represent modern network infrastructures with up-to-date cyber attacks. However, existing datasets commonly used by researchers are either synthetic, unscalable or easily outdated due to the dynamic network infrastructure and evolving nature of cyber attacks. In this paper, we introduce a security dataset generator (SDGen) which focuses on a scalable, reproducible and flexible approach to generate real world datasets for detection and response against cyber attacks. We implement SDGen within a virtual environment using DetectionLab, ELK (Elasticsearch, Logstash, Kibana) stack with Beats and AttackIQ (a security control validation platform). This implementation in fact provides a proof-of-concept (POC) of SDGen to demonstrate the dataset generation of an organisation being compromised by several types of Ransomware. We showcase that our proposed dataset generator, SDGen, provides scalability, reproducibility and flexibility in generating cyber security datasets by modifying the configurations in DetectionLab, VagrantFiles and launching different types of attacks in AttackIQ.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Al-rimy, B.A.S., Maarof, M.A., Shaid, S.Z.M.: Ransomware threat success factors, taxonomy, and countermeasures: a survey and research directions. Comput. Secur. 74, 144–166 (2018)
Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Towards generating real-life datasets for network intrusion detection. Int. J. Netw. Secur. 17(6), 683–701 (2015)
Elasticsearch B.V.: Beats. https://www.elastic.co/beats/. Accessed 30 Aug 2021
Elasticsearch B.V.: Elasticsearch. https://www.elastic.co/elasticsearch/. Accessed 31 Aug 2021
Elasticsearch B.V.: Kibana. https://www.elastic.co/kibana/. Accessed 31 Aug 2021
Elasticsearch B.V.: Logstash. https://www.elastic.co/logstash/. Accessed 30 Aug 2021
Cohen, I., Herzog, B.: Ryuk ransomware: a targeted campaign break-down (2018)
The Mitre Corporation: Ryuk, May 2020. https://attack.mitre.org/software/S0446/. Accessed 07 Sept 2021
Cunningham, R.K., et al.: Evaluating intrusion detection systems without attacking your friends: the 1998 DARPA intrusion detection evaluation. Technical report, Massachusetts Institute of Technology, Lexington, Lincoln Laboratory (1999)
DetectionLab. https://detectionlab.network
Gharib, A., Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A.: An evaluation framework for intrusion detection dataset. In: 2016 International Conference on Information Science and Security (ICISS), pp. 1–6. IEEE (2016)
Haines, J.W., Lippmann, R.P., Fried, D.J., Zissman, M., Tran, E.: 1999 DARPA intrusion detection evaluation: design and procedures. Technical report, Massachusetts Institute of Technology, Lexington, Lincoln Laboratory (2001)
Hashimoto, M.: Vagrant: Up and Running: Create and Manage Virtualized Development Environments. O’Reilly Media Inc., Sebastopol (2013)
Kozik, R., Choraś, M., Ficco, M., Palmieri, F.: A scalable distributed machine learning approach for attack detection in edge computing environments. J. Parallel Distrib. Comput. 119, 18–26 (2018)
Liu, L., De Vel, O., Han, Q.L., Zhang, J., Xiang, Y.: Detecting and preventing cyber insider threats: a survey. IEEE Commun. Surv. Tutor. 20(2), 1397–1417 (2018)
Long, C.: Introducing: Detection Lab. https://medium.com/@clong/introducing-detection-lab-61db34bed6ae
Mighan, S.N., Kahani, M.: A novel scalable intrusion detection system based on deep learning. Int. J. Inf. Secur. 20(3), 387–403 (2020). https://doi.org/10.1007/s10207-020-00508-5
Mixon, E.: Top 10 ransomware attacks of 2021 (so far) - blumira. https://www.blumira.com/ransomware-attacks-2021/
Moustafa, N., Slay, J.: UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS), pp. 1–6. IEEE (2015)
Ring, M., Wunderlich, S., Scheuring, D., Landes, D., Hotho, A.: A survey of network-based intrusion detection data sets. Comput. Secur. 86, 147–167 (2019)
Sharafaldin, I., Gharib, A., Lashkari, A.H., Ghorbani, A.A.: Towards a reliable intrusion detection benchmark dataset. Softw. Netw. 2018(1), 177–200 (2018)
Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: ICISSP, vol. 1, pp. 108–116 (2018)
Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.A.: Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 31(3), 357–374 (2012)
Song, J., Takakura, H., Okabe, Y., Eto, M., Inoue, D., Nakao, K.: Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation. In: Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, pp. 29–36 (2011)
Strom, B.E., Applebaum, A., Miller, D.P., Nickels, K.C., Pennington, A.G., Thomas, C.B.: MITRE ATT&CK: design and philosophy. Technical report (2018)
Yadav, T., Rao, A.M.: Technical aspects of cyber kill chain. In: Abawajy, J.H., Mukherjea, S., Thampi, S.M., Ruiz-MartÃnez, A. (eds.) SSCC 2015. CCIS, vol. 536, pp. 438–452. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22915-7_40
Zheng, M., Robbins, H., Chai, Z., Thapa, P., Moore, T.: Cybersecurity research datasets: taxonomy and empirical analysis. In: 11th USENIX Workshop on Cyber Security Experimentation and Test (CSET 2018) (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Koay, A.M.Y., Xie, M., Ko, R.K.L., Sterner, C., Choi, T., Dong, N. (2022). SDGen: A Scalable, Reproducible and Flexible Approach to Generate Real World Cyber Security Datasets. In: Wang, G., Choo, KK.R., Ko, R.K.L., Xu, Y., Crispo, B. (eds) Ubiquitous Security. UbiSec 2021. Communications in Computer and Information Science, vol 1557. Springer, Singapore. https://doi.org/10.1007/978-981-19-0468-4_8
Download citation
DOI: https://doi.org/10.1007/978-981-19-0468-4_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0467-7
Online ISBN: 978-981-19-0468-4
eBook Packages: Computer ScienceComputer Science (R0)