Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
survey

What Are the Attackers Doing Now? Automating Cyberthreat Intelligence Extraction from Text on Pace with the Changing Threat Landscape: A Survey

Published: 02 March 2023 Publication History

Abstract

Cybersecurity researchers have contributed to the automated extraction of CTI from textual sources, such as threat reports and online articles describing cyberattack strategies, procedures, and tools. The goal of this article is to aid cybersecurity researchers in understanding the current techniques used for cyberthreat intelligence extraction from text through a survey of relevant studies in the literature. Our work finds 11 types of extraction purposes and 7 types of textual sources for CTI extraction. We observe the technical challenges associated with obtaining available clean and labeled data for replication, validation, and further extension of the studies. We advocate for building upon the current CTI extraction work to help cybersecurity practitioners with proactive decision-making such as in threat prioritization and mitigation strategy formulation to utilize knowledge from past cybersecurity incidents.

Supplementary Material

3571726-supp (3571726-supp.pdf)
Supplementary material

References

[1]
AZSecure Portal. Retrieved from www.azsecure-data.org.
[2]
Cambridge crime dataset. Retrieved from www.cambridgecybercrime.uk/.
[3]
Chainsmith. Retrieved from https://ioc-chainsmith,org.
[4]
Cybersixgill. Retrieved from https://www.cybersixgill.com/.
[5]
Exploit Database. Retrieved from https://www.exploit-db.com/.
[6]
Featuresmith. Retrieved from http://featuresmith.org.
[7]
Github aptnotes. Retrieved from https://github.com/aptnotes/data.
[8]
[9]
[12]
Github DissectMalware. Retrieved from https://github.com/DissectMalware/IoCMiner.
[14]
Github HongyiZhu. Retrieved from https://github.com/HongyiZhu.
[15]
[16]
[19]
Github nicholasprayogo. Retrieved from https://github.com/nicholasprayogo/CyberATE.
[20]
[21]
Github Samsung. Retrieved from https://github.com/Samsung/Twiti.
[24]
Github yimingwu510. Retrieved from https://github.com/yimingwu510/TAG.
[25]
Hackmageddon. Retrieved from https://www.hackmageddon.com/.
[26]
Indicator of compromise - CSRC - NIST Glossary. Retrieved from https://csrc.nist.gov/glossary/term/indicator_of_compromise. [accessed 15-June-2022].
[29]
[30]
Privacy rights clearinghouse. Retrieved from https://privacyrights.org/data-breaches.
[31]
[32]
Stackexchange achieve. Retrieved from https://archive.org/details/stackexchange.
[34]
UMBC Ebiquity. Retrieved from http://ebiquity.umbc.edu/r/355.
[37]
Bank of England. 2016. CBEST Intelligence-Led Testing-Understanding Cyber Threat Intelligence Operations. Bank of England, Technical Report.
[38]
Staff Contributor. 2020. What is Threat Intelligence? Retrieved from https://www.dnsstuff.com/what-is-threat-intelligence.
[39]
Kurt Baker. 2022. What is cyber threat intelligence. Retrieved from https://www.crowdstrike.com/epp-101/threat-intelligence/.
[40]
Catalin Cimpanu. 2020. University of Utah pays USD 457,000 to ransomware gang. Retrieved from https://www.zdnet.com/article/university-of-utah-pays-457000-to-ransomware-gang/.
[41]
Henry Dalziel. 2014. Introduction. In How to Define and Build an Effective Cyber Threat Intelligence Capability. Syngress.
[42]
Sagar Samtani, Hongyi Zhu, and Hsinchun Chen. 2020. Proactively identifying emerging hacker threats from the dark web: A diachronic graph embedding framework (D-GEF). ACM Trans. Privac. Secur. 23, 4 (2020), 1–33.
[43]
Gbadebo Ayoade, Swarup Chandra, Latifur Khan, Kevin Hamlen, and Bhavani Thuraisingham. 2018. Automated threat report classification over multi-source data. In IEEE 4th International Conference on Collaboration and Internet Computing (CIC). IEEE, 236–245.
[44]
Mohamad Syahir Abdullah, Anazida Zainal, Mohd Aizaini Maarof, and Mohamad Nizam Kassim. 2018. Cyber-attack features for detecting cyber threat incidents from online news. In Cyber Resilience Conference (CRC). IEEE, 1–4.
[45]
Md Sahrom Abu, Siti Rahayu Selamat, Aswami Ariffin, and Robiah Yusof. 2018. Cyber threat intelligence—Issue and challenges. Indon. J. Electric. Eng. Comput. Sci. 10, 1 (2018), 371.
[46]
Fernando Alves, Pedro Miguel Ferreira, and Alysson Bessani. 2019. Design of a classification model for a Twitter-based streaming threat monitor. In 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). IEEE, 9–14.
[47]
Ehsan Amjadian, Nicholas Prayogo, Serena McDonnell, Cathal Smyth, and Muhammad Rizwan Abid. 2021. Attended-over distributed specificity for information extraction in cybersecurity. In IEEE Aerospace Conference. IEEE, 1–12.
[48]
Fernando Alves, Aurélien Bettini, Pedro M. Ferreira, and Alysson Bessani. 2021. Processing tweets for cybersecurity threat awareness. Inf. Syst. 95 (2021), 101586.
[49]
Sofia Alevizopoulou, Paris Koloveas, Christos Tryfonopoulos, and Paraskevi Raftopoulou. 2021. Social media monitoring for IoT cyber-threats. In IEEE International Conference on Cyber Security and Resilience (CSR). IEEE, 436–441.
[50]
Mohammad Al-Ramahi, Izzat Alsmadi, and Joshua Davenport. 2020. Exploring hackers assets: Topics of interest as indicators of compromise. In 7th Symposium on Hot Topics in the Science of Security. ACM, 1–4.
[51]
Robert A. Bridges, Corinne L. Jones, Michael D. Iannacone, Kelly M. Testa, and John R. Goodall. 2014. Automatic Labeling for Entity Extraction in Cyber Security. Retrieved from http://arxiv.org/abs/1308.4941.
[52]
Victor Benjamin, Weifeng Li, Thomas Holt, and Hsinchun Chen. 2015. Exploring threats and vulnerabilities in hacker web: Forums, IRC and carding shops. In IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 85–90.
[53]
Robert A. Bridges, Kelly M. T. Huffer, Corinne L. Jones, Michael D. Iannacone, and John R. Goodall. 2017. Cybersecurity automated information extraction techniques: Drawbacks of current methods, and enhanced extractors. In 16th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 437–442.
[54]
Vahid Behzadan, Carlos Aguirre, Avishek Bose, and William Hsu. 2018. Corpus and deep learning classifier for collection of cyber threat indicators in Twitter stream. In IEEE International Conference on Big Data (Big Data). IEEE, 5002–5007.
[55]
Avishek Bose, Vahid Behzadan, Carlos Aguirre, and William H. Hsu. 2019. A novel approach for detection and ranking of trendy and emerging cyber threat events in Twitter streams. In IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ACM, 871–878.
[56]
Tao Bo, Yue Chen, Can Wang, Yunwei Zhao, Kwok-Yan Lam, Chi-Hung Chi, and Hui Tian. 2019. TOM: A threat operating model for early warning of cyber security threats. In Advanced Data Mining and Applications. Vol. 11888. Springer International Publishing, 696–711.
[57]
Richard Colbaugh and Kristin Glass. 2011. Proactive defense for evolving cyber threats. In IEEE International Conference on Intelligence and Security Informatics. IEEE, 125–130.
[58]
Jeffrey C. Carver, Edgar Hassler, Elis Hernandes, and Nicholas A. Kraft. 2013. Identifying barriers to the systematic literature review process. In ACM / IEEE International Symposium on Empirical Software Engineering and Measurement. IEEE, 203–212.
[59]
Chia-Mei Chen, Dan-Wei Wen, Ya-Hui Ou, Wei-Chih Chao, and Zheng-Xun Cai. 2021. Retrieving potential cybersecurity information from hacker forums. Int. J. Netw. Secur. 23, 6 (2021), 1126–1138.
[60]
Chia-Mei Chen, Jing-Yun Kan, Ya-Hui Ou, Zheng-Xun Cai, and Albert Guan. 2021. Threat action extraction using information retrieval. In Computer Science & Information Technology (CS & IT). AIRCC Publishing Corporation, 13–19.
[61]
Isuf Deliu, Carl Leichter, and Katrin Franke. 2017. Extracting cyber threat intelligence from hacker forums: Support vector machines versus convolutional neural networks. In IEEE International Conference on Big Data (Big Data). IEEE, 3648–3656.
[62]
Isuf Deliu, Carl Leichter, and Katrin Franke. 2018. Collecting cyber threat intelligence from hacker forums via a two-stage, hybrid process using support vector machines and latent Dirichlet allocation. In IEEE International Conference on Big Data (Big Data). IEEE, 5008–5013.
[63]
Fangzhou Dong, Shaoxian Yuan, Haoran Ou, and Liang Liu. 2018. New cyber threat discovery from darknet marketplaces. In IEEE Conference on Big Data and Analytics (ICBDA). IEEE, 62–67.
[64]
Nuno Dionísio, Fernando Alves, Pedro M. Ferreira, and Alysson Bessani. 2019. Cyberthreat detection from Twitter using deep neural networks. In International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8.
[65]
Yong Fang, Jian Gao, Zhonglin Liu, and Cheng Huang. 2020. Detecting cyber threat event from Twitter using IDCNN and BiLSTM. Appl. Sci. 10, 17 (2020), 5922.
[66]
Paolo Frasconi, Daniele Baracchi, Betti Giusti, Ada Kura, Gaia Spaziani, Antonella Cherubini, Silvia Favilli, Andrea Di Lenarda, Guglielmina Pepe, and Stefano Nistri. 2021. Two-dimensional aortic size normalcy: A novelty detection approach. Diagnostics 11, 2 (2021), 220.
[67]
John Grisham, Sagar Samtani, Mark Patton, and Hsinchun Chen. 2017. Identifying mobile malware and key threat actors in online hacker forums for proactive cyber threat intelligence. In IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 13–18.
[68]
Yumna Ghazi, Zahid Anwar, Rafia Mumtaz, Shahzad Saleem, and Ali Tahir. 2018. A supervised machine learning based approach for automatically extracting high-level threat intelligence from unstructured sources. In International Conference on Frontiers of Information Technology (FIT). IEEE, 129–134.
[69]
Houssem Gasmi, Jannik Laval, and Abdelaziz Bouras. 2019. Information extraction of cybersecurity concepts: An LSTM approach. Appl. Sci. 9, 19 (2019), 3945.
[70]
Apurv Singh Gautam, Yamini Gahlot, and Pooja Kamat. 2020. Hacker forum exploit and classification for proactive cyber threat intelligence. In Inventive Computation Technologies. Vol. 98. Springer International Publishing, 279–285.
[71]
Ghaith Husari, Ehab Al-Shaer, Mohiuddin Ahmed, Bill Chu, and Xi Niu. 2017. TTPDrill: Automatic and accurate extraction of threat actions from unstructured text of CTI sources. In 33rd Annual Computer Security Applications Conference. ACM, 103–115.
[72]
Ghaith Husari, Xi Niu, Bill Chu, and Ehab Al-Shaer. 2018. Using entropy and mutual information to extract threat actions from cyber threat intelligence. In IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 1–6.
[73]
Jack Hughes, Seth Aycock, Andrew Caines, Paula Buttery, and Alice Hutchings. 2020. Detecting trending terms in cybersecurity forum discussions. In 6th Workshop on Noisy User-generated Text (W-NUT’20). Association for Computational Linguistics, 107–115.
[74]
Cheng Huang, Yongyan Guo, Wenbo Guo, and Ying Li. 2021. HackerRank: Identifying key hackers in underground forums. Int. J. Distrib. Sensor Netw. 17, 5 (2021), 155014772110151.
[75]
Zafar Iqbal, Zahid Anwar, and Rafia Mumtaz. 2018. STIXGEN - A novel framework for automatic generation of structured cyber threat information. In International Conference on Frontiers of Information Technology (FIT). IEEE, 241–246.
[76]
Denis Iorga, Dragos-Georgian Corlatescu, Octavian Grigorescu, Cristian Sandescu, Mihai Dascalu, and Razvan Rughinis. 2021. Yggdrasil - Early detection of cybernetic vulnerabilities from Twitter. In 23rd International Conference on Control Systems and Computer Science (CSCS). IEEE, 463–468.
[77]
Arnav Joshi, Ravendar Lal, Tim Finin, and Anupam Joshi. 2013. Extracting cybersecurity related linked data from text. In IEEE 7th International Conference on Semantic Computing. IEEE, 252–259.
[78]
Corinne L. Jones, Robert A. Bridges, Kelly M. T. Huffer, and John R. Goodall. 2015. Towards a relation extraction framework for cyber-security concepts. In 10th Annual Cyber and Information Security Research Conference. ACM, 1–4.
[79]
Taoran Ji, Xuchao Zhang, Nathan Self, Kaiqun Fu, Chang-Tien Lu, and Naren Ramakrishnan. 2019. Feature driven learning framework for cybersecurity event detection. In IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ACM, 196–203.
[80]
Rupinder Paul Khandpur, Taoran Ji, Steve Jan, Gang Wang, Chang-Tien Lu, and Naren Ramakrishnan. 2017. Crowdsourcing cybersecurity: Cyber attack detection using social media. In ACM Conference on Information and Knowledge Management. ACM, 1049–1057.
[81]
Masashi Kadoguchi, Shota Hayashi, Masaki Hashimoto, and Akira Otsuka. 2019. Exploring the dark web for cyber threat intelligence using machine leaning. In IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 200–202.
[82]
Nakhyun Kim, Minseok Kim, Seulgi Lee, Hyeisun Cho, Byung-ik Kim, Jun-hyung Park, and MoonSeog Jun. 2019. Study of natural language processing for collecting cyber threat intelligence using SyntaxNet. In 3rd International Symposium of Information and Internet Technology (SYMINTECH’18). Springer International Publishing, 10–18.
[83]
Paris Koloveas, Thanasis Chantzios, Sofia Alevizopoulou, Spiros Skiadopoulos, and Christos Tryfonopoulos. 2021. inTIME: A machine learning-based framework for gathering and leveraging web data to cyber-threat intelligence. Electronics 10, 7 (2021), 818.
[84]
Xiaojing Liao, Kan Yuan, XiaoFeng Wang, Zhou Li, Luyi Xing, and Raheem Beyah. 2016. Acing the IOC game: Toward automatic discovery and analysis of open-source cyber threat intelligence. In ACM SIGSAC Conference on Computer and Communications Security. ACM, 755–766.
[85]
Kuo-Chan Lee, Chih-Hung Hsieh, Li-Jia Wei, Ching-Hao Mao, Jyun-Han Dai, and Yu-Ting Kuang. 2017. Sec-Buzzer: Cyber security emerging topic mining with open threat intelligence retrieval and timeline event annotation. Soft Comput. 21, 11 (2017), 2883–2896.
[86]
Quentin Le Sceller, ElMouatez Billah Karbab, Mourad Debbabi, and Farkhund Iqbal. 2017. SONAR: Automatic detection of cyber security events over the Twitter stream. In 12th International Conference on Availability, Reliability and Security. ACM, 1–11.
[87]
Ke Li, Hui Wen, Hong Li, Hongsong Zhu, and Limin Sun. 2018. Security OSIF: Toward automatic discovery and analysis of event based cyber threat intelligence. In IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE, 741–747.
[88]
Zi Long, Lianzhi Tan, Shengping Zhou, Chaoyang He, and Xin Liu. 2019. Collecting indicators of compromise from unstructured text of cybersecurity articles using neural-based sequence labelling. In International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8.
[89]
Mengming Li, Rongfeng Zheng, Liang Liu, and Pin Yang. 2019. Extraction of threat actions from threat-related articles using multi-label machine learning classification method. In 2nd International Conference on Safety Produce Informatization (IICSPI). IEEE, 428–431.
[90]
Ba Dung Le, Guanhua Wang, Mehwish Nasim, and Ali Babar. 2019. Gathering cyber threat intelligence from Twitter using novelty classification. In International Conference on Cyberworlds (CW). IEEE, 316–323.
[91]
Valentine Solange Marine Legoy. 2019. Retrieving ATT&CK tactics and techniques in cyber threat reports. Master’s thesis. University of Twente. Retrieved from http://essay.utwente.nl/80012/.
[92]
Dong Li, Xiao Zhou, and Ao Xue. 2020. Open source threat intelligence discovery based on topic detection. In 29th International Conference on Computer Communications and Networks (ICCCN). IEEE, 1–4.
[93]
Yali Luo, Shengqin Ao, Ning Luo, Changxin Su, Peian Yang, and Zhengwei Jiang. 2021. Extracting threat intelligence relations using distant supervision and neural networks. In Advances in Digital Forensics XVII. Vol. 612. Springer International Publishing, 193–211.
[94]
Ying Li, Jiaxing Cheng, Cheng Huang, Zhouguo Chen, and Weina Niu. 2021. NEDetector: Automatically extracting cybersecurity neologisms from hacker forums. J. Inf. Secur. Applic. 58 (2021), 102784.
[95]
Varish Mulwad, Wenjia Li, Anupam Joshi, Tim Finin, and Krishnamurthy Viswanathan. 2011. Extracting information about security vulnerabilities from web text. In IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology. IEEE, 257–260.
[96]
Mitch Macdonald, Richard Frank, Joseph Mei, and Bryan Monk. 2015. Identifying digital threats in a hacker web forum. In IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ACM, 926–933.
[97]
Sudip Mittal, Prajit Kumar Das, Varish Mulwad, Anupam Joshi, and Tim Finin. 2016. CyberTwitter: Using Twitter to generate alerts for cybersecurity threats and vulnerabilities. In IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 860–867.
[98]
Syed Shariyar Murtaza, Wael Khreich, Abdelwahab Hamou-Lhadj, and Ayse Basar Bener. 2016. Mining trends and patterns of software vulnerabilities. J. Syst. Softw. 117 (2016), 218–228.
[99]
Eric Nunes, Ahmad Diab, Andrew Gunn, Ericsson Marin, Vineet Mishra, and Vivin Paliath, John Robertson, Jana Shakarian, Amanda Thart, and Paulo Shakarian. 2016. Darknet and deepnet mining for proactive cybersecurity threat intelligence. In IEEE Conference on Intelligence and Security Informatics (ISI). IEEE, 7–12.
[100]
Lorenzo Neil, Sudip Mittal, and Anupam Joshi. 2018. Mining threat intelligence about open-source projects and libraries from code repository issues and bug reports. In IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 7–12.
[101]
Amirreza Niakanlahiji, Jinpeng Wei, and Bei-Tseng Chu. 2018. A natural language processing based trend analysis of advanced persistent threat techniques. In IEEE International Conference on Big Data (Big Data). IEEE, 2995–3000.
[102]
Amirreza Niakanlahiji, Lida Safarnejad, Reginald Harper, and Bei-Tseng Chu. 2019. IoCMiner: Automatic extraction of indicators of compromise from Twitter. In IEEE International Conference on Big Data (Big Data). IEEE, 4747–4754.
[103]
Umara Noor, Zahid Anwar, Tehmina Amjad, and Kim-Kwang Raymond Choo. 2019. A machine learning-based FinTech cyber threat attribution framework using high-level indicators of compromise. Fut. Gen. Comput. Syst. 96 (2019), 227–242.
[104]
Tatsuya Nagai, Makoto Takita, Keisuke Furumoto, Yoshiaki Shiraishi, Kelin Xia, Yasuhiro Takano, Masami Mohri, and Masakatu Morii. 2019. Understanding attack trends from security blog posts using guided-topic model. J. Inf. Process. 27 (2019), 802–809.
[105]
Paweł Pawlinski, Przemylaw Jaroszewski, Piotr Kijewski, Lukasz Siewierski, Pawel Jacewicz, Przemyslaw Zielony, and Radoslaw Zuber. 2014. Actionable Information for Security Incident Response. Technical Report. European Union Agency for Network and Information Security.
[106]
Lior Perry, Bracha Shapira, and Rami Puzis. 2019. NO-DOUBT: Attack attribution based on threat intelligence reports. In IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 80–85.
[107]
Aritran Piplai, Sudip Mittal, Anupam Joshi, Tim Finin, James Holt, and Richard Zak. 2020. Creating cybersecurity knowledge graphs from malware after action reports. IEEE Access 8 (2020), 211691–211703.
[108]
Panos Panagiotou, Christos Iliou, Konstantinos Apostolou, Theodora Tsikrika, Stefanos Vrochidis, Periklis Chatzimisios, and Ioannis Kompatsiaris. 2021. Towards selecting informative content for cyber threat intelligence. In IEEE International Conference on Cyber Security and Resilience (CSR). IEEE, 354–359.
[109]
Andrei Lima Queiroz, Susan Mckeever, and Brian Keegan. 2019. Eavesdropping hackers: Detecting software vulnerability communication on social media using text mining. In 4th International Conference on Cyber-technologies and Cyber-systems. 41–48.
[110]
Alan Ritter, Evan Wright, William Casey, and Tom Mitchell. 2015. Weakly supervised extraction of computer security events from Twitter. In 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 896–905.
[111]
Roshni R. Ramnani, Karthik Shivaram, and Shubhashis Sengupta. 2017. Semi-automated Information extraction from unstructured threat advisories. In 10th Innovations in Software Engineering Conference. ACM, 181–187.
[112]
Md Rayhanur Rahman, Rezvan Mahdavi-Hezaveh, and Laurie Williams. 2020. A literature review on mining cyberthreat intelligence from unstructured texts. In International Conference on Data Mining Workshops (ICDMW). IEEE, 516–525.
[113]
Thea Riebe, Tristan Wirth, Markus Bayer, Philipp Kühn, Marc-André Kaufhold, Volker Knauthe, Stefan Guthe, and Christian Reuter. 2021. CySecAlert: An alert generation system for cyber security events using open source intelligence data. In Information and Communications Security. Vol. 12918. Springer International Publishing, 429–446.
[114]
Sagar Samtani, Ryan Chinn, and Hsinchun Chen. 2015. Exploring hacker assets in underground forums. In IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 31–36.
[115]
Carl Sabottke, Octavian Suciu, and Tudor Dumitraş. 2015. Vulnerability disclosure in the age of social media: Exploiting Twitter for predicting real-world exploits. In 24th USENIX Security Symposium (USENIX’15). USENIX, 1041–1056.
[116]
Sagar Samtani, Kory Chinn, Cathy Larson, and Hsinchun Chen. 2016. AZSecure hacker assets portal: Cyber threat intelligence and malware analysis. In IEEE Conference on Intelligence and Security Informatics (ISI). IEEE, 19–24.
[117]
Zareen Syed, Ankur Padia, Timothy W. Finin, Lisa Mathews, and Anupam Joshi. 2016. UCO: A unified cybersecurity ontology. In Proceeding of the AAAI Workshop: Artificial Intelligence for Cyber Security.
[118]
Anna Sapienza, Alessandro Bessi, Saranya Damodaran, Paulo Shakarian, Kristina Lerman, and Emilio Ferrara. 2017. Early warnings of cyber threats in online discussions. In IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 667–674.
[119]
Sagar Samtani, Ryan Chinn, Hsinchun Chen, and Jay F. Nunamaker Jr. 2017. Exploring emerging hacker assets and key hackers for proactive cyber threat intelligence. J. Manag. Inf. Syst. 34, 4 (2017), 1023–1053.
[120]
Clemens Sauerwein, Christian Sillaber, Andrea Mussmann, and Ruth Breu. 2017. Threat intelligence sharing platforms: An exploratory study of software vendors and research perspectives. In 13th International Conference on Wirtschaftsinformatik (WI’17). 837–851.
[121]
Anna Sapienza, Sindhu Kiranmai Ernala, Alessandro Bessi, Kristina Lerman, and Emilio Ferrara. 2018. DISCOVER: Mining online chatter for emerging cyber threats. In the Web Conference. ACM Press, 983–990.
[122]
Han-Sub Shin, Hyuk-Yoon Kwon, and Seung-Jin Ryu. 2020. A new text classification model based on contrastive word embedding for detecting cybersecurity intelligence in Twitter. Electronics 9, 9 (2020), 1527.
[123]
Kiavash Satvat, Rigel Gjomemo, and V. N. Venkatakrishnan. 2021. Extractor: Extracting attack behavior from threat reports. In IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 598–615.
[124]
Hyejin Shin, WooChul Shim, Saebom Kim, Sol Lee, Yong Goo Kang, and Yong Ho Hwang. 2021. #Twiti: Social listening for threat intelligence. In the Web Conference. ACM, 92–104.
[125]
Wiem Tounsi and Helmi Rais. 2018. A survey on technical threat intelligence in the age of sophisticated cyber attacks. Comput. Secur. 72 (2018), 212–233.
[126]
Katja Tuma, Gül Calikli, and Riccardo Scandariato. 2018. Threat analysis of software systems: A systematic literature review. J. Syst. Softw. 144 (2018), 275–294.
[127]
Hieu Man Duc Trong, Duc-Trong Le, Amir Pouran Ben Veyseh, Thuât Nguyên, and Thien Huu Nguyen. 2020. Introducing a new dataset for event detection in cybersecurity texts. In Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 5381–5390.
[128]
Uğur Tekin and Ercan Nurcan Yilmaz. 2021. Obtaining cyber threat intelligence data from Twitter with deep learning methods. In 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT). IEEE, 82–86.
[129]
Aaruni Upadhyay, Samira Eisaloo Gharghasheh, and Sanaz Nakhodchi. Mapping CKC model through NLP modelling for APT groups reports. In Handbook of Big Data Analytics and Forensics. Springer International Publishing.
[130]
Claes Wohlin, Per Runeson, Martin Höst, Magnus C. Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in Software Engineering. Springer Berlin.
[131]
Ryan Williams, Sagar Samtani, Mark Patton, and Hsinchun Chen. 2018. Incremental hacker forum exploit collection and classification for proactive cyber threat intelligence: An exploratory study. In IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 94–99.
[132]
Tianyi Wang and Kam Pui Chow. 2019. Automatic tagging of cyber threat intelligence unstructured data using semantics extraction. In IEEE International Conference on Intelligence and Security Informatics (ISI). IEEE, 197–199.
[133]
Thomas D. Wagner, Khaled Mahbub, Esther Palomar, and Ali E. Abdallah. 2019. Cyber threat intelligence sharing: Survey and research directions. Comput. Secur. 87 (2019), 101589.
[134]
Xuren Wang, Rong Chen, Binghua Song, Jie Yang, Zhengwei Jiang, Xiaoqing Zhang, Xiaomeng Li, and Shengqin Ao. 2021. A method for extracting unstructured threat intelligence based on dictionary template and reinforcement learning. In IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD). IEEE, 262–267.
[135]
Yiming Wu, Qianjun Liu, Xiaojing Liao, Shouling Ji, Peng Wang, Xiaofeng Wang, Chunming Wu, and Zhao Li. 2021. Price TAG: Towards semi-automatically discovery tactics, techniques and procedures of e-Commerce cyber threat intelligence. IEEE Trans. Depend. Secure Comput. (2021), 1–1.
[136]
Wenjun Xiong and Robert Lagerström. 2019. Threat modeling—A systematic literature review. Comput. Secur. 84 (2019), 53–69.
[137]
Zhe Yu and Tim Menzies. 2019. FAST2: An intelligent assistant for finding relevant papers. Expert Syst. Applic. 120 (2019).
[138]
Wenzhuo Yang and Kwok-Yan Lam. 2020. Automated cyber threat intelligence reports classification for early warning of cyber attacks in next generation SOC. In Information and Communications Security. Vol. 11999. Springer International Publishing, 145–164.
[139]
He Zhang, Muhammad Ali Babar, and Paolo Tell. 2011. Identifying relevant studies in software engineering. Inf. Softw. Technol. 53, 6 (2011), 625–637.
[140]
Ziyun Zhu and Tudor Dumitraş. 2016. FeatureSmith: Automatically engineering features for malware detection by mining the security literature. In ACM SIGSAC Conference on Computer and Communications Security. ACM, 767–778.
[141]
Ziyun Zhu and Tudor Dumitras. 2018. ChainSmith: Automatically learning the semantics of malicious campaigns by mining threat intelligence reports. In IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 458–472.
[142]
Panpan Zhang, Jing Ya, Tingwen Liu, Quangang Li, Jinqiao Shi, and Zhaojun Gu. 2019. iMCircle: Automatic mining of indicators of compromise from the web. In IEEE Symposium on Computers and Communications (ISCC). IEEE, 1–6.
[143]
Jun Zhao, Qiben Yan, Jianxin Li, Minglai Shao, Zuti He, and Bo Li. 2020. TIMiner: Automatically extracting and analyzing categorized cyber threat intelligence from social data. Comput. Secur. 95 (2020), 101867.
[144]
Huixia Zhang, Guowei Shen, Chun Guo, Yunhe Cui, and Chaohui Jiang. 2021. EX-action: Automatically extracting threat actions from cyber threat intelligence report based on multimodal learning. Secur. Commun. Netw. 121 (2021), 1–12.
[145]
Wenli Zeng, Zhi Liu, Yaru Yang, Gen Yang, and Qin Luo. 2021. QBC inconsistency-based threat intelligence IOC recognition. IEEE Access 9 (2021), 153102–153107.
[146]
Swati Khandelwal. 2019. New Group of Hackers Targeting Businesses with Financially Motivated Cyber Attacks. Retrieved from https://thehackernews.com/2019/11/financial-cyberattacks.html.
[147]
Marry L. McHugh. 2012. Interrater reliability: The kappa statistic. Biochem. Medica 22, 3 (2012), 276–282.
[148]
Rob McMillan. 2013. Definition: Threat intelligence. Retrieved from https://www.gartner.com/en/documents/2487216.
[149]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. DOI:DOI:
[150]
Larry Ponemon. 2014. Exchanging Cyber Threat Intelligence: There Has to Be a Better Way. Technical Report. Ponemon Institute Research Report, Ponemon Institute LLC.
[151]
Jon Porter. 2020. Amazon says it mitigated the largest DDoS attack ever recorded. Retrieved from https://www.theverge.com/2020/6/18/21295337/amazon-aws-biggest-ddos-attack-ever-2-3-tbps-shield-github-netscout-arbor.
[152]
Johnny Saldaña. 2015. The Coding Manual for Qualitative Researchers. Sage.
[153]
Sagar Samtani, Murat Kantarcioglu, and Hsinchun Chen. 2020. Trailblazing the artificial intelligence for cybersecurity discipline: A multi-disciplinary research roadmap. ACM Trans. Manag. Inf. Syst. 11, 4 (2020), 1–19.
[154]
Bruce Schneier. 1998. Security pitfalls in cryptography. In Proceeding of the EDI FORUM-OAK PARK-, Vol. 11, THE EDI GROUP, LTD., 65–69.
[155]
Dave Shackleford. 2015. Who’s Using Cyberthreat Intelligence and How?Technical Report. SANS Institute.
[156]
Donna Spencer. 2009. Card Sorting: Designing Usable Categories. Rosenfeld Media.
[157]
K. Zurkus. 2015. Threat intelligence needs to grow up. Retrieved from https://www.csoonline.com/article/2969275/threat-intelligence-needs-to-grow-up.html.

Cited By

View all
  • (2024)Generative AI for Threat Intelligence and Information SharingUtilizing Generative AI for Cyber Defense Strategies10.4018/979-8-3693-8944-7.ch006(191-234)Online publication date: 13-Sep-2024
  • (2024)You Might Have Known It Earlier: Analyzing the Role of Underground Forums in Threat IntelligenceProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678930(368-383)Online publication date: 30-Sep-2024
  • (2024)Applied Machine Learning for Information SecurityDigital Threats: Research and Practice10.1145/36520295:1(1-5)Online publication date: 5-Apr-2024
  • Show More Cited By

Index Terms

  1. What Are the Attackers Doing Now? Automating Cyberthreat Intelligence Extraction from Text on Pace with the Changing Threat Landscape: A Survey

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Computing Surveys
    ACM Computing Surveys  Volume 55, Issue 12
    December 2023
    825 pages
    ISSN:0360-0300
    EISSN:1557-7341
    DOI:10.1145/3582891
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 March 2023
    Online AM: 22 November 2022
    Accepted: 17 October 2022
    Revised: 12 July 2022
    Received: 04 June 2021
    Published in CSUR Volume 55, Issue 12

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Cyberthreat intelligence
    2. CTI extraction
    3. CTI mining
    4. IoC extraction
    5. TTPs extraction
    6. attack pattern extraction
    7. threat reports
    8. tactical threat intelligence
    9. technical threat intelligence

    Qualifiers

    • Survey

    Funding Sources

    • NSA Science of Security award

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)941
    • Downloads (Last 6 weeks)92
    Reflects downloads up to 14 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Generative AI for Threat Intelligence and Information SharingUtilizing Generative AI for Cyber Defense Strategies10.4018/979-8-3693-8944-7.ch006(191-234)Online publication date: 13-Sep-2024
    • (2024)You Might Have Known It Earlier: Analyzing the Role of Underground Forums in Threat IntelligenceProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678930(368-383)Online publication date: 30-Sep-2024
    • (2024)Applied Machine Learning for Information SecurityDigital Threats: Research and Practice10.1145/36520295:1(1-5)Online publication date: 5-Apr-2024
    • (2024)Sharing Is Caring: Hurdles and Prospects of Open, Crowd-Sourced Cyber Threat IntelligenceIEEE Transactions on Engineering Management10.1109/TEM.2023.327927471(6854-6873)Online publication date: 2024
    • (2024)Generic Quantum Blockchain-Envisioned Security Framework for IoT Environment: Architecture, Security Benefits and Future ResearchIEEE Open Journal of the Computer Society10.1109/OJCS.2024.33973075(248-267)Online publication date: 2024
    • (2024)Evolving techniques in cyber threat hunting: A systematic reviewJournal of Network and Computer Applications10.1016/j.jnca.2024.104004232(104004)Online publication date: Dec-2024
    • (2024)OSTIS: A novel Organization-Specific Threat Intelligence SystemComputers & Security10.1016/j.cose.2024.103990145(103990)Online publication date: Oct-2024
    • (2024)Relation Extraction Techniques in Cyber Threat IntelligenceNatural Language Processing and Information Systems10.1007/978-3-031-70239-6_24(348-363)Online publication date: 25-Jun-2024
    • (2024)An Analysis of Topic Modeling Approaches for Unlabeled Dark Web Data ClassificationInnovations and Advances in Cognitive Systems10.1007/978-3-031-69201-7_12(150-162)Online publication date: 25-Sep-2024
    • (2023)HackMentor: Fine-Tuning Large Language Models for Cybersecurity2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00076(452-461)Online publication date: 1-Nov-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media