Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3098954.3098992acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaresConference Proceedingsconference-collections
research-article

SONAR: Automatic Detection of Cyber Security Events over the Twitter Stream

Published: 29 August 2017 Publication History

Abstract

Everyday, security experts face a growing number of security events that affecting people well-being, their information systems and sometimes the critical infrastructure. The sooner they can detect and understand these threats, the more they can mitigate and forensically investigate them. Therefore, they need to have a situation awareness of the existing security events and their possible effects. However, given the large number of events, it can be difficult for security analysts and researchers to handle this flow of information in an adequate manner and answer the following questions in near-real time: what are the current security events? How long do they last? In this paper, we will try to answer these issues by leveraging social networks that contain a massive amount of valuable information on many topics. However, because of the very high volume, extracting meaningful information can be challenging. For this reason, we propose SONAR: an automatic, self-learned framework that can detect, geolocate and categorize cyber security events in near-real time over the Twitter stream. SONAR is based on a taxonomy of cyber security events and a set of seed keywords describing type of events that we want to follow in order to start detecting events. Using these seed keywords, it automatically discovers new relevant keywords such as malware names to enhance the range of detection while staying in the same domain. Using a custom taxonomy describing all type of cyber threats, we demonstrate the capabilities of SONAR on a dataset of approximately 47.8 million tweets related to cyber security in the last 9 months. SONAR could efficiently and effectively detect, categorize and monitor cyber security related events before getting on the security news, and it could automatically discover new security terminologies with their event. Additionally, SONAR is highly scalable and customizable by design; therefore we could adapt SONAR framework for virtually any type of events that experts are interested in.

References

[1]
Apache Cassandra: https://cassandra.apache.org.
[2]
Apache Kafka: https://kafka.apache.org.
[3]
Apache Spark: https://spark.apache.org.
[4]
Apache Zookeeper: https://zookeeper.apache.org.
[5]
ELK Stack: http://www.elastic.co.
[6]
2013. New Tweets per second record, and how! Twitter Official Blog https://blog.twitter.com/2013/new-tweets-per-second-record-and-how (August 2013).
[7]
Hamed Abdelhaq, Christian Sengstock, and Michael Gertz. 2013. EvenTweet: Online Localized Event Detection from Twitter. Proc. VLDB Endow. 6, 12 (Aug. 2013), 1326--1329.
[8]
Charu C. Aggarwal and Karthik Subbian. Event Detection in Social Streams. In Proceedings of the 2012 SIAM International Conference on Data Mining. 624--635.
[9]
James Allan, Victor Lavrenko, Daniella Malin, and Russell Swan. 2000. Detections, Bounds, and Timelines: UMass and TDT-3. In Proceedings of Topic Detection and Tracking Workshop. 164--174.
[10]
S. D. Applegate and A. Stavrou. 2013. Towards a Cyber Conflict Taxonomy. In 2013 5th International Conference on Cyber Conflict (CYCON 2013). 1--18.
[11]
Hila Becker, Dan Iter, Mor Naaman, and Luis Gravano. 2012. Identifying Content for Planned Events Across Social Media Sites. In WSDM'12. 533--542.
[12]
Hila Becker, Mor Naaman, and Luis Gravano. 2011. Beyond Trending Topics: Real-World Event Identification on Twitter. In ICWSM'11. 438--441.
[13]
Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin. 2001. A neural probabilistic language model. In Advances in Neural Information Processing Systems 13 (NIPS'00). 933--938.
[14]
James J. Cebula and Lisa R. Young. 2010. A Taxonomy of Operational Cyber Security Risks. Technical Report. Carnegie Mellon University.
[15]
Moses S. Charikar. 2002. Similarity Estimation Techniques from Rounding Algorithms. In Proceedings of the Thiry-fourth Annual ACM Symposium on Theory of Computing (STOC '02). ACM, New York, NY, USA, 380--388.
[16]
Julie Connolly, Mark Davidson, and Charles Schmidt. 2014. The Trusted Automated eXchange of Indicator Information. Technical Report. The MITRE Corporation.
[17]
Andrew Crooks, Arie Croitoru, Anthony Stefanidis, and Jacek Radzikowski. 2013. Earthquake: Twitter as a Distributed Sensor System. In Transactions in GIS, Vol. 17. 124--147.
[18]
Kameswari Kotapati, Peng Liu, Yan Sun, and Thomas F. LaPorta. 2005. A Taxonomy of Cyber Attacks on 3G Networks. Springer Berlin Heidelberg, Berlin, Heidelberg, 631--633.
[19]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a Social Network or a News Media?. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). 591--600.
[20]
R. Li, K. H. Lei, R. Khadiwala, and K. C. C. Chang. 2012. TEDAS: A Twitter-based Event Detection and Analysis System. In 2012 IEEE 28th International Conference on Data Engineering. 1273--1276.
[21]
Michael Mathioudakis and Nick Koudas. 2010. TwitterMonitor: Trend Detection over the Twitter Stream. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD '10). ACM, New York, NY, USA, 1155--1158.
[22]
Richard McCreadie, Craig Macdonald, Iadh Ounis, Miles Osborne, and Sasa Petrovic. 2013. Scalable Distributed Event Detection for Twitter. In Proceedings of IEEE International Conference on Big Data.
[23]
Andrew J. McMinn, Yashar Moshfeghi, and Joemon M. Jose. 2013. Building a Large-scale Corpus for Evaluating Event Detection on Twitter. In Proceedings of the 22Nd ACM International Conference on Information & Knowledge Management. 409--418.
[24]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 3111--3119.
[25]
Miles Osborne, Sean Moran, Richard Mccreadie, Alexander Von Lunen, Martin Sykora, Elizabeth Cano, Neil Ireson, Craig Macdonald, Iadh Ounis, Yulan He, Tom Jackson, Fabio Ciravegna, and Ann O'Brien. 2014. Real-time detection, tracking, and monitoring of automatically discovered events in social media. In 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 37--42.
[26]
Olutobi Owoputi, Brendan O'Connor, Chris Dyer, Kevin Gimpel, Nathan Schneider, and Noah A. Smith. 2013. Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters. In Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June 9-14, 2013, Westin Peachtree Plaza Hotel, Atlanta, Georgia, USA. 380--390.
[27]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532--1543.
[28]
Saša Petrović, Miles Osborne, and Victor Lavrenko. 2010. Streaming First Story Detection with Application to Twitter. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 181--189.
[29]
Anand Rajaraman and Jeffrey David Ullman. 2011. Mining of Massive Datasets. Cambridge University Press, New York, NY, USA.
[30]
Radim Řehůřek. 2014. Making sense of word2vec. (2014). https://rare-technologies.com/making-sense-of-word2vec/.
[31]
Alan Ritter, Evan Wright, William Casey, and Tom Mitchell. 2015. Weakly Supervised Extraction of Computer Security Events from Twitter. In Proceedings of the 24th International Conference on World Wide Web (WWW '15). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 896--905.
[32]
Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). ACM, New York, NY, USA, 851--860.
[33]
Dmitriy Selivanov. 2015. GloVe vs word2vec revisited. (2015). http://dsnotes.com/post/glove-enwiki/.
[34]
Chris Simmons, Charles Ellis, Sajjan Shiva, Dipankar Dasgupta, and Qishi Wu. 2009. AVOIDIT: A Cyber Attack Taxonomy. Technical Report. University of Memphis.
[35]
Symantec. 2016. Internet Security Threat Report. (2016). https://www.symantec.com/content/dam/symantec/docs/reports/istr-21-2016-en.pdf.
[36]
Cybersecurity Ventures. 2016. Zero Day Report. (2016). http://cybersecurityventures.com/zero-day-vulnerabilities-attacks-exploits-report-2017/.
[37]
Jianshu Weng, Yuxia Yao, Erwin Leonardi, and Francis Lee. 2011. Event Detection in Twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media. 401--408.
[38]
Computer World. 2016. MongoDB ransomware attacks and lessons learned. (2016). http://www.computerworld.com/article/3157766/linux/mongodb-ransomware-attacks-and-lessons-learned.html.
[39]
Yiming Yang, Tom Pierce, and Jaime Carbonell. 1998. A Study on Retrospective and On-Line Event Detection. In SIGIR'98. 28--36.
[40]
Qiankun Zhao, Prasenjit Mitra, and Bi Chen. 2007. Temporal and information flow based event detection from social text streams. In Proceeding AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence. 1501--1506.
[41]
Bonnie Zhu, Anthony Joseph, and Shankar Sastry. 2011. A taxonomy of cyber attacks on SCADA Systems. In Proceedings of The 2011 IEEE International Conference on Internet of Things (iThings'11). pp. 380--388.

Cited By

View all
  • (2024)‘We Do Not Have the Capacity to Monitor All Media’: A Design Case Study on Cyber Situational Awareness in Computer Emergency Response TeamsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642368(1-16)Online publication date: 11-May-2024
  • (2024)OSTIS: A novel Organization-Specific Threat Intelligence SystemComputers & Security10.1016/j.cose.2024.103990145(103990)Online publication date: Oct-2024
  • (2023)Açık Kaynaklardan Test Otomasyon Araçlarıyla Siber Tehdit İstihbaratı ÇıkarılmasıExtracting Cyber Threat Intelligence with Test Automation Tools from Open SourcesFırat Üniversitesi Mühendislik Bilimleri Dergisi10.35234/fumbd.121721935:1(283-290)Online publication date: 28-Mar-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ARES '17: Proceedings of the 12th International Conference on Availability, Reliability and Security
August 2017
853 pages
ISBN:9781450352574
DOI:10.1145/3098954
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cyber security events detection
  2. Twitter
  3. framework
  4. security awareness
  5. social media
  6. word embedding

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ARES '17
ARES '17: International Conference on Availability, Reliability and Security
August 29 - September 1, 2017
Reggio Calabria, Italy

Acceptance Rates

ARES '17 Paper Acceptance Rate 100 of 191 submissions, 52%;
Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)51
  • Downloads (Last 6 weeks)2
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)‘We Do Not Have the Capacity to Monitor All Media’: A Design Case Study on Cyber Situational Awareness in Computer Emergency Response TeamsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642368(1-16)Online publication date: 11-May-2024
  • (2024)OSTIS: A novel Organization-Specific Threat Intelligence SystemComputers & Security10.1016/j.cose.2024.103990145(103990)Online publication date: Oct-2024
  • (2023)Açık Kaynaklardan Test Otomasyon Araçlarıyla Siber Tehdit İstihbaratı ÇıkarılmasıExtracting Cyber Threat Intelligence with Test Automation Tools from Open SourcesFırat Üniversitesi Mühendislik Bilimleri Dergisi10.35234/fumbd.121721935:1(283-290)Online publication date: 28-Mar-2023
  • (2023)Detection of Cyber Security Threats through Social Media Platforms2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW59300.2023.00137(820-823)Online publication date: May-2023
  • (2023)Early Malware Characterization based on Online Social Networks2023 6th International Conference on Advanced Communication Technologies and Networking (CommNet)10.1109/CommNet60167.2023.10365252(1-8)Online publication date: 11-Dec-2023
  • (2023)Enhancing Cyber Threat Identification in Open-Source Intelligence Feeds Through an Improved Semi-Supervised Generative Adversarial Learning Approach With Contrastive LearningIEEE Access10.1109/ACCESS.2023.329960411(84440-84452)Online publication date: 2023
  • (2023)Automated Emerging Cyber Threat Identification and Profiling Based on Natural Language ProcessingIEEE Access10.1109/ACCESS.2023.326002011(58915-58936)Online publication date: 2023
  • (2023)Multi-level fine-tuning, data augmentation, and few-shot learning for specialized cyber threat intelligenceComputers and Security10.1016/j.cose.2023.103430134:COnline publication date: 1-Nov-2023
  • (2023)ATDG: An Automatic Cyber Threat Intelligence Extraction Model of DPCNN and BIGRU Combined with Attention MechanismWeb Information Systems Engineering – WISE 202310.1007/978-981-99-7254-8_15(189-204)Online publication date: 21-Oct-2023
  • (2023)Detection and Classification of Cyber Threats in Tweets Toward PreventionIntelligent Systems and Human Machine Collaboration10.1007/978-981-19-8477-8_8(83-98)Online publication date: 30-Mar-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media