Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3359789.3359822acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article
Public Access

Improving intrusion detectors by crook-sourcing

Published: 09 December 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Conventional cyber defenses typically respond to detected attacks by rejecting them as quickly and decisively as possible; but aborted attacks are missed learning opportunities for intrusion detection. A method of reimagining cyber attacks as free sources of live training data for machine learning-based intrusion detection systems (IDSes) is proposed and evaluated. Rather than aborting attacks against legitimate services, adversarial interactions are selectively prolonged to maximize the defender's harvest of useful threat intelligence. Enhancing web services with deceptive attack-responses in this way is shown to be a powerful and practical strategy for improved detection, addressing several perennial challenges for machine learning-based IDS in the literature, including scarcity of training data, the high labeling burden for (semi-)supervised learning, encryption opacity, and concept differences between honeypot attacks and those against genuine services. By reconceptualizing software security patches as feature extraction engines, the approach conscripts attackers as free penetration testers, and coordinates multiple levels of the software stack to achieve fast, automatic, and accurate labeling of live web data streams.
    Prototype implementations are showcased for two feature set models to extract security-relevant network- and system-level features from servers hosting enterprise-grade web applications. The evaluation demonstrates that the extracted data can be fed back into a network-level IDS for exceptionally accurate, yet lightweight attack detection.

    References

    [1]
    Mohiuddin Ahmed, Abdun Naser Mahmood, and Jiankun Hu. 2016. A Survey of Network Anomaly Detection Techniques. Journal of Network and Computer Applications 60 (2016), 19--31.
    [2]
    Khaled Alnaami, Gbadebo Ayoade, Asim Siddiqui, Nicholas Ruozzi, Latifur Khan, and Bhavani Thuraisingham. 2015. P2V: Effective Website Fingerprinting Using Vector Space Representations. In Proceedings of the IEEE Symposium on Computational Intelligence. 59--66.
    [3]
    Kostas G. Anagnostakis, Stelios Sidiroglou, Periklis Akritidis, Michalis Polychronakis, Angelos D. Keromytis, and Evangelos P. Markatos. 2010. Shadow Honeypots. International Journal of Computer and Network Security (IJCNS) 2, 9 (2010), 1--15.
    [4]
    Frederico Araujo and Kevin W. Hamlen. 2015. Compiler-instrumented, Dynamic Secret-Redaction of Legacy Processes for Attacker Deception. In Proceedings of the 24th USENIX Security Symposium.
    [5]
    Frederico Araujo, Kevin W. Hamlen, Sebastian Biedermann, and Stefan Katzenbeisser. 2014. From Patches to Honey-Patches: Lightweight Attacker Misdirection, Deception, and Disinformation. In Proceedings of the 21st ACM Conference on Computer and Communications Security (CCS). 942--953.
    [6]
    Frederico Araujo, Mohammad Shapouri, Sonakshi Pandey, and Kevin Hamlen. 2015. Experiences with Honey-patching in Active Cyber Security Education. In Proceedings of the 8th USENIX Conference on Cyber Security Experimentation and Test (CSET).
    [7]
    Mamoun Awad, Latifur Khan, Farokh Bastani, and I-Ling Yen. 2004. An Effective Support Vector Machines (SVMs) Performance Using Hierarchical Clustering. In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI). 663--667.
    [8]
    Stefan Axelsson. 1999. The Base-rate Fallacy and its Implications for the Difficulty of Intrusion Detection. In Proceedings of the 6th ACM Conference on Computer and Communications Security (CCS). 1--7.
    [9]
    Gbadebo Ayoade, Frederico Araujo, Khaled Al-Naami, Ahmad M. Mustafa, Yang Gao, Kevin W. Hamlen, and Latifur Khan. 2020. Automating Cyberdeception Evaluation with Deep Learning. In Proceedings of the 53rd Hawaii International Conference on System Sciences (HICSS).
    [10]
    Karel Bartos, Michal Sofka, and Vojtech Franc. 2016. Optimized Invariant Representation of Network Traffic for Detecting Unseen Malware Variants. In Proceedings of the 25th USENIX Security Symposium. 807--822.
    [11]
    Monowar H. Bhuyan, Dhruba Kumar Bhattacharyya, and Jugal Kumar Kalita. 2014. Network Anomaly Detection: Methods, Systems and Tools. IEEE Communications Surveys & Tutorials 16, 1 (2014), 303--336.
    [12]
    Avrim L. Blum and Pat Langley. 1997. Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence 97, 1 (1997), 245--271.
    [13]
    Nathaniel Boggs, Hang Zhao, Senyao Du, and Salvatore J. Stolfo. 2014. Synthetic Data Generation and Defense in Depth Measurement of Web Applications. In Proceedings of the 17th International Symposium on Recent Advances in Intrusion Detection (RAID). 234--254.
    [14]
    Casey Breen, Latifur Khan, and Arunkumar Ponnusamy. 2002. Image Classification Using Neural Networks and Ontologies. In Proceedings of the 13th International Workshop on Database and Expert Systems Applications. 98--102.
    [15]
    João BD Cabrera, Lundy Lewis, and Raman K Mehra. 2001. Detection and Classification of Intrusions and Faults Using Sequences of System Calls. ACM SIGMOD Record 30, 4 (2001), 25--34.
    [16]
    Davide Canali, Marco Cova, Giovanni Vigna, and Christopher Kruegel. 2011. Prophiler: A Fast Filter for the Large-scale Detection of Malicious Web Pages. In Proceedings of the 20th International World Wide Web Conference (WWW). 197--206.
    [17]
    Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly Detection: A Survey. ACM Computing Surveys (CSUR) 41, 3 (2009), 15.
    [18]
    Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2, 3 (2011).
    [19]
    Gal Chechik, Varun Sharma, Uri Shalit, and Samy Bengio. 2010. Large Scale Online Learning of Image Similarity Through Ranking. Journal of Machine Learning Research (JMLR) 11 (2010), 1109--1135.
    [20]
    David A. Cieslak, Nitesh V. Chawla, and Aaron Striegel. 2006. Combating Imbalance in Network Intrusion Datasets. In Proceedings of the IEEE International Conference on Granular Computing (GrC). 732--737.
    [21]
    William W. Cohen. 1995. Fast Effective Rule Induction. In Proceedings of the 12th International Conference on Machine Learning. 115--123.
    [22]
    Corinna Cortes and Vladimir Vapnik. 1995. Support-vector Networks. Machine Learning 20, 3 (1995), 273--297.
    [23]
    Dorothy E. Denning. 1987. An Intrusion-detection Model. IEEE Transactions on Software Engineering (TSE) 13, 2 (1987), 222--232.
    [24]
    Jon DiMaggio. 2015. The Black Vine Cyberespionage Group. Symantec Security Response.
    [25]
    Dmitry Dudorov, David Stupples, and Martin Newby. 2013. Probability Analysis of Cyber Attack Paths Against Business and Commercial Enterprise Systems. In Proceedings of the IEEE European Intelligence and Security Informatics Conference (EISIC). 38--44.
    [26]
    Kevin P. Dyer, Scott E. Coull, Thomas Ristenpart, and Thomas Shrimpton. 2012. Peek-a-boo, I Still See You: Why Efficient Traffic Analysis Countermeasures Fail. In Proceedings of the 33rd IEEE Symposium on Security & Privacy (S&P). 332--346.
    [27]
    Edgescan. 2019. Vulnerability Statistics Report.
    [28]
    Eleazar Eskin, Andrew Arnold, Michael Prerau, Leonid Portnoy, and Salvatore Stolfo. 2002. A Geometric Framework for Unsupervised Anomaly Detection. In Applications of Data Mining in Computer Security. Springer, 77--101.
    [29]
    Stephanie Forrest, Steven A. Hofmeyr, Aniln Somayaji, and Thomas A. Longstaff. 1996. A Sense of Self for Unix Processes. In Proceedings of the 17th IEEE Symposium on Security & Privacy (S&P). 120--128.
    [30]
    Yang Gao, Yi-Fan Li, Swarup Chandra, Latifur Khan, and Bhavani Thuraisingham. 2019. Towards Self-adaptive Metric Learning on the Fly. In Proceedings of the 28th International World Wide Web Conference (WWW). 503--513.
    [31]
    Pedro Garcia-Teodoro, J Diaz-Verdejo, Gabriel Maciá-Fernández, and Enrique Vázquez. 2009. Anomaly-based Network Intrusion Detection: Techniques, Systems and Challenges. Computers & Security 28, 1 (2009), 18--28.
    [32]
    Derek Greene and Pádraig Cunningham. 2006. Practical Solutions to the Problem of Diagonal Dominance in Kernel Document Clustering. In Proceedings of the 23rd International Conference on Machine learning (ICML). 377--384.
    [33]
    Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA Data Mining Software: An Update. ACM SIGKDD Explorations Newsletter 11, 1 (2009), 10--18.
    [34]
    Ahsanul Haque, Latifur Khan, and Michael Baron. 2016. SAND: Semi-supervised Adaptive Novel Class Detection and Classification Over Data Stream. In Proceedings of the 30th Conference on Artificial Intelligence (AAAI). 1652--1658.
    [35]
    Haibo He and Edwardo A. Garcia. 2009. Learning From Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering (TKDE) 21, 9 (2009), 1263--1284.
    [36]
    Steven A. Hofmeyr, Stephanie Forrest, and Anil Somayaji. 1998. Intrusion Detection Using Sequences of System Calls. Journal of Computer Security 6, 3 (1998), 151--180.
    [37]
    Prateek Jain, Brian Kulis, Inderjit S. Dhillon, and Kristen Grauman. 2008. Online Metric Learning and Fast Similarity Search. In Proceedings of the 21st International Conference on Neural Information Processing Systems (NIPS). 761--768.
    [38]
    Allen Jeng. 2015. Minimizing Damage From J.P. Morgan's Data Breach. InfoSec Reading Room (2015).
    [39]
    Rong Jin, Shijun Wang, and Yang Zhou. 2009. Regularized Distance Metric Learning: Theory and Algorithm. In Proceedings of the 22nd International Conference on Neural Information Processing Systems (NIPS). 862--870.
    [40]
    Marc Juarez, Sadia Afroz, Gunes Acar, Claudia Diaz, and Rachel Greenstadt. 2014. A Critical Evaluation of Website Fingerprinting Attacks. In Proceedings of the 21st ACM Conference on Computer and Communications Security (CCS). 263--274.
    [41]
    Juniper Research. 2017. The Future of Cybercrime and Security: Key Takeaways and Juniper Leaderboard.
    [42]
    Alexandros Kapravelos, Yan Shoshitaishvili, Marco Cova, Christopher Kruegel, and Giovanni Vigna. 2013. Revolver: An Automated Approach to the Detection of Evasive Web-based Malware. In Proceedings of the 22nd USENIX Security Symposium. 637--652.
    [43]
    Jungwon Kim, Peter J. Bentley, Uwe Aickelin, Julie Greensmith, Gianni Tedesco, and Jamie Twycross. 2007. Immune System Approaches to Intrusion detection---A Review. Natural Computing 6, 4 (2007), 413--466.
    [44]
    Tiina Kovanen, Gil David, and Timo Hämäläinen. 2016. Survey: Intrusion Detection Systems in Encrypted Traffic. In Proceedings of the 16th International Conference on Next Generation Wired/Wireless Networking (NEW2AN). 281--293.
    [45]
    Christopher Kruegel, Darren Mutz, William Robertson, and Fredrik Valeur. 2003. Bayesian Event Classification for Intrusion Detection. In Proceedings of the 19th Annual Computer Security Applications Conference (ACSAC). 14--23.
    [46]
    Christopher Kruegel and Giovanni Vigna. 2003. Anomaly Detection of Web-based Attacks. In Proceedings of the 10th ACM Conference on Computer and Communications Security (CCS). 251--261.
    [47]
    Christopher Kruegel, Giovanni Vigna, and William Robertson. 2005. A Multi-model Approach to the Detection of Web-based Attacks. Computer Networks 48, 5 (2005), 717--738.
    [48]
    Christopher Krügel, Thomas Toth, and Engin Kirda. 2002. Service Specific Anomaly Detection for Network Intrusion Detection. In Proceedings of the 17th ACM Symposium on Applied Computing (SAC). 201--208.
    [49]
    Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep Learning. Nature 521 (2015), 436--444.
    [50]
    Wenke Lee and Salvatore J. Stolfo. 1998. Data Mining Approaches for Intrusion Detection. In Proceedings of the 7th USENIX Security Symposium. 79--93.
    [51]
    Wenke Lee and Dong Xiang. 2001. Information-theoretic Measures for Anomaly Detection. In Proceedings of the 22nd IEEE Symposium on Security & Privacy (S&P). 130--143.
    [52]
    Wenbin Li, Yang Gao, Lei Wang, Luping Zhou, Jing Huo, and Yinghuan Shi. 2018. OPML: A One-pass Closed-form Solution for Online Metric Learning. Pattern Recognition 75 (2018), 302--314.
    [53]
    LXC. 2019. Linux Containers. http://linuxcontainers.org.
    [54]
    Prajowal Manandhar and Zeyar Aung. 2014. Towards Practical Anomaly-based Intrusion Detection by Outlier Mining on TCP Packets. In Proceedings of the 25th International Conference on Database and Expert Systems Applications (DEXA). 164--173.
    [55]
    Carla Marceau. 2001. Characterizing the Behavior of a Program Using Multiple-length N-grams. In Proceedings of the New Security Paradigms Workshop (NSPW). 101--110.
    [56]
    Mehedy Masud, Latifur Khan, and Bhavani Thuraisingham. 2011. Data Mining Tools for Malware Detection. CRC Press.
    [57]
    Mohammad M. Masud, Tahseen M. Al-Khateeb, Kevin W. Hamlen, Jing Gao, Latifur Khan, Jiawei Han, and Bhavani Thuraisingham. 2008. Cloud-based Malware Detection for Evolving Data Streams. ACM Transactions on Management Information Systems (TMIS) 2, 3 (2008).
    [58]
    Mohammad M. Masud, Jing Gao, Latifur Khan, Jiawei Han, and Bhavani Thuraisingham. 2010. Classification and Novel Class Detection in Data Streams with Active Mining. In Proceedings of the 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD). 311--324.
    [59]
    MinIO. 2019. MinIO Object Storage. https://min.io/.
    [60]
    Mockaroo. 2018. Product data set. https://www.mockaroo.com.
    [61]
    Novetta Threat Research Group. 2016. Operation Blockbuster: Unraveling the Long Thread of the Sony Attack.
    [62]
    Andriy Panchenko, Lukas Niessen, Andreas Zinnen, and Thomas Engel. 2011. Website Fingerprinting in Onion Routing Based Anonymization Networks. In Proceedings of the 10th Annual ACM Workshop on Privacy in the Electronic Society (WPES). 103--114.
    [63]
    Animesh Patcha and Jung-Min Park. 2007. An Overview of Anomaly Detection Techniques: Existing Solutions and Latest Technological Trends. Computer Networks 51, 12 (2007), 3448--3470.
    [64]
    John C. Platt. 1999. Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods. In Advances in Large Margin Classifiers. MIT Press, 61--74.
    [65]
    PyTorch. 2019. Open Source Deep Learning Platform. https://pytorch.org.
    [66]
    Tony Sager. 2014. Killing Advanced Threats in Their Tracks: An Intelligent Approach to Attack Prevention. InfoSec Reading Room (2014).
    [67]
    Selenium. 2019. Selenium Browser Automation. http://www.seleniumhq.org.
    [68]
    Xiaokui Shu, Danfeng Yao, and Naren Ramakrishnan. 2015. Unearthing Stealthy Program Attacks Buried in Extremely Long Execution Paths. In Proceedings of the 22nd ACM Conference on Computer and Communications Security (CCS). 401--413.
    [69]
    Robin Sommer and Vern Paxson. 2010. Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. In Proceedings of the 31st IEEE Symposium on Security & Privacy (S&P). 305--316.
    [70]
    Steve Souders. 2007. High Performance Web Sites: Essential Knowledge for Front-End Engineers. O'Reilly.
    [71]
    Lance Spitzner. 2002. Honeypots: Tracking Hackers. Addison-Wesley.
    [72]
    Symantec. 2018. Internet Security Threat Report, Vol. 23.
    [73]
    Sysdig. 2019. Universal System Visibility Tool. https://github.com/draios/sysdig.
    [74]
    tcpdump. 2019. Tcpdump and Libpcap. https://www.tcpdump.org/.
    [75]
    Chih-Fong Tsai, Yu-Feng Hsu, Chia-Ying Lin, and Wei-Yang Lin. 2009. Intrusion Detection By Machine Learning: A Review. Expert Systems with Applications 36, 10 (2009), 11994--12000.
    [76]
    Emmanouil Vasilomanolakis, Shankar Karuppayah, Max Mühlhäuser, and Mathias Fischer. 2015. Taxonomy and Survey of Collaborative Intrusion Detection. Comput. Surveys 47, 4 (2015).
    [77]
    Tao Wang, Xiang Cai, Rishab Nithyanand, Rob Johnson, and Ian Goldberg. 2014. Effective Attacks and Provable Defenses for Website Fingerprinting. In Proceedings of the 23rd USENIX Security Symposium.
    [78]
    Christina Warrender, Stephanie Forrest, and Barak Pearlmutter. 1999. Detecting Intrusions Using System Calls: Alternative Data Models. In Proceedings of the 20th IEEE Symposium on Security & Privacy (S&P). 133--145.
    [79]
    Shiming Xiang, Feiping Nie, and Changshui Zhang. 2008. Learning a Mahalanobis Distance Metric for Data Clustering and Classification. Pattern Recognition 41, 12 (2008), 3600--3612.
    [80]
    Danfeng Yao, Xiaokui Shu, Long Cheng, Salvatore J. Stolfo, Elisa Bertino, and Ravi Sandhu. 2017. Anomaly Detection as a Service: Challenges, Advances, and Opportunities. Morgan & Claypool Publishers.
    [81]
    Jim Yuill, Dorothy Denning, and Fred Feer. 2006. Using Deception to Hide Things From Hackers: Processes, Principles, and Techniques. Journal of Information Warfare 5, 3 (2006), 26--40.
    [82]
    Ming Zhang, Boyi Xu, and Dongxia Wang. 2015. An Anomaly Detection Model for Network Intrusions Using One-class SVM and Scaling Strategy. In Proceedings of the 11th International Conference on Collaborative Computing: Networking, Applications, and Worksharing (CollaborateCom). 267--278.

    Cited By

    View all
    • (2023)Advanced Persistent Threat Detection Using Data Provenance and Metric LearningIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2022.322178920:5(3957-3969)Online publication date: 1-Sep-2023
    • (2023)SoK: Pragmatic Assessment of Machine Learning for Network Intrusion Detection2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P)10.1109/EuroSP57164.2023.00042(592-614)Online publication date: Jul-2023
    • (2023)Deep learning techniques to detect cybersecurity attacks: a systematic mapping studyEmpirical Software Engineering10.1007/s10664-023-10302-128:3Online publication date: 9-May-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ACSAC '19: Proceedings of the 35th Annual Computer Security Applications Conference
    December 2019
    821 pages
    ISBN:9781450376280
    DOI:10.1145/3359789
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 December 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. datasets
    2. honeypots
    3. intrusion detection
    4. neural networks

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ACSAC '19
    ACSAC '19: 2019 Annual Computer Security Applications Conference
    December 9 - 13, 2019
    Puerto Rico, San Juan, USA

    Acceptance Rates

    ACSAC '19 Paper Acceptance Rate 60 of 266 submissions, 23%;
    Overall Acceptance Rate 104 of 497 submissions, 21%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)148
    • Downloads (Last 6 weeks)15
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Advanced Persistent Threat Detection Using Data Provenance and Metric LearningIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2022.322178920:5(3957-3969)Online publication date: 1-Sep-2023
    • (2023)SoK: Pragmatic Assessment of Machine Learning for Network Intrusion Detection2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P)10.1109/EuroSP57164.2023.00042(592-614)Online publication date: Jul-2023
    • (2023)Deep learning techniques to detect cybersecurity attacks: a systematic mapping studyEmpirical Software Engineering10.1007/s10664-023-10302-128:3Online publication date: 9-May-2023
    • (2023)SDN-Based Cyber Deception Deployment for Proactive Defense Strategy Using Honey of Things and Cyber Threat IntelligenceIntelligence of Things: Technologies and Applications10.1007/978-3-031-46749-3_26(269-278)Online publication date: 20-Oct-2023
    • (2023) RAKSHAM : Responsive approach to Knock‐off scavenging hackers and attack mitigation Transactions on Emerging Telecommunications Technologies10.1002/ett.490435:1Online publication date: 2-Dec-2023
    • (2021)Crook-sourced intrusion detection as a serviceJournal of Information Security and Applications10.1016/j.jisa.2021.10288061:COnline publication date: 1-Sep-2021
    • (2021)Autoencoder-based deep metric learning for network intrusion detectionInformation Sciences10.1016/j.ins.2021.05.016569(706-727)Online publication date: Aug-2021
    • (2020)AI-Powered Honeypots for Enhanced IoT Botnet Detection2020 3rd World Symposium on Communication Engineering (WSCE)10.1109/WSCE51339.2020.9275581(64-68)Online publication date: 9-Oct-2020
    • (2020)Evolving Advanced Persistent Threat Detection using Provenance Graph and Metric Learning2020 IEEE Conference on Communications and Network Security (CNS)10.1109/CNS48642.2020.9162264(1-9)Online publication date: Jun-2020

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media