research-article

Automating Privilege Escalation with Deep Reinforcement Learning

Authors:

Kalle Kujanpää,

Alexander IlinAuthors Info & Claims

AISec '21: Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security

Pages 157 - 168

https://doi.org/10.1145/3474369.3486877

Published: 15 November 2021 Publication History

Abstract

AI-based defensive solutions are necessary to defend networks and information assets against intelligent automated attacks. Gathering enough realistic data for training machine learning-based defenses is a significant practical challenge. An intelligent red teaming agent capable of performing realistic attacks can alleviate this problem. However, there is little scientific evidence demonstrating the feasibility of fully automated attacks using machine learning. In this work, we exemplify the potential threat of malicious actors using deep reinforcement learning to train automated agents. We present an agent that uses a state-of-the-art reinforcement learning algorithm to perform local privilege escalation. Our results show that the autonomous agent can escalate privileges in a Windows~7 environment using a wide variety of different techniques depending on the environment configuration it encounters. Hence, our agent is usable for generating realistic attack sensor data for training and evaluating intrusion detection systems.

References

[1]

Mohammad Alauthman, Nauman Aslam, Mouhammd Al-Kasassbeh, Suleman Khan, Ahmad Al-Qerem, and Kim-Kwang Raymond Choo. 2020. An efficient reinforcement learning-based Botnet detection approach. Journal of Network and Computer Applications, Vol. 150 (2020), 102479.

Digital Library

[2]

Hyrum S Anderson, Anant Kharkar, Bobby Filar, David Evans, and Phil Roth. 2018. Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning. arXiv preprint arXiv:1801.08917 (2018).

[3]

Andy Applebaum, Doug Miller, Blake Strom, Chris Korban, and Ross Wolf. 2016. Intelligent, Automated Red Team Emulation. In Proceedings of the 32nd Annual Conference on Computer Security Applications. 363--373.

Digital Library

[4]

Giovanni Apruzzese, Michele Colajanni, Luca Ferretti, Alessandro Guido, and Mirco Marchetti. 2018. On the Effectiveness of Machine and Deep Learning for Cyber Security. In 2018 10th International Conference on Cyber Conflict (CyCon). IEEE, 371--390.

[5]

John A Bland, Mikel D Petty, Tymaine S Whitaker, Katia P Maxwell, and Walter Alan Cantrell. 2020. Machine Learning Cyberattack and Defense Strategies. Computers & Security, Vol. 92 (2020), 101738.

[6]

Francesco Caturano, Gaetano Perrone, and Simon Pietro Romano. 2021. Discovering reflected cross-site scripting vulnerabilities using a multiobjective reinforcement learning environment. Computers & Security, Vol. 103 (2021), 102204.

[7]

Ünal cC avucs oug lu. 2019. A new hybrid approach for intrusion detection using machine learning methods. Applied Intelligence, Vol. 49, 7 (2019), 2735--2761.

Digital Library

[8]

Moitrayee Chatterjee and Akbar-Siami Namin. 2019. Detecting Phishing Websites through Deep Reinforcement Learning. In 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), Vol. 2. IEEE, 227--232.

[9]

Tong Chen, Jiqiang Liu, Yingxiao Xiang, Wenjia Niu, Endong Tong, and Zhen Han. 2019. Adversarial attack and defense in reinforcement learning-from AI security view. Cybersecurity, Vol. 2, 11 (2019), 1--22.

[10]

Ankur Chowdhary, Dijiang Huang, Jayasurya Sevalur Mahendran, Daniel Romo, Yuli Deng, and Abdulhakim Sabur. 2020. Autonomous Security Analysis and Penetration Testing. In 2020 16th International Conference on Mobility, Sensing and Networking (MSN). IEEE, 508--515.

[11]

Zhihua Cui, Fei Xue, Xingjuan Cai, Yang Cao, Gai-ge Wang, and Jinjun Chen. 2018. Detection of Malicious Code Variants Based on Deep Learning. IEEE Transactions on Industrial Informatics, Vol. 14, 7 (2018), 3187--3196.

[12]

Richard Elderman, Leon JJ Pater, Albert S Thie, Madalina M Drugan, and Marco Wiering. 2017. Adversarial Reinforcement Learning in a Cyber Security Simulation. In ICAART (2). 559--566.

[13]

László ErdHo di and Fabio Massimo Zennaro. 2021. The Agent Web Model: modeling web hacking for reinforcement learning. International Journal of Information Security (2021), 1--17.

[14]

Zhiyang Fang, Junfeng Wang, Jiaxuan Geng, and Xuan Kan. 2019 a. Feature Selection for Malware Detection Based on Reinforcement Learning. IEEE Access, Vol. 7 (2019), 176177--176187.

[15]

Zhiyang Fang, Junfeng Wang, Boya Li, Siqi Wu, Yingjie Zhou, and Haiying Huang. 2019 b. Evading Anti-Malware Engines With Deep Reinforcement Learning. IEEE Access, Vol. 7 (2019), 48867--48879.

[16]

Aidin Ferdowsi, Ursula Challita, Walid Saad, and Narayan B Mandayam. 2018. Robust Deep Reinforcement Learning for Security and Safety in Autonomous Vehicle Systems. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE, 307--312.

[17]

Jeff Forcier. 2021. Paramiko: A Python implementation of SSHv2. http://www.paramiko.org/index.html

[18]

Mohamed C Ghanem and Thomas M Chen. 2018. Reinforcement Learning for Intelligent Penetration Testing. In 2018 Second World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4). IEEE, 185--192.

[19]

Mohamed C Ghanem and Thomas M Chen. 2020. Reinforcement Learning for Efficient Network Penetration Testing. Information, Vol. 11, 1 (2020), 6.

[20]

Andy Greenberg. 2018. The Untold Story of NotPetya, the Most Devastating Cyberattack in History. Wired, August, Vol. 22 (2018).

[21]

Guoan Han, Liang Xiao, and H Vincent Poor. 2017. Two-Dimensional Anti-Jamming Communication Based on Deep Reinforcement Learning. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2087--2091.

[22]

Yi Han, Benjamin IP Rubinstein, Tamas Abraham, Tansu Alpcan, Olivier De Vel, Sarah Erfani, David Hubczenko, Christopher Leckie, and Paul Montague. 2018. Reinforcement Learning for Autonomous Defence in Software-Defined Networking. In International Conference on Decision and Game Theory for Security. Springer, 145--165.

[23]

Xiaofan He, Huaiyu Dai, and Peng Ning. 2016. Faster Learning and Adaptation in Security Games by Exploiting Information Asymmetry. IEEE Transactions on Signal Processing, Vol. 64, 13 (2016), 3429--3443.

Digital Library

[24]

Danny Hendler, Shay Kels, and Amir Rubin. 2018. Detecting Malicious PowerShell Commands using Deep Neural Networks. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security. 187--197.

Digital Library

[25]

Peter J Huber. 1992. Robust Estimation of a Location Parameter. In Breakthroughs in Statistics. Springer, 492--518.

[26]

TaeGuen Kim, BooJoong Kang, Mina Rho, Sakir Sezer, and Eul Gyu Im. 2018. A Multimodal Deep Learning Method for Android Malware Detection using Various Features. IEEE Transactions on Information Forensics and Security, Vol. 14, 3 (2018), 773--788.

[27]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980 (2014).

[28]

Ryusei Maeda and Mamoru Mimura. 2021. Automating post-exploitation with deep reinforcement learning. Computers & Security, Vol. 100 (2021), 102108.

Digital Library

[29]

Dean Richard McKinnel, Tooska Dargahi, Ali Dehghantanha, and Kim-Kwang Raymond Choo. 2019. A systematic literature review and meta-analysis on artificial intelligence in penetration testing and vulnerability assessment. Computers & Electrical Engineering, Vol. 75 (2019), 175--188.

Digital Library

[30]

Nikola Milosevic, Ali Dehghantanha, and Kim-Kwang Raymond Choo. 2017. Machine learning aided Android malware classification. Computers & Electrical Engineering, Vol. 61 (2017), 266--274.

[31]

Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous Methods for Deep Reinforcement Learning. In International conference on machine learning. PMLR, 1928--1937.

[32]

Thanh Thi Nguyen and Vijay Janapa Reddi. 2019. Deep Reinforcement Learning for Cyber Security. arXiv preprint arXiv:1906.05799 (2019).

[33]

Zhen Ni and Shuva Paul. 2019. A Multistage Game in Smart Grid Security: A Reinforcement Learning Solution. IEEE Transactions on Neural Networks and Learning Systems, Vol. 30, 9 (2019), 2684--2695.

[34]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. dtextquotesingle Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

Digital Library

[35]

Rapid7. 2021. Metasploit Framework. (2021). https://docs.rapid7.com/metasploit/msf-overview/

[36]

Richard S Sutton and Andrew G Barto. 2018. Reinforcement Learning: An Introduction. MIT press.

Digital Library

[37]

Isao Takaesu. 2018. Deep Exploit. https://github.com/13o-bbr-bbq/machine_learning_security/tree/master/DeepExploit

[38]

Microsoft 365 Defender Research Team. 2021. Gamifying machine learning for stronger security and AI models. https://www.microsoft.com/security/blog/2021/04/08/gamifying-machine-learning-for-stronger-security-and-ai-models/

[39]

Xiaoyue Wan, Geyi Sheng, Yanda Li, Liang Xiao, and Xiaojiang Du. 2017. Reinforcement Learning Based Mobile Offloading for Cloud-Based Malware Detection. In GLOBECOM 2017--2017 IEEE Global Communications Conference. IEEE, 1--6.

[40]

Yufei Wang, Zheyuan Ryan Shi, Lantao Yu, Yi Wu, Rohit Singh, Lucas Joppa, and Fei Fang. 2019. Deep Reinforcement Learning for Green Security Games with Real-Time Information. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 1401--1408.

Digital Library

[41]

Liang Xiao, Yan Li, Guolong Liu, Qiangda Li, and Weihua Zhuang. 2015. Spoofing Detection with Reinforcement Learning in Wireless Networks. In 2015 IEEE Global Communications Conference (GLOBECOM). IEEE, 1--5.

[42]

Liang Xiao, Xiaoyue Wan, Canhuang Dai, Xiaojiang Du, Xiang Chen, and Mohsen Guizani. 2018. Security in Mobile Edge Caching with Reinforcement Learning. IEEE Wireless Communications, Vol. 25, 3 (2018), 116--122.

[43]

Fabio Massimo Zennaro and Laszlo Erdodi. 2021. Modeling Penetration Testing with Reinforcement Learning Using Capture-the-Flag Challenges: Trade-offs between Model-free Learning and A Priori Knowledge. arXiv preprint arXiv:2005.12632 (2021).

Cited By

Liu HLiu CWu XQu YLiu H(2024)An Automated Penetration Testing Framework Based on Hierarchical Reinforcement LearningElectronics10.3390/electronics1321431113:21(4311)Online publication date: 2-Nov-2024
https://doi.org/10.3390/electronics13214311
Simon RMees W(2024)SoK: A Comparison of Autonomous Penetration Testing AgentsProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3664484(1-10)Online publication date: 30-Jul-2024
https://dl.acm.org/doi/10.1145/3664476.3664484
Xu DChen KLin MLin CWang X(2024)AutoPwn: Artifact-Assisted Heap Exploit Generation for CTF PWN CompetitionsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.332231919(293-306)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIFS.2023.3322319
Show More Cited By

Index Terms

Automating Privilege Escalation with Deep Reinforcement Learning

Recommendations

Automating post-exploitation with deep reinforcement learning
Abstract
In order to assess the risk of information systems, it is important to investigate the behavior of the attacker after successful exploitation (post-exploitation). However, the audit requires the experts, and to the best of our ...
Snooping Attacks on Deep Reinforcement Learning
AAMAS '20: Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems

Adversarial attacks have exposed a significant security vulnerability in state-of-the-art machine learning models. Among these models include deep reinforcement learning agents. The existing methods for attacking reinforcement learning agents assume the ...
Agent manipulator: Stealthy strategy attacks on deep reinforcement learning
Abstract
Deep reinforcement learning (DRL) is a primary machine learning approach for solving sequential decision problems. To exploit the potential vulnerabilities of DRL, we propose a poisoning attack method that injects a backdoor for the DRL model by ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AISec '21: Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security

November 2021

210 pages

ISBN:9781450386579

DOI:10.1145/3474369

Program Chairs:
Nicholas Carlini
Google Brain
,
Ambra Demontis
University of Cagliari
,
Yizheng Chen
University of California, Berkeley

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Business Finland / S2ERC

Conference

CCS '21

Sponsor:

SIGSAC

CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security

November 15, 2021

Virtual Event, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 94 of 231 submissions, 41%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
505
Total Downloads

Downloads (Last 12 months)113
Downloads (Last 6 weeks)16

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liu HLiu CWu XQu YLiu H(2024)An Automated Penetration Testing Framework Based on Hierarchical Reinforcement LearningElectronics10.3390/electronics1321431113:21(4311)Online publication date: 2-Nov-2024
https://doi.org/10.3390/electronics13214311
Simon RMees W(2024)SoK: A Comparison of Autonomous Penetration Testing AgentsProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3664484(1-10)Online publication date: 30-Jul-2024
https://dl.acm.org/doi/10.1145/3664476.3664484
Xu DChen KLin MLin CWang X(2024)AutoPwn: Artifact-Assisted Heap Exploit Generation for CTF PWN CompetitionsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.332231919(293-306)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIFS.2023.3322319
Ozkan-Okay MAkin EAslan ÖKosunalp SIliev TStoyanov IBeloev I(2024)A Comprehensive Survey: Evaluating the Efficiency of Artificial Intelligence and Machine Learning Techniques on Cyber Security SolutionsIEEE Access10.1109/ACCESS.2024.335554712(12229-12256)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3355547
Pham VHoang HTrung PQuoc VTo TDuy P(2024)Raiju: Reinforcement learning-guided post-exploitation for automating security assessment of network systemsComputer Networks10.1016/j.comnet.2024.110706253(110706)Online publication date: Nov-2024
https://doi.org/10.1016/j.comnet.2024.110706
Venturi AAndreolini MMarchetti MColajanni M(2024)Assessing generalizability of Deep Reinforcement Learning algorithms for Automated Vulnerability Assessment and Penetration TestingArray10.1016/j.array.2024.10036524(100365)Online publication date: Dec-2024
https://doi.org/10.1016/j.array.2024.100365
Samrouth KNassar MHarb H(2023)Revisiting Attack Trees for Modeling Machine Pwning in Training Environments2023 3rd Intelligent Cybersecurity Conference (ICSC)10.1109/ICSC60084.2023.10349984(46-53)Online publication date: 23-Oct-2023
https://doi.org/10.1109/ICSC60084.2023.10349984
Colvett CPetty MBland J(2023)Impact of computer users on cyber defense strategiesSystems Engineering10.1002/sys.2173727:3(532-555)Online publication date: 28-Nov-2023
https://doi.org/10.1002/sys.21737

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents