research-article

Open access

Offensive AI: Enhancing Directory Brute-forcing Attack with the Use of Language Models

Authors:

Alberto Castagnaro,

Luca PajolaAuthors Info & Claims

AISec '24: Proceedings of the 2024 Workshop on Artificial Intelligence and Security

Pages 184 - 195

https://doi.org/10.1145/3689932.3694770

Published: 22 November 2024 Publication History

Abstract

Web Vulnerability Assessment and Penetration Testing (Web VAPT) is a comprehensive cybersecurity process that uncovers a range of vulnerabilities which, if exploited, could compromise the integrity of web applications. In a VAPT, it is common to perform a Directory brute-forcing Attack, aiming at the identification of accessible directories of a target website. Current commercial solutions are inefficient as they are based on brute-forcing strategies that use wordlists, resulting in enormous quantities of trials for a small amount of success.

Offensive AI is a recent paradigm that integrates AI-based technologies in cyber attacks. In this work, we explore whether AI can enhance the directory enumeration process and propose a novel Language Model-based framework. Our experiments -- conducted in a testbed consisting of 1 million URLs from different web application domains (universities, hospitals, government, companies) -- demonstrate the superiority of the LM-based attack, with an average performance increase of 969%.

References

[1]

Abdulrahman Al-Hababi and Sezer C Tokgoz. 2020. Man-in-the-middle attacks to detect and identify services in encrypted network flows using machine learning. In 2020 3rd International Conference on Advanced Communication Technologies and Networking (CommNet). IEEE, 1--5.

[2]

Diego Antonelli, Roberta Cascella, Gaetano Perrone, Simon Pietro Romano, and Antonio Schiano. 2021. Leveraging AI to optimize website structure discovery during Penetration Testing. arxiv: 2101.07223 [cs.CR]

[3]

Daniel Arp, Erwin Quiring, Feargus Pendlebury, Alexander Warnecke, Fabio Pierazzi, Christian Wressnegger, Lorenzo Cavallaro, and Konrad Rieck. 2022. Dos and don'ts of machine learning in computer security. In 31st USENIX Security Symposium (USENIX Security 22). 3971--3988.

[4]

Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural language processing with Python: analyzing text with the natural language toolkit. " O'Reilly Media, Inc.".

Digital Library

[5]

Philip Bontrager, Aditi Roy, Julian Togelius, Nasir Memon, and Arun Ross. 2018. Deepmasterprints: Generating masterprints for dictionary attacks via latent variable evolution. In 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS). IEEE, 1--9.

Digital Library

[6]

Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, et al. 2023. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology (2023).

[7]

Abrael Delgado. 2023. Who is the Prime Target for Cyber Attacks? -- compuquip.com. https://www.compuquip.com/blog/prime-target-for-cyber-attacks-and-to-look-out-for. [Accessed 10-05-2024].

[8]

Ying He, Cunjin Luo, Jiyuan Zheng, Kuanquan Wang, and Henggui Zhang. 2022. AI Based Directory Discovery Attack and Prevention of the Medical Systems. In 2022 Computing in Cardiology (CinC), Vol. 498. IEEE, 1--4.

[9]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.

[10]

Nektaria Kaloudi and Jingyue Li. 2020. The ai-based cyber threat landscape: A survey. ACM Computing Surveys (CSUR), Vol. 53, 1 (2020), 1--34.

Digital Library

[11]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1412.6980

[12]

Yuanzhang Li, Yaxiao Wang, Ye Wang, Lishan Ke, and Yu-an Tan. 2020. A feature-vector generative adversarial network for evading PDF malware classifiers. Information Sciences, Vol. 523 (2020), 38--48.

[13]

Yisroel Mirsky, Ambra Demontis, Jaidip Kotak, Ram Shankar, Deng Gelei, Liu Yang, Xiangyu Zhang, Maura Pintor, Wenke Lee, Yuval Elovici, et al. 2023. The threat of offensive ai to organizations. Computers & Security, Vol. 124 (2023), 103006.

Digital Library

[14]

Sungyup Nam, Seungho Jeon, Hongkyo Kim, and Jongsub Moon. 2020. Recurrent gans password cracker for iot password security enhancement. Sensors, Vol. 20, 11 (2020), 3106.

[15]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: an imperative style, high-performance deep learning library. Curran Associates Inc.

[16]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.

[17]

Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, and Alexander Miller. 2019. Language Models as Knowledge Bases?. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (Eds.). Association for Computational Linguistics, Hong Kong, China, 2463--2473. https://doi.org/10.18653/v1/D19-1250

[18]

Tobias Schnabel, Igor Labutov, David Mimno, and Thorsten Joachims. 2015. Evaluation methods for unsupervised word embeddings. In Proceedings of the 2015 conference on empirical methods in natural language processing. 298--307.

[19]

The Constella Team. 2022. Top Common Targets for Hackers | How Do Hackers Choose Targets? - constella.ai. https://constella.ai/top-common-targets-for-hackers/. [Accessed 10-05--2024].

[20]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).

Index Terms

Offensive AI: Enhancing Directory Brute-forcing Attack with the Use of Language Models
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation
2. Security and privacy
  1. Systems security
    1. Vulnerability management
      1. Penetration testing

Recommendations

The Threat of Offensive AI to Organizations
Abstract
AI has provided us with the ability to automate tasks, extract information from vast amounts of data, and synthesize media that is nearly indistinguishable from the real thing. However, positive tools can also be used for negative ...
Web Vulnerability Detection Analyzer Based on Python

In the information age, hackers will use Web vulnerabilities to infiltrate websites, resulting in many security incidents. To solve this problem, security-conscious enterprises or individuals will conduct penetration tests on websites to test and ...
Improving offensive cyber security assessments using varied and novel initialization perspectives
ACMSE '18: Proceedings of the 2018 ACM Southeast Conference

Offensive cyber security assessment methods such as red teaming and penetration testing have grown in parallel with evolving threats to evaluate traditional and diverging attack surfaces. This paper provides a taxonomy of ethical hacker conducted ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AISec '24: Proceedings of the 2024 Workshop on Artificial Intelligence and Security

November 2024

225 pages

ISBN:9798400712289

DOI:10.1145/3689932

Program Chairs:
Maura Pintor
University of Cagliari
,
Xinyun Chen
Google DeepMind
,
Matthew Jagielski
Google DeepMind

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 November 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

European Commission

Conference

CCS '24

Sponsor:

SIGSAC

CCS '24: ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

UT, Salt Lake City, USA

Acceptance Rates

Overall Acceptance Rate 94 of 231 submissions, 41%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
61
Total Downloads

Downloads (Last 12 months)61
Downloads (Last 6 weeks)61

Reflects downloads up to 31 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents