research-article

Generalized vulnerability extrapolation using abstract syntax trees

Authors:

Fabian Yamaguchi,

Markus Lottmann,

Konrad RieckAuthors Info & Claims

ACSAC '12: Proceedings of the 28th Annual Computer Security Applications Conference

Pages 359 - 368

https://doi.org/10.1145/2420950.2421003

Published: 03 December 2012 Publication History

Abstract

The discovery of vulnerabilities in source code is a key for securing computer systems. While specific types of security flaws can be identified automatically, in the general case the process of finding vulnerabilities cannot be automated and vulnerabilities are mainly discovered by manual analysis. In this paper, we propose a method for assisting a security analyst during auditing of source code. Our method proceeds by extracting abstract syntax trees from the code and determining structural patterns in these trees, such that each function in the code can be described as a mixture of these patterns. This representation enables us to decompose a known vulnerability and extrapolate it to a code base, such that functions potentially suffering from the same flaw can be suggested to the analyst. We evaluate our method on the source code of four popular open-source projects: LibTIFF, FFmpeg, Pidgin and Asterisk. For three of these projects, we are able to identify zero-day vulnerabilities by inspecting only a small fraction of the code bases.

References

[1]

T. Avgerinos, S. K. Cha, B. L. T. Hao, and D. Brumley. AEG: Automatic Exploit Generation. In Proc. of Network and Distributed System Security Symposium (NDSS), 2011.

[2]

I. D. Baxter, A. Yahin, L. Moura, M. S. Anna, and L. Bier. Clone detection using abstract syntax trees. In Proc. of the International Conference on Software Maintenance (ICSM), 1998.

Digital Library

[3]

S. Bellon, R. Koschke, I. C. Society, G. Antoniol, J. Krinke, I. C. Society, and E. Merlo. Comparison and evaluation of clone detection tools. IEEE Transactions on Software Engineering, 33: 577--591, 2007.

Digital Library

[4]

M. Cova, V. Felmetsger, G. Banks, and G. Vigna. Static detection of vulnerabilities in x86 executables. In Proc. of Annual Computer Security Applications Conference (ACSAC), pages 269--278, 2006.

Digital Library

[5]

S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6): 391--407, 1990.

[6]

D. Engler, D. Y. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as deviant behavior: A general approach to inferring errors in systems code. In Proc. of ACM Symposium on Operating Systems Principles (SOSP), pages 57--72, 2001.

Digital Library

[7]

N. Falliere, L. O. Murchu, and E. Chien. W32.stuxnet dossier. Symantec Corporation, 2011.

[8]

P. Godefroid, M. Y. Levin, and D. Molnar. SAGE: whitebox fuzzing for security testing. Communications of the ACM, 55(3): 40--44, 2012.

Digital Library

[9]

S. Heelan. Vulnerability detection systems: Think cyborg, not robot. IEEE Security & Privacy, 9(3): 74--77, 2011.

Digital Library

[10]

J. Hopcroft and J. Motwani, R. Ullmann. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 2 edition, 2001.

Digital Library

[11]

J. Jang, A. Agrawal, and D. Brumley. ReDeBug: finding unpatched code clones in entire os distributions. In Proc. of IEEE Symposium on Security and Privacy, 2012.

Digital Library

[12]

N. Jovanovic, C. Kruegel, and E. Kirda. Pixy: A static analysis tool for detecting web application vulnerabilities. In Proc. of IEEE Symposium on Security and Privacy, pages 6--263, 2006.

Digital Library

[13]

T. Kamiya, S. Kusumoto, and K. Inoue. CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering, pages 654--670, 2002.

Digital Library

[14]

K. A. Kontogiannis, R. Demori, E. Merlo, M. Galler, and M. Bernstein. Pattern matching for clone and concept detection. Journal of Automated Software Engineering, 3: 108, 1996.

[15]

Z. Li and Y. Zhou. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In Proc. of European Software Engineering Conference (ESEC), pages 306--315, 2005.

Digital Library

[16]

Z. Li, S. Lu, S. Myagmar, and Y. Zhou. Cp-miner: Finding copy-paste and related bugs in large-scale software code. IEEE Transactions on Software Engineering, 32: 176--192, 2006.

Digital Library

[17]

B. Livshits and T. Zimmermann. Dynamine: finding common error patterns by mining software revision histories. In Proc. of European Software Engineering Conference (ESEC), pages 296--305, 2005.

Digital Library

[18]

V. B. Livshits and M. S. Lam. Finding security vulnerabilities in java applications with static analysis. In Proc. of USENIX Security Symposium, 2005.

Digital Library

[19]

A. Marcus and J. I. Maletic. Identification of high-level concept clones in source code. In Proc. of International Conference on Automated Software Engineering (ASE), page 107, 2001.

Digital Library

[20]

L. Moonen. Generating robust parsers using island grammars. In Proc. of Working Conference on Reverse Engineering (WCRE), pages 13--22, 2001.

Digital Library

[21]

D. Moore, V. Paxson, S. Savage, C. Shannon, S. Staniford, and N. Weaver. Inside the Slammer worm. IEEE Security and Privacy, 1(4): 33--39, 2003.

Digital Library

[22]

J. Newsome and D. Song. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In Proc. of Network and Distributed System Security Symposium (NDSS), 2005.

[23]

T. Parr and R. Quong. ANTLR: A predicated-LL(k) parser generator. Software Practice and Experience, 25: 789--810, 1995.

Digital Library

[24]

rats. Rough auditing tool for security. Fortify Software Inc., https://www.fortify.com/ssa-elements/threat-intelligence/rats.html, visited April, 2012.

[25]

G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1986.

Digital Library

[26]

C. Shannon and D. Moore. The spread of the Witty worm. IEEE Security and Privacy, 2(4): 46--50, 2004.

Digital Library

[27]

M. Sutton, A. Greene, and P. Amini. Fuzzing: Brute Force Vulnerability Discovery. Addison-Wesley Professional, 2007.

Digital Library

[28]

J. Viega, J. Bloch, Y. Kohno, and G. McGraw. ITS4: A static vulnerability scanner for C and C++ code. In Proc. of Annual Computer Security Applications Conference (ACSAC), pages 257--267, 2000.

Digital Library

[29]

T. Wang, T. Wei, Z. Lin, and W. Zou. IntScope: Automatically detecting integer overflow vulnerability in x86 binary using symbolic execution. In Proc. of Network and Distributed System Security Symposium (NDSS), 2009.

[30]

D. A. Wheeler. Flawfinder. http://www.dwheeler.com/flawfinder/, visited April, 2012.

[31]

C. C. Williams and J. K. Hollingsworth. Automatic mining of source code repositories to improve bug finding techniques. IEEE Transactions on Software Engineering, 31: 466--480, 2005.

Digital Library

[32]

Y. Xie and A. Aiken. Static detection of security vulnerabilities in scripting languages. In Proc. of USENIX Security Symposium, 2006.

Digital Library

[33]

F. Yamaguchi, F. Lindner, and K. Rieck. Vulnerability extrapolation: Assisted discovery of vulnerabilities using machine learning. In USENIX Workshop on Offensive Technologies (WOOT), Aug. 2011.

Digital Library

Cited By

Wei XJinghao HZhengzhang HTao WChao P(2024)Vulnerability Detection Method Based on Word Vector ModelScientific Insights and Discoveries Review10.59782/sidr.v2i1.1192:1(227-237)Online publication date: 7-Oct-2024
https://doi.org/10.59782/sidr.v2i1.119
Feng TCui Y(2024)Particle Swarm Algorithm for Smart Contract Vulnerability Detection Based on Semantic WebInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.34285020:1(1-33)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.4018/IJSWIS.342850
Bagheri AHegedűs P(2024)Towards a Block-Level ML-Based Python Vulnerability Detection ToolActa Cybernetica10.14232/actacyb.29966726:3(323-371)Online publication date: 22-Jul-2024
https://doi.org/10.14232/actacyb.299667
Show More Cited By

Index Terms

Generalized vulnerability extrapolation using abstract syntax trees

Recommendations

Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning
WOOT'11: Proceedings of the 5th USENIX conference on Offensive technologies

Rigorous identification of vulnerabilities in program code is a key to implementing and operating secure systems. Unfortunately, only some types of vulnerabilities can be detected automatically. While techniques from software testing can accelerate the ...
Automated Software Vulnerability Detection in Statement Level using Vulnerability Reports
EASE '24: Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering

Software vulnerabilities are flaws in a product that compromise system security. In large software systems, developers struggle to find particular vulnerable statements from vulnerable functions when new vulnerabilities arise. Existing research ...
Improving the performance of code vulnerability prediction using abstract syntax tree information
PROMISE 2022: Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering

The recent emergence of the Log4jshell vulnerability demonstrates the importance of detecting code vulnerabilities in software systems. Software Vulnerability Prediction Models (VPMs) are a promising tool for vulnerability detection. Recent studies have ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACSAC '12: Proceedings of the 28th Annual Computer Security Applications Conference

December 2012

464 pages

ISBN:9781450313124

DOI:10.1145/2420950

Conference Chair:
Robert H'obbes' Zakon
Zakon Group LLC

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

ACSA: Applied Computing Security Assoc

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 December 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

ACSAC '12

Sponsor:

ACSA

ACSAC '12: Annual Computer Security Applications Conference

December 3 - 7, 2012

Florida, Orlando, USA

Acceptance Rates

ACSAC '12 Paper Acceptance Rate 44 of 231 submissions, 19%;

Overall Acceptance Rate 104 of 497 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

154
Total Citations
View Citations
1,101
Total Downloads

Downloads (Last 12 months)89
Downloads (Last 6 weeks)14

Reflects downloads up to 01 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wei XJinghao HZhengzhang HTao WChao P(2024)Vulnerability Detection Method Based on Word Vector ModelScientific Insights and Discoveries Review10.59782/sidr.v2i1.1192:1(227-237)Online publication date: 7-Oct-2024
https://doi.org/10.59782/sidr.v2i1.119
Feng TCui Y(2024)Particle Swarm Algorithm for Smart Contract Vulnerability Detection Based on Semantic WebInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.34285020:1(1-33)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.4018/IJSWIS.342850
Bagheri AHegedűs P(2024)Towards a Block-Level ML-Based Python Vulnerability Detection ToolActa Cybernetica10.14232/actacyb.29966726:3(323-371)Online publication date: 22-Jul-2024
https://doi.org/10.14232/actacyb.299667
Shiri Harzevili NBoaye Belle AWang JWang SJiang ZNagappan N(2024)A Systematic Literature Review on Automated Software Vulnerability Detection Using Machine LearningACM Computing Surveys10.1145/369971157:3(1-36)Online publication date: 11-Nov-2024
https://dl.acm.org/doi/10.1145/3699711
Gong YNie JYou WShi WHuang JLiang BZhang JBaysal OLinares-Vasquez MMoran KSteinmacher I(2024)SICode: Embedding-Based Subgraph Isomorphism Identification for Bug DetectionProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3646556(304-315)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643916.3646556
Li ZWang NZou DLi YZhang RXu SZhang CJin HRoychoudhury APaiva AAbreu RStorey M(2024)On the Effectiveness of Function-Level Vulnerability Detectors for Inter-Procedural VulnerabilitiesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639218(1-12)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639218
Wu TChen LDu GMeng DShi G(2024)UltraVCS: Ultra-Fine-Grained Variable-Based Code Slicing for Automated Vulnerability DetectionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.337421919(3986-4000)Online publication date: 2024
https://doi.org/10.1109/TIFS.2024.3374219
Mahyari A(2024)Harnessing the Power of LLMs in Source Code Vulnerability DetectionMILCOM 2024 - 2024 IEEE Military Communications Conference (MILCOM)10.1109/MILCOM61039.2024.10774025(251-256)Online publication date: 28-Oct-2024
https://doi.org/10.1109/MILCOM61039.2024.10774025
Bahaa AKamal AFahmy HGhoneim A(2024)DB-CBIL: A DistilBert-Based Transformer Hybrid Model Using CNN and BiLSTM for Software Vulnerability DetectionIEEE Access10.1109/ACCESS.2024.339641012(64446-64460)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3396410
Nguyen HHoang TDam HGhose A(2024)Graph-based explainable vulnerability predictionInformation and Software Technology10.1016/j.infsof.2024.107566(107566)Online publication date: Aug-2024
https://doi.org/10.1016/j.infsof.2024.107566
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents