research-article

TRACER: Signature-based Static Analysis for Detecting Recurring Vulnerabilities

Authors:

Kihong HeoAuthors Info & Claims

CCS '22: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security

Pages 1695 - 1708

https://doi.org/10.1145/3548606.3560664

Published: 07 November 2022 Publication History

Abstract

Similar software vulnerabilities recur because developers reuse existing vulnerable code, or make similar mistakes when implementing the same logic. Recently, various analysis techniques have been proposed to find syntactically recurring vulnerabilities via code reuse. However, limited attention has been devoted to semantically recurring ones that share the same vulnerable behavior in different code structures. In this paper, we present a general analysis framework, called TRACER, for detecting such recurring vulnerabilities. TRACER is based on a taint analysis that can detect various types of vulnerabilities. For a given set of known vulnerabilities, the taint analysis extracts vulnerable traces and establishes a signature database of them. When a new unseen program is analyzed, TRACER compares all potentially vulnerable traces reported by the analysis with the known vulnerability signatures. Then, TRACER reports a list of potential vulnerabilities ranked by the similarity score. We evaluate TRACER on 273 Debian packages in C/C++. Our experiment results demonstrate that TRACER is able to find 281 previously unknown vulnerabilities with 6 CVE identifiers assigned.

References

[1]

Spotbugs. https://spotbugs.github.io, 2021.

[2]

Pavel Avgustinov, Oege de Moor, Michael Peyton Jones, and Max Schäfer. Ql: Object-oriented queries on relational data. In European Conference on Object- Oriented Programming (ECOOP 2016), 2016.

[3]

Nathaniel Ayewah, David Hovemeyer, J David Morgenthaler, John Penix, and William Pugh. Using static analysis to find bugs. IEEE Softw., 25, 2008.

[4]

Paul Black. Juliet 1.3 test suite: Changes from 1.2. NIST Technical Note, 8 2018.

[5]

Cristiano Calcagno and Dino Distefano. Infer: An automatic program verifier for memory safety of c programs. In NASA Formal Methods - Third International Symposium (NFM), volume 6617. Springer, 2011.

[6]

Tianyi Chen, Kihong Heo, and Mukund Raghothaman. In 29th ACM Joint European Software Engineering Conferenceand Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM.

[7]

CodeQL. Codeql cwe queries. https://github.com/github/codeql/tree/main/cpp/ ql/src/Security/CWE, 2021.

[8]

CodeQL. TaintedAllocationSize.ql. https://github.com/github/codeql/blob/main/ cpp/ql/src/Security/CWE/CWE-190/TaintedAllocationSize.ql, 2021.

[9]

James R Cordy and Chanchal K Roy. The NiCad clone detector. In The 19th IEEE International Conference on Program Comprehension (ICPC 2011). IEEE Computer Society, 2011.

Digital Library

[10]

The MITRE Corporation. Common vulnerabilities and exposures, 2021.

[11]

The MITRE Corporation. Common weakness enumeration, 2021.

[12]

Yaniv David, Nimrod Partush, and Eran Yahav. Firmup: Precise static detection of common vulnerabilities in firmware. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018, Williamsburg, VA, USA, March 24--28, 2018. ACM, 2018.

Digital Library

[13]

Debian. Debian packages. https://packages.debian.org/sid/, 2021.

[14]

Will Dietz, Peng Li, John Regehr, and Vikram S Adve. Understanding integer overflow in c/c. In 34th International Conference on Software Engineering (ICSE 2012). IEEE Computer Society, 2012.

[15]

Steven H. H. Ding, Benjamin C. M. Fung, and Philippe Charland. Asm2vec: Boosting static representation robustness for binary clone search against code obfuscation and compiler optimization. In 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, May 19--23, 2019. IEEE, 2019.

[16]

The OWASP Foundation. Attacks. https://owasp.org/www-community/attacks/, 2021.

[17]

Mark Gabel, Lingxiao Jiang, and Zhendong Su. Scalable detection of semantic clones. In 30th International Conference on Software Engineering (ICSE 2008). ACM, 2008.

Digital Library

[18]

Google. Error prone. https://errorprone.info, 2021.

[19]

Quinn Hanam, Lin Tan, Reid Holmes, and Patrick Lam. Finding patterns in static analysis alerts: improving actionable alert ranking. In 11th Working Conference on Mining Software Repositories (MSR 2014). ACM, 2014.

Digital Library

[20]

Kihong Heo, Hakjoo Oh, and Kwangkeun Yi. Machine-learning-guided selectively unsound static analysis. In Proceedings of the 39th International Conference on Software Engineering (ICSE 2017). IEEE / ACM, 2017.

Digital Library

[21]

Kihong Heo, Mukund Raghothaman, Xujie Si, and Mayur Naik. Continuously reasoning about programs using differential bayesian inference. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2019). ACM, 2019.

Digital Library

[22]

IoTcube. Iotcube. https://iotcube.korea.ac.kr, 2021.

[23]

Jiyong Jang, Abeer Agrawal, and David Brumley. Redebug: Finding unpatched code clones in entire os distributions. In IEEE Symposium on Security and Privacy (S&P 2012). IEEE Computer Society, 2012.

Digital Library

[24]

Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stéphane Glondu. DECKARD: Scalable and accurate tree-based detection of code clones. In 29th International Conference on Software Engineering (ICSE 2007). IEEE Computer Society, 2007.

Digital Library

[25]

Heejung Kim, Yungbum Jung, Sunghun Kim, and Kwangkeun Yi. MeCC: memory comparison-based clone detector. In Proceedings of the 33rd International Conference on Software Engineering (ICSE 2011). ACM, 2011.

[26]

Seulbae Kim, Seunghoon Woo, Heejo Lee, and Hakjoo Oh. VUDDY: A scalable approach for vulnerable code clone discovery. In IEEE Symposium on Security and Privacy (S&P 2017). IEEE Computer Society, 2017.

[27]

Raghavan Komondoor and Susan Horwitz. Using slicing to identify duplication in source code. In Proceedings of 8th International Static Analysis Symposium (SAS 2001), volume 2126. Springer, 2001.

[28]

Ted Kremenek and Dawson R Engler. Z-ranking: Using statistical analysis to counter the impact of static analysis approximations. In Proceedings of 10th International Static Analysis Symposium (SAS 2003), volume 2694. Springer, 2003.

[29]

Jingyue Li and Michael D Ernst. Cbcd: Cloned buggy code detector. In 34th International Conference on Software Engineering (ICSE 2012). IEEE Computer Society, 2012.

[30]

Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Hanchao Qi, and Jie Hu. VulPecker: an automated vulnerability detection system based on code similarity analysis. In Proceedings of the 32nd Annual Conference on Computer Security Applications (ACSAC 2016). ACM, 2016.

[31]

Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, SujuanWang, Zhijun Deng, and Yuyi Zhong. VulDeePecker: A deep learning-based system for vulnerability detection. In 25th Annual Network and Distributed System Security Symposium (NDSS 2018). The Internet Society, 2018.

[32]

Ziyang Li, Aravind Machiry, Binghong Chen, Mayur Naik, KeWang, and Le Song. Arbitrar : User-guided api misuse detection. IEEE Symposium on Security and Privacy (S&P 2021), 2021.

[33]

Cristina V Lopes, Petr Maj, Pedro Martins, Vaibhav Saini, Di Yang, Jakub Zitny, Hitesh Sajnani, and Jan Vitek. Déjàvu: a map of code duplicates on github. Proc. ACM Program. Lang., 1, 2017.

[34]

Antonio Nappa, Richard Johnson, Leyla Bilge, Juan Caballero, and Tudor Dumitras. The attack of the clones: A study of the impact of shared code on vulnerability patching. In IEEE Symposium on Security and Privacy (S&P 2015), 2015.

Digital Library

[35]

Damien Octeau, Somesh Jha, Matthew Dering, Patrick D. McDaniel, Alexandre Bartel, Li Li, Jacques Klein, and Yves Le Traon. Combining static analysis with probabilistic models to enable market-scale android inter-component analysis. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). ACM, 2016.

Digital Library

[36]

OWASP. Buffer overflow via environment variables. https://owasp.org/wwwcommunity/ attacks/Buffer_Overflow_via_Environment_Variables, 2021.

[37]

Soyeon Park,Wen Xu, Insu Yun, Daehee Jang, and Taesoo Kim. Fuzzing javascript engines with aspect-preserving mutation. In IEEE Symposium on Security and Privacy (S&P 2020). IEEE, 2020.

[38]

Nam H Pham, Tung Thanh Nguyen, Hoan Anh Nguyen, and Tien N Nguyen. Detection of recurring software vulnerabilities. In 25th IEEE/ACM International Conference on Automated Software Engineering (ASE 2010). ACM, 2010.

[39]

Mukund Raghothaman, Sulekha Kulkarni, Kihong Heo, and Mayur Naik. Userguided program reasoning using bayesian inference. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, 2018.

Digital Library

[40]

Hitesh Sajnani, Vaibhav Saini, Jeffrey Svajlenko, Chanchal K Roy, and Cristina V Lopes. Sourcerercc: scaling code clone detection to big-code. In Proceedings of the 38th International Conference on Software Engineering (ICSE 2016). ACM, 2016.

[41]

Abdullah Sheneamer and Jugal Kalita. Semantic clone detection using machine learning. In 15th IEEE International Conference on Machine Learning and Applications (ICMLA 2016). IEEE Computer Society, 2016.

[42]

Maddie Stone. Déjà vu-lnerability. https://googleprojectzero.blogspot.com/2021/ 02/deja-vu-lnerability.html, 2021.

[43]

Pengcheng Wang, Jeffrey Svajlenko, Yanzhao Wu, Yun Xu, and Chanchal K Roy. CCAligner: a token based large-gap clone detector. In Proceedings of the 40th International Conference on Software Engineering (ICSE 2018). ACM, 2018.

[44]

Xi Wang, Haogang Chen, Zhihao Jia, Nickolai Zeldovich, and M Frans Kaashoek. Improving integer security for systems with kint. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2012). USENIX Association, 2012.

[45]

Huihui Wei and Ming Li. Positive and unlabeled learning for detecting software functional clones with adversarial training. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI 2018). ijcai.org, 2018.

[46]

Martin White, Michele Tufano, Christopher Vendome, and Denys Poshyvanyk. Deep learning code fragments for code clone detection. In Proceedings of the 31st IEEE/ACM International Conference on AutomatedSoftware Engineering (ASE 2016). ACM, 2016.

Digital Library

[47]

Yang Xiao, Bihuan Chen, Chendong Yu, Zhengzi Xu, Zimu Yuan, Feng Li, Binghong Liu, Yang Liu, Wei Huo, Wei Zou, and Wenchang Shi. MVP: Detecting vulnerabilities using patch-enhanced vulnerability signatures. In 29th USENIX Security Symposium (USENIX Security 2020). USENIX Association, 2020.

[48]

Yaqin Zhou, Shangqing Liu, Jing Kai Siow, Xiaoning Du, and Yang Liu. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. In Neural Information Processing Systems 2019 (NeurIPS 2019), 2019.

Cited By

Woo SChoi ELee H(2025)A large-scale analysis of the effectiveness of publicly reported security patchesComputers & Security10.1016/j.cose.2024.104181148(104181)Online publication date: Jan-2025
https://doi.org/10.1016/j.cose.2024.104181
Feng SWu YXue WPan SZou DLiu YJin HBalzarotti DXu W(2024)FIREProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699005(1867-1884)Online publication date: 14-Aug-2024
https://dl.acm.org/doi/10.5555/3698900.3699005
Jiang ZSun WGu XWu JWen THu HYan M(2024)DFEPT: Data Flow Embedding for Enhancing Pre-Trained Model Based Vulnerability DetectionProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671388(95-104)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3671016.3671388
Show More Cited By

Index Terms

TRACER: Signature-based Static Analysis for Detecting Recurring Vulnerabilities
1. Security and privacy
  1. Software and application security
    1. Software security engineering

Recommendations

Improving software security with a C pointer analysis
ICSE '05: Proceedings of the 27th international conference on Software engineering

This paper presents a context-sensitive, inclusion-based, field-sensitive points-to analysis for C and uses the analysis to detect and prevent security vulnerabilities in programs. In addition to a conservative analysis, we propose an optimistic ...
Silent Taint-Style Vulnerability Fixes Identification
ISSTA 2024: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis

The coordinated vulnerability disclosure model, widely adopted in open-source software (OSS) organizations, recommends the silent resolution of vulnerabilities without revealing vulnerability information until their public disclosure. However, the ...
A Survey on XSS Attack Detection and Prevention in Web Applications
ICMLC '20: Proceedings of the 2020 12th International Conference on Machine Learning and Computing

With the popularity of web technology, web applications become more increasingly vulnerable and are exposed to malicious attacks. Cross Site Scripting(XSS) is a typical attack in web applications. When a vulnerability is exploited, an attacker may ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CCS '22: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security

November 2022

3598 pages

ISBN:9781450394505

DOI:10.1145/3548606

General Chairs:
Heng Yin
University of California, Riverside
,
Angelos Stavrou
Virginia Tech
,
Program Chairs:
Cas Cremers
CISPA Helmholtz Center for Information Security
,
Elaine Shi
Carnegie Mellon University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Research Foundation of Korea(NRF)
Institute for Information & Communications Technology Planning & Evaluation (IITP)
Institute for Information & communications Technology Planning&Evaluation(IITP)

Conference

CCS '22

Sponsor:

SIGSAC

CCS '22: 2022 ACM SIGSAC Conference on Computer and Communications Security

November 7 - 11, 2022

CA, Los Angeles, USA

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
849
Total Downloads

Downloads (Last 12 months)266
Downloads (Last 6 weeks)27

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Woo SChoi ELee H(2025)A large-scale analysis of the effectiveness of publicly reported security patchesComputers & Security10.1016/j.cose.2024.104181148(104181)Online publication date: Jan-2025
https://doi.org/10.1016/j.cose.2024.104181
Feng SWu YXue WPan SZou DLiu YJin HBalzarotti DXu W(2024)FIREProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699005(1867-1884)Online publication date: 14-Aug-2024
https://dl.acm.org/doi/10.5555/3698900.3699005
Jiang ZSun WGu XWu JWen THu HYan M(2024)DFEPT: Data Flow Embedding for Enhancing Pre-Trained Model Based Vulnerability DetectionProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671388(95-104)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3671016.3671388
Huang KLu CCao YChen BPeng XLuo BLiao XXu JKirda ELie D(2024)VMud: Detecting Recurring Vulnerabilities with Multiple Fixing Functions via Function Selection and Semantic Equivalent Statement MatchingProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690372(3958-3972)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3690372
Xiao HZhang YShen MLin CZhang CLiu SYang MLuo BLiao XXu JKirda ELie D(2024)Accurate and Efficient Recurring Vulnerability Detection for IoT FirmwareProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670275(3317-3331)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3670275
Chen XRoychoudhury APaiva AAbreu RStorey M(2024)IntTracer: Sanitization-aware IO2BO Vulnerability Detection across CodebasesProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3641223(447-449)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639478.3641223
Shi YZhang YBai TZhang LTan XYang MChua TNgo CKa-Wei Lee RKumar RLauw H(2024)RecurScan: Detecting Recurring Vulnerabilities in PHP Web ApplicationsProceedings of the ACM Web Conference 202410.1145/3589334.3645530(1746-1755)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645530
Gobbi MKinder J(2024)GENIE: Guarding the npm Ecosystem with Semantic Malware Detection2024 IEEE Secure Development Conference (SecDev)10.1109/SecDev61143.2024.00017(117-128)Online publication date: 7-Oct-2024
https://doi.org/10.1109/SecDev61143.2024.00017
Yang CHsu CBan TTakahashi THsiao H(2024)Uncovering Recurring Vulnerabilities through Taint-Extracted Operator Sequences2024 IEEE Conference on Communications and Network Security (CNS)10.1109/CNS62487.2024.10735703(1-9)Online publication date: 30-Sep-2024
https://doi.org/10.1109/CNS62487.2024.10735703
Weng ZZhang WZhu TDou ZSun HYe ZTian Y(2024)RT-APT: A Real-time APT Anomaly Detection Method for Large-scale Provenance GraphJournal of Network and Computer Applications10.1016/j.jnca.2024.104036(104036)Online publication date: Oct-2024
https://doi.org/10.1016/j.jnca.2024.104036
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents