research-article

Racing on the negative force: efficient vulnerability root-cause analysis through reinforcement learning on counterexamples

AUTHORs:

Longxing LiAuthors Info & Claims

SEC '24: Proceedings of the 33rd USENIX Conference on Security Symposium

Article No.: 237, Pages 4229 - 4246

Published: 12 August 2024 Publication History

Abstract

Root-Cause Analysis (RCA) is crucial for discovering security vulnerabilities from fuzzing outcomes. Automating this process through triaging the crashes observed during the fuzzing process, however, is considered to be challenging. Particularly, today's statistical RCA approaches are known to be exceedingly slow, often taking tens of hours or even a week to analyze a crash. This problem comes from the biased sampling such approaches perform. More specifically, given an input inducing a crash in a program, these approaches sample around the input by mutating it to generate new test cases; these cases are used to fuzz the program, in a hope that a set of program elements (blocks, instructions or predicates) on the execution path of the original input can be adequately sampled so their correlations with the crash can be determined. This process, however, tends to generate the input samples more likely causing the crash, with their execution paths involving a similar set of elements, which become less distinguishable until a large number of samples have been made. We found that this problem can be effectively addressed by sampling around "counterexamples", the inputs causing a significant change to the current estimates of correlations. These inputs though still involving the elements often do not lead to the crash. They are found to be effective in differentiating program elements, thereby accelerating the RCA process. Based upon the understanding, we designed and implemented a reinforcement learning (RL) technique that rewards the operations involving counterexamples. By balancing random sampling with the exploitation on the counterexamples, our new approach, called RACING, is shown to substantially elevate the scalability and the accuracy of today's statistical RCA, outperforming the state-of-the-art by more than an order of magnitude.

References

[1]

addr2line(1) - Linux manual page. https://man7.org/linux/man-pages/man1/addr2line.1.html.

[2]

CVE-2017-5380. https://nvd.nist.gov/vuln/detail/CVE-2017-5380.

[3]

CVE-2018-4145. https://nvd.nist.gov/vuln/detail/CVE-2018-4145.

[4]

CVE-2022-36320. https://nvd.nist.gov/vuln/detail/CVE-2022-36320.

[5]

RAClNG's source code. https://github.com/RacingN4th/Racing.git, 2023.

[6]

Rui Abreu, Peter Zoeteweij, Rob Golsteijn, and Arjan JC Van Gemund. A practical evaluation of spectrum-based fault localization. Journal of Systems and Software, 82(11):1780-1792, 2009.

Digital Library

[7]

Alan Agresti and Brent A Coull. Approximate is better than "exact" for interval estimation of binomial proportions. The American Statistician, 52(2):119-126, 1998.

[8]

Piramanayagam Arumuga Nainar, Ting Chen, Jake Rosin, and Ben Liblit. Statistical debugging using compound boolean predicates. In Proceedings of the 2007 international symposium on Software testing and analysis, pages 5-15, 2007.

[9]

Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. Finite-time analysis of the multiarmed bandit problem. Machine learning, 47:235-256, 2002.

Digital Library

[10]

George K Baah, Andy Podgurski, and Mary Jean Harrold. Causal inference for statistical fault localization. In Proceedings of the 19th international symposium on Software testing and analysis, pages 73-84, 2010.

Digital Library

[11]

Benoit Baudry, Franck Fleurey, and Yves Le Traon. Improving test suites for efficient fault localization. In Proceedings of the 28th international conference on Software engineering, pages 82-91, 2006.

Digital Library

[12]

Tim Blazytko, Moritz Schlögel, Cornelius Aschermann, Ali Abbasi, Joel Frank, Simon Wörner, and Thorsten Holz. Aurora: Statistical crash analysis for automated root cause explanation. In 29th USENIX Security Symposium (USENIX Security 20), pages 235-252, 2020.

[13]

Arlen Brown and Carl Pearcy. An introduction to analysis, volume 154. Springer Science & Business Media, 2012.

[14]

Yue Chen, Mustakimur Khandaker, and Zhi Wang. Pinpointing vulnerabilities. In Proceedings of the 2017 ACM on Asia conference on computer and communications security, pages 334-345, 2017.

Digital Library

[15]

Yu-Min Chung, Chin-Yu Huang, and Yu-Chi Huang. A study of modified testing-based fault localization method. In 2008 14th IEEE Pacific Rim International Symposium on Dependable Computing, pages 168-175. IEEE, 2008.

Digital Library

[16]

Weidong Cui, Xinyang Ge, Baris Kasikci, Ben Niu, Upamanyu Sharma, Ruoyu Wang, and Insu Yun. {REPT}: Reverse debugging of failures in deployed software. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18), pages 17-32, 2018.

[17]

Weidong Cui, Marcus Peinado, Sang Kil Cha, Yanick Fratantonio, and Vasileios P Kemerlis. Retracer: Triaging crashes by reverse execution from partial memory dumps. In Proceedings of the 38th International Conference on Software Engineering, pages 820-831, 2016.

[18]

Higor A de Souza, Marcos L Chaim, and Fabio Kon. Spectrum-based software fault localization: A survey of techniques, advances, and challenges. arXiv preprint arXiv:1607.04347, 2016.

[19]

Ronald Fagin, Ravi Kumar, and Dakshinamurthi Sivakumar. Comparing top k lists. SIAM Journal on discrete mathematics, 17(1):134-160, 2003.

[20]

Justin E. Forrester and Barton P. Miller. An empirical study of the robustness of windows nt applications using random testing. In Proceedings of the 4th Conference on USENIX Windows Systems Symposium - Volume 4, WSS'00, page 6. USENIX Association, 2000.

[21]

Hackerone. Type confusion in mrb_exc_set leading to memory corruption. https://hackerone.com/repcrts/185041.

[22]

Dan Hao, Tao Xie, Lu Zhang, Xiaoyin Wang, Jiasu Sun, and Hong Mei. Test input reduction for result inspection to facilitate fault localization. Automated software engineering, 17:5-31, 2010.

[23]

Liang He, Hong Hu, Purui Su, Yan Cai, and Zhenkai Liang. {FreeWill}: Automatically diagnosing use-after-free bugs via reference miscounting detection on binaries. In 31st USENIX Security Symposium (USENIX Security 22), pages 2497-2512, 2022.

[24]

Intel. Pin - A Dynamic Binary Instrumentation Tool. https://www.intel.com/content/www/us/en/developer/articles/tool/pin-a-dynamic-binary-instrumentation-tool.html.

[25]

James A Jones, Mary Jean Harrold, and John Stasko. Visualization of test information to assist fault localization. In Proceedings of the 24th international conference on Software engineering, pages 467-477, 2002.

Digital Library

[26]

René Just, Darioush Jalali, and Michael D Ernst. Defects4j: A database of existing faults to enable controlled testing studies for java programs. In Proceedings of the 2014 international symposium on software testing and analysis, pages 437-440, 2014.

Digital Library

[27]

George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. Evaluating fuzz testing. In Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pages 2123-2138, 2018.

Digital Library

[28]

Ravi Kumar and Sergei Vassilvitskii. Generalized distances between rankings. In Proceedings of the 19th international conference on World wide web, pages 571-580, 2010.

Digital Library

[29]

Caroline Lemieux and Koushik Sen. Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pages 475-485, 2018.

Digital Library

[30]

Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, and Yuyi Zhong. Vuldeep- ecker: A deep learning-based system for vulnerability detection. arXiv preprint arXiv:1801.01681, 2018.

[31]

Ben Liblit, Mayur Naik, Alice X Zheng, Alex Aiken, and Michael I Jordan. Scalable statistical bug isolation. Acm Sigplan Notices, 40(6):15-26, 2005.

Digital Library

[32]

Chao Liu, Long Fei, Xifeng Yan, Jiawei Han, and Samuel P Midkiff. Statistical debugging: A hypothesis testing-based approach. IEEE Transactions on software engineering, 32(10):831-848, 2006.

Digital Library

[33]

Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song, and Raheem Beyah. {MOPT}: Optimized mutation scheduling for fuzzers. In 28th USENIX Security Symposium (USENIX Security 19), pages 1949-1966, 2019.

[34]

Wes Masri. Fault localization based on information flow coverage. Software Testing, Verification and Reliability, 20(2):121-147, 2010.

Digital Library

[35]

Barton P Miller, Lars Fredriksen, and Bryan So. An empirical study of the reliability of unix utilities. Communications of the ACM, 33(12):32-44, 1990.

Digital Library

[36]

Pâl Révész. The laws of large numbers, volume 4. Academic Press, 2014.

[37]

Raul Santelices, James A Jones, Yanbing Yu, and Mary Jean Harrold. Lightweight fault-localization using multiple coverage types. In 2009 IEEE 31st International Conference on Software Engineering, pages 56-66. IEEE, 2009.

[38]

Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. Addresssanitizer: A fast address sanity checker. 2012.

[39]

Shiqi Shen, Aashish Kolluri, Zhen Dong, Prateek Saxena, and Abhik Roychoudhury. Localizing vulnerabilities statistically from one exploit. In Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, pages 537-549, 2021.

Digital Library

[40]

Evgeniy Stepanov and Konstantin Serebryany. Memorysanitizer: fast detector of uninitialized memory use in c++. In 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 46-55. IEEE, 2015.

[41]

Jinghan Wang, Chengyu Song, and Heng Yin. Reinforcement learning-based hierarchical seed scheduling for greybox fuzzing. 2021.

[42]

Tielei Wang, Tao Wei, Zhiqiang Lin, and Wei Zou. Intscope: Automatically detecting integer overflow vulnerability in x86 binary using symbolic execution. In NDSS. Citeseer, 2009.

[43]

Xinping Wang, Qing Gu, Xin Zhang, Xiang Chen, and Daoxu Chen. Fault localization based on multi-level similarity of execution traces. In 2009 16th Asia-Pacific Software Engineering Conference, pages 399-405. IEEE, 2009.

Digital Library

[44]

Maverick Woo, Sang Kil Cha, Samantha Gottlieb, and David Brumley. Scheduling black-box mutational fuzzing. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pages 511-522, 2013.

Digital Library

[45]

Jun Xu, Dongliang Mu, Xinyu Xing, Peng Liu, Ping Chen, and Bing Mao. Postmortem program analysis with hardware-enhanced post-crash artifacts. In USENIX Security Symposium, pages 17-32, 2017.

[46]

Carter Yagemann, Simon P Chung, Brendan Saltaformaggio, and Wenke Lee. Automated bug hunting with data-driven symbolic root cause analysis. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pages 320-336, 2021.

Digital Library

[47]

Carter Yagemann, Matthew Pruett, Simon P Chung, Kennon Bittick, Brendan Saltaformaggio, and Wenke Lee. {ARCUS}: Symbolic root cause analysis of exploits in production systems. In 30th USENIX Security Symposium (USENIX Security 21), pages 1989-2006, 2021.

[48]

Fabian Yamaguchi, Alwin Maier, Hugo Gascon, and Konrad Rieck. Automatic inference of search patterns for taint-style vulnerabilities. In 2015 IEEE Symposium on Security and Privacy, pages 797-812. IEEE, 2015.

Digital Library

[49]

Yanbing Yu, James A Jones, and Mary Jean Harrold. An empirical study of the effects of test-suite reduction on fault localization. In Proceedings of the 30th international conference on Software engineering, pages 201-210, 2008.

Digital Library

[50]

Tai Yue, Pengfei Wang, Yong Tang, Enze Wang, Bo Yu, Kai Lu, and Xu Zhou. Ecofuzz: Adaptive energy-saving greybox fuzzing as a variant of the adversarial multi- armed bandit. In Proceedings of the 29th USENIX Conference on Security Symposium, pages 2307-2324, 2020.

[51]

Michal Zalewski. American fuzzy lop. https://lcamtuf.coredump.cx/afl/, 2017.

[52]

Xing Zhang, Jiongyi Chen, Chao Feng, Ruilin Li, Wenrui Diao, Kehuan Zhang, Jing Lei, and Chaojing Tang. Default: mutual information-based crash triage for massive crashes. In Proceedings of the 44th International Conference on Software Engineering, pages 635-646, 2022.

Index Terms

Racing on the negative force: efficient vulnerability root-cause analysis through reinforcement learning on counterexamples

Index terms have been assigned to the content through auto-classification.

Recommendations

False Negative Sample Aware Negative Sampling for Recommendation
Advances in Knowledge Discovery and Data Mining
Abstract
Negative sampling plays a key role in implicit feedback collaborative filtering. It draws high-quality negative samples from a large number of uninteracted samples. Existing methods primarily focus on hard negative samples, while overlooking the ...
A selective deep stacked denoising autoencoders ensemble with negative correlation learning for gearbox fault diagnosis
Graphical abstract
SSDAE-NCL-based gearbox fault diagnosis.

Display Omitted
Highlights
- A novel selective DNN ensemble is proposed for gearbox fault diagnosis.
- A ...
Abstract
Vibration signals are widely used as an effective way to fulfill gearbox fault diagnosis. However, it is quite challenging to extract effective fault features from noisy vibration signals and then to construct a reliable fault ...
Revisiting Negative Sampling vs. Non-sampling in Implicit Recommendation
Recommendation systems play an important role in alleviating the information overload issue. Generally, a recommendation model is trained to discern between positive (liked) and negative (disliked) instances for each user. However, under the open-world ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

SEC '24: Proceedings of the 33rd USENIX Conference on Security Symposium

August 2024

7480 pages

ISBN:978-1-939133-44-1

Others:
Davide Balzarotti
Eurecom
,
Wenyuan Xu
Zhejiang University

Copyright © 2024 The USENIX Association.

Sponsors

Bloomberg Engineering
Google Inc.
NSF
Futurewei Technologies
IBM

Publisher

USENIX Association

United States

Publication History

Published: 12 August 2024

Qualifiers

Research-article
Research
Refereed limited

Acceptance Rates

Overall Acceptance Rate 40 of 100 submissions, 40%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten