Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3368089.3409748acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Boosting fuzzer efficiency: an information theoretic perspective

Published: 08 November 2020 Publication History

Abstract

In this paper, we take the fundamental perspective of fuzzing as a learning process. Suppose before fuzzing, we know nothing about the behaviors of a program P: What does it do? Executing the first test input, we learn how P behaves for this input. Executing the next input, we either observe the same or discover a new behavior. As such, each execution reveals ”some amount” of information about P’s behaviors. A classic measure of information is Shannon’s entropy. Measuring entropy allows us to quantify how much is learned from each generated test input about the behaviors of the program. Within a probabilistic model of fuzzing, we show how entropy also measures fuzzer efficiency. Specifically, it measures the general rate at which the fuzzer discovers new behaviors. Intuitively, efficient fuzzers maximize information.
From this information theoretic perspective, we develop Entropic, an entropy-based power schedule for greybox fuzzing which assigns more energy to seeds that maximize information. We implemented Entropic into the popular greybox fuzzer LibFuzzer. Our experiments with more than 250 open-source programs (60 million LoC) demonstrate a substantially improved efficiency and confirm our hypothesis that an efficient fuzzer maximizes information. Entropic has been independently evaluated and invited for integration into main-line LibFuzzer. Entropic now runs on more than 25,000 machines fuzzing hundreds of security-critical software systems simultaneously and continuously.

Supplementary Material

Auxiliary Teaser Video (fse20main-p597-p-teaser.mp4)
This is the presentation video for our ESEC/FSE'20 paper "Boosting Fuzzer Efficiency: An Information Theoretic Perspective" by Marcel Böhme, Valentin J. M. Manès, and Sang Kil Cha.
Auxiliary Presentation Video (fse20main-p597-p-video.mp4)
This is the presentation video for our ESEC/FSE'20 paper "Boosting Fuzzer Efficiency: An Information Theoretic Perspective" by Marcel Böhme, Valentin J. M. Manès, and Sang Kil Cha.

References

[1]
Abhishek Aarya, Oliver Chang, Max Moroz, Martin Barbella, and Jonathan Metzman. 2019. Open sourcing ClusterFuzz. https://security.googleblog.com/ 2019 /02/ open-sourcing-clusterfuzz.html. ( 2019 ). Accessed: 2020-09-30.
[2]
Nadia Alshahwan and Mark Harman. 2014. Coverage and Fault Detection of the Output-Uniqueness Test Selection Criteria. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA 2014 ). 181âĂŞ192. https: //doi.org/10.1145/2610384.2610413
[3]
Domenico Amalfitano, Nicola Amatucci, Anna Rita Fasolino, Porfirio Tramontana, Emily Kowalczyk, and Atif M. Memon. 2015. Exploiting the Saturation Efect in Automatic Random Testing of Android Applications. In Proceedings of the Second ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft '15). 33ś43.
[4]
Andrea Arcuri and Lionel Briand. 2011. A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering. In Proceedings of the 33rd International Conference on Software Engineering (ICSE '11). 1ś10.
[5]
Cornelius Aschermann, Sergej Schumilo, Ali Abbasi, and Thorsten Holz. 2020. IJON: Exploring Deep State Spaces via Fuzzing. In IEEE Symposium on Security and Privacy (Oakland).
[6]
Marcel Böhme. 2018. STADS: Software Testing as Species Discovery. ACM Transactions on Software Engineering and Methodology 27, 2, Article 7 ( June 2018 ), 52 pages. https://doi.org/10.1145/3210309
[7]
Marcel Böhme and Brandon Falk. 2020. Fuzzing: On the Exponential Cost of Vulnerability Discovery. In Proceedings of the 14th Joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE). 1ś12. https://doi.org/10.1145/3368089. 3409729
[8]
Marcel Böhme and Soumya Paul. 2016. A Probabilistic Analysis of the Eficiency of Automated Software Testing. IEEE Transactions on Software Engineering 42, 4 (April 2016 ), 345ś360. https://doi.org/10.1109/TSE. 2015.2487274
[9]
Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coveragebased Greybox Fuzzing as Markov Chain. IEEE Transactions on Software Engineering ( 2017 ), 1ś18.
[10]
Mitch Bryson and Salah Sukkarieh. 2008. Observability Analysis and Active Control for Airborne SLAM. IEEE Trans. Aerospace Electron. Systems 44, 1 ( January 2008 ), 261ś280.
[11]
Justin Campbell and Mike Walker. 2020. Microsoft announces new Project OneFuzz framework, an open source developer tool to find and fix bugs at scale. https://www.microsoft.com/security/blog/2020/09/15/microsoft-onefuzzframework-open-source-developer-tool-fix-bugs/. ( 2020 ). Accessed: 2020-09-30.
[12]
José Campos, Rui Abreu, Gordon Fraser, and Marcelo d'Amorim. 2013. Entropybased Test Generation for Improved Fault Localization. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE '13). 257ś267.
[13]
Henry Carrillo, Ian Reid, and José A. Castellanos. 2012. On the Comparison of Uncertainty Criteria for Active SLAM. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA '12). 2080ś2087.
[14]
Sang Kil Cha, Maverick Woo, and David Brumley. 2015. Program-Adaptive Mutational Fuzzing. In Proceedings of the IEEE Symposium on Security and Privacy (SP '15). 725ś741.
[15]
Anne Chao, Y. T. Wang, and Lou Jost. 2013. Entropy and the species accumulation curve: a novel entropy estimator via discovery rates of new species. Methods in Ecology and Evolution 4, 11 ( 2013 ), 1091ś1100.
[16]
Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, and Yang Liu. 2018. Hawkeye: Towards a Desired Directed Grey-box Fuzzer. In Proceedings of the ACM Conference on Computer and Communications Security (CCS '18). ACM, New York, NY, USA, 2095ś2108. https://doi.org/10.1145/3243734. 3243849
[17]
R. Feldt, S. Poulding, D. Clark, and S. Yoo. 2016. Test Set Diameter: Quantifying the Diversity of Sets of Test Cases. In Proceedings of the IEEE International Conference on Software Testing, Verification and Validation. 223ś233.
[18]
Antonio Filieri, Corina S. Păsăreanu, and Willem Visser. 2013. Reliability Analysis in Symbolic Pathfinder. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13). 622ś631.
[19]
Andrea Fioraldi, Dominik Maier, Heiko Eißfeldt, and Marc Heuse. 2020. AFL++: Combining Incremental Steps of Fuzzing Research. In Proceedings of the 14th USENIX Workshop on Ofensive Technologies (WOOT '20). 1ś12.
[20]
Jaco Geldenhuys, Matthew B. Dwyer, and Willem Visser. 2012. Probabilistic Symbolic Execution. In Proceedings of the 2012 International Symposium on Software Testing and Analysis (ISSTA 2012 ). 166ś176.
[21]
HyungSeok Han and Sang Kil Cha. 2017. IMF: Inferred Model-based Fuzzer. In Proceedings of the ACM Conference on Computer and Communications Security (CCS '17). 2345ś2358.
[22]
HyungSeok Han, DongHyeon Oh, and Sang Kil Cha. 2019. CodeAlchemist: Semantics-Aware Code Generation to Find Vulnerabilities in JavaScript Engines. In Proceedings of the Network and Distributed System Security Symposium (NDSS '19).
[23]
Ben Herrmann, Stefan Winter, and Janet Siegmund. 2020. Community Expectations for Research Artifacts and Evaluation Processes. In Proceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. (ESEC/FSE 2020 ). 1ś12. https://doi.org/10. 1145/3368089.3409767
[24]
George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating Fuzz Testing. In Proceedings of the ACM Conference on Computer and Communications Security (CCS '18). ACM, New York, NY, USA, 2123ś2138.
[25]
Caroline Lemieux, Rohan Padhye, Koushik Sen, and Dawn Song. 2018. PerfFuzz: Automatically Generating Pathological Inputs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2018 ). 254âĂŞ265. https://doi.org/10.1145/3213846.3213874
[26]
Caroline Lemieux and Koushik Sen. 2018. FairFuzz: A Targeted Mutation Strategy for Increasing Greybox Fuzz Testing Coverage. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE 2018 ). New York, NY, USA, 475âĂŞ485. https://doi.org/10.1145/3238147.3238176
[27]
LibFuzzer. 2019. LibFuzzer: A library for coverage-guided fuzz testing. http://llvm.org/docs/LibFuzzer.html. ( 2019 ). Accessed: 2019-02-20.
[28]
Valentin J. M. Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J. Schwartz, and Maverick Woo. 2019. The Art, Science, and Engineering of Fuzzing: A Survey. IEEE Transactions on Software Engineering ( 2019 ). https://doi.org/10.1109/TSE. 2019.2946563
[29]
Valentin J. M. Manès, Soomin Kim, and Sang Kil Cha. 2020. Ankou: Guiding Greybox Fuzzing towards Combinatorial Diference. In Proceedings of the International Conference on Software Engineering. 1024ś1036.
[30]
Jonathan Metzmann, Abhishek Arya, and Lászl'o Szekeres. 2020. FuzzBench: Fuzzer Benchmarking as a Service. https://security.googleblog.com/ 2020 /03/ fuzzbench-fuzzer-benchmarking-as-service.html. ( 2020 ). Accessed: 2020-09-17.
[31]
OSS-Fuzz. 2019. Continuous Fuzzing Platform. https://github.com/google/ossfuzz/tree/master/infra. ( 2019 ). Accessed: 2019-02-20.
[32]
Rohan Padhye, Caroline Lemieux, Koushik Sen, Mike Papadakis, and Yves Le Traon. 2019. Semantic Fuzzing with Zest. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2019 ). New York, NY, USA, 329âĂŞ340. https://doi.org/10.1145/3293882.3330576
[33]
Van-Thuan Pham, Marcel Böhme, and Abhik Roychoudhury. 2020. AFLNet: A Greybox Fuzzer for Network Protocols. In Proceedings of the 2020 IEEE International Conference on Software Testing, Verification and Validation (ICST 2020 ). 460ś465. https://doi.org/10.1109/ICST46399. 2020.00062
[34]
Van-Thuan Pham, Marcel Böhme, Andrew E. Santosa, Alexandru R. Căciulescu, and Abhik Roychoudhury. 2019. Smart Greybox Fuzzing. IEEE Transactions on Software Engineering ( 2019 ), 1ś17.
[35]
Alexandre Rebert, Sang Kil Cha, Thanassis Avgerinos, Jonathan Foote, David Warren, Gustavo Grieco, and David Brumley. 2014. Optimizing Seed Selection for Fuzzing. In Proceedings of the USENIX Security Symposium (SEC '14). 861ś875.
[36]
Sergej Schumilo, Cornelius Aschermann, Ali Abbasi, Simon Wörner, and Thorsten Holz. 2020. HYPER-CUBE: High-Dimensional Hypervisor Fuzzing. In 27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 23-26, 2020.
[37]
Konstantin Serebryany. 2017. https://github.com/google/fuzzer-test-suite/blob/ master/engine-comparison/tutorial/abTestingTutorial.md. ( 2017 ). Accessed: 2019-02-20.
[38]
Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: A Fast Address Sanity Checker. In Proceedings of the 2012 USENIX Conference on Annual Technical Conference (USENIX ATC '12). 28ś28.
[39]
Claude E. Shannon. 1948. A Mathematical Theory of Communication. Bell System Technical Journal 27 ( 1948 ).
[40]
Elena Sherman, Matthew B. Dwyer, and Sebastian Elbaum. 2009. Saturation-based Testing of Concurrent Programs. In Proceedings of the the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (ESEC/FSE '09). 53ś62.
[41]
Sebastian Thrun. 2003. Exploring Artificial Intelligence in the New Millennium. Chapter Robotic Mapping: A Survey, 1ś35.
[42]
Maverick Woo, Sang Kil Cha, Samantha Gottlieb, and David Brumley. 2013. Scheduling Black-box Mutational Fuzzing. In Proceedings of the ACM Conference on Computer and Communications Security (CCS '13). 511ś522.
[43]
Linmin Yang. 2011. Entropy and Software Systems: Towards an Informationtheoretic Foundation of Software Testing. Ph.D. Dissertation. Advisor(s) Dang, Zhe and Fischer, Thomas R.
[44]
Linmin Yang, Zhe Dang, and Thomas R. Fischer. 2011. Information gain of black-box testing. Formal Aspects of Computing 23, 4 ( 01 Jul 2011 ), 513ś539.
[45]
Shin Yoo, Mark Harman, and David Clark. 2013. Fault Localization Prioritization: Comparing Information-theoretic and Coverage-based Approaches. ACM Transactions on Software Engineering and Methodology 22, 3, Article 19 ( July 2013 ), 29 pages.
[46]
Michal Zalewski. 2019. AFL: American Fuzzy Lop Fuzzer. http://lcamtuf. coredump.cx/afl/technical_details.txt. ( 2019 ). Accessed: 2019-02-20.

Cited By

View all
  • (2024)DCGFuzz: An Embedded Firmware Security Analysis Method with Dynamically Co-Directional Guidance FuzzingElectronics10.3390/electronics1308143313:8(1433)Online publication date: 10-Apr-2024
  • (2024)Visualization Task Taxonomy to Understand the Fuzzing Internals (Registered Report)Proceedings of the 3rd ACM International Fuzzing Workshop10.1145/3678722.3685530(13-22)Online publication date: 13-Sep-2024
  • (2024)Graphuzz: Data-driven Seed Scheduling for Coverage-guided Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/366460333:7(1-36)Online publication date: 26-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
November 2020
1703 pages
ISBN:9781450370431
DOI:10.1145/3368089
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 November 2020

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. efficiency
  2. entropy
  3. fuzzing
  4. information theory
  5. software testing

Qualifiers

  • Research-article

Funding Sources

Conference

ESEC/FSE '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)127
  • Downloads (Last 6 weeks)19
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)DCGFuzz: An Embedded Firmware Security Analysis Method with Dynamically Co-Directional Guidance FuzzingElectronics10.3390/electronics1308143313:8(1433)Online publication date: 10-Apr-2024
  • (2024)Visualization Task Taxonomy to Understand the Fuzzing Internals (Registered Report)Proceedings of the 3rd ACM International Fuzzing Workshop10.1145/3678722.3685530(13-22)Online publication date: 13-Sep-2024
  • (2024)Graphuzz: Data-driven Seed Scheduling for Coverage-guided Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/366460333:7(1-36)Online publication date: 26-Aug-2024
  • (2024)Dodrio: Parallelizing Taint Analysis Based Fuzzing via Redundancy-Free SchedulingCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663844(244-254)Online publication date: 10-Jul-2024
  • (2024)DiPri: Distance-based Seed Prioritization for Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/3654440Online publication date: 26-Mar-2024
  • (2024)Planning to Guide LLM for Code Coverage PredictionProceedings of the 2024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering10.1145/3650105.3652292(24-34)Online publication date: 14-Apr-2024
  • (2024)Make out like a (Multi-Armed) Bandit: Improving the Odds of Fuzzer Seed Scheduling with T-SchedulerProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3637639(1463-1479)Online publication date: 1-Jul-2024
  • (2024)Extrapolating Coverage Rate in Greybox FuzzingProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639198(1-12)Online publication date: 20-May-2024
  • (2024)Curiosity-Driven Testing for Sequential Decision-Making ProcessProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639149(1-14)Online publication date: 20-May-2024
  • (2024)Marco: A Stochastic Asynchronous Concolic ExplorerProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623301(1-12)Online publication date: 20-May-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media