Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3359789.3359796acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article

Systematic comparison of symbolic execution systems: intermediate representation and its generation

Published: 09 December 2019 Publication History

Abstract

Symbolic execution has become a popular technique for software testing and vulnerability detection. Most implementations transform the program under analysis to some intermediate representation (IR), which is then used as a basis for symbolic execution. There is a multitude of available IRs, and even more approaches to transform target programs into a respective IR.
When developing a symbolic execution engine, one needs to choose an IR, but it is not clear which influence the IR generation process has on the resulting system. What are the respective benefits for symbolic execution of generating IR from source code versus lifting machine code? Does the distinction even matter? What is the impact of not using an IR, executing machine code directly? We feel that there is little scientific evidence backing the answers to those questions. Therefore, we first develop a methodology for systematic comparison of different approaches to symbolic execution; we then use it to evaluate the impact of the choice of IR and IR generation. We make our comparison framework available to the community for future research.

References

[1]
Roberto Baldoni, Emilio Coppa, Daniele Cono D'elia, Camil Demetrescu, and Irene Finocchi. 2018. A survey of symbolic execution techniques. ACM Computing Surveys (CSUR) 51, 3 (2018), 50.
[2]
Clark Barrett, Aaron Stump, and Cesare Tinelli. 2010. The SMT-LIB standard: Version 2.0. In Proceedings of the 8th International Workshop on Satisfiability Modulo Theories (Edinburgh, England), Vol. 13. 14.
[3]
Fabrice Bellard. 2005. QEMU, a fast and portable dynamic translator. In USENIX Annual Technical Conference, FREENIX Track, Vol. 41. 46.
[4]
Ella Bounimova, Patrice Godefroid, and David Molnar. 2013. Billions and Billions of Constraints: Whitebox Fuzz Testing in Production. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13). IEEE Press, Piscataway, NJ, USA, 122--131. http://dl.acm.org/citation.cfm?id=2486788.2486805
[5]
David Brumley, Ivan Jager, Thanassis Avgerinos, and Edward J Schwartz. 2011. BAP: A binary analysis platform. In International Conference on Computer Aided Verification. Springer, 463--469.
[6]
Robert Brummayer and Armin Biere. 2009. Boolector: An efficient SMT solver for bit-vectors and arrays. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 174--177.
[7]
Stefan Bucur, Vlad Ureche, Cristian Zamfir, and George Candea. 2011. Parallel symbolic execution for automated real-world software testing. In Proceedings of the 6th ACM SIGOPS/EuroSys Conference on Computer Systems. ACM, 183--198.
[8]
Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In OSDI, Vol. 8. 209--224.
[9]
Cristian Cadar, Vijay Ganesh, Peter M. Pawlowski, David L. Dill, and Dawson R. Engler. 2008. EXE: automatically generating inputs of death. ACM Transactions on Information and System Security (TISSEC) 12, 2 (2008), 10.
[10]
Sang Kil Cha, Thanassis Avgerinos, Alexandre Rebert, and David Brumley. 2012. Unleashing mayhem on binary code. In 2012 IEEE Symposium on Security and Privacy. IEEE, 380--394.
[11]
Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. 2011. S2E: A platform for in-vivo multi-path analysis of software systems. In ACM SIGARCH Computer Architecture News, Vol. 39. ACM, 265--278.
[12]
Cristina Cifuentes and K. John Gough. 1995. Decompilation of binary programs. Software: Practice and Experience 25, 7 (1995), 811--829.
[13]
Peter Collingbourne, Cristian Cadar, Paul H.J. Kelly, et al. 2011. Symbolic crosschecking of floating-point and SIMD code. In European Conference on Computer Systems (EuroSys 2011).
[14]
Nassim Corteggiani, Giovanni Camurati, and Aurélien Francillon. 2018. Inception: system-wide security testing of real-world embedded systems software. In 27th USENIX Security Symposium (USENIX Security 18). 309--326.
[15]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337--340.
[16]
Artem Dinaburg and Andrew Ruef. 2014. McSema: Static translation of x86 instructions to LLVM. In ReCon 2014 Conference, Montreal, Canada.
[17]
Joe W. Duran and Simeon Ntafos. 1981. A Report on Random Testing. In Proceedings of the 5th International Conference on Software Engineering (ICSE '81). IEEE Press, Piscataway, NJ, USA, 179--183. http://dl.acm.org/citation.cfm?id=800078.802530
[18]
E. Allen Emerson and Edmund M. Clarke. 1980. Characterizing correctness properties of parallel programs using fixpoints. In International Colloquium on Automata, Languages, and Programming. Springer, 169--181.
[19]
Free Software Foundation. 2016. Coreutils - GNU core utilities. https://www.gnu.org/software/coreutils/. Accessed: 2019-02-04.
[20]
Patrice Godefroid, Michael Y. Levin, and David Molnar. 2012. SAGE: whitebox fuzzing for security testing. Commun. ACM 55, 3 (2012), 40--44.
[21]
Soomin Kim, Markus Faerevaag, Minkyu Jung, SeungIl Jung, DongYeop Oh, JongHyup Lee, and Sang Kil Cha. 2017. Testing intermediate representations for binary analysis. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, 353--364.
[22]
James C. King. 1976. Symbolic execution and program testing. Commun. ACM 19, 7 (1976), 385--394.
[23]
Gergely Kovásznai, Helmut Veith, Andreas Fröhlich, and Armin Biere. 2014. On the complexity of symbolic verification and decision problems in bit-vector logic. In International Symposium on Mathematical Foundations of Computer Science. Springer, 481--492.
[24]
Volodymyr Kuznetsov, Johannes Kinder, Stefan Bucur, and George Candea. 2012. Efficient state merging in symbolic execution. Acm Sigplan Notices 47, 6 (2012), 193--204.
[25]
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization. IEEE Computer Society, 75.
[26]
Lixin Li and Chao Wang. 2013. Dynamic analysis and debugging of binary code for security applications. In International Conference on Runtime Verification. Springer, 403--423.
[27]
Tianhai Liu, Mateus Araújo, Marcelo d'Amorim, and Mana Taghdiri. 2014. A comparative study of incremental constraint solving approaches in symbolic execution. In Haifa Verification Conference. Springer, 284--299.
[28]
Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavyweight dynamic binary instrumentation. In ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation (PLDI 2007), Vol. 42. ACM, 89--100.
[29]
Hristina Palikareva and Cristian Cadar. 2013. Multi-solver support in symbolic execution. In International Conference on Computer Aided Verification. Springer, 53--68.
[30]
Xiao Qu and Brian Robinson. 2011. A case study of concolic testing tools and their limitations. In 2011 International Symposium on Empirical Software Engineering and Measurement. IEEE, 117--126.
[31]
Jean-Pierre Queille and Joseph Sifakis. 1982. Specification and verification of concurrent systems in CESAR. In International Symposium on Programming. Springer, 337--351.
[32]
Nguyen Anh Quynh and Dang Hoang Vu. 2015. Unicorn - The ultimate CPU emulator. https://www.unicorn-engine.org/. Accessed: 2019-02-26.
[33]
David A. Ramos and Dawson Engler. 2015. Under-constrained symbolic execution: Correctness checking for real code. In 24th USENIX Security Symposium (USENIX Security 15). 49--64.
[34]
Eric F Rizzi, Sebastian Elbaum, and Matthew B. Dwyer. 2016. On the techniques we create, the tools we build, and their misalignments: a study of KLEE. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). IEEE, 132--143.
[35]
S2E issue tracker. 2019. Add support for symbolic MMX registers. https://github.com/S2E/s2e-env/issues/144. Accessed: 2019-06-04.
[36]
Florent Saudel and Jonathan Salwan. 2015. Triton Dynamic Binary Analysis Framework. https://github.com/JonathanSalwan/Triton. Accessed: 2019-04-02.
[37]
Edward J Schwartz, Thanassis Avgerinos, and David Brumley. 2010. All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In 2010 IEEE Symposium on Security and Privacy. IEEE, 317--331.
[38]
Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Andrew Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel, et al. 2016. SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis. In 2016 IEEE Symposium on Security and Privacy (SP). IEEE, 138--157.
[39]
Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting Fuzzing Through Selective Symbolic Execution. In NDSS, Vol. 16. 1--16.
[40]
Trail of Bits. 2017. Manticore: Symbolic execution for humans. https://blog.trailofbits.com/2017/04/27/manticore-symbolic-execution-for-humans/. Accessed: 2019-02-27.
[41]
Hui Xu, Zirui Zhao, Yangfan Zhou, and Michael R. Lyu. 2018. Benchmarking the Capability of Symbolic Execution Tools with Logic Bombs. IEEE Transactions on Dependable and Secure Computing (2018).
[42]
Hui Xu, Yangfan Zhou, Yu Kang, and Michael R. Lyu. 2017. Concolic execution on small-size binaries: challenges and empirical study. In 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 181--188.
[43]
Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. 2018. QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing. In 27th USENIX Security Symposium (USENIX Security 18). 745--761.
[44]
Lintao Zhang, Conor F. Madigan, Matthew H. Moskewicz, and Sharad Malik. 2001. Efficient conflict driven learning in a boolean satisfiability solver. In Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design. IEEE Press, 279--285.

Cited By

View all
  • (2024)Fuzzing and Symbolic Execution for Multipath Malware Tracing: Bridging Theory and Practice via Survey and ExperimentsDigital Threats: Research and Practice10.1145/3700147Online publication date: 11-Oct-2024
  • (2024)Plankton: Reconciling Binary Code and Debug InformationProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640382(912-928)Online publication date: 27-Apr-2024
  • (2023)sem2vec: Semantics-aware Assembly Tracelet EmbeddingACM Transactions on Software Engineering and Methodology10.1145/356993332:4(1-34)Online publication date: 27-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ACSAC '19: Proceedings of the 35th Annual Computer Security Applications Conference
December 2019
821 pages
ISBN:9781450376280
DOI:10.1145/3359789
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 December 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. intermediate representation
  2. symbolic execution

Qualifiers

  • Research-article

Funding Sources

Conference

ACSAC '19
ACSAC '19: 2019 Annual Computer Security Applications Conference
December 9 - 13, 2019
Puerto Rico, San Juan, USA

Acceptance Rates

ACSAC '19 Paper Acceptance Rate 60 of 266 submissions, 23%;
Overall Acceptance Rate 104 of 497 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)90
  • Downloads (Last 6 weeks)4
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Fuzzing and Symbolic Execution for Multipath Malware Tracing: Bridging Theory and Practice via Survey and ExperimentsDigital Threats: Research and Practice10.1145/3700147Online publication date: 11-Oct-2024
  • (2024)Plankton: Reconciling Binary Code and Debug InformationProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640382(912-928)Online publication date: 27-Apr-2024
  • (2023)sem2vec: Semantics-aware Assembly Tracelet EmbeddingACM Transactions on Software Engineering and Methodology10.1145/356993332:4(1-34)Online publication date: 27-May-2023
  • (2022)So Many Fuzzers, So Little Time✱Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556946(1-12)Online publication date: 10-Oct-2022
  • (2022)SymFusion: Hybrid Instrumentation for Concolic ExecutionProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556928(1-12)Online publication date: 10-Oct-2022
  • (2022)SoK: Demystifying Binary Lifters Through the Lens of Downstream Applications2022 IEEE Symposium on Security and Privacy (SP)10.1109/SP46214.2022.9833799(1100-1119)Online publication date: May-2022
  • (2021)LeanSym: Efficient Hybrid Fuzzing Through Conservative Constraint DebloatingProceedings of the 24th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3471621.3471852(62-77)Online publication date: 6-Oct-2021
  • (2020)Symbolic execution with SYMCCProceedings of the 29th USENIX Conference on Security Symposium10.5555/3489212.3489223(181-198)Online publication date: 12-Aug-2020
  • (2020)Exposing cache timing side-channel leaks through out-of-order symbolic executionProceedings of the ACM on Programming Languages10.1145/34282154:OOPSLA(1-32)Online publication date: 13-Nov-2020
  • (2020)SymJEx: symbolic execution on the GraalVMProceedings of the 17th International Conference on Managed Programming Languages and Runtimes10.1145/3426182.3426187(63-72)Online publication date: 4-Nov-2020

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media