research-article

Systematic comparison of symbolic execution systems: intermediate representation and its generation

Authors:

Sebastian Poeplau,

Aurélien FrancillonAuthors Info & Claims

ACSAC '19: Proceedings of the 35th Annual Computer Security Applications Conference

Pages 163 - 176

https://doi.org/10.1145/3359789.3359796

Published: 09 December 2019 Publication History

Abstract

Symbolic execution has become a popular technique for software testing and vulnerability detection. Most implementations transform the program under analysis to some intermediate representation (IR), which is then used as a basis for symbolic execution. There is a multitude of available IRs, and even more approaches to transform target programs into a respective IR.

When developing a symbolic execution engine, one needs to choose an IR, but it is not clear which influence the IR generation process has on the resulting system. What are the respective benefits for symbolic execution of generating IR from source code versus lifting machine code? Does the distinction even matter? What is the impact of not using an IR, executing machine code directly? We feel that there is little scientific evidence backing the answers to those questions. Therefore, we first develop a methodology for systematic comparison of different approaches to symbolic execution; we then use it to evaluate the impact of the choice of IR and IR generation. We make our comparison framework available to the community for future research.

References

[1]

Roberto Baldoni, Emilio Coppa, Daniele Cono D'elia, Camil Demetrescu, and Irene Finocchi. 2018. A survey of symbolic execution techniques. ACM Computing Surveys (CSUR) 51, 3 (2018), 50.

Digital Library

[2]

Clark Barrett, Aaron Stump, and Cesare Tinelli. 2010. The SMT-LIB standard: Version 2.0. In Proceedings of the 8th International Workshop on Satisfiability Modulo Theories (Edinburgh, England), Vol. 13. 14.

[3]

Fabrice Bellard. 2005. QEMU, a fast and portable dynamic translator. In USENIX Annual Technical Conference, FREENIX Track, Vol. 41. 46.

Digital Library

[4]

Ella Bounimova, Patrice Godefroid, and David Molnar. 2013. Billions and Billions of Constraints: Whitebox Fuzz Testing in Production. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13). IEEE Press, Piscataway, NJ, USA, 122--131. http://dl.acm.org/citation.cfm?id=2486788.2486805

Digital Library

[5]

David Brumley, Ivan Jager, Thanassis Avgerinos, and Edward J Schwartz. 2011. BAP: A binary analysis platform. In International Conference on Computer Aided Verification. Springer, 463--469.

[6]

Robert Brummayer and Armin Biere. 2009. Boolector: An efficient SMT solver for bit-vectors and arrays. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 174--177.

Digital Library

[7]

Stefan Bucur, Vlad Ureche, Cristian Zamfir, and George Candea. 2011. Parallel symbolic execution for automated real-world software testing. In Proceedings of the 6th ACM SIGOPS/EuroSys Conference on Computer Systems. ACM, 183--198.

Digital Library

[8]

Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In OSDI, Vol. 8. 209--224.

Digital Library

[9]

Cristian Cadar, Vijay Ganesh, Peter M. Pawlowski, David L. Dill, and Dawson R. Engler. 2008. EXE: automatically generating inputs of death. ACM Transactions on Information and System Security (TISSEC) 12, 2 (2008), 10.

Digital Library

[10]

Sang Kil Cha, Thanassis Avgerinos, Alexandre Rebert, and David Brumley. 2012. Unleashing mayhem on binary code. In 2012 IEEE Symposium on Security and Privacy. IEEE, 380--394.

Digital Library

[11]

Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. 2011. S2E: A platform for in-vivo multi-path analysis of software systems. In ACM SIGARCH Computer Architecture News, Vol. 39. ACM, 265--278.

Digital Library

[12]

Cristina Cifuentes and K. John Gough. 1995. Decompilation of binary programs. Software: Practice and Experience 25, 7 (1995), 811--829.

Digital Library

[13]

Peter Collingbourne, Cristian Cadar, Paul H.J. Kelly, et al. 2011. Symbolic crosschecking of floating-point and SIMD code. In European Conference on Computer Systems (EuroSys 2011).

Digital Library

[14]

Nassim Corteggiani, Giovanni Camurati, and Aurélien Francillon. 2018. Inception: system-wide security testing of real-world embedded systems software. In 27th USENIX Security Symposium (USENIX Security 18). 309--326.

[15]

Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337--340.

Digital Library

[16]

Artem Dinaburg and Andrew Ruef. 2014. McSema: Static translation of x86 instructions to LLVM. In ReCon 2014 Conference, Montreal, Canada.

[17]

Joe W. Duran and Simeon Ntafos. 1981. A Report on Random Testing. In Proceedings of the 5th International Conference on Software Engineering (ICSE '81). IEEE Press, Piscataway, NJ, USA, 179--183. http://dl.acm.org/citation.cfm?id=800078.802530

[18]

E. Allen Emerson and Edmund M. Clarke. 1980. Characterizing correctness properties of parallel programs using fixpoints. In International Colloquium on Automata, Languages, and Programming. Springer, 169--181.

[19]

Free Software Foundation. 2016. Coreutils - GNU core utilities. https://www.gnu.org/software/coreutils/. Accessed: 2019-02-04.

[20]

Patrice Godefroid, Michael Y. Levin, and David Molnar. 2012. SAGE: whitebox fuzzing for security testing. Commun. ACM 55, 3 (2012), 40--44.

Digital Library

[21]

Soomin Kim, Markus Faerevaag, Minkyu Jung, SeungIl Jung, DongYeop Oh, JongHyup Lee, and Sang Kil Cha. 2017. Testing intermediate representations for binary analysis. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, 353--364.

Digital Library

[22]

James C. King. 1976. Symbolic execution and program testing. Commun. ACM 19, 7 (1976), 385--394.

Digital Library

[23]

Gergely Kovásznai, Helmut Veith, Andreas Fröhlich, and Armin Biere. 2014. On the complexity of symbolic verification and decision problems in bit-vector logic. In International Symposium on Mathematical Foundations of Computer Science. Springer, 481--492.

[24]

Volodymyr Kuznetsov, Johannes Kinder, Stefan Bucur, and George Candea. 2012. Efficient state merging in symbolic execution. Acm Sigplan Notices 47, 6 (2012), 193--204.

Digital Library

[25]

Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization. IEEE Computer Society, 75.

Digital Library

[26]

Lixin Li and Chao Wang. 2013. Dynamic analysis and debugging of binary code for security applications. In International Conference on Runtime Verification. Springer, 403--423.

[27]

Tianhai Liu, Mateus Araújo, Marcelo d'Amorim, and Mana Taghdiri. 2014. A comparative study of incremental constraint solving approaches in symbolic execution. In Haifa Verification Conference. Springer, 284--299.

[28]

Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavyweight dynamic binary instrumentation. In ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation (PLDI 2007), Vol. 42. ACM, 89--100.

Digital Library

[29]

Hristina Palikareva and Cristian Cadar. 2013. Multi-solver support in symbolic execution. In International Conference on Computer Aided Verification. Springer, 53--68.

[30]

Xiao Qu and Brian Robinson. 2011. A case study of concolic testing tools and their limitations. In 2011 International Symposium on Empirical Software Engineering and Measurement. IEEE, 117--126.

Digital Library

[31]

Jean-Pierre Queille and Joseph Sifakis. 1982. Specification and verification of concurrent systems in CESAR. In International Symposium on Programming. Springer, 337--351.

[32]

Nguyen Anh Quynh and Dang Hoang Vu. 2015. Unicorn - The ultimate CPU emulator. https://www.unicorn-engine.org/. Accessed: 2019-02-26.

[33]

David A. Ramos and Dawson Engler. 2015. Under-constrained symbolic execution: Correctness checking for real code. In 24th USENIX Security Symposium (USENIX Security 15). 49--64.

[34]

Eric F Rizzi, Sebastian Elbaum, and Matthew B. Dwyer. 2016. On the techniques we create, the tools we build, and their misalignments: a study of KLEE. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). IEEE, 132--143.

[35]

S2E issue tracker. 2019. Add support for symbolic MMX registers. https://github.com/S2E/s2e-env/issues/144. Accessed: 2019-06-04.

[36]

Florent Saudel and Jonathan Salwan. 2015. Triton Dynamic Binary Analysis Framework. https://github.com/JonathanSalwan/Triton. Accessed: 2019-04-02.

[37]

Edward J Schwartz, Thanassis Avgerinos, and David Brumley. 2010. All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In 2010 IEEE Symposium on Security and Privacy. IEEE, 317--331.

Digital Library

[38]

Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Andrew Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel, et al. 2016. SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis. In 2016 IEEE Symposium on Security and Privacy (SP). IEEE, 138--157.

[39]

Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting Fuzzing Through Selective Symbolic Execution. In NDSS, Vol. 16. 1--16.

[40]

Trail of Bits. 2017. Manticore: Symbolic execution for humans. https://blog.trailofbits.com/2017/04/27/manticore-symbolic-execution-for-humans/. Accessed: 2019-02-27.

[41]

Hui Xu, Zirui Zhao, Yangfan Zhou, and Michael R. Lyu. 2018. Benchmarking the Capability of Symbolic Execution Tools with Logic Bombs. IEEE Transactions on Dependable and Secure Computing (2018).

[42]

Hui Xu, Yangfan Zhou, Yu Kang, and Michael R. Lyu. 2017. Concolic execution on small-size binaries: challenges and empirical study. In 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 181--188.

[43]

Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. 2018. QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing. In 27th USENIX Security Symposium (USENIX Security 18). 745--761.

[44]

Lintao Zhang, Conor F. Madigan, Matthew H. Moskewicz, and Sharad Malik. 2001. Efficient conflict driven learning in a boolean satisfiability solver. In Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design. IEEE Press, 279--285.

Digital Library

Cited By

Botacin M(2024)Fuzzing and Symbolic Execution for Multipath Malware Tracing: Bridging Theory and Practice via Survey and ExperimentsDigital Threats: Research and Practice10.1145/3700147Online publication date: 11-Oct-2024
https://doi.org/10.1145/3700147
Zhou AYe CHuang HCai YZhang CTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Plankton: Reconciling Binary Code and Debug InformationProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640382(912-928)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640382
Wang HMa PWang STang QNie SWu S(2023)sem2vec: Semantics-aware Assembly Tracelet EmbeddingACM Transactions on Software Engineering and Methodology10.1145/356993332:4(1-34)Online publication date: 27-May-2023
https://dl.acm.org/doi/10.1145/3569933
Show More Cited By

Index Terms

Systematic comparison of symbolic execution systems: intermediate representation and its generation
1. Security and privacy
  1. Software and application security
    1. Software security engineering
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Enhancing symbolic execution with veritesting
ICSE 2014: Proceedings of the 36th International Conference on Software Engineering

We present MergePoint, a new binary-only symbolic execution system for large-scale and fully unassisted testing of commodity off-the-shelf (COTS) software. MergePoint introduces veritesting, a new technique that employs static symbolic execution to ...
Runtime exception detection in Java programs using symbolic execution

Most of the runtime failures of a software system can be revealed during test execution only, which has a very high cost. In Java programs, runtime failures are manifested as unhandled runtime exceptions.

In this paper we present an approach and tool ...
Running symbolic execution forever
ISSTA 2020: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis

When symbolic execution is used to analyse real-world applications, it often consumes all available memory in a relatively short amount of time, sometimes making it impossible to analyse an application for an extended period. In this paper, we present a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACSAC '19: Proceedings of the 35th Annual Computer Security Applications Conference

December 2019

821 pages

ISBN:9781450376280

DOI:10.1145/3359789

Conference Chair:
David Balenson
SRI International

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 December 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Agence Nationale de la Recherche

Conference

ACSAC '19

ACSAC '19: 2019 Annual Computer Security Applications Conference

December 9 - 13, 2019

Puerto Rico, San Juan, USA

Acceptance Rates

ACSAC '19 Paper Acceptance Rate 60 of 266 submissions, 23%;

Overall Acceptance Rate 104 of 497 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
574
Total Downloads

Downloads (Last 12 months)90
Downloads (Last 6 weeks)4

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Botacin M(2024)Fuzzing and Symbolic Execution for Multipath Malware Tracing: Bridging Theory and Practice via Survey and ExperimentsDigital Threats: Research and Practice10.1145/3700147Online publication date: 11-Oct-2024
https://doi.org/10.1145/3700147
Zhou AYe CHuang HCai YZhang CTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Plankton: Reconciling Binary Code and Debug InformationProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640382(912-928)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640382
Wang HMa PWang STang QNie SWu S(2023)sem2vec: Semantics-aware Assembly Tracelet EmbeddingACM Transactions on Software Engineering and Methodology10.1145/356993332:4(1-34)Online publication date: 27-May-2023
https://dl.acm.org/doi/10.1145/3569933
Poncelet CSagonas KTsiftes N(2022)So Many Fuzzers, So Little Time✱Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556946(1-12)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3551349.3556946
Coppa EYin HDemetrescu C(2022)SymFusion: Hybrid Instrumentation for Concolic ExecutionProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556928(1-12)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3551349.3556928
Liu ZYuan YWang SBao Y(2022)SoK: Demystifying Binary Lifters Through the Lens of Downstream Applications2022 IEEE Symposium on Security and Privacy (SP)10.1109/SP46214.2022.9833799(1100-1119)Online publication date: May-2022
https://doi.org/10.1109/SP46214.2022.9833799
Mi XRawat SGiuffrida CBos HBilge LDumitras T(2021)LeanSym: Efficient Hybrid Fuzzing Through Conservative Constraint DebloatingProceedings of the 24th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3471621.3471852(62-77)Online publication date: 6-Oct-2021
https://dl.acm.org/doi/10.1145/3471621.3471852
Poeplau SFrancillon ACapkun SRoesner F(2020)Symbolic execution with SYMCCProceedings of the 29th USENIX Conference on Security Symposium10.5555/3489212.3489223(181-198)Online publication date: 12-Aug-2020
https://dl.acm.org/doi/10.5555/3489212.3489223
Guo SChen YYu JWu MZuo ZLi PCheng YWang H(2020)Exposing cache timing side-channel leaks through out-of-order symbolic executionProceedings of the ACM on Programming Languages10.1145/34282154:OOPSLA(1-32)Online publication date: 13-Nov-2020
https://dl.acm.org/doi/10.1145/3428215
Kloibhofer SPointhuber THeisinger MMössenböck HStadler LLeopoldseder DMarr S(2020)SymJEx: symbolic execution on the GraalVMProceedings of the 17th International Conference on Managed Programming Languages and Runtimes10.1145/3426182.3426187(63-72)Online publication date: 4-Nov-2020
https://dl.acm.org/doi/10.1145/3426182.3426187

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents