Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3442381.3450002acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

On the Feasibility of Automated Built-in Function Modeling for PHP Symbolic Execution

Published: 03 June 2021 Publication History

Abstract

Symbolic execution has been widely applied in detecting vulnerabilities in web applications. Modeling language-specific built-in functions is essential for symbolic execution. Since built-in functions tend to be complicated and are typically implemented in low-level languages, a common strategy is to manually translate them into the SMT-LIB language for constraint solving. Such translation requires an excessive amount of human effort and deep understandings of the function behaviors. Incorrect translation can invalidate the final results. This problem aggravates in PHP applications because of their cross-language nature, i.e., the built-in functions are written in C, but the rest code is in PHP.
In this paper, we explore the feasibility of automating the process of modeling PHP built-in functions for symbolic execution. We synthesize C programs by transforming the constraint solving task in PHP symbolic execution into a C-compliant format and integrating them with C implementations of the built-in functions. We apply symbolic execution on the synthesized C program to find a feasible path, which gives a solution that can be applied to the original PHP constraints. In this way, we automate the modeling of built-in functions in PHP applications.
We thoroughly compare our automated method with the state-of-the-art manual modeling tool. The evaluation results demonstrate that our automated method is more accurate with a higher function coverage, and can exploit a similar number of vulnerabilities. Our empirical analysis also shows that the manual and automated methods have different strengths, which complement each other in certain scenarios. Therefore, the best practice is to combine both of them to optimize the accuracy, correctness, and coverage of symbolic execution.

References

[1]
Abeer Alhuzali, Birhanu Eshete, Rigel Gjomemo, and VN Venkatakrishnan. 2016. Chainsaw: Chained automated workflow-based exploit generation. In Proceedings of the 23rd ACM Conference on Computer and Communications Security (CCS). Vienna, Austria.
[2]
Abeer Alhuzali, Rigel Gjomemo, Birhanu Eshete, and VN Venkatakrishnan. 2018. NAVEX: Precise and Scalable Exploit Generation for Dynamic Web Applications. In Proceedings of the 27th USENIX Security Symposium (Security). Baltimore, MD.
[3]
Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, and Michael D Ernst. 2008. Finding bugs in dynamic web applications. In Proceedings of the 17th International Symposium on Software Testing and Analysis (ISSTA). Seattle, WA.
[4]
Thanassis Avgerinos, Sang Kil Cha, Alexandre Rebert, Edward J Schwartz, Maverick Woo, and David Brumley. 2014. Automatic exploit generation. Commun. ACM (2014).
[5]
Michael Backes, Konrad Rieck, Malte Skoruppa, Ben Stock, and Fabian Yamaguchi. 2017. Efficient and flexible discovery of php application vulnerabilities. In Proceedings of the 2nd IEEE Symposium on Security and Privacy (Oakland). Paris, France.
[6]
Roberto Baldoni, Emilio Coppa, Daniele Cono D’Elia, Camil Demetrescu, and Irene Finocchi. 2018. A Survey of Symbolic Execution Techniques. ACM Comput. Surv. (2018).
[7]
Roberto Baldoni, Emilio Coppa, Daniele Cono D’elia, Camil Demetrescu, and Irene Finocchi. 2018. A survey of symbolic execution techniques. ACM Computing Surveys (CSUR)(2018).
[8]
Clark Barrett, Aaron Stump, Cesare Tinelli, 2010. The smt-lib standard: Version 2.0. In Proceedings of the 8th international workshop on satisfiability modulo theories (Edinburgh, England).
[9]
Clark Barrett and Cesare Tinelli. 2018. Satisfiability modulo theories. In Handbook of Model Checking. Springer.
[10]
Fraser Brown, Deian Stefan, and Dawson Engler. 2019. Sys: a static/symbolic tool for finding good bugs in good (browser) code. In Proceedings of the 29th USENIX Security Symposium (Security). Boston, MA.
[11]
Roberto Bruttomesso, Alessandro Cimatti, Anders Franzén, Alberto Griggio, Alessandro Santuari, and Roberto Sebastiani. 2006. To Ackermann-ize or Not to Ackermann-ize? On Efficiently Handling Uninterpreted Function Symbols in SMT. In International Conference on Logic for Programming Artificial Intelligence and Reasoning. Springer.
[12]
Frank Busse, Martin Nowack, and Cristian Cadar. 2020. Running symbolic execution forever. In Proceedings of the 29th International Symposium on Software Testing and Analysis (ISSTA). Los Angeles, US.
[13]
Cristian Cadar, Daniel Dunbar, Dawson R Engler, 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI). San Diego, CA.
[14]
Cristian Cadar, Vijay Ganesh, Peter M Pawlowski, David L Dill, and Dawson R Engler. 2008. EXE: automatically generating inputs of death. ACM Transactions on Information and System Security (2008).
[15]
Cristian Cadar and Koushik Sen. 2013. Symbolic execution for software testing: three decades later. Commun. ACM (2013).
[16]
Kevin Chen, Warren He, Devdatta Akhawe, Vijay D’Silva, Prateek Mittal, and Dawn Song. 2015. ASPIRE: Iterative Specification Synthesis for Security. In 15th USENIX Workshop on Hot Topics in Operating Systems (HotOS) (HotOS XV). Kartause Ittingen, Switzerland.
[17]
Vitaly Chipounov, Vlad Georgescu, Cristian Zamfir, and George Candea. 2009. Selective symbolic execution. In Proceedings of the 5th Workshop on Hot Topics in System Dependability (HotDep).
[18]
Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. 2011. S2E: A Platform for in-Vivo Multi-Path Analysis of Software Systems. (March 2011).
[19]
Johannes Dahse and Thorsten Holz. 2014. Simulation of Built-in PHP Features for Precise Static Code Analysis. In Proceedings of the 2014 Annual Network and Distributed System Security Symposium (NDSS). San Diego, CA.
[20]
Dams. 2018. Top 100 PHP functions. https://www.exakat.io/top-100-php-functions/.
[21]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Budapest, Hungary.
[22]
Leonardo de Moura, Bruno Dutertre, and Natarajan Shankar. 2007. A tutorial on satisfiability modulo theories. In International Conference on Computer Aided Verification. Berlin, Germany.
[23]
Martin Fränzle, Christian Herde, Tino Teige, Stefan Ratschan, and Tobias Schubert. 2006. Efficient solving of large non-linear arithmetic constraint systems with complex boolean structure. Journal on Satisfiability, Boolean Modeling and Computation (2006).
[24]
Xiang Fu and Kai Qian. 2008. SAFELI: SQL injection scanner using symbolic execution. In Proceedings of the 2008 workshop on Testing, analysis, and verification of web services and applications.
[25]
Patrice Godefroid. 2007. Compositional dynamic test generation. In Proceedings of the 34th ACM Symposium on Principles of Programming Languages (POPL). Nice, France.
[26]
Patrice Godefroid. 2011. Higher-order test generation. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). San Jose, CA.
[27]
Sumit Gulwani, Oleksandr Polozov, Rishabh Singh, 2017. Program synthesis. Foundations and Trends® in Programming Languages (2017).
[28]
HHVM. 2018. Ending PHP Support, and The Future Of Hack. https://hhvm.com/blog/2018/09/12/end-of-php-support-future-of-hack.html.
[29]
Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with code fragments. In Proceedings of the 21st USENIX Security Symposium (Security). Bellevue, WA.
[30]
Jin Huang, Yu Li, Junjie Zhang, and Rui Dai. 2019. UChecker: Automatically detecting php-based unrestricted file upload vulnerabilities. In 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).
[31]
Nenad Jovanovic, Christopher Kruegel, and Engin Kirda. 2010. Static analysis for detecting taint-style vulnerabilities in web applications. Journal of Computer Security(2010).
[32]
Volodymyr Kuznetsov, Johannes Kinder, Stefan Bucur, and George Candea. 2012. Efficient state merging in symbolic execution. Acm Sigplan Notices (2012).
[33]
Xiao Liu, Xiaoting Li, Rupesh Prajapati, and Dinghao Wu. 2019. Deepfuzz: Automatic generation of syntax valid c programs for fuzz testing. In Proceedings of the AAAI Conference on Artificial Intelligence.
[34]
Nikic. 2020. A PHP parser written in PHP. https://github.com/nikic/PHP-Parser.
[35]
Oswaldo Olivo, Isil Dillig, and Calvin Lin. 2015. Detecting and exploiting second order denial-of-service vulnerabilities in web applications. In Proceedings of the 22nd ACM Conference on Computer and Communications Security (CCS). Denver, Colorado.
[36]
Guilherme Ottoni. 2018. HHVM JIT: A Profile-guided, Region-based Compiler for PHP and Hack. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Philadelphia, PA.
[37]
Soyeon Park, Wen Xu, Insu Yun, Daehee Jang, and Taesoo Kim. 2020. Fuzzing JavaScript Engines with Aspect-preserving Mutation. In Proceedings of the 41st IEEE Symposium on Security and Privacy (Oakland). San Francisco, CA.
[38]
Corina S Păsăreanu, Neha Rungta, and Willem Visser. 2011. Symbolic execution with mixed concrete-symbolic solving. In Proceedings of the 20th International Symposium on Software Testing and Analysis (ISSTA). Toronto, Canada.
[39]
David A Ramos and Dawson Engler. 2015. Under-constrained symbolic execution: Correctness checking for real code. In Proceedings of the 24th USENIX Security Symposium (Security). Washington, DC.
[40]
G Robinson and Lawrence Wos. 1983. Paramodulation and theorem-proving in first-order theories with equality. In Automation of Reasoning. Springer.
[41]
Prateek Saxena, Devdatta Akhawe, Steve Hanna, Feng Mao, Stephen McCamant, and Dawn Song. 2010. A symbolic execution framework for javascript. In Proceedings of the 31th IEEE Symposium on Security and Privacy (Oakland). Oakland, CA.
[42]
AAG IT Services. 2019. How often do Cyber Attacks occur?https://aag-it.com/how-often-do-cyber-attacks-occur/.
[43]
SMT-LIB. 2020. SMT-LIB. http://smtlib.cs.uiowa.edu/.
[44]
Sooel Son and Vitaly Shmatikov. 2011. SAFERPHP: Finding semantic vulnerabilities in PHP applications. In Proceedings of the ACM SIGPLAN 6th Workshop on Programming Languages and Analysis for Security.
[45]
Positive Technologies. 2019. Web application vulnerabilities: statistics for 2018. https://www.ptsecurity.com/ww-en/analytics/web-application-vulnerabilities-statistics-2019/.
[46]
Jiayi Wei, Jia Chen, Yu Feng, Kostas Ferles, and Isil Dillig. 2018. Singularity: Pattern fuzzing for worst case complexity. In Proceedings of the 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). Lake Buena Vista, FL.
[47]
Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2020. Validating SMT solvers via semantic fusion. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). London, UK.
[48]
Fabian Yamaguchi, Nico Golde, Daniel Arp, and Konrad Rieck. 2014. Modeling and discovering vulnerabilities with code property graphs. In Proceedings of the 35th IEEE Symposium on Security and Privacy (Oakland). San Jose, CA.
[49]
Zend and Perforce. 2021. Zend Framework. https://framework.zend.com/.

Cited By

View all
  • (2025)Blending Static and Dynamic Analysis for Web Application Vulnerability Detection: Methodology and Case StudyIEEE Access10.1109/ACCESS.2024.352209413(3139-3153)Online publication date: 2025
  • (2024)SWIDE: A Semantic-aware Detection Engine for Successful Web Injection AttacksProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670304(540-554)Online publication date: 2-Dec-2024
  • (2024)FuzzCache: Optimizing Web Application Fuzzing Through Software-Based Data CacheProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670278(511-524)Online publication date: 2-Dec-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '21: Proceedings of the Web Conference 2021
April 2021
4054 pages
ISBN:9781450383127
DOI:10.1145/3442381
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Constraint solving
  2. PHP
  3. Symbolic execution

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '21
Sponsor:
WWW '21: The Web Conference 2021
April 19 - 23, 2021
Ljubljana, Slovenia

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)44
  • Downloads (Last 6 weeks)5
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Blending Static and Dynamic Analysis for Web Application Vulnerability Detection: Methodology and Case StudyIEEE Access10.1109/ACCESS.2024.352209413(3139-3153)Online publication date: 2025
  • (2024)SWIDE: A Semantic-aware Detection Engine for Successful Web Injection AttacksProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670304(540-554)Online publication date: 2-Dec-2024
  • (2024)FuzzCache: Optimizing Web Application Fuzzing Through Software-Based Data CacheProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670278(511-524)Online publication date: 2-Dec-2024
  • (2024)URadar: Discovering Unrestricted File Upload Vulnerabilities via Adaptive Dynamic TestingIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.333588519(1251-1266)Online publication date: 1-Jan-2024
  • (2024)Holistic Concolic Execution for Dynamic Web Applications via Symbolic Interpreter Analysis2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00197(222-238)Online publication date: 19-May-2024
  • (2024)Undefined-oriented Programming: Detecting and Chaining Prototype Pollution Gadgets in Node.js Template Engines for Malicious Consequences2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00121(4015-4033)Online publication date: 19-May-2024
  • (2024)Methods to Improve API Performance in PHP Programming LanguageContributions Presented at The International Conference on Computing, Communication, Cybersecurity and AI, July 3–4, 2024, London, UK10.1007/978-3-031-74443-3_28(464-477)Online publication date: 20-Dec-2024
  • (2023)VulPathsFinder: A Static Method for Finding Vulnerable Paths in PHP Applications Based on CPGApplied Sciences10.3390/app1316924013:16(9240)Online publication date: 14-Aug-2023
  • (2023)An Enhanced Static Taint Analysis Approach to Detect Input Validation VulnerabilityJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2023.01.00935:2(682-701)Online publication date: 1-Feb-2023
  • (2022)HiddenCPG: Large-Scale Vulnerable Clone Detection Using Subgraph Isomorphism of Code Property GraphsProceedings of the ACM Web Conference 202210.1145/3485447.3512235(755-766)Online publication date: 25-Apr-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media