Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A decision procedure for subset constraints over regular languages

Published: 15 June 2009 Publication History

Abstract

Reasoning about string variables, in particular program inputs, is an important aspect of many program analyses and testing frameworks. Program inputs invariably arrive as strings, and are often manipulated using high-level string operations such as equality checks, regular expression matching, and string concatenation. It is difficult to reason about these operations because they are not well-integrated into current constraint solvers.
We present a decision procedure that solves systems of equations over regular language variables. Given such a system of constraints, our algorithm finds satisfying assignments for the variables in the system. We define this problem formally and render a mechanized correctness proof of the core of the algorithm. We evaluate its scalability and practical utility by applying it to the problem of automatically finding inputs that cause SQL injection vulnerabilities.

References

[1]
S. Adams, T. Ball, M. Das, S. Lerner, S. K. Rajamani, M. Seigle, and W. Weimer. Speeding up dataflow analysis using flow-insensitive pointer analysis. In Static Analysis Symposium, pages 230--246, 2002.
[2]
S. Bala. Regular language matching and other decidable cases of the satisfiability problem for constraints between regular open terms. In STACS, pages 596--607, 2004.
[3]
T. Ball, B. Cook, S. K. Lahiri, and L. Zhang. Zapato: Automatic theorem proving for predicate abstraction refinement. In Computer Aided Verification, pages 457--461, 2004.
[4]
T. Ball, M. Naik, and S. K. Rajamani. From symptom to cause: localizing errors in counterexample traces. SIGPLAN Not., 38(1):97--105, 2003.
[5]
T. Ball and S. K. Rajamani. Automatically validating temporal safety properties of interfaces. In SPIN Workshop on Model Checking of Software, pages 103--122, May 2001.
[6]
Y. Bertot and P. Casteran. Interactive Theorem Proving and Program Development. SpringerVerlag, 2004.
[7]
N. Bjørner, N. Tillmann, and A. Voronkov. Path feasibility analysis for string-manipulating programs. In Tools and Algorithms for the Construction and Analysis of Systems, 2009.
[8]
British Broadcasting Corporation. UN's website breached by hackers. In http://news.bbc.co.uk/2/hi/technology/6943385.stm, Aug. 2007.
[9]
R. E. Bryant, D. Kroening, J. Ouaknine, S. A. Seshia, O. Strichman, and B. Brady. Deciding bit-vector arithmetic with abstraction. In Tools and Algorithms for the Construction and Analysis of Systems, pages 358--372, 2007.
[10]
C. Cadar, V. Ganesh, P. M. Pawlowski, D. L. Dill, and D. R. Engler. EXE: automatically generating inputs of death. In Computer and Communications Security, pages 322--335, 2006.
[11]
A. S. Christensen, A. Møller, and M. I. Schwartzbach. Precise analysis of string expressions. In International Symposium on Static Analysis, pages 1--18, 2003.
[12]
T. Coquand and G. P. Huet. The calculus of constructions. Inf. Comput., 76(2/3):95--120, 1988.
[13]
L. M. de Moura and N. Bjørner. Z3: An efficient SMT solver. In Tools and Algorithms for the Construction and Analysis of Systems, pages 337--340, 2008.
[14]
D. Detlefs, G. Nelson, and J. B. Saxe. Simplify: a theorem prover for program checking. J. ACM, 52(3):365--473, 2005.
[15]
V. Ganesh and D. L. Dill. A decision procedure for bit-vectors and arrays. In Computer-Aided Verification, pages 519--531, 2007.
[16]
P. Godefroid, A. Kie|un, and M. Y. Levin. Grammar-based whitebox fuzzing. In Programming Language Design and Implementation, Tucson, AZ, USA, June 9--11, 2008.
[17]
P. Godefroid, N. Klarlund, and K. Sen. DART: directed automated random testing. In Programming Language Design and Implementation, pages 213--223, 2005.
[18]
P. Godefroid, M. Levin, and D. Molnar. Automated whitebox fuzz testing. In Network Distributed Security Symposium (NDSS), 2008.
[19]
T. A. Henzinger, R. Jhala, R. Majumdar, G. C. Necula, G. Sutre, and W. Weimer. Temporal-safety proofs for systems code. In Computer Aided Verification, pages 526--538, 2002.
[20]
T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Lazy abstraction. In Principles of Programming Languages, pages 58--70, 2002.
[21]
K. J. Higgins. Cross-site scripting: attackers' new favorite flaw. Technical report, http://www.darkreading.com/document.asp?doc_id=103774&WT.svl=news1_1, Sept. 2006.
[22]
P. Hooimeijer and W. Weimer. Modeling bug report quality. In International Conference on Automated Software Engineering, pages 73--82, 2007.
[23]
R. Jhala and R. Majumdar. Path slicing. In Programming Language Design and Implementation, pages 38--47, 2005.
[24]
N. Jovanovic, C. Kruegel, and E. Kirda. Pixy: A static analysis tool for detecting web application vulnerabilities (short paper). In Symposium on Security and Privacy, pages 258--263, 2006.
[25]
A. Kie|un, V. Ganesh, P. J. Guo, P. Hooimeijer, and M. D. Ernst. HAMPI: A solver for string constraints. technical report, Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory.
[26]
J. Kodumal and A. Aiken. Banshee: A scalable constraint-based analysis toolkit. In Static Analysis Symposium, pages 218--234, 2005.
[27]
M. Kunc. The power of commuting with finite sets of words. Theory Comput. Syst., 40(4):521--551, 2007.
[28]
M. Kunc. What do we know about language equations? In Developments in Language Theory, pages 23--27, 2007.
[29]
S. K. Lahiri, T. Ball, and B. Cook. Predicate abstraction via symbolic decision procedures. Logical Methods in Computer Science, 3(2), 2007.
[30]
R. Majumdar and R.-G. Xu. Directed test generation using symbolic grammars. In Automated Software Engineering, pages 134--143, 2007.
[31]
M. C. Martin, V. B. Livshits, and M. S. Lam. Finding application errors and security flaws using PQL: a program query language. In Object-Oriented Programming, Systems, Languages, and Applications, pages 365--383, 2005.
[32]
Y. Minamide. Static approximation of dynamically generated web pages. In International Conference on the World Wide Web, pages 432--441, 2005.
[33]
M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik. Chaff: engineering an efficient SAT solver. In Design Automation Conference, pages 530--535, 2001.
[34]
M. Naik and A. Aiken. Conditional must not aliasing for static race detection. In Principles of Programming Languages, pages 327--338, 2007.
[35]
G. C. Necula. Proof-carrying code. In Principles of Programming Languages, pages 106--119, New York, NY, USA, 1997. ACM.
[36]
G. Nelson and D. C. Oppen. Simplification by cooperating decision procedures. ACM Trans. Program. Lang. Syst., 1(2):245--257, 1979.
[37]
A. Salomaa, K. Salomaa, and S. Yu. State complexity of combined operations. Theor. Comput. Sci., 383(2--3):140--152, 2007.
[38]
K. Sen. Race directed random testing of concurrent programs. In Programming Language Design and Implementation, pages 11--21, 2008.
[39]
B. Steensgaard. Points-to analysis in almost linear time. In Principles of Programming Languages, pages 32--41, 1996.
[40]
A. Stump, C. W. Barrett, and D. L. Dill. Cvc: A cooperating validity checker. In Computer Aided Verification, pages 500--504, 2002.
[41]
Z. Su and G. Wassermann. The essence of command injection attacks in web applications. In Principles of Programming Languages, pages 372--382, 2006.
[42]
P. Thiemann. Grammar-based analysis of string expressions. In Workshop on Types in Languages Design and Implementation, pages 59--70, New York, NY, USA, 2005. ACM.
[43]
G. Wassermann and Z. Su. Sound and precise analysis of web applications for injection vulnerabilities. In Programming Language Design and Implementation, pages 32--41, 2007.
[44]
G. Wassermann and Z. Su. Static detection of cross-site scripting vulnerabilities. In International Conference on Software Engineering, pages 171--180, 2008.
[45]
G. Wassermann, D. Yu, A. Chander, D. Dhurjati, H. Inamura, and Z. Su. Dynamic test input generation for web applications. In International Symposium on Software testing and analysis, pages 249--260, 2008.
[46]
W. Weimer. Patches as better bug reports. In Generative Programming and Component Engineering, pages 181--190, 2006.
[47]
Y. Xie and A. Aiken. Static detection of security vulnerabilities in scripting languages. In Usenix Security Symposium, pages 179--192, July 2006.
[48]
Y. Xie and A. Aiken. Saturn: A scalable framework for error detection using boolean satisfiability. ACM Trans. Program. Lang. Syst., 29(3): 16, 2007.
[49]
F. Yu, T. Bultan, M. Cova, and O. H. Ibarra. Symbolic string verification: An automata-based approach. In SPIN'08: Proceedings of the 15th international workshop on Model Checking Software, pages 306--324, Berlin, Heidelberg, 2008. Springer-Verlag.
[50]
F. Yu, T. Bultan, and O. H. Ibarra. Symbolic string verification: Combining string analysis and size analysis. In Tools and Algorithms for the Construction and Analysis of Systems, 2009.

Cited By

View all
  • (2019)SegateProceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE.2019.00028(200-212)Online publication date: 10-Nov-2019
  • (2017)Automated String Constraints Solving for Programs Containing String Manipulation FunctionsJournal of Computer Science and Technology10.1007/s11390-017-1787-y32:6(1125-1135)Online publication date: 8-Dec-2017
  • (2015)Automata-Based Model Counting for String ConstraintsComputer Aided Verification10.1007/978-3-319-21690-4_15(255-272)Online publication date: 16-Jul-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 44, Issue 6
PLDI '09
June 2009
478 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1543135
Issue’s Table of Contents
  • cover image ACM Conferences
    PLDI '09: Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation
    June 2009
    492 pages
    ISBN:9781605583921
    DOI:10.1145/1542476
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2009
Published in SIGPLAN Volume 44, Issue 6

Check for updates

Author Tags

  1. decision procedure
  2. regular language

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2019)SegateProceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE.2019.00028(200-212)Online publication date: 10-Nov-2019
  • (2017)Automated String Constraints Solving for Programs Containing String Manipulation FunctionsJournal of Computer Science and Technology10.1007/s11390-017-1787-y32:6(1125-1135)Online publication date: 8-Dec-2017
  • (2015)Automata-Based Model Counting for String ConstraintsComputer Aided Verification10.1007/978-3-319-21690-4_15(255-272)Online publication date: 16-Jul-2015
  • (2015)Constraint Solving on Bounded String VariablesIntegration of AI and OR Techniques in Constraint Programming10.1007/978-3-319-18008-3_26(375-392)Online publication date: 16-Apr-2015
  • (2014)S3Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security10.1145/2660267.2660372(1232-1243)Online publication date: 3-Nov-2014
  • (2014)Towards testing of full-scale SQL applications using relational symbolic executionProceedings of the 6th International Workshop on Constraints in Software Testing, Verification, and Analysis10.1145/2593735.2593738(12-17)Online publication date: 31-May-2014
  • (2024)Mata: A Fast and Simple Finite Automata LibraryTools and Algorithms for the Construction and Analysis of Systems10.1007/978-3-031-57249-4_7(130-151)Online publication date: 5-Apr-2024
  • (2023)Reasoning About Regular Properties: A Comparative StudyAutomated Deduction – CADE 2910.1007/978-3-031-38499-8_17(286-306)Online publication date: 2-Sep-2023
  • (2021)A Survey on String Constraint SolvingACM Computing Surveys10.1145/348419855:1(1-38)Online publication date: 23-Nov-2021
  • (2018)Counting Algorithms for Recognizable and Algebraic SeriesIEICE Transactions on Information and Systems10.1587/transinf.2017FOP0003E101.D:6(1479-1490)Online publication date: 1-Jun-2018
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media