Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

A Constraint Solving Approach to Parikh Images of Regular Languages

Published: 29 April 2024 Publication History

Abstract

A common problem in string constraint solvers is computing the Parikh image, a linear arithmetic formula that describes all possible combinations of character counts in strings of a given language. Automata-based string solvers frequently need to compute the Parikh image of products (or intersections) of finite-state automata, in particular when solving string constraints that also include the integer data-type due to operations like string length and indexing. In this context, the computation of Parikh images often turns out to be both prohibitively slow and memory-intensive. This paper contributes a new understanding of how the reasoning about Parikh images can be cast as a constraint solving problem, and questions about Parikh images be answered without explicitly computing the product automaton or the exact Parikh image. The paper shows how this formulation can be efficiently implemented as a calculus, PC*, embedded in an automated theorem prover supporting Presburger logic. The resulting standalone tool Catra is evaluate on constraints produced by the Ostrich+ string solver when solving standard string constraint benchmarks involving integer operations. The experiments show that PC* strictly outperforms the standard approach by Verma et al. to extract Parikh images from finite-state automata, as well as the over-approximating method recently described by Janků and Turoňová by a wide margin, and for realistic timeouts (under 60 s) also the nuXmv model checker. When added as the Parikh image backend of Ostrich+ to the Ostrich string constraint solver’s portfolio, it boosts its results on the quantifier-free strings with linear integer algebra track of SMT-COMP 2023 (QF_SLIA) enough to solve the most Unsat instances in that track of all competitors.

References

[1]
Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Bui Phi Diep, Lukáš Holík, Ahmed Rezine, and Philipp Rümmer. 2017. Flatten and conquer: a framework for efficient analysis of string constraints. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). Association for Computing Machinery, New York, NY, USA. 602–617. isbn:9781450349888 https://doi.org/10.1145/3062341.3062384
[2]
Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Lukáš Holík, Ahmed Rezine, Philipp Rümmer, and Jari Stenman. 2015. Norn: An SMT Solver for String Constraints. In Computer Aided Verification, Daniel Kroening and Corina S. Păsăreanu (Eds.). Springer International Publishing, Cham. 462–469. isbn:978-3-319-21690-4 https://doi.org/10.1007/978-3-319-21690-4_29
[3]
Haniel Barbosa, Clark Barrett, Martin Brain, Gereon Kremer, Hanna Lachnitt, Makai Mann, Abdalrhman Mohamed, Mudathir Mohamed, Aina Niemetz, Andres Nötzli, Alex Ozdemir, Mathias Preiner, Andrew Reynolds, Ying Sheng, Cesare Tinelli, and Yoni Zohar. 2022. cvc5: A Versatile and Industrial-Strength SMT Solver. In Tools and Algorithms for the Construction and Analysis of Systems, Dana Fisman and Grigore Rosu (Eds.). Springer International Publishing, Cham. 415–442. isbn:978-3-030-99524-9 https://doi.org/10.1007/978-3-030-99524-9_24
[4]
Murphy Berzish, Vijay Ganesh, and Yunhui Zheng. 2017. Z3str3: A String Solver with Theory-aware Heuristics. In 2017 Formal Methods in Computer Aided Design (FMCAD). IEEE, Vienna, Austria. 55–59. https://doi.org/10.23919/FMCAD.2017.8102241
[5]
Murphy Berzish, Mitja Kulczynski, Federico Mora, Florin Manea, Joel D. Day, Dirk Nowotka, and Vijay Ganesh. 2021. An SMT Solver for Regular Expressions and Linear Arithmetic over String Length. In Computer Aided Verification, Alexandra Silva and K. Rustan M. Leino (Eds.). Springer International Publishing, Cham. 289–312. isbn:978-3-030-81688-9 https://doi.org/10.1007/978-3-030-81688-9_14
[6]
Tevfik Bultan, Fang Yu, Muath Alkhalaf, and Abdulbaki Aydin. 2017. String Analysis for Software Verification and Security. Springer, Cham. isbn:978-3-319-68668-4 https://doi.org/10.1007/978-3-319-68670-7
[7]
Michaël Cadilhac, Alain Finkel, and Pierre McKenzie. 2011. On the expressiveness of Parikh automata and related models. arxiv:1101.1547.
[8]
Roberto Cavada, Alessandro Cimatti, Michele Dorigatti, Alberto Griggio, Alessandro Mariotti, Andrea Micheli, Sergio Mover, Marco Roveri, and Stefano Tonetta. 2014. The nuXmv Symbolic Model Checker. In Computer Aided Verification, Armin Biere and Roderick Bloem (Eds.). Springer International Publishing, Cham. 334–342. isbn:978-3-319-08867-9 https://doi.org/10.1007/978-3-319-08867-9_22
[9]
Taolue Chen, Matthew Hague, Jinlong He, Denghang Hu, Anthony Widjaja Lin, Philipp Rümmer, and Zhilin Wu. 2020. A Decision Procedure for Path Feasibility of String Manipulating Programs with Integer Data Type. In Automated Technology for Verification and Analysis, Dang Van Hung and Oleg Sokolsky (Eds.). Springer International Publishing, Cham. 325–342. isbn:978-3-030-59152-6 https://doi.org/10.1007/978-3-030-59152-6_18
[10]
Taolue Chen, Matthew Hague, Anthony W. Lin, Philipp Rümmer, and Zhilin Wu. 2019. Decision procedures for path feasibility of string-manipulating programs with complex operations. Proc. ACM Program. Lang., 3, POPL (2019), Article 49, jan, 30 pages. https://doi.org/10.1145/3290362
[11]
Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Tools and Algorithms for the Construction and Analysis of Systems, C. R. Ramakrishnan and Jakob Rehof (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 337–340. isbn:978-3-540-78800-3 https://doi.org/10.1007/978-3-540-78800-3_24
[12]
Javier Esparza, Pierre Ganty, Stefan Kiefer, and Michael Luttenberger. 2011. Parikh’s theorem: A simple and direct automaton construction. Inform. Process. Lett., 111, 12 (2011), 614–619. issn:0020-0190 https://doi.org/10.1016/j.ipl.2011.03.019
[13]
Diego Figueira and Leonid Libkin. 2015. Path Logics for Querying Graphs: Combining Expressiveness and Efficiency. In 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science. IEEE, Kyoto, Japan. 329–340. https://doi.org/10.1109/LICS.2015.39
[14]
Melvin C. Fitting. 1996. First-Order Logic and Automated Theorem Proving (2nd ed.). Springer-Verlag, New York, NY, USA. isbn:978-1-4612-7515-2
[15]
John Harrison. 2009. Handbook of Practical Logic and Automated Reasoning. Cambridge University Press, Shaftesbury Road, Cambridge, UK. isbn:978-0-521-89957-4
[16]
Lukáš Holík, Petr Janků, Anthony W. Lin, Philipp Rümmer, and Tomáš Vojnar. 2017. String constraints with concatenation and transducers solved efficiently. Proc. ACM Program. Lang., 2, POPL (2017), Article 4, dec, 32 pages. https://doi.org/10.1145/3158092
[17]
Petr Janků and Lenka Turoňová. 2020. Solving String Constraints with Approximate Parikh Image. In Computer Aided Systems Theory – EUROCAST 2019, Roberto Moreno-Díaz, Franz Pichler, and Alexis Quesada-Arencibia (Eds.). Springer International Publishing, Cham. 491–498. isbn:978-3-030-45093-9 https://doi.org/10.1007/978-3-030-45093-9_59
[18]
Juhani Karhumäki. 1980. Generalized Parikh mappings and homomorphisms. Information and Control, 47, 3 (1980), 155–165. issn:0019-9958 https://doi.org/10.1016/S0019-9958(80)90493-3
[19]
Felix Klaedtke and Harald Rueß. 2002. Parikh Automata and Monadic Second-Order Logics with Linear Cardinality Constraints. Albert-Ludwigs-Universität Freiburg. https://tr.informatik.uni-freiburg.de/reports/report177/report00177.ps.gz
[20]
Dexter C. Kozen. 1997. Automata and computability. Springer, New York. isbn:0387949070
[21]
Giovanna J. Lavado, Giovanni Pighizzini, and Shinnosuke Seki. 2013. Converting nondeterministic automata and context-free grammars into Parikh equivalent one-way and two-way deterministic automata. Information and Computation, 228-229 (2013), 1–15. issn:0890-5401 https://doi.org/10.1016/j.ic.2013.06.003
[22]
Michael Luby, Alistair Sinclair, and David Zuckerman. 1993. Optimal speedup of Las Vegas algorithms. Inform. Process. Lett., 47, 4 (1993), 173–180. issn:0020-0190 https://doi.org/10.1016/0020-0190(93)90029-9
[23]
Kimbal Marriott and Peter Stuckey. 1998. Programming with Constraints: An Introduction. The MIT Press, Cambridge, MA, USA. isbn:9780262279161 https://doi.org/10.7551/mitpress/5625.001.0001
[24]
Federico Mora, Murphy Berzish, Mitja Kulczynski, Dirk Nowotka, and Vijay Ganesh. 2021. Z3str4: A Multi-armed String Solver. In Formal Methods, Marieke Huisman, Corina Păsăreanu, and Naijun Zhan (Eds.). Springer International Publishing, Cham. 389–406. isbn:978-3-030-90870-6 https://doi.org/10.1007/978-3-030-90870-6_21
[25]
Rohit J. Parikh. 1966. On Context-Free Languages. J. ACM, 13, 4 (1966), oct, 570–581. issn:0004-5411 https://doi.org/10.1145/321356.321364
[26]
Rodrigo Raya. 2023. Decision Procedures for Power Structures. Ph. D. Dissertation. EPFL. Lausanne. https://doi.org/10.5075/epfl-thesis-10546
[27]
Andrew Reynolds, Maverick Woo, Clark Barrett, David Brumley, Tianyi Liang, and Cesare Tinelli. 2017. Scaling Up DPLL(T) String Solvers Using Context-Dependent Simplification. In Computer Aided Verification, Rupak Majumdar and Viktor Kunčak (Eds.). Springer International Publishing, Cham. 453–474. isbn:978-3-319-63390-9 https://doi.org/10.1007/978-3-319-63390-9_24
[28]
Philipp Rümmer. 2008. A Constraint Sequent Calculus for First-Order Logic with Linear Integer Arithmetic. In Logic for Programming, Artificial Intelligence, and Reasoning, Iliano Cervesato, Helmut Veith, and Andrei Voronkov (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 274–289. isbn:978-3-540-89439-1 https://doi.org/10.1007/978-3-540-89439-1_20
[29]
Prateek Saxena, Devdatta Akhawe, Steve Hanna, Feng Mao, Stephen McCamant, and Dawn Song. 2010. A Symbolic Execution Framework for JavaScript. In 2010 IEEE Symposium on Security and Privacy. IEEE, Oakland, CA, USA. 513–528. https://doi.org/10.1109/SP.2010.38
[30]
Helmut Seidl, Thomas Schwentick, Anca Muscholl, and Peter Habermehl. 2004. Counting in Trees for Free. In Automata, Languages and Programming, Josep Díaz, Juhani Karhumäki, Arto Lepistö, and Donald Sannella (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 1136–1149. isbn:978-3-540-27836-8 https://doi.org/10.1007/978-3-540-27836-8_94
[31]
Rani Siromoney and V. Rajkumar Dare. 1985. A generalization of the Parikh vector for finite and infinite words. In Foundations of Software Technology and Theoretical Computer Science, S. N. Maheshwari (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 290–302. isbn:978-3-540-39722-9
[32]
SMT-COMP. 2023. SMT-COMP 2023 Results. https://smt-comp.github.io/2023/results.html
[33]
Daniel Stan and Anthony W. Lin. 2021. Regular Model Checking Approach to Knowledge Reasoning over Parameterized Systems. In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS ’21). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC. 1254–1262. isbn:9781450383073 https://doi.org/10.5555/3463952.3464097
[34]
Caleb Stanford, Margus Veanes, and Nikolaj Bjørner. 2021. Symbolic Boolean Derivatives for Efficiently Solving Extended Regular Expression Constraints. In PLDI 21. Association for Computing Machinery, New York, NY, USA. 620–635. https://doi.org/10.1145/3453483.3454066
[35]
Amanda Stjerna and Philipp Rümmer. 2024. Reproduction Package for ‘A Constraint Solving Approach to Parikh Images of Regular Languages’. https://doi.org/10.5281/zenodo.10796555
[36]
Kumar Neeraj Verma, Helmut Seidl, and Thomas Schwentick. 2005. On the Complexity of Equational Horn Clauses. In Automated Deduction – CADE-20, Robert Nieuwenhuis (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 337–352. isbn:978-3-540-31864-4 https://doi.org/10.1007/11532231_25
[37]
Yunhui Zheng, Xiangyu Zhang, and Vijay Ganesh. 2013. Z3-str: a z3-based string solver for web application analysis. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2013). Association for Computing Machinery, New York, NY, USA. 114–124. isbn:9781450322379 https://doi.org/10.1145/2491411.2491456

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 8, Issue OOPSLA1
April 2024
1492 pages
EISSN:2475-1421
DOI:10.1145/3554316
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 April 2024
Published in PACMPL Volume 8, Issue OOPSLA1

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Parikh images
  2. model checking
  3. string solvers

Qualifiers

  • Research-article

Funding Sources

  • Vetenskapsrådet
  • Vetenskapsrådet
  • Swedish Foundation for Strategic Research
  • Wallenberg Science Foundation

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 160
    Total Downloads
  • Downloads (Last 12 months)160
  • Downloads (Last 6 weeks)24
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media