Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3691620.3695014acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

LLM Meets Bounded Model Checking: Neuro-symbolic Loop Invariant Inference

Published: 27 October 2024 Publication History

Abstract

Loop invariant inference, a key component in program verification, is a challenging task due to the inherent undecidability and complex loop behaviors in practice. Recently, machine learning based techniques have demonstrated impressive performance in generating loop invariants automatically. However, these methods highly rely on the labeled training data, and are intrinsically random and uncertain, leading to unstable performance. In this paper, we investigate a synergy of large language models (LLMs) and bounded model checking (BMC) to address these issues. The key observation is that, although LLMs may not be able to return the correct loop invariant in one response, they usually can provide all individual predicates of the correct loop invariant in multiple responses. To this end, we propose a "query-filter-reassemble" strategy, namely, we first leverage the language generation power of LLMs to produce a set of candidate invariants, where training data is not needed. Then, we employ BMC to identify valid predicates from these candidate invariants, which are assembled to produce new candidate invariants and checked by off-the-shelf SMT solvers. The feedback is incorporated into the prompt for the next round of LLM querying. We expand the existing benchmark of 133 programs to 316 programs, providing a more comprehensive testing ground. Experimental results demonstrate that our approach significantly outperforms the state-of-the-art techniques, successfully generating 309 loop invariants out of 316 cases, whereas the existing baseline methods are only able to tackle 219 programs at best. The code is publicly available at https://github.com/SoftWiser-group/LaM4Inv.git.

References

[1]
Rajeev Alur, Dana Fisman, Saswat Padhi, Rishabh Singh, and Abhishek Udupa. 2019. SyGuS-Comp 2018: Results and Analysis. CoRR abs/1904.07146 (2019). arXiv preprint arXiv:1904.07146 (2019).
[2]
Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. 2017. Sygus-comp 2017: Results and analysis. arXiv preprint arXiv:1711.11438 (2017).
[3]
Haniel Barbosa, Clark Barrett, Martin Brain, Gereon Kremer, Hanna Lachnitt, Makai Mann, Abdalrhman Mohamed, Mudathir Mohamed, Aina Niemetz, Andres Nötzli, et al. 2022. cvc5: A versatile and industrial-strength SMT solver. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 415--442.
[4]
Clark Barrett, Christopher L Conway, Morgan Deters, Liana Hadarean, Dejan Jovanović, Tim King, Andrew Reynolds, and Cesare Tinelli. 2011. cvc4. In International Conference on Computer Aided Verification (CAV). Springer, 171--177.
[5]
Dirk Beyer. 2024. State of the art in software verification and witness validation: SV-COMP 2024. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 299--329.
[6]
Cristiano Calcagno, Dino Distefano, and Viktor Vafeiadis. 2009. Bi-abductive resource invariant synthesis. In Asian Symposium on Programming Languages and Systems. Springer, 259--274.
[7]
Saikat Chakraborty, Shuvendu Lahiri, Sarah Fakhoury, Akash Lal, Madanlal Musuvathi, Aseem Rastogi, Aditya Senthilnathan, Rahul Sharma, and Nikhil Swamy. 2023. Ranking LLM-Generated Loop Invariants for Program Verification. In Findings of the Association for Computational Linguistics: EMNLP. 9164--9175.
[8]
Xiangping CHEN, Xing HU, Yuan HUANG, He JIANG, Weixing JI, Yanjie JIANG, Yanyan JIANG, Bo LIU, Hui LIU, Xiaochen LI, Xiaoli LIAN, Guozhu MENG, Xin PENG, Hailong SUN, Lin SHI, Bo WANG, Chong WANG, Jiayi WANG, Tiantian WANG, Jifeng XUAN, Xin XIA, Yibiao YANG, Yixin YANG, Li ZHANG, Yuming ZHOU, and Lu ZHANG. [n. d.]. Deep Learning-based Software Engineering: Progress, Challenges, and Opportunities. SCIENCE CHINA Information Sciences ([n. d.]).
[9]
Edmund Clarke, Armin Biere, Richard Raimi, and Yunshan Zhu. 2001. Bounded model checking using satisfiability solving. Formal methods in system design 19 (2001), 7--34.
[10]
Edmund Clarke, Orna Grumberg, Somesh Jha, Yuan Lu, and Helmut Veith. 2003. Counterexample-guided abstraction refinement for symbolic model checking. Journal of the ACM (JACM) 50, 5 (2003), 752--794.
[11]
Michael A Colón, Sriram Sankaranarayanan, and Henny B Sipma. 2003. Linear invariant generation using non-linear constraint solving. In International Conference on Computer Aided Verification (CAV). Springer, 420--432.
[12]
Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages. 238--252.
[13]
Patrick Cousot and Radhia Cousot. 1979. Systematic design of program analysis frameworks. In Proceedings of the 6th ACM SIGACT-SIGPLAN symposium on Principles of programming languages (POPL). 269--282.
[14]
Patrick Cousot and Nicolas Halbwachs. 1978. Automatic discovery of linear restraints among variables of a program. In Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages (POPL). 84--96.
[15]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337--340.
[16]
Isil Dillig, Thomas Dillig, Boyang Li, and Ken McMillan. 2013. Inductive invariant generation via abductive inference. Acm Sigplan Notices 48, 10 (2013), 443--456.
[17]
Alastair F Donaldson, Leopold Haller, Daniel Kroening, and Philipp Rümmer. 2011. Software verification using k-induction. In Static Analysis: 18th International Symposium, SAS 2011, Venice, Italy, September 14--16, 2011. Proceedings 18. Springer, 351--368.
[18]
Mnacho Echenim, Nicolas Peltier, and Yanis Sellami. 2019. Ilinva: Using abduction to generate loop invariants. In Frontiers of Combining Systems: 12th International Symposium, FroCoS 2019, London, UK, September 4--6, 2019, Proceedings 12. Springer, 77--93.
[19]
Michael D Ernst, Jeff H Perkins, Philip J Guo, Stephen McCamant, Carlos Pacheco, Matthew S Tschantz, and Chen Xiao. 2007. The Daikon system for dynamic detection of likely invariants. Science of computer programming 69, 1--3 (2007), 35--45.
[20]
P Ezudheen, Daniel Neider, Deepak D'Souza, Pranav Garg, and P Madhusudan. 2018. Horn-ICE learning for synthesizing invariants and contracts. Proceedings of the ACM on Programming Languages 2, OOPSLA (2018), 1--25.
[21]
Mikhail R Gadelha, Felipe R Monteiro, Jeremy Morse, Lucas C Cordeiro, Bernd Fischer, and Denis A Nicole. 2018. ESBMC 5.0: an industrial-strength C model checker. In ACM/IEEE International Conference on Automated Software Engineering (ASE). 888--891.
[22]
Pranav Garg, Christof Löding, Parthasarathy Madhusudan, and Daniel Neider. 2014. ICE: A robust framework for learning invariants. In International Conference on Computer Aided Verification (CAV). Springer, 69--87.
[23]
Pranav Garg, Daniel Neider, Parthasarathy Madhusudan, and Dan Roth. 2016. Learning invariants using decision trees and implication counterexamples. ACM Sigplan Notices 51, 1 (2016), 499--512.
[24]
Ashutosh Gupta, Rupak Majumdar, and Andrey Rybalchenko. 2009. From tests to proofs. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 262--276.
[25]
Charles Antony Richard Hoare. 1969. An axiomatic basis for computer programming. Commun. ACM 12, 10 (1969), 576--580.
[26]
Hossein Hojjat and Philipp Rümmer. 2018. The ELDARICA horn solver. In 2018 Formal Methods in Computer Aided Design (FMCAD). IEEE, 1--7.
[27]
Ranjit Jhala and Kenneth L McMillan. 2006. A practical and complete approach to predicate refinement. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 459--473.
[28]
Adharsh Kamath, Aditya Senthilnathan, Saikat Chakraborty, Pantazis Deligiannis, Shuvendu K Lahiri, Akash Lal, Aseem Rastogi, Subhajit Roy, and Rahul Sharma. 2023. Finding Inductive Loop Invariants using Large Language Models. arXiv preprint arXiv:2311.07948 (2023).
[29]
Michael Karr. 1976. Affine relationships among variables of a program. Acta Informatica 6, 2 (1976), 133--151.
[30]
Chris Lattner. 2008. LLVM and Clang: Next generation compiler technology. In The BSD conference, Vol. 5. 1--20.
[31]
Ton Chanh Le, Guolong Zheng, and ThanhVu Nguyen. 2019. SLING: using dynamic analysis to infer program invariants in separation logic. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 788--801.
[32]
Jiaying Li, Jun Sun, Li Li, Quang Loc Le, and Shang-Wei Lin. 2017. Automatic loop-invariant generation anc refinement through selective sampling. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 782--792.
[33]
Chang Liu, Xiwei Wu, Yuan Feng, Qinxiang Cao, and Junchi Yan. 2023. Towards General Loop Invariant Generation via Coordinating Symbolic Execution and Large Language Models. arXiv preprint arXiv:2311.10483 (2023).
[34]
Kenneth L McMillan. 2010. Lazy annotation for program testing and verification. In International Conference on Computer Aided Verification (CAV). Springer, 104--118.
[35]
Vilém Novák, Irina Perfilieva, and Jiri Mockor. 2012. Mathematical principles of fuzzy logic. Vol. 517. Springer Science & Business Media.
[36]
OpenAI. 2024. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]
[37]
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback. Advances in neural information processing systems (NeurIPS) 35 (2022), 27730--27744.
[38]
Saswat Padhi, Rahul Sharma, and Todd Millstein. 2016. Data-driven precondition inference with learned features. ACM SIGPLAN Notices 51, 6 (2016), 42--56.
[39]
Kexin Pei, David Bieber, Kensen Shi, Charles Sutton, and Pengcheng Yin. 2023. Can large language models reason about program invariants?. In International Conference on Machine Learning. PMLR, 27496--27520.
[40]
Mukund Raghothaman and Abhishek Udupa. 2014. Language to specify syntax-guided synthesis problems. arXiv preprint arXiv:1405.5590 (2014).
[41]
Daniel Riley and Grigory Fedyukovich. 2022. Multi-phase invariant synthesis. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 607--619.
[42]
Gabriel Ryan, Justin Wong, Jianan Yao, Ronghui Gu, and Suman Jana. 2020. CLN2INV: Learning Loop Invariants with Continuous Logic Networks. In International Conference on Learning Representations (ICLR).
[43]
Rahul Sharma and Alex Aiken. 2016. From invariant checking to invariant inference using randomized search. Formal Methods in System Design 48 (2016), 235--256.
[44]
Rahul Sharma, Saurabh Gupta, Bharath Hariharan, Alex Aiken, and Aditya V Nori. 2013. Verification as learning geometric concepts. In Static Analysis: 20th International Symposium, SAS 2013, Seattle, WA, USA, June 20--22, 2013. Proceedings 20. Springer, 388--411.
[45]
Rahul Sharma, Aditya V Nori, and Alex Aiken. 2012. Interpolants as classifiers. In International Conference on Computer Aided Verification. Springer, 71--87.
[46]
Xujie Si, Hanjun Dai, Mukund Raghothaman, Mayur Naik, and Le Song. 2018. Learning loop invariants for program verification. Advances in Neural Information Processing Systems (NeurIPS) 31 (2018).
[47]
Hari Govind Vediramana Krishnan, YuTing Chen, Sharon Shoham, and Arie Gurfinkel. 2023. Global guidance for local generalization in model checking. Formal Methods in System Design (2023), 1--29.
[48]
Cheng Wen, Jialun Cao, Jie Su, Zhiwu Xu, Shengchao Qin, Mengda He, Haokun Li, Shing-Chi Cheung, and Cong Tian. 2024. Enchanting program specification synthesis by large language models using static analysis and program verification. In International Conference on Computer Aided Verification (CAV).
[49]
Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2020. Validating SMT solvers via semantic fusion. In Proceedings of the 41st ACM SIGPLAN Conference on programming language design and implementation (PLDI). 718--730.
[50]
Haoze Wu, Clark Barrett, and Nina Narodytska. 2024. Lemur: Integrating Large Language Models in Automated Program Verification. In The Twelfth International Conference on Learning Representations (ICLR).
[51]
Rongchen Xu, Fei He, and Bow-Yaw Wang. 2020. Interval counterexamples for loop invariant learning. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 111--122.
[52]
Jianan Yao, Gabriel Ryan, Justin Wong, Suman Jana, and Ronghui Gu. 2020. Learning nonlinear loop invariants with gated continuous logic networks. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 106--120.
[53]
Jianan Yao, Ziqiao Zhou, Weiteng Chen, and Weidong Cui. 2023. Leveraging large language models for automated proof synthesis in rust. arXiv preprint arXiv:2311.03739 (2023).
[54]
Shiwen Yu, Ting Wang, and Ji Wang. 2023. Loop Invariant Inference through SMT Solving Enhanced Reinforcement Learning. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). 175--187.
[55]
He Zhu, Aditya V Nori, and Suresh Jagannathan. 2015. Learning refinement types. ACM SIGPLAN Notices 50, 9 (2015), 400--411.

Index Terms

  1. LLM Meets Bounded Model Checking: Neuro-symbolic Loop Invariant Inference

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASE '24: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering
    October 2024
    2587 pages
    ISBN:9798400712487
    DOI:10.1145/3691620
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2024

    Check for updates

    Author Tags

    1. loop invariant
    2. program verification
    3. large language model

    Qualifiers

    • Research-article

    Conference

    ASE '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 82 of 337 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 123
      Total Downloads
    • Downloads (Last 12 months)123
    • Downloads (Last 6 weeks)44
    Reflects downloads up to 25 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media