Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Abstract semantic differencing via speculative correlation

Published: 15 October 2014 Publication History

Abstract

We address the problem of computing semantic differences between a program and a patched version of the program. Our goal is to obtain a precise characterization of the difference between program versions, or establish their equivalence. We focus on infinite-state numerical programs, and use abstract interpretation to compute an over-approximation of program differences.
Computing differences and establishing equivalence under abstraction requires abstracting relationships between variables in the two programs. Towards that end, we use a correlating abstract domain to compute a sound approximation of these relationships which captures semantic difference. This approximation can be computed over any interleaving of the two programs. However, the choice of interleaving can significantly affect precision. We present a speculative search algorithm that aims to find an interleaving of the two programs with minimal abstract semantic difference. This method is unique as it allows the analysis to dynamically alternate between several interleavings.
We have implemented our approach and applied it to real-world examples including patches from Git, GNU Coreutils, as well as a few handpicked patches from the Linux kernel and the Mozilla Firefox web browser. Our evaluation shows that we compute precise approximations of semantic differences, and report few false differences.

References

[1]
Github has surpassed sourceforge and google code in popularity. http://readwrite.com/2011/06/02/github-has-passed-sourceforge.
[2]
D. Amit, N. Rinetzky, T. W. Reps, M. Sagiv, and E. Yahav. Comparison under abstraction for verifying linearizability. In CAV, pages 477--490, 2007.
[3]
R. Bagnara, P. M. Hill, and E. Zaffanella. Widening operators for powerset domains. STTT, 8(4-5):449--466, 2006.
[4]
M. Barnett, B.-Y. E. Chang, R. DeLine, B. Jacobs, and K. R. M. Leino. Boogie: A modular reusable verifier for object-oriented programs. In FMCO, pages 364--387, 2005.
[5]
N. Benton. Simple relational correctness proofs for static analyses and program transformations. In POPL, pages 14--25, 2004.
[6]
D. Brumley, P. Poosankam, D. X. Song, and J. Zheng. Automatic patch-based exploit generation is possible: Techniques and implications. In IEEE Symposium on Security and Privacy, pages 143--157, 2008.
[7]
C. Cadar, D. Dunbar, and D. R. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, pages 209--224, 2008.
[8]
S. Chaki, A. Gurfinkel, and O. Strichman. Regression verification for multi-threaded programs. In VMCAI, pages 119--135, 2012.
[9]
E. M. Clarke and D. Kroening. Hardware verification using ansi-c programs as a reference. In ASP-DAC, pages 308--311, 2003.
[10]
E. M. Clarke, D. Kroening, N. Sharygina, and K. Yorav. Predicate abstraction of ansi-c programs using sat. Formal Methods in System Design, 25(2-3):105--127, 2004.
[11]
P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL, pages 238--252, 1977.
[12]
P. Cousot and N. Halbwachs. Automatic discovery of linear restraints among variables of a program. In POPL, pages 84--96, 1978.
[13]
Y. David and E. Yahav. Tracelet-based code search in executables. In PLDI, page 37, 2014.
[14]
P. Godefroid, N. Klarlund, and K. Sen. Dart: directed automated random testing. In PLDI, pages 213--223, 2005.
[15]
B. Godlin and O. Strichman. Regression verification. In DAC, pages 466--471, 2009.
[16]
C. Hawblitzel, S. K. Lahiri, K. Pawar, H. Hashmi, S. Gokbulut, L. Fernando, D. Detlefs, and S. Wadsworth. Will you still compile me tomorrow? static cross-version compiler validation. In ESEC/FSE 2013, 2013.
[17]
C. A. R. Hoare. An axiomatic basis for computer programming. Commun. ACM, 12(10):576--580, 1969.
[18]
S. Horwitz. Identifying the semantic and textual differences between two versions of a program. In PLDI, pages 234--245, 1990.
[19]
S. Horwitz, J. Prins, and T. W. Reps. Integrating noninterfering versions of programs. ACM Trans. Program. Lang. Syst., 11(3):345--387, 1989.
[20]
J. W. Hunt and M. D. McIlroy. An algorithm for differential file comparison. Technical report, Bell Laboratories, 1975.
[21]
D. Jackson and D. A. Ladd. Semantic diff: A tool for summarizing the effects of modifications. In ICSM, pages 243--252, 1994.
[22]
W. Jin, A. Orso, and T. Xie. Bert: a tool for behavioral regression testing. In SIGSOFT FSE, pages 361--362, 2010.
[23]
A. Kuehlmann and F. Krohm. Equivalence checking using cuts and heaps. In DAC, pages 263--268, 1997.
[24]
S. K. Lahiri, K. Vaswani, and C. A. R. Hoare. Differential static analysis: opportunities, applications, and challenges. In FoSER, pages 201--204, 2010.
[25]
S. K. Lahiri, C. Hawblitzel, M. Kawaguchi, and H. Rebêlo. Symdiff: A language-agnostic semantic diff tool for imperative programs. In CAV, pages 712--717, 2012.
[26]
S. K. Lahiri, K. L. McMillan, R. Sharma, and C. Hawblitzel. Differential assertion checking. In ESEC/FSE 2013, 2013.
[27]
A. Miné. The octagon abstract domain. Higher-Order and Symbolic Computation, 19(1):31--100, 2006.
[28]
A. Mishchenko, S. Chatterjee, R. K. Brayton, and N. Eén. Improvements to combinational equivalence checking. In ICCAD, pages 836--843, 2006.
[29]
G. C. Necula. Translation validation for an optimizing compiler. In PLDI, pages 83--94, 2000.
[30]
N. Partush and E. Yahav. Abstract semantic differencing for numerical programs. In SAS, pages 238--258, 2013.
[31]
D. Peled. All from one, one for all: on model checking using representatives. In CAV, pages 409--423, 1993.
[32]
S. Person, M. B. Dwyer, S. G. Elbaum, and C. S. Pasareanu. Differential symbolic execution. In SIGSOFT FSE, pages 226--237, 2008.
[33]
A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In TACAS, pages 151--166, 1998.
[34]
D. A. Ramos and D. R. Engler. Practical, low-effort equivalence verification of real code. In CAV, pages 669--685, 2011.
[35]
X. Rival and L. Mauborgne. The trace partitioning abstract domain. ACM Trans. Program. Lang. Syst., 29(5), 2007.
[36]
R. Sharma, E. Schkufza, B. R. Churchill, and A. Aiken. Data-driven equivalence checking. In OOPSLA, pages 391--406, 2013.
[37]
Y. Song, Y. Zhang, and Y. Sun. Automatic vulnerability locating in binary patches. In CIS (2), pages 474--477, 2009.
[38]
T. Terauchi and A. Aiken. Secure information flow as a safety problem. In SAS, pages 352--367, 2005.
[39]
A. Valmari. Stubborn sets for reduced state space generation. In Applications and Theory of Petri Nets, pages 491--515, 1989.
[40]
P. Wolper and P. Godefroid. Partial-order methods for temporal verification. In CONCUR, pages 233--246, 1993.
[41]
L. D. Zuck, A. Pnueli, Y. Fang, B. Goldberg, and Y. Hu. Translation and run-time validation of optimized code. Electr. Notes Theor. Comput. Sci., 70(4):179--200, 2002.

Cited By

View all
  • (2024)Assembly Function Recognition in Embedded Systems as an Optimization ProblemMathematics10.3390/math1205065812:5(658)Online publication date: 23-Feb-2024
  • (2022)PEQtest: Testing Functional EquivalenceFundamental Approaches to Software Engineering10.1007/978-3-030-99429-7_11(184-204)Online publication date: 29-Mar-2022
  • (2022)SEMEO: A Semantic Equivalence Analysis Framework for Obfuscated Android ApplicationsMobile and Ubiquitous Systems: Computing, Networking and Services10.1007/978-3-030-94822-1_18(322-346)Online publication date: 8-Feb-2022
  • Show More Cited By

Index Terms

  1. Abstract semantic differencing via speculative correlation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 49, Issue 10
      OOPSLA '14
      October 2014
      907 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2714064
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
      • cover image ACM Conferences
        OOPSLA '14: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications
        October 2014
        946 pages
        ISBN:9781450325851
        DOI:10.1145/2660193
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 October 2014
      Published in SIGPLAN Volume 49, Issue 10

      Check for updates

      Author Tags

      1. abstract interpretation
      2. differential analysis
      3. equivalence checking
      4. numerical domains
      5. patches
      6. semantic diff

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)14
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 04 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Assembly Function Recognition in Embedded Systems as an Optimization ProblemMathematics10.3390/math1205065812:5(658)Online publication date: 23-Feb-2024
      • (2022)PEQtest: Testing Functional EquivalenceFundamental Approaches to Software Engineering10.1007/978-3-030-99429-7_11(184-204)Online publication date: 29-Mar-2022
      • (2022)SEMEO: A Semantic Equivalence Analysis Framework for Obfuscated Android ApplicationsMobile and Ubiquitous Systems: Computing, Networking and Services10.1007/978-3-030-94822-1_18(322-346)Online publication date: 8-Feb-2022
      • (2019)Analysis of Software Patches Using Numerical Abstract InterpretationStatic Analysis10.1007/978-3-030-32304-2_12(225-246)Online publication date: 8-Oct-2019
      • (2018)Incremental inference for probabilistic programsACM SIGPLAN Notices10.1145/3296979.319239953:4(571-585)Online publication date: 11-Jun-2018
      • (2018)DeepSim: deep learning code functional similarityProceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3236024.3236068(141-151)Online publication date: 26-Oct-2018
      • (2018)Incremental inference for probabilistic programsProceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3192366.3192399(571-585)Online publication date: 11-Jun-2018
      • (2017)Verifiable semantic difference languagesProceedings of the 19th International Symposium on Principles and Practice of Declarative Programming10.1145/3131851.3131870(73-84)Online publication date: 9-Oct-2017
      • (2016)Qlose: Program Repair with Quantitative ObjectivesComputer Aided Verification10.1007/978-3-319-41540-6_21(383-401)Online publication date: 13-Jul-2016
      • (2015)A Mechanically Checked Generation of Correlating Programs Directed by Structured Syntactic DifferencesAutomated Technology for Verification and Analysis10.1007/978-3-319-24953-7_6(64-79)Online publication date: 22-Nov-2015
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media