research-article

Abstract semantic differencing via speculative correlation

Authors:

Nimrod Partush,

Eran YahavAuthors Info & Claims

ACM SIGPLAN Notices, Volume 49, Issue 10

Pages 811 - 828

https://doi.org/10.1145/2714064.2660245

Published: 15 October 2014 Publication History

Abstract

We address the problem of computing semantic differences between a program and a patched version of the program. Our goal is to obtain a precise characterization of the difference between program versions, or establish their equivalence. We focus on infinite-state numerical programs, and use abstract interpretation to compute an over-approximation of program differences.

Computing differences and establishing equivalence under abstraction requires abstracting relationships between variables in the two programs. Towards that end, we use a correlating abstract domain to compute a sound approximation of these relationships which captures semantic difference. This approximation can be computed over any interleaving of the two programs. However, the choice of interleaving can significantly affect precision. We present a speculative search algorithm that aims to find an interleaving of the two programs with minimal abstract semantic difference. This method is unique as it allows the analysis to dynamically alternate between several interleavings.

We have implemented our approach and applied it to real-world examples including patches from Git, GNU Coreutils, as well as a few handpicked patches from the Linux kernel and the Mozilla Firefox web browser. Our evaluation shows that we compute precise approximations of semantic differences, and report few false differences.

References

[1]

Github has surpassed sourceforge and google code in popularity. http://readwrite.com/2011/06/02/github-has-passed-sourceforge.

[2]

D. Amit, N. Rinetzky, T. W. Reps, M. Sagiv, and E. Yahav. Comparison under abstraction for verifying linearizability. In CAV, pages 477--490, 2007.

Digital Library

[3]

R. Bagnara, P. M. Hill, and E. Zaffanella. Widening operators for powerset domains. STTT, 8(4-5):449--466, 2006.

Digital Library

[4]

M. Barnett, B.-Y. E. Chang, R. DeLine, B. Jacobs, and K. R. M. Leino. Boogie: A modular reusable verifier for object-oriented programs. In FMCO, pages 364--387, 2005.

Digital Library

[5]

N. Benton. Simple relational correctness proofs for static analyses and program transformations. In POPL, pages 14--25, 2004.

Digital Library

[6]

D. Brumley, P. Poosankam, D. X. Song, and J. Zheng. Automatic patch-based exploit generation is possible: Techniques and implications. In IEEE Symposium on Security and Privacy, pages 143--157, 2008.

Digital Library

[7]

C. Cadar, D. Dunbar, and D. R. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, pages 209--224, 2008.

Digital Library

[8]

S. Chaki, A. Gurfinkel, and O. Strichman. Regression verification for multi-threaded programs. In VMCAI, pages 119--135, 2012.

Digital Library

[9]

E. M. Clarke and D. Kroening. Hardware verification using ansi-c programs as a reference. In ASP-DAC, pages 308--311, 2003.

Digital Library

[10]

E. M. Clarke, D. Kroening, N. Sharygina, and K. Yorav. Predicate abstraction of ansi-c programs using sat. Formal Methods in System Design, 25(2-3):105--127, 2004.

Digital Library

[11]

P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL, pages 238--252, 1977.

Digital Library

[12]

P. Cousot and N. Halbwachs. Automatic discovery of linear restraints among variables of a program. In POPL, pages 84--96, 1978.

Digital Library

[13]

Y. David and E. Yahav. Tracelet-based code search in executables. In PLDI, page 37, 2014.

Digital Library

[14]

P. Godefroid, N. Klarlund, and K. Sen. Dart: directed automated random testing. In PLDI, pages 213--223, 2005.

Digital Library

[15]

B. Godlin and O. Strichman. Regression verification. In DAC, pages 466--471, 2009.

Digital Library

[16]

C. Hawblitzel, S. K. Lahiri, K. Pawar, H. Hashmi, S. Gokbulut, L. Fernando, D. Detlefs, and S. Wadsworth. Will you still compile me tomorrow? static cross-version compiler validation. In ESEC/FSE 2013, 2013.

Digital Library

[17]

C. A. R. Hoare. An axiomatic basis for computer programming. Commun. ACM, 12(10):576--580, 1969.

Digital Library

[18]

S. Horwitz. Identifying the semantic and textual differences between two versions of a program. In PLDI, pages 234--245, 1990.

Digital Library

[19]

S. Horwitz, J. Prins, and T. W. Reps. Integrating noninterfering versions of programs. ACM Trans. Program. Lang. Syst., 11(3):345--387, 1989.

Digital Library

[20]

J. W. Hunt and M. D. McIlroy. An algorithm for differential file comparison. Technical report, Bell Laboratories, 1975.

[21]

D. Jackson and D. A. Ladd. Semantic diff: A tool for summarizing the effects of modifications. In ICSM, pages 243--252, 1994.

Digital Library

[22]

W. Jin, A. Orso, and T. Xie. Bert: a tool for behavioral regression testing. In SIGSOFT FSE, pages 361--362, 2010.

Digital Library

[23]

A. Kuehlmann and F. Krohm. Equivalence checking using cuts and heaps. In DAC, pages 263--268, 1997.

Digital Library

[24]

S. K. Lahiri, K. Vaswani, and C. A. R. Hoare. Differential static analysis: opportunities, applications, and challenges. In FoSER, pages 201--204, 2010.

Digital Library

[25]

S. K. Lahiri, C. Hawblitzel, M. Kawaguchi, and H. Rebêlo. Symdiff: A language-agnostic semantic diff tool for imperative programs. In CAV, pages 712--717, 2012.

Digital Library

[26]

S. K. Lahiri, K. L. McMillan, R. Sharma, and C. Hawblitzel. Differential assertion checking. In ESEC/FSE 2013, 2013.

Digital Library

[27]

A. Miné. The octagon abstract domain. Higher-Order and Symbolic Computation, 19(1):31--100, 2006.

Digital Library

[28]

A. Mishchenko, S. Chatterjee, R. K. Brayton, and N. Eén. Improvements to combinational equivalence checking. In ICCAD, pages 836--843, 2006.

Digital Library

[29]

G. C. Necula. Translation validation for an optimizing compiler. In PLDI, pages 83--94, 2000.

Digital Library

[30]

N. Partush and E. Yahav. Abstract semantic differencing for numerical programs. In SAS, pages 238--258, 2013.

[31]

D. Peled. All from one, one for all: on model checking using representatives. In CAV, pages 409--423, 1993.

Digital Library

[32]

S. Person, M. B. Dwyer, S. G. Elbaum, and C. S. Pasareanu. Differential symbolic execution. In SIGSOFT FSE, pages 226--237, 2008.

Digital Library

[33]

A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In TACAS, pages 151--166, 1998.

Digital Library

[34]

D. A. Ramos and D. R. Engler. Practical, low-effort equivalence verification of real code. In CAV, pages 669--685, 2011.

Digital Library

[35]

X. Rival and L. Mauborgne. The trace partitioning abstract domain. ACM Trans. Program. Lang. Syst., 29(5), 2007.

Digital Library

[36]

R. Sharma, E. Schkufza, B. R. Churchill, and A. Aiken. Data-driven equivalence checking. In OOPSLA, pages 391--406, 2013.

Digital Library

[37]

Y. Song, Y. Zhang, and Y. Sun. Automatic vulnerability locating in binary patches. In CIS (2), pages 474--477, 2009.

Digital Library

[38]

T. Terauchi and A. Aiken. Secure information flow as a safety problem. In SAS, pages 352--367, 2005.

Digital Library

[39]

A. Valmari. Stubborn sets for reduced state space generation. In Applications and Theory of Petri Nets, pages 491--515, 1989.

Digital Library

[40]

P. Wolper and P. Godefroid. Partial-order methods for temporal verification. In CONCUR, pages 233--246, 1993.

Digital Library

[41]

L. D. Zuck, A. Pnueli, Y. Fang, B. Goldberg, and Y. Hu. Translation and run-time validation of optimized code. Electr. Notes Theor. Comput. Sci., 70(4):179--200, 2002.

Cited By

Avitan MRavve EVolkovich Z(2024)Assembly Function Recognition in Embedded Systems as an Optimization ProblemMathematics10.3390/math1205065812:5(658)Online publication date: 23-Feb-2024
https://doi.org/10.3390/math12050658
Jakobs MWiesner M(2022)PEQtest: Testing Functional EquivalenceFundamental Approaches to Software Engineering10.1007/978-3-030-99429-7_11(184-204)Online publication date: 29-Mar-2022
https://doi.org/10.1007/978-3-030-99429-7_11
Hu ZSilva BBagheri HSrisa-an WRothermel GDinh J(2022)SEMEO: A Semantic Equivalence Analysis Framework for Obfuscated Android ApplicationsMobile and Ubiquitous Systems: Computing, Networking and Services10.1007/978-3-030-94822-1_18(322-346)Online publication date: 8-Feb-2022
https://doi.org/10.1007/978-3-030-94822-1_18
Show More Cited By

Index Terms

Abstract semantic differencing via speculative correlation
1. Theory of computation
  1. Semantics and reasoning
    1. Program reasoning
      1. Program analysis
    2. Program semantics

Recommendations

Abstract semantic differencing via speculative correlation
OOPSLA '14: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications

We address the problem of computing semantic differences between a program and a patched version of the program. Our goal is to obtain a precise characterization of the difference between program versions, or establish their equivalence. We focus on ...
Numerical static analysis with Soot
SOAP '13: Proceedings of the 2nd ACM SIGPLAN International Workshop on State Of the Art in Java Program analysis

Numerical static analysis computes an approximation of all the possible values that a numeric variable may assume, in any execution of the program. Many numerical static analyses have been proposed exploiting the theory of abstract interpretation, which ...
Redesigning Soot's data-flow analysis framework for abstract interpretation
ISSTA '18: Companion Proceedings for the ISSTA/ECOOP 2018 Workshops

The goal of a program analysis framework is to decrease the effort required of a program analysis developer to implement a new analysis. The Soot Java optimization framework provides analysis developers with several types of abstract analyses, which ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 49, Issue 10

OOPSLA '14

October 2014

907 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/2714064

Editor:
Andy Gill
University of Kansas, Lawrence, KS

Issue’s Table of Contents

OOPSLA '14: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications
October 2014
946 pages
ISBN:9781450325851
DOI:10.1145/2660193
General Chair:
Andrew Black
Portland State University, USA
,
Program Chair:
Todd Millstein
University of California, Los Angeles, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2014

Published in SIGPLAN Volume 49, Issue 10

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

46
Total Citations
View Citations
497
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)2

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Avitan MRavve EVolkovich Z(2024)Assembly Function Recognition in Embedded Systems as an Optimization ProblemMathematics10.3390/math1205065812:5(658)Online publication date: 23-Feb-2024
https://doi.org/10.3390/math12050658
Jakobs MWiesner M(2022)PEQtest: Testing Functional EquivalenceFundamental Approaches to Software Engineering10.1007/978-3-030-99429-7_11(184-204)Online publication date: 29-Mar-2022
https://doi.org/10.1007/978-3-030-99429-7_11
Hu ZSilva BBagheri HSrisa-an WRothermel GDinh J(2022)SEMEO: A Semantic Equivalence Analysis Framework for Obfuscated Android ApplicationsMobile and Ubiquitous Systems: Computing, Networking and Services10.1007/978-3-030-94822-1_18(322-346)Online publication date: 8-Feb-2022
https://doi.org/10.1007/978-3-030-94822-1_18
Delmas DMiné A(2019)Analysis of Software Patches Using Numerical Abstract InterpretationStatic Analysis10.1007/978-3-030-32304-2_12(225-246)Online publication date: 8-Oct-2019
https://dl.acm.org/doi/10.1007/978-3-030-32304-2_12
Cusumano-Towner MBichsel BGehr TVechev MMansinghka V(2018)Incremental inference for probabilistic programsACM SIGPLAN Notices10.1145/3296979.319239953:4(571-585)Online publication date: 11-Jun-2018
https://dl.acm.org/doi/10.1145/3296979.3192399
Zhao GHuang JLeavens GGarcia APăsăreanu C(2018)DeepSim: deep learning code functional similarityProceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3236024.3236068(141-151)Online publication date: 26-Oct-2018
https://dl.acm.org/doi/10.1145/3236024.3236068
Cusumano-Towner MBichsel BGehr TVechev MMansinghka VFoster JGrossman D(2018)Incremental inference for probabilistic programsProceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3192366.3192399(571-585)Online publication date: 11-Jun-2018
https://dl.acm.org/doi/10.1145/3192366.3192399
Girka TMentré DRégis-Gianas YVanhoof WPientka B(2017)Verifiable semantic difference languagesProceedings of the 19th International Symposium on Principles and Practice of Declarative Programming10.1145/3131851.3131870(73-84)Online publication date: 9-Oct-2017
https://dl.acm.org/doi/10.1145/3131851.3131870
D’Antoni LSamanta RSingh R(2016)Qlose: Program Repair with Quantitative ObjectivesComputer Aided Verification10.1007/978-3-319-41540-6_21(383-401)Online publication date: 13-Jul-2016
https://doi.org/10.1007/978-3-319-41540-6_21
Girka TMentré DRégis-Gianas Y(2015)A Mechanically Checked Generation of Correlating Programs Directed by Structured Syntactic DifferencesAutomated Technology for Verification and Analysis10.1007/978-3-319-24953-7_6(64-79)Online publication date: 22-Nov-2015
https://doi.org/10.1007/978-3-319-24953-7_6
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents