Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Public Access

Stratified synthesis: automatically learning the x86-64 instruction set

Published: 02 June 2016 Publication History

Abstract

The x86-64 ISA sits at the bottom of the software stack of most desktop and server software. Because of its importance, many software analysis and verification tools depend, either explicitly or implicitly, on correct modeling of the semantics of x86-64 instructions. However, formal semantics for the x86-64 ISA are difficult to obtain and often written manually through great effort. We describe an automatically synthesized formal semantics of the input/output behavior for a large fraction of the x86-64 Haswell ISA’s many thousands of instruction variants. The key to our results is stratified synthesis, where we use a set of instructions whose semantics are known to synthesize the semantics of additional instructions whose semantics are unknown. As the set of formally described instructions increases, the synthesis vocabulary expands, making it possible to synthesize the semantics of increasingly complex instructions. Using this technique we automatically synthesized formal semantics for 1,795 instruction variants of the x86-64 Haswell ISA. We evaluate the learned semantics against manually written semantics (where available) and find that they are formally equivalent with the exception of 50 instructions, where the manually written semantics contain an error. We further find the learned formulas to be largely as precise as manually written ones and of similar size.

References

[1]
N. Amit, D. Tsafrir, A. Schuster, A. Ayoub, and E. Shlomo. Virtual cpu validation. In Proceedings of the 25th Symposium on Operating Systems Principles, SOSP ’15, pages 311–327, New York, NY, USA, 2015. ACM. ISBN 978-1-4503-3834-9. URL http://doi. acm.org/10.1145/2815400.2815420.
[2]
G. Balakrishnan, R. Gruian, T. Reps, and T. Teitelbaum. Codesurfer/x86—a platform for analyzing x86 executables. In Compiler Construction, pages 250–254. Springer, 2005.
[3]
G. Balakrishnan, R. Gruian, T. W. Reps, and T. Teitelbaum. Codesurfer/x86-a platform for analyzing x86 executables. In Compiler Construction, 14th International Conference, CC 2005, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2005, Edinburgh, UK, April 4-8, 2005, Proceedings, pages 250–254, 2005.
[4]
S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2006, San Jose, CA, USA, October 21-25, 2006, pages 394–403, 2006.
[5]
S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XII, pages 394– 403, New York, NY, USA, 2006. ACM. ISBN 1-59593- 451-0. URL http: //doi.acm.org/10.1145/1168857.1168906.
[6]
C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanovic, T. King, A. Reynolds, and C. Tinelli. CVC4. In Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings, pages 171–177, 2011.
[7]
D. Brumley, I. Jager, T. Avgerinos, and E. J. Schwartz. BAP: A binary analysis platform. In Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings, pages 463–469, 2011.
[8]
M. Charney. Personal communication, February 2016.
[9]
M. Christodorescu, N. Kidd, and W.-H. Goh. String analysis for x86 binaries. In Proceedings of the Workshop on Program Analysis for Software Tools and Engineering, volume 31, pages 88–95, 2005.
[10]
E. Darulova and V. Kuncak. Sound compilation of reals. In The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’14, San Diego, CA, USA, January 20-21, 2014, pages 235–248, 2014.
[11]
J. K. Feser, S. Chaudhuri, and I. Dillig. Synthesizing data structure transformations from input-output examples. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 229–239, 2015.
[12]
P. Godefroid and A. Taly. Automated synthesis of symbolic instruction encodings from i/o samples. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, pages 441–452, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1205-9. URL http://doi. acm.org/10.1145/2254064.2254116.
[13]
S. Gulwani, S. Jha, A. Tiwari, and R. Venkatesan. Synthesis of loop-free programs. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011, pages 62–73, 2011.
[14]
Intel. Intel 64 and IA-32 Architectures Software Developer Manuals, Revision 325462-057US, December 2015. URL http://www.intel. com/content/www/us/en/processors/ architectures-software-developer-manuals. html.
[15]
S. Jha, S. Gulwani, S. A. Seshia, and A. Tiwari. Oracleguided component-based program synthesis. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE 2010, Cape Town, South Africa, 1-8 May 2010, pages 215–224, 2010.
[16]
J. Kinder and H. Veith. Jakstab: A static analysis platform for binaries. In Computer Aided Verification, 20th International Conference, CAV 2008, Princeton, NJ, USA, July 7-14, 2008, Proceedings, pages 423–427, 2008.
[17]
X. Leroy. The CompCert C Verified Compiler, 2012.
[18]
J. Lim and T. W. Reps. TSL: A system for generating abstract interpreters and its application to machine-code analysis. ACM Trans. Program. Lang. Syst., 35(1):4, 2013.
[19]
A. V. Nori, S. Ozair, S. K. Rajamani, and D. Vijaykeerthy. Efficient synthesis of probabilistic programs. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 208–217, 2015.
[20]
P. Osera and S. Zdancewic. Type-and-example-directed program synthesis. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 619–630, 2015.
[21]
D. A. Ramos and D. R. Engler. Practical, Low-Effort Equivalence Verification of Real Code. In Computer Aided Verification, 2011.
[22]
978-3-642-22110-1_55. URL http://dx.doi. org/10.1007/978-3-642-22110-1_55.
[23]
J. Regehr and U. Duongsaa. Deriving abstract transfer functions for analyzing embedded software. In Proceedings of the 2006 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES’06), Ottawa, Ontario, Canada, June 14-16, 2006, pages 34–43, 2006.
[24]
J. Regehr and A. Reid. HOIST: a system for automatically deriving static analyzers for embedded systems. In Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2004, Boston, MA, USA, October 7-13, 2004, pages 133–143, 2004.
[25]
T. Reps and G. Balakrishnan. Improved memory-access analysis for x86 executables. In Compiler Construction, pages 16–35. Springer, 2008.
[26]
T. W. Reps, S. Sagiv, and G. Yorsh. Symbolic implementation of the best transformer. In Verification, Model Checking, and Abstract Interpretation, 5th International Conference, VMCAI 2004, Venice, January 11-13, 2004, Proceedings, pages 252– 266, 2004.
[27]
E. Schkufza, R. Sharma, and A. Aiken. Stochastic superoptimization. In Architectural Support for Programming Languages and Operating Systems, ASPLOS ’13, Houston, TX, USA - March 16 - 20, 2013, pages 305–316, 2013.
[28]
E. Schkufza, R. Sharma, and A. Aiken. Stochastic optimization of floating-point programs with tunable precision. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, Edinburgh, United Kingdom - June 09 - 11, 2014, page 9, 2014.
[29]
A. Solar-Lezama, R. M. Rabbah, R. Bod´ık, and K. Ebcioglu. Programming by sketching for bit-streaming programs. In Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, Chicago, IL, USA, June 12-15, 2005, pages 281–294, 2005.
[30]
D. X. Song, D. Brumley, H. Yin, J. Caballero, I. Jager, M. G. Kang, Z. Liang, J. Newsome, P. Poosankam, and P. Saxena. Bitblaze: A new approach to computer security via binary analysis. In Information Systems Security, 4th International Conference, ICISS 2008, Hyderabad, India, December 16-20, 2008. Proceedings, pages 1–25, 2008.
[31]
V. Srinivasan and T. W. Reps. Synthesis of machine code from semantics. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 596–607, 2015.
[32]
A. V. Thakur, J. Lim, A. Lal, A. Burton, E. Driscoll, M. Elder, T. Andersen, and T. W. Reps. Directed proof generation for machine code. In Computer Aided Verification, 22nd International Conference, CAV 2010, Edinburgh, UK, July 15-19, 2010. Proceedings, pages 288–305, 2010.
[33]
C. M. Wintersteiger, Y. Hamadi, and L. M. de Moura. Efficiently solving quantified bit-vector formulas. Formal Methods in System Design, 42(1):3–23, 2013.

Cited By

View all
  • (2024)Synthetiq: Fast and Versatile Quantum Circuit SynthesisProceedings of the ACM on Programming Languages10.1145/36498138:OOPSLA1(55-82)Online publication date: 29-Apr-2024
  • (2023)A High-Coverage and Efficient Instruction-Level Testing Approach for x86 ProcessorsIEEE Transactions on Computers10.1109/TC.2023.328876272:11(3203-3217)Online publication date: 1-Nov-2023
  • (2021)Adaptive restarts for stochastic synthesisProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454071(696-709)Online publication date: 19-Jun-2021
  • Show More Cited By

Index Terms

  1. Stratified synthesis: automatically learning the x86-64 instruction set

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 51, Issue 6
      PLDI '16
      June 2016
      726 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2980983
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
      • cover image ACM Conferences
        PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation
        June 2016
        726 pages
        ISBN:9781450342612
        DOI:10.1145/2908080
        • General Chair:
        • Chandra Krintz,
        • Program Chair:
        • Emery Berger
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 June 2016
      Published in SIGPLAN Volume 51, Issue 6

      Check for updates

      Author Tags

      1. ISA specification
      2. program synthesis
      3. x86-64

      Qualifiers

      • Article

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)273
      • Downloads (Last 6 weeks)39
      Reflects downloads up to 17 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Synthetiq: Fast and Versatile Quantum Circuit SynthesisProceedings of the ACM on Programming Languages10.1145/36498138:OOPSLA1(55-82)Online publication date: 29-Apr-2024
      • (2023)A High-Coverage and Efficient Instruction-Level Testing Approach for x86 ProcessorsIEEE Transactions on Computers10.1109/TC.2023.328876272:11(3203-3217)Online publication date: 1-Nov-2023
      • (2021)Adaptive restarts for stochastic synthesisProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454071(696-709)Online publication date: 19-Jun-2021
      • (2020)Cognification of Program Synthesis—A Systematic Feature-Oriented Analysis and Future DirectionComputers10.3390/computers90200279:2(27)Online publication date: 12-Apr-2020
      • (2020)Sound C Code Decompilation for a Subset of x86-64 BinariesSoftware Engineering and Formal Methods10.1007/978-3-030-58768-0_14(247-264)Online publication date: 8-Sep-2020
      • (2020)Highly Automated Formal Proofs over Memory Usage of Assembly CodeTools and Algorithms for the Construction and Analysis of Systems10.1007/978-3-030-45237-7_6(98-117)Online publication date: 17-Apr-2020
      • (2019)Semantic program alignment for equivalence checkingProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314596(1027-1040)Online publication date: 8-Jun-2019
      • (2019)On the verification of system-level information flow properties for virtualized execution platformsJournal of Cryptographic Engineering10.1007/s13389-019-00216-4Online publication date: 25-May-2019
      • (2019)Formal Semantics Extraction from Natural Language Specifications for ARMFormal Methods – The Next 30 Years10.1007/978-3-030-30942-8_28(465-483)Online publication date: 23-Sep-2019
      • (2018)Cross-Architecture Lifter SynthesisSoftware Engineering and Formal Methods10.1007/978-3-319-92970-5_10(155-170)Online publication date: 30-May-2018
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media