Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2908080.2908081acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article
Public Access

Into the depths of C: elaborating the de facto standards

Published: 02 June 2016 Publication History

Abstract

C remains central to our computing infrastructure. It is notionally defined by ISO standards, but in reality the properties of C assumed by systems code and those implemented by compilers have diverged, both from the ISO standards and from each other, and none of these are clearly understood. We make two contributions to help improve this error-prone situation. First, we describe an in-depth analysis of the design space for the semantics of pointers and memory in C as it is used in practice. We articulate many specific questions, build a suite of semantic test cases, gather experimental data from multiple implementations, and survey what C experts believe about the de facto standards. We identify questions where there is a consensus (either following ISO or differing) and where there are conflicts. We apply all this to an experimental C implemented above capability hardware. Second, we describe a formal model, Cerberus, for large parts of C. Cerberus is parameterised on its memory model; it is linkable either with a candidate de facto memory object model, under construction, or with an operational C11 concurrency model; it is defined by elaboration to a much simpler Core language for accessibility, and it is executable as a test oracle on small examples. This should provide a solid basis for discussion of what mainstream C is now: what programmers and analysis tools can assume and what compilers aim to implement. Ultimately we hope it will be a step towards clear, consistent, and accepted semantics for the various use-cases of C.

References

[1]
Programming Languages — C. 2011.
[2]
ISO/IEC 9899:2011. A non-final but recent version is available at www.open-std.org/jtc1/sc22/wg14/docs/n1539.pdf.
[3]
ANSI. Programming Languages – C: ANSI X3.159-1989. 1989.
[4]
M. Batty, S. Owens, S. Sarkar, P. Sewell, and T. Weber. Mathematizing C++ concurrency. In Proc. POPL, 2011.
[5]
P. Becker, editor. Programming Languages — C++. 2011. ISO/IEC 14882:2011.
[6]
A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C. Henri-Gros, A. Kamsky, S. McPeak, and D. Engler. A few billion lines of code later: using static analysis to find bugs in the real world. Commun. ACM, 53(2), 2010.
[7]
F. Besson, S. Blazy, and P. Wilke. A precise and abstract memory model for C using symbolic values. In APLAS, 2014.
[8]
F. Besson, S. Blazy, and P. Wilke. A concrete memory model for CompCert. In Proc. ITP, 2015.
[9]
H.-J. Boehm and S. Adve. Foundations of the C++ concurrency memory model. In Proc. PLDI, 2008.
[10]
G. Canet, P. Cuoq, and B. Monate. A value analysis for C programs. In Proc. 9th IEEE Intl. Working Conference on Source Code Analysis and Manipulation, SCAM ’09, 2009.
[11]
D. Chisnall, J. Matthiesen, K. Memarian, K. Nienhuis, P., and R.N.M. Watson. C memory object and value semantics: the space of de facto and ISO standards, 2016. http://www.cl.cam.ac.uk/~pes20/cerberus/.
[12]
D. Chisnall, C. Rothwell, R. N. M. Watson, J. Woodruff, M. Vadera, S. W. Moore, M. Roe, B. Davis, and P. G. Neumann. Beyond the PDP-11: Architectural support for a memory-safe C abstract machine. In Proc. ASPLOS, 2015.
[13]
E. Cohen, M. Moskal, S. Tobies, and W. Schulte. A precise yet efficient memory model for C. Electron. Notes Theor. Comput. Sci. (SSV 2009), 254:85–103, October 2009.
[14]
J. Cook and S. Subramanian. A formal semantics for C in Nqthm. Technical Report 517D, Trusted Information Systems, October, 1994.
[15]
J. Criswell, N. Geoffray, and V. Adve. Memory safety for low-level software/hardware interactions. In Proceedings of the Eighteenth Usenix Security Symposium, August 2009.
[16]
J. Devietti, C. Blundell, M. M. K. Martin, and S. Zdancewic. Hardbound: Architectural support for spatial safety of the C programming language. In Proc. ASPLOS, 2008.
[17]
C. Ellison and G. Ro¸su. An executable formal semantics of C with applications. In Proc. POPL, 2012.
[18]
Y. Gurevich and J. K. Huggins. The semantics of the C programming language. In Proc. CSL ’92, 1993.
[19]
C. Hathhorn, C. Ellison, and G. Rosu. Defining the undefinedness of C. In Proc. PLDI, 2015.
[20]
Charles McEwen Ellison III. A Formal Semantics of C with Applications. PhD thesis, University of Illinois at Urbana-Champaign, 2012.
[21]
Runtime Verification Inc. Rv-match v0.1. https://runtimeverification.com/match/download/, 2016. Downloaded 2016-03-11.
[22]
Intel Plc. Introduction to Intel memory protection extensions, July 2013.
[23]
T. Jim, J. G. Morrisett, D. Grossman, M. W. Hicks, J. Cheney, and Y. Wang. Cyclone: A safe dialect of C. In Proc. USENIX ATC, 2002.
[24]
J. Kang, C.-K. Hur, W. Mansky, D. Garbuzov, S. Zdancewic, and V. Vafeiadis. A formal C memory model supporting integer-pointer casts. In Proc. PLDI, 2015.
[25]
B. W. Kernighan and D. M. Ritchie. The C Programming Language (ANSI C). Prentice Hall, 2nd edition, 1988.
[26]
R. Krebbers. Aliasing restrictions of C11 formalized in Coq. In Proc. CPP, LNCS 8307, 2013.
[27]
R. Krebbers. An operational and axiomatic semantics for nondeterminism and sequence points in C. In Proc. POPL, 2014.
[28]
R. Krebbers. The C standard formalized in Coq. PhD thesis, Radboud University Nijmegen, December 2015.
[29]
R. Krebbers and F. Wiedijk. Separation logic for non-local control flow and block scope variables. In FoSSaCS, 2013.
[30]
R. Krebbers and F. Wiedijk. A typed C11 semantics for interactive theorem proving. In Proc. CPP, 2015.
[31]
X. Leroy, A. W. Appel, S. Blazy, and G. Stewart. The CompCert memory model, version 2. Research report RR-7987, INRIA, June 2012.
[32]
X. Leroy and S. Blazy. Formal verification of a C-like memory model and its uses for verifying program transformations. Journal of Automated Reasoning, 41(1):1–31, 2008.
[33]
K. Memarian, J. Matthiesen, K. Nienhuis, V. B. F. Gomes, J. Lingard, D. Chisnall, R. N. M. Watson, and P. Sewell. Cerberus, 2016.
[34]
www.cl.cam.ac.uk/~pes20/cerberus, www.repository.cam.ac.uk/handle/1810/255730.
[35]
D. P. Mulligan, S. Owens, K. E. Gray, T. Ridge, and P. Sewell. Lem: reusable engineering of real-world semantics. In Proc. ICFP, 2014.
[36]
S. Nagarakatte, J. Zhao, M. M.K. Martin, and S. Zdancewic. SoftBound: highly compatible and complete spatial memory safety for C. In Proc. PLDI, 2009.
[37]
G. C. Necula, S. McPeak, and W. Weimer. CCured: type-safe retrofitting of legacy code. In Proc. POPL, 2002.
[38]
N. Nethercote and J. Seward. Valgrind: A framework for heavyweight dynamic binary instrumentation. In PLDI, 2007.
[39]
K. Nienhuis, K. Memarian, and P. Sewell. An operational semantics for C/C++11 concurrency, 2016. Draft available at www.cl.cam.ac.uk/~pes20/cerberus.
[40]
M. Norrish. C formalised in HOL. Technical Report UCAMCL-TR-453, U. Cambridge, Computer Laboratory, 1998.
[41]
M. Norrish. Deterministic expressions in C. In ESOP, 1999.
[42]
N. S Papaspyrou. A formal semantics for the C programming language. PhD thesis, National Technical University of Athens, 1998.
[43]
F. Pottier and Y. Régis-Gianas. Menhir. gallium.inria.fr/~fpottier/menhir/.
[44]
J. Regehr. blog.regehr.org/archives/721, 2012.
[45]
J. Regehr, April 2015. blog.regehr.org/archives/1234.
[46]
TrustInSoft. trust-in-soft.com/tis-interpreter, 2015.
[47]
H. Tuch, G. Klein, and M. Norrish. Types, bytes, and separation logic. In Proc. POPL, 2007.
[48]
J. Ševˇcík, V. Vafeiadis, F. Zappa Nardelli, S. Jagannathan, and P. Sewell. CompCertTSO: A verified compiler for relaxedmemory concurrency. J. ACM, 60(3), June 2013.
[49]
X. Wang, H. Chen, A. Cheung, Z. Jia, N. Zeldovich, and M. F. Kaashoek. Undefined behavior: what happened to my code? In Proc. APSYS, 2012.
[50]
X. Wang, N. Zeldovich, M. F. Kaashoek, and A. Solar-Lezama. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In Proc. SOSP, 2013.
[51]
R. N. M. Watson, P. G. Neumann, J. Woodruff, M. Roe, J. Anderson, D. Chisnall, B. Davis, A. Joannou, B. Laurie, S. W. Moore, S. J. Murdoch, and R. Norton. Capability hardware enhanced RISC instructions: CHERI instruction-set architecture. Technical Report UCAM-CL-TR-864, University of Cambridge, Computer Laboratory, November 2015.
[52]
R. N. M. Watson, J. Woodruff, P. G. Neumann, S. W. Moore, J. Anderson, D. Chisnall, N. H. Dave, B. Davis, K. Gudka, B. Laurie, S. J. Murdoch, R. Norton, M. Roe, S. Son, and M. Vadera. CHERI: A hybrid capability-system architecture for scalable software compartmentalization. In IEEE Symposium on Security and Privacy, SP, 2015.
[53]
WG14. Defect report summary for ISO/IEC 9899:1999.
[54]
WG14. ISO/IEC 9899:2011.
[55]
WG14. Defect report 260, September 2004.
[56]
www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm.
[57]
J. Woodruff, R. N. M. Watson, D. Chisnall, S. W. Moore, J. Anderson, B. Davis, B. Laurie, P. G. Neumann, R. Norton, and M. Roe. The CHERI capability model: revisiting RISC in an age of risk. In ISCA, 2014.
[58]
X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding bugs in C compilers. In Proc. PLDI, 2011.
[59]
J. Zhao, S. Nagarakatte, M. M.K. Martin, and S. Zdancewic. Formalizing the LLVM intermediate representation for verified program transformations. In Proc. POPL, 2012.

Cited By

View all
  • (2025)Fulminate: Testing CN Separation-Logic Specifications in CProceedings of the ACM on Programming Languages10.1145/37048799:POPL(1260-1292)Online publication date: 9-Jan-2025
  • (2024)Understanding and Reusing Test Suites Across Database SystemsProceedings of the ACM on Management of Data10.1145/36988292:6(1-26)Online publication date: 20-Dec-2024
  • (2024)Iris-MSWasm: Elucidating and Mechanising the Security Invariants of Memory-Safe WebAssemblyProceedings of the ACM on Programming Languages10.1145/36897228:OOPSLA2(304-332)Online publication date: 8-Oct-2024
  • Show More Cited By

Index Terms

  1. Into the depths of C: elaborating the de facto standards

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation
    June 2016
    726 pages
    ISBN:9781450342612
    DOI:10.1145/2908080
    • General Chair:
    • Chandra Krintz,
    • Program Chair:
    • Emery Berger
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 51, Issue 6
      PLDI '16
      June 2016
      726 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2980983
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 June 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tag

    1. C

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    PLDI '16
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 406 of 2,067 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)939
    • Downloads (Last 6 weeks)76
    Reflects downloads up to 12 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Fulminate: Testing CN Separation-Logic Specifications in CProceedings of the ACM on Programming Languages10.1145/37048799:POPL(1260-1292)Online publication date: 9-Jan-2025
    • (2024)Understanding and Reusing Test Suites Across Database SystemsProceedings of the ACM on Management of Data10.1145/36988292:6(1-26)Online publication date: 20-Dec-2024
    • (2024)Iris-MSWasm: Elucidating and Mechanising the Security Invariants of Memory-Safe WebAssemblyProceedings of the ACM on Programming Languages10.1145/36897228:OOPSLA2(304-332)Online publication date: 8-Oct-2024
    • (2024)A Two-Phase Infinite/Finite Low-Level Memory Model: Reconciling Integer–Pointer Casts, Finite Space, and undef at the LLVM IR Level of AbstractionProceedings of the ACM on Programming Languages10.1145/36746528:ICFP(789-817)Online publication date: 15-Aug-2024
    • (2024)Full Spatial and Temporal Memory Safety for CIEEE Security and Privacy10.1109/MSEC.2024.336314222:4(30-39)Online publication date: 1-Jul-2024
    • (2024)SPP: Safe Persistent Pointers for Memory Safety2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58291.2024.00019(37-52)Online publication date: 24-Jun-2024
    • (2023)CREATING MULTITHREADED PTOGRAMS IN C++Computer Design Systems. Theory and Practice10.23939/cds2023.01.0195:1(19-27)Online publication date: 2023
    • (2023)CN: Verifying Systems C Code with Separation-Logic Refinement TypesProceedings of the ACM on Programming Languages10.1145/35711947:POPL(1-32)Online publication date: 11-Jan-2023
    • (2023)Divergent Representations: When Compiler Optimizations Enable Exploitation2023 IEEE Security and Privacy Workshops (SPW)10.1109/SPW59333.2023.00035(337-348)Online publication date: May-2023
    • (2022)The Right Kind of Non-Determinism: Using Concurrency to Verify C Programs with Underspecified SemanticsElectronic Proceedings in Theoretical Computer Science10.4204/EPTCS.365.1365(1-16)Online publication date: 9-Aug-2022
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media