Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Journal of Automated Reasoning https://doi.org/10.1007/s10817-018-9496-y C OMP C ERT S: A Memory-Aware Verified C Compiler Using a Pointer as Integer Semantics Frédéric Besson1 · Sandrine Blazy1 · Pierre Wilke2 Received: 26 February 2018 / Accepted: 31 October 2018 © Springer Nature B.V. 2018 Abstract The CompCert C compiler provides the formal guarantee that the observable behaviour of the compiled code improves on the observable behaviour of the source code. In this paper, we present a formally verified C compiler, CompCertS, which is essentially the CompCert compiler, albeit with a stronger formal guarantee: it gives a semantics to more programs and ensures that the memory consumption is preserved by the compiler. CompCertS is based on an enhanced memory model where, unlike CompCert but like Gcc, the binary representation of pointers can be manipulated much like integers and where, unlike CompCert, allocation may fail if no memory is available. The whole proof of CompCertS is a significant proofeffort and we highlight the crux of the novel proofs of 12 passes of the back-end and a challenging proof of an essential optimising pass of the front-end. Keywords Verified compilation · Low-level code · Optimisations · Pointer as integer 1 Introduction Over the past decade, the CompCert compiler has established a milestone in compiler verification. CompCert is a formally verified C compiler written with the Coq proof assistant, which initially targeted safety-critical embedded software. The compiler comes with a machine-checked proof that it does not introduce bugs during compilation [1]. This semantic preservation proof relies on the formal semantics of the source and target languages of the compiler, and requires that the source program has a defined semantics. Therefore, CompCert only provides formal guarantees for programs that do not exhibit undefined behaviours—a property that is in general undecidable. B Pierre Wilke pierre.wilke@yale.edu; pierre.wilke@centralesupelec.fr Frédéric Besson frederic.besson@inria.fr Sandrine Blazy sandrine.blazy@irisa.fr 1 Inria, Univ Rennes, CNRS, IRISA, Rennes, France 2 Yale University, New Haven, USA 123 F. Besson et al. CompCert’s memory model is a central component of the compiler. In this paper, we show how to adapt CompCert for a more expressive memory model which lifts two main limitations. First, memory allocation in CompCert always succeeds, therefore modelling infinite memory. As a consequence, the compiler does not guarantee anything on the memory consumption of the compiled program. In particular, the compiled program may exhibit a stack overflow. Second, CompCert’s memory model limits pointer arithmetic: implementationdefined operations on pointers such as arbitrary comparison or bitwise operations result in an undefined behaviour of the memory model. This may seem restrictive but this is compliant with the C standard. In previous work [3], we proposed a more concrete memory model inspired by CompCert where memory is finite and pointers can be used as integers. On that basis, we have adapted the proof of 3 passes of CompCert’s front-end [4]. In this work, we present a fully verified CompCert compiler where 12 remaining passes have been ported to our new memory model. This compiler is called CompCertS (for CompCert with Symbolic values). CompCertS gives much stronger guarantees about the behaviour of arbitrary pointer arithmetic, thus avoiding the miscompilation of programs performing bit-level manipulation of pointers. CompCertS also provides strong guarantees about the relative memory usage of the source and target programs. This is challenging because it is unclear how to even define the memory usage at the C level. We tackle this challenge by first defining the memory usage of individual functions directly from the C level, and then proving that compiled programs use no more memory than source programs. In particular, this ensures that the absence of memory overflow is preserved by compilation. All the results presented in this paper have been mechanically verified using the Coq proof assistant. The development is available online [2]. Additionally, we include links to the online documentation for several definitions and theorems in this paper under the form of Coq logos .1 Our contribution is CompCertS, which is stronger than CompCert in the following sense: (1) CompCertS offers guarantees for a wider class of programs; (2) CompCertS also offers guarantees about the memory usage of the compiled program. More precisely, we make the following technical contributions: – We present the proof of the compiler back-end (12 compiler passes) including constant propagation, common sub-expression elimination and dead-code elimination. In particular, we detail how the existing alias analyses of CompCert [19] benefit from our more defined semantics. – We show how to instrument the C semantics with oracles specifying the memory usage of functions, so that the compiler only reduces the memory usage of the program. We thus ensure that the absence of memory overflow is preserved by compilation. The rest of the paper is organised as follows. First, Sect. 2 gives background information on CompCert and the symbolic memory model of our previous work [4]. Section 3 gives an overview of the proof effort required to port the majority of the compiler, and of the proof challenges related to treating pointers as integers. Section 4 describes how we deal with an early pass of CompCert which relies on a subtle memory injection. Section 5 explains the impact of the symbolic memory model on optimisations. Section 6 shows how we ensure that the compiler reduces the memory usage of programs and proves that the absence of memory overflows is preserved. Section 7 mentions related work and finally, Sect. 8 concludes. 1 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/../index.html 123 CompCertS: A Memory-Aware Verified C Compiler Using a… 2 Background on C OMP C ERT This section describes the architecture of the CompCert compiler [14]. It also summarises the main features and properties of our memory model [3,4]. Our work is based on version 2.4 of CompCert. 2.1 Architecture of the C OMP C ERT Compiler CompCert compiles C programs into assembly code, through 8 other intermediate languages. The same memory model is shared by all the languages of the compiler. Each language is given a formal semantics in the form of a state transition system. The semantics observe behaviours that are either defined behaviours, with a trace of I/O events (this trace is finite for terminating programs, or infinite for diverging programs), or undefined behaviours. Every transformation from one language to another is proved to be semantics preserving using simulation relations, relating the states of the source and target programs with some matching relation. In particular, the trace of I/O events that they emit must be the same. The proof technique most commonly used in CompCert is forward simulations, where every step in the source language is matched with a number of steps in the target language. The heart of a forward simulation proof is captured by Theorem 1. Theorem 1 (Forward Simulation) Given a source program and a target program represented by their state transition systems → S and →T , there is a forward simulation between those programs through the simulation relation ∼ if and only if for any states S1 and S2 related by ∼, any step taken from S1 can be simulated by a (sequence of) step(s) from S2 such that the resulting states are still related by ∼. Mathematically, e e + → S S1′ ⇒ ∃ S2′ , S2 − →T S2′ ∧ S1′ ∼ S2′ , ∀ S1 ∼ S2 , ∀ S1′ , S1 − where e is the trace of emitted I/O events. The final compiler correctness theorem is about behaviour preservation. Behaviours are built on (possibly infinite) traces of events, in the following way, where t are finite traces and τ is an infinite trace: beh  Terminates(t) | Diverges(t) | Reacts(τ ) | Wrong(t). The behaviour Terminates(t) corresponds to an execution that terminates normally after emitting the trace of events t. The behaviour Diverges(t) is the execution of a program emitting t, and then loops silently (i.e. without emitting events) forever. Reacts(τ ) is the behaviour of an execution that never terminates but still emits messages, resulting in the infinite trace τ . Finally, the behaviour Wrong(t) corresponds to a program that goes wrong (i.e. triggers undefined behaviour) after having emitted the finite trace t. Given some hypotheses about the determinism of the target language, we can transform the forward simulation proofs into a behaviour preservation theorem stating that every behaviour of the compiled program is a behaviour of the source program, i.e. the compiler has not introduced bugs. The composition of the simulation lemmas for all the compiler passes forms the compiler semantic preservation theorem given below. 123 F. Besson et al. Fig. 1 Run-time and memory values Theorem 2 (CompCert’s semantic preservation) Suppose that tp is the result of the successful compilation of the program p. If bh′ is a behaviour of tp then there exists a behaviour bh such that bh is a behaviour of p and bh′ improves on the behaviour bh. bh′ ∈ ASem(t p) ⇒ ∃bh.bh ∈ CSem( p) ∧ bh ⊆ bh′ In the theorem, CSem gives the semantics of C programs and ASem gives the semantics of assembly programs. Moreover, a behaviour bh′ improves on a behaviour bh (written bh ⊆ bh′ ) if either bh and bh′ are the same, or undefined behaviours in bh are replaced by defined behaviours in bh′ . 2.2 The Memory Model of C OMP C ERT The memory model of CompCert is the cornerstone of the semantics of all the intermediate languages. It consists of a collection of separated blocks, where blocks are arrays of a given size. A value v ∈ val (see Fig. 1) can be either a 32-bit integer int(i), a pointer or the token undef. A pointer is a pair ptr(b, o) consisting of a block identifier b and an offset o. CompCert also features 64-bit integers, single and double precision floating-point numbers, which we ignore in this paper for the sake of simplicity. To allow fine-grained access to the memory, CompCert does not store values directly in the memory. Rather, values are encoded as sequences of byte-sized memory values called memval that describe the content of a memory block. They are either concrete 8-bit integers Byte (x), a special Undef byte that represents uninitialised memory, or a byte-sized fragment of a symbolic pointer value Pointer (b, o, n) (read: n-th byte of pointer ptr(b, o)). Therefore, a pointer ptr(b, o) is encoded in memory as a sequence of 4 memvals, from Pointer(b, o, 0) to Pointer(b, o, 3). (The version of CompCert that this works build upon, v2.4, only supports 32-bit pointers, hence 4 memvals. More recent versions support 64-bit pointers, made of 8 memvals.) The memory model exports four main operations: load reads values from the memory at a given address (a block and an offset), store writes values into the memory at a given address, alloc allocates a new block and free frees a given block. 2.3 A Symbolic Memory Model for C OMP C ERT In previous work [3,4], we extended CompCert’s memory model and gave semantics to pointer operations by replacing the value domain val by a more expressive domain sval of symbolic values. This low-level memory model enables reasoning about the bit-level encoding of pointers within CompCert. In this section, we first give a motivating example; then we recall the principles of symbolic values and their normalisation. 2.3.1 Motivation for Pointers as Integers Figure 2 shows an example of C code that benefits from our low-level memory model. This is an implementation of red-black trees which belongs to the Linux kernel. A node in a redblack tree (type rb_node) contains an integer rb_parent_color and two pointers to its children nodes. The integer rb_parent_color encodes both the color of the node and a pointer to the 123 CompCertS: A Memory-Aware Verified C Compiler Using a… Fig. 2 Red-black tree implementation in Linux Fig. 3 Symbolic run-time and memory values parent node. The rationale for this encoding is as follows: (1) pointers to rb_nodes are at least 4-byte aligned, therefore the two trailing bits are zeros; and (2) the color of a node can be encoded with a single bit. Retrieving each piece of information from this encoding is implemented by the two macros rb_color and rb_parent shown in Fig. 2. To get the parent pointer, the macro clears the two trailing bits using a bitwise and with ∼ 3 (i.e. 0b1 . . . 100). In CompCert, these operations are undefined because of the bitwise operations on pointers. In CompCertS, these operations are defined and therefore this kernel code can be safely compiled without fear of any miscompilation. 2.3.2 Symbolic Values A symbolic value sv ∈ sval (see Fig. 3) is either a value v or an expression built from unary and binary C operators over symbolic values. Memory values memval are also generalised into symbolic memory values smemval, which have a single constructor Symbolic(sv, n), denoting the n-th byte of a symbolic value sv. This constructor is inspired from the Pointer (·, ·, ·) constructor of CompCert (see Fig. 1) and subsumes the three existing cases. Building symbolic values instead of the token undef for undefined operations delays the challenge of giving more semantics to C expressions. However, symbolic values cannot be kept symbolic indefinitely. To perform memory accesses at an address represented by the symbolic value addr , the address addr must be normalised into a genuine pointer ptr(b, o). Similarly, the condition cond of a conditional statement must be normalised into an integer int(i) to decide which branch to follow. The normalisation is specified as a function normalise which takes as input a memory state m and a symbolic value sv, and outputs a value v. Its specification relies on the notions of concrete memories valid for a memory state m, and of evaluation of expressions that we recall below. An intuitive way to think about symbolic values is in terms of intermediate values that do not make sense immediately, but can be soundly used to later produce regular values, just like how complex numbers were first introduced in mathematics as intermediate values to solve cubic equations. 2.3.3 Concrete Memories and Evaluation A concrete memory is a mapping from blocks to concrete addresses, represented as 32bit integers. In addition to the permissions and memory contents associated to blocks in CompCert, we also associate with each memory block b a size size and an alignment 123 F. Besson et al. constraint al. We say a pointer ptr(b, o) is valid if the offset o is within the bounds [0, size[, written valid(m, b, o). The size and alignment of a block b can be retrieved with the accessors size(m, b) and align(m, b). Definition 1 2 A concrete memory cm is valid for a memory state m (cm ⊢ m) if the following conditions hold: 1. Valid addresses lie within the address space, i.e. ∀ b o, valid(m, b, o) ⇒ cm(b) + o ∈ ]0; 232 − 1[. 2. Valid pointers from distinct blocks do not overlap, i.e. ∀ b b′ o o′ , b = b′ ∧ valid(m, b, o) ∧ valid(m, b′ , o′ ) ⇒ cm(b) + o = cm(b′ ) + o′ . 3. Addresses are properly aligned, i.e. ∀ b, 2align(m,b) | cm(b). We exclude the address 0 from valid addresses because it represents the NULL pointer and is therefore invalid. We also exclude the address 232 − 1 so that weakly-valid pointers, i.e. pointers one past the end of an object, are also valid. (See the C standard [10], section 6.5.8.5 (Relational operators) for a discussion of pointers one past the end.) The evaluation of a symbolic value sv in a concrete memory cm (written svcm ) consists in replacing pointers with their integer value (according to cm) and then evaluating the resulting expression with standard integer operations. Example 1 Consider for example a concrete memory cm that maps a block b to the address 32. The evaluation of the symbolic value sv = ptr(b, 5) & int(1) results in int(1) because svcm = (cm(b) + 5) & 1 = (32 + 5) & 1 = 37 & 1 = 1. 2.3.4 Specification of the Normalisation Rather than defining an algorithm for the normalisation, we specify its behaviour through a relation is_norm m sv v, where m is a memory state, sv is a symbolic value and v is a value. This predicate is defined as follows. Definition 2 (Sound normalisation) A value v is a sound normalisation of sv in m, if v and sv evaluate identically in every concrete memory cm valid for m. is_norm m sv v  ∀cm ⊢ m ⇒ svcm = v cm . We prove that this relation is deterministic, i.e. any two values v1 and v2 that are sound normalisations of the same symbolic value sv in the same memory state m are necessarily the same. However, this is only true if at least one valid concrete memory exists for any memory state m, otherwise any value v would be a sound normalisation of any symbolic value sv. We enforce this property by restricting the allocation operation of the memory model of CompCert so that it fails if no concrete memory can be constructed. This is explained in great detail in [5]. We will discuss additional aspects related to finite memory in Sect. 6 in this article. For convenience, in the rest of this article, we will refer to the normalisation as the function normalise, which returns a value v that is a sound normalisation when such a value exists, and undef otherwise. Example 2 Consider a program which stores information in the 2 least significant bits of a 4-byte aligned pointer (cf. Fig. 2). The symbolic value after setting the last 2 bits of a pointer ptr(b, 0) is sv = ptr(b, 0) | 3. To recover the original pointer, the last two bits can be 2 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/NormaliseSpec.html#compat 123 CompCertS: A Memory-Aware Verified C Compiler Using a… Fig. 4 Injecting several blocks into one cleared by the following bitwise manipulation: sv′ = sv & ∼ 3. We have that sv′ normalises into pointer ptr(b, 0) because for any valid concrete memory cm: sv′ cm = (ptr(b, 0) | 3) & ∼ 3cm = (cm(b) | 3) & ∼ 3 = cm(b) The last rewriting step is justified by the alignment constraints of block b. Since ptr(b, 0)cm = cm(b) for any cm, then sv′ normalises into ptr(b, 0). 2.4 Memory Injections Memory injections are CompCert’s central notion to formalise the effect of merging blocks together; they are used to specify the passes that transform the memory layout. The stereotypical example is the construction of stack frames, which happens during the transformation from C♯minor to Cminor. At the C♯minor level, each local variable is allocated in its own block. In Cminor, a single block contains all the local variables, stored at different offsets. This mapping from local variable blocks in C♯minor to offsets in the stack block in Cminor is captured by a memory injection. A memory injection is characterised by an injection function f : block → ⌊block × Z⌋ that optionally associates with each block a new block and an offset within that block. For example, in Fig. 4, the blocks b1 , b2 and b3 are injected by f into the single block b′ , at different offsets. In addition to reflecting the structural relation between memory states, injections also relate the contents of the memory states. Values that are stored at corresponding locations are required to be in injection. Two values v1 and v2 are in injection if (1) v1 is undef, or (2) v1 and v2 are the same non-pointer value, or (3) v1 is ptr(b, o), v2 is ptr(b′ , o + δ) and f (b) = ⌊(b′ , δ)⌋3 . For example, in Fig. 4, the pointer ptr(b2 , o) is in injection with the pointer ptr(b′ , o + δ1 ). Two symbolic values are in injection (see [4]) if they have the same structure (the same operators are applied) and the values at the leaves of each symbolic value are in injection. In [4], we proved a central result that relates injections and normalisations, recalled in Theorem 3. Theorem 3 4 For any total injection f , for any memory states m 1 and m 2 in injection by f , for any symbolic values sv1 and sv2 in injection by f , the normalisations of sv1 in m 1 and of sv2 in m 2 are in injection by f . This theorem has the precondition that f must be a total injection, i.e. all non-empty blocks must be injected (i.e. f (b) = ∅). In this paper, one of our contributions is a generalisation of Theorem 3, which covers the case of more general injections. As we shall see in Sect. 4, it is required to prove the SimplLocals pass of CompCert. 3 ⌊·⌋ denotes the option type. We write ⌊v⌋ for Some(v) and ∅ for None. 4 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/Memory.html#Mem.norm_inject 123 F. Besson et al. Language / Pass Frontend C Cstrategy SimplExpr Clight SimplLocals C♯minorgen C♯minor Cminorgen Backend Cminor Selection CminorSel RTLgen Simulation relation equality equality injection extension injection extension extension Language / Pass RTL Tailcall Inlining Renumber Constprop CSE Deadcode Allocation LTL Tunneling Linearize Linear CleanupLabels Stacking Mach Asmgen Assembly Simulation relation extension injection equality extension extension extension extension equality equality equality injection extension Fig. 5 Overview of the compiler passes and the simulation relations used 3 Overview of the Compiler Proof This paper addresses the challenge of porting the CompCert compiler to our semantics with symbolic values, where pointer operations behave as integer operations, e.g. bitwise operators are defined on pointers and memory is bounded. Fig. 5 gives an overview of the 19 compiler passes of CompCert, together with the kind of simulation relations that are used to prove them. Three such relations between memory states are defined: memory equalities, memory extensions and memory injections. They share a common basis, the notions of memory embeddings, defined in [15]. Memory equalities are used by passes that do not modify the memory at all, neither its structure nor its contents. Memory extensions are used by passes that do not modify the structure of the memory, but are allowed to specialise the values stored in the memory (e.g. , transform an undef value into any other value). Finally, as explained in the previous section, memory injections are used for passes that modify the structure of the memory. Our changes to the semantics of the individual languages consist mainly in inserting normalisations before memory accesses and conditionals. These changes are reflected in the semantic preservation proofs, where we now have to account for the preservation of normalisations. The compiler passes that are proved based on the equality simulation relation are the simplest to port. The passes based on memory extensions and memory injections require additional lemmas about the preservation of normalisations with respect to these memory relations, and the passes based on memory injections operate the most difficult memory transformations of the compiler. In the rest of this paper, we will focus on three particular aspects of our proof effort. First, in Sect. 4 we address the problems raised by the SimplLocals pass of CompCert, which modifies the structure of the memory, and uses a kind of memory injection that is not covered by our previous work [5]. Then, in Sect. 5 we explain the challenges related to optimisations, and in particular the notion of pointer provenance. The existing pointer analysis in CompCert needs to be refined, so that it is correct in our symbolic setting. Finally, in Sect. 6 we describe the implications of having a bounded memory model in CompCert. In particular, we need 123 CompCertS: A Memory-Aware Verified C Compiler Using a… b b′ b b′ 0 4 8 12 b b 16 20 24 b 28 32 0 4 (a) 8 12 16 20 24 28 32 (b) Fig. 6 Concrete memories and partial injections a before injection b after injection that every compiler pass reduces the memory usage of programs, and we show how we ensure this is in fact the case in CompCertS. 4 Proving the Correctness of SimplLocals The SimplLocals compiler pass is one of the earliest in CompCert. Its source language is Clight, a stripped-down dialect of C where expressions are side-effect-free. The purpose of this pass is to pull out of memory the local scalar variables that do not need to reside in memory: those whose address is never taken. Those variables are transformed into temporaries, i.e. pseudo-registers, upon which most subsequent optimisations operate. 4.1 Arguments for the Correctness of SimplLocals. In CompCert, the correctness of this compiler pass relies on memory injections. The blocks corresponding to variables that are not transformed into temporaries are injected into themselves (i.e. f (b) = ⌊b′ , 0⌋), while the blocks corresponding to variables that are transformed into temporaries are not injected (i.e. f (b) = ∅). The core difficulty of porting the proof of SimplLocals to the symbolic setting resides in proving that normalisations are preserved by injections. In previous work, we have established Theorem 3 which proves this preservation for total injections. Here, the injection is partial (i.e. some blocks are not injected) and therefore Theorem 3 does not apply. The following example illustrates the challenge of dealing with partial injections. Example 3 For the sake of simplicity, consider a memory size of 32 bytes and a memory state m 1 with two blocks b and b′ which are both 4-byte aligned: b of size 8 and b′ of size 16. We show in Fig. 6a the only two possible concrete memories, where b is the darker block and b′ is the lighter one. Note that no block can be assigned the address 0 nor the address 28, as per Definition 1. Consider the symbolic value sv = ptr(b, 0)! =16. It normalises into 1 in m 1 , because b is never allocated at address 16 in any concrete memory valid for m 1 . Indeed, this address is always occupied by block b′ . Now consider a memory state m 2 where the block b′ has been pulled out of memory. Fig. 6b shows that in m 2 it is, of course, still possible to allocate block b at addresses 4 and 20. However, there is a new possible configuration where block b can be allocated at address 16. The normalisation of sv is now undefined because sv evaluates to different values (1 or 0) depending on the concrete memory used. This contradicts Theorem 3, which we are trying to prove. The essence of the problem illustrated by the above example is that blocks may have more allowed positions after the injection than before, meaning that the set of valid concrete 123 F. Besson et al. memories is larger after the injection. Therefore, the normalisation may be less defined after a partial injection and Theorem 3 cannot be generalised for arbitrary partial injections. 4.2 Well-Behaved Injections We identify a restricted class of well-behaved injections functions f , for which we show that blocks that are injected by f (those for which f (b) = ∅) do not gain new valid concrete addresses after the injection. The criterion for well-behavedness of injection functions f is stated in Definition 3. Definition 3 (Well-behaved injection) 5 An injection function f is said to be well-behaved if the blocks that are forgotten by f are at most 8-byte wide and at most 8-byte aligned. Formally, well_behaved ( f , m)  ∀ b, f (b) = ∅ ⇒ size(m, b) ≤ 8 ∧ align(m, b) ≤ 8. The injection used for the correctness proof of SimplLocals satisfies this constraint because only scalar variables may be removed from the memory, i.e. the largest are long-typed variables that are 8-byte wide and 8-byte aligned. Using such well-behaved injections, we can prove Lemma 1, from which a generalised version of Theorem 3 can be derived, as we explain at the end of this section. Lemma 1 6 Let f be a well-behaved injection function. Let m 1 and m 2 be memory states in injection by f . For every concrete memory cm2 valid for m 2 , there is a corresponding concrete memory cm1 valid for m 1 , such that every non-forgotten block has the same address in cm1 and cm2 . Formally, ∀ f , well_behaved f ⇒ ∀ m 1 m 2 , mem_inject f m 1 m 2 ⇒ ∀ cm2 ⊢ m 2 , ∃ cm1 ⊢ m 1 ∧ cm1 ≡ f cm2 where cm1 ≡ f cm2  ∀ b b′ , f (b) = ⌊(b′ , 0)⌋ ⇒ cm1 (b) = cm2 (b′ ) The problem that Lemma 1 solves can be thought of as follows: for every concrete memory cm2 valid for m 2 (cm2 ⊢ m 2 ), it is possible to insert back all the blocks that have been forgotten by f , without moving the others. In other words, all block positions that are allowed in m 2 were already allowed in m 1 , therefore we avoid the problems illustrated by Example 3. The proof of Lemma 1 goes by counting 8-byte wide and 8-byte aligned regions of memory that we call boxes, delimited by dashed lines in Fig. 7. Our allocation algorithm [4] entails that for every memory state m, there exists a concrete memory cm that we call the canonical concrete memory of m and write canon_cm(m), that is built by allocating all the blocks of m at maximally-aligned, i.e. 8-byte aligned, addresses. We call nbox(cm) the number of used boxes for a given concrete memory cm. For example, we have nbox(cm2 ) = 2, and nbox(canon_cm(m 2 )) = 3. In general, thanks to alignment constraints, we have that for any memory m and any concrete memory cm valid for m, cm uses no more boxes than canon_cm(m), i.e. nbox(cm) ≤ nbox(canon_cm(m)). This is a direct consequence of the relation between the size and the alignment properties of blocks. More precisely, a block b of size s has an alignment m such that s < 2m , when b is a small block (smaller than 8 bytes, i.e. those that are likely to be forgotten). Due to this alignment property, a properly 5 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/InjectWellBehaved.html#inject_well_ behaved 6 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/InjectWellBehaved.html#forget_compat 123 CompCertS: A Memory-Aware Verified C Compiler Using a… Fig. 7 Inverting partial injections aligned small block cannot span over more than 1 box. Larger blocks are 8-byte aligned and therefore use as many boxes in any valid concrete memory. This reasoning could be extended to slightly different definitions of well-behaved injections where larger blocks can be forgotten, hence considering larger boxes, so that properly aligned forgotten blocks never span over more than 1 box. Consider now two memory states m 1 and m 2 in injection by some well-behaved injection function f , such that m 2 is the result of forgetting F blocks from m 1 . We have that nbox(canon_cm(m 2 )) = nbox(canon_cm(m 1 )) − F. This can be verified on Fig. 7, where F = 2 blocks have been forgotten, nbox(canon_cm(m 1 )) = 5 and nbox(canon_cm(m 2 )) = 3, indeed satisfying the equation. Starting from a concrete memory cm2 ⊢ m 2 , we derive that nbox(cm2 ) + F ≤ nbox(canon_cm(m 1 )). In other words, it is possible to find F free boxes in cm2 . In our example, those 2 boxes can be for example the boxes [8; 16[ and [24; 32[. Because the blocks we forgot each fit in a box, all we have to do at this point is fill each of these F boxes in cm2 with the F forgotten variables. The result is the concrete memory cm1 shown in the last line of Fig. 7. Theorem 4 is the generalised version of Theorem 3 for well-behaved injections. Theorem 4 7 For any well-behaved injection f , for any memory states m 1 and m 2 in injection by f , for any symbolic values sv1 and sv2 in injection by f , the normalisations of sv1 in m 1 and of sv2 in m 2 are in injection by f . Proof The proof is performed in two steps. – First, we exhibit some value v such that the normalisation of sv1 injects into v. This shows that if the normalisation of sv1 is a pointer, then this pointer is injected by f . This is a consequence of the fact that sv1 is injected into another symbolic value. – Then, we show that this v is necessarily the normalisation of sv2 in m 2 . This boils down to showing that: ∀ cm2 ⊢ m 2 , v cm2 = sv2 cm2 . Using Lemma 1 and the specification of the normalisation, we conclude this proof. ⊔ ⊓ This theorem is a central piece of the proof of the SimplLocals pass, which is now fully proved in CompCertS. It is worth noting that we did not modify the behaviour of the SimplLocals 7 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/InjectWellBehaved.html#forget_norm 123 F. Besson et al. pass. The work we have done here is simply to strengthen the proof so that the original SimplLocals pass is still correct with our more defined semantics, in particular with respect to the set of valid concrete memories across memory injections. 5 Optimisations CompCert features several standard optimisations. Among them, constant propagation, strength reduction and common sub-expression elimination exploit the result of a dataflow analysis computing the combination of a numeric analysis and an alias analysis. In this section, we explain why the existing dataflow transfer functions are not sound for CompCertS and how to fix them. This demonstrates that the semantics of CompCertS is a provably strong safeguard preventing the miscompilations of low-level pointer arithmetic. For the sake of explanation, we will present a simplified version of CompCert’s abstract domains and transfer function that is sufficient for our needs. A more thorough description can be found in [19]. 5.1 The Abstract Value Domain of C OMP C ERT The abstract value domain of CompCert is made of a pointer domain and a numeric domain. The purpose of the pointer domain is to infer aliasing information and get an abstract model of memory reads and writes. In particular, if the current stack pointer does not escape through global variables or arguments of functions, the compiler gets the valuable information that the content of the current stack frame cannot be modified by function calls. A representative but simplified abstract domain of pointers, aptr, is given below. aptr:: = ⊥ | Stk ofs | Stack | ¬Stack | ⊤ Its semantics is given by its concretisation function γsb where sb stands for the memory block of the current stack frame. The empty set of pointers is denoted by ⊥. Stk o represents the stack pointer ptr(sb, o). The set of all pointers to the current stack frame (block sb at any offset) is captured by Stack. All pointers to blocks different from the stack block sb are abstracted by ¬Stack. Finally, ⊤ is the set of all pointers. γsb (⊥) = {} γsb (Stk o) = {ptr(sb, o)} γsb (Stk) = {ptr(sb, o) | o ∈ int} γsb (¬Stack) = {ptr(b, o) | b = sb ∧ o ∈ int} γsb (⊤) = {ptr(b, o) | b ∈ blocks ∧ o ∈ int} The numeric domain anum tracks constant values and intervals of the form [0; 2n −1] and [−2n ; 2n −1]. anum:: = ⊥ | Cst c | [0; 2n − 1] | [ − 2n ; 2n − 1] | ⊤ Conceptually, the domain of abstract values is of the form aval = aptr × anum such that γsb (ap, an) = γsb (ap)∪γn (an). The union of concretisations is relevant because a value can be either a pointer or an integer but not both. Moreover, as certain operators may return the value undef, undef belongs to every concretisation of the numeric domain i.e. undef ∈ γn (⊥). 123 CompCertS: A Memory-Aware Verified C Compiler Using a… Fig. 8 Aggressive dataflow analysis for red-black trees According to the original semantics of CompCert, the bitwise conjunction & between a pointer ptr(b, o) and an integer int(i) returns undef. As a result, the most precise transfer function for the bitwise & is such that ( p, ⊤)&(⊥, ⊤) = (⊥, ⊤) For the pointer part, it returns ⊥ because a bitwise & with a pointer argument returns undef (it cannot be a pointer). For the integer part, it returns ⊤ because a bitwise & between arbitrary integers is still an arbitrary integer. This formulation is semantically sound. Yet, as shown by Example 4, this aggressive transfer function can be responsible for miscompilation. Example 4 Consider the red-black tree code of Fig. 8. The code is annotated by the result of a sound dataflow analysis using the previous domain. At function entry, the current stack frame has just been created and is therefore free of aliases. As a result, the parameter r and the local variable rpc can be abstracted by (¬Stack, ⊤). Line 6, the aggressive analysis is using the previous transfer function for the bitwise & and obtains (⊥, ⊤) for the abstraction of p. This makes the reasoning that p can only be an integer. As the dereference of an integer has no semantics, the aggressive analysis infers that the rest of the code is not reachable. Line 8, this is encoded by ∅. Based on this information, a live-variable analysis and an aggressive dead-code removal could replace the whole function body by a no-op which is obviously a miscompilation. To avoid such dramatic effects, the transfer functions of CompCert are written with prudence with the objective of preventing miscompilations and “[track] leakage of pointers through arithmetic operations”.8 This is done by computing carefully crafted transfer functions which are purposely non-optimal in order to prevent aggressive optimisations (which would be sound by relying on undefined behaviours of the CompCert semantics). For instance, the transfer function for the bitwise & becomes: ( p, ⊤)&(⊥, ⊤) = ( p , ⊤) where  p reads as provenance of the pointer p and has the informal meaning that the result is some value derived from the pointer p and is defined by:  p = if p = Stk o then Stack else p. This formulation is semantically sound and prudent. Yet, this is not completely satisfactory because it is not grounded on any palpable semantics notion. 8 See https://github.com/AbsInt/CompCert/blob/a968152051941a0fc50a86c3fc15e90e22ed7c47/backend/ ValueDomain.v#L707 123 F. Besson et al. Fig. 9 CompCertS concretisation for the pointer domain 5.2 A Formally Prudent Dataflow Analysis With our semantics, the program of Figure 8 may have a defined semantics, hence the aggressive dataflow analysis of Example 4 is not sound and therefore no such miscompilation can occur. The reason is that, for our semantics, arithmetic operations (e.g. the bitwise &) are always defined and compute symbolic values. To adapt the existing abstract domains to our semantics, we need to adapt the concretisation so that they denote symbolic values instead of values. A direct lifting consists in using the evaluation of symbolic values. This approach is effective for the numeric domain and we get: γn∗ (an) = {sv | ∀cm, svcm ∈ γn (an)}. For the pointer domain, the same lifting is such that the concretisation of the Stack element represents any symbolic value whose evaluation has value sb + o for some o. As o is unrestricted, this concretisation captures any symbolic expression and collapses with the ⊤ element. A more restricted lifting could be based on the normalise function. This appealing option is however too restrictive because it rules out symbolic values which may not have a normalisation. Interestingly, we eventually noticed that, to get a concretisation that is both sound and robust to syntactic variations, what was needed was a formal account of pointer tracking. It is formalised, using Definition 4, by a notion of pointer dependence of a symbolic value sv with respect to a set S of memory blocks. Definition 4 9 A symbolic value sv depends at most on the set of blocks S if sv evaluates identically in concrete memories that are identical for all the blocks in S. Formally, we have: dep(sv, S)  ∀ cm ≡ S cm′ , svcm = svcm′ where cm ≡ S cm′  ∀ b ∈ S, cm(b) = cm′ (b). Note that, for any other block b ∈ / S, the memory may differ arbitrarily. The concretisation ∗ function γsb , where sb is the current stack block, is defined in Fig. 9 .10 Intuitively, Cst represents any symbolic value which always evaluates to the same value whatever the concrete memory (i.e., it does not depends on pointers); Stack represents any symbolic value which depends at most on the current stack block sb and ¬Stack represents any symbolic value which may depend on any block except the current stack block sb. Our abstract domain is still a pair of values (ap, an) ∈ aptr × anum but it represents a (reduced) product of domains. For symbolic values, there is no syntactic distinction between pointer and integer values. Hence, the concretisation is given by an intersection of concretisations (instead of a union): γsb (ap, an) = γsb (ap) ∩ γn (an). In CompCert, a prudent transfer function for the pointer domain is defined by p1 ⊔ p2 . Theorem 5 gives the formal guarantee that this transfer function is sound for our semantics. 9 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/ExprEval.html#depends_on_blocks 10 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/ValueDomain.html#epmatch 123 CompCertS: A Memory-Aware Verified C Compiler Using a… Theorem 5 11 Suppose that sv1 is modelled by the abstract pointer p1 and sv2 is modelled by the abstract pointer p2 . The symbolic value sv1 ⋊ ⋉sv2 is modelled by the least upper bound of the provenance of p1 and p2 i.e. ⋉sv2 ∈ γsb ( p1 ⊔ p2 ) sv1 ∈ γsb ( p1 ) ∧ sv2 ∈ γsb ( p2 ) ⇒ sv1 ⋊ Depending on the operator, the transfer function can be specialised sometimes using additional information from the numeric domain. In particular, for bitwise operators, we have the following transfer functions. p1 & p2 = i f p1 = p2 = Stk o then Stk o else p1 ⊔ p2 p1 | p2 = i f p1 = p2 = Stk o then Stk o else p1 ⊔ p2 p1 ^ p2 = i f p1 = p2 = Stk o then Cst else p1 ⊔ p2 When the pointer is known to be a constant of the form ptr(sb, o), the transfer functions exploits numeric properties of bitwise operators. In particular, they exploit the property that bitwise & and bitwise | are idempotent i.e. ptr(sb, o)&ptr(sb, o) = ptr(sb, o) | ptr(sb, o) = ptr(sb, o) For bitwise ^, we have that ptr(sb, o)^ptr(sb, o) = int(0). In the pointer domain, the most precise abstraction is Cst. This is however an example where the pointer domain may refine the numeric domain as we have: (Stk o, ⊤)^(Stk o, ⊤) = (Cst, [0; 0]) While adapting the proof, we found and fixed several minor but subtle bugs in CompCert related to pointer tracking, where the existing transfer functions were unsound for our lowlevel memory model. Though unlikely, each of them could potentially be responsible for a miscompilation. For instance, the right shift operator x >> y ignores the leak of information that would be due to the shift amount y. Though it makes little sense to pass a pointer as a shift amount, there is nonetheless some form of information flow that is captured by our semantics and forces our transfer function to include the dependence  y. Using its more conservative dataflow analysis, CompCertS forbids program transformations that are otherwise valid for CompCert but may result in miscompilations. In this particular case, we generate the right code not because our optimisations are designed with prudence but because our more defined semantics provides a formal safeguard. 5.3 Instruction Selection and Symbolic Values For dataflow analysis, our semantics makes optimisations more conservative. Yet, a more defined semantics may also enable new optimisations that would be unsound for a less defined semantics. This phenomenon has already been observed e.g. by Muellen et al. [17] in the context of peephole optimisations for CompCert. The motivating example of Muellen et al. essentially transforms the expression y − x − 1 into y + ~x where ~ is bitwise negation. In CompCert, the transformation is unsound because when x and y are pointers to the same block e.g. ptr(b, o) and ptr(b, o′ ), the expression y −x −1 evaluates to int(o′ −o−1) but the expression y + ~x evaluates to undef because of the bitwise negation that is undefined for pointers. With our semantics, both expressions have the same evaluation:  y − x − 1cm =  y + ~x cm . 11 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/ValueDomain.html#epmatch_binop_lub 123 F. Besson et al. stack frame size SimplLocals Cminorgen Stacking Fig. 10 Evolution of the size of stack frames and therefore the transformation is sound. We have introduced it in the instruction selection pass which performs strength reduction over the subtraction operator .12 There are nonetheless standard transformations that our semantics is unable to validate. For instance, an efficient way of setting a register r to 0 consists in performing a bitwise ^ with itself. Unfortunately, we cannot prove that the symbolic values 0 and sv^sv have always the same evaluation. A counterexample is when sv evaluates to undef because 0cm = undefcm ^undefcm = undef. For our semantics, this is a corner case because the optimised expression depends on more variables than the original expression. In order to perform this optimisation, CompCert introduces, at assembly level, a pseudo instruction which has the semantics of setting a register r to 0 and is assembled as a genuine bitwise ^. This approach also works for our semantics. 6 Preservation of Memory Consumption The C standard does not impose a model of memory consumption. In particular, there is no requirement that a conforming implementation should make a disciplined use of memory. A striking consequence is that the possibility of stack overflow is not mentioned. From a formal point of view, CompCert models an unbounded memory and therefore, as the C standard, does not impose any limit on stack consumption of the binary code. As a result, the existing CompCert theorem is oblivious to memory consumption of the assembly code. Though CompCert makes a wise usage of memory, this is not explicit in the correctness statement and can only be assessed by a thorough inspection of the code. Our memory model is finite and the memory allocation fails when no more memory is available. As a consequence, in order to prove the forward simulations for each compiler pass, we now also need to show the preservation of memory allocation steps. This means there is more proof effort required, but also that CompCertS provides a stronger formal guarantee about memory consumption than CompCert. It ensures that if the source code does not exhaust the memory, then neither does the assembly code. In other words, the compilation ensures that the assembly code consumes no more memory than the source code does. Although this memory consumption preservation behaviour could exist in its own right (without symbolic values and normalisation), the converse is not true: we need to have a finite memory so that at least one concrete memory exists for every memory state, and we need to preserve a bound on the memory across compiler passes. 12 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/SelectOp.html#sub 123 CompCertS: A Memory-Aware Verified C Compiler Using a… 6.1 Evolution of Stack Memory Usage throughout Compilation The memory is split into three distinct uses in CompCert: global variables, dynamically allocated memory (e.g. through malloc) and stack memory. The memory for global variables is statically known and dynamically allocated memory does not change throughout the compilation passes. Only the stack memory is deeply impacted by the compiler. Figure 10 shows the evolution of the size of the stack frame for one given function across compiler passes. We define the size of a stack frame as the sum of the maximally aligned sizes of its blocks. Formally, if a stack frame is composed of blocks {b1 , . . . , bn }, the size is defined as: size_frame({b1 , . . . , bn }, m)  n  next_aligned(size(m, bi ), 8) i where next_aligned(x, a) returns the smallest integer larger than or equal to x which is divisible by a. Reasoning about maximally aligned sizes of blocks is consistent with our allocation algorithm (see [5]) and will be important in the following. Three passes are distinguished, which modify the memory usage: – First, the SimplLocals pass introduces pseudo-registers for certain variables, which are pulled out of memory. This pass reduces the memory usage of functions and therefore satisfies the requirement that compilation should reduce memory usage. – Then, the Cminorgen pass allocates a unique stack frame containing all the remaining variables of a function. This pass may introduce some padding to ensure proper alignment properties. However, the size of the frames always decreases, thanks to the fact that we are considering maximally aligned sizes, therefore we have already accounted for the maximal amount of padding necessary. It might even be the case that we have counted too much padding and the global size of the frame will decrease. Hence, this pass preserves the memory usage. – Finally, the remaining problematic pass is the Stacking pass which builds activation records from stack frames. This pass makes explicit some low-level data (e.g. the return address or the space for spilled locals) and is responsible for an increase of the memory usage. In the following, we explain how we solve this discordance and ensure nonetheless a decreasing usage of memory across the compiler passes. 6.2 The Stacking Compiler Pass The Stacking pass transforms Linear programs into Mach code. The Linear stack frame consists of a single block containing the local variables of the function. The Mach stack frame embeds the Linear stack frame together with additional data, namely the return address of the function, the spilled pseudo-registers that could not be allocated in machine registers, the callee-save registers, and the outgoing arguments to function calls. 6.2.1 Provisioning Memory In order to fit the Stacking pass into the decreasing memory usage framework, our solution is to provision memory from the beginning of the compilation chain, i.e. from the C language. Hence, we parameterise the semantics of all intermediate languages, from C to Linear, with an oracle ns which specifies, for each function f , the additional space that is needed. The semantics therefore include special operations that reserve some space at function entry and 123 F. Besson et al. release it at function exit. Below are the relevant rules for the RTL language (other languages have similar, if not identical rules). 13 FunEntry alloc m 1 0 (stacksize f ) = ⌊m 2 , stk⌋ reserve_boxes m 2 (ns f ) = ⌊m 3 ⌋ Callstate s f args m 1 → State s f stk (entrypoint f ) (init_rs f args) m 3 14 FunExit f !pc = ⌊Ireturn r ⌋ free m 1 stk 0 (stacksize f ) = ⌊m 2 ⌋ release_boxes m 2 (ns f ) = ⌊m 3 ⌋ State s f stk pc rs m 1 → Returnstate s (rs r ) m 3 The FunEntry rule describes the transition from a Callstate, with a call stack s (which represents the stack of program points in parent functions where the execution should return afterwards, i.e. a continuation) where we are just about to enter a function f with arguments args in memory state m, to a regular State with the appropriate stack block stk, program counter entrypoint f , register state initialised from the arguments init_rs f args and memory m 3 set up. In CompCert, the end memory is simply the result of allocating the stack block with the alloc operation of the memory model. In CompCertS, we also reserve a number of boxes (the same notion of boxes that was defined in Sect. 4) with the reserve_boxes 15 operation for the additional space that will be needed to concretely lay out the stack frame of the function at the Mach and assembly levels. Symmetrically, the FunExit rule describes the transition from a regular State where the program counter points to an Ireturn r instruction (return with the value stored in register r ). In this case, the resulting state is a Returnstate with an updated memory state m 3 . In CompCert, m 3 is simply the result of freeing (deallocating) the stack block stk. In CompCertS, we also release the appropriate number of boxes with the release_boxes 16 operation. In the Mach and assembly languages, no more boxes are reserved or released because the stack is completely laid out and no extra memory will be needed in the future. These boxes that we reserve and release are just abstract information that we keep in the memory state but are not related to actual memory blocks. We maintain the invariant that the size of all blocks plus the size of all reserved boxes does not exceed some predetermined threshold .17 For most compiler passes, the amount of boxes reserved for a function call doesn’t change and these reserve and release operations are easy to preserve across these passes. For the Stacking pass, we leverage these boxes associated with the Linear function call to justify the larger stack block in Mach. Consider the example in the following picture. On the left, the stack frame for Linear is represented, together with 2 additional boxes. On the right, the stack frame for Mach is represented: no additional boxes are reserved but the stack block is larger to accomodate for the outgoing arguments to function calls, spilled variables, or the return address. The oracle ns is correct if the amount of boxes that is reserved is sufficient to hold this extra space in Mach. In such a case, we maintain that the memory usage for a Linear function is not smaller than the memory usage for the corresponding Mach function and therefore preserve that the 13 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/RTL.html#exec_function_internal 14 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/RTL.html#exec_Ireturn 15 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/MemReserve.html#reserve_boxes 16 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/MemReserve.html#release_boxes 17 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/Memory.html#Mem.wfm_alloc_ok 123 CompCertS: A Memory-Aware Verified C Compiler Using a… memory usage for the whole program in Mach does not exceed the maximum memory size we allow. outgoing arguments spilled variables Linear stack frame + 2 boxes ≥ pointer to parent frame return address The question of how to compute such a correct oracle ns remains to be discussed. It may be possible to derive an over-approximation of the needed stack space for each function from a static analysis. However, the estimate would probably be very rough as, for instance, it seems unlikely that the impact of register allocation could be modelled accurately. Instead, as the exact amount of additional memory space is known during the Stacking pass, we construct the oracle ns as a byproduct of the compilation. In other words, the compiler returns not only an assembly program but also a mapping that associates with each function of the program the quantity of additional stack space required. Note that the construction is not circular since the oracle is only needed for the correctness proof of the compiler and not by the compiler itself. CompCertS’ final theorem takes the form of Theorem 6. Theorem 6 18 Suppose that (tp, ns) is the result of the successful compilation of the program p. If tp has the behaviour bh′ , then there exists a behaviour bh such that bh is a behaviour of p with oracle ns and bh′ improves on the behaviour bh. bh′ ∈ ASem(t p) ⇒ ∃bh.bh ∈ CSem( p, ns) ∧ bh ⊆ bh′ . The only difference with CompCert is that the C semantics is instrumented by the oracle ns computed by the compiler. Though not completely explicit, Theorem 6 ensures that the absence of memory overflows is preserved by compilation. The fundamental reason is that the failure to allocate memory results in an observable going wrong behaviour. On the contrary, if the source code does not have a going wrong behaviour, neither does the assembly. It follows that if the C source succeeds at allocating memory, so does the assembly. Hence, CompCertS ensures that the absence of memory overflows is preserved by compilation. 6.2.2 Recycling Memory The semantics of function calls now reserve some amount of memory space on top of the space for the stack data. Since this operation may fail if too much memory is requested, we should thrive to make this amount as low as possible so that as many programs as possible have a defined semantics. We have seen that our oracle ns accurately predicts the total amount of stack space that will be needed at the Mach level (by construction), however some compiler passes—SimplLocals in particular—may forget some blocks and therefore throw away some memory space. We can reuse this freed space and therefore have a weaker requirement on the source semantics. To do so, we introduce another parameter sl (for SimplLocals) that gives for every function the amount of memory space that will be freed by SimplLocals, and that can therefore not be reserved in advance with a reserve_boxes operation. 18 http://www.cs.yale.edu/homes/wilke-pierre/jar18/doc/html/Compiler.html#transf_c_program_correct 123 F. Besson et al. x y x x y SimplLocals 20 x SimplLocals 20 20 12 (a) (b) Fig. 11 Recycling memory a without recycling b with recycling Example 5 Consider a function with long-integer local variables x and y, as illustrated in Fig. 11, where ns( f ) = 20 additional bytes are needed for the Stacking pass. During SimplLocals, y is transformed into a temporary while x is kept and allocated on the stack. The naive first solution that we implemented was to reserve directly from the C level the 20 needed bytes, as shown in Fig. 11a. However, this results in over provisioning memory because we request 36 bytes in total (2 long-typed variables and 20 reserved bytes), where we need no more than 28 bytes in the next compilation stage. Instead of throwing away the space for the y variable, we can reuse it as additional space (see Fig. 11b). As a result we only require 12 additional bytes at the C level, or 28 bytes in total. This memory consumption then stays the same in the next compilation stage. The amount of requested stack space is therefore lower at the C level than it would be using the naive approach of requesting the whole amount necessary. Below is a picture representing the amount requested for a selection of intermediate languages, for a function f . The parameter sl is also obtained as a byproduct of the compiler, just like the oracle ns discussed above. C Clight RTL Mach Asm ns(f ) − sl(f ) ns(f ) ns(f ) 0 0 Using this recycling principle, we slightly relax the requirements for having a defined C semantics, therefore making our formal semantic preservation theorem applicable to more programs. 6.3 About Function Inlining and Tailcall Recognition Our current implementation of CompCertS does not support compiler optimisations that deeply modify the structure of stack blocks such as function inlining and tailcall recognition. We briefly explain the difficulties raised by these optimisations and sketch our ideas to deal with those in future work. Those optimisations change the order in which stack blocks are allocated/freed and additional boxes are reserved/released. Looking only at stack blocks allocations/deallocations, the inlining of a function f into a function g transforms the sequence of events alloc f ; alloc g; free g; free f into the sequence alloc f ; free f (as shown in Fig. 12a). If the function call to g gets transformed into a tail-call, the same sequence becomes alloc f ; free f ; alloc g; free g instead (as shown in Fig. 12b). 123 CompCertS: A Memory-Aware Verified C Compiler Using a… alloc f alloc g free g alloc f free f free f (a) alloc f alloc g free g free f alloc f free f alloc g free g (b) Fig. 12 Transformations induced by function inlining and tailcall recognition a memory operations and matching relation for function inlining b memory operations and matching relation for tailcall recognition Figure 12 pictures the matching relation (with dashed lines) that we should capture between source and target programs. The issue is that all theorems we have about memory injection and memory allocation/deallocation require that every operation in the source program has a matching operation in the target program, and the transformations induced by function inlining and tailcall recognition do not fit in that setting. Instead, there are allocations and deallocations that have no counterpart in the target program (for the inlined function); or operations are reordered, making it impossible to use the available lemmas. While appropriate lemmas exist in the original CompCert, they are more subtle to prove in CompCertS because allocations and deallocations affect the set of valid concrete memories and therefore the behaviour of normalisations: it is unclear how to preserve the behaviour of normalisations in such cases; a more thorough study of these transformations is needed to reprove such theorems. We would also need to record a subtle relation between the sizes of the memories in the source and target programs, to capture the fact that the target program has already freed its stack block (and associated provisioned memory boxes), while the source program has not yet (e.g. in the second matching of Fig. 12b). 7 Related Work Formal semantics for C. The first formal realistic semantics of C is due to Norrish [18]. More recent works [9,12,13] aim at providing a formal account of the subtleties of the C standard. Hathhorn et al. [9] present an executable C semantics within the K framework. They extend the previous work of Ellison et al. [8] to precisely characterise the undefined behaviours of C. Krebbers [12,13] gives a formal account of sequence points and non-aliasing. These notions are probably the most intricate of the ISO C standard. Memarian et al. [16] realise a survey among C experts, in which they aim at capturing the de facto semantics of C. They remark that uninitialised values and pointer arithmetic are commonly used. Our work builds upon the CompCert C compiler [14]. The semantics and the memory model used in the compiler are close to ISO C. Our previous works [3,4] show how to extend the support for pointer arithmetic and adapt most of the front-end of CompCert to this 123 F. Besson et al. extended semantics with the notable exception of the SimplLocals pass which requires a sophisticated proof argument detailed in the present paper. CompCertand memory consumption. CompCert observes the I/O behaviour of programs but not their resource usage. Carbonneaux et al. [7] propose a logic for reasoning, at source level, on the resource consumption of target programs compiled by CompCert. They instrument the event traces to include resource consumption events that are preserved by compilation, and use the compiler itself to determine the actual size of stack frames. We borrow from them the idea of using a compiler-generated oracle. Their approach to finite memory is more lightweight than ours and does not require modifying the memory model. However, our ambition to reason about symbolic values in CompCert requires more intrusive changes. CompCertTSO [20] is a version of CompCert implementing a TSO relaxed memory model. It also models a finite memory where pointers are pairs of integers. Their soundness theorem is oblivious to out-of-memory errors. They remark that they could exploit memory bounds computed by the compiler, but do not implement it. In terms of expressiveness, their semantics and ours seem to be incomparable. For instance, CompCertTSO gives a defined semantics to the comparison of arbitrary pointers, we do not. That is because our semantics requires that the evaluation of symbolic values is the same in every valid concrete memory, and a comparison p1 < p2 may evaluate differently depending on the memory layout, if p1 and p2 are pointers to different objects; this would therefore result in undefined behaviour, just like in CompCert. Yet, the example of Sect. 2.3.1 is not handled by the formal semantics of CompCertTSO. Pointers as integers. Kang et al. [11] propose a hybrid memory model where an abstract pointer is mapped to a concrete address at pointer-integer cast time. Their semantics may get stuck at cast-time if there is not enough memory available. For our semantics, a cast is a no-op and our semantics may get stuck at allocation time. They study aggressive program optimisations but do not preserve memory consumption. In CompCertS, we consider simpler optimisations but implemented in a working compiler for a real language. Moreover, we ensure that the memory consumption is preserved by compilation. Mullen et al. [17] present Peek, a framework to certify peephole optimisations within CompCert. Peek leverages a low-level memory model, ASMZ32 , for the assembly language of CompCert where pointers are integers. This more defined semantics allows to validate peephole optimisations that are unsound for the more abstract model of CompCert. They give an axiomatic definition of a memory allocator and prove that, in the absence of memory exhaustion, their low-level memory model simulates the memory model of CompCert. In CompCertS, we provide a stronger guarantee and ensure the preservation of memory usage using a more high-level memory model. In theory, because our normalise function may return undef, our semantics is less defined than ASMZ32 . Nonetheless, we believe that most, if not all, of the peephole optimisations presented by Mullen et al. are also sound for our semantics. 8 Conclusion We present CompCertS, an extension of the CompCert compiler that is based on a more defined semantics and provides additional guarantees about the compiled code. Programs performing low-level bitwise operations on pointers are now covered by the semantics preservation theorem, and can thus be compiled safely. CompCertS also guarantees that the compiled program does not require more memory than the source program. This is done 123 CompCertS: A Memory-Aware Verified C Compiler Using a… by instrumenting the semantics with an oracle providing, for each function, the size of the stack frame. CompCertS compiles down to assembly; compared to CompCert, we adapted all the 4 passes of the front-end and 12 out of 14 passes of the back-end. This whole work amounts to more than 210k lines of Coq code, which is 60k more than the original CompCert 2.4. This is the result of approximately 3 person years. CompCertS does not feature the inlining and tailcall optimisations. The inlining optimisation may increase the memory consumption of functions. This disagrees with our decreasing memory size policy, but we should be able to provision memory in a similar way as we did for the Stacking pass. The tail call recognition transforms regular function calls into tail calls when appropriate. Its proof cannot be adapted in a straightforward way because of the additional stack space we introduced for the Stacking pass: the release of those blocks does not happen at the same place before and after the transformation. We need to investigate further the proof of this optimisation and come up with a more complex invariant on memory states. As future work, we shall investigate how security-related program transformations would benefit from the increased expressiveness of CompCertS. Recently, Blazy and Trieu [6] pioneered the integration of state-of-the-art obfuscations within CompCert. Data obfuscations based on bitwise operations cannot be proved sound for pointers with CompCert. Lastly, currently every function stores its stack frame in a distinct block, even at the assembly level. An ultimate compiler pass that merges blocks into a concrete stack is possible with our finite memory and would bring even more confidence in CompCertS. Acknowledgements This work has been partially funded by the ANR Project AnaStaSec ANR-14-CE280014, NSF Grant 1521523 and DARPA Grant FA8750-12-2-0293. References 1. Bedin Franca, R., Blazy, S., Favre-Felix, D., Leroy, X., Pantel, M., Souyris, J.: Formally verified optimizing compilation in ACG-based flight control software. In: ERTS 2012: Embedded Real Time Software and Systems (2012) 2. Besson, F., Blazy, S., Wilke, P.: Companion website. http://www.cs.yale.edu/homes/wilke-pierre/jar18/ 3. Besson, F., Blazy, S., Wilke, P.: A precise and abstract memory model for C using symbolic values. In: APLAS, LNCS, vol. 8858 (2014) 4. Besson, F., Blazy, S., Wilke, P.: A concrete memory model for CompCert. In: ITP, LNCS, vol. 9236. Springer, Berlin (2015) 5. Besson, F., Blazy, S., Wilke, P.: A Verified CompCert Front-End for a Memory Model supporting Pointer Arithmetic and Uninitialised Data. Journal of Automated Reasoning pp. 1–48 (2017). https://doi.org/10. 1007/s10817-017-9439-z 6. Blazy, S., Trieu, A.: Formal verification of control-flow graph flattening. In: CPP. ACM, New York (2016) 7. Carbonneaux, Q., Hoffmann, J., Ramananandro, T., Shao, Z.: End-to-end verification of stack-space bounds for C programs. In: PLDI. ACM, New York (2014) 8. Ellison, C., Rosu, G.: An executable formal semantics of C with applications. SIGPLAN Not. 47(1) (2012). https://doi.org/10.1145/2103621.2103719 9. Hathhorn, C., Ellison, C., Rosu, G.: Defining the undefinedness of C. In: PLDI. ACM, New York (2015) 10. ISO: ISO C Standard 2011. Tech. rep. (2011) 11. Kang, J., Hur, C., Mansky, W., Garbuzov, D., Zdancewic, S., Vafeiadis, V.: A formal C memory model supporting integer-pointer casts. In: PLDI (2015) 12. Krebbers, R.: Aliasing restrictions of C11 formalized in Coq. In: CPP, LNCS, vol. 8307. Springer, Berlin (2013). https://doi.org/10.1007/978-3-319-03545-1_4 13. Krebbers, R.: An operational and axiomatic semantics for non-determinism and sequence points in C. In: POPL. ACM, New York (2014) 14. Leroy, X.: Formal verification of a realistic compiler. C. ACM 52(7), 107–115 (2009). http://gallium. inria.fr/~xleroy/publi/compcert-CACM.pdf 123 F. Besson et al. 15. Leroy, X., Blazy, S.: Formal verification of a C-like memory model and its uses for verifying program transformations. J. Autom. Reason. 41(1), 1–31 (2008) 16. Memarian, K., Matthiesen, J., Lingard, J., Nienhuis, K., Chisnall, D., Watson, R.N., Sewell, P.: Into the depths of C: elaborating the de facto standards. In: PLDI. ACM, New York (2016) 17. Mullen, E., Zuniga, D., Tatlock, Z., Grossman, D.: Verified peephole optimizations for CompCert. In: PLDI, pp. 448–461. ACM, New York (2016). https://doi.org/10.1145/2908080 18. Norrish, M.: C formalised in hol. Ph.D. thesis, University of Cambridge, Cambridge (1998) 19. Robert, V., Leroy, X.: A formally-verified alias analysis. In: CPP, LNCS, vol. 7679. Springer, Berlin (2012). http://gallium.inria.fr/~xleroy/publi/alias-analysis.pdf 20. Ševčík, J., Vafeiadis, V., Zappa Nardelli, F., Jagannathan, S., Sewell, P.: CompCertTSO: A verified compiler for relaxed-memory concurrency. J. ACM 60(3), 22:1–22:50 (2013). https://doi.org/10.1145/ 2487241.2487248 123