Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Sandrine Blazy

    Sandrine Blazy

    • noneedit
    • I am professor of computer science.edit
    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be exempt from mis-compilation. The executable code it produces is proved to behave exactly as... more
    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be exempt from mis-compilation. The executable code it produces is proved to behave exactly as specified by the semantics of the source C program. This article gives an overview of the use of CompCert to gain certification credits for a highly safety-critical industry application, certified according to IEC 60880. We will briefly introduce the target application, illustrate the process of changing the existing compiler infrastructure to CompCert, and discuss performance characteristics. The main part focuses on the tool qualification strategy, in particular on how to take advantage of the formal correctness proof in the certification process.
    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be free from miscompilation. The executable code it produces is proved to behave exactly as... more
    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be free from miscompilation. The executable code it produces is proved to behave exactly as specified by the semantics of the source C program. CompCert's intended use is the compilation of safety-critical and mission-critical software meeting high levels of assurance. This article gives an overview of the design of CompCert and its proof concept, summarizes the resulting confidence argument, and gives an overview of relevant tool qualification strategies. We briefly summarize practical experience and give an overview of recent CompCert developments.
    Un assistant de preuve est un logiciel interactif permettant a son utilisateur de construire des demonstrations de facon semi-automatique, tout en garantissant la correction de ces demonstrations.  Ce type d'outil est particulierement... more
    Un assistant de preuve est un logiciel interactif permettant a son utilisateur de construire des demonstrations de facon semi-automatique, tout en garantissant la correction de ces demonstrations.  Ce type d'outil est particulierement utile a la verification de logiciel critique. Cet article presente Coq, assistant de preuve dont le developpement est coordonne par l'institut de recherche Inria. Son utilisation est d’abord presentee a travers un exemple tres simple: la verification d'une fonction de tri. Puis une deuxieme partie presente quelques domaines d'applications, notamment la surete du logiciel et la recherche en informatique et en mathematiques. Coq est considere comme un des outils les plus fiables pour la validation du logiciel, ce qui s’explique par les fondements theoriques de cet outil et son evolution depuis plus de 30 ans de recherche et de developpement.
    ABSTRACT
    Research Interests:
    Research Interests:
    This paper describes an approach for r eusing design patterns that have been formally specified. Reusing such a pattern means instantiating it or composing it with other patterns or extending it. Three levels of composition are defined :... more
    This paper describes an approach for r eusing design patterns that have been formally specified. Reusing such a pattern means instantiating it or composing it with other patterns or extending it. Three levels of composition are defined : juxtaposition, composition with inter-patterns links and unification. This paper shows through examples how to define specification patterns in B, how to reuse
    This paper describes a technique and a tool that support partial evaluation of FORTRAN programs, i.e., their specialization for specific values of their input variables. The authors’ aim is to understand old programs, which have become... more
    This paper describes a technique and a tool that support partial evaluation of FORTRAN programs, i.e., their specialization for specific values of their input variables. The authors’ aim is to understand old programs, which have become very complex due to numerous extensions. From a given FORTRAN program and these values of its input variables, the tool provides a simplified program, which behaves like the initial program for the specific values. This tool mainly uses constant propagation and simplification of alternatives to one of their branches. The tool is specified in terms of inference rules and operates by induction on the FORTRAN abstract syntax. These rules are compiled into Prolog by the Centaur/FORTRAN programming environment. The completeness and soundness of these rules are proven using rule induction.
    ABSTRACT This paper describes an approach for reusing specification patterns. Specification patterns are design patterns that are expressed in a formal specification language. Reusing a specification pattern means instantiating it or... more
    ABSTRACT This paper describes an approach for reusing specification patterns. Specification patterns are design patterns that are expressed in a formal specification language. Reusing a specification pattern means instantiating it or composing it with other specification patterns. Three levels of composition are defined: juxtaposition, composition with inter-patterns links and unification. This paper shows through examples how to define specification patterns in B, how to reuse them directly in B, and also how to reuse the proofs associated with specification patterns.
    Liveness analysis is a standard compiler analysis, enabling several optimizations such as deadcode elimination. The SSA form is a popular compiler intermediate language allowing for simple and fast optimizations. Boissinot et al. [7]... more
    Liveness analysis is a standard compiler analysis, enabling several optimizations such as deadcode elimination. The SSA form is a popular compiler intermediate language allowing for simple and fast optimizations. Boissinot et al. [7] designed a fast liveness analysis by combining the specific properties of SSA with graph-theoretic ideas such as depth-first search and dominance. We formalize their approach in the Coq proof assistant, inside the CompCertSSA verified C compiler. We also compare experimentally this approach on CompCert’s benchmarks with respect to the classic data-flow-based liveness analysis, and observe performance gains.
    Static code analysis is increasingly used to guarantee the absence of undesirable behaviors in industrial programs. Designing sound analyses is a continuing trade-off between precision and complexity. Notably, dataflow analyses often... more
    Static code analysis is increasingly used to guarantee the absence of undesirable behaviors in industrial programs. Designing sound analyses is a continuing trade-off between precision and complexity. Notably, dataflow analyses often perform overly wide approximations when two control-flow paths meet, by merging states from each path.This paper presents a generic abstract interpretation based framework to enhance the precision of such analyses on join points. It relies on predicated domains, that preserve and reuse information valid only inside some branches of the code. Our predicates are derived from conditional statements, and postpone the loss of information.The work has been integrated into Frama-C, a C source code analysis platform. Experiments on real generated code show that our approach scales, and improves significantly the precision of the existing analyses of Frama-C. We automatically extend existing abstract domains.The new information reflects the structure of the conditionals of the program.This approach keeps path-sensitive information.Our transfer functions have been designed to scale on real programs.Our technique has been successfully applied to complex generated programs.
    ABSTRACT Static analysis of binary code is challenging for several reasons. In particular, standard static analysis techniques operate over control flow graphs, which are not available when dealing with self-modifying programs which can... more
    ABSTRACT Static analysis of binary code is challenging for several reasons. In particular, standard static analysis techniques operate over control flow graphs, which are not available when dealing with self-modifying programs which can modify their own code at runtime. We formalize in the Coq proof assistant some key abstract interpretation techniques that automatically extract memory safety properties from binary code. Our analyzer is formally proved correct and has been run on several self-modifying challenges, provided by Caiet al.in their PLDI 2007 paper.
    ABSTRACT
    s Short Papers Advanced Development of Certified OS Kernels
    Software Fault Isolation (SFI) is a security-enhancing program transformation for instrumenting an untrusted binary module so that it runs inside a dedicated isolated address space, called a sandbox. To ensure that the untrusted module... more
    Software Fault Isolation (SFI) is a security-enhancing program transformation for instrumenting an untrusted binary module so that it runs inside a dedicated isolated address space, called a sandbox. To ensure that the untrusted module cannot escape its sandbox, existing approaches such as Google’s Native Client rely on a binary verifier to check that all memory accesses are within the sandbox. Instead of relying on a posteriori verification, we design, implement and prove correct a program instrumentation phase as part of the formally verified compiler CompCert that enforces a sandboxing security property a priori. This eliminates the need for a binary verifier and, instead, leverages the soundness proof of the compiler to prove the security of the sandboxing transformation. The technical contributions are a novel sandboxing transformation that has a well-defined C semantics and which supports arbitrary function pointers, and a formally verified C compiler that implements SFI. Experiments show that our formally verified technique is a competitive way of implementing SFI.
    A formally verified compiler is a compiler that comes with a machine-checked proof that no bug is introduced during compilation. This correctness property states that the compiler preserves the semantics of programs. Formally verified... more
    A formally verified compiler is a compiler that comes with a machine-checked proof that no bug is introduced during compilation. This correctness property states that the compiler preserves the semantics of programs. Formally verified compilers guarantee the absence of correctness bugs, but do not protect against other classes of bugs, such as security bugs. This limitation partly arises from the traditional form of stating compiler correctness as preservation of semantics that do not capture non-functional properties such as security. Moreover, proof techniques for compiler correctness, including the traditional notions of simulation, do not immediately apply to secure compilation, and need to be extended accordingly. This talk will address the challenges of secure compilation from the specific angle of turning an existing formally-verified compiler into a formally-verified secure compiler. Two case studies will illustrate this approach, where each case study addresses a notion of security and uses modular reasoning (first proving correctness then security) to show that compilation preserves security. Specifically, we consider the problem of secure compilation for CompCert, a formally-verified moderately optimizing compiler for C programs, programmed and verified using the Coq proof assistant [1]. CompCert evolved significantly over the last 15 years, starting as an academic project and now being used in commercial settings [2]. The first case study focuses on software fault isolation and considers a novel security-enhancing sandboxing transformation [3]; it ensures that an untrusted module cannot escape its dedicated isolated address space. The second case study [4] focuses on side-channel protection, and considers cryptographic constant-time, a popular software-based counter- measure against timing-based and cache-based attacks. Informally, an implementation is secure with respect to the cryptographic constant-time policy if its control flow and sequence of memory accesses do not depend on secrets.
    Program comprehension is the most tedious and time consuming task of software maintenance, an important phase of the software life cycle. This is particularly true while maintaining scientific application programs that have been written... more
    Program comprehension is the most tedious and time consuming task of software maintenance, an important phase of the software life cycle. This is particularly true while maintaining scientific application programs that have been written in Fortran for decades and that are still vital in various domains even though more modern languages are used to implement their user interfaces. Very often, programs have evolved as their application domains increase continually and have become very complex due to extensive modifications. This generality in programs is implemented by input variables whose value does not vary in the context of a given application. Thus, it is very interesting for the maintainer to propagate such information, that is to obtain a simplified program, which behaves like the initial one when used according to the restriction. We have adapted partial evaluation for program comprehension. Our partial evaluator performs mainly two tasks: constant propagation and statements simplification. It includes an interprocedural alias analysis. As our aim is program comprehension rather than optimization, there are two main differences with classical partial evaluation. We do not change the original
    International audienc
    International audienc
    s Short Papers Advanced Development of Certified OS Kernels
    Modern Just-in-Time compilers (or JITs) typically interleave several mechanisms to execute a program. For faster startup times and to observe the initial behavior of an execution, interpretation can be initially used. But after a while,... more
    Modern Just-in-Time compilers (or JITs) typically interleave several mechanisms to execute a program. For faster startup times and to observe the initial behavior of an execution, interpretation can be initially used. But after a while, JITs dynamically produce native code for parts of the program they execute often. Although some time is spent compiling dynamically, this mechanism makes for much faster times for the remaining of the program execution. Such compilers are complex pieces of software with various components, and greatly rely on a precise interplay between the different languages being executed, including on-stack-replacement. Traditional static compilers like CompCert have been mechanized in proof assistants, but JITs have been scarcely formalized so far, partly due to their impure nature and their numerous components. This work presents a model JIT with dynamic generation of native code, implemented and formally verified in Coq. Although some parts of a JIT cannot be ...
    Observational non-interference (ONI) is a generic information-flow policy for side-channel leakage. Informally, a program is ONI-secure if observing program leakage during execution does not reveal any information about secrets. Formally,... more
    Observational non-interference (ONI) is a generic information-flow policy for side-channel leakage. Informally, a program is ONI-secure if observing program leakage during execution does not reveal any information about secrets. Formally, ONI is parametrized by a leakage function $\ell$, and different instances of ONI can be recovered through different instantiations of $\ell$. One popular instance of ONI is the cryptographic constant-time (CCT) policy, which is widely used in cryptographic libraries to protect against timing and cache attacks. Informally, a program is CCT-secure if it does not branch on secrets and does not perform secret-dependent memory accesses. Another instance of ONI is the constant-resource (CR) policy, a relaxation of the CCT policy which is used in Amazon's s2n implementation of TLS and in several other security applications. Informally, a program is CR-secure if its cost (modelled by a tick operator over an arbitrary semi-group) does not depend on secrets.In this paper, we consider the problem of preserving ONI by compilation. Prior work on the preservation of the CCT policy develops proof techniques for showing that main compiler optimisations preserve the CCT policy. However, these proof techniques critically rely on the fact that the semi-group used for modelling leakage satisfies the property:\begin{equation*}\ell_{1}+\ \ell_{1}^{\prime}\ =\ell_{2}+\ell_{2}^{\prime}\Rightarrow\ell_{1}=\ell_{2}\wedge\ \ell_{1}^{\prime}\ =\ell_{2}^{\prime}\end{equation*}Unfortunately, this non-cancelling property fails for the CR policy, because its underlying semi-group is $(\mathbb{N},\ +)$ and it is currently not known how to extend existing techniques to policies that do not satisfy non-cancellation.We propose a methodology for proving the preservation of the CR policy during a program transformation. We present an implementation of some elementary compiler passes, and apply the methodology to prove the preservation of these passes. Our results have been mechanically verified using the Coq proof assistant.
    Mémoire d'Habilitation à diriger des recherches Spécialité informatique
    We present a new modular way to structure abstract interpreters. Modular means that new analysis domains may be plugged-in. These abstract domains can communicate through different means to achieve maximal precision. First, all... more
    We present a new modular way to structure abstract interpreters. Modular means that new analysis domains may be plugged-in. These abstract domains can communicate through different means to achieve maximal precision. First, all abstractions work cooperatively to emit alarms that exclude the undesirable behaviors of the program. Second, the state abstract domains may exchange information through abstractions of the possible value for expressions. Those value abstractions are themselves extensible, should two domains require a novel form of cooperation. We used this approach to design \({\textsc {eva}}\), an abstract interpreter for C implemented within the \(\textsc {Frama}\text {-}\textsc {C}\) framework. We present the domains that are available so far within \({\textsc {eva}}\), and show that this communication mechanism is able to handle them seamlessly.
    Motivated by applications to security and high efficiency, we propose an automated methodology for validating on low-level intermediate representations the results of a source-level static analysis. Our methodology relies on two main... more
    Motivated by applications to security and high efficiency, we propose an automated methodology for validating on low-level intermediate representations the results of a source-level static analysis. Our methodology relies on two main ingredients: a relative-safety checker, an instance of a relational verifier which proves that a program is "safer" than another, and a transformation of programs into defensive form which verifies the analysis results at runtime. We prove the soundness of the methodology, and provide a formally verified instantiation based on the Verasco verified C static analyzer and the CompCert verified C compiler. We experiment with the effectiveness of our approach with client optimizations at RTL level, and static analyses for cache-based timing side-channels and memory usage at pre-assembly levels.
    Just-in-Time compilation consists in interleaving program interpretation and compilation at run-time, to achieve better performance than standard interpretation. While some of the execution time is spent compiling, a JIT compiler can... more
    Just-in-Time compilation consists in interleaving program interpretation and compilation at run-time, to achieve better performance than standard interpretation. While some of the execution time is spent compiling, a JIT compiler can leverage run-time information to make speculative optimizations. These optimizations create optimized versions of functions given some assumptions. While static compilers have been the topic of many formal verification works, few have tackled JIT compilation verification. We present our ongoing work about formal verification of a Just-in-Time compiler.
    Liveness analysis is a standard compiler analysis, enabling several optimizations such as deadcode elimination. The SSA form is a popular compiler intermediate language allowing for simple and fast optimizations. Boissinot et al. [7]... more
    Liveness analysis is a standard compiler analysis, enabling several optimizations such as deadcode elimination. The SSA form is a popular compiler intermediate language allowing for simple and fast optimizations. Boissinot et al. [7] designed a fast liveness analysis by combining the specific properties of SSA with graph-theoretic ideas such as depth-first search and dominance. We formalize their approach in the Coq proof assistant, inside the CompCertSSA verified C compiler. We also compare experimentally this approach on CompCert’s benchmarks with respect to the classic data-flow-based liveness analysis, and observe performance gains.
    Ce mémoire présente plusieurs définitions de sémantiques formelles et de transformations de programmes, et expose les choix de conception associés. En particulier, ce mémoire décrit une transformation de programmes inspirée de... more
    Ce mémoire présente plusieurs définitions de sémantiques formelles et de transformations de programmes, et expose les choix de conception associés. En particulier, ce mémoire décrit une transformation de programmes inspirée de l'évaluation partielle et dédiée à la compréhension de programmes scientifiques écrits en Fortran. Il détaille également le front-end d'un compilateur réaliste du langage C, ayant été formellement vérifié en C
    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be exempt from mis-compilation. The executable code it produces is proved to behave exactly as... more
    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be exempt from mis-compilation. The executable code it produces is proved to behave exactly as specified by the semantics of the source C program. This article gives an overview of the design of CompCert and its proof concept and then focuses on aspects relevant for industrial application. We briefly summarize practical experience and give an overview of recent CompCert development aiming at industrial usage. CompCert's intended use is the compilation of life-critical and mission-critical software meeting high levels of assurance. In this context tool qualification is of paramount importance. We summarize the confidence argument of CompCert and give an overview of relevant qualification strategies.
    We present the contents of a new formal methods course taught to undergraduate students in their third year at the University of Rennes 1 in France. This course aims at initiating students to formal methods, using the Why3 platform for... more
    We present the contents of a new formal methods course taught to undergraduate students in their third year at the University of Rennes 1 in France. This course aims at initiating students to formal methods, using the Why3 platform for deductive verification. It exposes students to several techniques, ranging from testing specifications, designing loop invariants, building adequate data structures and their type invariants, to the use of ghost code. At the end of the course, most of the students were able to prove correct in an automated way non-trivial sorting algorithms, as well as standard recursive algorithms on binary search trees.
    The insertion of expressions mixing arithmetic operators and bitwise boolean operators is a widespread protection of sensitive data in source programs. This recent advanced obfuscation technique is one of the less studied among program... more
    The insertion of expressions mixing arithmetic operators and bitwise boolean operators is a widespread protection of sensitive data in source programs. This recent advanced obfuscation technique is one of the less studied among program obfuscations even if it is commonly found in binary code. In this paper, we formally verify in Coq this data obfuscation. It operates over a generic notion of mixed boolean-arithmetic expressions and on properties of bitwise operators operating over machine integers. Our obfuscation performs two kinds of program transformations: rewriting of expressions and insertion of modular inverses. To facilitate its proof of correctness, we define boolean semantic tables, a data structure inspired from truth tables. Our obfuscation is integrated into the CompCert formally verified compiler where it operates over Clight programs. The automatic extraction of our program obfuscator into OCaml yields a program with competitive results.
    Just-in-time compilers for dynamic languages routinely generate code under assumptions that may be invalidated at run-time, this allows for specialization of program code to the common case in order to avoid unnecessary overheads due to... more
    Just-in-time compilers for dynamic languages routinely generate code under assumptions that may be invalidated at run-time, this allows for specialization of program code to the common case in order to avoid unnecessary overheads due to uncommon cases. This form of software speculation requires support for deoptimization when some of the assumptions fail to hold. This paper presents a model just-in-time compiler with an intermediate representation that explicits the synchronization points used for deoptimization and the assumptions made by the compiler's speculation. We also present several common compiler optimizations that can leverage speculation to generate improved code. The optimizations are proved correct with the help of a proof assistant. While our work stops short of proving native code generation, we demonstrate how one could use the verified optimization to obtain significant speed ups in an end-to-end setting.
    Timing side-channels are arguably one of the main sources of vulnerabilities in cryptographic implementations. One effective mitigation against timing side-channels is to write programs that do not perform secret-dependent branches and... more
    Timing side-channels are arguably one of the main sources of vulnerabilities in cryptographic implementations. One effective mitigation against timing side-channels is to write programs that do not perform secret-dependent branches and memory accesses. This mitigation, known as "cryptographic constant-time", is adopted by several popular cryptographic libraries. This paper focuses on compilation of cryptographic constant-time programs, and more specifically on the following question: is the code generated by a realistic compiler for a constant-time source program itself provably constant-time? Surprisingly, we answer the question positively for a mildly modified version of the CompCert compiler, a formally verified and moderately optimizing compiler for C. Concretely, we modify the CompCert compiler to eliminate sources of potential leakage. Then, we instrument the operational semantics of CompCert intermediate languages so as to be able to capture cryptographic constant-t...
    interpretation provides advanced techniques to infer numerical invariants on programs. There is an abundant literature about numerical abstract domains that operate on scalar variables. This work deals with lifting these techniques to a... more
    interpretation provides advanced techniques to infer numerical invariants on programs. There is an abundant literature about numerical abstract domains that operate on scalar variables. This work deals with lifting these techniques to a realistic C memory model. We present an abstract memory functor that takes as argument any standard numerical abstract domain, and builds a memory abstract domain that finely tracks properties about memory contents, taking into account union types, pointer arithmetic and type casts. This functor is implemented and verified inside the Coq proof assistant with respect to the CompCert compiler memory model. Using the Coq extraction mechanism, it is fully executable and used by the Verasco C static analyzer.
    Code obfuscation is designed to impede the reverse engineering of a binary software. Dynamic data tainting is an analysis technique used to identify dependencies between data in a software. Performing dynamic data tainting on obfuscated... more
    Code obfuscation is designed to impede the reverse engineering of a binary software. Dynamic data tainting is an analysis technique used to identify dependencies between data in a software. Performing dynamic data tainting on obfuscated software usually yields hard to exploit results, due to over-tainted data. Such results are clearly identifiable as useless: an attacker will immediately discard them and opt for an alternative tool. In this paper, we present a code transformation technique meant to prevent the identification of useless results: a few lines of code are inserted in the obfuscated software, so that the results obtained by the dynamic data tainting approach appear acceptable. These results remain however wrong and lead an attacker to waste enough time and resources trying to analyze incorrect data dependencies, so that he will usually decide to use less automated and advanced analysis techniques, and maybe give up reverse engineering the current binary software. This improves the security of the software against malicious analysis.
    Code obfuscation is emerging as a key asset in security by obscurity. It aims at hiding sensitive information in programs so that they become more difficult to understand and reverse engineer. Since the results on the impossibility of... more
    Code obfuscation is emerging as a key asset in security by obscurity. It aims at hiding sensitive information in programs so that they become more difficult to understand and reverse engineer. Since the results on the impossibility of perfect and universal obfuscation, many obfuscation techniques have been proposed in the literature, ranging from simple variable encoding to hiding the control-flow of a program. In this paper, we formally verify in Coq an advanced code obfuscation called control-flow graph flattening, that is used in state-of-the-art program obfuscators. Our control-flow graph flattening is a program transformation operating over C programs, that is integrated into the CompCert formally verified compiler. The semantics preservation proof of our program obfuscator relies on a simulation proof performed on a realistic language, the Clight language of CompCert. The automatic extraction of our program obfuscator into OCaml yields a program with competitive results.
    ABSTRACT

    And 44 more