Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Sandrine Blazy

    Sandrine Blazy

    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be exempt from mis-compilation. The executable code it produces is proved to behave exactly as... more
    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be exempt from mis-compilation. The executable code it produces is proved to behave exactly as specified by the semantics of the source C program. This article gives an overview of the use of CompCert to gain certification credits for a highly safety-critical industry application, certified according to IEC 60880. We will briefly introduce the target application, illustrate the process of changing the existing compiler infrastructure to CompCert, and discuss performance characteristics. The main part focuses on the tool qualification strategy, in particular on how to take advantage of the formal correctness proof in the certification process.
    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be free from miscompilation. The executable code it produces is proved to behave exactly as... more
    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be free from miscompilation. The executable code it produces is proved to behave exactly as specified by the semantics of the source C program. CompCert's intended use is the compilation of safety-critical and mission-critical software meeting high levels of assurance. This article gives an overview of the design of CompCert and its proof concept, summarizes the resulting confidence argument, and gives an overview of relevant tool qualification strategies. We briefly summarize practical experience and give an overview of recent CompCert developments.
    Un assistant de preuve est un logiciel interactif permettant a son utilisateur de construire des demonstrations de facon semi-automatique, tout en garantissant la correction de ces demonstrations.  Ce type d'outil est particulierement... more
    Un assistant de preuve est un logiciel interactif permettant a son utilisateur de construire des demonstrations de facon semi-automatique, tout en garantissant la correction de ces demonstrations.  Ce type d'outil est particulierement utile a la verification de logiciel critique. Cet article presente Coq, assistant de preuve dont le developpement est coordonne par l'institut de recherche Inria. Son utilisation est d’abord presentee a travers un exemple tres simple: la verification d'une fonction de tri. Puis une deuxieme partie presente quelques domaines d'applications, notamment la surete du logiciel et la recherche en informatique et en mathematiques. Coq est considere comme un des outils les plus fiables pour la validation du logiciel, ce qui s’explique par les fondements theoriques de cet outil et son evolution depuis plus de 30 ans de recherche et de developpement.
    ABSTRACT
    Research Interests:
    Research Interests:
    This paper describes an approach for r eusing design patterns that have been formally specified. Reusing such a pattern means instantiating it or composing it with other patterns or extending it. Three levels of composition are defined :... more
    This paper describes an approach for r eusing design patterns that have been formally specified. Reusing such a pattern means instantiating it or composing it with other patterns or extending it. Three levels of composition are defined : juxtaposition, composition with inter-patterns links and unification. This paper shows through examples how to define specification patterns in B, how to reuse
    This paper describes a technique and a tool that support partial evaluation of FORTRAN programs, i.e., their specialization for specific values of their input variables. The authors’ aim is to understand old programs, which have become... more
    This paper describes a technique and a tool that support partial evaluation of FORTRAN programs, i.e., their specialization for specific values of their input variables. The authors’ aim is to understand old programs, which have become very complex due to numerous extensions. From a given FORTRAN program and these values of its input variables, the tool provides a simplified program, which behaves like the initial program for the specific values. This tool mainly uses constant propagation and simplification of alternatives to one of their branches. The tool is specified in terms of inference rules and operates by induction on the FORTRAN abstract syntax. These rules are compiled into Prolog by the Centaur/FORTRAN programming environment. The completeness and soundness of these rules are proven using rule induction.
    ABSTRACT This paper describes an approach for reusing specification patterns. Specification patterns are design patterns that are expressed in a formal specification language. Reusing a specification pattern means instantiating it or... more
    ABSTRACT This paper describes an approach for reusing specification patterns. Specification patterns are design patterns that are expressed in a formal specification language. Reusing a specification pattern means instantiating it or composing it with other specification patterns. Three levels of composition are defined: juxtaposition, composition with inter-patterns links and unification. This paper shows through examples how to define specification patterns in B, how to reuse them directly in B, and also how to reuse the proofs associated with specification patterns.
    Liveness analysis is a standard compiler analysis, enabling several optimizations such as deadcode elimination. The SSA form is a popular compiler intermediate language allowing for simple and fast optimizations. Boissinot et al. [7]... more
    Liveness analysis is a standard compiler analysis, enabling several optimizations such as deadcode elimination. The SSA form is a popular compiler intermediate language allowing for simple and fast optimizations. Boissinot et al. [7] designed a fast liveness analysis by combining the specific properties of SSA with graph-theoretic ideas such as depth-first search and dominance. We formalize their approach in the Coq proof assistant, inside the CompCertSSA verified C compiler. We also compare experimentally this approach on CompCert’s benchmarks with respect to the classic data-flow-based liveness analysis, and observe performance gains.
    Static code analysis is increasingly used to guarantee the absence of undesirable behaviors in industrial programs. Designing sound analyses is a continuing trade-off between precision and complexity. Notably, dataflow analyses often... more
    Static code analysis is increasingly used to guarantee the absence of undesirable behaviors in industrial programs. Designing sound analyses is a continuing trade-off between precision and complexity. Notably, dataflow analyses often perform overly wide approximations when two control-flow paths meet, by merging states from each path.This paper presents a generic abstract interpretation based framework to enhance the precision of such analyses on join points. It relies on predicated domains, that preserve and reuse information valid only inside some branches of the code. Our predicates are derived from conditional statements, and postpone the loss of information.The work has been integrated into Frama-C, a C source code analysis platform. Experiments on real generated code show that our approach scales, and improves significantly the precision of the existing analyses of Frama-C. We automatically extend existing abstract domains.The new information reflects the structure of the conditionals of the program.This approach keeps path-sensitive information.Our transfer functions have been designed to scale on real programs.Our technique has been successfully applied to complex generated programs.
    ABSTRACT Static analysis of binary code is challenging for several reasons. In particular, standard static analysis techniques operate over control flow graphs, which are not available when dealing with self-modifying programs which can... more
    ABSTRACT Static analysis of binary code is challenging for several reasons. In particular, standard static analysis techniques operate over control flow graphs, which are not available when dealing with self-modifying programs which can modify their own code at runtime. We formalize in the Coq proof assistant some key abstract interpretation techniques that automatically extract memory safety properties from binary code. Our analyzer is formally proved correct and has been run on several self-modifying challenges, provided by Caiet al.in their PLDI 2007 paper.
    ABSTRACT
    s Short Papers Advanced Development of Certified OS Kernels
    Software Fault Isolation (SFI) is a security-enhancing program transformation for instrumenting an untrusted binary module so that it runs inside a dedicated isolated address space, called a sandbox. To ensure that the untrusted module... more
    Software Fault Isolation (SFI) is a security-enhancing program transformation for instrumenting an untrusted binary module so that it runs inside a dedicated isolated address space, called a sandbox. To ensure that the untrusted module cannot escape its sandbox, existing approaches such as Google’s Native Client rely on a binary verifier to check that all memory accesses are within the sandbox. Instead of relying on a posteriori verification, we design, implement and prove correct a program instrumentation phase as part of the formally verified compiler CompCert that enforces a sandboxing security property a priori. This eliminates the need for a binary verifier and, instead, leverages the soundness proof of the compiler to prove the security of the sandboxing transformation. The technical contributions are a novel sandboxing transformation that has a well-defined C semantics and which supports arbitrary function pointers, and a formally verified C compiler that implements SFI. Experiments show that our formally verified technique is a competitive way of implementing SFI.
    A formally verified compiler is a compiler that comes with a machine-checked proof that no bug is introduced during compilation. This correctness property states that the compiler preserves the semantics of programs. Formally verified... more
    A formally verified compiler is a compiler that comes with a machine-checked proof that no bug is introduced during compilation. This correctness property states that the compiler preserves the semantics of programs. Formally verified compilers guarantee the absence of correctness bugs, but do not protect against other classes of bugs, such as security bugs. This limitation partly arises from the traditional form of stating compiler correctness as preservation of semantics that do not capture non-functional properties such as security. Moreover, proof techniques for compiler correctness, including the traditional notions of simulation, do not immediately apply to secure compilation, and need to be extended accordingly. This talk will address the challenges of secure compilation from the specific angle of turning an existing formally-verified compiler into a formally-verified secure compiler. Two case studies will illustrate this approach, where each case study addresses a notion of security and uses modular reasoning (first proving correctness then security) to show that compilation preserves security. Specifically, we consider the problem of secure compilation for CompCert, a formally-verified moderately optimizing compiler for C programs, programmed and verified using the Coq proof assistant [1]. CompCert evolved significantly over the last 15 years, starting as an academic project and now being used in commercial settings [2]. The first case study focuses on software fault isolation and considers a novel security-enhancing sandboxing transformation [3]; it ensures that an untrusted module cannot escape its dedicated isolated address space. The second case study [4] focuses on side-channel protection, and considers cryptographic constant-time, a popular software-based counter- measure against timing-based and cache-based attacks. Informally, an implementation is secure with respect to the cryptographic constant-time policy if its control flow and sequence of memory accesses do not depend on secrets.
    Program comprehension is the most tedious and time consuming task of software maintenance, an important phase of the software life cycle. This is particularly true while maintaining scientific application programs that have been written... more
    Program comprehension is the most tedious and time consuming task of software maintenance, an important phase of the software life cycle. This is particularly true while maintaining scientific application programs that have been written in Fortran for decades and that are still vital in various domains even though more modern languages are used to implement their user interfaces. Very often, programs have evolved as their application domains increase continually and have become very complex due to extensive modifications. This generality in programs is implemented by input variables whose value does not vary in the context of a given application. Thus, it is very interesting for the maintainer to propagate such information, that is to obtain a simplified program, which behaves like the initial one when used according to the restriction. We have adapted partial evaluation for program comprehension. Our partial evaluator performs mainly two tasks: constant propagation and statements simplification. It includes an interprocedural alias analysis. As our aim is program comprehension rather than optimization, there are two main differences with classical partial evaluation. We do not change the original
    International audienc
    International audienc
    s Short Papers Advanced Development of Certified OS Kernels
    Modern Just-in-Time compilers (or JITs) typically interleave several mechanisms to execute a program. For faster startup times and to observe the initial behavior of an execution, interpretation can be initially used. But after a while,... more
    Modern Just-in-Time compilers (or JITs) typically interleave several mechanisms to execute a program. For faster startup times and to observe the initial behavior of an execution, interpretation can be initially used. But after a while, JITs dynamically produce native code for parts of the program they execute often. Although some time is spent compiling dynamically, this mechanism makes for much faster times for the remaining of the program execution. Such compilers are complex pieces of software with various components, and greatly rely on a precise interplay between the different languages being executed, including on-stack-replacement. Traditional static compilers like CompCert have been mechanized in proof assistants, but JITs have been scarcely formalized so far, partly due to their impure nature and their numerous components. This work presents a model JIT with dynamic generation of native code, implemented and formally verified in Coq. Although some parts of a JIT cannot be ...
    Observational non-interference (ONI) is a generic information-flow policy for side-channel leakage. Informally, a program is ONI-secure if observing program leakage during execution does not reveal any information about secrets. Formally,... more
    Observational non-interference (ONI) is a generic information-flow policy for side-channel leakage. Informally, a program is ONI-secure if observing program leakage during execution does not reveal any information about secrets. Formally, ONI is parametrized by a leakage function $\ell$, and different instances of ONI can be recovered through different instantiations of $\ell$. One popular instance of ONI is the cryptographic constant-time (CCT) policy, which is widely used in cryptographic libraries to protect against timing and cache attacks. Informally, a program is CCT-secure if it does not branch on secrets and does not perform secret-dependent memory accesses. Another instance of ONI is the constant-resource (CR) policy, a relaxation of the CCT policy which is used in Amazon's s2n implementation of TLS and in several other security applications. Informally, a program is CR-secure if its cost (modelled by a tick operator over an arbitrary semi-group) does not depend on secrets.In this paper, we consider the problem of preserving ONI by compilation. Prior work on the preservation of the CCT policy develops proof techniques for showing that main compiler optimisations preserve the CCT policy. However, these proof techniques critically rely on the fact that the semi-group used for modelling leakage satisfies the property:\begin{equation*}\ell_{1}+\ \ell_{1}^{\prime}\ =\ell_{2}+\ell_{2}^{\prime}\Rightarrow\ell_{1}=\ell_{2}\wedge\ \ell_{1}^{\prime}\ =\ell_{2}^{\prime}\end{equation*}Unfortunately, this non-cancelling property fails for the CR policy, because its underlying semi-group is $(\mathbb{N},\ +)$ and it is currently not known how to extend existing techniques to policies that do not satisfy non-cancellation.We propose a methodology for proving the preservation of the CR policy during a program transformation. We present an implementation of some elementary compiler passes, and apply the methodology to prove the preservation of these passes. Our results have been mechanically verified using the Coq proof assistant.
    Mémoire d'Habilitation à diriger des recherches Spécialité informatique
    We present a new modular way to structure abstract interpreters. Modular means that new analysis domains may be plugged-in. These abstract domains can communicate through different means to achieve maximal precision. First, all... more
    We present a new modular way to structure abstract interpreters. Modular means that new analysis domains may be plugged-in. These abstract domains can communicate through different means to achieve maximal precision. First, all abstractions work cooperatively to emit alarms that exclude the undesirable behaviors of the program. Second, the state abstract domains may exchange information through abstractions of the possible value for expressions. Those value abstractions are themselves extensible, should two domains require a novel form of cooperation. We used this approach to design \({\textsc {eva}}\), an abstract interpreter for C implemented within the \(\textsc {Frama}\text {-}\textsc {C}\) framework. We present the domains that are available so far within \({\textsc {eva}}\), and show that this communication mechanism is able to handle them seamlessly.
    Motivated by applications to security and high efficiency, we propose an automated methodology for validating on low-level intermediate representations the results of a source-level static analysis. Our methodology relies on two main... more
    Motivated by applications to security and high efficiency, we propose an automated methodology for validating on low-level intermediate representations the results of a source-level static analysis. Our methodology relies on two main ingredients: a relative-safety checker, an instance of a relational verifier which proves that a program is "safer" than another, and a transformation of programs into defensive form which verifies the analysis results at runtime. We prove the soundness of the methodology, and provide a formally verified instantiation based on the Verasco verified C static analyzer and the CompCert verified C compiler. We experiment with the effectiveness of our approach with client optimizations at RTL level, and static analyses for cache-based timing side-channels and memory usage at pre-assembly levels.
    Just-in-Time compilation consists in interleaving program interpretation and compilation at run-time, to achieve better performance than standard interpretation. While some of the execution time is spent compiling, a JIT compiler can... more
    Just-in-Time compilation consists in interleaving program interpretation and compilation at run-time, to achieve better performance than standard interpretation. While some of the execution time is spent compiling, a JIT compiler can leverage run-time information to make speculative optimizations. These optimizations create optimized versions of functions given some assumptions. While static compilers have been the topic of many formal verification works, few have tackled JIT compilation verification. We present our ongoing work about formal verification of a Just-in-Time compiler.
    Liveness analysis is a standard compiler analysis, enabling several optimizations such as deadcode elimination. The SSA form is a popular compiler intermediate language allowing for simple and fast optimizations. Boissinot et al. [7]... more
    Liveness analysis is a standard compiler analysis, enabling several optimizations such as deadcode elimination. The SSA form is a popular compiler intermediate language allowing for simple and fast optimizations. Boissinot et al. [7] designed a fast liveness analysis by combining the specific properties of SSA with graph-theoretic ideas such as depth-first search and dominance. We formalize their approach in the Coq proof assistant, inside the CompCertSSA verified C compiler. We also compare experimentally this approach on CompCert’s benchmarks with respect to the classic data-flow-based liveness analysis, and observe performance gains.
    Ce mémoire présente plusieurs définitions de sémantiques formelles et de transformations de programmes, et expose les choix de conception associés. En particulier, ce mémoire décrit une transformation de programmes inspirée de... more
    Ce mémoire présente plusieurs définitions de sémantiques formelles et de transformations de programmes, et expose les choix de conception associés. En particulier, ce mémoire décrit une transformation de programmes inspirée de l'évaluation partielle et dédiée à la compréhension de programmes scientifiques écrits en Fortran. Il détaille également le front-end d'un compilateur réaliste du langage C, ayant été formellement vérifié en C
    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be exempt from mis-compilation. The executable code it produces is proved to behave exactly as... more
    CompCert is the first commercially available optimizing compiler that is formally verified, using machine-assisted mathematical proofs, to be exempt from mis-compilation. The executable code it produces is proved to behave exactly as specified by the semantics of the source C program. This article gives an overview of the design of CompCert and its proof concept and then focuses on aspects relevant for industrial application. We briefly summarize practical experience and give an overview of recent CompCert development aiming at industrial usage. CompCert's intended use is the compilation of life-critical and mission-critical software meeting high levels of assurance. In this context tool qualification is of paramount importance. We summarize the confidence argument of CompCert and give an overview of relevant qualification strategies.
    We present the contents of a new formal methods course taught to undergraduate students in their third year at the University of Rennes 1 in France. This course aims at initiating students to formal methods, using the Why3 platform for... more
    We present the contents of a new formal methods course taught to undergraduate students in their third year at the University of Rennes 1 in France. This course aims at initiating students to formal methods, using the Why3 platform for deductive verification. It exposes students to several techniques, ranging from testing specifications, designing loop invariants, building adequate data structures and their type invariants, to the use of ghost code. At the end of the course, most of the students were able to prove correct in an automated way non-trivial sorting algorithms, as well as standard recursive algorithms on binary search trees.
    The insertion of expressions mixing arithmetic operators and bitwise boolean operators is a widespread protection of sensitive data in source programs. This recent advanced obfuscation technique is one of the less studied among program... more
    The insertion of expressions mixing arithmetic operators and bitwise boolean operators is a widespread protection of sensitive data in source programs. This recent advanced obfuscation technique is one of the less studied among program obfuscations even if it is commonly found in binary code. In this paper, we formally verify in Coq this data obfuscation. It operates over a generic notion of mixed boolean-arithmetic expressions and on properties of bitwise operators operating over machine integers. Our obfuscation performs two kinds of program transformations: rewriting of expressions and insertion of modular inverses. To facilitate its proof of correctness, we define boolean semantic tables, a data structure inspired from truth tables. Our obfuscation is integrated into the CompCert formally verified compiler where it operates over Clight programs. The automatic extraction of our program obfuscator into OCaml yields a program with competitive results.
    Just-in-time compilers for dynamic languages routinely generate code under assumptions that may be invalidated at run-time, this allows for specialization of program code to the common case in order to avoid unnecessary overheads due to... more
    Just-in-time compilers for dynamic languages routinely generate code under assumptions that may be invalidated at run-time, this allows for specialization of program code to the common case in order to avoid unnecessary overheads due to uncommon cases. This form of software speculation requires support for deoptimization when some of the assumptions fail to hold. This paper presents a model just-in-time compiler with an intermediate representation that explicits the synchronization points used for deoptimization and the assumptions made by the compiler's speculation. We also present several common compiler optimizations that can leverage speculation to generate improved code. The optimizations are proved correct with the help of a proof assistant. While our work stops short of proving native code generation, we demonstrate how one could use the verified optimization to obtain significant speed ups in an end-to-end setting.
    Timing side-channels are arguably one of the main sources of vulnerabilities in cryptographic implementations. One effective mitigation against timing side-channels is to write programs that do not perform secret-dependent branches and... more
    Timing side-channels are arguably one of the main sources of vulnerabilities in cryptographic implementations. One effective mitigation against timing side-channels is to write programs that do not perform secret-dependent branches and memory accesses. This mitigation, known as "cryptographic constant-time", is adopted by several popular cryptographic libraries. This paper focuses on compilation of cryptographic constant-time programs, and more specifically on the following question: is the code generated by a realistic compiler for a constant-time source program itself provably constant-time? Surprisingly, we answer the question positively for a mildly modified version of the CompCert compiler, a formally verified and moderately optimizing compiler for C. Concretely, we modify the CompCert compiler to eliminate sources of potential leakage. Then, we instrument the operational semantics of CompCert intermediate languages so as to be able to capture cryptographic constant-t...
    interpretation provides advanced techniques to infer numerical invariants on programs. There is an abundant literature about numerical abstract domains that operate on scalar variables. This work deals with lifting these techniques to a... more
    interpretation provides advanced techniques to infer numerical invariants on programs. There is an abundant literature about numerical abstract domains that operate on scalar variables. This work deals with lifting these techniques to a realistic C memory model. We present an abstract memory functor that takes as argument any standard numerical abstract domain, and builds a memory abstract domain that finely tracks properties about memory contents, taking into account union types, pointer arithmetic and type casts. This functor is implemented and verified inside the Coq proof assistant with respect to the CompCert compiler memory model. Using the Coq extraction mechanism, it is fully executable and used by the Verasco C static analyzer.
    Code obfuscation is designed to impede the reverse engineering of a binary software. Dynamic data tainting is an analysis technique used to identify dependencies between data in a software. Performing dynamic data tainting on obfuscated... more
    Code obfuscation is designed to impede the reverse engineering of a binary software. Dynamic data tainting is an analysis technique used to identify dependencies between data in a software. Performing dynamic data tainting on obfuscated software usually yields hard to exploit results, due to over-tainted data. Such results are clearly identifiable as useless: an attacker will immediately discard them and opt for an alternative tool. In this paper, we present a code transformation technique meant to prevent the identification of useless results: a few lines of code are inserted in the obfuscated software, so that the results obtained by the dynamic data tainting approach appear acceptable. These results remain however wrong and lead an attacker to waste enough time and resources trying to analyze incorrect data dependencies, so that he will usually decide to use less automated and advanced analysis techniques, and maybe give up reverse engineering the current binary software. This improves the security of the software against malicious analysis.
    Code obfuscation is emerging as a key asset in security by obscurity. It aims at hiding sensitive information in programs so that they become more difficult to understand and reverse engineer. Since the results on the impossibility of... more
    Code obfuscation is emerging as a key asset in security by obscurity. It aims at hiding sensitive information in programs so that they become more difficult to understand and reverse engineer. Since the results on the impossibility of perfect and universal obfuscation, many obfuscation techniques have been proposed in the literature, ranging from simple variable encoding to hiding the control-flow of a program. In this paper, we formally verify in Coq an advanced code obfuscation called control-flow graph flattening, that is used in state-of-the-art program obfuscators. Our control-flow graph flattening is a program transformation operating over C programs, that is integrated into the CompCert formally verified compiler. The semantics preservation proof of our program obfuscator relies on a simulation proof performed on a realistic language, the Clight language of CompCert. The automatic extraction of our program obfuscator into OCaml yields a program with competitive results.
    ABSTRACT
    Research Interests:
    ABSTRACT
    Research Interests:
    Research Interests:
    L'utilisation de methodes formelles permet d'obtenir des garanties fortes sur le code source des logiciels critiques. Cependant, des bugs dans le compilateur utilise pour produire un executable a partir de ces sources peuvent... more
    L'utilisation de methodes formelles permet d'obtenir des garanties fortes sur le code source des logiciels critiques. Cependant, des bugs dans le compilateur utilise pour produire un executable a partir de ces sources peuvent invalider ces garanties. Ce risque est ecarte si l'on verie formellement le compilateur : on prouve que le compilateur produit du code machine qui se comporte comme le code source qu'on lui fournit. Le travail realise s'inscrit dans le cadre du projet Compcert ayant pour but le developpement et la verification formelle, utilisant l'assistant de preuve Coq, d'un compilateur realiste potentiellement utilisable pour le logiciel embarque critique. Un compilateur effectue une succession de transformations pour generer un code machine a partir d'un code source. L'allocation de registres par coloriage de graphes est une de ces transformations ; c'est une des plus difficiles a mettre en oeuvre. Elle a pour but de proposer une utilisation optimale des registres, c'est-a-dire minimisant l'occupation de la memoire durant l'execution d'un programme. Le compilateur Compcert utilise une des meilleures heuristiques d'allocation de registres, mais celle-ci n'a pas ete specifiee formellement en Coq. Aussi, nous avons propose un algorithme de coloration avec preferences, qui a la particularite de generaliser deux problemes d'optimisation connus : le probleme de coloration et le probleme de multicoupe minimale. La resolution de ce nouveau probleme dans les graphes triangules peut etre realisee grâce a un test d'existence de solution admissible (que nous realisons grâce a l'algorithme de coloration gourmande) suivi d'un algorithme de coupes. Notre algorithme a ete formellement specie en Coq, ce qui a permis de verifier formellement certaines de ses proprietes. En particulier, nous avons formellement specifie un algorithme de coloration gourmande, ainsi qu'un algorithme de recherche d'ordre d'elimination simplicial. Nous avons egalement prouve en Coq la validite de la solution calculee par notre algorithme, ainsi que son optimalite dans le cas de graphes triangules. Ce travail est un premier pas vers la verication formelle d'algorithmes operant sur des graphes.
    Research Interests:
    ABSTRACT
    Research Interests:
    Real life C programs are often written using C dialects which, for the ISO C standard, have undefined behaviours. In particular, according to the ISO C standard, reading an uninitialised variable has an undefined behaviour and low-level... more
    Real life C programs are often written using C dialects which, for the ISO C standard, have undefined behaviours. In particular, according to the ISO C standard, reading an uninitialised variable has an undefined behaviour and low-level pointer operations are implementation defined. We propose a formal semantics which gives a well-defined meaning to those behaviours for the C dialect of the CompCert compiler. Our semantics builds upon a novel memory model leveraging a notion of symbolic values. Symbolic values are used by the semantics to delay the evaluation of operations and are normalised lazily to genuine values when needed. We show that the most precise normalisation is computable and that a slightly relaxed normalisation can be efficiently implemented using an SMT solver. The semantics is executable and our experiments show that the enhancements of our semantics are mandatory to give a meaning to low-levels idioms such as those found in the allocation functions of a C standard library.
    ABSTRACT
    Research Interests:
    Static code analysis is increasingly used to guarantee the absence of undesirable behaviors in industrial programs. Designing sound analyses is a continuing trade-off between precision and complexity. Notably, dataflow analyses often... more
    Static code analysis is increasingly used to guarantee the absence of undesirable behaviors in industrial programs. Designing sound analyses is a continuing trade-off between precision and complexity. Notably, dataflow analyses often perform overly wide approximations when two control-flow paths meet, by merging states from each path. This paper presents a generic abstract interpretation based framework to enhance the precision of such analyses on join points. It relies on predicated domains, that preserve and reuse information valid only inside some branches of the code. Our predicates are derived from conditionals statements, and postpone the loss of information. The work has been integrated into Frama-C, a C source code analysis platform. Experiments on real code show that our approach scales, and improves significantly the precision of the existing analyses of Frama-C.
    The problem of computing dominators in a control flow graph is central to numerous modern compiler optimizations. Many efficient algorithms have been proposed in the literature, but mechanizing the correctness of the most sophisticated... more
    The problem of computing dominators in a control flow graph is central to numerous modern compiler optimizations. Many efficient algorithms have been proposed in the literature, but mechanizing the correctness of the most sophisticated algorithms is still considered as too hard problems, and to this date, verified compilers use less optimized implementations. In contrast, production compilers, like GCC or LLVM, implement the classic, efficient Lengauer-Tarjan algorithm [12], to compute dominator trees. And subsequent optimization phases can then determine whether a CFG node dominates another node in constant time by using their respective depth-first search numbers in the dominator tree. In this work, we aim at integrating such techniques in verified compilers. We present a formally verified validator of untrusted dominator trees, on top of which we implement and prove correct a fast dominance test following these principles. We conduct our formal development in the Coq proof assistant, and integrate it in the middle-end of the CompCertSSA verified compiler. We also provide experimental results showing performance improvement over previous formalizations.
    The authors describe a technique and a tool supporting partial evaluation of Fortran programs, i.e. their specialization for specific values of their input variables. They aim at understanding old programs, which have become very complex... more
    The authors describe a technique and a tool supporting partial evaluation of Fortran programs, i.e. their specialization for specific values of their input variables. They aim at understanding old programs, which have become very complex due to numerous extensions. From a given Fortran program and these values of its input variables, the tool provides a simplified program, which behaves like
    Cminor is a mid-level imperative programming language; there are proved-correct optimizing compilers from C to Cminor and from Cminor to machine language. We have redesigned Cminor so that it is suitable for Hoare Logic reasoning and we... more
    Cminor is a mid-level imperative programming language; there are proved-correct optimizing compilers from C to Cminor and from Cminor to machine language. We have redesigned Cminor so that it is suitable for Hoare Logic reasoning and we have designed a Sepa- ration Logic for Cminor. In this paper, we give a small-step semantics (instead of the big-step of the proved-correct
    Static analysis – the automatic determination of simple properties of a program – is the basis both for optimizing compilation and for verification of safety properties such as absence of run-time errors. To support the use of static... more
    Static analysis – the automatic determination of simple properties of a program – is the basis both for optimizing compilation and for verification of safety properties such as absence of run-time errors. To support the use of static analyses in verified compilers and in high-confidence verification environments, the analyses must be proved to be sound. In this invited talk, I will review some ongoing work in this direction in the CompCert and Verasco projects, in particular the construction and formal verification of a modular static analyzer based on abstract interpretation.
    This paper reports on the design and soundness proof, using the Coq proof assistant, of Verasco, a static analyzer based on abstract interpretation for most of the ISO C 1999 language (excluding recursion and dynamic allocation). Verasco... more
    This paper reports on the design and soundness proof, using the Coq proof assistant, of Verasco, a static analyzer based on abstract interpretation for most of the ISO C 1999 language (excluding recursion and dynamic allocation). Verasco establishes the absence of run-time errors in the analyzed programs. It enjoys a modular architecture that supports the extensible combination of multiple abstract domains, both relational and non-relational. Verasco integrates with the CompCert formally-verified C compiler so that not only the soundness of the analysis results is guaranteed with mathematical certitude, but also the fact that these guarantees carry over to the compiled code.
    Partial evaluation is an optimization technique traditionally used in compilation. We have adapted this technique to the understanding of scientifi c application programs during their maintenance and we have implemented a tool. This tool... more
    Partial evaluation is an optimization technique traditionally used in compilation. We have adapted this technique to the understanding of scientifi c application programs during their maintenance and we have implemented a tool. This tool analyzes Fortran 90 application programs and performs an interpr ocedural pointer analysis. This paper presents how we have specified this analysis with different formalisms (inference rules with global definitions and set and relational operators). Then we pr esent the tool implementing these specifications. It has been implemented in a generic programming environment and a graphical interface has been developed to visualize the information computed during the partial evaluation (values of variables, already analyzed procedures, scope of variables, removed statements, ...).
    ABSTRACT
    ABSTRACT
    ABSTRACT Static analysis of binary code is challenging for several reasons. In particular, standard static analysis techniques operate over control flow graphs, which are not available when dealing with self-modifying programs which can... more
    ABSTRACT Static analysis of binary code is challenging for several reasons. In particular, standard static analysis techniques operate over control flow graphs, which are not available when dealing with self-modifying programs which can modify their own code at runtime. We formalize in the Coq proof assistant some key abstract interpretation techniques that automatically extract memory safety properties from binary code. Our analyzer is formally proved correct and has been run on several self-modifying challenges, provided by Caiet al.in their PLDI 2007 paper.
    ABSTRACT
    This paper describes a tool for facilitating the comprehension of general programs using automatic specialization. The goal of this approach was to assist in the maintenance of old programs, which have become very complex due to numerous... more
    This paper describes a tool for facilitating the comprehension of general programs using automatic specialization. The goal of this approach was to assist in the maintenance of old programs, which have become very complex due to numerous extensions. This paper explains why this approach was chosen, how the tool's architecture was set up, and how the correctness of the specialization
    ABSTRACT This paper describes an approach for reusing specification patterns. Specification patterns are design patterns that are expressed in a formal specification language. Reusing a specification pattern means instantiating it or... more
    ABSTRACT This paper describes an approach for reusing specification patterns. Specification patterns are design patterns that are expressed in a formal specification language. Reusing a specification pattern means instantiating it or composing it with other specification patterns. Three levels of composition are defined: juxtaposition, composition with inter-patterns links and unification. This paper shows through examples how to define specification patterns in B, how to reuse them directly in B, and also how to reuse the proofs associated with specification patterns.
    ABSTRACT
    The results are given of a series of case studies conducted at different industrial sites in the framework of the ESF/EPSOM (Eureka Software Factory/European Platform for Software Maintenance) project. The approach taken in the case... more
    The results are given of a series of case studies conducted at different industrial sites in the framework of the ESF/EPSOM (Eureka Software Factory/European Platform for Software Maintenance) project. The approach taken in the case studies was to directly contact software maintainers and obtain their own view of their activity, mainly through the use of interactive methods based on group
    ABSTRACT
    Research Interests:
    ABSTRACT
    Research Interests:
    This paper reports on the design and soundness proof, using the interpretation for most of the ISO C 1999 language (excluding re-cursion and dynamic allocation), which establishes the absence of run-time errors in the analyzed programs.... more
    This paper reports on the design and soundness proof, using the interpretation for most of the ISO C 1999 language (excluding re-cursion and dynamic allocation), which establishes the absence of run-time errors in the analyzed programs. Verasco enjoys a modu-lar architecture that supports the extensible combination of multiple abstract domains, both relational and non-relational. Verasco inte-grates with the CompCert C formally-verified compiler so that not only the soundness of the analysis results is guaranteed with math-ematical certitude, but also the fact that these guarantees carry over to the compiled executable code. 1.
    This paper extends the idea of specializing modified interpreters for systematically generating obfuscated code. By using the Coq proof assistant we specify some elementary obfuscations and prove that the resulting distorted interpreter... more
    This paper extends the idea of specializing modified interpreters for systematically generating obfuscated code. By using the Coq proof assistant we specify some elementary obfuscations and prove that the resulting distorted interpreter is correct, namely it preserves the intended semantics of programs. The paper shows how the semantic preservation proofs generated and verified in Coq can provide a measure of the quality of the obfuscation. In particular we can observe that there is a precise corresponding between the potency of the obfuscation and the complexity of the proof of semantics preservation. Our obfuscation can be easily integrated into the CompCert C compiler, providing the basis for a formally verified obfuscating compiler which can be applied to any C program.
    Research Interests:
    ABSTRACT
    ABSTRACT Static analyzers based on abstract interpretation are complex pieces of software implementing delicate algorithms. Even if static analysis techniques are well understood, their implementation on real languages is still... more
    ABSTRACT Static analyzers based on abstract interpretation are complex pieces of software implementing delicate algorithms. Even if static analysis techniques are well understood, their implementation on real languages is still error-prone. This paper presents a formal verification using the Coq proof assistant: a formalization of a value analysis (based on abstract interpretation), and a soundness proof of the value analysis. The formalization relies on generic interfaces. The mechanized proof is facilitated by a translation validation of a Bourdoncle fixpoint iterator. The work has been integrated into the CompCert verified C-compiler. Our verified analysis directly operates over an intermediate language of the compiler having the same expressiveness as C. The automatic extraction of our value analysis into OCaml yields a program with competitive results, obtained from experiments on a number of benchmarks and comparisons with the Frama-C tool.
    Program slicing is a well-known program transformation which simplifies a program wrt a given criterion while preserving its semantics. Since the seminal paper published by Weiser in 1981, program slicing is still widely used in various... more
    Program slicing is a well-known program transformation which simplifies a program wrt a given criterion while preserving its semantics. Since the seminal paper published by Weiser in 1981, program slicing is still widely used in various application domains. State of the art program slicers operate over program dependence graphs (PDG), a sophisticated data structure combining data and control dependences. In this paper, we follow the a posteriori validation approach to formally verify (in Coq) a general program slicer. Our validator for program slicing is efficient and validates the results of a run of an unverified program slicer. Program slicing is interesting for a posteriori validation because the correctness proof of program slicing requires to compute new supplementary information from the PDG, thus decoupling the slicing algorithm from its proof. Our semantics-preserving program slicer is integrated into the CompCert formally verified compiler. It operates over an intermediate language of the compiler having the same expressiveness as C. Our experiments show that our formally verified validator scales on large realistic programs.
    Worst-case execution time WCET estimation tools are complex pieces of software performing tasks such as computation on control flow graphs CFGs and bound calculation. In this paper, we present a formal verification in Coq of a loop bound... more
    Worst-case execution time WCET estimation tools are complex pieces of software performing tasks such as computation on control flow graphs CFGs and bound calculation. In this paper, we present a formal verification in Coq of a loop bound estimation. It relies on program slicing and bound calculation. The work has been integrated into the CompCert verified C compiler. Our verified analyses directly operate on non-structured CFGs. We extend the CompCert RTL intermediate language with a notion of loop nesting a.k.a. weak topological ordering on CFGs that is useful for reasoning on CFGs. The automatic extraction of our loop bound estimation into OCaml yields a program with competitive results, obtained from experiments on a reference benchmark for WCET bound estimation tools.
    This paper presents the formal verification of a compiler front-end that translates a subset of the C language into the Cminor intermediate language. The semantics of the source and target languages as well as the translation between them... more
    This paper presents the formal verification of a compiler front-end that translates a subset of the C language into the Cminor intermediate language. The semantics of the source and target languages as well as the translation between them have been written in the ...
    Register allocation is often a two-phase approach: spilling of registers to memory, followed by coalescing of registers. Extreme live-range splitting (i.e. live-range splitting after each statement) enables optimal solutions based on ILP,... more
    Register allocation is often a two-phase approach: spilling of registers to memory, followed by coalescing of registers. Extreme live-range splitting (i.e. live-range splitting after each statement) enables optimal solutions based on ILP, for both spilling and coalescing. However, while the solutions are easily found for spilling, for coalescing they are more elusive. This difficulty stems from the huge size of interference graphs resulting from live-range splitting. This paper focuses on coalescing in the context of extreme live-range splitting. It presents some theoretical properties that give rise to an algorithm for reducing interference graphs. This reduction consists mainly in finding and removing useless splitting points. It is followed by a graph decomposition based on clique separators. The reduction and decomposition are general enough, so that any coalescing algorithm can be applied afterwards. Our strategy for reducing and decomposing interference graphs preserves the op...
    International audienc
    ... 58–70 (2002) 12. Clarke, EM, Kroening, D., Sharygina, N., Yorav, K.: Predicate abstraction of ANSI-C programs using SAT. ... Abrial, J.-R., Butler, M., Hallerstede, S., Son Hoang, T., Mehta, F., Voisin, L.: Rodin: An open toolset for... more
    ... 58–70 (2002) 12. Clarke, EM, Kroening, D., Sharygina, N., Yorav, K.: Predicate abstraction of ANSI-C programs using SAT. ... Abrial, J.-R., Butler, M., Hallerstede, S., Son Hoang, T., Mehta, F., Voisin, L.: Rodin: An open toolset for modelling and reasoning in event-b. STTT, Int. ...
    ... This experiment is a first bridge between, on the one hand, pro-gram proof in the style of Hoare, and on the other hand the Compcert compiler verification effort. ... Accepted for publication. 16. George C. Necula, Scott McPeak, Shree... more
    ... This experiment is a first bridge between, on the one hand, pro-gram proof in the style of Hoare, and on the other hand the Compcert compiler verification effort. ... Accepted for publication. 16. George C. Necula, Scott McPeak, Shree Prakash Rahul, and Westley Weimer. ...
    ABSTRACT
    ABSTRACT