research-article

Open access

SSProve: A Foundational Framework for Modular Cryptographic Proofs in Coq

Authors:

Philipp G. Haselwarter,

Bas SpittersAuthors Info & Claims

ACM Transactions on Programming Languages and Systems, Volume 45, Issue 3

Article No.: 15, Pages 1 - 61

https://doi.org/10.1145/3594735

Published: 20 July 2023 Publication History

PDF eReader

Abstract

State-separating proofs (SSP) is a recent methodology for structuring game-based cryptographic proofs in a modular way, by using algebraic laws to exploit the modular structure of composed protocols. While promising, this methodology was previously not fully formalized and came with little tool support. We address this by introducing SSProve, the first general verification framework for machine-checked state-separating proofs. SSProve combines high-level modular proofs about composed protocols, as proposed in SSP, with a probabilistic relational program logic for formalizing the lower-level details, which together enable constructing machine-checked cryptographic proofs in the Coq proof assistant. Moreover, SSProve is itself fully formalized in Coq, including the algebraic laws of SSP, the soundness of the program logic, and the connection between these two verification styles.

To illustrate SSProve, we use it to mechanize the simple security proofs of ElGamal and pseudo-random-function–based encryption. We also validate the SSProve approach by conducting two more substantial case studies: First, we mechanize an SSP security proof of the key encapsulation mechanism–data encryption mechanism (KEM-DEM) public key encryption scheme, which led to the discovery of an error in the original paper proof that has since been fixed. Second, we use SSProve to formally prove security of the sigma-protocol zero-knowledge construction, and we moreover construct a commitment scheme from a sigma-protocol to compare with a similar development in CryptHOL. We instantiate the security proof for sigma-protocols to give concrete security bounds for Schnorr’s sigma-protocol.

1 Introduction

Cryptographic proofs can be challenging to make fully precise and to rigorously check. This has caused a “crisis of rigor” [30] in cryptography that Shoup [89], Bellare and Rogaway [30], Halevi [59], and others, proposed to address by systematically structuring proofs as sequences of games. This game-based proof methodology is not only ubiquitous in provable cryptography nowadays but also amenable to full machine-checking in proof assistants such as Coq [26, 78] and Isabelle/HOL [27]. It has also led to the development of specialized proof assistants [21, 35] and automated verification tools for cryptographic proofs [19, 25, 35]. There are two key ideas behind these tools: (i) formally representing games and the adversaries against them as code in a probabilistic programming language, and (ii) using program verification techniques to conduct all game transformation steps in a machine-checked manner.

For a long time, however, game-based proofs have lacked modularity, which made them hard to scale to large, composed protocols such as TLS [83] or the upcoming MLS [17]. To address this issue, Brzuska et al. [40] have recently introduced state-separating proofs (SSP), a methodology for modular game-based proofs, inspired by the paper proofs in the miTLS project [33, 34, 55], by prior compositional cryptography frameworks [43, 71, 72], and by process algebras [74]. In the SSP methodology, the code of cryptographic games is split into packages, which are modules made up of procedures sharing state. Packages can call each other’s procedures (also known as oracles) and can operate on their own state, and adversarial packages in particular cannot directly access other packages’ state. Packages have natural notions of sequential and parallel composition that satisfy simple algebraic laws, such as associativity of sequential composition. This law is used to define cryptographic reductions not only in SSP but also in the The Joy of Cryptography textbook [86], which teaches cryptographic proofs in a style very similar to SSP.

While the SSP methodology is promising, and has, for instance, been recently used for proofs of the TLS 1.3 Key Schedule [39] and of the MLS draft standard [38], the lack of a complete formalization made SSP only usable for informal paper proofs, not for machine-checked ones. The SSP paper [40] defines package composition and the syntax of a cryptographic pseudocode language for games and adversaries, but the semantics of this language is not formally defined, and for instance the meaning of their assert operator is neither explained nor self-evident, given the probabilistic setting and the different possible choices, which can affect the meaning of cryptographic security definitions [29]. Moreover, while SSP provides a good way to structure proofs at the high-level, using algebraic laws such as associativity, the low-level details of such proofs are usually treated very casually on paper. Yet none of the existing cryptographic verification tools that could help machine-check these low-level details supports the high-level part of SSP proofs: equational reasoning about composed packages (i.e., modules) is either not possible at all [26, 59, 78, 95] or does not exactly match the SSP package abstraction [21, 67] (see Section 8 for a comparison with this related work).

The main contribution of this work is to introduce SSProve, the first general verification framework for machine-checked state-separating proofs. SSProve brings together two different proof styles into a single unified framework: (1) high-level proofs are modular, done by reasoning equationally about composed packages, as proposed in SSP [40]; (2) low-level details are formally proved in a probabilistic relational program logic [21, 26, 78]. Importantly, we show a formal connection between these two proof styles in Theorem 2.4.

SSProve is a foundational framework, fully formalized in Coq. To achieve this, we define the syntax of cryptographic pseudocode in terms of a free monad, in which external calls are represented as algebraic operations [79]. This gives us a principled way to define sequential composition of packages based on an algebraic effect handler [81] and to give machine-checked proofs of the SSP package laws [40], some of which were treated informally on paper. We formalize the state of SSP packages in terms of a shared global memory and make precise the minimal state-separation requirements, by only requiring disjoint state between adversaries and the games with which they are composed.

Beyond syntax, we also give a denotational semantics to cryptographic code in terms of stateful probabilistic functions that can signal assertion failures by sampling from the empty probability subdistribution. Finally, we prove the soundness of a probabilistic relational program logic for deriving properties about pairs of cryptographic code fragments.

For this soundness proof, we build a semantic model based on relational weakest-precondition specifications. Our model is modular with respect to the considered side-effects (currently probabilities, state, and assertion failures). To obtain it, we follow a general recipe by Maillard et al. [70], who recently proposed to characterize such semantic models as relative monad morphisms, mapping two monadic computations to their canonical relational specification. This allows us to first define a relative monad morphism for probabilistic, potentially failing computations and then to extend this to state by applying a relative monad transformer. Working out this instance of Maillard et al.’s [70] recipe involved formalizing various non-standard categorical constructs in Coq, in an order-enriched context: lax functors, lax natural transformations, left relative adjunctions, lax morphisms between such adjunctions, state transformations of such adjunctions, and so on. This formalization is of independent interest and could also allow one to more easily add extra side-effects and F $^\star$ -style sub-effecting [95] to SSProve in the future.

We formalize several security proofs, starting with the simple illustrative example of pseudo-random-function (PRF)-based encryption of Brzuska et al. [40], followed by a security proof for ElGamal public key encryption inspired by The Joy of Cryptography textbook [86, Chapter 15.3]. We then put SSProve to the test by formalizing two, more interesting case studies: First, we mechanize the security proof of the key encapsulation mechanism–data encryption mechanism (KEM-DEM) public key encryption scheme of Cramer and Shoup [48], which Brzuska et al. [40] used to illustrate the main ideas of SSP. The proof extensively uses the package laws of SSP and showcases formal reasoning with invariants. Second, we give a new proof of security of $\Sigma$ -protocols in SSP style, and show how any $\Sigma$ -protocol can be used to construct a commitment scheme, which allows us to compare with a similar development in CryptHOL [42]. We instantiate the general security proof for $\Sigma$ -protocols to the concrete example of Schnorr’s $\Sigma$ -protocol [87] to derive concrete security bounds.

We have already started to reap the benefits of mechanizing SSP in a proof assistant: Our mechanization of the KEM-DEM proof of Brzuska et al. [40] has led us to find—in conjunction with Brzuska et al.—an error in their originally published paper proof. Brzuska et al. have since proposed a revised version of their theorem and proof, which we have adapted and fully mechanized in SSProve. In turn, Markulf Kohlweiss has alerted us about a weakness in the security definition of public-key encryption schemes used in the conference version of the current article [1], which we quickly fixed as discussed in Section 2.4. This demonstrates that the language of SSProve is comprehensible to independent experts, who can review security definitions.

Outline. The remainder of this article is structured as follows. Section 2 illustrates the key ideas of how to use SSProve on two simple cryptographic proofs, showing semantic security of PRF-based and ElGamal encryption. In Section 3, we formalize the SSP methodology: cryptographic pseudocode, packages, sequential and parallel composition, and the algebraic laws they satisfy. In Section 4, we introduce the rules of a probabilistic relational program logic and use them to prove Theorem 2.4, which formally connects SSP to this program logic. In Section 5, we outline the effect-modular semantic model we use to prove the soundness of the program logic. In Section 6, we present a first larger case study, formalizing security of the KEM-DEM public key encryption scheme of Cramer and Shoup [48], following the proof of Brzuska et al. [40]. In Section 7, we present the formalization of $\Sigma$ -protocols in SSProve as a second case study. Finally, Section 8 discusses related work and Section 9 future directions.

The full formalization of SSProve and of the examples from this article (circa 24K lines of Coq code including comments) are available under the MIT open source license at the following URL: https://github.com/SSProve/ssprove/tree/journal-version.

Remark. A previous version [1] of the present article has been published at CSF 2021. The improvements we made throughout the text are too many to list exhaustively. At a high level we: (i) corrected the ElGamal security definition in Section 2.4; (ii) expanded the explanation of the logical rules in Section 4 and added new rules for assertions, one-sided memory accesses, and state invariants; (iii) significantly expanded the semantics section to be more self-contained and accessible, and to draw connections to related approaches (Section 5); (iv) added Sections 6 and 7 presenting two new case studies; and (v) improved and expanded the related work section (Section 8). The article has also gained a new author, and the author order has also slightly changed.

2 Using SSProve: Key Ideas and Examples

Formalizing the SSP methodology for high-level proofs allows us to formally link it to the methodology of probabilistic relational program logics for low-level proofs. In this section, we begin with a brief introduction to SSP (Section 2.1). Then, we present our new theorem connecting SSP to a probabilistic relational program logic (Section 2.2). Finally, by way of two examples, we show how the two methodologies are used together to obtain machine-checked security proofs. The first example looks at a symmetric encryption scheme built out of a pseudo-random function (Section 2.3), while the second looks at ElGamal, a popular asymmetric encryption scheme (Section 2.4).

2.1 An Introduction to SSP

We begin by introducing our variant of the SSP methodology of Brzuska et al. [40]. The main concept behind this methodology is the package, which is a collection of procedure implementations that together manipulate a common piece of state, and that may depend on a set of external procedures. We refer to the set of external procedures on which the package can depend as the imports of the package. In Figure 1, we can see a high-level picture of a package P: It implements and exports the procedures $\mathtt {X}$ and Y, and it imports the external procedure Z. The arrows indicate the direction of calls, i.e., exports that can be called from the outside point towards $\texttt {P}$ and imports point away. We use ${\mathtt {import}({P})}$ to denote the set of procedure names the package P imports, and ${\mathtt {export}({P})}$ to denote the names of the procedures it exports. The term interface is used to refer to such a set of procedure names.¹ While the import and export interfaces of a package tell us where it can be used, in the SSP papers, the package implementations are usually given in separate figures, which describe, in pseudocode, each of the procedures exported by the package. For example, a possible pseudocode implementation corresponding to the package P can be found in Figure 2. We refer to the code of the procedure $\mathtt {X}$ exported by package P as $P.\mathtt {X}$ .

Fig. 1.

Fig. 2.

In Figure 2, we can also see that the package implementation also depends on some memory location $\texttt {n}$ , which can be read and updated as shown in procedures $\texttt {X}$ and $\texttt {Y}$ , respectively. In SSP, such memory cells are implicitly initialized to default values depending on their type; here initially $\texttt {n = 0}$ . In SSProve memory locations refer to a shared global memory and the declaration “ $\texttt {mem: n : nat}$ ” in package P should be understood as: $\texttt {n}$ is the only global memory location that is used by the procedures of P.

Package algebra. Packages can be combined as algebraic objects. We can build complex packages out of simpler ones using the following composition operations:

•

Sequential composition: given two packages $P_1$ and $P_2$ with ${\mathtt {import}({P_1})} \subseteq {\mathtt {export}({P_2})}$ , then $P_1 \circ P_2$ is obtained by inlining procedure definitions, each time $P_1$ calls a procedure in $P_2$ .

•

Parallel composition: given $P_1$ and $P_2$ such that ${\mathtt {export}({P_1})}$ and ${\mathtt {export}({P_2})}$ are disjoint, then $P_1 \parallel P_2$ is the union of $P_1$ and $P_2$ —it exports the procedures from both $P_1$ and $P_2$ .

•

Identity package: given an interface I, we have a package that simply forwards all calls in this interface. We refer to it as the identity package on the interface I, written ${\mathtt {ID}_{I}}$ , and we have that ${\mathtt {import}({{\mathtt {ID}_{I}}})} = {\mathtt {export}({{\mathtt {ID}_{I}}})} = I$ .

In SSProve sequential and parallel composition are defined even in cases in which the composed packages use the same global memory locations, which allows state sharing.

These operations have graphical counterparts, which we show in Figure 3: parallel composition (Figure 3(a)) is represented by stacking packages on top of each other; sequential composition (Figure 3(c)) is obtained by merging the input arrows of one of the packages with the output arrows of the other; finally the identity package (Figure 3(b)) is essentially silent when represented graphically, its presence being notified by longer arrows. Moreover, there are natural algebraic laws that hold between these operators. For example, sequential composition is an associative operator, which formally we can state by the following equation:

\begin{align} P_1 \circ (P_2 \circ P_3) \stackrel{ {\mbox{code}}}{\equiv }(P_1 \circ P_2) \circ P_3 \end{align}

(1)

Graphically these laws are obtained by simply forgetting about the dashed boxes (which represent parenthesizing) and by stretching arrows. In the SSP methodology, the $\stackrel{ {\mbox{code}}}{\equiv }$ symbol stands for code equality between the packages: two packages are equal if the implementations of their procedures are equal to each other. As in SSProve code equality corresponds precisely to syntactic equality (including using the same global memory locations), we will write $P = Q$ instead of $P \stackrel{ {\mbox{code}}}{\equiv }Q$ in the remainder of the article. The aforementioned algebraic package laws (see Section 3.4 for details) are convenient for cryptographic proofs, since they allow the compositional structure of a package to be manipulated without having to look at all at the implementation of its procedures.

Fig. 3.

Games and distinguishers. A package with no imports is called a game. A game pair ${G}^{01}$ contains two games that export the same procedures, i.e., ${G}^{01} = ({G}^{0}, {G}^{1})$ such that ${\mathtt {export}({{G}^{0}})} = {\mathtt {export}({{G}^{1}})}$ and ${\mathtt {import}({{G}^{b}})} = \emptyset$ for $b = 0, 1$ . A distinguisher for a game pair is a package $\mathcal {D}$ with ${\mathtt {import}({\mathcal {D}})} = {\mathtt {export}({{G}^{0}})} = {\mathtt {export}({{G}^{1}})}$ and ${\mathtt {export}({\mathcal {D}})} = { \mathsf {Run}}$ , where $\mathsf {Run}$ is an entry-point procedure that can call the procedures exported by the games and returns a Boolean value: $\mathsf {true}$ or $\mathsf {false}$ . When a game ${G}^{b}$ exports a single procedure $\mathsf {Run}: \mathtt {unit} → \mathtt {bool}$ as above, we denote by $\mathrm{Pr} [ \mathsf {true} \leftarrow G ]$ the probability that $G.\mathsf {Run}$ returns the Boolean value $\mathsf {true}$ when running on initial memory. We can quantify how much a distinguisher can distinguish the two packages in a game pair:

Definition 2.1 (Distinguisher Advantage).

The advantage of a distinguisher $\mathcal {D}$ against a game pair ${G}^{01} = ({G}^{0}, {G}^{1})$ is

\begin{equation*} \alpha ({G}^{01})(\mathcal {D}) = \left|\mathrm{Pr}\!\left[ \mathsf {true} \leftarrow \mathcal {D}\circ {G}^{0} \right] - \mathrm{Pr}\!\left[ \mathsf {true} \leftarrow \mathcal {D}\circ {G}^{1} \right] \right|. \end{equation*}

Reasoning about advantage. Next, we review the two main results used for equational-like reasoning about advantage against games in SSP:

Lemma 2.2 (Triangle Inequality).

Let ${G}^{0}$ , ${G}^{1}$ , and ${G}^{2}$ be games, we have that for every distinguisher $\mathcal {D}$ ,

\begin{equation*} \alpha ({G}^{0},{G}^{2})(\mathcal {D}) \le \alpha ({G}^{0},{G}^{1})(\mathcal {D}) + \alpha ({G}^{1},{G}^{2})(\mathcal {D}). \end{equation*}

In general, we want to bound the advantage to distinguish ${G}^{0}$ and ${G}^{n}$ (i.e., the advantage $\alpha ({G}^{0}, {G}^{n})(\mathcal {D})$ against game pair $({G}^{0}, {G}^{n})$ ). To do so, by repeatedly applying Lemma 2.2, it is enough to exhibit a chain of games ${G}^{0}, {G}^{1}, {G}^{2}, \ldots , {G}^{n}$ so that a bound for $\alpha ({G}^{0},{G}^{n})(\mathcal {D})$ can be given by

\begin{equation*} \alpha ({G}^{0}, {G}^{1})(\mathcal {D}) + \alpha ({G}^{1}, {G}^{2})(\mathcal {D}) + \cdots + \alpha ({G}^{{}n-1}, {G}^{n})(\mathcal {D}). \end{equation*}

Lemma 2.3 (Reduction).

Let ${G}^{01} = ({G}^{0}, {G}^{1})$ be a game pair and let P be an arbitrary package. Then, for every distinguisher $\mathcal {D}$ , we have

\begin{equation*} \alpha (P \circ {G}^{0}, P \circ {G}^{1})(\mathcal {D}) = \alpha ({G}^{01})(\mathcal {D}\circ P). \end{equation*}

As its name indicates, Lemma 2.3 is used to reduce the advantage of the distinguisher over a composed game pair ( $P \circ {G}^{0}, P \circ {G}^{1}$ ) to the advantage over part of the game pair ${G}^{01}$ , for which we know a bound. We will use both these SSP lemmas in Section 2.3.

Shared state. As mentioned above, in SSProve, we use a shared global memory and composed packages can share state. In particular our sequential and parallel composition are defined even in cases in which the composed packages use the same memory locations. This was easy to formalize in Coq, and it allowed us to prove formally that the algebraic laws for package composition as well as the two lemmas above hold even when the involved packages share state.

This treatment of state in SSProve is quite different from the original SSP [40], in which composed packages have to always be “state separated” (i.e., have disjoint state). To make the state of packages disjoint they allow for $\alpha$ -renaming of state variables—which shows up for instance in their definition of code equality. Such informal $\alpha$ -renaming conventions are generally more difficult to formalize in a proof assistant [2, 14, 96]. Yet in the absence of $\alpha$ -renaming, requiring state disjointness everywhere would only increase the proof burden and the clutter in the proved security results (it would require the adversary’s state to be disjoint from all intermediate games in the proof).

Adversaries. State separation is, however, still crucial for defining adversaries against game pairs. An adversary $\mathcal {A}$ for a game pair is a distinguisher whose memory footprint is disjoint from the footprint of each game in the pair. We define adversaries, the memory footprint of a package, and disjointness of footprints more formally in Definition 3.3.

Perfect game indistinguishability. We say that the games ${G}^{0}$ and ${G}^{1}$ of a game pair ${G}^{01}$ are perfectly indistinguishable when $\alpha ({G}^{01})(\mathcal {A}) = 0$ for every adversary $\mathcal {A}$ . Perfect indistinguishability is a form of observational equivalence and states that no adversary can learn any information about which game in the pair it is interacting with.

2.2 Proving Perfect Indistinguishability Steps in a Probabilistic Relational Program Logic

We now present the main novel result brought by SSProve. The SSP laws above deal only with the high-level structure of composed packages. However, we often also need to show that two concrete games are equivalent with respect to what an adversary can learn from using them, i.e., perfect indistinguishability. In SSProve, we formally verify this kind of equivalence by reducing it to proving a family of semantic judgments in a probabilistic relational program logic. The logic we use is a variant of pRHL, a probabilistic relational Hoare logic introduced by Barthe et al. [26] in CertiCrypt. Judgments of this logic are of the form

and intuitively mean that after separately running the two code fragments $c_0$ and $c_1$ on the corresponding component of a pair of memories $m_0, m_1$ satisfying a precondition $\phi$ , the final memories $m_0^{\prime }, m_1^{\prime }$ and results $a_0, a_1$ satisfy the postcondition $\psi$ . When writing pre- and postconditions, we write as $p.\,M$ a function that binds p and has body M (usually denoted by $\lambda p.\,M$ in the functional programming community).² This notation is handy for writing postconditions, which depend on final memories and on final results. We adopt the convention that the variables $m_0$ and $m_1$ stand for the state associated to $c_0$ and $c_1$ in preconditions, the initial memories, and $m_0^{\prime }, m_1^{\prime }$ stand for the corresponding state in postconditions, the final memories. We will omit them from judgments when no ambiguity can arise. We now state the main theorem of SSProve:

Intuitively, we ask that both procedures, when run on memories satisfying $\psi$ , yield results drawn from the same distribution and memories still satisfying $\psi$ . We leave the precise definition of stable invariants and how this theorem is proved to Section 4.2, but the main idea behind this invariant is that it keeps track of a relation between the memories of ${G}^{0}$ and ${G}^{1}$ , and that this relation is preserved as different procedures from the interface are called during the execution. This can be understood as a bisimulation argument between packages, where transitions between states come from procedure calls. We illustrate how this theorem is used in the examples from the next two subsections.

2.3 Security Proof of PRF-based Encryption in SSProve

We first illustrate the key ideas of SSProve on a cryptographic proof by Brzuska et al. [40] that we have verified in Coq using our framework. In this proof, reasoning about composed packages (using Lemmas 2.2 and 2.3 above) allows for a high level of abstraction that drives the proof argument. Some steps of this proof are, however, justified by perfect indistinguishability between games, which involves inspecting the procedures of the games and applying program transformations to show the equivalence. In the previous paper proof [40] these steps were only justified informally by code inspection. Instead, we have formally verified these steps too, using Theorem 2.4 and our relational program logic.

Brzuska et al. [40] show how to construct a symmetric encryption scheme out of a PRF and use the SSP methodology to reduce security of the encryption scheme—expressed as IND-CPA—to the security of the pseudo-random function, expressed as being indistinguishable from a package doing random sampling.

The scheme assumes a pseudo-random function called $\mathtt {prf}$ with the following signature:

where ${{ 0, 1 } ^n}$ represents the set of n-bit sequences. It is possible to formalize and quantify the security of PRF-based encryption as the probability for an adversary to distinguish it from a package that samples from a uniform distribution (real versus random paradigm [86]). Concretely, given the packages ${\mathtt {PRF}}^{0}$ and ${\mathtt {PRF}}^{1}$ as in Figure 4, where “<$ uniform S” represents uniform sampling from set $\texttt {S}$ , the advantage of an adversary $\mathcal {A}$ against the game ${\mathtt {PRF}}^{01} = ({\mathtt {PRF}}^{0}, {\mathtt {PRF}}^{1})$ is defined using Definition 2.1 as follows:

\begin{equation*} \alpha ({\mathtt {PRF}}^{01})(\mathcal {A}) = \left|\mathrm{Pr}\!\left[ \mathsf {true} \leftarrow \mathcal {A}\circ {\mathtt {PRF}}^{0} \right] - \mathrm{Pr}\!\left[ \mathsf {true} \leftarrow \mathcal {A}\circ {\mathtt {PRF}}^{1} \right] \right| \end{equation*}

The three basic algorithms we use to construct a symmetric encryption scheme out of $\mathtt {prf}$ are given in Figure 5. These are not packages themselves but rather code used inside packages.

Fig. 4.

Fig. 5.

The security property proposed for this encryption scheme is defined as the advantage on a game pair that captures indistinguishability under chosen-plaintext attack (IND-CPA). We refer to this game pair as ${\mathtt {IND-CPA}}^{01}$ , and the packages involved are introduced in Figure 6. Notice that in procedure ${\mathtt {IND-CPA}}^{1}.\mathsf {Enc}$ the argument $\mathtt {msg}$ is never used, the encryption procedure is run on a random $\mathtt {msg^{\prime }}$ . Therefore, the advantage of an adversary with respect to the game pair ${\mathtt {IND-CPA}}^{01}$ represents the probability that the adversary is able to distinguish the encryption of $\mathtt {msg}$ from the encryption of a random bit-string. The security of the encryption procedure with respect to an adversary $\mathcal {A}$ against ${\mathtt {IND-CPA}}^{01}$ is then $\alpha ({\mathtt {IND-CPA}}^{01})(\mathcal {A})$ .

Fig. 6.

Brzuska et al. [40] use a sequence of game-hops to bound $\alpha ({\mathtt {IND-CPA}}^{01})$ in terms of (a linear function of) the advantage $\alpha ({\mathtt {PRF}}^{01})$ . This technique of game-hops follows the style of inequality reasoning chains from Section 2.1 (Lemma 2.2), where each step involves establishing the advantage on a game pair, and as a result we obtain a bound on the advantage of the game consisting of the initial and final game.

In this example, ${\mathtt {IND-CPA}}^{b}$ is shown equivalent to a variant, ${\mathtt {MOD-CPA}}^{b}$ , that gets the pad through the PRF, i.e., with a call to $\mathtt {Eval}$ of the package ${\mathtt {PRF}}^{0}$ or ${\mathtt {PRF}}^{1}$ (see Figure 7). By repeatedly applying Lemma 2.2, we bound $\alpha ({\mathtt {IND-CPA}}^{01})(\mathcal {A})$ by

\begin{equation*} \begin{array}{l@{}l} \alpha ({\mathtt {IND-CPA}}^{0},&\ {\mathtt {MOD-CPA}}^{0} \circ {\mathtt {PRF}}^{0})(\mathcal {A}) \\ \mathop {+}\alpha ({\mathtt {MOD-CPA}}^{0} \circ {\mathtt {PRF}}^{0},&\ {\mathtt {MOD-CPA}}^{0} \circ {\mathtt {PRF}}^{1})(\mathcal {A}) \\ \mathop {+}\alpha ({\mathtt {MOD-CPA}}^{0} \circ {\mathtt {PRF}}^{1},&\ {\mathtt {MOD-CPA}}^{1} \circ {\mathtt {PRF}}^{1})(\mathcal {A}) \\ \mathop {+}\alpha ({\mathtt {MOD-CPA}}^{1} \circ {\mathtt {PRF}}^{1},&\ {\mathtt {MOD-CPA}}^{1} \circ {\mathtt {PRF}}^{0})(\mathcal {A}) \\ \mathop {+}\alpha ({\mathtt {MOD-CPA}}^{1} \circ {\mathtt {PRF}}^{0},&\ {\mathtt {IND-CPA}}^{1})(\mathcal {A}) \end{array} \end{equation*}

By observing that $\alpha ({\mathtt {IND-CPA}}^{0}, {\mathtt {MOD-CPA}}^{0} \circ {\mathtt {PRF}}^{0})(\mathcal {A}) = 0$ and $\alpha ({\mathtt {MOD-CPA}}^{1} \circ {\mathtt {PRF}}^{0}, {\mathtt {IND-CPA}}^{1})(\mathcal {A}) = 0$ , and by using Lemma 2.3 twice, we reduce this bound to

\begin{equation*} \alpha ({\mathtt {PRF}}^{01})(\mathcal {A}\circ {\mathtt {MOD-CPA}}^{0}) + \varepsilon _{\it stat.}(\mathcal {A}) + \alpha ({\mathtt {PRF}}^{01})(\mathcal {A}\circ {\mathtt {MOD-CPA}}^{1}) \end{equation*}

where $\varepsilon _{\it stat.} = \alpha ({\mathtt {MOD-CPA}}^{0} \circ {\mathtt {PRF}}^{1},\ {\mathtt {MOD-CPA}}^{1} \circ {\mathtt {PRF}}^{1})$ . The advantage of an attacker with respect to ${\mathtt {MOD-CPA}}^{0}$ and ${\mathtt {MOD-CPA}}^{1}$ is usually referred to as statistical gap, a polynomial function of the number of calls from the adversary (see Reference [40, Appendix A]). One could prove a precise bound based on the birthday paradox here [40, Appendix A], but this would require formal reasoning about failure events [26]. It would be useful to extend SSProve in the future with a unary, union bounds logic for adding more precision to such steps [23].

Fig. 7.

We can informally interpret the formally proved bound above as saying that if the advantage of $\mathcal {A}\circ {\mathtt {MOD-CPA}}^{b}$ against PRF and the statistical gap are negligible then the advantage of $\mathcal {A}$ against ${\mathtt {IND-CPA}}^{01}$ is also negligible, under the additional assumption that $\mathcal {A}$ is probabilistic polynomial time and has disjoint state from ${\mathtt {PRF}}^{1}$ , so that $\mathcal {A}\circ {\mathtt {MOD-CPA}}^{b}$ is an adversary against ${\mathtt {PRF}}^{01}$ . While this disjointness assumption is needed at the top level, when applying Lemma 2.2 for reducing the advantage of the adversary $\mathcal {A}$ for game pair ${\mathtt {IND-CPA}}^{01}$ , we are basically using $\mathcal {A}$ as a distinguisher, which does not introduce additional state disjointness requirements, i.e., for applying Lemma 2.2, the state of $\mathcal {A}$ is only required to be disjoint from ${\mathtt {IND-CPA}}^{01}$ , not from ${\mathtt {MOD-CPA}}^{b}$ and ${\mathtt {PRF}}^{b}$ .

It remains to justify the two perfect indistinguishability statements above. These steps involve replacing an informal argument [40] by a fully formal one, moving to our probabilistic relational program logic. We will detail one of these steps: $\alpha ({\mathtt {IND-CPA}}^{0},\ {\mathtt {MOD-CPA}}^{0} \circ {\mathtt {PRF}}^{0})(\mathcal {A}) = 0$ . The other step, $\alpha ({\mathtt {MOD-CPA}}^{1} \circ {\mathtt {PRF}}^{0},\ {\mathtt {IND-CPA}}^{1})(\mathcal {A}) = 0$ , is analogous.

To prove this equivalence, Brzuska et al. [40] notice that the Enc procedures of ${\mathtt {IND-CPA}}^{0}$ and ${\mathtt {MOD-CPA}}^{0} \circ {\mathtt {PRF}}^{0}$ (see Figure 8) return the same distributions of ciphertext when called on the same $\mathtt {msg}$ . The two procedures are obtained by “inlining” the code of ${\mathtt {PRF}}^{0}.\texttt {Eval}$ inside ${\mathtt {MOD-CPA}}^{0}$ , and by “unfolding” the code of $\mathtt {enc}$ .

Fig. 8.

The left- and right-hand side procedures in Figure 8 only differ when $\mathtt {k} = \bot$ , in which case the left Enc procedure first samples k and then r, while the right Enc first samples r and then k. In both procedures, k and r are drawn from independent distributions. Here, Brzuska et al. [40] conclude informally that independence allows to “swap” the two operations. We instead use Theorem 2.4 to formally reduce $\alpha ({\mathtt {IND-CPA}}^{0}, {\mathtt {MOD-CPA}}^{0} \circ {\mathtt {PRF}}^{0})(\mathcal {A}) = 0$ to showing the equivalence of the two Enc procedures from Figure 8. In our probabilistic relational program logic, this comes down to proving the following judgment for all plaintext messages $\texttt {msg}$ ,

This judgment intuitively states that encrypting $\texttt {msg}$ with the same initial memories “ $m_0 = m_1$ ,” terminates in memories and ciphertexts drawn from the same distribution, “ $m_0^{\prime } = m_1^{\prime } \wedge rc_0 = rc_1$ .” We use the following instance of the $\mathtt {swap}$ rule from Section 4.1, to formally justify this swapping:³

2.4 Security Proof of ElGamal in SSProve

We also illustrate the key ideas of SSProve on a security proof for the ElGamal encryption scheme inspired by The Joy of Cryptography textbook [86, Chapter 15.3]. ElGamal belongs to the family of public-key or asymmetric encryption schemes, which use a public key for encryption and a private key for decryption. Public-key schemes therefore require a key generation algorithm producing the pair of public and private keys. In our formalization, it suffices to provide the aforementioned algorithms together with key-, plaintext-, and cipher-spaces to automatically obtain a public-key scheme together with its related security notions (to be proved) such as security against CPA. In what follows, we describe which spaces and algorithms define ElGamal and the security proof we provided for it.

ElGamal is parameterized by a multiplicative cyclic group $(\mathcal {G}, \texttt {*})$ with n elements and with generator g, usually denoted by $\left\lt g\right\gt = \mathcal {G}$ . Plaintexts are elements $msg \in \mathcal {G}$ and ciphertexts are pairs of group elements $c = (c_{\mathit {rnd}}, c_{\mathit {msg}}) \in \mathcal {G}\times \mathcal {G}$ . Secret keys are elements of $\mathbb {Z}_{n}$ , while public keys are group elements once again, $\mathtt {pk} \in \mathcal {G}$ . The key generation algorithm (KeyGen in Figure 9) generates a secret key that is a random number $\texttt {sk} \in { 0, \ldots , n - 1 }$ and a public key that is $g^{\texttt {sk}}$ . Encryption and decryption (Enc and Dec in Figure 9) involve the group operation $(_*_)$ , exponentiation $(_)-$ and the multiplicative inverse $(_)^{-1}$ . Encryption works probabilistically, generating an ephemeral key rnd to derive a shared secret shs, which is used to encrypt the plaintext message msg.

Fig. 9.

Under the Decisional Diffie–Hellman (DDH) assumption for the group $\mathcal {G}$ , namely, that ${\mathtt {DDH}}^{0}$ and ${\mathtt {DDH}}^{1}$ from Figure 10 are computationally indistinguishable, one can prove that an adversary cannot distinguish messages encrypted with the ElGamal scheme from ciphertexts that are randomly sampled (CPA). Our formalization only considers the case in which the adversary can see a single ciphertext (one-time CPA, written OT-CPA), as it is known that this suffices for public-key encryption schemes to satisfy CPA [86, Claim 15.5]. We leave the formalization of this last result as future work and discuss hereafter our proof of OT-CPA in SSProve.

Fig. 10.

The security property OT-CPA is expressed in terms of the advantage against game pair ${\mathtt {CPA}}^{01}$ in Figure 11. An adversary $\mathcal {A}$ can call Get_pk() and get the public key, if already initialized.⁴ The adversary can “challenge” a package to encrypt a certain plaintext $\mathtt {msg}$ through $\mathtt {Challenge(msg)}$ . Both packages return a ciphertext only if the counter is 0—as expressed by the use of assert—so the adversary can only see one ciphertext. Both packages call KeyGen to generate public and private keys, but while ${\mathtt {CPA}}^{0}$ indeed encrypts the message provided by the adversary with the public key through Enc(pk, msg), the package ${\mathtt {CPA}}^{1}$ instead returns a randomly sampled ciphertext (c_rnd, c_msg) <$ uniform ${\mathcal {G}\times \mathcal {G}}$ , i.e., a pair of group elements sampled from the uniform distribution on $\mathcal {G}\times \mathcal {G}$ .

Fig. 11.

The OT-CPA proof reduces the advantage of adversary $\mathcal {A}$ against $({\mathtt {CPA}}^{0}, {\mathtt {CPA}}^{1})$ to the advantage of $\mathcal {A}\circ \mathtt {MOD-CPA}$ against $({\mathtt {DDH}}^{0},{\mathtt {DDH}}^{1})$ , with the auxiliary package $\mathtt {MOD-CPA}$ listed in Figure 12:

\begin{equation*} \alpha ({\mathtt {CPA}}^{01})(\mathcal {A}) \le \alpha ({\mathtt {DDH}}^{01})(\mathcal {A}\circ \mathtt {MOD-CPA}). \end{equation*}

We once again obtain this result by repeatedly applying Lemma 2.2 to bound $\alpha ({\mathtt {CPA}}^{01})(\mathcal {A})$ by

\begin{equation*} \begin{array}{l@{}l} \alpha ({\mathtt {CPA}}^{0},&\ \mathtt {MOD-CPA}\circ {\mathtt {DDH}}^{0})(\mathcal {A}) \\ \mathop {+}\alpha (\mathtt {MOD-CPA}\circ {\mathtt {DDH}}^{0},&\ \mathtt {MOD-CPA}\circ {\mathtt {DDH}}^{1})(\mathcal {A}) \\ \mathop {+}\alpha (\mathtt {MOD-CPA}\circ {\mathtt {DDH}}^{1},&\ {\mathtt {CPA}}^{1})(\mathcal {A}) \end{array} \end{equation*}

We prove that the first and last advantages are null by proving the packages perfectly indistinguishable, and the remaining advantage is equal to $\alpha ({\mathtt {DDH}}^{01})(\mathcal {A}\circ \mathtt {MOD-CPA})$ by simple application of Lemma 2.3. It now remains to show the equivalences below:

Fig. 12.

Step $\alpha ({\mathtt {CPA}}^{0}, \mathtt {MOD-CPA}\circ {\mathtt {DDH}}^{0})(\mathcal {A}) = 0$ : We apply Theorem 2.4 and reduce the goal to a relational judgment between ${\mathtt {CPA}}^{0}\texttt {.Challenge(msg)}$ and $(\mathtt {MOD-CPA}\circ {\mathtt {DDH}}^{0})\texttt {.Challenge(msg)}$ for a generic plaintext msg, and where the invariant $\psi$ is equality of memories. Inlining the code of Query provided by ${\mathtt {DDH}}^{0}$ inside $\mathtt {MOD-CPA}$ and unfolding one realizes that the two code fragments are identical and the judgment holds by application of the reflexivity rule in Section 4.1.

Step $\alpha (\mathtt {MOD-CPA}\circ {\mathtt {DDH}}^{1}, {\mathtt {CPA}}^{1})(\mathcal {A}) = 0$ : This step is quite similar to the one above. After inlining, however, the two code fragments are not exactly the same, since in particular ${\mathtt {CPA}}^{1}$ completely ignores $\mathtt {msg}$ and returns a random ciphertext, while $\mathtt {MOD-CPA}\circ {\mathtt {DDH}}^{1}$ returns $\mathtt {msg*g}$ ^{$\mathtt {rnd^{\prime }}$} for a random $\mathtt {rnd^{\prime }}$ . To have equality of memories as invariant $\psi$ , we show that in $\mathcal {G}$ , multiplication by $\mathtt {g^(_)}$ acts like a one time pad, which is a standard result [26, Section 6.2].

3 Formalizing State-separating Proofs

We separate the programming language and thus the reasoning into two strata: code and packages. We define the syntax of code (Section 3.1), relate it to the notation used in Section 2.1, and explain its semantics (Section 3.2). We then give a formal description of packages (Section 3.3) and the algebraic laws they obey (Section 3.4). In Section 2.1, we took some license regarding notation to stay close to the presentation of Brzuska et al. [40]. The code examples in the remainder of the article more faithfully follow the Coq notations we use in the formal development of SSProve.

3.1 Syntax for Cryptographic Code (Free Monad)

The language of the Coq system, Gallina, is a dependently typed, purely functional programming language. As such, we can directly express functional code in Gallina, but not code with side-effects such as reading from and writing to memory, probabilistic sampling, or external procedure calls. We thus represent cryptographic code via a combination of the ambient language Gallina and a monad of effectful computations. Monads constitute an established way of adding effects to a purely functional language [76, 97]. Free monads in particular allow to separate the representation (syntax) of an embedded language from its interpretation (semantics).

Raw code. We use a hybrid approach [78] of embedding the pure fragment of our cryptographic programming language shallowly in Coq, and embedding the effects deeply via a free monad. This free monad is defined as an inductive type:⁵

Some more explanations about $\texttt {raw_code}$ are in order. The type parameter $\texttt {A}$ indicates the result of a computation of type $\texttt {raw_code A}$ . The first clause of the above definition lets us inject any pure value $\texttt {x}$ of type $\texttt {A}$ into the monad as return $\texttt {x}$ . Calls to external procedures are represented via $\texttt {call p x κ }$ , where the first argument $\texttt {p : opsig}$ specifies the name of the procedure, the type of its argument ( $\texttt {src p}$ ), and its return type ( $\texttt {tgt p}$ ). The second argument $\texttt {x}$ is the input value of type $\texttt {src p}$ passed to the called procedure. The last argument $κ$ is the continuation of the program, awaiting the result of the call to $\texttt {p}$ . The $\texttt {get}$ and $\texttt {put}$ operations take a (typed) global memory location $\texttt {ℓ}$ as argument, respectively, read from and write to that location, and continue with the continuation $\texttt {κ }$ . Finally, we may sample from a collection of probabilistic subdistributions $\texttt {Op}$ . Subdistributions constitute the base of our code semantics and are further discussed in Section 3.2. The type $\texttt {Op}$ is a parameter of the language that can be instantiated by the user. Sampling a subdistribution $\texttt {op}$ on type $\texttt {dom op}$ (the domain of the subdistribution) can be composed with a matching continuation $\texttt {κ }$ (continuations are explained below).

We will use the following two pieces of code as running examples to explain different aspects of the definition:

(2)

(3)

The code in (2) increments the value stored at location $ℓ$ by $\texttt {1}$ and returns the value before the increment. The code in (3) draws a random bit-string $\texttt {y}$ of length n, calls an external procedure $\texttt {Prf}$ with arguments $\texttt {y}$ and bit-string $\texttt {101010}$ , and returns the result.

Code with locations and interface. Raw code is merely a representation of syntax. To record which imported procedures and global memory locations are used, we introduce a corresponding predicate. We consider the code with respect to set of locations $\mathcal {L}$ and to an import interface $\mathcal {I}$ , which is a set of procedure signatures ( $\texttt {opsig}$ ) consisting of a name, an input type, and an output type. The predicate checks that all reads and writes performed in the code are made to locations in $\mathcal {L}$ and that all imported procedures belong to $\mathcal {I}$ . Concretely, the code in (2) uses locations in the singleton set $\texttt {ℓ}$ and has the empty import interface, while (3) uses the empty set of locations and the interface {Prf : {0, 1}ⁿ × {0, 1}ⁿ → {0, 1}ⁿ}. The type $\texttt {code}$ $_{\mathcal {L},I}$ is then simply defined as raw code that verifies the predicate corresponding to locations $\mathcal {L}$ and import interface $\mathcal {I}$ :

In the article, we sometimes omit the set of locations and the interface. Thanks to the use of tactics and Coq’s type classes, proofs regarding these locations and interfaces for well-scoped user-written code are constructed automatically without requiring user intervention.

Locations are typed, which means that each $\texttt {ℓ }$ in $\mathcal {L}$ is designated a specific type that corresponds to the type of data stored in the respective memory cell. The type assigned to $\texttt {ℓ }$ is denoted by $\texttt {type ℓ }$ .

Continuations. A continuation is a suspended computation awaiting the result of an operation, intuitively corresponding to the rest of the program. Consider for instance the code (2). The $\texttt {get}$ operation performs a memory lookup at the location $ℓ$ , and its continuation is a Coq function “

” of type “

” that receives the value stored at $ℓ$ as its parameter $x_ℓ$ . The continuation in turn performs a $\texttt {put}$ operation, storing the value $x_ℓ + 1$ at memory location $ℓ$ , and returns the value $x_ℓ$ . The code thus corresponds to the expression written as $\texttt {ℓ ++}$ in common imperative languages.

Variables. As demonstrated in example (2), we draw a strict distinction between a location $ℓ$ , which can be accessed and updated via $\texttt {get}$ and $\texttt {put}$ , and the value stored in memory at location $ℓ$ . In (2), this value is available in the continuation of $\mathtt {get \ ℓ (\lambda x_ℓ . put ...)}$ as $\mathtt {x_ℓ }$ . Formally speaking, $\mathtt {x_ℓ }$ is an immutable Coq variable, and in (2) the location $ℓ$ itself is a Coq variable of type $\texttt {Location}$ . This distinction is already present in SSP [40, Definition 2], where locations correspond to “state variables” and the ambient, mathematical notion of variable is referred to as “local variable.”

Memory initialization. As mentioned in Section 2.1, memory locations are implicitly initialized to default values depending on their type (for instance in Figure 2 $\texttt {n}$ is initialized to $\texttt {0}$ ). Yet, we frequently want to clearly distinguish the case when a location was implicitly initialized to a default value from the case when it was explicitly initialized. For instance, in Figure 6, we want to distinguish the case in which the key $\texttt {k}$ has not yet been generated from the case in which it has been. To achieve this, we define the type of the $\texttt {k}$ as option $\texttt {key}$ , using the following standard definition for the option type:

Memory locations of type option $\texttt {key}$ are implicitly initialized to the dummy value $\bot$ , which is different from any generated key, which is tagged with the constructor $\texttt {Some}$ .

Monadic bind. The $\texttt {bind}$ operation of the monad, with type code A → (A → code B) →code B, allows the composition of effectful code. Take for instance the following pieces of code:

We would like to use $\texttt {c}$ as an argument to $\texttt {κ }$ , but the types do not match: $\texttt {κ }$ expects a value of type $\texttt {nat}$ as argument, not a computation of type $\texttt {code nat}$ . We define a standard $\texttt {bind}$ operation that achieves this by traversing the code of $\texttt {c}$ , applying $κ$ when a returned value is encountered, and recursively pushing $κ$ into any other continuations:

An easy structural induction over $\texttt {code}$ allows us to prove that $\texttt {bind}$ satisfies the expected monad laws: bind is associative (bind m (λ p . bind (f p) g) = bind (bind m f) g), and return serves as a unit ( $\texttt {bind}$ (return x) $\texttt {f = f x}$ and $\texttt {bind m}$ return $\texttt {= m}$ ).

Loops. We do not have syntax for loops in $\texttt {code}$ . However, since we are embedding in Coq, we take advantage of its recursion mechanisms to write terminating loops. The most basic construction we can write is a “ for i := 0 to N do $\texttt{c}$ ” loop that repeats $(\texttt {N}{+}1)$ -times a command $\texttt {c}$ , providing to $\texttt {c}$ the value of the index $\texttt {i}$ . In the code below, the pattern matching happens over the natural number $\texttt {N}$ , which in Coq is represented in unary format, so it is either $\texttt {0}$ (zero) or $\texttt {S n}$ (read successor of $\texttt {n}$ , for some natural number $\texttt {n}$ ).

More generally, we can define a “do-while” loop that repeatedly executes a loop body while a condition holds, checked after each iteration. To ensure termination in Coq, we add a natural number $\texttt {N}$ to bound the maximum number of iterations:

At the end, the returned Boolean signals whether there was remaining fuel (i.e., iteration steps) available or not. In the future, we hope to extend SSProve to potentially non-terminating loops, since the semantic models of probabilistic programs usually support general fixpoints (as further discussed in Section 5.5).

Standard subdistributions. Probabilistic operations denoting a collection of subdistributions we may sample from are included in the type $\texttt {Op}$ , which is a parameter of our language. Standard subdistributions including uniform sampling on finite types as well as a null subdistribution are predefined for convenience. The null subdistribution in particular allows us to represent failure and an assert construct. Failure at type $\texttt {A}$ is defined as sampling from the null distribution $\texttt {dnull}$ .

A simple assertion is not expected to produce any interesting values but only gets evaluated for the possibility of failing if the condition is violated. This is expressed by the fact that a successful assert simply returns a value of unit type, where $\texttt {unit}$ is Coq’s singleton type with a unique inhabitant $\texttt {()}$ .

If $\texttt {b}$ is true, then assert returns the trivial value $\texttt {()}$ , but if $\texttt {b}$ is $\texttt {false}$ , we instead sample from the $\texttt {dnull}$ subdistribution via $\texttt {fail unit}$ , assigning probability zero to all values of the type $\texttt {unit}$ (i.e., to $\texttt {()}$ ). Sampling from the null subdistribution is similar to non-termination, and means that the continuation will never be called. This provides a simple and clearly defined semantics for assertion failures. While this is not the only choice [29, 52, 84], this formalizes our understanding of the following informal convention from the paper introducing SSP [40]: “all our definitions and theorems apply only to code that never violates assertions.”

We can see this use of assert in Figure 11 from Section 2.4. The packages ${\mathtt {CPA}}^{b}$ ensure that the Challenge procedure can be called only once by running assert $\texttt {counter = 0}$ . If the assertion succeeds, then we may assume that $\texttt {counter = 0}$ holds in the rest of the procedure, until $\texttt {counter}$ is incremented. We can demonstrate how distinguishers interact with procedures calling assert by computing $\mathrm{Pr}\![ \mathsf {true} \leftarrow \mathcal {D}\circ {\mathtt {CPA}}^{b} ]$ (as used in Definition 2.1) for a distinguisher $\mathcal {D}$ that calls $\texttt {Challenge}$ twice on some fixed message before always returning $\texttt {true}$ . Even though the distinguisher still returns $\texttt {true}$ in this scenario, assert will fail the second time the distinguisher calls $\texttt {Challenge}$ , and thus the subdistribution represented by $(\mathcal {D}\circ {\mathtt {CPA}}^{b})\!.\mathsf {Run}$ becomes the null subdistribution. It follows that $\mathrm{Pr} [ \mathsf {true} \leftarrow \mathcal {D}\circ {\mathtt {CPA}}^{b} ] = 0$ .

The simple use of assert above is “logical”: We limit the input to certain functions or the ways in which protocols can be called to exploit these assertions in our security reasoning. A more complex use of assertions, which we call “dependent,” occurs in the following example. Consider the situation where we have a memory location $\texttt {ℓ : Location}$ holding a key that is initialized to $\bot$ . Formally, this amounts to the information $\texttt {type ℓ}$ = option $\texttt {key}$ , which may be presented in the signature of a package as $\texttt {mem: ℓ}$ : option $\texttt {key}$ . We did not distinguish immutable variables and locations in Section 2, but we carry out a careful analysis of memory initialization in the KEM-DEM case study in Section 6. For instance, we will see in Figure 14 (Section 6, Page 35) an implementation of a $\texttt {Get()}$ procedure that returns the key stored at location $\texttt {k_loc}$ . This procedure defines a partial function of type $\texttt {unit → key}$ , that fails to return a value if the memory location has not yet been explicitly initialized (i.e., it is $\bot$ ).

We first retrieve the potentially not explicitly initialized key from memory as $\texttt {k}$ , of type $\texttt {option key}$ . We then check via a dependent assert that $\texttt {k}$ is not $\texttt {l}$ , and record the asserted condition as $\texttt {kSome}$ of type $\texttt {k ≠ l}$ . We can now apply getSome with $\texttt {kSome}$ to safely coerce $\texttt {k}$ from $\texttt {option key}$ to $\texttt {key}$ . In this example, the code that follows the assertion depends on the asserted condition, whereas in the previous example with $\texttt {counter}$ the assertion was only used when reasoning in the relational program logic. Indeed the continuation of dependent assert, called $\texttt {kont}$ in the declaration below, has type $\texttt {b}$ = true → $\texttt {code A}$ . In other words, it constitutes a piece of $\texttt {code}$ , computing a value of type $\texttt {A}$ , that is defined only when the assertion $\texttt {b}$ is true. Operationally $\texttt {assertD}$ is similar to the simple assert above, but the actual Coq definition in terms of $\texttt {fail}$ and a conditional is more complicated, because it uses dependent elimination to type-check the success branch. Since such details are not important here, we refer the interested reader to the Coq source and only show the type of this operator:

Procedure calls. A call to an external procedure such as $\texttt {Prf}$ in (3) is represented by the $\texttt {call}$ operation, taking as arguments a procedure name $\texttt {p}$ annotated with a type, a value matching the argument type of $\texttt {p}$ , and a continuation $κ$ matching the return type of $\texttt {p}$ . In Section 3.3, we show how an implementation gets substituted for this placeholder via sequential package composition.

Notation. The use of continuations is pervasive in monadic code, and to alleviate the presentation, we introduce the following more familiar notation:

In code listings we will frequently omit the $\texttt {;}$ separator when it would occur at the end of a line. We will also write $\texttt {c_1 ; c_2}$ for $\texttt {_ ← c_1 ; c_2}$ when the continuation $\texttt {c_2}$ does not use its argument.

Correct typing. The typing constraints imposed by the $\texttt {raw_code}$ definition enforce correct typing for user-written code, guaranteeing that operations and their continuations are compatible. For instance, let the continuation of $\texttt {get}$ in (2) be $\texttt {f}$ . Then, $\texttt {f}$ is only compatible with $ℓ$ if $\texttt {f}$ ’s domain matches the type of $ℓ$ , i.e., $\texttt {f : type ℓ → raw_code A}$ for some type $\texttt {A}$ .

Syntax at work. We now illustrate our formal syntax in action. For this, we restate the procedure $\mathtt {PRF^0.Eval(x)}$ from Figure 4 more formally using $\texttt {raw_code}$ and the various notations above:

TWThis is not really equivalent is it? The bind should be some $\texttt {;}$ and then do a $\texttt {get}$ again on $\texttt {k}$ to match the above.PGHswitched to a $\texttt {get}$ as suggested (which is justified by $\texttt {getSome}$ ). Also using an if-then-else instead of a match to more closely mirror the pseudo-code. Here, we mix constructors of $\texttt {raw_code}$ with other Gallina terms such as the match construct. The result of the match is made available to the continuation of the code as $\texttt {val_k}$ via a use of $\texttt {bind}$ (under the guise of “ $\texttt {val_k ← ... ; ...}$ ”).

3.2 Semantics of Cryptographic Code

When no external procedure calls ( $\texttt {call o x k}$ ) appear in a piece of code $\texttt {c : code A}$ , it is possible to interpret $\texttt {c}$ as a state-transforming probability subdistribution of type

This semantics is similar to that of CertiCrypt [26]. The type $\texttt {SD A}$ denotes the collection of all subdistributions over type $\texttt {A}$ . Generally speaking, a subdistribution is a function $d: A → \mathbb {R}$ assigning a certain probability $d(a)$ to each $a:A$ in such a way that $\int _A d \le 1$ . We use the definition of subdistributions from mathcomp-analysis [4, 69], a Coq library from which we use the foundations it gives to discrete probability theory. The semantics function $\texttt {Pr_code}$ is defined by recursion on the structure of $\texttt {c}$ . Its definition basically boils down to providing an effect handler that interprets state and probabilities in the monad $\texttt {mem → SD(- x mem)}$ .

Using this subdistribution semantics, we can formalize the notation $\mathrm{Pr}\![ b \leftarrow G ]$ from Section 2.1 as follows: (i) extract the $\mathtt {Run}$ function from G; (ii) apply $\texttt {Pr_code}$ to it; (iii) run it on the initial memory; (iv) extract the Boolean component (first projection) from the resulting subdistribution. The final result has type $\texttt {d : SD bool}$ , the type of subdistributions for Booleans, and we precisely define $\mathrm{Pr}\![ b \leftarrow G ] = \texttt {d}(b)$ as the probability assigned to b by this subdistribution on Booleans.

3.3 Packages

A raw package is a finite map from names to raw procedures, which are functions from some $\texttt {A}$ to $\texttt {raw_code B}$ . An interface is a finite set of operation signatures ( $\texttt {opsig}$ ), each specifying the name, argument type, and result type of a procedure. A package is then a raw package $\mathit {RP}$ together with an import interface $\mathcal {I}$ , an export interface $\mathcal {E}$ , and a set of locations $\mathcal {L}$ , such that each procedure in $\mathit {RP}$ uses only locations in $\mathcal {L}$ and imports only from $\mathcal {I}$ , and each procedure name listed in $\mathcal {E}$ is implemented by a procedure in $\mathit {RP}$ of the appropriate type. Consider for instance the package $\mathtt {MOD-CPA}$ in Figure 13. The memory used, ${\mathtt {mem}({\mathtt {MOD-CPA}})}$ ,⁶ consists of two locations, pk : option pubKey and $\texttt {counter : nat}$ . The import interface ${\mathtt {import}({\mathtt {MOD-CPA}})}$ contains a single procedure $\texttt {Query : unit →$ $\mathtt {\mathcal {G}\times \mathcal {G}\times \mathcal {G}}$ . There are two procedures implemented by $\mathtt {MOD-CPA}$ , yielding an export interface ${\mathtt {export}({\mathtt {MOD-CPA}})}$ containing

and

Fig. 13.

We define composition of packages, following the intuition of Brzuska et al. [40]. Given two raw packages P and Q, we define their sequential composition $Q \circ P$ by traversing Q and replacing each $\texttt {call}$ by the corresponding procedure implementation in P. In case P does not implement the procedure for which we search, we use a dummy value instead. If the exports of P match the imports of Q—i.e., ${\mathtt {import}({Q})} \subseteq {\mathtt {export}({P})}$ —then no dummy value will be used. Concretely, during the traversal each $\texttt {call p a κ }$ node is replaced by

where $\mathtt {link_P}$ stands for the recursive call of the function composing P with the remaining code. Experts will recognize this transformation as an algebraic effect handler [28], interpreting the free monad for probabilities, state and the operations imported by P to code in the free monad for probabilities, state, and the operations imported by Q. We have ${\mathtt {mem}({Q \circ P})} = {\mathtt {mem}({P})} \cup {\mathtt {mem}({Q})}$ , ${\mathtt {import}({Q \circ P})} = {\mathtt {import}({P})}$ and ${\mathtt {export}({Q \circ P})} = {\mathtt {export}({Q})}$ .

Given two raw packages P and Q, we may define their parallel composition $P \parallel Q$ by aggregating the implementations and delegating calls to the respective package providing it. This operation is defined even if both packages have overlapping export signatures, in which case procedures in P will be given priority. If their exports are disjoint, i.e., ${\mathtt {export}({P})} \cap {\mathtt {export}({Q})} = \emptyset$ , then this overlap situation does not happen, and we have ${\mathtt {mem}({P \parallel Q})} = {\mathtt {mem}({P})} \cup {\mathtt {mem}({Q})}$ , ${\mathtt {import}({P \parallel Q})} = {\mathtt {import}({P})} \cup {\mathtt {import}({Q})}$ and ${\mathtt {export}({P \parallel Q})} = {\mathtt {export}({P})} \cup {\mathtt {export}({Q})}$ .

Private state. When formalizing composition in SSProve, we do not impose any a priori restrictions on the disjointness of the state that the composed packages manipulate. The essence of state separation can be thus viewed as disjointness of state between the adversary $\mathcal {A}$ and the games in a pair ${G}^{01}$ . We therefore introduce the more economical assumption that only the adversary has to have disjoint state in our security definitions and corresponding theorem statements (e.g., Theorem 2.4). Formally, an adversary $\mathcal {A}$ to game pair ${G}^{01}$ is thus a distinguisher for ${G}^{01}$ with the additional assumption that ${\mathtt {mem}({\mathcal {A}})}$ is disjoint from both ${\mathtt {mem}({{G}^{0}})}$ and ${\mathtt {mem}({{G}^{1}})}$ . This extra state assumption sets apart adversaries from distinguishers: We do not require that the state of a distinguisher be disjoint from the game pair it tries to distinguish, and we still proved Lemma 2.2 and Lemma 2.3 from Section 2.1. The difference between distinguishers and adversaries manifests, for instance, in the definition of perfect indistinguishability from the end of Section 2.1, which quantifies only over adversaries. Distinguishers that are not adversaries could otherwise directly access the game state and trivially distinguish the games we want to consider indistinguishable.

In the setting of SSProve, in which memory locations are global and not up to $\alpha$ -renaming, this fine-grained state separation is helpful, since not only it minimizes the burden of formally proving disjointness, but it also reduces the clutter in the statement of the final results.

Automation of side conditions. In our Coq formalization, we provide automation for the checking of side conditions required to build packages and their compositions. This includes checking predicates such as $\texttt {has_locs_and_imports}$ to ensure that the implementation of a package is consistent with the expected interface and memory footprint. There are limitations, however, to the scope of the automation; specifically checking disjointness of location sets is currently quite basic. We plan to improve this situation in future work.

3.4 Package Laws

We formally proved the algebraic laws obeyed by packages as stipulated by Brzuska et al. [40]. Sequential composition is associative and parallel composition is commutative and associative, so for any packages $P_1, P_2, P_3$ :

\begin{equation*} \begin{array}{rcl} P_1 \circ (P_2 \circ P_3) &=& (P_1 \circ P_2) \circ P_3 \\ P_1 \parallel P_2 &=& P_2 \parallel P_1 \\ P_1 \parallel (P_2 \parallel P_3) &=& (P_1 \parallel P_2) \parallel P_3. \end{array} \end{equation*}

We furthermore relate the two package operations with an interchange law stating that

\begin{equation*} (P_1 \circ P_3) \parallel (P_2 \circ P_4) = (P_1 \parallel P_2) \circ (P_3 \parallel P_4). \end{equation*}

Commutativity of parallel composition only holds if the packages have indeed disjoint interfaces: ${\mathtt {export}({P_1})} \cap {\mathtt {export}({P_2})} = \emptyset$ . The interchange law will only ask this of $P_3$ and $P_4$ : ${\mathtt {export}({P_3})} \cap {\mathtt {export}({P_4})} = \emptyset$ .

The identity package ${\mathtt {ID}_{I}}$ behaves as an identity for sequential composition when using the correct interface:

\begin{equation*} {\mathtt {ID}_{{\mathtt {export}({P})}}} \circ P = P = P \circ {\mathtt {ID}_{{\mathtt {import}({P})}}}. \end{equation*}

As we have hinted before, these laws do not require disjointness of state, because they are syntactic equalities. In fact, in SSProve they hold with respect to the usual equality of Coq (“propositional equality,” written $\texttt {_ = _}$ ), without the need to define a separate notion of “code equality” [40].

4 Probabilistic Relational Program Logic

Some of the SSP proof steps can be carried out at a high-level of abstraction relying on the package formalism from Section 3. The justification of other steps like perfect indistinguishability requires, however, a finer, lower-level analysis. As already pointed out in Section 2.2, we can perform such analyses in a relational program logic, a deductive system in which it is possible to show that two pieces of code $c_0,c_1$ satisfy a certain relational specification, e.g., that they are equivalent.

In Section 4.1, we present some of the elementary rules constituting our program logic. We then sketch a proof of Theorem 2.4, the link between the high-level reasoning based on the package laws to the low-level one based on our probabilistic relational program logic in Section 4.2.

4.1 Selected Rules

The logic we use is a variant of pRHL, a probabilistic relational Hoare logic introduced by Barthe et al. [26]. The logic exposes relational judgments of the form

for which a basic intuition is provided in Section 2.2. Formally, $c_0$ and $c_1$ denote probabilistic stateful code with return type $A_0$ and $A_1$ , respectively, and the precondition $m_0 : \mathsf {mem}, m_1 : \mathsf {mem}\vdash \phi : \mathbb {P}$ is a proposition with free variables $m_0$ and $m_1$ denoting the initial state of the memory (before execution of the code). The postcondition $m_0^{\prime } : \mathsf {mem}, a_0 : A_0, m_1^{\prime } : \mathsf {mem}, a_1 : A_1 \vdash \psi : \mathbb {P}$ is a predicate on the values returned by the executed code, which is parameterized by the variables $m_0^{\prime }$ and $m_1^{\prime }$ representing the final state of the memory (after execution) and by the final values $a_0$ and $a_1$ . As mentioned before, we will sometimes omit the quantifications when they are clear from the context. We will also abuse notation and sometimes write, e.g., $\psi (m_0^{\prime },a_0) (m_1^{\prime },a_1)$ for the substitution of $\psi$ with the given memories and values. The code fragments appearing in a judgment are drawn from the free monad $\texttt {code}$ $_{\mathcal {L} , I}$ of Section 3.1, and meet the further requirement that no oracle calls $\texttt {call o x k}$ appear in them (exactly as in Section 3.2). The precondition $\phi$ is defined to be a relation between initial memories (for instance, $m_0 = m_1$ ). Similarly the postcondition $\psi$ relates final memories and final results, intuitively obtained after the execution of $c_i$ on $m_i$ . We describe how to assign a formal semantics for such probabilistic judgments in Section 5.2. The semantics is based on the notion of probabilistic couplings, already adopted by Barthe et al. [22]. In the remainder of this subsection, we describe a selection of our rules. The presentation does not contain all the rules employed in practice by SSProve, nor does it provide a canonical presentation of these rules: some rules are overlapping hence there are multiple ways to prove the same relational judgment, but the actual derivation might be simpler with this redundancy. We return to the question of the organization of rules after this presentation:

The $\texttt {reflexivity}$ rule relates the code c to itself when both copies are executed on identical initial memories.

The $\texttt {seq}$ rule relates two sequentially composed commands using $\texttt {bind}$ by relating each of the sub-commands.

The $\texttt {swap}$ rule states that if a certain relation on memories I is invariant with respect to the execution of $c_0$ and $c_1$ , then the order in which the commands are executed is not relevant. We used the $\texttt {swap}$ rule in Section 2.3 to swap two independent samplings; in that case the invariant I consisted in the equality of memories.

The eqDistrL rule allows us to replace $c_0$ by $c_0^{\prime }$ when both codes have the same denotational semantics as defined by $\mathtt {Pr_code}$ , in the sense of Section 3.2.

The symmetry rule simply states that the symmetric judgment holds if the arguments of the pre- and postconditions are swapped accordingly.

The for-loop rule relates two executions of for-loops with the same number of iterations by maintaining a relational invariant through each step of the iteration.

The do-while rule relates two bounded while loops with bodies $c_0$ and $c_1$ . Every iteration preserves a relational invariant on memories I that depends on a pair of Booleans, and the postcondition also stipulates that $c_0$ and $c_1$ return the same Boolean, i.e., $b_0 = b_1$ . This rule follows the pattern of the unbounded do-while rule defined for simple imperative programs by Maillard et al. [70]. We believe that, with some additional work, their ideas could be used to also support unbounded loops in SSProve (see Section 5.5 for details).

The uniform rule relates sampling from uniform distributions on finite sets A and B that are in a bijective correspondence. Note how it applies the bijection f in the continuation on the right-hand side:

The code $y \mathbin {\mathtt {\lt \$}}D \: ;\:\,c_0$ samples y from the subdistribution D. If y is never used in $c_0$ , as indicated by the last premise of the dead-sample rule, then we would like to argue that the sampling constitutes “dead code” and can be ignored. This intuition only holds if D is a proper distribution rather than a subdistribution. For instance, if D is the null distribution, the sampling behaves like “ $\texttt {assert false}$ ” and can certainly not be ignored. The premise $\sum _{x \in |D|} D(x) = 1$ ensures that D is indeed a proper distribution (also known as a “lossless subdistribution”). A uniform distribution over a non-empty set would, for instance, constitute a proper distribution in this sense:

The sample-irrelevant rule has a similar flavor to dead-sample, as it too requires D to be a proper distribution. We assume that $c_0\ y$ can be related to $c_1$ for all values of y. In other words, the choice of a particular value for y is irrelevant for the pre- and postcondition at hand. Therefore, sampling y from a proper distribution D will likewise allow us to conclude that $c_0\ y$ is related to $c_1$ :

The assert rule relates two assert commands, as long as “ $b_0 = b_1$ ” holds before the commands. Note that while the precondition is a predicate on initial memories, nothing prevents it form talking about other things such as the Booleans $b_0$ and $b_1$ quantified at the meta-level. It guarantees “ $b_0 = \mathtt {true}\wedge b_1 = \mathtt {true}$ ” afterwards, ignoring the values $a_0$ and $a_1$ of type $\texttt {unit}$ :

The one-sided assertL rule specifies the behavior an assert with a true Boolean, by relating it with return (). Note that if a code fragment $c_0$ is shown to be related to an assertion failure

, then $c_0$ must necessarily contain an assertion failure as well, i.e., correspond to the null sub-distribution. Indeed the (sound) model of our program logic, explained in Section 5, gives rise to a total correctness semantics [70] for assertion failures: assertion failures only relate to other assertion failures:

The assertD rule allows reasoning about the dependent version of assert where the continuation $κ _i$ is only well-defined if the assertion holds, as described in Section 3.1. As in the assert rule, the two assertion conditions $b_0$ and $b_1$ may a priori be different. The precondition $\phi$ has to ensure that $b_0$ and $b_1$ are either both true or both false. The continuations $κ _i$ are defined only in case the assertions succeed. Under this assumption, here represented as the hypotheses $H_0$ and $H_1$ , the continuations $κ _i$ must be related for the same pre and post as the composite statements “ ${\texttt {assert } b_i \texttt { as } h_i} \: ;\:\,κ _i\ h_i$ ”. The intuition for the validity of this rule is the following: if $b_i$ is true, ${\texttt {assert } b_i \texttt { as } h_i}$ is defined as $κ _i \ h_i$ , and we appeal to the last premise. If $b_i$ is false, then both composite statements $\texttt {fail}$ and evaluate to the null distribution:

The put-get rule states that looking up the value at location $ℓ$ after storing v at $ℓ$ results in the value v. We also have a similar rule to remove a put right before another one at the same location, and one for two get in a row. More interestingly, we provide one-sided rules for get and put, which update the pre- or postcondition accordingly:

With get-lhs, the left-hand side program is able to read from a memory location while we record that information in the precondition. Dually, get-lhs-rem will recover that information from the precondition. We also use the information in the preconditions when dealing with memory invariants such as the one presented in Section A for the security proof of KEM-DEM.

The situation is slightly more complicated for one-sided writes, because writing might break a postcondition. Typically, writing on only one side when the postcondition ensures that both memory locations are equal would (maybe temporarily) break said postcondition:

Instead, put-lhs modifies the precondition to state that the precondition $\phi$ was satisfied by a previous memory state, and that the current memory state is the same except that $ℓ$ now points to v. Typically, when $\phi$ is an invariant—such as one stating the equality of the two memories—we relax the precondition temporarily until we reach a new state where it holds again, for instance by having a similar write on the right-hand side. Rule restore-pre-lhs is such an example—although much simplified—of how one can recover the precondition after a write, provided they can prove that the precondition holds after the corresponding update of memory. In SSProve, we, in fact, implement a more general rule accounting for any number of writes on both the left- and right-hand sides. Several put operations are performed, until one can show that the invariant is preserved by all these memory updates.

More generally, we define handy tactics to apply these rules immediately, as well as performing the necessary massaging of goals so that they become applicable. As such, we have automation for swapping multiple lines at once and checking that the swap was legal. Moreover, these tactics rely on the hints mechanism of Coq and can thus be extended by the user.

Organization of the relational program logics rules. The rules of the relational program logic are not canonically derived but a few guidelines have been used to come up with them. These guidelines follow three independent criteria: the algebraic criterion, the historical criterion, and the practical criterion. The algebraic criterion follows the idea that the effects employed in the programming language can be modeled using algebraic structures, e.g., monads, and rules corresponding to the standard combinators of this algebraic presentation can be naturally expressed [70]. In particular, equationally presented monads such as the state monad, the exception monad or the Giry probability monad [58, 80] naturally induce reasoning rules corresponding to their equational theory [57]. The algebraic approach, however, falls short of providing a complete solution accounting for the distinction between one-sided and two-sided rules specific to a particular effect—e.g., rules for get and set when considering the state monad. For these rules, the historical presentation follows earlier work on (x)pRHL [22, 26], providing both one-sided and two-sided rules. Finally, the pragmatics of proving relational properties of programs in SSProve pushed for specific presentations of the rules, well tailored to streamlined applications in practice, in particular, when considering the specifics of the Coq hosting environment, such as tactic language and incremental proof derivation through interactive use of existential variables and subgoals generation. The redundancy between rules created by these different approaches to relational program logic rules’ design is not an issue in practice, since each of these rules is proved against the semantic model presented in Section 5 rather than assumed axiomatically. The question of completeness of the rules is somehow side-stepped in our setting by having an escape hatch using the relational semantics: in any case, if the rules we provide are not suitable to prove a particular judgment, one can always fall back to the underlying semantic model and prove an additional valid rule at that level. Ultimately, we validate the design of the rules present in SSProve via case studies, ensuring that the chosen set of rules are indeed enough to obtain concrete interesting results.

4.2 Proof Sketch for Theorem 2.4

If we denote by $\mathsf {mem}$ the type of memories, then a binary memory predicate

holds on a pair of memories $(h_0, h_1)$ , written $(h_0, h_1) \vDash \psi$ , when $\psi (h_0,h_1)$ holds. Moreover, we say that such predicate is stable on sets of locations $\mathcal {L}_0$ and $\mathcal {L}_1$ when for all $h_0, h_1$ such that $(h_0, h_1) \vDash \psi$ , we have for all memory locations l, such that $l \not\in \mathcal {L}_0$ and $l \not\in \mathcal {L}_1$ , that

(1)

${h_0} [{l}] = {h_1} [{l}]$ ,

(2)

for all v, $({h_0} [{l} \mapsto {v}], {h_1} [{l} \mapsto {v}]) \vDash \psi$ .

In other words, on locations outside of $\mathcal {L}_0$ and $\mathcal {L}_1$ , $\psi$ must ensure equality of corresponding values and nothing else.

When we want to prove that two packages with the same interface are perfectly indistinguishable, we will assume that we have a stable predicate on the locations of the packages, and moreover, that this predicate is an invariant on the different operations of the interface. This invariance of the predicate is the reason why $\psi$ appears both as a pre- and postcondition in Theorem 2.4. Notice that stable predicates do not impose conditions on the intermediate states of each procedure in the interface of Theorem 2.4, e.g., two related procedures may differ in their internal order of updates, as long as the final results of computations are related.

Before giving the proof sketch for Theorem 2.4, we state a theorem that is also proved in Coq and relates the probabilistic relational program logic with the probabilistic semantics.

We are now ready to outline the proof for Theorem 2.4.

5 Semantic Model and Soundness of Rules

We build a semantic model validating the rules of the effectful relational program logic from Section 4. The construction of the model builds upon an effect-modular framework [70], instantiating it with probabilities, simple failures, and global state. We first give in Section 5.1 an overview of the framework of Maillard et al. [70]. We then informally explain how we apply it to (1) obtain a model for a probabilistic relational program logic in Section 5.2 and (2) enrich it with state in Section 5.3. The categorical constructions underlying the framework are explained in Section 5.4, together with the extensions that we need in this work. Finally, in Section 5.5, we compare this methodology to other approaches for modelling relational program logics.

5.1 Relational Effect Observation

The aforementioned framework builds upon a monadic representation of effects to provide sound semantics to a large class of relational program logics. As we shall see, this class notably contains logics for reasoning about cryptographic code: code that can manipulate state and sample randomly (see Section 4.1). A generic relational program logic $r\mathcal {L}$ is a deductive system with a relational judgment $\vDash _{} c_0 \sim c_1 \, { \:w\:}$ asserting that pairs of effectful code fragments $c_0,c_1$ behave according to a given relational specification w connecting the two computations. The exact shape of code and specifications appearing in such a judgment can vary depending on what programming language and logic are considered.

The recipe laid out by Maillard et al. [70] stems from the realization that not only effectful code can be modelled using monads, but specifications can too, and we can build semantics for $r\mathcal {L}$ using a so-called relational effect observation in three steps:

(1)

Model the effects involved in the considered left and right programs as monads $M_0$ and $M_1$ .

(2)

Turn the collection of relational specifications w into a relational specification monad

${(A_0,A_1) \mapsto W(A_0,A_1)}$ where $A_0$ corresponds to the return type of the left program, and $A_1$ the return type of the right program. The set $W(A_0,A_1)$ should be ordered by entailment of specifications, written $w \le w^{\prime }$ .

(3)

Finally, find an appropriate relational effect observation $\theta ^{A_0 A_1} : M_0A_0 \times M_1A_1 → W(A_0,A_1)$ mapping a pair of monadic computations in $M_0\,A_0\times M_1\,A_1$ to a relational specification in $W(A_0,A_1)$ , and preserving the monadic features present on both sides.

Once a relational effect observation $\theta$ is specified, we define a semantic judgment for $r\mathcal {L}$ as follows:

where $c_i:M_i\,A_i$ and $w:W(A_0,A_1)$ .

A typical example of a relational specification monad is the relational backward predicate transformer monad $\mathrm{BP}(A_0,A_1) := (A_0\times A_1{→ }\mathbb {P})→ \mathbb {P}$ , where $\mathbb {P}$ is the type of propositions. Intuitively a backward predicate transformer $w:\mathrm{BP}(A_0,A_1)$ maps a relational postcondition $\phi$ to a precondition $w\, \phi$ sufficient to ensure $\phi$ on the result of the executions of code fragments $c_0,c_1$ respecting w (i.e., for which $\vDash _{\theta } c_0 \sim c_1 \, { \:w\:}$ for some $\theta$ ). The preorder on $\mathrm{BP}(A_0,A_1)$ is given by reverse pointwise implication. For two backward predicate transformers $w_1, w_2 : \mathrm{BP}(A_0,A_1)$ , we say that $w_1 \le w_2$ when $\forall \phi .\ w_2 \, \phi \Rightarrow w_1 \, \phi$ . Every pre-/postcondition pair $(pre, post)$ can systematically be translated into a single backward predicate transformer $\mathrm{toBP}(pre,post)$ :

Note that $\mathrm{BP}$ does not form a monad: it takes two types $A_0,A_1$ as input but only returns one $(A_0\times A_1{→ }\mathbb {P})→ \mathbb {P}$ . Yet $\mathrm{BP}$ somehow still behaves as a monad, because we can equip it with $\texttt {bind}$ and $\texttt {return}$ operations satisfying equations akin to the standard monad laws. This is one of the reasons why our precise definitions of relational specification monad and relational effect observation are centered around the notion of relative monad instead, as discussed in Reference [70] and explained here in Section 5.4.

5.2 Effect Observation for Probabilities and Failures

The technique above can be exploited to build a model for a probabilistic relational program logic. We model probabilistic code using a free monad $\mathrm{F}_\mathit {Pr}$ over a probabilistic signature, reusing $\texttt {code}$ $_{\mathcal {L} , I}$ mentioned in Section 3.1, where we require that only sampling operations are performed. This code can be assigned a probabilistic semantics using the monad of subdistributions [13, 58], following the track of Section 3.2, but ignoring considerations around state. This semantics assignment can, in fact, be seen as a monad morphism $\delta : \mathrm{F}_\mathit {Pr}→ \mathrm{SD}$ .

Specifications and effect observation. To model specifications for probabilistic code, we use the relational specification monad $\mathrm{BP}$ of backward predicate transformers, defined above. The relational effect observation $\theta _\mathit {Pr}$ is based on the notion of probabilistic coupling. A coupling ${d} : {\texttt {coupling}(d_0, d_1)}$ of two subdistributions $d_0:\mathrm{SD}(A_0)$ and $d_1:\mathrm{SD}(A_1)$ is a subdistribution over $A_0 \times A_1$ such that its left and right marginals correspond to $d_0$ and $d_1$ , respectively. For $d_i : \mathrm{SD}(A_i)$ two subdistributions, we define $\theta _\mathit {Pr}^{\prime } : \mathrm{SD}\times \mathrm{SD}→ \mathrm{BP}$ by

We moreover turn the domain of $\theta _\mathit {Pr}^{\prime }$ into a product of free monads by setting

\begin{equation*} \theta _\mathit {Pr}:= \theta _\mathit {Pr}^{\prime } \circ \delta ^2 \quad : \mathrm{F}_\mathit {Pr}\times \mathrm{F}_\mathit {Pr}→ \mathrm{BP}. \end{equation*}

Intuitively, if $w : BP(A_0 ,A_1)$ is obtained out of a $(pre,post)$ pair, then the semantic judgment $\vDash _{\theta _{Pr}} c_0 \sim c_1 \, { \:w\:}$ holds when one can find a coupling d of $\delta (c_0),\delta (c_1)$ whose support validates post whenever pre is valid.

Our probabilistic model $\vDash _{\theta _{Pr}} c_0 \sim c_1 \, { \:w\:}$ validates state-free accounts of several rules of Section 4.1. First, since the subdistribution monad is commutative (sampling operations always commute), our semantics validates a state-free variant of the swap rule. Second, as it is often the case for an arbitrary effect observation, symmetric rules like uniform involving similar effectful operations on both sides (here: $a \mathbin {\mathtt {\lt \$}}~\mathtt {uniform}\ A$ ) are validated as well. Third, failing assertions at type A can be modelled using the null subdistribution on A, and this interpretation allows us to validate the assert rule in our model. Fourth, a state-free variant of the reflexivity rule can be established by building, for any subdistribution s, a coupling ${d} : {\texttt {coupling}(s, s)}$ of s with itself. Fifth, any relational effect observation $\theta$ validates a rule like seq. Such a rule is essentially a syntactic formulation of the fact that $\theta$ should preserve the monadic composition, which is true by definition.

The implementation of the relational effect observation $\theta _\mathit {Pr}$ in Coq depends on a mathematical theory of couplings and of their interaction with probabilistic programs that we developed. This theory relies internally upon the mathcomp-analysis library [3, 4], particularly on their formalization of real numbers, subdistributions and discrete integrals.

5.3 Adding State

To extend this first model to stateful code and state-aware specifications, we adapt to our setting the classical notion of state monad transformer [65]. A monad transformer maps monads M to monads $\mathrm{T}\,M$ and monad morphisms $\theta$ to monad morphisms $\mathrm{T}\,\theta$ . In particular, the state monad transformer takes as input a monad M and a fixed set of states S and produces a monad with underlying carrier $\mathrm{StT}\,M(A) = S → M (A \times S)$ with additional ability to read and write elements of S. Besides, a monad transformer comes equipped with a family of liftings $\texttt {lift}^{\mathrm{T}} : \forall M.\ M → \mathrm{T}\,M$ coercing any computation in the original monad M to a computation in the extended effectful environment $\mathrm{T}\,M$ . We generalize this to specification monads and build modularly an effect observation $\theta _\mathit {Pr,St}$ on top of $\theta _\mathit {Pr}$ :

\begin{align*} \theta _\mathit {Pr,St}^{\prime } := \mathrm{StT}\,\theta _\mathit {Pr}&\quad : \mathrm{StT}(\mathrm{F}_\mathit {Pr}^2)(A_0,A_1) → \mathrm{StT}(\mathrm{BP})(A_0,A_1) \end{align*}

using two sets of global states $S_0,S_1$ for the left and right, where

Following the definition of $\theta _\mathit {Pr}$ in Section 5.2, we further extend $\theta _\mathit {Pr,St}^{\prime }$ by turning its domain into a product of free monads $\mathrm{F}_\mathit {Pr,St}^2 := \mathrm{F}_\mathit {Pr,St}\times \mathrm{F}_\mathit {Pr,St}$ over a stateful and probabilistic signature. This extension is obtained from $\theta _\mathit {Pr,St}^{\prime }$ by precomposition with the mapping mentioned in Section 3.2:

\begin{equation*} \theta _\mathit {Pr,St}:= \theta _\mathit {Pr,St}^{\prime } \circ \texttt {Pr_code}^2. \end{equation*}

Using the liftings $\texttt {lift}^{\mathrm{StT}}$ provided by $\mathrm{StT}$ , we can build from any purely probabilistic relational judgment $\vDash _{\theta _\mathit {Pr}} c_0 \sim c_1 \, { \:w\:}$ a relational judgment $\vDash _{\theta _\mathit {Pr,St}} c_0 \sim c_1 \, { \:\texttt {lift}^\mathrm{StT}\,w\:}$ in the state-aware model. This correspondence can be shown to form an embedding of logics: for every $c_0, c_1, w$ free from state manipulation, derivations of $\vDash _{\theta _\mathit {Pr,St}} c_0 \sim c_1 \, { \:\texttt {lift}^\mathrm{StT}\,w\:}$ are in bijective correspondence with derivations of $\vDash _{\theta _\mathit {Pr}} c_0 \sim c_1 \, { \:w\:}$ . The proof of this latter fact is simplified by the modularity of the construction. This modularity is moreover reflected in the way $\theta _\mathit {Pr,St}(c_0,c_1)$ evaluates. A first pass converts stateful operations of $c_0,c_1$ and yields state-passing probabilistic code. A second pass interprets the remaining sampling operations and yields state-transforming subdistributions. Last, a third pass uses $\theta _{Pr}$ and yields the expected specification ${\theta _\mathit {Pr,St}(c_0,c_1): \mathrm{StT}(\mathrm{BP})(A_0,A_1)}$ . The semantic judgment $\vDash _{\theta _\mathit {Pr,St}} c_0 \sim c_1 \, { \:w\:}$ obtained out of $\theta _\mathit {Pr,St}$ validates all of the rules of our relational program logic (including Section 4.1).

5.4 Categorical Foundations of the Framework

Our semantics relies on the notion of relational effect observation (Section 5.1), and on our ability to apply a suitable state transformer to them (Section 5.3). In this section, we provide categorical definitions for those notions. Our Coq formalization of the semantics is essentially a formal version of the theory laid out here. Note that Coq types and functions between them form a category that we call $\mathrm{Type}$ . We will also use the category $\mathrm{PreOrder}$ of types equipped with a preorder structure (reflexive, transitive relation), and monotone functions.

Computations and specifications as order-enriched relative monads. We are interested in modelling probabilistic programs using monads. Yet, in our constructive setting probabilistic computations fail to form a monad. Indeed, our Coq formalization relies on the mathcomp-analysis library, which defines the type of subdistributions $\mathrm{SD}(A)$ (see Section 3.2) only when A is a “choiceType,” that is, a type equipped with an enumeration function for each of its decidable subtypes. This extra choice structure is crucial to define a well-behaved notion of discrete integral on A, and consequently of subdistribution on A. KMThis is a bit mysterious, could we give a simple explanation of the operation a choice type support (I think it allows to extract an enumeration out a type, no ?)AVMchanged a bit Beyond the discrepancy between the domain and codomain of $\mathrm{SD}$ , it is still possible to endow it with slightly modified versions of the expected $\texttt {bind}$ and $\texttt {return}$ operations, that satisfy laws comparable to the standard monad laws. Fortunately, Altenkirch et al. [10] explain well how these superficial obstructions due to a mismatch between the domain and codomain of a monad-like structure can be solved using the closely related notion of a relative monad instead.

Definition 5.1 (Relative Monad).

Given a functor $J : \mathcal {I}→ \mathcal {C}$ , a monad relative to J (or J-relative monad) is a functor $M : \mathcal {I}→ \mathcal {C}$ equipped with “J-shifted” return and bind operations

\begin{align*} \texttt {return}: \quad &\forall (X:\mathcal {I}). \ \mathcal {C}(J X , M X) &\\ \texttt {bind}: \quad &\forall (X\, Y:\mathcal {I}). \ \mathcal {C}(J X, M Y) → \mathcal {C}(M X, M Y) & \end{align*}

satisfying J-shifted versions of the return and bind monad laws:

\begin{align*} \texttt {bind}_{X,X}(\texttt {return}_X) &= \mathrm{id}_{MX}\\ \texttt {bind}_{X,Y} (k) \circ \texttt {return}_X &= k\\ \texttt {bind}_{X,Z} (\texttt {bind}_{Y,Z}(l) \circ k) &= \texttt {bind}_{Y,Z}(l) \circ \texttt {bind}_{X,Y}(k) \end{align*}

As a trivial example, any monad $M : \mathcal {C}→ \mathcal {C}$ can be seen as a relative monad over the identity functor $\mathrm{Id}_{\mathcal {C}}$ . Writing $\mathrm{chTy}$ for the category of choice types (choiceType), we are able to package $\mathrm{SD}: \mathrm{chTy}→ \mathrm{Type}$ as a monad relative to the inclusion functor $\mathrm{chTy}→ \mathrm{Type}$ forgetting the extra choice structure. Similarly the probabilistic code monad $\mathrm{F}_\mathit {Pr}$ must actually be restricted to $\mathrm{chTy}$ and only forms a relative monad $\mathrm{F}_\mathit {Pr}: \mathrm{chTy}→ \mathrm{Type}$ over the inclusion functor.

Regarding specifications, relational specification monads W fail to form monads as well. Indeed, $W : \mathrm{Type}\times \mathrm{Type}→ \mathrm{PreOrder}$ expects two types $A_0,A_1$ as input but only returns one $W(A_0,A_1)$ , which is moreover pre-ordered. Again, it turns out that relational specification monads W (including $\mathrm{BP}$ ) can be seen as relative monads, over the discrete product functor $\mathrm{dprod}$ mapping two types to their product seen as a trivial preorder:

\begin{equation*} \begin{array}{lccc} \mathrm{dprod} :~ &\mathrm{Type}\times \mathrm{Type}& → & \mathrm{PreOrder}\\ & A_0, A_1 &\mapsto & A_0 \times A_1. \end{array} \end{equation*}

Specializing the definition of a relative monad with $J=\mathrm{dprod}$ , the $\texttt {bind}$ and $\texttt {return}$ operations of W take the following form:

\begin{equation*} \begin{array}{rl} \texttt {return}^{\mathrm{W}}:& A_0\times A_1 → W(A_0,A_1)\\ \texttt {bind}^{\mathrm{W}}:& (A_0\times A_1 → W(B_0,B_1)) → W(A_0,A_1) → W(B_0,B_1) \end{array} \end{equation*}

To soundly model relational program logics, the $\texttt {bind}^{\mathrm{W}}$ operation of the relational specification monad being used should be monotonic in both arguments. In our setting, we can, in fact, easily express that condition by requiring all categorical constructions to be order-enriched [62, 63, 82]. For the sake of readability, we ignore the trivial considerations arising from this enrichment and consider that all the constructions we are dealing with are implicitly order-enriched.

Summing up, in our setting:

•

Pairs of computations are modelled by a product $M_0 \times M_1$ of $\mathrm{Type}$ -valued (order-enriched) relative monads.

•

Specifications are modelled using a relational specification monad W, i.e., a (order-enriched) relative monad over the discrete product functor $\mathrm{dprod} : \mathrm{Type}\times \mathrm{Type}→ \mathrm{PreOrder}$ .

For instance, the domain and codomain of the relational effect observation $\theta _\mathit {Pr}$ defined in Section 5.2 form, respectively, a product of $\mathrm{Type}$ -valued relative monads, and a relational specification monad:

\begin{align*} \operatorname{dom}(\theta _\mathit {Pr})&\text{ is }\quad \mathrm{F}_\mathit {Pr}\times \mathrm{F}_\mathit {Pr}: \mathrm{chTy}\times \mathrm{chTy}→ \mathrm{Type}\times \mathrm{Type}\\ \operatorname{cod}(\theta _\mathit {Pr})&\text{ is }\quad \mathrm{BP}: \mathrm{Type}\times \mathrm{Type}→ \mathrm{PreOrder} \end{align*}

Relational effect observations. Consider $M_0, M_1$ two $\mathrm{Type}$ -valued relative monads with base functors $J_0, J_1$ , respectively. Let W be a relational specification monad. The relative monads $M_0\times M_1$ and W organize in the following configuration:

A relational effect observation $\theta : M_0 \times M_1 → W$ is a collection of mappings

\begin{equation*} \theta ^{A_0 A_1} : M_0\,A_0 \times M_1\,A_1 → W(J_0\, A_0 , J_1\,A_1) \end{equation*}

preserving the $\texttt {bind}$ and $\texttt {return}$ operations of $M_0, M_1$ up to inequalities

\begin{align} \theta \,(\texttt {return}^{M_0}\,a_0, \texttt {return}^{M_1}\,a_1) &\le \texttt {return}^W\,(a_0,a_1) \end{align}

(4)

\begin{align} \theta \,(\texttt {bind}^{M_0}\,f_0\,m_0, \texttt {bind}^{M_1}\,f_1\,m_1) &\le \texttt {bind}^W\, (\theta \circ (f_0,f_1))\,\, \theta \,(m_0,m_1) \end{align}

(5)

An instance of relational effect observation is of course given by $\theta _\mathit {Pr}$ . Note that $\theta _\mathit {Pr}$ validates those inequalities but fails to validate them as equalities.

In our development, relational effect observations $\theta : M_0 \times M_1 → W$ are defined as special cases of lax morphisms between order-enriched relative monads. We refer the interested reader to our formalization⁷ for a precise definition of this notion. In the remainder of this section, we explain how to extend relative monads and lax morphisms between them with state. In particular, this extension will apply to relational effect observations such as $\theta _\mathit {Pr}$ .

Transforming a relative monad with an appropriate left adjunction. It is a standard result that every adjunction induces a monad and that every monad is induced by a family of adjunctions (see Reference [68], chapter 6). A similar kind of correspondence holds between left J-relative adjunctions on one side, and J-relative monads on the other. The two following definitions appear in Reference [10].

Definition 5.2 (Left J-relative Adjunction)

*Definition 5.3* (Kleisli Adjunction of a Relative Monad).

In this work, we introduce the following notion.

*Definition 5.4* (Transforming Adjunction).

If $\mathcal {I}$ is cartesian, $\mathcal {C}$ is cartesian closed, and J preserves cartesian products, then the following configuration gives rise to a transforming adjunction $\sigma \, : \, J\circ ({-}\times S) {\,\,}_J\!\dashv S→ {-}$ , which we suggestively call “state-transforming adjunction.” Note that the J-relative monad induced by this adjunction $X \mapsto S → J(X \times S)$ is a J-shifted version of a standard state monad:

Theorem 5.5 (Relative Transformer).

Given a J-relative monad $M: \mathcal {I}→ \mathcal {C}$ “sitting” on a transforming adjunction $\alpha \, :\, JL^\flat {\,\,}_J\!\dashv R$ , the composition $RML^\flat$ is also a J-relative monad. We call it the relative monad transformed by $\alpha$ and denote it as $\mathrm{T}_\alpha \, M$ .

Proof.

We can factorize M through its Kleisli category as shown in Figure 5.3 to obtain $\mathrm{T}_\alpha \, M := RML^\flat = R R^M L^M L^\flat$ and observe that $L^M L^\flat {\,\,}_J\!\dashv R R^M$ , meaning that $\mathrm{T}_\alpha \, M$ is the relative monad induced by the latter adjunction.□

Adding state to a J-relative monad M consists in applying the above theorem with the state-transforming adjunction $\sigma$ defined above to obtain $\mathrm{StT}\, M := T_\sigma \, M$ . In particular, this is how the domain and codomain of $\mathrm{StT}(\theta _\mathit {Pr})$ from Section 5.3 are defined.

Transforming lax morphisms. To transform lax morphisms of relative monads (such as $\theta _\mathit {Pr}$ ) we follow the same methodology as in the previous paragraph. Various non-standard categorical notions are at play under the hood: lax morphisms of left relative adjunctions, lax functors, and lax natural transformations. Informally, let $\theta : M → W$ be a lax morphism of relative monads. Let $\alpha$ be a transforming adjunction for both M and W. Then $\theta$ induces a lax morphism between the Kleisli adjunctions of its domain and codomain:

\begin{equation*} \mathrm{Kl}(\theta) : (L^M {\,\,}_J\!\dashv R^M) \longrightarrow (L^W {\,\,}_J\!\dashv R^W) \end{equation*}

$\mathrm{Kl}(\theta)$ can then be pasted with an appropriate cell to obtain a lax morphism between the transformed adjunctions, which ultimately induces a morphism $\mathrm{T}_\alpha \, \theta : \mathrm{T}_\alpha \, M → \mathrm{T}_\alpha \, W$ between the transformed relative monads. Adding state to a relational effect observation $\theta$ now consists in applying the above with the state-transforming adjunction $\sigma$ . This is how we can obtain $\mathrm{StT}(\theta _\mathit {Pr}) := \mathrm{T}_\sigma \, \theta _\mathit {Pr}$ in a modular way.

5.5 Comparing Approaches to Semantic Models for Relational Program Logics

We use the semantic framework of Maillard et al. [70] based on effect observations to obtain a formal and foundational approach to relational program logics for cryptographic code. In this section, we compare this methodology to other approaches for modelling relational program logics.

The Foundational Cryptography Framework (FCF) [78] develops machine-checked proofs of cryptographic code in the Coq proof assistant. Computations are modelled as elements of a free monad, then interpreted as distributions. This denotational model is subsequently used to derive a program logics using couplings. The approach we take is similar, but we paid special attention to the intermediate monadic structures involved. For instance, FCF distinguishes the type of simple computations $\texttt {Comp}\,R$ with result value in R from the type of computations with access to some oracle $\texttt {OracleComp}\,I\,O\,R$ . The latter provides operations to query an oracle with a given value in I and obtain results in O of an oracle call that are not found in simple computations. We rely on the genericity of free monads at the level of packages to encode both the simple computations and computations with oracles in a single uniform type of code, using the parameterized operations to provide oracle queries for instance. State passing is done explicitly in FCF, while we prefer a more abstract presentation using a state monad transformer. As a result, we obtain a conceptually comfortable decomposition of our computational monads and specifications, at the price of additional work to define the few components that we need for verification of cryptographic code.

Although the implementation of EasyCrypt does not have a proper foundational backend per se, many rules of its probabilistic relational Hoare logic (pRHL) were proved sound in Coq in a project called XHL with respect to a model [94], parts of which have been merged into mathcomp-analysis [4]. Our own development relies on these very same definitions and lemmas for the probabilistic aspect of the relational specification monads and the underlying theory of couplings. Our contribution here, forced by the organization of the framework of Maillard et al. [70], is to show formally that the various lemmas proved in the library indeed build instances of the monadic abstractions that are not explicated in the original development. Amongst the technical difficulties that appeared in that operation, we should mention the fact that distributions are only built-in mathcomp-analysis for types equipped with a certain choice structure, reflecting the constructive nature of Coq. However, we cannot endow distributions with such a choice structure, and in particular, distributions do not form monads, but they do form relative monads over the functor forgetting the choice structure (see Section 5.4). Note that although the definition of the semantic judgment ${\vDash _{\theta _\mathit {Pr,St}} c_0 \sim c_1 \, { \:w\:} }$ can seem abstract at first, it ultimately reduces to a direct formulation equivalent to the one underlying XHL.

CertiCrypt is a predecessor of EasyCrypt embedded directly in Coq and our foundational approach is closer in spirit to CertiCrypt than to its successor. CertiCrypt’s Coq code employs an elaborate definition of probabilistic Relational Hoare Logic judgments in terms of approximate couplings, based on the ALEA library [13, 77], which directly offers support for bounding the distance between the distributions of two probabilistic programs. It also features a memory model distinguishing between local and global variables, providing the ability to give a direct semantics for first-order procedure calls with local state in the form of “stack frames.” This memory model could be employed easily in our account of the state relative monad transformer.

CertiCrypt, EasyCrypt, and FCF provide some support for unbounded while loops, ultimately relying at the semantic level on the fact that distributions take values in a complete lattice, namely, the real numbers, and using standard fixpoint theorems to obtain an interpretation of arbitrary loops. Our semantic account could also support such an extension, as witnessed by the semantics Maillard et al. [70] construct for a simple Imp language with unbounded loops.

While a direct ad hoc definition of the model is comparatively simpler to implement, our categorical approach aims to provide more modularity, with the potential to account for multiple effects. This modular approach makes explicit that the model can be restricted to one validating a solely probabilistic program logic (Section 5.2). Moreover, it should be possible to extend our stateful model with other effects using a similar range of algebraic techniques. However, as things stand now, incorporating new effects in our relational program logic and its associated semantics can only be done on a case by case basis. Developing and using the framework required a high proof effort, in particular to manipulate the various layers of abstractions when proving concrete statements, such as the soundness of rules of the relational program logics specific to the effects involved. Further engineering work would be needed to make efficient use of the factorisation provided by abstraction and bring this approach to a competitive level compared to the simpler direct approach of FCF [78] and XHL [94].

6 Case Study: KEM-DEM

To better demonstrate the practicality of our tool we formalized a more involved public key encryption scheme, KEM-DEM originally proposed by Cramer and Shoup [48], and used for instance in the CryptoBox protocol [32]. KEM-DEM consists in the composition of a key encapsulation mechanism (KEM) and a data encryption mechanism (DEM), and it can be proved to be indistinguishable from random under chosen ciphertext attacks (IND-CCA), as long as both the KEM and DEM are also IND-CCA schemes.

Our formalization of KEM-DEM showcases high-level SSP arguments that are not present in our previous examples such as parallel composition, the identity package, and the interchange law. Furthermore, we make a more extensive use of our probabilistic relational framework. In particular, we have to account for more interesting invariants than the mere equality of state we were using previously.

Our mechanized proof of KEM-DEM faithfully follows the single-instance security proof done on paper by Brzuska et al. [40, Section 4]. In fact, while conducting the proof in SSProve, we were able to find—in conjunction with the authors of Reference [40]—a flaw in their argument that has led them to propose a revised version of their theorem and its corresponding proof. This section describes the revised proof, which we formalized in SSProve.

6.1 The KEY Package

The KEM-DEM protocol involves the use of a symmetric key to encrypt the actual data that is going to be sent. The $\mathtt {KEY}$ package is used for generating, storing and accessing such a key.

All our statements, as well as the $\mathtt {KEY}$ package itself, are parameterized over a type of symmetric keys and distribution keyD for generating them. In this respect, we generalize [40], which uniformly samples over bit-strings of a given length. We give the $\mathtt {KEY}$ package in Figure 14.

Fig. 14.

Next, we consider packages that may rely on $\mathtt {KEY}$ either for storage/generation or for access of an otherwise set key; we will later see the KEM and DEM as instances of those. We call the former keying and the latter keyed games. More formally, a keying game ${\mathtt {K}}^{b}$ is given by a core keying game ${\mathtt {CK}}^{b}$ with ${\mathtt {import}({{\mathtt {CK}}^{0}})} = { \mathtt {Set}}$ , and ${\mathtt {import}({{\mathtt {CK}}^{1}})} = { \mathtt {Gen}}$ , and $\mathtt {Get}\not\in {\mathtt {export}({{\mathtt {CK}}^{b}})}$ , while a keyed game ${D}^{b}$ is given by a core keyed game ${\mathtt {CD}}^{b}$ such that ${\mathtt {import}({{\mathtt {CD}}^{b}})} = { \mathtt {Get}}$ , and $\mathtt {Gen}\not\in {\mathtt {export}({{\mathtt {CD}}^{b}})}$ . They are graphically represented in Figure 15 (where we write $\mathtt {Gen}/ \mathtt {Set}$ to indicate that the import is either $\mathtt {Gen}$ or $\mathtt {Set}$ depending on the secret bit b) and defined as follows:

\begin{equation*} \begin{array}{rcl} {\mathtt {K}}^{b} &=& ({\mathtt {CK}}^{b} \parallel {\mathtt {ID}_{{ \mathtt {Get}} }}) \circ \mathtt {KEY}\\ {\mathtt {D}}^{b} &=& ({\mathtt {ID}_{{ \mathtt {Gen}} }} \parallel {\mathtt {CD}}^{b}) \circ \mathtt {KEY}\end{array} \end{equation*}

The import and export interface conditions stated above ensure that these packages are meaningful.

Fig. 15.

As we will see, we will define the $\mathtt {KEM}$ package as a core keying game, and the $\mathtt {DEM}$ package as a core keyed game. From that fact alone, we will be able to derive security bounds on games that combine them as evidenced by the following lemma.

Lemma 6.1 (Single Key).

Given a keying game pair ${K}^{01}$ and a keyed game pair ${D}^{01}$ as above, we have the following inequalities for any distinguisher $\mathcal {D}$ :

\begin{equation*} \begin{array}{l@{}l} \alpha (({\mathtt {CK}}^{0} \parallel {\mathtt {CD}}^{0}) \circ \mathtt {KEY}, ({\mathtt {CK}}^{1} \parallel {\mathtt {CD}}^{1}) \circ \mathtt {KEY})(\mathcal {D}) \\ \mathop {\le }\alpha ({\mathtt {K}}^{01})(\mathcal {D}\circ ({\mathtt {ID}_{{\mathtt {export}({\mathtt {CK}})}}} \parallel {\mathtt {CD}}^{0})) \\ \mathop {+}\alpha ({\mathtt {D}}^{01})(\mathcal {D}\circ ({\mathtt {CK}}^{1} \parallel {\mathtt {ID}_{{\mathtt {export}({\mathtt {CD}})}}}))\\ \\ \alpha (({\mathtt {CK}}^{0} \parallel {\mathtt {CD}}^{0}) \circ \mathtt {KEY}, ({\mathtt {CK}}^{0} \parallel {\mathtt {CD}}^{1}) \circ \mathtt {KEY})(\mathcal {D}) \mathop {}\\ \mathop {\le }\alpha ({\mathtt {K}}^{01})(\mathcal {D}\circ ({\mathtt {ID}_{{\mathtt {export}({\mathtt {CK}})}}} \parallel {\mathtt {CD}}^{0})) \mathop {}\\ \mathop {+}\alpha ({\mathtt {D}}^{01})(\mathcal {D}\circ ({\mathtt {CK}}^{1} \parallel {\mathtt {ID}_{{\mathtt {export}({\mathtt {CD}})}}})) \mathop {}\\ \mathop {+}\alpha ({\mathtt {K}}^{01})(\mathcal {D}\circ ({\mathtt {ID}_{{\mathtt {export}({\mathtt {CK}})}}} \parallel {\mathtt {CD}}^{1})) \end{array} \end{equation*}

6.2 KEM and DEM

To be as general as possible, we will assume we are given KEM and DEM schemes. As stated earlier, the KEM will generate a symmetric key and encrypt it using an asymmetric scheme. The DEM will then use that symmetric key to encrypt the data to be sent. To that effect, we assume we are given a public and secret key spaces pkey and skey, together with a relation pkey_pair to tell which secret key correspond to which public key. We furthermore assume a symmetric key space key and an encrypted symmetric key space ekey together with distributions keyD and ekeyD on them.⁸ Finally, we also assume that we have a type of ciphers and a type of plaintexts together with a distinguished null plaintext (which we will write as $\mathtt {0}$ ), as well as a distribution on ciphers cipherD. In the original SSP paper [40], these distributions are uniform distributions and these types are described using bit-strings, but we decided for a more abstract approach, not only because it is slightly more general but also because things appear simpler, as we do not have to deal with low-level concerns.

A KEM, $\eta$ , is given by the following:

(1)

$\eta .\mathsf {kgen}$ , a—state-preserving⁹ and typically sampling—procedure that generates a valid public/ secret key pair according to the pkey_pair relation;

(2)

$\eta .\mathsf {encap}$ , a state-preserving procedure that takes a public key pk and generates a symmetric key together with its asymmetric encryption with pk;

(3)

$\eta .\mathsf {decap}$ , a deterministic function—represented by a pure function in Coq—which returns a symmetric key from its encryption and the secret key.

We additionally require that with an appropriate secret key, the original symmetric key can be recovered by applying $\eta .\mathsf {decap}$ to the encrypted key returned by $\eta .\mathsf {encap}$ . This specification, and the specification of $\eta .\mathsf {kgen}$ are handled in our formalization using the diagonal of our probabilistic relational program logic, i.e., we ensure that property $\varphi$ holds of c by verifying that the judgment

holds.

A DEM, $\theta$ , is given by the following:

(1)

$\theta .\mathsf {enc}$ , a deterministic encryption function taking a symmetric key to turn a plaintext into a cipher;

(2)

$\theta .\mathsf {dec}$ , a deterministic decryption function.

Note that we do not need to know that $\theta .\mathsf {dec}$ and $\theta .\mathsf {enc}$ are inverses of each other to conduct the security proof, so we do not require it. Of course, the DEM schemes of interest will verify this property as well.

For the remainder of this section, we assume that we are given a KEM $\eta$ and a DEM $\theta$ . Using them, we define the ${\mathtt {KEM}}^{b}$ and ${\mathtt {DEM}}^{b}$ games in Figure 18.¹⁰ These games are then used, respectively, as core keying and keyed games in the ${\mathtt {KEM-CCA}}^{01}$ and ${\mathtt {DEM-CCA}}^{01}$ game pairs (represented in Figure 19):

\begin{equation*} \begin{array}{rclrcl} {\mathtt {KEM-CCA}}^{b} &=& ({\mathtt {KEM}}^{b} \parallel {\mathtt {ID}_{{ \mathtt {Get}} }}) \circ \mathtt {KEY} \ \ \ {\mathtt {DEM-CCA}}^{b} &=& ({\mathtt {ID}_{{ \mathtt {Gen}} }} \parallel {\mathtt {DEM}}^{b}) \circ \mathtt {KEY}\end{array} \end{equation*}

Fig. 18.

Fig. 19.

We finally combine $\eta$ and $\theta$ to form a public-key encryption (PKE) in the form the ${\mathtt {PKE-CCA}}^{01}$ game pair defined in Figures 20 and 22.

Fig. 20.

6.3 Security of the KEM-DEM Construction

To state our PKE security theorem, we also define in Figures 21 and 23 the $\mathtt {MOD-CCA}$ package, which has the same exports as ${\mathtt {PKE-CCA}}^{ 01}$ , but which will eventually be composed sequentially with $\mathtt {KEM}$ , $\mathtt {DEM}$ and $\mathtt {KEY}$ to form an auxiliary game, featured in Figure 24 and defined as

\begin{equation*} {\mathtt {AUX}}^{b} = \mathtt {MOD-CCA}\circ ({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{b}) \circ \mathtt {KEY}. \end{equation*}

The security theorem that we formalize is then the following:

Fig. 21.

Fig. 22.

Fig. 23.

Fig. 24.

Theorem 6.2.

For every adversary to ${\mathtt {PKE-CCA}}^{01}$ and ${\mathtt {AUX}}^{01}$ $\mathcal {A}$ , we have the following inequality:

\begin{equation*} \begin{array}{l} \alpha ({\mathtt {PKE-CCA}}^{01})(\mathcal {A}) \\ \le \alpha ({\mathtt {KEM-CCA}}^{01})(\mathcal {A}\circ \mathtt {MOD-CCA}\circ ({\mathtt {ID}_{{\mathtt {export}({\mathtt {KEM}})}}} \parallel {\mathtt {DEM}}^{0})) \mathop {} \\ +\alpha ({\mathtt {DEM-CCA}}^{01})(\mathcal {A}\circ \mathtt {MOD-CCA}\circ ({\mathtt {KEM}}^{1} \parallel {\mathtt {ID}_{{\mathtt {export}({\mathtt {DEM}})}}})) \mathop {} \\ +\alpha ({\mathtt {KEM-CCA}}^{01})(\mathcal {A}\circ \mathtt {MOD-CCA}\circ ({\mathtt {ID}_{{\mathtt {export}({\mathtt {KEM}})}}} \parallel {\mathtt {DEM}}^{1})) \\ \end{array} \end{equation*}

Note here that unfortunately, one detail of the proof leaks into the theorem statement: the adversary is also forbidden from using the state of the intermediary game pair ${\mathtt {AUX}}^{01}$ . Concretely, that means an adversary is not allowed to use $\texttt {k_loc}$ in addition to the locations of ${\mathtt {PKE-CCA}}^{01}$ .

Proof of Theorem 6.2

We use Lemma 2.2 to establish the inequality

\begin{equation} \begin{array}{l} \alpha ({\mathtt {PKE-CCA}}^{01})(\mathcal {A}) \\ \le \alpha ({\mathtt {PKE-CCA}}^{0}, \mathtt {MOD-CCA}\circ ({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{0}) \circ \mathtt {KEY})(\mathcal {A}) \mathop {} \\ +\alpha (\mathtt {MOD-CCA}\circ ({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{0}) \circ \mathtt {KEY}, \mathtt {MOD-CCA}\circ ({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{1}) \circ \mathtt {KEY})(\mathcal {A}) \mathop {} \\ +\alpha (\mathtt {MOD-CCA}\circ ({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{1}) \circ \mathtt {KEY}, {\mathtt {PKE-CCA}}^{1})(\mathcal {A}) \end{array} \end{equation}

(6)

which corresponds to the following if we fold the definition of $\mathtt {AUX}$ :

\begin{equation} \begin{array}{l} \alpha ({\mathtt {PKE-CCA}}^{01})(\mathcal {A}) \\ \le \alpha ({\mathtt {PKE-CCA}}^{0}, {\mathtt {AUX}}^{0})(\mathcal {A}) \mathop {} \\ +\alpha (\mathtt {MOD-CCA}\circ ({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{0}) \circ \mathtt {KEY}, \mathtt {MOD-CCA}\circ ({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{1}) \circ \mathtt {KEY})(\mathcal {A}) \mathop {} \\ +\alpha ({\mathtt {AUX}}^{1}, {\mathtt {PKE-CCA}}^{1})(\mathcal {A}) \end{array} \end{equation}

(7)

We then show that ${\mathtt {PKE-CCA}}^{b}$ and ${\mathtt {AUX}}^{b}$ are perfectly indistinguishable, using Theorem 2.4, i.e., we show they define equivalent procedures using an appropriate invariant. We detail the invariant and the code comparisons in Section A. We now proceed with rest of the proof of Theorem 6.2. Since ${\mathtt {PKE-CCA}}^{b}$ is perfectly indistinguishable from ${\mathtt {AUX}}^{b}$ , the inequality (7) simplifies to

\begin{equation} \alpha ({\mathtt {PKE-CCA}}^{01})(\mathcal {A}) \le \alpha (\mathtt {MOD-CCA}\circ ({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{0}) \circ \mathtt {KEY}, \mathtt {MOD-CCA}\circ ({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{1}) \circ \mathtt {KEY})(\mathcal {A}) \end{equation}

(8)

Using Lemma 2.3, we replace the right-hand side to derive

\begin{equation} \alpha ({\mathtt {PKE-CCA}}^{01})(\mathcal {A}) \le \alpha (({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{0}) \circ \mathtt {KEY}, ({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{1}) \circ \mathtt {KEY})(\mathcal {A}\circ \mathtt {MOD-CCA}) \end{equation}

(9)

This (right-hand) upper bound corresponds to an instance of the left-hand side of the second inequality of Lemma 6.1 using $\mathcal {A}\circ \mathtt {MOD-CCA}$ as distinguisher, meaning we have the inequality

\begin{equation} \begin{array}{l@{}l} \alpha (({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{0}) \circ \mathtt {KEY}, ({\mathtt {KEM}}^{0} \parallel {\mathtt {DEM}}^{1}) \circ \mathtt {KEY})(\mathcal {A}\circ \mathtt {MOD-CCA}) \\ \le \alpha ({\mathtt {KEM-CCA}}^{01})(\mathcal {A}\circ \mathtt {MOD-CCA}\circ ({\mathtt {ID}_{{\mathtt {export}({\mathtt {KEM}})}}} \parallel {\mathtt {DEM}}^{0})) \mathop {}\\ +\alpha ({\mathtt {DEM-CCA}}^{01})(\mathcal {A}\circ \mathtt {MOD-CCA}\circ ({\mathtt {KEM}}^{1} \parallel {\mathtt {ID}_{{\mathtt {export}({\mathtt {DEM}})}}})) \mathop {}\\ +\alpha ({\mathtt {KEM-CCA}}^{01})(\mathcal {A}\circ \mathtt {MOD-CCA}\circ ({\mathtt {ID}_{{\mathtt {export}({\mathtt {KEM}})}}} \parallel {\mathtt {DEM}}^{1})) \end{array} \end{equation}

(10)

We thus conclude using transitivity of (9) and (10).□

7 Case Study: Σ-protocols

$\Sigma$ -protocols form an important class of honest-verifier zero-knowledge protocols [50, 61]. A $\Sigma$ -protocol is defined on a relation $\mathbf {R}$ for which a prover, in zero-knowledge, can prove it knows a witness w for a public statement h such that the relations $\mathbf {R}\; h\; w$ holds.

In this section, we show how we can define the class of $\Sigma$ -protocols in SSProve. This is not meant as an exhaustive treatment of $\Sigma$ -protocols in SSP. We were mainly driven by comparison of SSProve to CryptHOL, in particular with the work of Butler et al. [42]. We prove the security of a transformation by converting a $\Sigma$ -protocol in our class of protocols into a commitment scheme. A commitment scheme is a cryptographic primitive allowing anyone to publicly commit themselves to a value without revealing the value itself. Moreover, the party committing to the message can freely reveal the message at a later time with the guarantee that the value revealed is the value publicly committed to earlier.

Finally, we conclude the section by proving Schnorr’s protocol [87] to be a member of the class of $\Sigma$ -protocols and prove concrete security bounds. Schnorr’s protocol allows a prover to convince a verifier that it knows the discrete logarithm of a group element.

The general flow of a $\Sigma$ -protocol can be seen in Figure 25. First, the prover uses the secret information w to compute a message a, which is sent to the verifier. Second, the verifier samples a challenge e uniformly at random from some challenge space and sends it to the prover. Third, the prover computes a response z based on the secret information, the message, and the challenge. The response is then sent to the verifier. Finally, the verifier takes the public information, message, challenge, and response and checks whether it is convinced that the prover knows the secret.

Fig. 25.

The combination of the message, challenge, and response is commonly referred to as the transcript of the protocol.

We say that a $\Sigma$ -protocol is secure when both of the following hold.

(1)

There exists an efficient simulator that, given the public information and a fixed challenge, can produce a transcript that is indistinguishable from a real execution of the protocol with the same challenge. This is commonly referred to as the protocol being special honest-verifier zero-knowledge.

(2)

Given two accepting transcripts with the same initial message and different challenges, the witness for the relation can be reconstructed. This property is known as special soundness.

7.1 The Σ scheme

$\Sigma$ -protocols have been formalized before. Butler et al. [42] formalize a number of $\Sigma$ -protocols in CryptHOL, and prove compositional properties of these protocols. Sidorenco et al. [90] formalize $\Sigma$ -protocols in EasyCrypt, and combine them with a formalization of multi-party computation (MPC) to formalize a, so-called, MPC-in-the-head protocol.

For our representation of $\Sigma$ -protocols, we build on both these formalizations to define the $\mathtt {SIGMA}$ scheme and its corresponding security properties.

(1)

$\Sigma$ .Init, a non-deterministic procedure generating a message and state given access to the witness and public statement.

(2)

$\Sigma$ .Response, a procedure generating a response from a state, witness, public statement, previous message, and any challenge.

(3)

$\Sigma$ .Verify, a deterministic procedure returning a Boolean based on all information sent in the protocol.

(4)

$\Sigma$ .Simulate, a function computing a tuple (message, response) from a public statement and any challenge.

(5)

$\Sigma$ .Extract, a procedure that given two transcripts either outputs a witness or fails.

Additionally, we assume uniform distributions, challengeD and witnessD, for sampling challenges and witnesses of the relation, respectively.

Security is defined as several games interacting utilizing the $\mathtt {SIGMA}$ scheme. The various security games are shown in Figures 26 and 27 .

Fig. 26.

Fig. 27.

For any game that depends on a $\Sigma$ scheme, we denote the scheme as follows: $\mathtt {P}_{S}$ denotes the $\mathtt {P}$ game depending on the $\Sigma$ scheme S. For brevity, the scheme is sometimes omitted from the notation.

Definition 7.1.

The special honest-verifier zero-knowledge security of an instantiation of a $\mathtt {SIGMA}$ scheme S against adversary $\mathcal {A}$ is the advantage $\alpha ({{\mathtt {SHVZK}_{S}}}^{0}, {{\mathtt {SHVZK}_{S}}}^{1})(\mathcal {A})$ .

Definition 7.2.

The special soundness of an instantiation of a $\mathtt {SIGMA}$ scheme S against adversary $\mathcal {A}$ is the advantage $\alpha ({{\mathtt {SOUND}_{S}}}^{0}, {{\mathtt {SOUND}_{S}}}^{1})(\mathcal {A})$ .

7.2 Commitment Schemes from Σ-Protocols

A commitment scheme is another cryptographic primitive allowing a committer with some message msg to convince a verifier of two things: First, that msg has a fixed value set before contacting the verifier. Second, that the committer can at any later time reveal the value of msg to the verifier.

In particular, it must be the case that the verifier is convinced that the revealed message has not been changed from the original fixed message.

A commitment scheme is parameterized by types of messages, opening keys, and commitments. The scheme is given by the following:

(1)

$\texttt {Commit}$ , a probabilistic and stateful procedure, which produces a commitment from a message and an opening key.

(2)

$\texttt {Ver}$ , which takes as input a commitment, message, and opening key and checks the validity of the commitment.

Definition 7.3.

A commitment scheme is called secure when it is both hiding and binding:

•

Hiding: For any commitment c produced from message msg there exists a message $msg^{\prime } \ne msg$ with commitment $c^{\prime }$ indistinguishable from c.

•

Binding: For any commitment c it is infeasible to find messages with opening keys $(msg, o)$ and $(msg^{\prime }, o^{\prime })$ with $msg \ne msg^{\prime }$ such that both messages are valid openings for c.

For any given instance of a commitment scheme, we define the security definitions from Definition 7.3 as the security games seen in Figures 30 and 31.

Following the presentation of Damgaard [50], we show how our $\mathtt {SIGMA}$ scheme with related security games can be used to construct a commitment scheme. This result is mostly of theoretical interest, and has previously been formalized in CryptHOL [42]. We formalize the result in SSProve to experiment with leveraging the package laws to build one protocol from another, and compare this result with the construction in CryptHOL in Section 7.2.1.

The key component of this transformation is the $\mathtt {COM}$ package shown in Figure 29. Here, $\mathtt {COM}$ is parameterized by the $\mathtt {SIGMA}$ scheme. Moreover, $\mathtt {COM}$ is meant to be composed with the $\mathtt {KEY}$ package, which is responsible for distributing the public and secret keys between the parties, and as such imports $\texttt {Init}$ and $\texttt {Get}$ . The $\mathtt {KEY}$ package is shown in Figure 28. Note that the $\mathtt {KEY}$ package contains the signing key. We use the $\mathtt {KEY}$ package because we need to sample a commitment key that fits the relation of the $\Sigma$ -protocol without revealing the underlying witness used. Without this package, we would not be able to use our theorems of the $\Sigma$ -protocol without giving the committer access to the secret part of the relation.

Fig. 28.

Fig. 29.

$\mathtt {COM}$ then exports two procedures:

(1)

$\texttt {Commit}$ , which uses the public and secret parts of the relation to produce a commitment to the challenge e. For this transformation, the commitment is the initial message of the underlying $\Sigma$ -protocol.

(2)

$\texttt {Ver}$ , which takes the commitment and the opening information and verifies their consistency. This, again, is dependent on the underlying $\Sigma$ -protocol.

In this transformation, the types of the underlying $\Sigma$ -protocol dictate the types of the commitment scheme. In particular, the message type of the commitment scheme is the type of the challenge used in the $\Sigma$ -protocol.

First, in Theorem 7.4, we show that the construction is hiding with its security bounded by the Special Honest-Verifier Zero-Knowledge property of the underlying $\Sigma$ -protocol.

Theorem 7.4.

For any instantiation of the $\mathtt {SIGMA}$ scheme, the commitment scheme given by $\mathtt {COM}\circ \mathtt {KEY}$ , where the adversary is given oracle access to the $\mathtt {KEY}$ package, is hiding with the following advantage:

\begin{equation*} \begin{array}{l} \alpha (({\mathtt {HIDE}}^{0} \circ \mathtt {COM}\circ \mathtt {KEY}) \parallel \mathtt {KEY}, ({\mathtt {HIDE}}^{1} \circ \mathtt {COM}\circ \mathtt {KEY}) \parallel \mathtt {KEY})(\mathcal {A}) \\ \le \alpha (({\mathtt {HIDE}}^{0} \circ \mathtt {AUX} \circ {\mathtt {SHVZK}}^{0}) \parallel \mathtt {KEY}, ({\mathtt {HIDE}}^{1} \circ \mathtt {AUX} \circ {\mathtt {SHVZK}}^{1}) \parallel \mathtt {KEY})(\mathcal {A}) \mathop {} \\ +\alpha ({\mathtt {SHVZK}}^{01})(\mathcal {A}\circ ({\mathtt {HIDE}}^{0} \circ \mathtt {AUX} \parallel \mathtt {KEY})) \mathop {} \\ +\alpha ({\mathtt {SHVZK}}^{01})(\mathcal {A}\circ ({\mathtt {HIDE}}^{1} \circ \mathtt {AUX} \parallel \mathtt {KEY})) \\ \end{array} \end{equation*}

where $\mathtt {AUX}$ is the package given by inlining $(\mathtt {SIGMA}\parallel \mathtt {KEY})$ into $\mathtt {COM}$ and replacing the call to the $\Sigma$ -protocol simulator with ${\mathtt {SHVZK}}^{1}.Main$ . Note that ${\mathtt {SHVZK}}^{1}.Main$ requires both a public statement and a witness as its arguments. Since $\mathtt {COM}$ calls $\mathtt {KEY}.Init$ the witness exists within the package $\mathtt {COM}\circ \mathtt {KEY}$ . The ${\mathtt {SHVZK}}^{1}.Main$ procedure is called with the statement and witness produced by $\mathtt {KEY}$ . ¹¹

The binding property states that it must be infeasible for an adversary to produce two openings to the same commitments. Usually, this property is stated as: the probability of $\mathtt {BIND}\circ \mathtt {COM}$ returning true, for any input, is sufficiently small. In SSProve, we do not have a unary logic to express such a statement. Rather, we prove in our relational logic that any adversary that makes $\mathtt {BIND}\circ \mathtt {COM}$ return true, can be used to construct a program that outputs a witness of the relation. The package using the adversary to extract a witness for the relation is given by Figure 32. If a $\Sigma$ -protocol is used to construct the commitment scheme, then the binding game is infeasible to win, because it is infeasible to compute the witness from the statement in a $\Sigma$ -protocol. ¹²

Fig. 32.

To prove Theorem 7.5, we use Lemma 2.2 and Lemma 2.3 to replace the commitments with $\Sigma$ -protocol transcripts, via ${\mathtt {SOUND}}^{1}$ .

7.2.1 Comparison to Formalization in CryptHOL.

The construction deriving commitment schemes from $\Sigma$ -protocols has also been formalized in CryptHOL [42]. In their definition, the commitment scheme construction is dependent on the types needed to instantiate a $\Sigma$ -protocol. The proofs of hiding and binding are then quantified over any secure $\Sigma$ -protocol that can be formed from the specified types. In particular, it is required that the distinguishing advantage on the special honest-verifier zero-knowledge security game is 0.PGHNS: Is that an unreasonable assumption? Do important protocols fail to satisfy it? Moreover, they assume that the $\Sigma$ .Response and $\Sigma$ .Verify procedures of the $\Sigma$ -protocol terminate on all inputs.

In contrast, our security bounds defined in Theorems 7.4 and 7.5 make no assumptions on the underlying $\Sigma$ -protocol other than that it implements the $\mathtt {SIGMA}$ interface. Because our results hold for any $\mathtt {SIGMA}$ package, we obtain a more general notion of security for the hiding property in Theorem 7.4, where the security bound is directly related to the security of the underlying $\Sigma$ -protocol.

The imperfect game hops are justified by the package theorems of SSProve, which can be seen in the proof of Theorem 7.4. In particular, this allows us to accumulate the advantage from our game-hops into our final security bound, whatever the respective intermediate advantages may be.

Without the ability to reason about the advantage of composed packages, the proof would have had to involve significantly more steps, or make the same assumptions as in Reference [42]. Namely, if we adopt the same assumption, then the proof of the security bounds can be done entirely within the relational logic itself. More concretely, the assumption of perfect special honest-verifier zero-knowledge reduces the statement from an adversary comparing both programs to the two programs being equivalent in the relation logic.

7.3 Concrete Implementation: Schnorr’s Protocol

The Schnorr protocol [87] is parameterized over a cyclic group $(\mathcal {G}, \texttt {*})$ with q elements generated by g. Schnorr’s protocol is a $\Sigma$ -protocol for the relation $(h, w) \in \mathbf {R}\iff h = g^{w}$ , where w is an element of $\mathbb {Z}_{q}$ and $h \in \mathcal {G}$ .

Messages are elements $a \in \mathcal {G}$ and responses are elements $z \in \mathbb {Z}_{q}$ . Challenges are sampled from a uniform distribution over $\mathbb {Z}_{q}$ .

The protocol is implemented as an SSP package as shown in Figure 33. In particular, the package exports match the expected imports of our $\Sigma$ -protocol security statements.

Fig. 33.

Lemma 7.6 (Schnorr SHVZK).

For any adversary $\mathcal {A}$ , we have the following equality:

\begin{equation*} \alpha ({{\mathtt {SHVZK}_{\mathtt {SCHNORR}}}}^{01})(\mathcal {A}) = 0. \end{equation*}

Lemma 7.7 (Schnorr Special-Soundness).

For all adversaries $\mathcal {A}$ , we have the following equality:

\begin{equation*} \alpha \left({{\mathtt {SOUND}_{\mathtt {SCHNORR}}}}^{01}\right)(\mathcal {A}) = 0 \end{equation*}

Based on Lemmas 7.6 and 7.7, we can instantiate Theorems 7.4 and 7.5. For the latter, we can directly apply the theorem to show that any adversary has no advantage between the binding game and directly extracting the witness. For the former, the adversary also has no advantage, which we show in Theorem 7.8.

Theorem 7.8.

For all adversaries $\mathcal {A}$ , we get the following equality for the commitment scheme instantiated from Schnorr’s protocol:

\begin{equation*} \alpha ({\mathtt {HIDE}}^{01} \circ \mathtt {COM}_{\mathtt {SCHNORR}} \circ \mathtt {KEY})(\mathcal {A}) = 0. \end{equation*}

We omit the simple proof of the binding property [42, pg.9], which can be found in our formalization (Lemma commitment_binding in SigmaProtocol.v).

8 Related Work

SSProve is the first verification framework for SSP, yet the formal verification of cryptographic proofs in other styles has been intensely investigated [15]. In this section, we survey the closest related work in this space.

CertiCrypt [26] is a foundational Coq framework for game-based cryptographic proofs. CertiCrypt does not support modular proofs and is no longer maintained, yet it is seminal work that has inspired many other tools in this space, such as EasyCrypt, FCF, and so on. The logic we introduce in Section 4 is also inspired by the probabilistic relational Hoare logic at the core of CertiCrypt.

FCF [78] is a more recent foundational Coq framework for cryptographic proofs that was used to verify the HMAC implementations in OpenSSL [31] and mbedTLS [98]. In contrast to CertiCrypt’s (and EasyCrypt’s) deep embedding of a probabilistic While language, FCF represents code with finite probabilities and non-termination using a monadic embedding, similar to the free monad we use for code in Section 3.1. The advantage of such an embedding is that code can be both easily manipulated as a syntactic object (e.g., to define package composition in Section 3.1) and easily lifted to a probability monad when needed (Sections 3.2 and 5.2), all without leaving Gallina, the internal language of Coq. This monadic representation of computational effects could also allow a more modular treatment of programs exhibiting effects of different nature such as communications with an external process [70]. Building a formalization of SSP on top of FCF may also be possible in principle though, and maybe even more interesting in practice could be rebasing SSProve on a simpler but less modular semantic model in the style of FCF or XHL [94] (as discussed in Section 5.5).

EasyCrypt [21, 24] is a proof assistant and verification tool specifically designed for game-based cryptographic proofs and built from scratch. This state-of-the-art tool provides not only a probabilistic relational Hoare logic in the style of CertiCrypt (as we also do in Section 4) but also a unary logic for reasoning about the probabilities of bad events [23, 26] (which is, as mentioned in Section 2.3, future work for SSProve). EasyCrypt has for instance been used to prove security of Amazon Web Services’ Key Management Service [6]; of electronic voting schemes [51]; of zero-knowledge protocols [54]; of secure multiparty computation constructions [8]; of distance-bounding protocols [36]; and of low-level implementations of cryptographic primitives [5, 7, 9, 18, 88]. EasyCrypt’s good integration with automatic theorem provers (e.g., SMT solvers) seems helpful for such proofs, even if it does come at a cost in terms of trusted computing base. EasyCrypt also comes with an ML-style module system [16], as well as other abstraction mechanisms such as “theories”. The default way of mechanizing proofs in EasyCrypt is, however, quite different than that of SSP. The EasyCrypt abstraction mechanisms were designed for allowing reuse of code and theorems [21], but they are rarely used at the moment to provide a modular high-level structure to cryptographic reduction proofs in the style of SSP [52].

In concurrent work, Dupressoir et al. [52] show that they can mechanize in EasyCrypt an SSP-style multi-instance security proof for the Cryptobox [32] KEM-DEM [48] construction, and they discuss the various trade-offs they had to navigate and the strengths and shortcomings of EasyCrypt for formalizing such SSP-style proofs. Their multi-instance setting comes with specific challenges that do not appear in our mechanized proof from Section 6, such as the adversarial ability to create corrupt key instances. Yet despite the different setting, the high-level structure of the proofs seems similar to us.

Beyond this case study, we focus on providing a general framework for SSP proofs, including formal definitions of SSP packages, their composition, and the corresponding algebraic laws. Dupressoir et al. [52] show instead informally, in the context of their example, how SSP concepts can be mapped to existing features in EasyCrypt. In particular, games are mapped to modules, and packages with imports are mapped to ML-style functors [75], i.e., modules parameterized by other modules. Modules and functors are not fully first class in EasyCrypt though: while it is possible to quantify over modules, formulas cannot talk about module equality [93]. As a result, SSP-style laws cannot be stated or proven, and one has to manually write down all the functors involved in the proof, and argue about equality of the individual procedures instead.

SSP also has a notion of parameterized packages, in particular packages parameterized by crypto schemes or multi-instance packages [40, Section 5]. For the former, in SSProve, we can easily parametrize packages over parameters like crypto schemes using the expressive abstraction mechanisms of Coq’s functional programming language, such as functions returning packages. Dupressoir et al. achieve something similar using the module-level abstraction mechanisms of EasyCrypt, choosing between functors and theories on a case by case basis [52, pg. 7]. The latter—multi-instance packages in SSP—are regular packages for which the names of the procedures (defined or used by the package) are indexed by natural numbers. To support multi-instance packages in SSProve a small technical change is needed: We can parametrize both the packages’ memory locations and procedure names by an offset to make each instance distinct, which some of the authors have already tried out in ongoing work on formally connecting SSProve and Jasmin [5, 60]. By contrast, Dupressoir et al. [52] represent a multi-instance package as a single regular module (or functor) in EasyCrypt, having as state a map from instance indices to the actual state of each instance, and where the procedures also take an explicit instance index as argument.

SSProve gives the assert operation from SSP a simple and clearly defined semantics: assertion failure samples from the empty probability subdistribution (the same as entering an infinite loop in EasyCrypt). While this is not the only choice [29], this formalizes our understanding of the following informal convention from the paper introducing SSP [40]: “all our definitions and theorems apply only to code that never violates assertions.” By contrast, Dupressoir et al. [52], manually encode a different, “oracle silencing” semantics for assertion failures [84] in their case study, but without formally providing an assert construct, which seems non-trivial to formalize for their semantics.

SSProve faithfully follows the SSP model for memory initialization, allowing to express SSP proofs more naturally. Using default initial values for implicitly initializing all state variables allows us to define the notion of distinguishing advantage in terms of running on an initial memory, as done in SSP. By contrast, Dupressoir et al. [52] use a mix of explicit initialization and logical preconditions to restrict the memories considered to those that are properly initialized. Finally, one aspect in which we took a similar design decision in SSProve to Dupressoir et al. [52], is the absence of the implicit $\alpha$ -renaming conventions and pervasive state separation of Brzuska et al. [40]. We instead reason about concrete locations and explicitly require state separation only between adversaries and the games with which they are composed. Reasoning about concrete locations in a shared global memory is not without downsides though, since as seen in Sections 2.3 and 6.3 sometimes details of the proofs or assumptions do leak into the final theorem statements, so more research on better supporting private state for machine-checked modular cryptography would be helpful, in particular to also allow fresh locations to be allocated.

EasyUC [45] aims to address the lack of composability in game-based proofs by formalizing the Universal Composability (UC) framework [43] using EasyCrypt. EasyUC replaces the interactive Turing machines in UC with EasyCrypt functions. It was used to prove a secure messaging protocol composed of Diffie-Hellman and one-time pad. More recent work develops a DSL [44] on top of EasyUC for hiding away the boilerplate needed to mediate between procedure-based communication in EasyCrypt and co-routine-based communication in the UC framework. Barbosa et al. [16] add automatic complexity analysis to EasyCrypt and use it for another formalization of UC. ILC [66] is a process calculus modelling some of the key ideas behind the UC framework, in particular its co-routine-based communication mechanism, while completely abstracting away from interactive Turing machines. ILC has not yet been formalized in a proof assistant. SSP was in part inspired by the UC framework, but focuses on making game-based proofs more modular and scalable, without targeting universal composability. A more precise comparison between SSP and UC proofs would be interesting, but out of scope for the current article. Recent work by Brzuska and Oechsner [41] and Brzuska et al. [39] indicates that SSP can also be relevant for simulation-based security.

CryptHOL [27] is a foundational framework for game-based proofs that established a connection between relational parametricity and coupling, the main workhorse of pRHL, to achieve automation in the Isabelle/HOL proof assistant. CryptHOL also makes use of the extensive mathematical libraries of Isabelle/HOL. More proof engineering and automation would be needed for SSProve to have a chance at matching the elegance of CryptHOL’s formalization of ElGamal or PRF-based encryption. CryptHOL [67] has been also used to formalize Constructive Cryptography [71] (an instance of Abstract Cryptography [72]), another composable framework that inspired SSP, and the example of a one-time pad. CryptHOL converters are similar to SSP packages, however, there are certain distinctions that we discuss here: To begin, converters combine all their procedures into a single resumption-like value, resulting in simpler interfaces consisting of a dependent pair $(A, B)$ , where A denotes a global input type and B a global output type of the bundled procedures, and an additional invariant is maintained to ensure that outputs correspond to specific inputs. Procedures in SSP packages are written using a free monad that captures the probabilistic operations without evaluating them, and only at a later stage these operations are interpreted, whereas converters are built directly over the probabilistic subdistributions monad. Similarly, SSP packages utilize uninterpreted operations for stateful operations, whereas converters keep a hidden state using the coinductive structure of resumptions. CryptHOL provides a bisimilarity notion that can be used to prove perfect indistinguishability, similar to our Theorem 2.4, but in addition, they also provide a bisimulation-style proof rule for establishing trace equivalence, which as they show is needed in certain cryptographic arguments.

Regarding automation, both Isabelle/CryptHOL and EasyCrypt are based on classical HOL and provide powerful SMT-based automation. A detailed comparison between the HOL-systems and systems like Coq based on dependent-type theory is out of scope for the present article. We merely observe that huge formalizations have been carried out in both traditions. On the narrow topic of SMT-based automation, we mention this is also being developed for Coq [12, 49], and any progress on those projects could also profit SSProve. Another key part of the automation in EasyCrypt is the auto tactic, which tries a to apply rules of the program logic based on the structure of the code. We have a similar tactic in SSProve, which moreover, is easily extensible due to Coq’s tactic language.

IPDL [56] is another recent Coq framework for cryptographic proofs. Although their motivation is similar to SSP and their interaction sets are reminiscent of packages, the relation of IPDL to other composable frameworks has not been worked out, and is out of scope for the current article. Their compositional formalism seems closer to Constructive Cryptography [71] than to UC. Another recent work that seems related to Constructive Cryptography is categorical composable cryptography [37], which gives an abstract model of composable cryptographic security definitions in terms of categorically formulated resource theories [46]. The authors leave open the precise relation to Constructive Cryptography, or to other established cryptographic frameworks.

Computational Indistinguishability Logic (CIL) [20] is another formal framework for reasoning about cryptographic primitives in terms of “oracle systems,” which are inspired by probabilistic process algebra and which seem related to SSP packages. The operational model and the soundness of CIL have been formalized in Coq [47].

SSP packages have been motivated by ML modules [85] and sequential composition of packages is similar to the usual composition of modules. No specific theory for probabilistic programming languages with stateful modules seems to be available, but Sterling and Harper [92] provide a general module system. It would be interesting to specialize it to probabilistic stateful programs and compare it to packages.

9 Future Work

The high-level proofs done on paper in the miTLS project [33, 34, 39, 55] were the main inspiration for the SSP methodology, and it would be an interesting challenge to scale SSProve to mechanizing such large security proofs in the future. This would, for a start, require more work on proof engineering and automation, as well as implementing a unary logic for reasoning about the probabilities of bad events [23, 26]. The problem of verifying such large proofs all the way down to low-level efficient executable code is even more challenging, also given the large scale of a complete implementation for a protocol like TLS. Achieving this in Coq would probably require integrating with projects such as Jasmin [5, 7], VST [11], or FiatCrypto [53]. Some of the authors have, in fact, been working on integrating SSProve with Jasmin [5, 7] by defining a translation from the Coq representation of Jasmin programs to SSProve programs, while provably preserving semantics [60]. This allows them to formally connect in Coq security proofs in SSProve to the assembly code produced by the Jasmin verified compiler.

An alternative would be to port SSProve to F $^\star$ [95], where at least functional correctness can be verified at that scale. Still many challenges would remain, including extending F $^\star$ to probabilistic verification, giving F $^\star$ modules first-class status, and extending the SSP methodology to support type abstraction and procedures with specifications. An interesting step in this direction is the preliminary work of Kohbrok et al. [64], who have implemented vanilla SSP packages in F $^\star$ and attempted to automate state-separating proofs based on a library for partial setoids.

Finally, our formalization of $\Sigma$ -protocols has been used recently to prove the security of the OpenVoteNetwork smart contract [91].

Acknowledgments

We are grateful to Arthur Azevedo de Amorim for his technical support and to Théo Laurent, Sabine Oechsner, and Ramkumar Ramachandra for participating in stimulating discussions. We are also grateful to Markulf Kohlweiss for pointing out a bug in our modelling of public-key encryption schemes in the ElGamal example, which we quickly fixed as discussed in Section 2.4. We also thank 40 for their prompt fix of the informal proof of security for KEM-DEM [40], which allowed us to complete our formalization. Finally, we are also very grateful to the anonymous reviewers of CSF and TOPLAS for their detailed and helpful feedback. Antoine Van Muylder holds a PhD Fellowship from the Research Foundation–Flanders (FWO).

Footnotes

In SSProve the procedure names within interfaces are also associated with argument and result types, but we omit this detail until Section 3.1.

We will still use the $\lambda$ notation for programs.

Here we omit quantifications in pre- and postconditions for conciseness.

⁴

In a previous version of this work [1] we were—erroneously—not providing Get_pk() to the adversary, so the result was not a proper public-key scheme. We thank Markulf Kohlweiss for making us aware of this flaw, which was easy to fix. Formalizing the connection between OT-CPA and CPA [86, Claim 15.5] would have likely also exposed this flaw.

⁵

This type of raw code comes equipped with an induction principle (basically structural induction on trees), which is used for instance in the proof of Theorem 2.4, in Theorem 4.1, and in the definition of the bind operation and sequential composition of packages by recursion over code.

⁶

Here, we write ${\mathtt {mem}({P})}$ to denote the memory footprint of package P. In the Coq formalization, we instead use a relation on a package and a set of memory locations stating that the packages only uses memory locations in the set but not necessarily all of them. This is a technical detail, and in the remainder of the article, we will stick to the $\mathtt {mem}$ as a function notation.

⁷

https://github.com/SSProve/ssprove/blob/journal-version/theories/Relational/OrderEnrichedCategory.v#L379.

⁸

The same $\texttt {keyD}$ is used in the $\mathtt {KEY}$ package.

⁹

Observationally, the state must be the same after execution of the procedure.

¹⁰

We abuse $\texttt {if}$ notation here to match with our examples in Section 2. Formally, $\texttt {c}$ being set in the two branches can be understood as the whole $\texttt {if}$ computing the value that is then stored in $\texttt {c}$ .

¹¹

An anonymous reviewer points out that it should be possible to obtain a sharper bound as follows. The hiding games for a commitment scheme contains some redundancy: $\mathtt {HIDE}^0$ is already a left-or-right security notion in itself, and is compared to the ideal $\mathtt {HIDE}^1$ game, which commits to random messages. This duplication appears in the advantage above as the SHVZK advantage counted twice, when once would be sufficient with a different formulation of the hiding games. We prefer to stick to the present presentation, as it facilitates comparison with the work of Butler et al. [42].

¹²

Our definition of the binding game is equivalent to those used in other formalizations of commitment schemes [42, 73]. An anonymous reviewer suggests that these definitions are reminiscent of a trapdoor commitment scheme. We leave a detailed comparison to future work.

A Package Equivalence for KEM-DEM

To prove Theorem 6.2, we show that ${\mathtt {PKE-CCA}}^{b}$ and ${\mathtt {AUX}}^{b}$ (note that, as in Section 6, we assume we are given a KEM $\eta$ and a DEM $\theta$ corresponding to the ones used in both games) are perfectly indistinguishable using Theorem 2.4. After inlining package composition, we end up with the code comparison found in Tables A.1, A.2, and A.3. We deliberately add extra newlines to align similar lines of code.

Table A.1.

Table A.2.

Table A.3.

Because only ${\mathtt {AUX}}^{b}$ makes use of the $\mathtt {KEY}$ package, the $\mathtt {k_loc}$ memory location is only used on one side, which means that we cannot use equality of heaps as an invariant. Instead, our invariant corresponds to ensuring the following three points:

(1)

equality of heaps on all locations except $\texttt {k_loc}$ ;

(2)

$\texttt {pk_loc}$ will store $\texttt {l}$ if and only if $\texttt {sk_loc}$ stores $\texttt {l}$ ;

(3)

whenever $\texttt {k_loc}$ and $\texttt {ek_loc}$ are set—i.e., do not contain $\texttt {l}$ —in ${\mathtt {AUX}}^{b}$ , $\texttt {ek_loc}$ will, in fact, contain the result of the encapsulation of the value stored in $\texttt {k_loc}$ .

To preserve this last invariant, we exploit the correctness of the KEM, as we will see later.

We will now address the equivalences corresponding to the three different procedures in the common export interface of ${\mathtt {PKE-CCA}}^{b}$ and ${\mathtt {AUX}}^{b}$ .

Equivalence for Pkgen . When looking at Table A.1, we can see that the only difference is in the first two lines of ${\mathtt {AUX}}^{b}$ , which are absent from ${\mathtt {PKE-CCA}}^{b}$ . Taken in isolation, they would break the equivalence because of the assert. Here, we can leverage the invariant stating that the locations $\texttt {pk_loc}$ and $\texttt {sk_loc}$ are always mutually set, so that if $\mathtt {sk_loc}$ contains $\texttt {l}$ then $\mathtt {pk_loc}$ does too. To exploit the invariant, we first swap commands on the right-hand side to perform the read of $\texttt {sk}$ on both sides, and we recover the fact that it must be $\texttt {l}$ . We also verify that the invariant is preserved as $\texttt {pk_loc}$ and $\texttt {sk_loc}$ are both set at the end of the run.

Equivalence for Pkenc . In Table A.2, we can see a lot of repetition and locations that are read at different occasions on the left- and right-hand sides. Thankfully, our relational program logic supports one-sided rules for memory reads and writes that remember values that have been written and read; they correspond to rules like $\mathtt {|}$ get-lhs| or $\mathtt {|}$ put-lhs| that are presented at the end of Section 4.1. With this, we have the same programs on both sides up to the following line:

To progress, we cannot merely use a simple application of the bind rule, because we would then lose the information that $\texttt {k}$ and $\texttt {ek}$ are related. Instead, we use the specification of $\eta$ (the KEM) to get as a precondition, for the rest of the comparison, the fact that $\texttt {ek}$ is the encryption of $\texttt {k}$ . After that, on the right-hand side, we make use of the invariant relating $\texttt {pk}$ , $\texttt {k_loc}$ and $\texttt {ek_loc}$ to ascertain that, since $\texttt {ek_loc}$ contains $\texttt {l}$ , so must $\texttt {k_loc}$ . When the value of $\texttt {k_loc}$ is read again on the right-hand side, we proceed as above to remember the value that was just stored.

The rest of the proof is straightforward, we only have to show that we preserved the invariant when overwriting the memory, which means that we must show that the newly stored values in $\texttt {k_loc}$ and $\texttt {ek_loc}$ must indeed correspond to a pair of a key and its encryption, a fact that we recovered above.

Equivalence for Pkdec . For the most part before the if in Table A.3, the equivalence proof is conducted in roughly the same way as above. Then, we proceed with a case-analysis on $\texttt {ek}$ = Some $\mathtt {ek^{\prime }}$ . In the else branch, all of the asserts hold, as they hold in the lines above and the invariant stating that $\mathtt {pk_loc}$ is $\texttt {L}$ if and only if $\mathtt {sk_loc}$ is satisfied, and furthermore the case-analysis yielded $\texttt {ek ≠ }$ Some $\mathtt {ek^{\prime }}$ . The rest of the code in the $\texttt {else}$ branch then goes on to produce exactly the same result as the left-hand side.

The more interesting bit happens in the then branch where there is no call to the decapsulation procedure $\eta .\mathsf {decap}$ of the KEM. Instead, we exploit the invariant that states that the stored encrypted key in $\mathtt {ek_loc}$ corresponds to the encryption of the key in $\mathtt {k_loc}$ using the public key in $\mathtt {pk_loc}$ , a fact that we encoded by saying that in this case $\texttt {k}$ is equal to $\eta .\mathsf {encap}$ $\mathtt {(sk, ek)}$ . We conclude remembering that we are in the branch where $\texttt {ek }$ = Some $\mathtt {ek^{\prime }}$ .

Now that the three procedures have been shown equivalent, we know that the two packages are indeed perfectly indistinguishable.

References

[1]

Carmine Abate, Philipp G. Haselwarter, Exequiel Rivas, Antoine Van Muylder, Théo Winterhalter, Cătălin Hriţcu, Kenji Maillard, and Bas Spitters. 2021. SSProve: A foundational framework for modular cryptographic proofs in Coq. In Proceedings of the IEEE 34th Computer Security Foundations Symposium (CSF’21). IEEE. DOI:

Abstract

1 Introduction

2 Using SSProve: Key Ideas and Examples

2.1 An Introduction to SSP

2.2 Proving Perfect Indistinguishability Steps in a Probabilistic Relational Program Logic

2.3 Security Proof of PRF-based Encryption in SSProve

2.4 Security Proof of ElGamal in SSProve

3 Formalizing State-separating Proofs

3.1 Syntax for Cryptographic Code (Free Monad)

3.2 Semantics of Cryptographic Code

3.3 Packages

3.4 Package Laws

4 Probabilistic Relational Program Logic

4.1 Selected Rules

4.2 Proof Sketch for Theorem 2.4

5 Semantic Model and Soundness of Rules

5.1 Relational Effect Observation

5.2 Effect Observation for Probabilities and Failures

5.3 Adding State

5.4 Categorical Foundations of the Framework

5.5 Comparing Approaches to Semantic Models for Relational Program Logics

6 Case Study: KEM-DEM

6.1 The KEY Package

6.2 KEM and DEM

6.3 Security of the KEM-DEM Construction

7 Case Study: Σ-protocols

7.1 The Σ scheme

7.2 Commitment Schemes from Σ-Protocols

7.2.1 Comparison to Formalization in CryptHOL.

7.3 Concrete Implementation: Schnorr’s Protocol

8 Related Work

9 Future Work

Acknowledgments

Footnotes

A Package Equivalence for KEM-DEM

References

Cited By

Index Terms

Recommendations

The Last Yard: Foundational End-to-End Verification of High-Speed Cryptography

Computer-aided security proofs for the working cryptographer

Interactive Matching Logic Proofs in Coq

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations