Keywords

1 Introduction

Bertoni et al. [8] introduced the sponge construction as an approach to design hash functions with variable output length (later called extendable output functions (XOF)). The construction faced rapid traction in light of NIST’s SHA-3 competition, with multiple candidates inspired by the sponge methodology. Keccak, the eventual winner of the competition and now standardized as SHA-3 [27], internally uses the sponge construction. The sponge construction found quick adoption in the area of lightweight hashing [19, 32]. Also beyond the area of hash functions various applications of the sponge construction appeared such as keystream generation and MAC computation [12], reseedable pseudorandom sequence generation [10, 30], and authenticated encryption [11, 14]. In particular, the ongoing CAESAR competition for the development of a portfolio of authenticated encryption schemes has received about a dozen sponge-based submissions.

At a high level, the sponge construction operates on a state of b bits. This is split into an inner part of size c bits and an outer part of size r bits, where \(b=c+r\). Data absorption and squeezing is done via the outer part, r bits at a time, interleaved with evaluations of a b-bit permutation f. Bertoni et al. [9] proved a bound on the security of the sponge construction in the indifferentiability framework of Maurer et al. [37]. While it was clear from the start that this birthday-type bound in the capacity is tight for the unkeyed use cases, i.e., hashing, for the keyed use cases of the sponge it appeared that a higher level of security could be achieved. This has resulted in an impressive series of papers on the generic security of keyed versions of the sponge, with bounds improving and the construction becoming more efficient.

1.1 Keyed Sponge and Keyed Duplex

Keyed Sponge. Bertoni et al.’s original keyed sponge [13] was simply the sponge with input \((K\Vert M)\) where K is the key. Chang et al. [21] suggested an alternative where the initial state of the sponge simply contains the key in its inner part. Andreeva et al. [2] generalized and improved the analyses of both the outer- and inner-keyed sponge, and also considered security of these functions in the multi-target setting. In a recent analysis their bounds were improved by Naito and Yasuda in [42]. All of these results, however, stayed relatively close to the (keyless) sponge design that absorbs input in blocks of r bits in the outer part of the state. It turned out that, thanks to the secrecy of part of the state after key injection, one can absorb data over the full state, and therewith achieve maximal compression. Full-state absorbing was first explicitly proposed in a variant of sponge for computing MACs: donkeySponge [14]. It also found application in various recent sponge-inspired designs, such as Chaskey [41].

Nearly tight bounds for the full-state absorbing keyed sponge were given by Gaži et al. [29] but their analysis was limited to the case of fixed output length. Mennink et al. [38] generalized and formalized the idea of the full-state keyed sponge and presented an improved security bound for the general case where the output length is variable.

Keyed Duplex. Whereas the keyed sponge serves message authentication and stream encryption, authenticated encryption is mostly done via the keyed duplex construction [11]. This is a stateful construction that consists of an initialization interface and a duplexing interface. Initialization resets the state and a duplexing call absorbs a data block of at most \(r\,-\,1\) bits, applies the underlying permutation f and squeezes at most r bits. Bertoni et al. [11] proved that the output of duplexing calls can be simulated by calls to a sponge, a fortiori making duplex as strong as sponge.

Mennink et al. [38] introduced the full-state keyed duplex and derived a security bound on this construction with dominating terms:

$$\begin{aligned} \frac{\mu N}{2^{k}} + \frac{M^2}{2^c} . \end{aligned}$$
(1)

Here M is the data complexity (total number of initialization and duplexing calls), N the computational complexity (total number of offline calls to f), \(\mu \le 2M\) is a term called the “multiplicity,” and \(k\) the size of the key. This security bound was derived by describing the full-state keyed duplex in terms of the full-state keyed sponge. A naive bounding of \(\mu \) (to cover the strongest possible adversary) yields a dominating term of the form \(2MN/2^{k}\), implying a security strength erosion of \(\log _2 M\) with respect to exhaustive key search.

The duplex construction finds multiple uses in the CAESAR competition [20] in the embodiment of the authenticated encryption mode SpongeWrap [11] or close variants of it. The recent line of research on improving bounds of sponge-inspired authenticated encryption schemes, set by Jovanovic et al. [35], Sasaki and Yasuda [46], and Reyhanitabar et al. [44], can be seen as an analysis of a specific use case of the keyed duplex. The Full-State SpongeWrap [38], an authenticated encryption scheme designed from the full-state keyed duplex, improves over these results. Particularly, the idea already found application in the Motorist mode of the CAESAR submission Keyak [16].

Trading Sponge for Duplex. As said, the duplex can be simulated by the sponge, but not the other way around. This is the case because duplex pads each input block and cannot simulate sponge inputs with, e.g., long sequences of zeroes. It is therefore natural that Mennink et al. [38] derived a security bound on the full-state keyed duplex by viewing it as an extension to the full-state keyed sponge. However, we observe that the introduction of full-state absorption changes that situation: the full-state keyed duplex can simulate the full-state keyed sponge. All keyed usages of the sponge can be described quite naturally as application of the keyed duplex and it turns out that proving security of keyed duplex is easier than that of keyed sponge. Therefore, in keyed use cases, the duplex is preferred as basic building block over the sponge.

1.2 Multi-target Security

The problem of multi-target security of cryptographic designs has been acknowledged and analyzed for years. Biham [17] considered the security of blockciphers in the multi-target setting and shows that the security strength can erode to half the key size if data processed by sufficiently many keys is available. Various extensions have subsequently appeared [7, 18, 34]. It has been demonstrated (see, e.g., [5] for public key encryption and [22] for message authentication codes) that the security of a scheme in the multi-target setting can be reduced to the security in the single-target setting, at a security loss proportional to the number of keys used.

However, in certain cases, a dedicated analysis in the multi-target setting could render improved bounds. Andreeva et al. [2] considered the security of the outer- and inner-keyed sponge in the multi-target setting, a proof which internally featured a security analysis of the Even-Mansour blockcipher in the multi-target setting. The direction of multi-target security got subsequently popularized by Mouha and Luykx [40], leading to various multi-target security results [4, 33] with security bounds (almost) independent of the number of targets involved.

1.3 Our Contribution

We present a generalization of the full-state keyed duplex that facilitates multiple instances by design (Sect. 2.2). This generalization is realized via the formalization of a state initialization function that has access to a key array \(\mathbf {K}\) consisting of u keys of size \(k\), generated following a certain distribution. Given as input a key index \(\delta \) and an initialization vector \(\mathrm {iv}\), it initializes the state using \(\mathrm {iv}\) and the \(\delta \)th key taken from \(\mathbf {K}\). We capture its functional behavior under the name of an extendable input function (XIF) and explicitly define its idealized instance.

Unlike the approach of Mennink et al. [38], who viewed the full-state keyed duplex as an extension to the full-state keyed sponge, our analysis is a dedicated analysis of the full-state keyed duplex. To accommodate bounds for different use cases, we have applied a re-phasing to the definition of the keyed duplex. In former definitions of the (keyed) duplex, a duplexing call consisted of input injection, applying the permutation f, and then output extraction. In our new definition, the processing is as follows: first the permutation f, then output extraction, and finally input injection. This adjustment reflects a property present in practically all modes based on the keyed duplex, namely that the user (or adversary) must provide the input before knowing the previous output. The re-phasing allows us to prove a bound on keyed duplex that is tight even for those use cases. The fact that, in previous definitions, an adversary could see the output before providing the input allowed it to force the outer part of the state to a value of its choice, and as such gave rise to a term in the bound at worst \(MN/2^c\) and at best \(\mu N/2^c\), where \(\mu \) is a term that reflects a property of the transcript that needs to be bound by out-of-band reasonings.

Alongside the re-phasing, we have eliminated the term \(\mu \) and express the bound purely as a function of the adversary’s capabilities. Next to the total offline complexity N, i.e., the number queries the adversary can make to f and the total online complexity M, i.e., the total number of construction queries (to keyed duplex or ideal XIF), we introduce two metrics: L and \(\varOmega \), both reflecting the ability of the adversary to force the outer part of the state to a value of its choice. The metric L counts the number of construction queries with repeated path (intuitively, a concatenation of all data blocks up to a certain permutation call), which may typically occur in MAC functions and authenticated encryption schemes that do not impose nonce uniqueness. The metric \(\varOmega \) counts the number of construction queries where the adversary can overwrite the outer part of the state. Such a possibility may occur in authenticated encryption schemes that release unverified decrypted ciphertext (cf., [1]). A comparison of the scheme analyzed in this work with those in earlier works is given in Table 1.

Table 1. Comparison of the schemes analyzed in earlier works and this work. By “pure bound” we mean that the derived security bound is expressed purely as a function of the adversary’s capabilities. Differences in bounds are not reflected by the table.

We prove in Sect. 4 a bound on the advantage of distinguishing a full-state keyed duplex from an ideal XIF in a multi-target setting. We here give the bound for several settings, all of which having multiple keys sampled uniformly at random without replacement. For adversaries with the ability to send queries with repeated paths and queries that overwrite the outer part of the state, the dominating terms in our bound are:

$$\begin{aligned} \frac{q_\mathrm {iv}N}{2^k} + \frac{(L+\varOmega )N}{2^c} . \end{aligned}$$
(2)

The metric \(q_\mathrm {iv}\) denotes the maximum number of sessions started with the same \(\mathrm {iv}\) but different keys. For adversaries that cannot send queries with repeated paths or send queries that overwrite the outer part of the state, one of the dominating terms depends on the occurrence of multicollisions via a coefficient \(\nu _{r,c}^{M}\) that is fully determined by the data complexity M and parameters r and c (see Sect. 6.5, and particularly Fig. 4). For wide permutations we can have large rates (i.e., \(r > 2 \log _2(M) + c\)) and the dominating terms in our bound are:

$$\begin{aligned} \frac{q_\mathrm {iv}N}{2^k} + \frac{N}{2^{c-1}} . \end{aligned}$$
(3)

For relatively small rates the data complexity can be such that \(M > 2^{r-1}\) and for that range the dominating terms are upper bounded by (assuming \(\nu _{r,c}^{2M} \le \frac{bM}{2^{r+1}}\)):

$$\begin{aligned} \frac{q_\mathrm {iv}N}{2^k} + \frac{bMN}{2^{b}} + \frac{M^2}{2^b} . \end{aligned}$$
(4)

For the case in-between where M is in the range \(2^{(r-c)/2} < M \le 2^{r-1}\), the bound becomes (assuming \(\nu _{r,c}^{2M} \le \min (b/\log \frac{2^r}{2M}, b/4)\)):

$$\begin{aligned} \frac{q_\mathrm {iv}N}{2^k} + \frac{bN}{\max (4, r-1-\log _2 M)2^{c-1}} . \end{aligned}$$
(5)

This bound is valid for permutation widths of 200 and above. These bounds are significantly better than that of [38].

Concretely, in implementations of duplex-based authenticated encryption schemes that respect the nonce requirement and do not release unverified plaintext, we have \(L=\varOmega =0\). Assuming keys are randomly sampled without replacement, the generic security is governed by (3), (4), or (5). Depending on the parameters, a scheme is either in case (3), or case (4)–(5), where a transition happens for \(M=2^{r-1}\). Table 2 summarizes achievable security strengths for the duplex-based CAESAR contenders.

Table 2. Application of our analysis to Ketje, Ascon, NORX, and Keyak. For the nonce misuse case, we consider \(L+\varOmega =M/2\). A “Strength” equal to s means that it requires a computational effort \(2^s\) to distinguish. Here, \(a=\log _2(Mr)\).

Our general security bound, covering among others a broader spectrum of key sampling distributions, is given in Theorem 1. It is noteworthy that, via the built-in support of multiple targets, we manage to obtain a security bound that is largely independent of the number of users u: the only appearance is in the key guessing part, \(q_\mathrm {iv}N/2^k\), which shows an expected decrease in the security strength of exhaustive key search by a term \(\log _2 q_\mathrm {iv}\). Note that security erosion can be avoided altogether by requiring \(\mathrm {iv}\) to be a global nonce, different for each initialization call (irrespective of the used key).

Our analysis improves over the one of [38] in multiple aspects. First, our security bound shows less security erosion for increasing data complexities. Whereas in (1) security strength erodes to \(k- \log _2 M\), in (2) this is \(c - \log _2(L + \varOmega )\) with \(L + \varOmega < M\). By taking \(c > k+ \log _2 M_{\max }\) with \(M_{\max }\) some upper bound on the amount of data an adversary can get its hands on, one can guarantee that this term does not allow attacks faster than exhaustive key search.

Second, via the use of parameters L and \(\varOmega \) our bound allows for a more flexible interpretation and a wide range of use cases. For example, in stream ciphers, \(L=\varOmega =0\) by design. This also holds for most duplex-based authenticated encryption schemes in the case of nonce-respecting adversaries that cannot obtain unverified decrypted ciphertexts.

Third, even in the general case (with key size taken equal to c bits and no nonce restriction on \(\mathrm {iv}\)), our bound still improves over the one of [38] by replacing the multiplicity metric, that can only be evaluated a posteriori, by the metrics L and \(\varOmega \), that reflect what the adversary can do.

Fourth, in our approach we address the multi-key aspect natively. This allows us to determine the required set of properties on the joint distribution of all keys under attack. Theorem 1 works for arbitrary key sampling techniques with individual keys of sufficient min-entropy and the probability that two keys in the array collide is small enough, and demonstrates that the full-state keyed duplex remains secure even if the keys are not independently and uniformly randomly distributed.

Finally, we perform an analysis on the contribution of outer-state multicollisions to the bound that is of independent interest. This analysis strongly contributes to the tightness of our bounds, as we illustrate in the Stairway to Heaven graph in Fig. 4.

1.4 Notation

For an integer \(n\in \mathbb {N}\), we denote \(\mathbb {Z}_n=\{0,\ldots ,n-1\}\) and by \(\mathbb {Z}_2^{n}\) the set of bit strings of length n. \(\mathbb {Z}_2^*\) denotes the set of bit strings of arbitrary length. For two bit strings \(s,t\in \mathbb {Z}_2^n\), their bitwise addition is denoted \(s+t\). The expression \({\lfloor s \rfloor _{\ell }}\) denotes the bitstring s truncated to its first \(\ell \) bits. A random oracle [6] \({\mathcal {RO}}:\mathbb {Z}_2^*\rightarrow \mathbb {Z}_2^n\) is a function that maps bit strings of arbitrary length to bit strings of some length n. In this paper, the value of n is determined by the context. We denote by \((x)_{(y)}\) the falling factorial power \((x)_{(y)}=x(x-1)\cdots (x-y+1)\).

Throughout this work, b denotes the width of the permutation f. The parameters c and r denote the capacity and rate, where \(b=c+r\). For a state value \(s\in \mathbb {Z}_2^b\), we follow the general convention to define its outer part by \(\overline{s}\in \mathbb {Z}_2^r\) and its inner part by \(\widehat{s}\in \mathbb {Z}_2^c\), in such a way that \(s=\overline{s}||\widehat{s}\). The key size is conventionally denoted by \(k\), and the number of users by u. Throughout, we assume that \(u\le 2^k\), and regularly use an encoding function \(\mathsf {Encode}:\mathbb {Z}_u\rightarrow \mathbb {Z}_2^k\), mapping integers from \(\mathbb {Z}_u\) to \(k\)-bit strings in some injective way.

2 Constructions

In Sect. 2.1, we will discuss the key sampling technique used in this work. The keyed duplex construction is introduced in Sect. 2.2, and we present its “ideal equivalent,” the ideal extendable input function, in Sect. 2.3. To suit the security analysis, we will also need an in-between hybrid, the randomized duplex, discussed in Sect. 2.4.

2.1 Key Sampling

Our keyed duplex construction has built-in multi-user support, and we start with a formalization of the key sampling that we consider. At a high level, our formalization is not specific for the keyed duplex, and may be of independent interest for modeling multi-target attacks.

In our formalization the adversary can invoke a keyed object (block cipher, stream cipher, PRF, keyed sponge, ...) with a key selected from a key array \(\mathbf {K}\) containing \({u}\) keys, each of length \(k\) bits:

$$\begin{aligned} \mathbf {K}= (\mathbf {K}[0],\ldots , {\mathbf {K}[u-1])} \in \big (\mathbb {Z}_2^k\big )^u . \end{aligned}$$

These keys are sampled from the space of \(k\)-bit keys according to some distribution \(\mathcal {D}_\mathrm{K}\). This distribution can, in theory, be anything. In particular, the distribution of the key with index \(\delta \) may depend on the values of the \(\delta \) keys sampled before.

Two plausible examples of the key distribution are random sampling with replacement and random sampling without replacement. In the former case, all keys are generated uniformly at random and pairwise independent, but it may cause problems in case of accidental collisions in the key array. The latter distribution resolves this by generating all keys uniformly at random from the space of values excluding the ones already sampled. A third, more extreme, example of \(\mathcal {D}_\mathrm{K}\) generates \(\mathbf {K}[0]\) uniformly at random and defines all subsequent keys as \(\mathbf {K}[\delta ] = \mathbf {K}[0] +\mathsf {Encode}(\delta )\).

Different distributions naturally entail different levels of security, and we define two characteristics of a distribution that are relevant for our analysis. Note that the characteristics take u as implicit parameter. The first characteristic is the min-entropy of the individual keys, defined as

$$\begin{aligned} H_{\min }(\mathcal {D}_\mathrm{K}) = - \log _2 \max _{\delta \in \mathbb {Z}_u,a\in \mathbb {Z}_2^k} \Pr (\mathbf {K}[\delta ] = a) , \end{aligned}$$
(6)

or in words, minus the binary logarithm of the probability of the key value to be selected with the highest probability. The three example samplings outlined above have min-entropy \(k\), regardless of the value u.

The second characteristic is related to the maximum collision probability between two keys in the array:

$$\begin{aligned} H_{\text {coll}}(\mathcal {D}_\mathrm{K}) = - \log _2 \max _{\begin{array}{c} \delta ,\delta '\in \mathbb {Z}_u\\ \delta \ne \delta ' \end{array}} \Pr (\mathbf {K}[\delta ] = \mathbf {K}[\delta ']) . \end{aligned}$$
(7)

Uniform sampling with replacement has maximum collision probability equal to \(2^{-k}\) and so \(H_{\text {coll}}(\mathcal {D}_\mathrm{K}) = k\). Sampling without replacement and our third example clearly have collision probability zero, giving \(H_{\text {coll}}(\mathcal {D}_\mathrm{K}) = \infty \).

2.2 Keyed Duplex Construction

The full-state keyed duplex (KD) construction is defined in Algorithm 1, and it is illustrated in Fig. 1.

Fig. 1.
figure 1

Full-state keyed duplex construction \({\textsc {KD}^{f}_{\mathbf {K}}}\). In this figure, the sequence of calls is \(Z = {\mathrm {KD}{.}\mathrm {Init}(\delta ,\mathrm {iv},\sigma ,\text {false})}\), \(Z = {\mathrm {KD}{.}\mathrm {Duplexing}(\sigma ,\text {true})}\), and \(Z = {\mathrm {KD}{.}\mathrm {Duplexing}(\sigma ,\text {false})}\).

It calls a b-bit permutation f and is given access to an array \(\mathbf {K}\) consisting of \({u}\) keys of size \(k\) bits. A user can make two calls: initialization and duplexing calls.

In an initialization call it takes as input a key index \(\delta \) and a string \(\mathrm {iv}\in \mathbb {Z}_2^{b-k}\) and initializes the state as \(f(\mathbf {K}[\delta ] ||\mathrm {iv})\). In the same call, the user receives an r-bit output string Z and injects a b-bit string \(\sigma \). A duplexing call just performs the latter part: it updates the state by applying f to it, returns to the user an r-bit output string Z and injects a user-provided b-bit string \(\sigma \).

Both in initialization and duplexing calls, the output string Z is taken from the state prior to the addition of \(\sigma \) to it, but the user has to provide \(\sigma \) before receiving Z. This is in fact a re-phasing compared to the original definition of the duplex [11] or of the full-state keyed duplex [38], and it aims at better reflecting typical use cases. We illustrate this with the SpongeWrap authenticated encryption scheme [11] and its more recent variants [38]. In this scheme, each plaintext block is typically encrypted by (i) applying f, (ii) fetching a block of key stream, (iii) adding the key stream and plaintext blocks to get a ciphertext block, and (iv) adding the plaintext block to the outer part of the state. By inspecting Algorithm 3 in [11], there is systematically a delay between the production of key stream and its use, requiring to buffer a key stream block between the (original) duplexing calls. In contrast, our re-phased calls better match the sequence of operations.

The flag in the initialization and duplexing calls is required to implement decryption in SpongeWrap and variants. In that case, the sequence of operations is the same as above, except that step (iii) consists of adding the key stream and ciphertext blocks to get a plaintext block. However, a user would need to see the keystream block before being able to add the plaintext block in step (iv). One can see, however, that step (iv) is equivalent to overwriting the outer part of the state with the ciphertext block. Switching between adding the plaintext block (for encryption) and overwriting with the ciphertext block (for decryption) is the purpose of the flag. The usage of the flag, alongside the re-phasing is depicted in Fig. 1.

Note that in Algorithm 1 in the case that the flag is true, the outer part of the state is overwritten with \(\overline{\sigma }\). For consistency with the algorithms of constructions we will introduce shortly, this is formalized as bitwise adding Z to \(\overline{\sigma }\) before its addition to the state if flag is true. Alternatively, one could define an authenticated encryption mode that does not allow overwriting the state with the ciphertext block C. For example, encryption would be \(C = P + \mathbf {M} \times Z\), with P the plaintext block and \(\mathbf {M}\) a simple invertible matrix. Upon decryption, the outer part of the state then becomes \(C + (\mathbf {M} + \mathbf {I}) \times Z\). If \(\mathbf {M}\) is chosen such that \(\mathbf {M} + \mathbf {I}\) is invertible, the adversary has no control over the outer part of the state after the duplexing call. This would require changing “\(\overline{\sigma } \leftarrow \overline{\sigma } +Z\)” into “\(\overline{\sigma } \leftarrow \overline{\sigma } +\mathbf {M} \times Z\)” in Algorithm 1.

figure a

2.3 Ideal Extendable Input Function

We define an ideal extendable input function (IXIF) in Algorithm 2. It has the same interface as KD, but instead it uses a random oracle \({\mathcal {RO}}:\mathbb {Z}_2^*\rightarrow \mathbb {Z}_2^r\) to generate its responses. In particular, every initialization call initializes a \(\mathsf {Path}\) as \(\mathsf {Encode}(\delta ) ||\mathrm {iv}\). In both initialization and duplexing calls, an r-bit output is generated by evaluating \({\mathcal {RO}}(\mathsf {Path})\) and the b-bit input string \(\sigma \) is absorbed by appending it to the \(\mathsf {Path}\). Figure 2 has an illustration of IXIF (at the right).

Note that IXIF properly captures the random equivalent of the full-state keyed duplex: it simply returns random values from \(\mathbb {Z}_2^r\) for every new path, and repeated paths result in identical responses. IXIF is in fact almost equivalent to the duplex as presented by Mennink et al. [38]: as a matter of fact, when (i) not considering multiple keys for our construction and (ii) avoiding overlap of the \(\mathrm {iv}\) with the key (as possible in the construction of [38]), the ideal functionalities are the same. In our analysis, we do not consider overlap of the \(\mathrm {iv}\) with the key as (i) it unnecessarily complicates the analysis and (ii) we discourage it as it may be a security risk if the keys in the key array \(\mathbf {K}\) are not independently and uniformly randomly distributed.

figure b

2.4 Randomized Duplex Construction

To simplify our security analysis, we introduce a hybrid algorithm lying in-between KD and IXIF: the full-state randomized duplex (RD) construction. It is defined in Algorithm 3. It again has the same interface as KD, but the calls to the permutation f and the access to a key array \(\mathbf {K}\) have been replaced by two primitives: a uniformly random injective mapping \(\phi :\mathbb {Z}_u\times \mathbb {Z}_2^{b-k}\rightarrow \mathbb {Z}_2^b\), and a uniformly random b-bit permutation \(\pi \). The injective mapping \(\phi \) replaces the keyed state initialization by directly mapping an input \((\delta ,\mathrm {iv})\) to a b-bit state value. The permutation \(\pi \) replaces the evaluations of f in the duplexing calls. In our use of RD, \(\phi \) and \(\pi \) will be secret primitives. Figure 2 has an illustration of RD (at the left).

figure c

3 Security Setup

The security analysis in this work is performed in the distinguishability framework where one bounds the advantage of an adversary \(\mathcal {A}\) in distinguishing a real system from an ideal system.

Definition 1

Let \(\mathcal {O},\mathcal {P}\) be two collections of oracles with the same interface. The advantage of an adversary \(\mathcal {A}\) in distinguishing \(\mathcal {O}\) from \(\mathcal {P}\) is defined as

$$\begin{aligned} \varDelta _{\mathcal {A}}(\mathcal {O}\;;\;\mathcal {P}) = \left| \Pr \left( \mathcal {A}^{\mathcal {O}}\rightarrow 1\right) - \Pr \left( \mathcal {A}^{\mathcal {P}}\rightarrow 1\right) \right| . \end{aligned}$$

Our proofs in part use the H-coefficient technique from Patarin [43]. We will follow the adaptation of Chen and Steinberger [23]. Consider any information-theoretic deterministic adversary \(\mathcal {A}\) whose goal is to distinguish \(\mathcal {O}\) from \(\mathcal {P}\), with its advantage denoted

$$\begin{aligned} \varDelta _{\mathcal {A}}(\mathcal {O}\;;\;\mathcal {P}) \; . \end{aligned}$$

The interaction of \(\mathcal {A}\) with its oracle, either \(\mathcal {O}\) or \(\mathcal {P}\), will be stored in a transcript \(\tau \). Denote by \(D_{\mathcal {O}}\) (resp. \(D_{\mathcal {P}}\)) the probability distribution of transcripts that can be obtained from interaction with \(\mathcal {O}\) (resp. \(\mathcal {P}\)). Call a transcript \(\tau \) attainable if it can be obtained from interacting with \(\mathcal {P}\), hence if \(\Pr \left( D_{\mathcal {P}}=\tau \right) >0\). Denote by \(\mathcal {T}\) the set of attainable transcripts, and consider any partition \(\mathcal {T}=\mathcal {T}_\mathrm {good}\cup \mathcal {T}_\mathrm {bad}\) of the set of attainable transcripts into “good” and “bad” transcripts. The H-coefficient technique states the following [23].

Lemma 1

(H-coefficient Technique). Consider a fixed information-theoretic deterministic adversary \(\mathcal {A}\) whose goal is to distinguish \(\mathcal {O}\) from \(\mathcal {P}\). Let \(\varepsilon \) be such that for all \(\tau \in \mathcal {T}_\mathrm {good}\):

$$\begin{aligned} \frac{\Pr \left( D_{\mathcal {O}}=\tau \right) }{\Pr \left( D_{\mathcal {P}}=\tau \right) } \ge 1-\varepsilon . \end{aligned}$$
(8)

Then, \(\varDelta _{\mathcal {A}}(\mathcal {O}\;;\;\mathcal {P}) \le \varepsilon + \Pr \left( D_{\mathcal {P}}\in \mathcal {T}_\mathrm {bad}\right) \).

The H-coefficient technique can thus be used to neatly bound a distinguishing advantage in the terminology of Definition 1, and a proof typically goes in four steps: (i) investigate what transcripts look like, which gives a definition for \(\mathcal {T}\), (ii) define the partition of \(\mathcal {T}\) into \(\mathcal {T}_\mathrm {good}\) and \(\mathcal {T}_\mathrm {bad}\), (iii) investigate the fraction of (8) for good transcripts and (iv) analyze the probability that \(D_{\mathcal {P}}\) generates a bad transcript.

4 Security of Keyed Duplex Construction

We prove that the full-state keyed duplex construction (KD) is sound. We do so by proving an upper bound for the advantage of distinguishing the KD calling a random permutation f from an ideal extendable input function (IXIF). Both in the real and ideal world the adversary gets additional query access to f and \(f^{-1}\), simply denoted as \(f\).

The main result is stated in Sect. 4.2, but before doing so, we specify the resources of the adversary in Sect. 4.1.

4.1 Quantification of Adversarial Resources

We will consider information-theoretic adversaries that have two oracle interfaces: a construction oracle, \({\textsc {KD}^{f}_{\mathbf {K}}}\) or \({\textsc {IXIF}^{{\mathcal {RO}}}}\), and a primitive oracle \(f\). For the construction queries, it can make initialization queries or duplexing queries. Note that, when querying \({\textsc {IXIF}^{{\mathcal {RO}}}}\), every query has a path \(\mathsf {Path}\) associated to it. To unify notation, we also associate a \(\mathsf {Path}\) to each query (initialization or duplexing) to \({\textsc {KD}^{f}_{\mathbf {K}}}\). This \(\mathsf {Path}\) is defined the straightforward way: it simply consists of the concatenation of \(\mathsf {Encode}(\delta ),\mathrm {iv}\) of the most recent initialization call and all \(\sigma \)-values that have been queried after the last initialization but before the current query. Using this formalization, every initialization or duplexing call that the adversary makes to \({\textsc {KD}^{f}_{\mathbf {K}}}\) or \({\textsc {IXIF}^{{\mathcal {RO}}}}\) can be properly captured by a tuple

$$\begin{aligned} (\mathsf {Path},Z,\sigma ), \end{aligned}$$

where, intuitively, \(\mathsf {Path}\) is all data that is used to generate response \(Z\in \mathbb {Z}_2^r\), and \(\sigma \in \mathbb {Z}_2^b\) is the input string (slightly abusing notation; \(\sigma =\sigma \) if \(\text {flag}=\text {false}\) and \(\sigma =\sigma +(Z||0^c)\) if \(\text {flag}=\text {true}\)).

Following Andreeva et al. [2], we specify adversarial resources that impose limits on the transcripts that any adversary can obtain. The basic resource metrics are quantitative: they specify the number of queries an adversary is allowed to make for each type.

  • N : the number of primitive queries. It corresponds to computations requiring no access to the (keyed) construction. It is usually called the time or offline complexity. In practical use cases, N is only limited by the computing power and time available to the adversary.

  • M : the number of construction queries. It corresponds to the amount of data processed by the (keyed) construction. It is usually called the data or online complexity. In many practical use cases, M is limited.

We remark that identical calls are counted only once. In other words, N only counts the number of primitive queries, and M only counts the number of unique tuples \((\mathsf {Path},\sigma )\).

It is possible to perform an analysis solely based on these metrics, but in order to more accurately cover practical settings that were not covered before (such as the multi-key setting or the nonce-respecting setting), and to eliminate the multiplicity (a metric used in all earlier results in this direction), we define a number of additional metrics.

  • q: the total number of different initialization tuples \((\mathsf {Encode}(\delta ),\mathrm {iv})\). Parameter q corresponds to the number of times an adversary can start a fresh initialization of KD or IXIF.

  • \(\varvec{q_\mathrm {iv}}{\varvec{:}}\) \(\mathrm {iv}\) multiplicity, the maximum number of different initialization tuples \((\mathsf {Encode}(\delta ),\mathrm {iv})\) with same \(\mathrm {iv}\), maximized over all \(\mathrm {iv}\) values.

  • \(\varvec{\varOmega }{\varvec{:}}\) the number of queries with \(\text {flag}= \text {true}\).

  • L: equals the number of queries M minus the number of distinct paths. It corresponds to the number of construction queries that have the same \(\mathsf {Path}\) as some prior query.

In many practical use cases, q is limited, but as it turns out re-initialization queries give the adversary more power. The metric \(q_\mathrm {iv}\) is relevant in multi-target attacks, where clearly \(q_\mathrm {iv}\le {u}\). The relevance of \(\varOmega \) and L is the following. In every query with flag equal to true, the adversary can force the outer part of the input to f in a later query to a chosen value \(\alpha \) by taking \(\overline{\sigma } = \alpha \). Note that, as discussed in Sect. 2.2, by adopting authenticated encryption schemes with a slightly non-conventional encryption method, \(\varOmega \) can be forced to zero. Similarly, construction queries with the same path return the same value Z, and hence allow an adversary to force the outer part of the input to f in a later query to a chosen value \(\alpha \) by taking \(\sigma \) such that \(\overline{\sigma } = Z + \alpha \). An adversary can use this technique to increase the probability of collisions in \(f(s)+\sigma \) and to speed up inner state recovery. By definition, \(L \le M-1\) but in many cases L is much smaller. In particular, if one considers KD in the nonce-respecting setting, where no \((\mathsf {Encode}(\delta ),\mathrm {iv})\) occurs twice, the adversary never generates a repeating path, and \(L=0\).

4.2 Main Result

Our bound uses a function that is defined in terms of a simple balls-into-bins problem.

Definition 2

The multicollision limit function \(\nu _{r,c}^{M}\), with M a natural number, returns a natural number and is defined as follows. Assume we uniformly randomly distribute M balls in \(2^r\) bins. If we call the number of balls in the bin with the highest number of balls \(\mu \), then \(\nu _{r,c}^{M}\) is defined as the smallest natural number x that satisfies:

$$\begin{aligned} \Pr \left( \mu > x\right) \le \frac{x}{2^{c}} \;. \end{aligned}$$

In words, when uniformly randomly sampling M elements from a set of \(2^r\) elements, the probability that there is an element that is sampled more than x times is smaller than \(x 2^{-c}\).

Theorem 1

Let f be a random permutation and \({\mathcal {RO}}\) be a random oracle. Let \(\mathbf {K}\) be a key array generated using a distribution \(\mathcal {D}_\mathrm{K}\). Let \({\textsc {KD}^{f}_{\mathbf {K}}}\) be the construction of Algorithm 1 and \({\textsc {IXIF}^{{\mathcal {RO}}}}\) be the construction of Algorithm 2 and let \(\nu _{r,c}^{M}\) be defined according to Definition 2. For any adversary \(\mathcal {A}\) with resources as discussed in Sect. 4.1, and with \(N + M \le 0.1 \cdot 2^c\),

$$\begin{aligned} \varDelta _{\mathcal {A}}({\textsc {KD}^{f}_{\mathbf {K}}}, {f}\;;\;{\textsc {IXIF}^{{\mathcal {RO}}}}, {f}) \le&\frac{ (L + \varOmega )N}{2^c} + \frac{ 2\nu _{r,c}^{2(M-L)}(N+1)}{2^c} + \frac{{L \,{+}\, \varOmega \, {+}\, 1 \atopwithdelims ()2}}{2^c} \\&+\, \frac{ (M-q-L)q}{2^b - q} + \frac{ M(M-L-1)}{2^b} \\&+ \,\frac{ (M-q-L)q}{2^{H_{\min }(\mathcal {D}_\mathrm{K})+\min \{c,b-k\}}} + \frac{ q_\mathrm {iv}N}{2^{H_{\min }(\mathcal {D}_\mathrm{K})}} + \frac{ {u\atopwithdelims ()2}}{2^{H_{\text {coll}}(\mathcal {D}_\mathrm{K})}} . \end{aligned}$$

The proof is given in Sect. 4.3.

For the case where \(k+ c \le b-1\), and where \(\mathcal {D}_\mathrm{K}\) corresponds to uniform sampling without replacement, the bound simplifies to

$$\begin{aligned} \varDelta _{\mathcal {A}}({\textsc {KD}^{f}_{\mathbf {K}}}, {f}\;;\;{\textsc {IXIF}^{{\mathcal {RO}}}}, {f}) \le&\frac{ (L + \varOmega )N}{2^c} + \frac{ 2\nu _{r,c}^{2(M-L)}(N+1)}{2^c} + \frac{{L \,{+}\, \varOmega \,{+}\, 1 \atopwithdelims ()2}}{2^c} \\&+ \,\frac{ q_\mathrm {iv}N}{2^{k}} + \frac{ (M-q-L)q}{2^{k+ c - 1}} + \frac{ M(M-L-1)}{2^b} . \end{aligned}$$

The behavior of the function \(\nu _{r,c}^{M}\) is discussed in Sect. 6.5 and illustrated in the Fig. 4, which we refer to as the Stairway to Heaven graph.

4.3 Proof of Theorem 1

Let \(\mathcal {A}\) be any information-theoretic adversary that has access to either, in the real world \(({\textsc {KD}^{f}_{\mathbf {K}}}, {f})\), or in the ideal world \(({\textsc {IXIF}^{{\mathcal {RO}}}}, {f})\). Note that, as \(\mathcal {A}\) is information-theoretic, we can without loss of generality assume that it is deterministic, and we can apply the technique of Sect. 3. By the triangle inequality,

$$\begin{aligned}&\varDelta _{\mathcal {A}}({\textsc {KD}^{f}_{\mathbf {K}}}, {f}\;;\;{\textsc {IXIF}^{{\mathcal {RO}}}}, {f}) \nonumber \\ \le&\varDelta _{\mathcal {B}}({\textsc {KD}^{f}_{\mathbf {K}}}, {f}\;;\;{\textsc {RD}^{\phi ,\pi }_{}}, {f}) + \varDelta _{\mathcal {C}}({\textsc {RD}^{\phi ,\pi }_{}}, {f}\;;\;{\textsc {IXIF}^{{\mathcal {RO}}}}, {f}) , \end{aligned}$$
(9)

where \({\textsc {RD}^{\phi ,\pi }_{}}\) for random injection function \(\phi \) and random permutation \(\pi \) is the construction of Algorithm 3, and where \(\mathcal {B}\) and \(\mathcal {C}\) have the same resources \((N,M,q,q_\mathrm {iv},L,\varOmega )\) as \(\mathcal {A}\).

In the last term of (9), RD calls an ideal injective function \(\phi \) and a random permutation \(\pi \), both independent of f, and IXIF calls a random oracle \({\mathcal {RO}}\), also independent of f. The oracle access to \(f\) therefore does not “help” the adversary in distinguishing the two, or more formally,

$$\begin{aligned} \varDelta _{\mathcal {C}}({\textsc {RD}^{\phi ,\pi }_{}}, {f}\;;\;{\textsc {IXIF}^{{\mathcal {RO}}}}, {f}) \le \varDelta _{\mathcal {D}}({\textsc {RD}^{\phi ,\pi }_{}}\;;\;{\textsc {IXIF}^{{\mathcal {RO}}}}) . \end{aligned}$$
(10)

where \(\mathcal {D}\) is an adversary with the same construction query parameters as \(\mathcal {A}\), but with no access to \(f\).

The two remaining distances, i.e., the first term of (9) and the term of (10), will be analyzed in the next lemmas. The proof of Theorem 1 directly follows.

Lemma 2

For any adversary \(\mathcal {D}\) with resources as discussed in Sect. 4.1 but with no access to \(f\),

$$\begin{aligned} \varDelta _{\mathcal {D}}({\textsc {RD}^{\phi ,\pi }_{}}\;;\;{\textsc {IXIF}^{{\mathcal {RO}}}}) \le \frac{{L \,{+}\, \varOmega \,{+}\, 1 \atopwithdelims ()2}}{2^c} + \frac{ M(M-L-1)}{2^b} . \end{aligned}$$
(11)

Lemma 3

For any adversary \(\mathcal {B}\) with resources as discussed in Sect. 4.1,

(12)

The proof of Lemma 2 is given in Sect. 5, and the proof of Lemma 3 is given in Sect. 6.

5 Distance Between RD and IXIF

In this section we bound the advantage of distinguishing the randomized duplex from an ideal extendable input function, (11) of Lemma 2. The distinguishing setup is illustrated in Fig. 2. The derivation is performed using the H-coefficient technique.

Fig. 2.
figure 2

Distinguishing experiment of RD and IXIF.

Description of Transcripts. The adversary has only a single interface, \({\textsc {RD}^{\phi ,\pi }_{}}\) or \({\textsc {IXIF}^{{\mathcal {RO}}}}\), but can make both initialization and duplexing queries. Following the discussion of Sect. 4.1, we can unify the two different types of queries, and summarize the conversation of \(\mathcal {D}\) with its oracle in a transcript of the form

$$\begin{aligned} {\tau _\mathrm{C}}= \{(\mathsf {Path}_j,Z_j,\sigma _j)\}_{j=1}^M . \end{aligned}$$

The values \(Z_j\) correspond to the outer part of the state just before \(\sigma _j\) gets injected. To make the analysis easier, we give at the end of the experiment for each query the inner value of the state at the moment \(Z_j\) is extracted (in the real world). We denote this as \(t_j = \overline{t}_j ||\widehat{t}_j\) with \(\overline{t}_j = Z_j\). In the IXIF, \(\widehat{t}_j\) is a value that is randomly generated for each path \(\mathsf {Path}\) and can be expressed as \({\mathcal {RO}}'(\mathsf {Path})\) for some random oracle \({\mathcal {RO}}'\) with c-bit output. We integrate those values in the transcript, yielding:

$$\begin{aligned} \tau =\{(\mathsf {Path}_j,t_j,\sigma _j)\}_{j=1}^M . \end{aligned}$$

Definition of Good and Bad Transcripts. We define a transcript \(\tau \) as bad if it contains a t-collision or an s-collision, where \(s=t+\sigma \). A t-collision is defined as equal t values despite different \(\mathsf {Path}\) values:

$$\begin{aligned} \exists (\mathsf {Path},t,\sigma ), (\mathsf {Path}',t',\sigma ') \in \tau \text { with } \big ( \mathsf {Path}\ne \mathsf {Path}' \big ) \text { AND } \big ( t = t' \big ) . \end{aligned}$$
(13)

An s-collision is defined as equal s values despite different \((\mathsf {Path},\sigma ')\) values:

$$\begin{aligned} \exists (\mathsf {Path},t,\sigma )&, (\mathsf {Path}',t',\sigma ') \in \tau \text { with } \nonumber \\&\big ((\mathsf {Path},\sigma ) \ne (\mathsf {Path}',\sigma ')\big ) \text { AND } \big (t +\sigma = t' +\sigma '\big ) . \end{aligned}$$
(14)

In case the oracle is \({\textsc {RD}^{\phi ,\pi }_{}}\), a t-collision is equivalent to two different inputs to \(\pi \) with identical outputs; an s-collision corresponds to the case of two identical inputs to f where the outputs are expected to be distinct. By considering these transcripts as bad, all queries properly define input-output tuples for \(\phi \) and \(\pi \).

Bounding the H-coefficient Ratio for Good Transcripts. Denote \(\mathcal {O}={\textsc {RD}^{\phi ,\pi }_{}}\) and \(\mathcal {P}={\textsc {IXIF}^{{\mathcal {RO}}}}\) for brevity. Consider a good transcript \(\tau \in \mathcal {T}_\mathrm {good}\). For the real world \(\mathcal {O}\), the transcript defines exactly q input-output pairs for \(\phi \) and exactly \(M-q-L\) input-output pairs for \(\pi \). It follows that \(\Pr \left( D_{\mathcal {O}}=\tau \right) = 1/((2^b)_{(q)}(2^b)_{(M-q-L)})\). For the ideal world \(\mathcal {P}\), every different \(\mathsf {Path}\) defines exactly one evaluation of \({\mathcal {RO}}(\mathsf {Path})||{\mathcal {RO}}'(\mathsf {Path})\), so \(\Pr \left( D_{\mathcal {P}}=\tau \right) = 2^{-(M-L)b}\). We consequently obtain that \(\displaystyle \frac{\Pr \left( D_{\mathcal {O}}=\tau \right) }{\Pr \left( D_{\mathcal {P}}=\tau \right) } \ge 1\).

Bounding the Probability of Bad Transcripts in the Ideal World. In the ideal world, every t is generated as \({\mathcal {RO}}(\mathsf {Path})||{\mathcal {RO}}'(\mathsf {Path})\). As the number of distinct \(\mathsf {Path}\)’s in \(\tau \) is \(M-L\), there are \({M\,{-}\,L\atopwithdelims ()2}\) possibilities for a t-collision, each occurring with probability \(2^{-b}\). The probability of such a collision is hence \(\frac{{M\,{-}\,L\atopwithdelims ()2}}{2^b}\).

There are \({M \atopwithdelims ()2}\) occasions for an s-collision. Denote by S the size of the subset of these occasions for which the adversary can (in the worst case) force the outer part of \(s=t+\sigma \) to be a value of its choice. Note that \(S \le {L \,{+}\, \varOmega \,{+}\, 1 \atopwithdelims ()2}\). In the worst case, in these S occasions the outer part of s always has the same value and s-collision probability is \(2^{-c}\). For the \({M\atopwithdelims ()2}-S\) other occasions the s-collision probability is \(2^{-b}\). Thus, the probability of an s-collision is upper bound by (using our bound on S):

$$\begin{aligned} \frac{{M \atopwithdelims ()2} - S}{2^b} + \frac{S}{2^c} \le \frac{{M \atopwithdelims ()2} - {L \,{+}\, \varOmega \,{+}\, 1 \atopwithdelims ()2}}{2^b} + \frac{{L \,{+}\, \varOmega \,{+}\, 1 \atopwithdelims ()2}}{2^c} \le \frac{{M \atopwithdelims ()2} - {L \,{+}\, 1 \atopwithdelims ()2}}{2^b} + \frac{{L \,{+}\, \varOmega \,{+}\, 1 \atopwithdelims ()2}}{2^c} . \end{aligned}$$

The total probability of having a bad transcript is hence upper bound by:

$$\begin{aligned} \frac{{M\,{-}\,L \atopwithdelims ()2}}{2^b} + \frac{{M \atopwithdelims ()2} - {L \,{+}\, 1 \atopwithdelims ()2}}{2^b} + \frac{{L \,{+}\, \varOmega \,{+}\, 1 \atopwithdelims ()2}}{2^c}&= \frac{M(M-L-1)}{2^b} + \frac{{L\,{+}\,\varOmega \,{+}\, 1\atopwithdelims ()2}}{2^c} . \end{aligned}$$

As the H-coefficient ratio is larger than 1, this is the bound on the distinguishing advantage and we have proven Lemma 2.

6 Distance Between KD and RD

In this section we bound the advantage of distinguishing the keyed duplex from a randomized duplex, (12) of Lemma 3. The analysis consists of four steps. In Sect. 6.1, we revisit the KD-vs-RD setup, and exclude the case where the queries made by the adversary result in a forward multiplicity that exceeds a certain threshold \(T_\mathrm {fw}\). Next, in Sect. 6.2 we convert our distinguishing setup to a simpler one, called the permutation setup and illustrated in Fig. 3. In this setup, the adversary has direct query access to the primitives \(\phi \) and \(\pi \) of the randomized duplex, and at the keyed duplex side, we define two constructions on top of f that turn out to be hard to distinguish from \(\phi \) and \(\pi \). We carefully translate the resources of the adversary \(\mathcal {B}\) in the KD-vs-RD setup to those of the adversary \(\mathcal {C}\) in the permutation setup. In Sect. 6.3 we subsequently prove a bound in this setup. This analysis in part depends on a threshold on backward multiplicities \(T_\mathrm {bw}\). In Sect. 6.4 where we return to the KD-vs-RD setup and blend all results. Finally, in Sects. 6.5 and 6.6 we analyze the function \(\nu _{r,c}^{M}\) that plays an important role in our analysis.

We remark that forward and backward multiplicity appeared before in Bertoni et al. [10] and Andreeva et al. [2], but we resolve them internally in the proof. There is a specific reason for resolving forward multiplicity before the conversion to the permutation setup and backward multiplicity after this conversion. Namely, in the permutation setup, an adversary could form its queries so that the forward multiplicity equals \(M-q\), leading to a non-competitive bound, while the backward multiplicity cannot be controlled by the adversary as it cannot make inverse queries to the constructions. It turns out that, as discussed in Sect. 6.4, we can bound the thresholds as functions of M, L, and \(\varOmega \).

6.1 The KD-vs-RD Setup

As in Sect. 4.1, we express the conversation that \(\mathcal {B}\) has with \({\textsc {KD}^{f}_{\mathbf {K}}}\) or \({\textsc {RD}^{\phi ,\pi }_{}}\) in a transcript of the form:

$$\begin{aligned} {\tau _\mathrm{C}}= \{(\mathsf {Path}_j,Z_j,\sigma _j)\}_{j=1}^M . \end{aligned}$$

We denote by \(\mu _\mathrm {fw}\) the maximum number of occurrences in this transcript of a value \(Z_j +\overline{\sigma }_j\) over all possible values:

$$\begin{aligned} \mu _\mathrm {fw}= \max _{\alpha } \#\{(\mathsf {Path}_j,Z_j,\sigma _j) \in {\tau _\mathrm{C}}\mid Z_j+\overline{\sigma }_j = {\alpha }\} . \end{aligned}$$
(15)

We now distinguish between two cases: \(\mu _\mathrm {fw}\) above some threshold \(T_\mathrm {fw}\) and below it. Denoting \(\mathcal {O}=({\textsc {KD}^{f}_{\mathbf {K}}}, {f})\) and \(\mathcal {P}=({\textsc {RD}^{\phi ,\pi }_{}}, {f})\), we find using a hybrid argument,

$$\begin{aligned} \varDelta _{\mathcal {B}}(\mathcal {O}\;;\;\mathcal {P}) =&\left| \Pr \left( \mathcal {B}^{\mathcal {O}}\rightarrow 1\right) - \Pr \left( \mathcal {B}^{\mathcal {P}}\rightarrow 1\right) \right| \nonumber \\ \le&\left| \Pr \left( \mathcal {B}^{\mathcal {O}}\rightarrow 1 \wedge \mu _\mathrm {fw}\le T_\mathrm {fw}\right) - \Pr \left( \mathcal {B}^{\mathcal {P}}\rightarrow 1 \wedge \mu _\mathrm {fw}\le T_\mathrm {fw}\right) \right| \nonumber \\&+\, \left| \Pr \left( \mathcal {B}^{\mathcal {O}}\rightarrow 1 \wedge \mu _\mathrm {fw}> T_\mathrm {fw}\right) - \Pr \left( \mathcal {B}^{\mathcal {P}}\rightarrow 1 \wedge \mu _\mathrm {fw}> T_\mathrm {fw}\right) \right| \nonumber \\ \le&\left| \Pr \left( \mathcal {B}^{\mathcal {O}}\rightarrow 1 \wedge \mu _\mathrm {fw}\le T_\mathrm {fw}\right) - \Pr \left( \mathcal {B}^{\mathcal {P}}\rightarrow 1 \wedge \mu _\mathrm {fw}\le T_\mathrm {fw}\right) \right| \nonumber \\&+\, \max \Big \{\Pr \left( \mu _\mathrm {fw}> T_\mathrm {fw}\text { for }\mathcal {O}\right) ,\Pr \left( \mu _\mathrm {fw}> T_\mathrm {fw}\text { for }\mathcal {P}\right) \Big \} . \end{aligned}$$
(16)

As we will find out (and explicitly mention) in Sect. 6.4, the bound we will derive on \(\Pr \left( \mu _\mathrm {fw}> T_\mathrm {fw}\right) \) in fact applies to both \(\mathcal {O}\) and \(\mathcal {P}\), and for brevity denote the maximum of the two probabilities by \(\Pr _{\mathcal {O},\mathcal {P}}\left( \mu _\mathrm {fw}> T_\mathrm {fw}\right) \).

6.2 Entering the Permutation Setup

To come to our simplified setup we define two constructions: the Even-Mansour construction and a “state initialization construction.” The original Even-Mansour construction builds a b-bit block cipher from a b-bit permutation f and takes two b-bit keys \(K_1\) and \(K_2\) [25, 26], and is defined as \(f(x +K_1)+K_2\). We consider a variant, where \(K_1 = K_2 = 0^r ||\kappa \) with \(\kappa \) a c-bit key, and define

$$\begin{aligned} {E}^{f}_{\kappa }(x) = f(x +(0^r ||\kappa ))+(0^r ||\kappa ) . \end{aligned}$$
(17)

The state initialization construction is a dedicated construction of an injective function that maps an \(\mathrm {iv}\) and a key selected from a key array \(\mathbf {K}\) to a b-bit state and that takes a c-bit key \(\kappa \).

$$\begin{aligned} {I}^{f}_{\kappa ,\mathbf {K}}(\delta ,\mathrm {iv}) = f( \mathbf {K}[\delta ] ||\mathrm {iv}) +(0^r ||\kappa ) . \end{aligned}$$
(18)

Now, let \(\kappa \xleftarrow {{\scriptscriptstyle \$}}\mathbb {Z}_2^c\) be any c-bit key. We call \(\kappa \) the inner masking key. Using the idea of bitwise adding the inner masking key twice in-between every two primitive evaluations [2, 21, 38], we obtain that: \({\textsc {KD}^{f}_{\mathbf {K}}} = {\textsc {RD}^{{I}^{f}_{\kappa ,\mathbf {K}},{E}^{f}_{\kappa }}_{}}\). We thus obtain for (16), leaving the condition \(\mu _\mathrm {fw}\le T_\mathrm {fw}\) implicit:

$$\begin{aligned} \varDelta _{\mathcal {B}}({\textsc {KD}^{f}_{\mathbf {K}}}, {f}\;;\;{\textsc {RD}^{\phi ,\pi }_{}}, {f})&= \varDelta _{\mathcal {B}}({\textsc {RD}^{{I}^{f}_{\kappa ,\mathbf {K}},{E}^{f}_{\kappa }}_{}}, {f}\;;\;{\textsc {RD}^{\phi ,\pi }_{}}, {f})\nonumber \\&\le \varDelta _{\mathcal {C}}({I}^{f}_{\kappa ,\mathbf {K}},{E}^{f}_{\kappa }, {f}\;;\;\phi ,\pi , {f}) . \end{aligned}$$
(19)

Clearly an adversary \(\mathcal {B}\) can be simulated by an adversary \(\mathcal {C}\) as any construction query can be simulated by queries to the initialization function \(\mathcal {O}_{\mathrm {i}}\) (\({I}^{f}_{\kappa ,\mathbf {K}}\) in the real world and \(\phi \) in the ideal world) and the duplexing function \(\mathcal {O}_{\mathrm {d}}\) (\({E}^{f}_{\kappa }\) in the real world and \(\pi \) in the ideal world). Hence, we can quantify the resources of adversary \(\mathcal {C}\) in terms of the resources of adversary \(\mathcal {B}\), making use of the threshold \(T_\mathrm {fw}\) on the multiplicity (cf., (16)). This conversion will be formally performed in Sect. 6.4.

6.3 Distinguishing Bound for the Permutation Setup

We now bound \(\varDelta _{\mathcal {C}}({I}^{f}_{\kappa ,\mathbf {K}},{E}^{f}_{\kappa }, {f}\;;\;\phi ,\pi , {f})\). The permutation setup is illustrated in Fig. 3. The derivation is performed using the H-coefficient technique.

Fig. 3.
figure 3

Permutation setup.

Description of Transcripts. The adversary has access to either \(({I}^{f}_{\kappa ,\mathbf {K}},{E}^{f}_{\kappa }, {f})\) or \((\phi ,\pi , {f})\). The queries of the adversary and their responses are assembled in three transcripts \({\tau _\mathrm{f}}, {\tau _\mathrm{d}}\), and \({\tau _\mathrm{i}}\).

\({\tau _\mathrm{f}}=\{(x_j,y_j)\}_{j=1}^N\) :

The queries to f and \(f^{-1}\). The transcript does not code whether the query was \(y = f(x)\) or \(x = f^{-1}(y)\).

\({\tau _\mathrm{i}}= \{(\delta _i,\mathrm {iv}_i,t_i)\}_{i=1}^{q'}\) :

The queries to the initialization function \(\mathcal {O}_{\mathrm {i}}\), \({I}^{f}_{\kappa ,\mathbf {K}}\) in the real world and \(\phi \) in the ideal world.

\({\tau _\mathrm{d}}=\{(s_i,t_i)\}_{i=1}^{M'}\) :

The queries to the duplexing function \(\mathcal {O}_{\mathrm {d}}\), \({E}^{f}_{\kappa }\) in the real world and \(\pi \) in the ideal world.

The resources of \(\mathcal {C}\) are defined by the number of queries in each transcript: N, \(M'\), and \(q'\), as well as \(q_\mathrm {iv}=\max _\alpha \#\{(\delta ,\mathrm {iv},t) \in {\tau _\mathrm{i}}\mid \mathrm {iv}= \alpha \}\). In addition, the resources of \(\mathcal {C}\) are limited on \({\tau _\mathrm{d}}\), for which the forward multiplicity must be below the threshold \(T_\mathrm {fw}\):

$$\begin{aligned} \max _\alpha \#\{(s_i,t_i) \in {\tau _\mathrm{d}}\mid \bar{s_i} = \alpha \} \le T_\mathrm {fw}. \end{aligned}$$

To ease the analysis, we will disclose the full key array \(\mathbf {K}\) and the inner masking key \(\kappa \) at the end of the experiment (in the ideal world, \(\kappa \) and the elements of \(\mathbf {K}\) will simply be dummy keys). The transcripts are thus of the form \(\tau =(\mathbf {K},\kappa ,{\tau _\mathrm{f}},{\tau _\mathrm{i}},{\tau _\mathrm{d}})\). Note that it is fair to assume that none of the transcripts contains duplicate elements (i.e., the adversary never queries f twice on the same value). Additionally, as we consider attainable transcripts only and \(\phi ,\pi ,f\) are injective mappings, \(\tau \) does not contain collisions.

We define the backward multiplicity as characteristic of the transcript \(\tau \):

Definition 3

In the permutation setup, the backward multiplicity \(\mu _\mathrm {bw}\) is defined as:

$$\begin{aligned}&\mu _\mathrm {bw}= \max _\alpha \Big ( \#\{(s_i,t_i) \in {\tau _\mathrm{d}}\mid \bar{t}_i = \alpha \} + \#\{(\delta ,\mathrm {iv},t_i) \in {\tau _\mathrm{i}}\mid \bar{t}_i = \alpha \} \Big ) . \end{aligned}$$

Definition of Good and Bad Transcripts. In the real world, every tuple in \(({\tau _\mathrm{f}},{\tau _\mathrm{i}},{\tau _\mathrm{d}})\) defines exactly one evaluation of f. We define a transcript \(\tau \) as bad if it contains an input or output collision of f or if the backward multiplicity is above some limit \(T_\mathrm {bw}\). In other words, \(\tau \) is bad if one of the following conditions is satisfied. Input collisions between:

$$\begin{aligned}&{\tau _\mathrm{f}}\text { and } {\tau _\mathrm{i}}: \exists (x,y)\in {\tau _\mathrm{f}}, (\delta ,\mathrm {iv},t) \in {\tau _\mathrm{i}}\text { with } \big (x=\mathbf {K}[\delta ] ||\mathrm {iv}\big ) ; \end{aligned}$$
(20)
$$\begin{aligned}&{\tau _\mathrm{f}}\text { and } {\tau _\mathrm{d}}: \exists (x,y)\in {\tau _\mathrm{f}}, (s,t)\in {\tau _\mathrm{d}}\text { with } \big (x=s+0^r ||\kappa \big ) ; \end{aligned}$$
(21)
$$\begin{aligned}&{\tau _\mathrm{i}}\text { and } {\tau _\mathrm{d}}: \exists (\delta ,\mathrm {iv},t)\in {\tau _\mathrm{i}}, (s',t')\in {\tau _\mathrm{d}}\text { with } \big (\mathbf {K}[\delta ] ||\mathrm {iv}=s'+0^r ||\kappa \big ) ; \end{aligned}$$
(22)
$$\begin{aligned}&\text {within } {\tau _\mathrm{i}}: \exists (\delta ,\mathrm {iv},t), (\delta ',\mathrm {iv}',t')\in {\tau _\mathrm{i}}\text { with } \big (\delta \ne \delta '\big ) \text { AND } \big (\mathbf {K}[\delta ] ||\mathrm {iv}=\mathbf {K}[\delta '] ||\mathrm {iv}'\big ) . \end{aligned}$$
(23)

Output collisions between:

$$\begin{aligned}&{\tau _\mathrm{f}}\text { and } {\tau _\mathrm{i}}: \exists (x,y)\in {\tau _\mathrm{f}}, (\delta ,\mathrm {iv},t) \in {\tau _\mathrm{i}}\text { with } \big (y=t+0^r ||\kappa \big ) ; \end{aligned}$$
(24)
$$\begin{aligned}&{\tau _\mathrm{f}}\text { and } {\tau _\mathrm{d}}: \exists (x,y)\in {\tau _\mathrm{f}}, (s,t)\in {\tau _\mathrm{d}}\text { with } \big (y=t+0^r ||\kappa \big ) ; \end{aligned}$$
(25)
$$\begin{aligned}&{\tau _\mathrm{i}}\text { and } {\tau _\mathrm{d}}: \exists (\delta ,\mathrm {iv},t)\in {\tau _\mathrm{i}}, (s',t')\in {\tau _\mathrm{d}}\text { with } \big (t+0^r ||\kappa =t'+0^r ||\kappa \big ) . \end{aligned}$$
(26)

Finally, \(\tau \) is bad if the backward multiplicity \(\mu _\mathrm {bw}\) is above the threshold \(T_\mathrm {bw}\):

$$\begin{aligned} \mu _\mathrm {bw}> T_\mathrm {bw}. \end{aligned}$$
(27)

Note that output collisions within \({\tau _\mathrm{i}}\) are excluded by attainability of transcripts. Similarly, collisions (input or output) within \({\tau _\mathrm{f}}\) as well as collisions within \({\tau _\mathrm{d}}\) are excluded by attainability of transcripts.

Bounding the H-coefficient Ratio for Good Transcripts. Denote \(\mathcal {O}=({I}^{f}_{\kappa ,\mathcal {D}_\mathrm{K}},{E}^{f}_{\kappa }, {f})\) and \(\mathcal {P}=(\phi ,\pi , {f})\) for brevity. Consider a good transcript \(\tau \in \mathcal {T}_\mathrm {good}\).

In the real world \(\mathcal {O}\), the transcript defines exactly \(q'+M'+N\) input-output pairs of f, so \(\Pr \left( D_{\mathcal {O}}=\tau \right) = 1/(2^b)_{(q'+M'+N)}\). In the ideal world \(\mathcal {P}\), every tuple in \({\tau _\mathrm{f}}\) defines exactly N input-output pairs for f, every tuple in \({\tau _\mathrm{i}}\) defines exactly \(q'\) input-output pairs for \(\phi \), and every tuple in \({\tau _\mathrm{d}}\) defines exactly \(M'\) input-output pairs for \(\pi \). It follows that \(\Pr \left( D_{\mathcal {P}}=\tau \right) = 1/((2^b)_{(N)}(2^b)_{(q')}(2^b)_{(M')})\). We consequently obtain that \(\displaystyle \frac{\Pr \left( D_{\mathcal {O}}=\tau \right) }{\Pr \left( D_{\mathcal {P}}=\tau \right) } \ge 1\).

Bounding the Probability of Bad Transcripts in the Ideal World. In the ideal world, \(\kappa \) is generated uniformly at random. The key array \(\mathbf {K}\) is generated according to distribution \(\mathcal {D}_\mathrm{K}\), cf., Sect. 2.1. We will use the min-entropy and maximum collision probability definitions of (6) and (7).

For (20), fix any \((x,y) \in {\tau _\mathrm{f}}\). There are at most \(q_\mathrm {iv}\) tuples in \({\tau _\mathrm{i}}\) with \(\mathrm {iv}\) equal to the last \(b-k\) bits of x. For any of those tuples, the probability that the first \(k\) bits of x are equal to \(\mathbf {K}[\delta ]\) is at most \(2^{-H_{\min }(\mathcal {D}_\mathrm{K})}\), cf., (6). The collision probability is hence at most \(q_\mathrm {iv}N/2^{H_{\min }(\mathcal {D}_\mathrm{K})}\).

For (21), fix any \((x,y) \in {\tau _\mathrm{f}}\). There are at most \(T_\mathrm {fw}\) tuples in \({\tau _\mathrm{d}}\) with \(\overline{x}=\overline{s}\). For any of those tuples, the probability that \(\widehat{x}=\widehat{s} +\kappa \) is \(2^{-c}\). The collision probability is hence at most \(T_\mathrm {fw}N/2^c\).

For (24) or (25), we will assume \(\lnot \)(27). Fix any \((x,y) \in {\tau _\mathrm{f}}\). There are at most \(T_\mathrm {bw}\) tuples in \({\tau _\mathrm{i}}\cup {\tau _\mathrm{d}}\) with \(\overline{y}=\overline{t}\). For any of those tuples, the probability that \(\widehat{y}=\widehat{t}+\kappa \) is \(2^{-c}\). The collision probability is hence at most \(T_\mathrm {bw}N/2^c\).

For (22), fix any \((\delta ,\mathrm {iv},t) \in {\tau _\mathrm{i}}\) and any \((s',t')\in {\tau _\mathrm{d}}\). Any such combination sets (22) if \(0^{k}||\mathrm {iv}+s' = \mathbf {K}[\delta ]||0^{b-k} +0^r ||\kappa \). Note that the randomness of \(\mathbf {K}[\delta ]\) may overlap the one of \(\kappa \). If \(k+c\le b\), the two queries satisfy the condition with probability at most \(2^{-(H_{\min }(\mathcal {D}_\mathrm{K})+c)}\), cf., (6). On the other hand, if \(k>b-c\), the first \(b-c\) bits of \(\mathbf {K}[\delta ]\) has a min-entropy of at least \(H_{\min }(\mathcal {D}_\mathrm{K})-(k-(b-c))\). In this case, the two queries satisfy the condition with probability at most

$$\begin{aligned} 2^{-(H_{\min }(\mathcal {D}_\mathrm{K})-(k-(b-c))+c)}=2^{-(H_{\min }(\mathcal {D}_\mathrm{K})+b-k)} . \end{aligned}$$

The collision probability is hence at most \(\frac{M'q'}{2^{H_{\min }(\mathcal {D}_\mathrm{K})+\min \{c,b-k\}}}\), using that \({\tau _\mathrm{i}}\) contains \(q'\) elements and \({\tau _\mathrm{d}}\) contains \(M'\) elements.

For (26), fix any \((\delta ,\mathrm {iv},t)\in {\tau _\mathrm{i}}\) and any \((s',t')\in {\tau _\mathrm{d}}\). As \(\phi \) and \(\pi \) are only evaluated in forward direction, and \(\phi \) is queried at most \(q'\) times, the probability that \(t=t'\) for these two tuples is at most \(1/(2^b-q')\). The collision probability is hence at most \(M'q'/(2^b-q')\).

For (23), a collision of this form implies the existence of two distinct \(\delta ,\delta '\) such that \(K[\delta ]=K[\delta ']\). This happens with probability at most \({u\atopwithdelims ()2}/2^{H_{\text {coll}}(\mathcal {D}_\mathrm{K})}\), cf., (7).

The total probability of having a bad transcript is at most:

(28)

As the H-coefficient ratio is larger than 1, Eq. (28) is the bound on the distinguishing advantage.

6.4 Returning to the KD-vs-RD Setup

The resources of \(\mathcal {C}\) can be computed from those of \(\mathcal {B}\) (see Sect. 4.1) in the following way:

  • \(q' \le q\): for every query to \(\mathcal {O}_{\mathrm {i}}\) there must be at least one initialization query.

  • \(M' \le M-q-L\): The minus L is there because queries with repeated paths just give duplicate queries to \(\mathcal {O}_{\mathrm {i}}\) and the q initialization queries do not give queries to \(\mathcal {O}_{\mathrm {d}}\).

The remaining resources have the same meaning for \(\mathcal {B}\) and \(\mathcal {C}\). Filling in these values in Eq. (28) and combining with Eq. (16) yields:

$$\begin{aligned} \varDelta _{\mathcal {B}}(\mathcal {O}\;;\;\mathcal {P}) \le&\left( \frac{T_\mathrm {fw}N}{2^c} + {\Pr }_{\mathcal {O},\mathcal {P}}\left( \mu _\mathrm {fw}> T_\mathrm {fw}\right) \right) \end{aligned}$$
(29a)
$$\begin{aligned}&+\,\left( \frac{T_\mathrm {bw}N}{2^c} + {\Pr }_{\mathcal {P}}\left( \mu _\mathrm {bw}> T_\mathrm {bw}\right) \right) \end{aligned}$$
(29b)
$$\begin{aligned}&+\,\frac{ (M-q-L)q}{2^b - q} + \frac{ (M-q-L)q}{2^{H_{\min }(\mathcal {D}_\mathrm{K})+\min \{c,b-k\}}} \end{aligned}$$
(29c)
$$\begin{aligned}&+\,\frac{ q_\mathrm {iv}N}{2^{H_{\min }(\mathcal {D}_\mathrm{K})}} + \frac{ {u\atopwithdelims ()2}}{2^{H_{\text {coll}}(\mathcal {D}_\mathrm{K})}} \,. \end{aligned}$$
(29d)

Clearly \(\mu _\mathrm {fw}\le M-q-L\) and \(\mu _\mathrm {bw}\le M-L\). So by taking \(T_\mathrm {fw}= T_\mathrm {bw}= M-L\), lines (29a)–(29b) reduce to \(2(M-L)N/2^c\). However, much better bounds can be obtained by carefully tuning \(T_\mathrm {fw}\) and \(T_\mathrm {bw}\).

Although the probabilities on the \(\mu _\mathrm {fw}\) and \(\mu _\mathrm {bw}\) are defined differently (the former in the KD-vs-RD setup, the latter in the permutation setup), in essence they are highly related and we can rely on multicollision limit function of Definition 2 for their analysis. There is one caveat. Definition 2 considers balls thrown uniformly at random into the \(2^r\) bins, hence a bin is hit with probability \(1/2^r\). In Lemma 6 in upcoming Sect. 6.6, we will prove that for non-uniform bin allocation where the probability that a ball hits any particular bin is upper bounded by \(y2^{-r}\), the multicollision limit function is at most \(\nu _{r,c}^{yM}\). In our case the states are generated from a set of size at least \(2^b-M-N\) (for both \(\mathcal {O}\) and \(\mathcal {P}\)) and thus its outer part is thrown in a bin with probability at most \(2^c/(2^b-M-N)\), where we use that \(M+N\le 2^{b-1}\). Using the fact that \(\nu _{r,c}^{M}\) is a monotonic function in M and that \(2^b/(2^b-M-N) < 2\) for any reasonable value of \(M+N\), we upper bound the multicollision limit function by \(\nu _{r,c}^{2(M-L)}\)

We first look at (29b) and treat \(\mu _\mathrm {bw}\). As it is a metric of the responses of queries to \(\pi \) and \(\phi \), it is a stochastic variable. It corresponds to the multicollision limit function of Definition 2, where \(M-L\) balls are distributed over \(2^r\) bins, and each bin is hit with probability at most \(2/2^r\). Using above observation, we take \(T_\mathrm {bw}= \nu _{r,c}^{2(M-L)}\), and (29b) becomes

$$\begin{aligned} \frac{\nu _{r,c}^{2(M-L)} N}{2^c} + \frac{\nu _{r,c}^{2(M-L)}}{2^c} = \frac{\nu _{r,c}^{2(M-L)} (N+1)}{2^c} . \end{aligned}$$

The case of \(\mu _\mathrm {fw}\) in (29c) is slightly more complex. As discussed in Sect. 4.1, the adversary can enforce the outer part \(Z_j +\overline{\sigma }_j\) to match a value \(\alpha \) in case \(\mathsf {Path}_j\) is a repeating path. Moreover, for queries with \(\text {flag}= \text {true}\), it can also enforce the outer part to any chosen value. These total to \(L + \varOmega \) queries. For the remaining queries, for simplicity upper bound by \(M-L\) here, the adversary has no control over the outer part. Therefore, if take \(T_\mathrm {fw}= L + \varOmega + \nu _{r,c}^{2(M-L)}\), we have \(\Pr \left( \mu _\mathrm {fw}> T_\mathrm {fw}\right) = \frac{\nu _{r,c}^{2(M-L)}}{2^c}\). Namely, this is the probability that in the (at most) \(M-L\) queries where the adversary has no control over the outer part, the multiplicity is above \(\nu _{r,c}^{2(M-L)}\) assuming that the \(L + \varOmega \) queries are manipulated to hit the same outer value as those \(\nu _{r,c}^{2(M-L)}\) queries. Eq. (29a) now becomes:

$$\begin{aligned} \frac{(L + \varOmega + \nu _{r,c}^{2(M-L)})N}{2^c} + \frac{\nu _{r,c}^{2(M-L)}}{2^c} = \frac{(L + \varOmega )N}{2^c} + \frac{2\nu _{r,c}^{2(M-L)}(N+1)}{2^c} . \end{aligned}$$

Plugging these two bounds into (29a)–(29b) yields the bound of Lemma 3.

Fig. 4.
figure 4

Stairway to Heaven graph: \(\nu _{r,c}^{M}\) computed with (33) for \(r+c = 256\), with upper bounds and asymptote for \(M \rightarrow \infty \).

6.5 Bounds on \(\nu _{r,c}^{M}\)

We will upper bound \(\nu _{r,c}^{M}\) by approximating the term \(\Pr (\mu > x)\) in Definition 2 by simpler expressions that are strictly larger.

In Definition 2, \(\mu \) is the maximum of the number of balls over all \(2^r\) bins. If we model the number of balls in a particular bin as a stochastic variable \(X_i\) with some distribution function \(D_i(x) = \Pr (X_i \le x)\), clearly, the distribution function of the maximum over all bins is the product of the distribution functions: \(D_{\max }(x) = \prod _i D_i(x)\). Assuming all variables have the same distribution and that they are independent, we hence obtain:

$$\begin{aligned} \Pr (\mu > x) = 1 - \Pr (\mu \le x)&= 1 - \left( \Pr (X \le x) \right) ^{2^r} . \end{aligned}$$
(30)

The distributions that are of interest here are the number of balls in a bin, and they are not independent as they must sum to M. This means that if we know one distribution is high, the others are somewhat lower than if they would be independent. This makes that the value obtained by taking the product of factors \(\Pr (X \le x)\) slightly underestimates the probability \(\Pr (\mu \le x)\). Using the inequality \((1-\epsilon )^y \ge 1 - \epsilon y\), we obtain

$$\begin{aligned} \left( \Pr (X \le x) \right) ^{2^r} = \left( 1 - \Pr (X> x) \right) ^{2^r} \ge 1 - 2^r \Pr (X > x), \end{aligned}$$

and we obtain for (30):

$$\begin{aligned} \Pr (\mu> x) < 2^r \Pr (X > x) \, . \end{aligned}$$
(31)

We will now upper bound \(\Pr (X > x)\). The number of balls x in any particular bin has a binomial distribution. If the number of bins \(2^r\) and the total number of balls M are large enough, for \(x > \lambda \) this is (tightly) upper bounded by a Poisson distribution with \(\lambda = M2^{-r}\). The probability that a Poisson-distributed variable X is larger than x is given by:

$$\begin{aligned} \Pr (X > x) = \sum _{i \ge x} \frac{e^{-\lambda } \lambda ^i}{i!} = \frac{e^{-\lambda } \lambda ^x}{x!} \sum _{i \ge 0} \frac{\lambda ^i}{(i+x)_{(i)}} < \frac{e^{-\lambda } \lambda ^x}{x!} \sum _{i \ge 0} \frac{\lambda ^i}{x^i} = \frac{x e^{-\lambda } \lambda ^x}{(x - \lambda )x!} \, . \end{aligned}$$

This yields for (31):

$$\begin{aligned} \Pr (\mu > x) < 2^r \frac{x e^{-\lambda } \lambda ^x}{(x - \lambda )x!} . \end{aligned}$$

From Definition 2, we obtain that \(\nu _{r,c}^{M}\) is upper bounded by the smallest value x that satisfies

$$\begin{aligned} \frac{2^b e^{-\lambda } \lambda ^x}{(x - \lambda )x!} \le 1 , \end{aligned}$$
(32)

with \(\lambda = M2^{-r}\). Remarkably, the dependence of \(\nu _{r,c}^{M}\) on rc and M is only via \(b = r+c\) and \(\lambda = M2^{-r}\). Hence, it is a function in two variables b and \(\lambda \) rather than three. Taking the logarithm of (32), applying Stirling’s approximation (\(\ln (x!)\ge \frac{1}{2}\ln (2\pi x) + x(\ln (x)-1)\)) and rearranging the terms gives:

$$\begin{aligned} x \left( \ln (x) - \ln (\lambda ) - 1 \right) + \ln (x-\lambda ) + \frac{1}{2}\ln (2\pi x) + \lambda \ge \ln (2)b . \end{aligned}$$
(33)

We will now derive expressions from (32) and (33) that give insight in the behavior of this function for the full range of \(\lambda \).

Case \(\varvec{\lambda }<\) 1. If we consider Eq. (33) with value of x given and we look for the maximum value of x such that it holds. This gives the value of \(\lambda \) where \(\nu _{r,c}^{M}\) transitions from \(x-1\) to x. We can now prove the following lemma.

Lemma 4

The value of \(\lambda \) where \(\nu _{r,c}^{M}\) transitions from \(x\,-\,1\) to x is lower bounded by \(2^{-b/x}\).

Proof

We need to prove that for \(\lambda =2^{-b/x}\), inequality (33) holds:

$$\begin{aligned} x(\ln (x)-1) + \ln (x-2^{-b/x}) + \frac{1}{2}\ln (2\pi x) + 2^{-b/x} \ge 0 . \end{aligned}$$

For \(x > e\) all terms in the left hand side of this equation are positive and hence the equation is satisfied. The only other relevant value is \(x=2\) and it can be verified by hand that this is satisfied for all b.    \(\square \)

If we substitute \(\lambda \) by \(M2^{-r}\), this gives bounds on M for which \(\nu _{r,c}^{M}\) achieves a certain value. If we denote by \(M_x\) the value where \(\nu _{r,c}^{M}\) transitions from \(x-1\) to x, we have \(M_x \ge 2^{r-b/x} = 2^{((x-1)r-c)/x}\). In particular \(M_2 \ge 2^{(r-c)/2}\). It follows that \(\nu _{r,c}^{M}\) is 1 for \(M \le 2^{(r-c)/2}\). Clearly, M must be an integer value, so the value of \(\nu _{r,c}^{M}\) for \(M=1\) will be above 1 if \(r < c+2\).

Case \({\varvec{\lambda }}\) = 1. Equation (33) for \(\lambda = 1\) reads

$$\begin{aligned} x \left( \ln (x) - 1 \right) + \ln (x-1) + \frac{1}{2}\ln (2\pi x) + 1 \ge \ln (2)b , \end{aligned}$$

and \(\nu _{r,c}^{M}\) is upper bounded by the smallest x such that this inequality holds, or equivalently, such that

$$\begin{aligned} x \ge \frac{\ln (2)b - 1 - \ln (x-1) - \frac{1}{2}\ln (2\pi x)}{\ln (x) - 1 } . \end{aligned}$$

The right hand side of this equation is upper bounded by \(\frac{\ln (2)b}{\ln (x) - 1 }\). Therefore, \(\nu _{r,c}^{M}\) is certainly upper bounded by the smallest x such that

$$\begin{aligned} x \ge \frac{\ln (2)b}{\ln (x) - 1 } . \end{aligned}$$

This expression can be efficiently evaluated for all values of b, and it turns out that the value of \(\nu _{r,c}^{2^r}\) increases from b / 4 for values of b close to 200 to values b / 6 for values of b close to 2000.

Case \({\varvec{\lambda }}>\) 1. For large \(\lambda \), Eq. (33) becomes numerically instable. We derive a formula for integer values of \(\lambda \), or equivalently values of M that are a multiple of \(2^r\) (w.l.o.g.). By a change of variable from x to \(x = \lambda + y\) we obtain for the left hand side of (32):

$$\begin{aligned} \frac{2^b e^{-\lambda } \lambda ^x}{(x - \lambda )x!} = \frac{2^b e^{-\lambda } \lambda ^{\lambda +y}}{y(\lambda +y)!} = \frac{2^b \lambda ^y }{y (\lambda +y)_y} \frac{(\lambda /e)^{\lambda }}{\lambda !} \le \frac{2^b \lambda ^y }{y \sqrt{2\pi \lambda } (\lambda +y)_y} \end{aligned}$$

using Stirling’s approximation. Now (32) holds provided that

$$\begin{aligned} \frac{2^b \lambda ^y }{y \sqrt{2\pi \lambda } (\lambda +y)_y} = \frac{2^b}{y \sqrt{2\pi \lambda } \prod _{i=1}^{y} (1 + \frac{i}{\lambda })} \le 1 . \end{aligned}$$

Taking the logarithm:

$$\begin{aligned} \sum _{i=1}^{y} \ln \left( 1 + \frac{i}{\lambda }\right) + \ln (y) + \frac{1}{2}\ln (2\pi \lambda ) \ge \ln (2)b . \end{aligned}$$
(34)

This equation allows efficiently computing \(\nu _{r,c}^{M}\) for \(M>2^r\) and also to prove a simple upper bound for the range \(\lambda > 1\).

Lemma 5

For M a nonzero integer multiple of \(2^r\), we have

$$\begin{aligned} \nu _{r,c}^{M} \le \frac{M}{2^r} + \nu _{r,c}^{2^r} \left\lceil \sqrt{\frac{M}{2^r}} \right\rceil \, . \end{aligned}$$

Proof

First of all, note that for \(\lambda =1\), (34) is satisfied for \(y = \nu _{r,c}^{2^r}-1\). Therefore, we have

$$\begin{aligned} \Xi := \sum _{i=1}^{\nu _{r,c}^{2^r} - 1} \ln (1 + i) + \ln (\nu _{r,c}^{2^r}-1) + \frac{1}{2}\ln (2\pi ) - \ln (2)b \ge 0 . \end{aligned}$$

Our goal is to prove that (34) holds for \(y = \nu _{r,c}^{2^r} \big \lceil \sqrt{\lambda } \big \rceil \). Since \(\Xi \ge 0\), we will in fact prove that

$$\begin{aligned} \sum _{i=1}^{\nu _{r,c}^{2^r} \big \lceil \sqrt{\lambda } \big \rceil } \ln \left( 1 + \frac{i}{\lambda }\right) + \ln (\nu _{r,c}^{2^r} \big \lceil \sqrt{\lambda } \big \rceil ) + \frac{1}{2}\ln (2\pi \lambda ) - \ln (2)b \ge \Xi . \end{aligned}$$

Note that

$$\begin{aligned}&\;\sum _{i=1}^{\nu _{r,c}^{2^r} \big \lceil \sqrt{\lambda } \big \rceil } \ln \left( 1 + \frac{i}{\lambda }\right) + \ln (\nu _{r,c}^{2^r} \big \lceil \sqrt{\lambda } \big \rceil ) + \frac{1}{2}\ln (2\pi \lambda ) - \ln (2)b - \Xi \\ \ge&\; \sum _{i=1}^{\nu _{r,c}^{2^r} \big \lceil \sqrt{\lambda } \big \rceil } \ln \left( 1 + \frac{i}{\lambda }\right) - \sum _{i=1}^{\nu _{r,c}^{2^r} - 1} \ln (1 + i) . \end{aligned}$$

This can be rewritten as

$$\begin{aligned} \sum _{i=0}^{\nu _{r,c}^{2^r}-1} \left( \sum _{j=1}^{\big \lceil \sqrt{\lambda } \big \rceil } \ln \left( 1 + \frac{i \big \lceil \sqrt{\lambda } \big \rceil + j}{\lambda } \right) - \ln (1 + i) \right) , \end{aligned}$$

and our claim holds if we can prove that the summand is at least 0 for all \(i=0,\ldots ,\nu _{r,c}^{2^r}-1\). This is easily verified as

$$\begin{aligned} \sum _{j=1}^{\big \lceil \sqrt{\lambda } \big \rceil } \ln \left( 1 + \frac{i \big \lceil \sqrt{\lambda } \big \rceil + j}{\lambda } \right)&\ge \sum _{j=1}^{\big \lceil \sqrt{\lambda } \big \rceil } \ln \left( 1 + \frac{i \big \lceil \sqrt{\lambda } \big \rceil }{\lambda } \right) = \ln \left( \left( 1 + \frac{i \big \lceil \sqrt{\lambda } \big \rceil }{\lambda } \right) ^{\big \lceil \sqrt{\lambda } \big \rceil }\right) , \end{aligned}$$

which is at least

$$\begin{aligned} \ln \left( 1 + \frac{i \big \lceil \sqrt{\lambda } \big \rceil ^2}{\lambda } \right) \ge \ln (1+i), \end{aligned}$$

as in general \((1+x)^y \ge 1 + xy\).    \(\square \)

Clearly, for large M, \(\nu _{r,c}^{M}\) asymptotically converges to \(M/2^r\).

6.6 Dealing with Non-uniform Sampling

In this section we address the non-uniform balls-and-bins problem. We consider the balls-and-bins problems for some values r and c where the probability that a ball hits a particular bin (of the \(2^r\) bins) is not \(2^{-r}\). In other words, the distribution is not uniform. In general the probability distribution for the n-th ball depends on how the previous \(n-1\) balls were distributed. We denote this distribution by D and define \(D(i \mid s)\) as the probability that a ball falls in bin i given the sequence s of bins in which the previous \(n-1\) balls fell. We denote by \(\nu _{r,c}^{D,M}\) the variant of the function with the same name with the given distribution.

Definition 4

The multicollision limit function for some distribution D, \(\nu _{r,c}^{D,M}\), with M a natural number, returns a natural number and is defined as follows. Assume we independently distribute M balls in \(2^r\) bins according to a distribution D. If we call the number of balls in the bin with the highest number of balls \(\mu \), then \(\nu _{r,c}^{D,M}\) is defined as the smallest natural number x that satisfies:

$$\begin{aligned} \Pr \left( \mu > x\right) < \frac{x}{2^{c}} . \end{aligned}$$

We can now prove the following lemma.

Lemma 6

If for every bin, according to the distribution D the probability for a ball to end up in that bin satisfies \(|D(i \mid s)-2^{-r}|\le y2^{-r}\) for some \(y\le 0.1\) and any i and s, then \(\nu _{r,c}^{D,M} \le \nu _{r,c}^{2M}\), provided \(M\le y2^c\) and \(r\ge 5\).

Before proving Lemma 6, note that in our application of the lemma in Sect. 6.4, the ith ball hits a certain bin with probability

$$\begin{aligned} \frac{2^c-(i-1)}{2^b-(i-1)} \le p \le \frac{2^c}{2^b-(i-1)} . \end{aligned}$$

Assuming that \(i-1\le y2^c\) and \(y\le 0.1\), we obtain that \((1-y)\cdot 2^{-r}\le p \le (1+y)\cdot 2^{-r}\), and that the condition imposed by Lemma 6 is satisfied. As in our setup there are in total \(M+N\) queries to f, this is satisfied if \(M+N \le 0.1 \times 2^c\).

Proof

Consider the following two experiments:

  • Experiment 1: we drop 2M balls into \(2^r\) bins and the distribution is uniform.

  • Experiment 2: we drop M balls into \(2^r\) bins and the probability for a ball to land in any particular bin is between \((1-y)\cdot 2^{-r}\) and \((1+y)\cdot 2^{-r}\).

We need to prove that \(\nu _{r,c}^{2M}\) of the first experiment is at least \(\nu _{r,c}^{D,M}\) of the second experiment. The general strategy is as follows. First, we prove that \(\nu _{r,c}^{2M}\) is lower bounded by some threshold t. Then, if for all \(x \ge t\), we have \(\Pr \left( \mu ^{\text {exp 1}}> x\right) \ge \Pr \left( \mu ^{\text {exp 2}} > x\right) \), then \(\nu _{r,c}^{D,M} \le \nu _{r,c}^{2M}\) because \(x=\nu _{r,c}^{2M}\) satisfies the equation \(\Pr \left( \mu ^{\text {exp 2}} > x\right) < \frac{x}{2^c}\). Clearly, the condition above is satisfied if for all \(x \ge t\) and for all bins i, we have \(\Pr \left( X_i^{\text {exp 1}}> x\right) \ge \Pr \left( X_i^{\text {exp 2}} > x\right) \), where \(X_i\) is the number of balls in bin i. And in turn, it is satisfied if for all \(x \ge t\) and for all bins i, we have \(\Pr \left( X_i^{\text {exp 1}} = x\right) \ge \Pr \left( X_i^{\text {exp 2}} = x\right) \).

First, by the pigeonhole principle, in experiment 1 there is always a bin with \(\max \{2M/2^r,1\}\) balls, and \(\nu _{r,c}^{2M}\) is at least this value: \(\nu _{r,c}^{2M}\ge \max \{2M/2^r,1\}\). Then, consider any bin and the probability that it contains exactly x balls. In experiment 1, the bin contains exactly x balls if in the sequence of 2M balls, x balls fall into the particular bin and \(2M-x\) fall in another bin, and thus:

$$\begin{aligned} \Pr \left( X_i^{\text {exp 1}} = x\right) = {2M\atopwithdelims ()x}\cdot (2^{-r})^x\cdot (1-2^{-r})^{2M-x} . \end{aligned}$$

For experiment 2 we likewise obtain, using the fact that the ith ball ends in the bin with probability \((1-y)\cdot 2^{-r}\le p \le (1+y)\cdot 2^{-r}\) for any i:

$$\begin{aligned} \Pr \left( X_i^{\text {exp 1}} = x\right) \le {M\atopwithdelims ()x}\cdot ((1+y)\cdot 2^{-r})^x\cdot (1-(1-y)\cdot 2^{-r})^{M-x} . \end{aligned}$$

Using that \({2M\atopwithdelims ()x}/{M\atopwithdelims ()x} \ge 2^x\) and \((1-2^{-r})^2\ge 1-2\cdot 2^{-r}\), the condition certainly holds if

$$\begin{aligned} \left( \frac{2}{1+y}\frac{1-(1-y)\cdot 2^{-r}}{1-2^{-r}}\right) ^x \ge \left( \frac{1-(1-y)\cdot 2^{-r}}{1-2\cdot 2^{-r}}\right) ^M , \end{aligned}$$

which in turn certainly holds if

$$\begin{aligned} \left( \frac{2}{1+y}\right) ^x \ge \left( 1 + \frac{1+y}{2^r-2}\right) ^M , \end{aligned}$$

which in turn certainly holds if

$$\begin{aligned} \left( \frac{2}{1+y}\right) ^x \ge \left( 1 + \frac{1+y}{2^r-2}\right) ^{\max \{M,2^{r-1}\}} . \end{aligned}$$
(35)

We have to prove that this condition holds for all \(x\ge \max \{2M/2^r,1\}\). The left hand side is increasing in x, whereas the right hand side is constant in x, and we therefore only have to prove it for \(x=\max \{2M/2^r,1\}\) (w.l.o.g., assuming that x is integral). Therefore, our goal now is to prove that

$$\begin{aligned} \frac{2}{1+y} \ge \left( 1 + \frac{1+y}{2^r-2}\right) ^{2^{r-1}} . \end{aligned}$$

Using that \(1+a\le e^a\) and \(r\ge 5\), the condition is satisfied if

$$\begin{aligned} \frac{2}{1+y} \ge e^{(1+y)\frac{16}{30}} , \end{aligned}$$

which in turn holds for \(y\le 0.1\).    \(\square \)