Keywords

1 Introduction

The design of block ciphers is a well researched area. An overwhelming majority of modern block ciphers fall in one of two categories: Substitution-Permutation Networks (SPN) and Feistel Networks (FN). Examples of those two structures are the block ciphers standardized by the American National Institute for Standards and Technology, respectively the AES [1] and the DES [2]. However, since block ciphers are simply keyed permutations, the same design strategies can be applied to the design of so-called S-Boxes.

S-Boxes are “small” functions operating usually on at most 8-bits of data which are used to ensure a high non-linearity. For instance, both the DES and the AES use S-Boxes, respectively mapping 6 bits to 4 and permuting 8 bits. If a bijective S-Box is needed, it can be built like a small unkeyed block cipher. For example, the S-Box of Khazad [3] is a 3-round SPN and the S-Box of Zorro [4] is a 3-round FN. These \(8 \times 8\) bits S-Boxes are built from smaller \(4 \times 4\) ones to diminish memory requirements.

Keeping the design process of an S-Box secret might be necessary in the context of white-box cryptography, as described e.g. in [5]. In this paper, Biryukov et al. describe a memory-hard white-box encryption scheme relying on a SPN with large S-Boxes built like smaller SPN. Their security claim needs the fact that an adversary cannot decompose these S-Boxes into their different linear and non-linear layers. Such memory-hard white-box implementation may also be used to design proofs-of-work such that one party has an advantage over the others. Knowing the decomposition of a memory-hard function would effectively allow a party to bypass this memory-hardness. Such functions can have many use cases including password hashing and crypto-currency design.

Decomposing SPNs into their components is possible for up to 3 S-Box layers when the S-Boxes are small using the multi-set attack on SASAS described in [6]. A more general strategy for reverse-engineering of unknown S-Boxes was proposed recently in [7]. Our work pursues the same line of research but targets FN specifically: what is the complexity of recovering all Feistel functions of a R-round FN? Our results are different depending on whether the Feistel Network attacked uses an exclusive-or (\(\oplus \)) or a modular addition (\(\boxplus \)). Thus, we refer to a Feistel Network using XOR as a \(\oplus \)-Feistel and to one based on modular addition as a \(\boxplus \)-Feistel.

This work also has implications for the analysis of format-preserving encryption schemes, which are designed to encrypt data with a small plaintext set, for instance credit-card numbers (a 16 decimal digits number). In particular, the BPS [8] and FFX [9] constructions are Feistel schemes with small blocks; BPS uses 8 rounds, while FFX uses between 12 and 36 rounds (with more rounds for smaller domains). When these schemes are instantiated with small block sizes, recovering the full round functions might be easier than recovering the master key and provides an equivalent key.

Previous Work. Lampe et al.  [10], followed by Dinur et al.  [11], studied Feistel Networks where the Feistel function at round i consists in \(x \mapsto F_{i}(x \oplus k_{i})\), with \(F_{i}\) being public but \(k_{i}\) being kept secret. If the subkeys are independent then it is possible to recover all of them for a 5-round (respectively 7-round) FN in time \(\text {O}(2^{2n})\) (resp. \(\text {O}(2^{3n})\)) using only 4 known plaintexts with the optimised Meet-in-the-Middle attack described in [11]. However, we consider the much more complex case where the Feistel functions are completely unknown.

A first theoretical analysis of the Feistel structure and the first generic attacks were proposed in the seminal paper by Luby and Rackoff [12]. Since then, several cryptanalyses have been identified with the aim to either distinguish a Feistel Network from a random permutation or to recover the Feistel functions. Differential distinguishers against up to 5 rounds in the usual setting and 6 rounds in a multi-key setting are presented in [13], although they assume that the Feistel functions are random functions and thus have inner-collisions. Conversely, an impossible differential covering 5 rounds in the case where the Feistel functions are permutations is described in [14] and used to attack DEAL, a block cipher based on a 6-round Feistel Network. Finally, a method relying on a SAT-solver was recently shown in [7]. It is capable of decomposing Feistel Networks with up to \(n=7\) in at most a couple of hours. How this time scales for larger n is unclear. These attacks, their limitations and their efficiency are summarized in Table 1. A short description of some of them is given in Sect. 2 for the sake of completeness.

Table 1. Generic attacks against Feistel Networks.

Our Contribution. We present attacks against generic 5-round Feistel Networks which recover all Feistel functions efficiently instead of only distinguishing them from random. Furthermore, unlike distinguishers from the litterature, our attacks do not make any assumptions about whether the Feistel functions are bijective or not. Our attack against \(\oplus \)-Feistel uses the yoyo game, a tool introduced in [15] which we improve by providing a more general theoretical framework for it and leveraging particular cycle structures to diminish its cost. The principle of the yoyo game is introduced in Sect. 3 and how to use cycles to improve it is described in Sect. 4. We also present an optimized guess-and-determine attack which, unlike yoyo cryptanalysis, works against \(\boxplus \)-Feistel. It exploits a boomerang-like property related to the one used in our yoyo game to quickly explore the implications of the guess of an S-Box entry, see Sect. 5. Finally, an integral attack is given in Sect. 6.

We note that several of our attack have a double exponential complexity, and can only be used in practice for small values of n, as used for 8-bit S-Boxes (\(n=4\)) or format-preserving encryption with a small domain.

Notation. We introduce some notation for the different states during encryption (see Fig. 1). Each of the values is assigned a letter, e.g. the left side of the input is in position “A”. When we look at 5-round Feistel Networks, the input is fed in positions A and B and the output is read in GF. For 6 rounds, the input is the same but the output is read in HG with \(H = S_{5}(G) + F\). If we study a \(\boxplus \)-Feistel then “\(+\)” denotes modular addition (\(\boxplus \)); it denotes exclusive-or (\(\oplus \)) if we attack a \(\oplus \)-Feistel. Concatenation is denoted “||” and encryption is denoted \(\mathcal {E}\) (the number of rounds being clear from the context). For example, \(\mathcal {E}(a||b) = g||f\) for a 5-round Feistel Network. The bit-length of a branch of the Feistel Network is equal to n.

In addition we remark that, for an R-round Feistel, we can fix one entry of the last \(R-2\) Feistel functions (or the first \(R-2\) ones) arbitrarily. For example, the output of the 5-round Feistel Network described in Fig. 2 does not depend on \(\alpha _{0}\), \(\alpha _{1}\) or \(\alpha _{2}\).

Fig. 1.
figure 1figure 1

Internal state notation.

Fig. 2.
figure 2figure 2

Equivalent Feistel Networks.

2 Previous Attacks Against 5- and 6-Round Feistel Networks

2.1 Differential Distinguishers

In [13], Patarin shows a differential distinguisher against 5-round Feistel Networks. However, it only works if the Feistel functions have inner-collisions. It is based on the following observation. Let \((g_{i}||f_{i})\) be the image of \((a_{i}||b_{i})\) by a permutation and let \(b_{i}\) be constant. Then for \(i \ne j\), such that \(f_{i} = f_{j}\), count how many times \(a_{i} \oplus a_{j} = g_{i} \oplus g_{j}\). This number is roughly twice as high for a 5-round Feistel Network than for a random permutation.

In the same paper, Patarin suggests two distinguishers against 6-round \(\oplus \)-Feistel Networks. However, these do not target a permutation but a generator of permutation. This can be interpreted as a multi-key attack: the attacker has a black-box access to several permutations and either none or all of which are 6-round \(\oplus \)-Feistel Networks. The first attack uses that the signature of a \(\oplus \)-Feistel Network is always even. The second attack exploits a statistical bias too weak to be reliably observable using one codebook but usable when several permutations are available. It works by counting all quadruples of encryptions \((a_{i} || b_{i}) \rightarrow (g_{i} || h_{i})\), \(i=1..4\) satisfying this system:

$$\begin{aligned} {\left\{ \begin{array}{ll} b_{1} = b_{3}, ~b_{2} = b_{4} \\ g_{1} = g_{2}, ~g_{3} = g_{4} \\ a_{1} \oplus a_{3} = a_{2} \oplus a_{4} = g_{1} \oplus g_{3} \\ h_{1} \oplus h_{2} = h_{3} \oplus h_{4} = b_{1} \oplus b_{2}. \end{array}\right. } \end{aligned}$$

If there are \(\lambda \) black-boxes to distinguish and if m queries are performed for each then we expect to find about \(\lambda m^{4} 2^{-8n}\) solutions for a random permutation and \(2 \lambda m^{4} 2^{-8n}\) for 6-round Feistel Networks, i.e. twice as much.

2.2 Impossible Differential

Knudsen described in [14] an impossible differential attack against his AES proposal, DEAL, a 6-round Feistel Network using the DES [2] as a round function. This attack is made possible by the existence of a 5-round impossible differential caused by the Feistel functions being permutations. In this case, an input difference \((\alpha ||0)\) cannot be mapped to a difference of \((\alpha || 0)\) after 5 rounds. This would imply that the non-zero difference which has to appear in D as the image of \(\alpha \) by \(S_{2}\) is mapped to 0, which is impossible.

To distinguish such a 5-round FN from a random permutation we need to generate \(\lambda \cdot 2^{2n}\) pairs with input difference \((\varDelta || 0)\). Among those, about \(\lambda \) should have an output difference equal to \((\varDelta || 0)\) if the permutation is a random permutation while it is impossible to observe if for a 5-round FN with bijective Feistel functions. Note that while the time complexity is \(\text {O}(2^{2n})\), the data complexity can be brought down to \(\text {O}(2^{n})\) using structures.

An attack on 6 rounds uses this property by identifying pairs of encryptions with difference \((\alpha || 0)\) in the input and \((\alpha || \varDelta )\) for the output for any \(\varDelta \ne 0\). A pair as a correct output difference with probability \(2^{-n}(1-2^{-n})\) since \(\alpha \) is fixed and \(\varDelta \) can take any value except 0. We repeat this process for the whole codebook and all \(\alpha \ne 0\) to obtain \(2^{n+(2n-1)} \cdot 2^{-n}(1-2^{-n}) = 2^{2n-1}-2^{n-1}\) pairs. Each of them gives an impossible equation for \(S_{5}\): if \(\{(a||b) \rightarrow (g||h), (a \oplus \alpha || b) \rightarrow (g \oplus \alpha || h \oplus \varDelta )\}\) is a pair of encryptions then it is impossible that \(S_{5}(g) \oplus S_{5}(g \oplus \alpha ) = \varDelta \) as it would imply the impossible differential. In the end, we have a system of about \(2^{2n-1}-2^{n-1}\) impossible equations, a random Feistel function satisfying an impossible equation with probability \((1-2^{-n})\). Thus, this attack filters out all but the following fraction of candidates for \(S_{5}\):

$$\begin{aligned} \text {Impossible differential filter} ~=~ \big ( 1-2^{-n} \big )^{2^{2n-1}-2^{n-1}} \approx 2^{0.72 - 1.443 \cdot 2^{n-1}}. \end{aligned}$$

3 Yoyo Game and Cryptanalysis

3.1 The Original Yoyo Game

Several cryptanalyses have been proposed in the literature that rely on encrypting a plaintext, performing an operation on the ciphertext and then decrypting the result. For example, the “double-swiping” used against newDES [16] in the related-key setting relies on encrypting a pair of plaintexts using two related-keys and decrypting the result using two different related-keys. Another example is the boomerang attack introduced by Wagner [17] in the single-key setting. A pair with input difference \(\delta \) is encrypted. Then, a difference \(\varDelta \) is added to the ciphertexts and the results are decrypted, hopefully yielding two plaintexts with a difference of \(\delta \).

The yoyo game was introduced by Biham et al. in [15] where it was used to attack the 16 center rounds of Skipjack [18], a block cipher operating on four 16-bits words. We describe this attack using slightly different notation and terminology to be coherent with the rest of our paper. In this paragraph, \(\mathcal {E}_{k}\) denotes an encryption using round-reduced Skipjack under key k.

It was noticed that if the difference between two encryptions at round 5 is \((0,\varDelta ,0,0)\) where \(\varDelta \ne 0\) then the other three words have difference 0 between rounds 5 and 12. Two encryptions satisfying this truncated differential are said to be connected. The key observation is the following. Consider two plaintexts and where \(x_{2}\) is constant. If they are connected, then the pair is connected as well (see [15] for a detailed explanation on why it is the case). Furthermore, let and . We can form two new ciphertexts by swapping their first words to obtain and . If we decrypt them to obtain \((u, u') = (\mathcal {E}_{k}^{-1}(z), \mathcal {E}_{k}^{-1}(z'))\), then u and \(u'\) are connected. If we denote \(\psi (x,x')\) the function which encrypts x and \(x'\), swaps the first words of the ciphertexts obtained and decrypts the result then \(\psi \) preserves connection, just like \(\phi \). It is thus possible to iterate \(\phi \) and \(\psi \) to obtain many connected pairs, this process being called the yoyo game.

In this section, we present other definitions of the connection and of the functions \(\phi \) and \(\psi \) which allow us to play a similar yoyo game on 5-round Feistel Networks.

3.2 Theoretical Framework for the Yoyo Game

Consider two plaintexts a||b and \(a'||b'\) such that the difference between their encryptions in positions (CD) is equal to \((\gamma ,0)\) with \(\gamma \ne 0\). Then the difference in position E is equal to \(\gamma \). Conversely, the difference in (ED) being \((\gamma , 0)\) implies that the difference in C is \(\gamma \). When this is the case, the two encryptions satisfy the systems of equations and the trail described in Fig. 3.

Fig. 3.
figure 3figure 3

The equations defining connection in \(\gamma \) and the corresponding differential trail.

Definition 1

If the encryptions of a||b and \(a'||b'\) follow the trail in Fig. 3 then they are said to be connected in \(\gamma \).

This connection is an “exclusive” relation: if (a||b) and \((a'||b')\) are connected, then neither (a||b) nor \((a'||b')\) can be connected to anything else. Furthermore, we can replace \((a, a')\) by \((a \oplus \gamma , a' \oplus \gamma )\) in the top equations and still have them being true. Indeed, the two \(\gamma \) cancel each other in the first one. In the second, the values input to each call to \(S_1\) are simply swapped as a consequence of the first equation. Similarly, we can replace \((g, g')\) by \((g \oplus \gamma , g' \oplus \gamma )\) in the bottom equations.Footnote 1 As consequence of these observation, we state the following lemma.

Lemma 1

We define the following two involutions

$$ \phi _{\gamma } (a || b) = (a \oplus \gamma ) || b,~ \psi _{\gamma } = \mathcal {E}^{-1} \circ \phi _{\gamma } \circ \mathcal {E}. $$

If a||b and \(a'||b'\) are connected then, with probability 1:

  • \(\phi _{\gamma }(a || b)\) and \(\phi _{\gamma } (a' || b')\) are connected,

  • \(\psi _{\gamma }(a || b)\) and \(\psi _{\gamma } (a' || b')\) are connected.

By repeatedly applying \(\phi _{\gamma }\) and \(\psi _{\gamma }\) component-wise on a pair of plaintexts \((x,x')\), we can play a yoyo game which preserves connection in \(\gamma \). This process is defined formally below.

Definition 2

Let \(\big (x_0 = (a_{0}||b_{0}), x'_0 = (a'_{0}||b'_{0}) \big )\) be a pair of inputs. The yoyo game in \(\gamma \) starting in \((x_0, x'_0)\) is defined recursively as follows:

$$\begin{aligned} \big ( x_{i+1}, x'_{i+1} \big ) = {\left\{ \begin{array}{ll} \big ( \phi _{\gamma }(x_i), \phi _{\gamma }(x'_i) \big ) &{} if \ i \ is \ even, \\ \big ( \psi _{\gamma }(x_i), \psi _{\gamma }(x'_i) \big ) &{} if \ i \ is \ odd \end{array}\right. } \end{aligned}$$

Lemma 2

If \((x_0, x'_0)\) is connected in \(\gamma \) then all pairs in the game starting in \((x_0, x'_0)\) are connected in \(\gamma \). In other words, either all pairs within the game played using \(\phi _{\gamma }\) and \(\psi _{\gamma }\) are connected in \(\gamma \) or none of them are.

3.3 The Yoyo Cryptanalysis Against 5-Round \(\oplus \)-Feistel Networks

Given a yoyo game connected in \(\gamma \), it is easy to recover Feistel functions \(S_0\) and \(S_4\) provided that the yoyo game is long enough, i.e. that it contains enough connected pairs to be able to recover all \(2^{n}\) entries of both S-Boxes. If the yoyo game is not connected in \(\gamma \) then yoyo cryptanalysis (Algorithm 1) identifies it as such very efficiently.

It is a differential cryptanalysis using that all pairs in the game are (supposed to be) right pairs for the differential trail defining connection in \(\gamma \). If it is not the case, \(S_{0}\) or \(S_{4}\) will end up requiring contradictory entries, e.g. \(S_{0}(0)=0\) and \(S_{0}(0)=1\). In this case, the game is not connected in \(\gamma \) and must be discarded. Yoyo cryptanalysis is described in Algorithm 1Footnote 2. It only takes as inputs a (possible) yoyo game and the value of \(\gamma \). Algorithm 2 describes AddEntry, a subroutine handling some linear equations. Note that one entry can be set arbitrarily (here, \(S_{0}(0) = 0\)) as summarized in Fig. 2.

Let \(\mathcal {Y}\) be a (supposed) yoyo game of size \(| \mathcal {Y} |\). For each pair in it, either an equation is added to the list, FAIL is returned or AddEntry is called. While the recursive calls to AddEntry may lead to a worst time complexity quadratic in \(| \mathcal {Y} |\) if naively implemented, this problem can be mitigated by using a hashtable indexed by the Feistel functions inputs instead of a list. Furthermore, since already solved equations are removed, the total time complexity is \(\text {O}(| \mathcal {Y} |)\).

figure afigure a
figure bfigure b

3.4 On the Infeaseability of Our Yoyo Game Against an \(\boxplus \)-Feistel

Assume that the following equations holds:

$$\begin{aligned} {\left\{ \begin{array}{ll} \big ( S_0(b) + a \big ) - \big ( S_0(b') + a' \big ) = \gamma \\ \big ( S_1(S_0(b) + a) + b \big ) - \big (S_1(S_0(b') + a') + b'\big ) = 0. \end{array}\right. } \end{aligned}$$
(1)

In order to be able to play a yoyo game against the corresponding \(\boxplus \)-Feistel, we need to be able to replace a by \(a+\gamma \) and \(a'\) by \(a'+\gamma \) in System (1) and still have it hold. In other words, we need that Eq. (1) holding implies that the following equations hold as well:

$$\begin{aligned} {\left\{ \begin{array}{ll} \big ( S_0(b) + a + \gamma \big ) - \big ( S_0(b') + a' + \gamma \big ) = \gamma \\ \big ( S_1(S_0(b) + a + \gamma ) + b \big ) - \big (S_1(S_0(b') + a' + \gamma ) + b'\big ) = 0. \end{array}\right. } \end{aligned}$$
(2)

The first one trivially does. Using it, we note that \(S_{0}(b) + a + \gamma = S_{0}(b') + a' + 2 \gamma \). Let \(X = S_{0}(b')+a'\). Then the left-hand side of the second equation in System (2) can be re-written as \(S_1(X + 2\gamma ) - S_1(X + \gamma ) + b - b'\). Furthermore, the second equation in System (1), which is assumed to hold, implies that \(S_{1}(X+\gamma ) - S_{1}(X) = b' - b\). Thus, the left-hand side of the second equation in System (2) is equal to

$$\begin{aligned} S_{1}(X + 2\gamma ) - (b' - b + S_{1}(X)) + b - b' ~=~ S_{1}(X + 2\gamma ) - S_{1}(X) - 2(b'-b). \end{aligned}$$

The term \(S_{1}(X + 2\gamma ) - S_{1}(X)\) has an unknown value unless \(\gamma = 2^{n-1}\). Nevertheless, in this case, we would need \(2(b'-b) = 0\) which does not have a probability equal to 1. However both \(S_{1}(X + 2\gamma ) - S_{1}(X)\) and \(2(b'-b)\) are always equal to 0 in characteristic 2 which is why our yoyo game can always be played against a \(\oplus \)-Feistel.

4 An Improvement: Using Cycles

4.1 Cycles and Yoyo Cryptanalysis

A yoyo game is a cycle of \(\psi _{\gamma }\) and \(\phi _{\gamma }\) applied iteratively component-wise on a pair of elements. Thus, it can be decomposed into two cycles, one for each “side” of the game: \((x_0, x_1, x_2, ...)\) and \((x_0',x'_1, x'_2, ...)\). This means that both cycles must have the same length, otherwise the game would imply that \(x_0\) is connected to \(x'_j\) for \(j \ne 0\), which is impossible. Since both \(\phi _{\gamma }\) and \(\psi _{\gamma }\) are involutions, the cycle can be iterated through in both directions. Therefore, finding one cycle gives us two directed cycles.

In order to exploit yoyo games, we could generate pairs \((x_0,x'_0)\) at random, generate the yoyo game starting at this pair and then try and recover \(S_0\) and \(S_4\) but this endeavour would only work with probability \(2^{-2n}\) (the probability for two random points to be connected). Instead, we can use the link between cycles, yoyo games and connection in \(\gamma \) as is described in this section. Note that the use of cycles in cryptography is not new; in fact it was used in the first cryptanalyses against ENIGMA. More recently, particular distribution of cycle sizes were used to distinguish rounds-reduced PRINCE-core [19] from random [20] and to attack involutional ciphers [21].

4.2 Different Types of Cycles

Let \(\mathcal {C} = (x_i)_{i=0}^{\ell -1}\) be a cycle of length \(\ell \) of \(\psi _{\gamma }\) and \(\phi _{\gamma }\), with \(x_{2i} = \psi _{\gamma }(x_{2i-1})\) and \(x_{2i+1} = \phi _{\gamma }(x_{2i})\). We denote the point connected to \(x_i\) as \(y_i\), where all indices are taken modulo \(\ell \). Since \(x_i\) and \(y_i\) are connected, and the connection relation is one-to-one, we also have \(y_{2i} = \psi _{\gamma }(y_{2i-1})\) and \(y_{2i+1} = \phi _{\gamma }(y_{2i})\). Therefore, \(\mathcal {C}' = (y_i)_{i=0}^{\ell -1}\) is also a cycle of length \(\ell \).

We now classify the cycles according to the relationship between \(\mathcal {C}\) and \(\mathcal {C}'\).

  • If \(\mathcal {C}\) and \(\mathcal {C}'\) are Distincts, \(\mathcal {C}\) is a Type-D cycle. A representation is given in Fig. 4a. Otherwise, there exists k such that \(y_0 = x_k\).

  • If k is even, we have \(x_{k+1} = \phi _{\gamma }(x_k)\). Since \(x_k = y_0\) is connected to \(x_0\), \(x_{k+1} = \phi _{\gamma }(x_k)\) is connected to \(\phi _{\gamma }(x_0) = x_1\), i.e. \(y_1 = x_{k+1}\). Further, \(x_{k+2} = \psi _{\gamma }(x_{k+1})\) is connected to \(\psi _{\gamma }(x_1) = x_2\), i.e. \(y_2 = x_{k+2}\). By induction, we have \(y_i = x_{k+i}\). Therefore \(x_0\) is connected to \(x_k\) and \(x_k\) is connected to \(x_{2k}\). Since the connection relation is one-to-one, this implies that \(2k = \ell \).

    We denote this setting as a Type-S cycle. Each element \(x_i\) is connected to \(x_{i+\ell /2}\). Thus, if we represent the cycle as a circle, the connections between the elements would all cross in its center, just like Spokes, as can be seen in Fig. 4b.

  • If k is odd, we have \(x_{k-1} = \phi _{\gamma }(x_k)\). Since \(x_k = y_0\) is connected to \(x_0\), \(x_{k-1} = \phi _{\gamma }(x_k)\) is connected to \(\phi _{\gamma }(x_0) = x_1\), i.e. \(y_1 = x_{k-1}\). Further, \(x_{k-2} = \psi _{\gamma }(x_{k-1})\) is connected to \(\psi _{\gamma }(x_1) = x_2\), i.e. \(y_2 = x_{k-2}\). By induction, we have \(y_i = x_{k-i}\).

    We denote this setting as a Type-P cycle. If we represent the cycle as a circle, the connections between the elements would all be Parallel to each other as can be seen in Fig. 4c.

    In particular, there at exactly two pairs \((x_{i}, x_{i+1})\) such that \(x_{i}\) and \(x_{i+1}\) are connected. Indeed, we have \(x_{i+1} = y_i\) if and only if \(i+1 \equiv k-i \ \text {mod}\ {\ell }\) i.e. \(i \equiv (k-1)/2 \ \text {mod}\ {\ell /2}\). As a consequence, the existence of w connected pairs \((x,x')\) with \(x'=\phi _{\gamma }(x)\) or \(x'=\psi _{\gamma }(x)\) implies the existence of w / 2 Type-P cycles.

    In addition, Type-P cycles can only exist if either \(S_1\) or \(S_3\) are not bijections. Indeed, if (a||b) and \((a \oplus \gamma ||b)\) are connected then the difference in position D cannot be zero unless \(S_1\) can map a difference of \(\gamma \) to zero. If it is a permutation, this is impossible. The situation is identical for \(S_{3}\). Furthermore, each value c such that \(S_{1}(c) = S_{1}(c \oplus \gamma )\) implies the existence of \(2^{n}\) values (a||b) connected to \(\phi _{\gamma }(a||b)\) as b can be chosen arbitrarily and a computed from b and c. Again, the situation is identical for \(S_{3}\). Thus, if \(S_{1}(x) = S_{1}(x \oplus \gamma )\) has \(w_{1}\) solutions and if \(S_{3}(x) = S_{3}(x \oplus \gamma )\) has \(w_{3}\) solutions then there are \((w_{1}+w_{3}) \cdot 2^{n-2}\) Type-P cycles.

Fig. 4.
figure 4figure 4

All the types of cycles that can be encountered. \(\phi _{\gamma }\) is a blue line, \(\psi _{\gamma }\) is a red one and connection is a green one (remember that \(\phi _{\gamma }\) and \(\psi _{\gamma }\) are involutions) (Color figure online).

4.3 The Cycle-Based Yoyo Cryptanalysis

Exploiting a Type-S cycle is a lot easier than exploiting a Type-P or a pair of Type-D cycles. Indeed, the connected pairs \((x_i,x_{i+\ell /2})\) can be immediately derived from the length \(\ell \) of the cycle, while we have to guess a shift amount for connected pairs in a Type-P cycle, or between two type D cycles. Thus, it makes sense to target those specifically, for instance by implementing Algorithm 3.

figure cfigure c

Let \(q_S(n)\) be the probability that a Type-S cycle exists for the chosen \(\gamma \) for a 5-round Feistel Network built out of bijective Feistel functions. When averaged over all such Feistel Networks, this probability does not depend on \(\gamma \). The full version of this paper [22] contains a more detailed discussion of this probability in Sect. 3.4, and examples of functional graphs in the Appendix.

This attacks requires \(\text {O}(2^{2n}/n)\) blocks of memory to store which plaintexts were visited and \(\text {O}(2^{2n})\) time. Indeed, at most all elements of the codebook will be evaluated and inspected a second time when attempting a yoyo cryptanalysis on each cycle large enough. Even though the attack must be repeated about \(1/q_S(n)\) times to be able to obtain a large enough Type-S cycle, \(q_S(n)\) increases with n so that \(1/q_S(n)\) can be upper-bounded by a constant independent of n.Footnote 3 Note also that special points can be used to obtain a time-memory tradeoff: instead of storing whether all plaintexts were visited or not, we only do so for those with, say, the first \(\mathcal {B}\) bits equal to 0. In this case, the time complexity becomes \(\text {O}(\mathcal {B} \cdot 2^{2n})\) and the memory complexity \(\text {O}\big (2^{2n}/ (n \cdot \mathcal {B}) \big ))\). Access to the hash table storing whether an element has been visited or not is a bottle-neck in practice so special points actually give a “free” memory improvement in the sense that memory complexity is decreased without increasing time. In fact, wall clock time may actually decrease. An attack against a \(\oplus \)-Feistel with \(n=14\) on a regular desktop computerFootnote 4 takes about 1 hour to recover both \(S_0\) and \(S_4\).

4.4 Attacking 6 and 7 Rounds

An Attack on 6 Rounds. A naive approach could consist in guessing all of the entries of \(S_{5}\) and, for each guess, try running a cycle-based yoyo cryptanalysis. If it fails then the guess is discarded. Such an attack would run in time \(\text {O}\big (2^{n2^{n} + 2n} \big )\). However, it is possible to run such an attack at a cost similar to that of guessing only half of the entries of \(S_{5}\), namely \(\text {O}(2^{n2^{n-1}+2n})\) which corresponds to a gain of \(2^{n2^{n-1}}\).

Instead of guessing all the entries, this attack requires guessing the values of \(\varDelta _5(x, \gamma ) = S_{5}(x) \oplus S_{5}(x \oplus \gamma )\). Once these are know, we simply need to replace \(\psi _{\gamma }\) by \(\psi '_{\gamma }\) with

$$\begin{aligned} \big ( \mathcal {E}\circ \psi '_{\gamma } \circ \mathcal {E}^{-1} \big )(g || h) = \big (g \oplus \gamma ~||~ h \oplus \varDelta _5(x, \gamma ) \big ). \end{aligned}$$

The cycle-based yoyo cryptanalysis can then be run as previously because, again, both \(\phi _{\gamma }\) and \(\psi '_{\gamma }\) preserve connection in \(\gamma \). Once it succeeds, the top S-Box is known which means that it can be peeled of. The regular attack is then performed on the remaining 5 rounds. Note that if the yoyo cryptanalysis fails because of inner collisions in \(S_{1}\) or \(S_{3}\) then we can still validate a correct guess by noticing that there are \(\text {O}(2^{n})\) cycles instead of \(\text {O}(2n)\) as would be expectedFootnote 5.

In this algorithm, \(2^{n-1}\) values of \([0,2^{n}-1]\) must be guessed and for each of those an attack with running time \(\text {O}(2^{2n})\) must be run. Hence, the total running time is \(\text {O}\big ( 2^{n2^{n-1}+2n} \big )\). The time necessary to recover the remainder of the Feistel functions is negligible.

An Attack on 7 Rounds. A \(\oplus \)-Feistel with 7 rounds can be attacked in a similar fashion by guessing both \(\varDelta _0(x, \gamma )\) and \(\varDelta _6(x, \gamma )\) for all x. These guesses allow the definition of \(\phi ''_{\gamma }\) and \(\psi ''_{\gamma }\), as follows:

$$\begin{aligned} \phi ''_{\gamma }(a || b) ~&=~ \big (a \oplus \varDelta _{0}(x, \gamma ) ~||~ b \oplus \gamma \big ) \\ \big ( \mathcal {E}\circ \psi ''_{\gamma } \circ \mathcal {E}^{-1} \big )(g || h) ~&=~ \big ( h \oplus \varDelta _6(x, \gamma ) ~||~ g \oplus \gamma \big ). \end{aligned}$$

For each complete guess \(\big ( (\varDelta _0(x, \gamma _{0}), \forall x), (\varDelta _6(x, \gamma _{0}), \forall x) \big )\), we run a yoyo cryptanalysis. If it succeeds, we repeat the attack for a new difference \(\gamma _{1}\). In this second step, we don’t need to guess \(2^{n-1}\) values for each \(\varDelta _{0}(x, \gamma _{1})\) and \(\varDelta _{6}(x, \gamma _{1})\) but only \(2^{n-2}\) as \(\varDelta _{i}(x \oplus \gamma _{0}, \gamma _{1}) = \varDelta _{i}(x, \gamma _{0}) \oplus \varDelta (x \oplus \gamma _{1}, \gamma _{0}) \oplus \varDelta _{i}(x, \gamma _{1})\). We run again a cycle-based yoyo cryptanalysis to validate our guesses. The process is repeated \(n-1\) times in total so as to have \(\sum _{k=0}^{n-1}2^{k} = 2^{n}\) independent linear equations connecting the entries of \(S_{0}\) and another \(2^{n}\) for the entries of \(S_{6}\). Solving those equations gives the two outer Feistel functions, meaning that they can be peeled off. We then run a regular yoyo-cryptanalysis on the 5 inner rounds to recover the remainder of the structure.

Since \(\sum _{k=0}^{n-1}2^{n2^{k}+2n} = \text {O}\big ( 2^{n2^{n} + 2n} \big )\), the total time complexity of this attack is \(\text {O}\big ( 2^{n2^{n} + 2n} \big )\), which is roughly the complexity of a naive 6-round attack based on guessing a complete Feistel function and running a cycle-based yoyo cryptanalysis on the remainder.

5 Guess and Determine Attack

Since the yoyo game is only applicable to an \(\oplus \)-Feistel, we now describe a different attack that works for any group operation \(+\). This guess and determine attack is based on a well-known boomerang-like distinguisher for 3-round Feistel Networks, initially described by Luby and Rackoff [12]. This is used to attack Feistel Networks with 4 or 5 rounds using a guess and determine approach: we guess entries of \(S_3\) and \(S_4\) in order to perform partial encryption/decryption for some values, and we use the distinguisher on the first three rounds in order to verify the consistency of the guesses, and to recover more values of \(S_3\) and \(S_4\).

Fig. 5.
figure 5figure 5

Distinguisher for a 3-round Feistel

5.1 Three-Round Property

The distinguisher is illustrated by Fig. 5, and works as follows:

  • Select arbitrary values \(a, b, \delta \) (\(\delta \ne 0\));

  • Query \((e,d) = \mathcal {E}(a,b)\) and \((e', d') = \mathcal {E}(a + \delta ,b)\);

  • Query \((a',b') = \mathcal {E}^{-1}(e+\delta ,d)\);

  • If \(\mathcal {E}\) is a three-round Feistel, then \(d - b' = d' - b\).

The final equation is always true for a 3-round Feistel Network because the input to the third Feistel function is \(c+\delta \) for both queries \(\mathcal {E}(a + \delta ,b)\) and \(\mathcal {E}^{-1}(e+\delta ,d)\). Therefore the output of \(S_1\) is the same in both cases. On the other hand, the relation only holds with probability \(2^{-n}\) for a random permutation.

Fig. 6.
figure 6figure 6

Attack against 4-round Feistel

5.2 Four-Round Attack

We now explain how to use this property to decompose a four-round Feistel network. We first fix \(S_3(0) = 0\), and guess the value \(S_3(1)\). Then we use known values of \(S_3\) to iteratively learn new values as follows (see Fig. 6):

  • Select e and \(\delta \) such that \(S_3(e)\) and \(S_3(e+ \delta )\) are known, with \(\delta \ne 0\).

  • For every \(d \in \mathbb {F}_{2}^{n}\), we set \(f = d + S_3(e)\) and \(f' = d + S_3(e + \delta )\); we query \((a,b) = \mathcal {E}^{-1}(f,e)\) and \((a',b') = \mathcal {E}^{-1}(f', e + \delta )\)

  • Then we query \((f'',e') = \mathcal {E}(a+\delta ,b)\). Using the three-round property, we know that:

    $$ d - b' = d' - b, \quad \text {where}\; d' = f'' - S_3(e'). $$

    This gives the value of \(S_3(e')\) as \(f'' - d + b' - b\).

We iterate the deduction algorithm until we either detect a contradiction (if the guess of \(S_3(1)\) is wrong), or we recover the full \(S_3\). Initially, we select \(e = 0, \delta = 1\), or \(e=1, \delta = -1\), with \(2^n\) choices of d: this allows \(2^{n+1}\) deductions. If the guess of \(S_3(1)\) is wrong, we expect to find a contradiction after about \(2^{n/2}\) deductions. If the guess is correct, almost all entries of \(S_3\) will be deduced with a single choice of e and \(\delta \), and we will have many options for further deduction. Therefore, the complexity of this attack is about \(2^{3n/2}\).

Fig. 7.
figure 7figure 7

Attack against 5-round Feistel

5.3 Five-Round Attack

The extension from 4 rounds to 5 rounds is similar to the extension from a three-round distinguisher to a four-round attack. First, we guess some entries of the last S-Box, so that we can invert the last round for a subset of the outputs. Then, we use those pairs to perform an attack on a reduced version so as to test whether the guess was valid. However, we need to guess a lot more entries in this context. The deductions are performed as follows (see Fig. 7):

  • Select d, e and \(\delta \) such that \(S_3(e)\), \(S_3(e + \delta )\), \(S_4(d + S_3(e))\) and \(S_4(d + S_3(e + \delta ))\) are known.

  • Let \((f,g) = (d+S_3(e), e+S_4(f))\) and \((f',g') = (d+S_3(e+\delta ), e+\delta +S_4(f'))\), then query \((a,b) = \mathcal {E}^{-1}(g,f)\) and \((a',b') = \mathcal {E}^{-1}(g', f')\)

  • Finally, query \((g'',f'') = \mathcal {E}(a+\delta ,b)\). Assuming that \(S_4(f'')\) is known, we can use the three-round property and deduce:

    $$ d - b' = d' - b, \quad \text {where}\; d' = f'' - S_3(g'' - S_4(f'')) $$

    This gives the value of \(S_3(g'' - S_4(f''))\) as \(f'' - d + b' - b\).

Guessing Strategy. The order in which we guess entries of \(S_3\) and \(S_4\) is very important in order to obtain a low complexity attack. We first guess the values of \(S_3(i)\) and \(S_4(S_3(i))\) for \(i < \ell \), with \(\ell > 2^{n/2}\). This allows to try deductions with \(d=0\) and any \(e, e+\delta \le \ell \), i.e. \(\ell ^2\) attempts. Since \(\ell \) entries of \(S_4\) are known, each attempt succeeds with probability \(\ell 2^{-n}\), and we expect to guess about \(\ell ^3 2^{-n}\) new values of \(S_3\). With \(\ell > 2^{n/2}\), this will introduce a contradiction with high probability.

When an initial guess is non-contradictory, we select x such that \(S_3(x)\) has been deduced earlier, we guess the corresponding value \(S_4(S_3(x))\), and run again the deduction. The new guess allows to make \(\ell \) new deduction attempts with \(d=0\), \(e < \ell \) and \(e' = x\). We expect about \(\ell ^2 2^{-n}\) successful new deductions. With \(\ell = 2^{3n/4+\varepsilon }\) with a small \(\varepsilon > 0\), the probability of finding a contradiction is higher than \(2^{-n}\), and the size of the search tree decreases.

The attack will also work if we start with \(2^{n/4}\) entries in \(S_3\) and \(2^{3n/4}\) entries in \(S_4\): the first step will deduce \(2^{3n/4}\) values in \(S_3\). Therefore, we have to make only \(2^{n/4}+2^{3n/4} \approx 2^{3n/4}\) guesses, and the total complexity is about \(2^{n2^{3n/4}}\).

Application to n = 4. We now explain the attack in more detail with \(n=4\). We first set \(S_4(0) = S_3(0) = 0\) and we guess the values of \(S_3(1)\), \(S_3(2)\), \(S_4(S_3(1))\), and \(S_4(S_3(2))\). In particular this allows to compute the last two rounds for the following (ed) values:

$$\begin{aligned} (0, 0)\qquad \;\;\; \qquad (1, 0)\qquad \;\;\;\qquad (2, 0) \end{aligned}$$

This gives 6 candidates \((e,d),\delta \) for the deduction algorithm:

$$\begin{aligned} (0, 0), \delta \in \{1,2\} \qquad (1, 0), \delta \in \{1,-1\} \qquad (2, 0), \delta \in \{-1,-2\} \end{aligned}$$

Each candidate gives a deduction with probability 3 / 16 because three entries are known in \(S_4\). Therefore there is a good probability to get one deduction \(S_4(x)\). In this case, we guess the value \(S_3(S_4(x))\), so that we can also compute the last two rounds for \((d,e) = (x,0)\). We have at least 6 new candidates \((e,d),\delta \) for the deduction algorithm:

$$\begin{aligned} (x, 0)&, \delta \in \{ -x, 1-x, 2-x \}&(0, 0)&, \delta = x&(1, 0)&, \delta = x-1&(2, 0)&, \delta = x-2 \end{aligned}$$

In total, we have 12 candidates, and each of them gives a deduction with probability 4 / 16, including the deduction made in the first step. We expect about 3 deductions in total, which leads to 7 known values in \(S_3\). Since \(7 > 2^{n/2}\), there is already a good chance to detect a contradiction. For the remaining cases, we have to make further guesses of \(S_4\) entries, and repeat the deduction procedure.

Since we had to make five guesses for most branches of the guess and determine algorithm, the complexity is about \(2^{20}\). In practice, this attack takes less than one second on a single core with \(n=4\) (on a 3.4 GHz Haswell CPU).

6 Integral attack

Finally, we present an integral attack against 5-round Feistels, that was shown to us by one of the anonymous reviewers. This attack has a complexity of \(2^{3n}\), and works for any group operation \(+\), but it requires \(S_1\) or \(S_3\) to be a permutation.

The attack is based on an integral property, as introduced in the cryptanalysis of Square [23, 24]. In the following, we assume that \(S_1\) is a permutation; if \(S_3\) is a permutation instead, the attack is performed against the decryption oracle. An attacker uses a set of \(2^n\) plaintexts \((a_i,b)\) where \(a_i\) takes all possible values, and b is fixed to a constant value. She can then trace the evolution of this set of plaintext through the Feistel structure:

  • \(c_i = a_i + S_0(b)\) takes all possible values once;

  • \(d_i = b + S_1(c_i)\) takes all possible values once, since \(S_1\) is a permutation;

  • \(e_i = c_i + S_2(d_i)\) has a fixed sum:

    $$\begin{aligned} \sum _i e_i = \sum _{x \in \{0 \ldots 2^n-1\}} x + \sum _{x \in \{0 \ldots 2^n-1\}} S_2(x) = S. \end{aligned}$$

    The first term is 0 for a \(\oplus \)-Feistel, and \(2^{n-1}\) for a \(\boxplus \)-Feistel, while the second term is equal to the first if \(S_2\) is a permutation, but otherwise unknown.

After collecting the \(2^n\) ciphertexts \((g_i, f_i)\) corresponding to the set of plaintexts, she can express \(e_i = g_i - S_4(f_i)\). The fixed sum \(\sum e_i = S\) gives a linear equation between the values \(S_4(f_i)\) and S. This can be repeated with \(2^n\) different sets of plaintexts, in order to build \(2^{n}\) linear equations. Solving the equations recovers the values \(S_4(f_i)\), i.e. the full \(S_4\) box.

When \(S_2\) is a permutation, S is known, and the system has a single solution with high probability. However, when S is unknown, the system has n equations and \(n+1\) unknowns; with high probability it has rank n and \(2^n\) solutions. Therefore, an attacker has to explore the set of solutions, and to use a 4-round distinguisher to verify the guess. Using the attack of Sect. 5.2, this has complexity \(2^{5n/2}\).

For a \(\oplus \)-Feistel, the cost of solving the linear system is \(2^{3n}\) with Gaussian elimination, but can be improved to \(O(2^{2.81n})\) with Strassen’s Algorithm (the currently best known algorithm [25] has complexity only \(O(2^{2.3729n})\) but is probably more expensive for practical values of n).

For a \(\boxplus \)-Feistel, solving a linear system over \(\mathbb {Z}/2^n\mathbb {Z}\) is harder. However, we can solve the system bit-by-bit using linear algebra over \(\mathbb {F}_2\). We first consider the equations modulo 2, and recover the least significant bit of \(S_4\). Next, we consider the equations modulo 4. Since the least significant bits are known, this also turns into linear equations over \(\mathbb {F}_2\), with the second bits as unknowns. We repeat this technique from the least significant bit to the most significant. At each step, we might have to consider a few different candidates if the system is not full rank.

In total, this attack has a time complexity about \(2^{2.81n}\).

7 Conclusion

We presented new generic attacks against Feistel Networks capable of recovering the full description of the Feistel functions without making any assumptions regarding whether they are bijective or not. To achieve this, we have improved the yoyo game proposed in [15] using cycles and found an efficient guessing strategy to be used if modular addition is used instead of exclusive-or. We implemented our attacks to check our claims. We finally described an integral attack suggested by an anonymous reviewer.

Our attacks allow an efficient recovery of the Feistel functions for 5-round Feistel Networks and cycle-based yoyo cryptanalysis can be pushed to attack 6-round and(respectively 7-round) \(\oplus \)-Feistel at a cost similar to guessing half (resp. all) of the entries of a Feistel function.

Our results differ significantly between \(\oplus \)-Feistel and \(\boxplus \)-Feistel. It remains an open problem to find a more efficient attack against \(\boxplus \)-Feistel or to theoretically explain such a difference.