Memoryless near-collisions via coding theory

Vincent Rijmen

Des. Codes Cryptogr. (2012) 62:1–18 DOI 10.1007/s10623-011-9484-2 Memoryless near-collisions via coding theory Mario Lamberger · Florian Mendel · Vincent Rijmen · Koen Simoens Received: 2 November 2009 / Revised: 21 January 2011 / Accepted: 24 January 2011 / Published online: 13 February 2011 © Springer Science+Business Media, LLC 2011 Abstract We investigate generic methods to find near-collisions in cryptographic hash functions. We introduce a new generic approach based on methods to find cycles in the space of codewords of a code with low covering radius. We give an analysis of our approach and demonstrate it on the SHA-3 candidate TIB3. Keywords Hash functions · Near-collisions · Cycle finding algorithms · Covering codes Mathematics Subject Classification (2000) 11T71 · 94A60 1 Introduction After the publication of the attacks on MD5, SHA-1 and several other modern cryptographic hash functions by Wang et al. [29,30], there has been a renewed interest in the design and cryptanalysis of these important cryptographic primitives. Cryptographic hash functions have to satisfy many requirements, among which the properties of preimage resistance, second preimage resistance and collision resistance are cited most often. While designers of practical proposals usually try to make their design satisfy some additional properties, most theoretical constructions and their accompanying proofs of security consider these three properties only. In this paper, we are concerned with a somewhat less popular property, namely nearcollision resistance. We want to investigate ways how to efficiently find such near-collisions and compare them with the generic approaches used to find collisions. Communicated by S. D. Galbraith. M. Lamberger (B) · F. Mendel IAIK, Graz University of Technology, Graz, Austria e-mail: mario.lamberger@iaik.tugraz.at V. Rijmen · K. Simoens IAIK, Graz University of Technology ESAT/COSIC, K. U. Leuven and IBBT, Heverlee, Belgium 123 2 M. Lamberger et al. 2 Background and motivation 2.1 Hash and compression function collisions In general, a cryptographic hash function maps a message m of arbitrary length to a hash value of fixed length, H : {0, 1}∗ → {0, 1}n . In this paper, we consider iterative hash functions H that split up the message into blocks of equal size. So upon input a message m, we apply an injective padding such that the result consists of l blocks m i , i = 0, 1, . . . , l − 1 of q bits each, and then process one block m i at a time to update the n-bit internal state xi . The compression function is the same function in every iteration, and is denoted by h: xi+1 = h(xi , m i ), i = 0, 1, . . . , l − 1 (1) Here x0 is a pre-defined initial state value. The output of the hash function is defined as the final state value: H (m) = xl . The strengthened Merkle–Damgård construction (further abbreviated to MD-construction) of cryptographic hash functions [6,16], which is basically (1) where the message padding also includes the bitlength of m, is very popular in practical designs like MD5, SHA-1 and the SHA-2 family [18,24]. This is mostly because of its property of preserving the collision resistance of the compression function: Theorem 1 Let H be a hash function based on the MD-construction and let m = m ∗ be two messages. Then H (m) = H (m ∗ ) ⇒ ∃i : h(xi , m i ) = h(xi∗ , m i∗ ) (2) A solution {(xi , m i ), (xi∗ , m i∗ )} to h(xi , m i ) = h(xi∗ , m i∗ ) where (xi , m i ) = (xi∗ , m i∗ ) is called a collision for the compression function h. One early result in the field of hash function cryptanalysis was found by den Boer and Bosselaers, namely, that collisions for the compression function of the widely used MD5 hash function can be found easily [7]. Although the methods of [7] can’t be used to construct collisions for MD5, this early result implied already that Theorem 1 can’t be used to prove the security of MD5. 2.2 Near-collisions In all of the following we will work with binary values, where we identify {0, 1}n with Zn2 . We denote the standard basis vectors of Zn2 by e j , j = 1, . . . , n. Let “+” denote the n-bit exclusive-or operation. The Hamming weight of a vector v ∈ Zn2 is denoted by w(v) = #{ j | v j = 1} and the Hamming distance of two vectors by d(u, v) = w(u + v). The Handbook of Applied Cryptography defines near-collision resistance as follows: Definition 1 (Near-Collision Resistance [15, p. 331]) It should be hard to find any two inputs m, m ∗ with m = m ∗ such that H (m) and H (m ∗ ) differ in only a small number of bits: d H (m), H (m ∗ ) ≤ ǫ. (3) For ease of later use we also give the following definition: 123 Memoryless near-collisions via coding theory 3 Definition 2 A message pair m, m ∗ with m = m ∗ is called an ǫ-near-collision for H if (3) holds. Intuitively speaking, a hash function for which an efficient algorithm is known to construct near-collisions, can no longer be considered to be ideal. A practically more relevant consequence is that for several designs, near-collisions for the compression function can be converted to collisions for the hash function, see Sect. 2.3. Let δ denote an n-bit vector, possibly of low Hamming weight. In the strict near-collision problem, we want to find two messages m, m ∗ such that H (m ∗ ) + H (m) = δ. If δ is fixed on beforehand, the strict near-collision problem is not necessarily significantly easier than finding collisions. 2.3 Combining a near-collision and a collision for h to a collision for H As a motivation why we should certainly bother about near-collisions we give the following example. Although collisions for the compression function of MD5 can be constructed easily [7], these collisions require a special difference in the state input. To date, there is no algorithm known that can produce collisions for the compression function of MD5 without having a difference in the state input. Since there is no algorithm known to construct message blocks resulting in states with this difference, the collisions for the compression function can’t be converted into collisions for MD5. A significant contribution of [29] was the description of an efficient algorithm to find collisions for MD5, see Algorithm 1 for a simplified description. The first phase of the algorithm consists of the construction of a near-collision for h, where the output difference is such that in the second phase of the algorithm, it is feasible to construct a collision for h with this difference in the state input. Algorithm 1 Wang et al.’s algorithm to create collisions for H (simplified). Input: Initial value x0 and hash function H with compression function h Find m 0 , m ∗0 such that x1 + x1∗ = h(x0 , m 0 ) + h(x0 , m ∗0 ) = Find m 1 , m ∗1 such that h(x1 , m 1 ) = h(x1∗ , m ∗1 ) Output: m 0 , m ∗0 , m 1 , m ∗1 For many hash functions using a Davies-Meyer mode iteration function [13], like SHA-1, HAVAL, and reduced variants of SHA-256, it turns out that it is relatively easy to find collisions for the compression function (with differences in the state input). Hence, if we can also find good methods to construct near-collisions of the right form, then we can use Algorithm 1 to construct collisions. This will also be illustrated for the SHA-3 candidate TIB3 in Sect. 4. 3 Efficiently finding near-collisions Although collision resistance and (second) preimage resistance are the properties of a hash function that have attracted the most attention in cryptanalysis, the question of near-collision resistance is also of significant importance. Examples for this are applications that truncate the hash value at the end in which case a near-collision might be sufficient to thwart a security 123 4 M. Lamberger et al. goal. In Sect. 4, we will also show an example where near-collisions for the compression function can be used to construct collisions for the full hash function. In this section, we present a new generic method to find near-collisions. First we give a short discussion of generic methods to find collisions. Since hash functions were our main motivation for the underlying research, we will mainly use them to formulate the results that follow. Note that hash functions could as well be replaced with compression functions or arbitrary random functions. 3.1 Generic collision finding The generic method for finding collisions for a given hash function is based on the birthday paradox. The basic principle of this attack is that when randomly drawing elements from a set of size 2n , with high probability a repeated element will be encountered after about √ 2n drawings [15]. Due to their complexity, these generic birthday attacks are also often called square-root attacks. When implementing such a square-root attack there usually are different possibilities. The simplest approach is to randomly select messages y j , compute H (y j ) and store the results in a table until a collision is detected. This approach, which is usually attributed to Yuval [31], requires approximately 2n/2 hash function computations, and a table of the same size. If a birthday attack is implemented and run, usually the memory requirements form the bottleneck. Therefore, collision attacks are often implemented by means of cycle finding algorithms. Consider the process where we start with an arbitrary n-bit value y0 and repeatedly apply H : H H H H y0 −→ y1 −→ y2 −→ y3 −→ · · · (4) This can be seen as a walk on a graph with 2n nodes, induced by a (pseudo-) random map, namely the hash function. Since the output space is finite we eventually arrive at the situation that yi+1 = H (yi ) = H (y j ) = y j+1 . But then due to the definition of our walk, also yi+ℓ = y j+ℓ for ℓ ≥ 1, i.e., the walk runs into a cycle. Put differently, we can consider y0 , y1 , y2 , . . . as an eventually periodic sequence. For this sequence there exist two unique smallest integers µ and λ such that yi = yi+λ for all i ≥ µ. Here, λ is the cycle length and µ is the tail length. Under the assumption, that H behaves like √a random mapping, Harris [10] could show that the expected values for λ and µ are about π2n−3 . There are two well known techniques based on the above ideas to identify collisions for (pseudo)-random mappings which are due to Floyd [12, p. 7] and Brent [2]. Floyd’s cycle finding algorithm is widely known and cited. It is based on the observation that for an eventually periodic sequence y0 , y1 , . . ., there exists an index i such that yi = y2i and the smallest such i satisfies µ ≤ i ≤ µ + λ. Floyd’s algorithm only needs a small constant amount of memory and again under the assumption that H behaves like a random mapping, it can be shown that the expected number of iterations is about 0.94 · 2n/2 . Brent [2] improved on Floyd’s algorithm by introducing an auxiliary variable z which stores the value yℓ(i)−1 where i is the index of the random walk and ℓ(i) is the largest power of 2 less or equal to i. In other words, ℓ(i) = 2⌊log2 (i)⌋ . Brent’s method is described in Algorithm 2. On average, Brent found that his algorithm needs twice as much iterations as Floyd’s algorithm, that is 1.9828 · 2n/2 . The actual improvement of Brent is that in each iteration only one hash evaluation is necessary instead of 3 in the case of Floyd. Apart from these classical examples of cycle finding algorithms, there are various parallelization techniques available, but all result in higher memory requirements [26,27]. 123 Memoryless near-collisions via coding theory 5 Algorithm 2 Brent’s cycle finding algorithm Input: Starting point y0 and hash function H z ← y0 , w ← y0 , i ← 0, ℓ ← 1 while true do w ← H (w), i ← i + 1 if w = z then break end if if i ≥ (2ℓ − 1) then z ← w, ℓ ← 2ℓ end if end while j ←ℓ−1 Output: (m, m ∗ ) with m = H i−1 (y0 ) and m ∗ = H j−1 (y0 ) Furthermore, we also want to mention a more recent technique due to Nivasch [20] which can be seen as the best technique on average in terms of (hash) function evaluations. A lot of impulses in the development of cycle finding techniques have come from the computation of discrete logarithms in finite groups (e.g. Pollard’s rho-method [22]). For a nice treatise we refer to [5, Sect. 19.5.1]. 3.2 Generic near-collision attacks Obviously, Definition 2 includes collisions as well, so the task of finding near-collisions is easier than finding collisions. The goal is now to find a generic method to construct nearcollisions more efficiently than the generic methods to find collisions. In all of the following, let Br (x) = {y ∈ Zn2 | d(x, y) ≤ r } denote the Hamming sphere around x of radius r . Furthermore, we denote by r n V (n, r ) := |Br (x)| = (5) i i=0 the cardinality of any n-dimensional Hamming sphere of radius r . With this notation, Bǫ (0) would be the set of all vectors having Hamming weight ≤ ǫ. A first approach to find ǫ-near-collisions is a simple extension of the table-based birthday attack which leads to Algorithm 3. Algorithm 3 Birthday-like ǫ-near-collision search Input: Hash function H T = {} while true do Randomly select a message m and compute H (m) if (H (m) + δ, m ∗ ) ∈ T for some δ ∈ Bǫ (0) and arbitrary m ∗ then return (m, m ∗ ); else Add (H (m), m) to T end if end while Output: m, m ∗ such that d H (m), H (m ∗ ) ≤ ǫ 123 6 M. Lamberger et al. Lemma 1 If we assume that H acts like a random mapping, the average number of messages that we need to hash and store in Algorithm 3 before we find an ǫ-near-collisions is ≈ √ 2n/2 . V (n, ǫ) (6) random Proof Consider a set C = {y0 , . . . , y L−1 } of L independent, uniformly distributed variables with values in Zn2 . We now consider the random variables d y , y and let furtheri j more χ be the characteristic function of the event d yi , y j ≤ ǫ, that is, 1 if d yi , y j ≤ ǫ χ(d yi , y j ≤ ǫ) = 0 otherwise. Now, for i = j we consider the number NC (ǫ) of pairs (yi , y j ) from C which have d yi , y j ≤ ǫ (that is, the number of ǫ-near-collisions): NC (ǫ) = i−1 L−1 i=0 j=0 χ(d yi , y j ≤ ǫ) The expected value of this sum of pairwise-independent random variables can be computed as L V (n, ǫ)2−n . E(NC (ǫ)) = 2 Therefore, if we choose L such that L(L − 1) > 2n+1 /V (n, ǫ), the expected number of ǫ-near-collisions is at least 1. ⊓ ⊔ Remark 1 The proof of the previous lemma stems in part from the proof of [1, Th. 2. 1] where similar arguments apply in the context of random codes. We see that, depending on ǫ, finding ǫ-near-collisions is clearly easier than finding collisions. The question that now arises is whether or not we can find a memoryless algorithm for the search for ǫ-near-collisions. Unfortunately, we cannot use cycle finding methods like Algorithm 2 directly, because a cycle only occurs when there is a full collision. There is no such thing as a “near-cycle”. A first approach to find near-collisions in a memoryless way is as follows. Let I = { j1 , . . . , jǫ } ⊆ {1, . . . , n} be a set of mutually distinct indices. Let p I denote the linear projection map on the space Zn2 , which sets the bits of its argument to zero at the positions in I , that is, p I : Zn2 → Zn2 with x j j ∈ I p I (x) j = 0 j ∈ I. It follows that if p I (H (m)) = p I (H (m ∗ )), then H (m) and H (m ∗ ) can differ only in the bits determined by I , hence they are ǫ-near-collisions. Now we are again in the position to apply a cycle finding algorithm. Since we know that dim(Im( p I )) = n − ǫ, the expected number of iterations in Algorithm 2 is reduced to about 2(n−ǫ)/2 . This is a performance improvement factor of 2ǫ/2 compared to the search for full collisions. On the negative side, this approach can only find a fraction of all possible ǫ-near-collisions, namely 2ǫ . V (n, ǫ) 123 (7) Memoryless near-collisions via coding theory 7 We can generalize this approach by replacing the projection p I by a more general map g. Ideally, we would like to have a one-to-one correspondence between ǫ-near-collisions (ǫ ≥ 1) for H and collisions for g ◦ H : d H (m), H (m ∗ ) ≤ ǫ ⇔ g(H (m)) = g(H (m ∗ )). (8) However, we can show a negative result in this direction. Lemma 2 Let ǫ ≥ 1, let H be a hash function and let g be a function such that (8) holds. Then, g is a constant map and d(H (m), H (m ∗ )) ≤ ǫ for all m, m ∗ . Proof For an arbitrary message m, we have d H (m), H (m) + e j = w(e j ) = 1 ≤ ǫ and thus g(H (m)) = g(H (m) + e j ) for all j ∈ {1, . . . , n}. It is easy to see that g is constant on the span of the e j , which is all of Zn2 . But this implies d(H (m), H (m ∗ )) ≤ ǫ, ∀m, m ∗ , which is clearly not the case for the interesting hash functions H . ⊔ ⊓ 3.3 An approach using coding theory The memoryless method to find near-collisions based on the projection map p I introduced in the previous section suffers from the fact that only a small fraction (7) of near-collisions can be detected. Our solution to improve upon this approach is to make use of the theory of covering codes. Let C be a binary code with length n and with K codewords. Note that in the rest of the paper, the length of a code and the output length of a hash function will both be denoted by n, because they coincide in all our applications. Definition 3 ([21]) The covering radius ρ of a binary code C is the smallest integer ρ such that every vector in Zn2 is at a distance of at most ρ from a codeword of C , i.e., ρ(C ) = maxn min d(x, c) . x∈Z2 c∈C (9) In general, if the minimum distance of a code is the parameter of interest, we speak of errorcorrecting codes, whereas if the emphasis is on the covering radius, we speak of covering codes. For a thorough introduction to covering codes we refer to the monograph [4]. Let H be a hash function of output size n. Let C be a code of the same length n, size K and covering radius ρ(C ) and assume there exists an efficiently computable map g satisfying g : Zn2 →C x → c with (10) d(x, c) ≤ ρ(C ). In other words, g maps every vector of Zn2 to a codeword at distance ρ(C ) or less. For example, the map g can be the decoding map of C , if decoding can be done efficiently. Note however that it is not necessary that g maps every vector to the closest codeword; any function satisfying (10) would serve our purpose. This weaker requirement may allow to replace the decoding map by a faster alternative. The approach outlined above is now summarized in Algorithm 4. 123 8 M. Lamberger et al. Algorithm 4 Main algorithm to find memoryless near-collisions for H . Input: Starting point y0 and hash function H of length n, a code C of length n, size K , covering radius ρ and decoding function g satisfying (10). Apply Algorithm 2 to (g ◦ H ) and y0 Output: Pair (m, m ∗ ) such that m = m ∗ and d H (m), H (m ∗ ) ≤ 2ρ We are now in the position to state our main result: Theorem 2 If we assume that g ◦ H acts like a random mapping, in the sense that the expected cycle and tail lengths are the same as for the iteration of a truly random mapping on a space √ of size K , then, Algorithm 4 finds 2ρ(C )-near-collisions for H with a complexity of O( K ) and with virtually no memory requirements. Proof Assume that we have two inputs m, m ∗ to the hash function H with m = m ∗ . If g(H (m)) = g(H (m ∗ )) is satisfied, we can deduce that d H (m), H (m ∗ ) ≤ 2ρ(C ), that is, every collision for g ◦ H corresponds to an ǫ-near-collision for H with ǫ = 2ρ(C ). In order to find these messages m, m ∗ we can apply Algorithm 2 to the function g ◦ H to find indices i = j such that (g ◦ H )i (y0 ) = (g ◦ H ) j (y0 ) for some starting point y0 , or in other words, we get m, m ∗ with m = m ∗ such that g(H (m)) = g(H (m ∗ )). Since g√has an output space of size K , the expected complexity of the described method will be O( K ) and there are virtually no memory requirements. ⊔ ⊓ Remark 2 We see that in our setting, the length of the code is determined by the size of the hash digest and the covering radius ρ is determined by the maximum weight that the near-collisions may have. The efficiency of our approach is therefore determined by the size of the code. The task is thus to find a code C with K as small as possible. However, also the computability of the function g defined in (10) plays a crucial role. An evaluation of g should be efficient when compared to a hash function call. The actual task is thus to find a code C with given length n and covering radius ρ, such that the size of C is as small as possible and decoding can be done efficiently. The task pointed out in the previous remark is a central problem in the field of covering codes. In all of the following we denote by K (n, ρ) the minimum size of a binary code of length n and covering radius ρ and by k(n, ρ) the smallest dimension of a binary linear code of length n and covering radius ρ (that is, for binary linear codes we have K (n, ρ) = 2k(n,ρ) ). A well known bound with respect to the size problem is the Sphere Covering Bound, which states that K (n, ρ) satisfies K (n, ρ) ≥ 2n , V (n, ρ) (11) where V (n, ρ) is as in (5). Another general bound is due to van Wee (see [4, Theorem 6.4.4]) which states that for n > ρ K (n, ρ) ≥ with (n − ρ + µ)2n , (n − ρ)V (n, ρ) + µV (n, ρ − 1) n+1 ⌉ − (n + 1). µ = (ρ + 1)⌈ ρ+1 123 (12) Memoryless near-collisions via coding theory 9 Whenever µ = 0, (12) improves over (11). An extensive amount of work in the theory of covering codes is devoted to the improvement of upper and lower bounds on the size of covering codes and to ways to construct codes meeting these bounds (see [25,28], [4, Chap. 6] or [11]). We will discuss some possible constructions in the next section. Finally, we also want to note that covering codes have been mentioned before in the context of (keyed) hash functions in [3]. Furthermore, during the preparation of this manuscript we learned about the interesting paper [8] which treats the connection of covering codes and locality sensitive hashing. 3.4 Hamming codes and ǫ-near-collisions with ǫ = 2 and ǫ = 4 An important class of codes are the Hamming codes Hr . Hamming codes are linear codes of length n = 2r − 1, dimension k = 2r − 1 − r = n − r , minimum distance d = 3 and covering radius ρ = 1. They correct every 1-bit error, decoding can be done very efficiently and they are perfect codes. Note that for perfect codes, we have equality in (11), that is, they are optimal covering codes for their respective lengths. There is only one other non-trivial binary perfect code known, namely the Golay code which has length 23, dimension 12 and covering radius 3. In the following, we will write [n, k] code for a linear code of length n and dimension k, i.e., a binary code having 2k codewords. Decoding a Hamming code is done via syndrome decoding. In general, syndrome decoding of a binary linear code makes use of a table of size 2n−k which stores to every syndrome s the corresponding coset leader cs , i.e., the vector in the coset of all vectors having syndrome s of smallest Hamming weight. A vector y ∈ Zn2 is then decoded to y + cs . For Hamming codes, the table of size 2n−k can be omitted since the coset leaders are unique and known for a given syndrome s. Thus, using a Hamming code in Theorem 2, we get a memoryless algorithm to find nearcollisions of weight at most 2 in words of size n = 2r − 1. The performance improvement factor is √ 2r/2 = 2log2 (n−1)/2 ≈ n, compared to a generic collision search algorithm. In practice, most common hash functions have an output size of n = 2r . Unfortunately, for such lengths, no perfect codes are known. In view of (11), we therefore have to search for codes leading to the smallest possible bounds for K (n, ρ). In [28] it is shown that r −r K (2r , 1) = 22 for all r ≥ 1. (13) The following lemma will be used frequently throughout the remainder of this Sect. [9]: Lemma 3 For linear codes C1 = [n 1 , k1 ] with ρ(C1 ) = ρ1 and C2 = [n 2 , k2 ] with ρ(C2 ) = ρ2 , the direct sum of C1 and C2 is defined as C1 ⊕ C2 = {(c1 , c2 ) | c1 ∈ C1 , c2 ∈ C2 }. Then, C1 ⊕ C2 is a linear code satisfying [n 1 + n 2 , k1 + k2 ] and ρ = ρ1 + ρ2 . To construct a code meeting the bound (13), one can start with a Hamming code Hr and extend it. In terms of direct sums of codes, this can be realized by Hr ⊕ U1 , where Uℓ = Zℓ2 is the trivial code of length ℓ. In Uℓ , every word is a codeword, and therefore ρ(Uℓ ) = 0. 123 10 M. Lamberger et al. By Lemma 3 we end up with a code of length 2r , dimension 2r − r and covering radius 1. By (13) we know, that no smaller code with this property exists. Furthermore, the code C = Hr ⊕ U1 from above is as easy to decode as Hr since decoding can be done for each component of the direct sum. We can summarize: Proposition 1 Let H be a hash function of output length n = 2r for r ≥ 1. Then, for the approach outlined in Theorem 2, i.e., a generic method capable of finding 2-near-collisions for H , the choice of the code C = Hr ⊕ U1 is optimal. We can also aim for near-collisions of weight ≤ 4, i.e.we take ρ = 2. The sphere covering bound (11) in the case n = 2r then implies r K (2r , 2) ≥ 22 +1 r ≥ 22 −2r . 2 + 2r + 22r We do not have a strict formula like (13) but we can use a result from [9] about linear codes with covering radius 2. Lemma 4 Let k(n, 2) be the smallest dimension k such that a binary linear [n, k] code with ρ = 2 exists. Then for n ≥ 28, 1. if 2 j+1 − 4 ≤ n < 3 · 2 j − 4, k(n, 2) ∈ {n − 2 j − 2, n − 2 j − 1, n − 2 j} ; 2. if 3 · 2 j − 4 ≤ n < 2 j+2 − 4, k(n, 2) ∈ {n − 2 j − 2, n − 2 j − 1}. When applying Lemma 4 to the special case n = 2r , we end up with k(2r , 2) ∈ {2r − 2r, 2r − 2r + 1, 2r − 2r + 2}. In order to come as close as possible to the above bounds for ρ = 2, we again use the direct sum construction to build the following code: C = Hr −1 ⊕ Hr −1 ⊕ U2 , (14) that is, the direct sum of two Hamming codes of length 2r −1 − 1 and the trivial code of length 2. From Lemma 3 we get that this code has covering radius 2, length n = 2r and dimension k = 2r − 2r + 2. We have no construction reaching one of the two lower possible dimensions k = 2r − 2r + 1 or k = 2r − 2r (for n ≥ 128). Remark 3 Note that for special cases, better constructions than the direct sum might be available. One example is the amalgamated direct sum construction, also introduced in [9]. 3.5 Near-collisions for general n and higher weight The last construction can be further generalized. One way to go is to consider a larger covering radius ρ and another is to consider output lengths n that are not a power of 2. The construction principle we propose does not claim to be optimal, our objective is to use only simple, well understood codes with nice properties. We can safely assume ρ < ⌊ n4 ⌋ when talking about “near”-collisions. Let ρ be a given covering radius and we consider a hash function H with arbitrary output length n. The idea is now to construct a code of length n by using the direct sum construction 123 Memoryless near-collisions via coding theory 11 with ρ suitable Hamming codes and filling the remaining space with Uℓ codes. This renders a code, which can be decoded efficiently and we will now prove a result on the size of codes based on the above construction. For this, we define for i = 1, 2, . . . the numbers Ni = 2i − 1 to be the possible lengths of the Hamming codes H i . Let D = {0, 1, . . . , ρ} be the set of digits. We are interested in digital expansions x = i≥1 di Ni with di ∈ D and di = 0 for finitely many i. For ease of notation, we will denote by d · Hi the direct sum of d copies of Hi . Now we can prove the following: Construction 1 Let n be given and let ρ < ⌊ n4 ⌋. We now consider digital expansions i≥1 di Ni with di ∈ D which are smaller or equal to n. For a given expansion (di )i≥1 , let s((di )i≥1 ) denote the difference n − i≥1 di Ni . We assume that additionally the following holds: di = ρ, i≥1 di · i is maximal. i≥1 Then, the code C= i≥1 (15) di · Hi ⊕ Us((di )i≥1 ) has length n, covering radius ρ and the dimension of the code is di · i. k=n− i≥1 Properties of Construction 1: Since the construction is based on Hamming codes and the trivial codes Uℓ = Zℓ2 , we certainly need ρ Hamming codes in the direct sum construction since the covering radius ρ(Uℓ ) = 0. i Also multiple Hi ’s are allowed, so we choose a combination where the lengths Ni = 2 −1 of the Hamming codes satisfy i≥1 di Ni ≤ n. In general, we cannot reach n exactly because of the requirement i≥1 di = ρ, that is, the sum of the digits must be exactly ρ. Let s((di )i≥1 ) be the error we make when approximating n. From Lemma 3 we know that a code that is constructed as in (15) has length di Ni + s((di )i≥1 ) = n, i≥1 and that the covering radius is exactly ρ. For the dimension of the code we get di (Ni − i) + s((di )i≥1 ) k= i≥1 di · i, di Ni = n − di (Ni − i) + n − = i≥1 i≥1 which is smallest possible if i≥1 di i≥1 · i is maximal. Remark 4 Since the direct sum of perfect codes is no longer perfect, we obviously get further away from the sphere covering bound (11). Nevertheless, since our construction also has the requirement of being easy to decode, we see no better construction that is more efficient when subject to the same restrictions. When restricting Theorem 2 to codes constructed as direct sums of Hamming codes Hr and Zℓ2 , then, Construction 1 provides the optimal solution. 123 12 M. Lamberger et al. Table 1 For given ρ = 1, . . . , 5, the table compares the base-2 logarithms of the complexity of the standard table-based approach (with ǫ = 2ρ) (6), the complexity induced by the van Wee bound (12) and the complexity of our construction (15) for n = 128, 160 and 512 ρ n = 128 n = 160 (6) (12) (15) 1 57.5 60.5 2 52.3 57.5 n = 512 (6) (12) (15) (6) (12) (15) 60.5 73.2 76.3 76.5 247.5 251.5 251.5 58.0 67.7 73.2 74.0 240.3 247.5 248.0 3 47.8 54.8 56.0 62.8 70.3 71.5 233.8 243.8 245.0 4 43.8 52.3 54.0 58.5 67.7 69.5 227.7 240.3 242.0 5 40.1 50.0 52.5 54.4 65.2 67.5 221.9 237.0 239.5 Of course, we also have to note the drawback that our construction so far only allows to search for ǫ-near-collisions with even ǫ (since ǫ = 2ρ). This issue will be addressed in the following section (Table 1). 3.6 Additional thoughts and probabilistic considerations We want to conclude this section with two observations. First, we want to note that the projection based approach described in Sect. 3.2 can also be seen in the context of our coding based solution. To be more precise, in the projection case, we cover Zn2 with 2n−ǫ sets which all have size 2ǫ . The representative of each set is the vector having zeros in the ǫ positions of the predefined set I and p I then maps a given vector to its representative. Basically, we have a “code” that does not take into account its parity bits. On the other hand, when we look at Hamming spheres around these representatives of radius ǫ, we clearly observe significant overlap. There is however a simple idea to improve the projection based approach, namely, by just enlarging the set of indices I that are then set to zero. Assume we want to find ǫ-nearcollisions. It is not forbidden to take a set of indices I with |I | > ǫ. A cycle finding method applied to p I ◦ H has an expected complexity of 2(n−|I |)/2 and clearly finds two messages m, m ∗ such that d(H (m), H (m ∗ )) ≤ |I |. The probability that these two messages m, m ∗ satisfy d(H (m), H (m ∗ )) ≤ ǫ can be computed to be 2−|I | ǫ |I | . i (16) i=0 If we set for example |I | = 2ǫ+1, (16) implies that a collision for p I ◦ H is an ǫ-near-collision with probability 1/2. For a truly memoryless approach, we can treat multiple runs of the cycle-finding algorithm as independent events. Then, the expected complexity to find an ǫ-near-collision is obtained by multiplying the expected complexity to find a cycle by the expected number of times that we have to run the cycle-finding algorithm, i.e. one over the probability that a single run finds an ǫ-near-collision. In other words, we end up with an expected complexity of 2 n+|I | 2 ǫ |I | i=0 123 i −1 . (17) Memoryless near-collisions via coding theory 13 Table 2 Optimal values of |I | to minimize (17) for small values of ǫ ǫ 1 2 3 4 5 6 7 8 9 10 |I | 2 5 8 11 15 18 21 25 28 32 Finding the best trade-off for this probabilistic approach corresponds to finding the minimum value of (17) with respect to |I | and given ǫ. In Table 2 we give some optimal combinations of ǫ and |I |. A similar approach can also be taken for the coding based method to find near-colli′ sions. Assume that Algorithm 4 produces two messages m,m ∗ such that g(y) = g(y ) with ′ ∗ ′ y = H (m) and y = H (m ). Now weknow for sure that d y, y ≤ 2ρ(C ) but what do we know about the distribution of d y, y ′ ? For general codes, this question is difficult to answer, but for a Hamming code Hr of length n = 2r − 1 it is fairly easy. Since ρ(Hr ) = 1, and g(y) = g(y ′ ) implies that both yand y′ lie in the same Hamming sphere of radius 1 around some code word, the distance d y, y ′ must be either 0, 1 or 2 by the triangle inequality. For an ideal hash function, we consider y, y ′ to be uniformly distributed in Zn2 . The Hamming sphere B1 (c) contains n + 1 elements, namely the codeword c and c + ei for i ∈ {1, . . . , n}. Proposition 2 Let y, y ′ be taken independently and uniformly from B1 (c) for some codeword c ∈ Hr of length n = 2r − 1, then ⎧ n+1 ⎪ ⎨ 0 with prob. (n+1)2 2n ′ d y, y = 1 with prob. (n+1)2 (18) ⎪ ⎩ 2 with prob. n(n−1) . (n+1)2 2 ′ ′ Proof There a pair (y, y ) in B1 (c). In n + 1 cases y ′= y are′ (n + 1) possibilities tochoose ′ and thus d y, y = 0. In order to have d y, y = 1 we need to have either y = c and y = c or vice versa. This explains the second probability. The probability for d y, y ′ = 2 results from y = c, y ′ = c and y = y ′ . ⊔ ⊓ These probabilities can also be used for the codes coming from Construction 1. The distribution of d y, y ′ can then be described as the convolution of ρ distribution functions of the form (18). For example, the probability that a 2ρ(C )-near-collision found by a code C as in (15) is in fact a (2ρ(C ) − 1)-near-collision, can now be computed as follows. The probability that d y, y ′ = 2ρ(C ) is the product of the probabilities that the restriction of y, y ′ to every independent Hamming code contributing to the direct sum C , has distance 2. Hence, the probability that d y, y ′ ≤ 2ρ(C ) − 1 is: (2i −1)(2i −2) di 1− (19) 22i i≥1 More generally, we can play the same game as with the projection approach and fix the near-collision parameter ǫ. Afterwards, we successively increase the covering radius ρ, follow Construction 1, and compute again the product of the complexity of the cycle finding algorithm in the constructed code and the reciprocal value of the probability, that a resulting 2ρ(C )-near-collison is in fact an ǫ-near-collision. 123 14 M. Lamberger et al. Determining a closed expression for the complexity like (17) in the projection case seems out of reach, since the dependence on the tuneable covering radius is too involved. Numerical experiments for relevant values of n and ǫ indicate however, that increasing the covering radius does rarely bring an advantage. We refer to Sect. 4 for a concrete example. 4 Illustration: the SHA-3 candidate TIB3 In this section, we want to demonstrate our novel approach introduced above on a practical example, in this case, a recently proposed hash function. In Sect. 2, we have discussed Theorem 1 about the MD-construction which stated that collisions for the hash function H imply collisions for the compression function h. Sometimes we are also able to convert collisions for h into collisions for H . 4.1 Overview on the SHA-3 candidate TIB3 TIB3 [17] is an iterated hash function based on the Merkle–Damgård design principle and was proposed as a candidate for the NIST SHA-3 competition [19]. TIB3 comes in two flavors: TIB3-256 and TIB3-512. TIB3-256 processes message blocks of 512 bits and a 256bit state in order to produce 224 or 256 bits of hash output, whereas TIB3-512 operates on message blocks of 1024 bits, a state of size 512 bits and produces hash values of 384 or 512 bits. Let m = M1 M2 · · · Mt be a t-block message (after padding). Then, the hash value h = H (m) is computed as follows: H0 = I V H , M0 = I VM Hi = h T (Hi−1 , Mi Mi−1 ) for 1 ≤ i ≤ t Ht+1 = h T (Ht , 0Ht Mt ) = h where I V H and I VM are predefined initial values. Note that each message block is used in two compression function calls. The compression function h T is used in Davies–Meyer mode [13] and consists of 2 parts: the key schedule and the state update transformation. The state update of the compression function has 16 rounds, consisting of additions modulo 264 , bitwise exclusive-ORs, bitwise parallel nonlinear functions and fixed rotations. We refer to [17] for a complete description of the hash function. In the following, we want to focus on TIB3-512 in order to demonstrate our ideas. Basically, our illustration will rely on the attacks presented by Mendel and Schläffer in [14]. We note that the 512 bit and the 256 bit version of TIB3 are closely related since TIB3-512 is more or less a parallel invocation of two TIB3-256 instances. In all of the following, the 4 512-bit state and hash value are considered as values in (Z128 2 ) . In [14] it was shown that the compression function of TIB3-512 exhibits a similar weakness as the MD5 compression function, see Sect. 2.3. Namely, it is relatively easy to find internal state values and message blocks such that h T (xi , m i ) = h T (xi + , m i ), (20) where is one of a set of special difference vectors of the form 4 = (0, δ, δ, δ) ∈ (Z128 2 ) . 123 (21) Memoryless near-collisions via coding theory 15 Take one application of the compression function as unit operation. Then, according to [14], the complexity Q to find a solution m i for (20) is 24 2 if δ = e j and j ∈ {64, 128}, Q= 248 if δ = e j and j ∈ {1, . . . , 63, 65, . . . , 127} and we have that Q ≤ 2240 if δ = e j1 + e j2 + e j3 + e j4 + e j5 and j1 , j2 , j3 , j4 , j5 ∈ {1, . . . , 128}. For δ vectors with a higher Hamming weight, the complexity becomes larger than 2256 , hence worse than the generic complexity a collision in a 512-bit hash 5 of finding 128 function. Note that by this construction we get i=0 ≈ 228 different vectors. In i [14], near-collisions having a difference vector from this set are constructed by means of a classical birthday attack, hence with a complexity of 2 512−28 2 = 2242 compression function evaluations. The memory requirements were reduced to about 2100 using the method of distinguished points [23]. 4.2 Memoryless near-collisions for h T We will now show how we can construct near-collisions for h T with suitable output difference in a memoryless way. The combination of these near-collisions with the collisions for h T will then result in full collisions. We first observe that the difference vectors in (21) are of a special form: the nonzero bits occur three times in the same positions. The probability that a randomly found near-collision with weight 15 shows this structure, is negligible. Therefore, we introduce an additional linear output transformation. We define ⎞ ⎛ 0010 ⎜1 1 0 0⎟ ⎟ h RT (xi , m i ) = h T (xi , m i ) × ⎜ ⎝0 1 1 1⎠. 0011 It can easily be verified that h T (xi , m i ) + h T (xi , m i∗ ) = (0, δ, δ, δ) (22) h RT (xi , m i ) + h RT (xi , m i∗ ) = (δ, 0, 0, 0). (23) if and only if Now we want to show how to use the approach of Sect. 3.3 to efficiently find near-collisions of the form (23) for the function h RT . Since w(δ) = 5, and we have found an almost optimal code of length 2r with covering radius 2, we will choose the code from (14) C = H6 ⊕ H6 ⊕ U 2 , with length n = 128 and k = 116. Let g be the decoding function of C satisfying (10), i.e., g efficiently maps a 128-bit vector to a codeword at distance at most 4. With g, we define the function 4 128 4 f : (Z128 2 ) → (Z2 ) (A, B, C, D) → (g(A), B, C, D), (24) 123 16 M. Lamberger et al. Table 3 Base-2 logarithm of the complexity of finding 5-near-collisions with codes (15) of length n = 128 with increasing covering radii ρ ρ 2 3 4 5 6 7 8 9 Compl. 250 250.23 251.24 252.20 253.24 254.34 255.46 256.48 and finally, we can apply a cycle finding method to the function f ◦ h RT . Theorem 2 together with the generic collision complexity in the last three 128-bit words lead to a complexity of about 2 116+3·128 2 = 2250 (25) but with virtually no memory requirements. Compared to the attack of [14], at the cost of a higher complexity by a factor 28 , we can eliminate the memory requirements of 2100 . Although an attack with a complexity of 2250 is not feasible, NIST stated explicitly in its requirements that a result like this should not exist for SHA-3 [19]. Remark 5 Obviously, by taking ρ = 2 we cannot find near-collisions with Hamming weight equal to 5. We can now make use of the discussion in Sect. 3.6. When using ρ = 3, we can choose the code C = H5 ⊕ H5 ⊕ H6 ⊕ U3 having length 128, dimension k = 112 and ρ = 3. Then, following from (19) the probability of finding a near-collision of weight ≤ 5 is p =1− 25 − 1 25 − 2 210 2 26 − 1 26 − 2 ≈ 0.2134. 212 Replacing k = 116 in (25) with k = 112 leads to a complexity of 2248 , however due to the probability p, we have to repeat the attack p −1 times and thus end up again with ≈ 2250 compression function computations. However, now we can also find near-collisions of weight 5. As suggested in Sect. 3.6, increasing the covering radius further while still looking for 5-near-collisions would have the effect, that the dimension of the code, and thus, the complexity of the cycle-finding part, decreases. The probability that the found 2ρ-near-collision has weight ≤ 5 however decreases even faster and therefore, the overall complexity increases. Table 3 displays the numerical values of the complexities when using a code of length n = 128 and covering radius ρ ∈ {2, 3, . . . , 9} in the same manner as above: 5 Conclusion In this paper, we have proposed a new memoryless method to search for near-collisions. Our approach is based on the decoding operation of covering codes. The efficiency of our algorithm depends on the size of the underlying code and we gave constructions to find near-collisions of small weight which are optimal, or close to optimal. One merit of our approach is that we do not have to impose any conditions on how the near-collisions look like. We demonstrated our approach on the SHA-3 candidate TIB3, where we showed how to completely eliminate the memory requirements of 2100 by a small loss in efficiency. Our method marks the first general approach which makes cycle finding algorithms applicable to the search for near-collisions. 123 Memoryless near-collisions via coding theory 17 Acknowledgments The authors wish to thank the anonymous referees, Gaëtan Leurent and Kazumaro Aoki for valuable comments and discussions. The work in this paper has been supported in part by the Research Fund K. U. Leuven, project OT/08/027, in part by the European Commission under contract ICT-2007-216646 (ECRYPT II), in part by the Austrian Science Fund (FWF), project P21936 and in part by the IAP Programme P6/26 BCRYPT of the Belgian State (Belgian Science Policy). References 1. Barg A., Forney G.D. Jr.: Random codes: minimum distances and error exponents. IEEE Trans. Inf. Theory 48(9), 2568–2573 (2002). 2. Brent R.P.: An improved Monte Carlo factorization algorithm. BIT 20(2), 176–184 (1980). 3. Canetti R., Rivest R.L., Sudan M., Trevisan L., Vadhan S.P., Wee H.: Amplifying collision resistance: a complexity-theoretic treatment. In: Menezes A. (ed.) CRYPTO, Lecture Notes in Computer Science, vol. 4622, pp. 264–283. Springer, Heidelberg (2007). 4. Cohen G., Honkala I., Litsyn S., Lobstein A.: Covering codes, vol. 54 of North-Holland Mathematical Library. North-Holland Publishing Co., Amsterdam (1997). 5. Cohen H., Frey G., Avanzi R., Doche C., Lange T., Nguyen K., Vercauteren F. (eds.): Handbook of Elliptic and Hyperelliptic Curve Cryptography. Discrete Mathematics and its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton, FL (2006). 6. Damgård I.: A design principle for hash functions. In: Brassard G. (ed). CRYPTO, Lecture Notes in Computer Science, vol. 435, pp. 416–427. Springer, Heidelberg (1989). 7. den Boer B., Bosselaers A.: Collisions for the compression function of MD5. In: Goos G., Hartmanis J. (eds.) EUROCRYPT, Lecture Notes in Computer Science, vol. 765, pp. 293–304. Springer, Heidelberg (1993). 8. Gordon D., Miller V., Ostapenko P.: Optimal hash functions for approximate matches on the n-cube. IEEE Trans. Inform. Theory 56(3), 984–991 (2010). 9. Graham R.L., Sloane N.J.A.: On the covering radius of codes. IEEE Trans. Inform. Theory 31(3), 385–401 (1985). 10. Harris B.: Probability distributions related to random mappings. Ann. Math. Stat. 31, 1045–1062 (1960). 11. Kéri G.: Tables for bounds on covering codes. http://www.sztaki.hu/~keri/codes/. Accessed 17 May 2010. 12. Knuth D.E.: The art of computer programming. Seminumerical algorithms, Addison-Wesley Series in Computer Science and Information Processing, vol. 2, third edn. Addison-Wesley Publishing Co., Reading, MA, (1997). 13. Matyas S.M., Meyer C.H., Oseas J.: Generating strong one-way functions with crypographic algorithm. IBM Tech. Discl. Bull. 27(10A), 5658–5659 (1985). 14. Mendel F., Schläffer M.: On free-start collisions and collisions for TIB3. In: Samarati P., Yung M., Martinelli F., Ardagna C.A. (ed) ISC, Lecture Notes in Computer Science, vol. 5735, pp. 95–106. Springer, Heidelberg (2009). 15. Menezes A., van Oorschot P.C., Vanstone S.A.: Handbook of Applied Cryptography. CRC Press, Boca Raton (1996). 16. Merkle R.C.: One way hash functions and DES. In: Brassard G. (ed.) CRYPTO, Lecture Notes in Computer Science, vol. 435, pp. 428–446. Springer, Heidelberg (1989). 17. Montes M., Penazzi D.: The TIB3 Hash. Submission to NIST (2008). 18. National Institute of Standards and Technology (NIST): FIPS-180-2: Secure Hash Standard. http://www. itl.nist.gov/fipspubs/ (2002). 19. National Institute of Standards and Technology (NIST): Cryptographic Hash Project. http://www.nist. gov/hash-competition (2007). 20. Nivasch G.: Cycle detection using a stack. Inf. Process. Lett. 90(3) ,135–140 (2004). 21. Pless V.: Introduction to the theory of error-correcting codes. Wiley-Interscience Series in Discrete Mathematics and Optimization, third edn. Wiley, New York, (1998). 22. Pollard J.M.: Monte Carlo methods for index computation (mod p). Math. Comp. 32(143), 918–924 (1978). 23. Quisquater J.-J., Delescaille J.-P.: How easy is collision search. new results and applications to DES. In: Brassard G. (ed.) CRYPTO Lecture Notes in Computer Science, vol. 435, pp. 408–413. Springer, Heidelberg (1989). 24. Rivest R.: RFC1321—The MD5 Message-Digest Algorithm (1992). 25. Struik R.: An improvement of the Van Wee bound for binary linear covering codes. IEEE Trans. Inform. Theory 40(4), 1280–1284 (1994). 123 18 M. Lamberger et al. 26. van Oorschot P.C., Wiener M.J.: Improving implementable meet-in-the-middle attacks by orders of magnitude. In: Koblitz N. (ed.) CRYPTO, Lecture Notes in Computer Science, vol. 1109, pp. 229–236. Springer, Heidelberg (1996). 27. van Oorschot P.C., Wiener M.J.: Parallel collision search with cryptanalytic applications. J. Cryptol. 12(1), 1–28 (1999). 28. van Wee G.J.M.: Improved sphere bounds on the covering radius of codes. IEEE Trans. Inform. Theory 34(2), 237–245 (1988). 29. Wang X., Yu H.: How to break MD5 and other hash functions. In: Cramer R. (ed.) EUROCRYPT, Lecture Notes in Computer Science, vol. 3494, pp. 19–35. Springer, Heidelberg (2005). 30. Wang X., Yin Y.L., Yu H.: Finding collisions in the full SHA-1. In: Shoup V. (ed.) CRYPTO, Lecture Notes in Computer Science, vol. 3621, pp. 17–36. Springer, Heidelberg (2005). 31. Yuval G.: How to swindle Rabin? Cryptologia 3(3), 187–191 (1979). 123

RELATED PAPERS

RELATED TOPICS

Log In

Memoryless near-collisions via coding theory

Memoryless near-collisions via coding theory

Related Papers

RELATED PAPERS

RELATED TOPICS