05 HashFunctions
05 HashFunctions
05 HashFunctions
Chester Rebeiro
IIT Madras
CR STINSON : chapter4
Issues with Integrity
How can Bob ensure that Alice’s message has not been modified?
Bob re-computes a message hash and verifies the digest with Alice’s message digest.
CR 3
Integrity with Hashes
Alice y = h(x)
Bob
“Message digest”
h secure channel =
“Attack at Dawn!!” “Attack at Dawn!!”
Message insecure channel
h
“Attack at Dawn!!”
Mallory does not have access to the digest y.
Her task (to modify Alice’s message) is much
y = h(x) more difficult.
y = h(x’)
If she modifies x to x’, the modification can be
detected unless h(x) = h(x’)
MACs allow the message and the digest to be sent over an insecure channel
CR 5
Avalanche Effect
Short
Message Hash also called
fixed length
M Function ‘hash’
digest
CR
Hash functions in Security
• Digital signatures
• Random number generation
• Key updates and derivations
• One way functions
• MAC
• Detect malware in code
• User authentication (storing passwords)
CR 7
Hash Family
hK
X Y
• The hash family is a 4-tuple defined by (X,Y,K,H)
• X is a set of messages
(may be infinite, we assume the minimum size is at least 2|Y| )
• Y is a finite set of message digests (aka authentication tags)
• K is a finite set of keys
• Each K Ɛ K, defines a keyed hash function hK Ɛ H
CR 8
Hash Family : some definitions
hK
X Y
• Valid pair under K : (x,y) Ɛ XxY such that, x = hK(y)
• Size of the hash family:
is the number of functions possible from set X to set Y
|Y| = M and |X| = N
then the number of mappings possible is MN
CR 9
Unkeyed Hash Function
h
X Y
• The hash family is a 4-tuple defined by (X,Y,K,H)
• X is a set of messages
(may be infinite, we assume the minimum size is at least 2|Y| )
• Y is a finite set of message digests
• In an unkeyed hash function : |K | = 1
• We thus have only one mapping function in the family
CR 10
Hash function Requirement
Preimage Resistant
• Also know as one-wayness problem
• If Mallory happens to know the message digest, she should
not be able to determine the message
• Given a hash function h : X Y and an element y Ɛ Y. Find
any x Ɛ X such that, h(x) = y
CR X Y 11
Hash function Requirement
(Second Preimage)
• Mallory has x and can compute h(x), she should not be able to
find another message x’ which produces the same hash.
– It would be easy to forge new digital signatures from old signatures if
the hash function used weren’t second preimage resistant
• Given a hash function h : X Y and an element x Ɛ X,Find,
x’ Ɛ X such that, h(x) = h(x’)
X Y
CR 12
Hash Function Requirement
(Collision Resistant)
• Mallory should not be able to find two messages x and
x’ which produce the same hash
• Given a hash function h : X Y and an element x Ɛ
X, find, x, x’ Ɛ X and x ≠x’ such that, h(x) = h(x’)
h
There is no
collision Free hash
Function
X Y
CR 13
Hash Function Requirement
(No shortcuts)
• For a message m, the only way to compute its
hash is to evaluate the function h(m)
• This should remain to irrespective of how many
hashes we compute
– Even if we have computed h(m1), h(m2), h(m3), ……., h(m1000)
There should not be a shortcut to compute h(m1001)
– An example where this is not true :
eg. Consider h(x) = ax mod n
If h(x1) and h(x2) are known, then h(x1+x2) can be calculated
CR 14
The Random Oracle Model
• The ideal hash function should be executed by applying h on
the message x.
• The RO model was developed by Bellare and Rogaway for
analysis of ideal hash functions
random oracle • Let F(X,Y) be the set of all functions mapping
X to Y .
O • The oracle picks a random function h from F(X,Y).
only the Oracle has the capability of executing
the hash function.
• All other entities, can invoke the oracle with a
message x Ɛ X . The oracle will return y = h(x).
CR 15
Independence Property
• Let h be a randomly chosen hash function from the set F(X,Y)
• If x1 Ɛ X and a different x2 Ɛ X then
Pr[h(x1) = h(x2)] = 1/M
where M = |Y|
this means, the hash digests occur with uniform probability
CR 16
Complexity of Problems
in the RO model
• 3 problems : First pre-image, Second pre-image,
Collision resistance
• We study the complexity of breaking these problems
– Use Las Vegas randomized algorithms
• A Las-Vegas algorithm may succeed or fail
• If it succeeds, the answer returned is always correct
– Worst case success probability
– Average case success probability (e)
• Probability that the algorithm returns success, averaged over all
problem instances is at least e
– (e, Q) Las Vegas algorithm:
• Is an algorithm which can make Q queries and have an average
success probability of e
CR 17
Las Vegas Algorithm Example
• Find a person who has a birthday today in at-most Q queries
BirthdayToday(){
X = set of Q randomly chosen people
for x in X{
if (birthday(x) == today) return x
}
return FAILURE;
}
CR 18
Las Vegas Algorithm Example
• Find a person who has a birthday today in at-most Q queries
BirthdayToday(){
X = set of Q randomly chosen people
for x in X{
if (birthday(x) == today) return x
}
return FAILURE;
}
First_PreImage_Attack(h, y, Q){
Ideal hash function choose Q distinct values from X (say x1, x2, …., xQ)
queried using the RO access for(i=1; i<=Q; ++i){
if (h(xi) == y) return xi
}
return FAIL
}
Q
1
|Y| = M Pr[ Success in Q trials on average ] = 1 − 1 −
M
CR 20
Second Preimage Attack
h
x
Problem : Given an x, find an x’
(≠x) such that h(x’) = h(x) y
x’
Second_PreImage_Attack(h, x, Q){
Extra Oracle choose Q-1 distinct values from X (say x1, x2, …., xQ-1)
query y = h(x)
for(i=1; i<=Q-1; ++i){
if (h(xi) == y) return xi
}
return FAIL
}
Q −1
1
Pr[ Success in Q trials on average ] = 1 − 1 −
M
CR 21
Finding Collisions
Find_Collisions(h, Q){
choose Q distinct values from X (say x1, x2, …., xQ)
for(i=1; i<=Q; ++i) yi = h(xi)
if there exists (yj == yk) for j ≠k then return (xj, xk)
return FAIL
}
Q −1
i
Success Pr obability (ε ) is ε = 1 − ∏ 1 −
i =1 M
CR 22
Birthday Paradox
• Find the probability that at-least two people in
a room have the same birthday
Event A : atleast two people in the room have the same birthday
Event A': no two people in the room have the same birthday
Pr[ A] = 1 − Pr[ A' ]
1 2 3 Q −1
Pr[ A' ] = 1× 1 − ×
1 − ×
1 − LL 1 −
365 365 365 365
Q −1
i
= ∏ 1 −
i =1 365
Q −1
i
Pr[ A] = 1 − ∏ 1 −
i =1 365
CR 23
Birthday Paradox
• If there are 23 people in a room, then the
probability that two birthdays collide is 1/2
CR 24
Collisions in Birthdays
to Collisions in Hash Functions
Find_Collisions(h, Q){
choose Q distinct values from X (say x1, x2, …., xQ)
for(i=1; i<=Q; ++i) yi = h(xi)
if there exists (yj == yk) for j ≠k then return (xj, xk)
return FAIL
}
Q −1
i
Success Pr obability (ε ) is ε = 1 − ∏ 1 − |Y| = M
i =1 M
Relationship between Q, M, and success
Q always proportional to square root
1
Q ≈ 2 M ln of M.
1− ε Ɛ only affects the constant factor
CR 26
Comparing Security Criteria
• Finding collisions is easier than solving pre-
image or second preimage
• Do reductions exist between the three
problems?
CR 27
collision resistance second preimage
Find_Collisions2(h, Q){
choose x randomly from X
y = h(x)
x’ = PreImage_Attack(h, y, Q-1)
if (x ≠ x’)
return (x,x’)
else
return FAIL
}
X= X1 U X2 U X3 U X4
Xi is an equivalence class. The number of such Xi formed is |Y|
Assume Preimage_Attack always finds the pre-image of y in Q-1 queries to
the Oracle, then, Find_Collisions2 is a (1/2, Q) Las Vegas algorithm
CR 29
Proof
y ∈ Y partitions X as follows.
X y = {x ∈ X | s.t. h( x) = y}
Number of partitions of X is | Y |= M
M
(assume X ≤ )
2
1 1
Pr[ success] = Pr[ x ≠ x' ] =∑∑
N
y Xy
1−
| X |
y
1 1
= ∑ | X y | 1 −
N y | X |
y
1 1
= ∑ (| X y | −1) = ( N − M )
N y N
N−N
≥ 2 (use N ≥ 2M )
N
1
CR =
2 30
Iterated Hash Functions
• So far, we’ve looked at hash functions where the
message was picked from a finite set X
• What if the message is of an infinite size?
– We use an iterated hash function
• The core in an iterated hash function is a function
called compress m+t bit
– Compress, hashes from m+t bit to m bit
compress : { 0 ,1 } m + t → { 0 ,1 } m compress
t ≥ 1
m bit
CR 31
Iterated Hash Function
(Principle,
input message (x)
given m and t)
(may be of any length) • must be at-least m+t+1 in length
y
IV t • Number of bits in the pad appended
concatenate
compress
• Concatinate previous m bit output with next t bit block
(IV used only during initialization)
compress • The compress function is invoked iteratively for each t
m bit block in the message. For the first operation, an
m initialization vector is used
g • After all t bit blocks are processed, there is a post
processing step, and finally the hash is obtained.
This step is optional.
h(y)
CR 32
Iterated Hash Function (Principle)
• Another perspective
CR 33
Merkle-Damgard Iterated Hash
input message (x)
Function
(may be of any length)
Append Pad
h : {0,1}m +t → {0,1}m
∞
X= U i
Pad Length
y {0,1}
IV=0 r t-1
i = m + t +1
r=0 for the first iteration
concatenate
compress else r=1
compress
m Itrated hash function construction
m That uses a compress function h
after k steps
If h is collision resistant then the Merkle Damgard
construction is collision resistant
h(y)
CR 34
Merkle-Damgard Iterated Hash
Function
Message length
k :Num of blocks of in x. Each
block has length t-1
Note that t cannot be = 1
Apply padding
Append d
Amount of padding
IV is 0m
required to make
message a multiple of
t-1
CR 35
On Merkle-Damgard Construction
Theorem: If the compress function is collision
resistant then the Merkle-Damgard
construction is collision resistant
CR 36
Merkle-Damgard Construction is Collision Resistant (Proof)
CR 37
Case 1 | x |≠ | x' | mod(t − 1)
• This means that the padding (resp. d and d’) applied to x and x’ is different
(i.e. d ≠ d’)
x
d’
x’
The last step in hashing
1 d 1 d’
h(x) h(x’)
CR 38
Case 1 formally : | x |≠ | x' | mod(t − 1)
CR 39
Case 2a : | x |= | x ' | mod(t − 1) and | x |=| x' |
x
x’
1 Yk-1 1 Yk-1
concatenate concatenate
compress compress
compress compress
1 yk 1 yk
concatenate concatenate
compress compress
compress compress These may or may not collide.
If they collide, we are done : we have shown a collision in
1 yk+1 1 yk+1 compress. If they don’t collide we look at the previous
concatenate concatenate iteration
compress compress
a collision here
h(x) h(x’)
CR 40
Case 2a : | x |= | x ' | mod(t − 1) and | x |=| x' |
x
x’
1 Yk-1 1 yk-1
concatenate concatenate
compress compress
compress compress These may or may not collide.
If they collide, we are done :
1 yk 1 yk We have shown a collision in compress.
concatenate concatenate If they don’t collide we look at the previous iteration
compress compress
compress compress We continue this back tracking, until we find a
collision. We will definitely find a collision at some point
1 yk+1 1 yk+1 because x ≠ x’.
concatenate concatenate
compress compress
CR h(x) h(x’) 41
Case 2a formally : | x |= | x' | mod(t − 1) and | x |=| x ' |
gi
yi
1
concatenate
compress
compress
gi+1
CR 42
Case 2b : | x |= | x ' | mod(t − 1) and | x |≠| x' |
x
CR 43
0m
Case 2b : | x |= | x ' | mod(t − 1) and | x |≠| x' | 0 y1
x concatenate
compress
compress
d
y2
1
d concatenate
compress
x’ compress
yk+1
1
concatenate
compress
CR 44
Case 2b formally : | x |= | x' | mod(t − 1) and | x |≠| x' |
CR 45
Merkle-Damgard-2
(for the case when t=1)
CR 46
Hash Functions in Practice
• MD5
• NIST specified “secure hash algorithm”
– SHA0 : published in 1993. 160 bit hash.
• There were unpublished weaknesses in this algorithm
• The first published weakness was in 1998, where a collision attack was discovered with
complexity 261
– SHA1 : published in 1995. 160 bit hash.
• SHA0 replaced with SHA1 which resolved several of the weaknesses
• SHA1 used in several applications until 2005, when an algorithm to find collisions with a
complexity of 269 was developed
• In 2010, SHA1 was no longer supported. All applications that used SHA1 needed to be
migrated to SHA2
– SHA2 : published in 2001. Supports 6 functions: 224, 256, 384, 512, and
two truncated versions of 512 bit hashes
• No collision attacks on SHA2 as yet. The best attack so far assumes reduced rounds of the
algorithm (46 rounds)
– SHA3 : published in 2015. Also known as Kecchak
CR 47
MD5
input message x
1
• Appended with 1 and then 0s so that length is a multiple of 512 – 64 = 448
Append Pad
Pad Length
512 bits
• Message length appended (in 64 bits) and split into blocks of 512 bits
each limb A B CD
is of 32 bits • Each round has 16 similar operations of this modified Feistel form
32 bits x 16
Round 4
ΔH = 0
CR 50
input message (x)
IV
each word is 32 bits (512/16=32)
expand to 79 words
bit rate
security parameter
CR 52
Message Authentication Codes
(Keyed Hash Functions)
CR 54
Constructing a MAC
input message (x)
(First Attempt)
(may be of any length)• Won’t work if no preprocessing step
– attackers could append messages and get the
same hash
Append Pad
x hK(x),
Pad Length x || x’ compress(hK(x) || x’)
Secret IV y
r t-1
concatenate
compress
compress
m
m
after k steps
h(y)
CR 55
Constructing a MAC
input message (x)
(First Attempt)
(may be of any length)• Won’t work if preprocessing step present
IV
eK eK eK eK
hK(p0||p1||…p4)
CR 57
Authenticated Encryption
• Achieves Confidentiality, Integrity, and Authentication
EtM E&M
(encrypt then MAC)
MtE
(MAC then Encrypt)
CR 58
Using CBC-MAC for Authenticated
Encryption
1. Consider p = (p0, p1, p2, p3) is a message Alice sends to Bob
1. She encrypts it with CBC as follows
c0 = Ek(p0) ; c1 = Ek(p1 + c0); c2 = Ek(p2 + c1); c3 = Ek(p3 + c2)
2. She computes mac = CBC-MACk(p)
She transmits (c, mac) to Bob : where c = (c0, c1, c2, c3)
2. Mallory modifies one or more of the ciphertexts (c0, c1, c2) to (c0’, c1’, c2’)
3. Bob will
1. Decrypt (c0’, c1’, c2’) to (p0’, p1’, p2’)
2. And use it compute the MAC mac’
CR 59
Using CBC-MAC for Authenticated
Encryption
Alice’s side Bob’s side
(encryption) (decryption)
c0 = Ek ( p0 ) p '0 = Dk (c '0 ) (assume IV = 0)
c1 = Ek ( p1 ⊕ c0 ) p 1' = Dk (c1' ) ⊕ c '0
c2 = Ek ( p2 ⊕ c1 ) p '2 = Dk (c '2 ) ⊕ c1'
c3 = Ek ( p3 ⊕ c2 ) p3' = Dk (c3 ) ⊕ c2' Without modifying the final
ciphertext, Mallory can change any
mac' = CBCMAC ( p ' ) other ciphertext as she pleases. The
CBC-MAC will not be altered.
= Ek ( p 3' ⊕ Ek ( p '2 ⊕ Ek ( p 1' ⊕ Ek ( p '0 ))))
= Ek ( p3 ⊕ c '2 ) Moral of the story: Never use CBC-
= Ek ( Dk (c3 ) ⊕ c2' ⊕ c2' ) MAC with CBC encryption!!
= Ek ( Dk (c3 ))
= c3
CR 60
Counter Mode + CBC-MAC for
Authenticated Encryption
Consider p = (p0, p1, p2, p3) is a message Alice sends to Bob
CR 61