Cryptographic Hash Functions - Data Integrity Applications: Dhiren Patel
Cryptographic Hash Functions - Data Integrity Applications: Dhiren Patel
Dhiren Patel
1
HF Applications
• Hash functions
• very popular tools for cryptographic
applications such as message integrity check,
message authentication, digital signature, and
protection scheme for pass-phrases or
passwords
• electronic funds transfer, data storage,
software distribution, and other applications
where data integrity is very important
2
Block Ciphers v/s Hash Functions
A block cipher (is a function which maps) n-bit plaintext blocks to n-bit
ciphertext blocks; n is called the block length.
– E: {0,1}n × {0,1}k → {0,1}n
To allow unique decryption, the encryption function must be one-to-one
(i.e., invertible)
Hash Function - takes input (of variable-length) and returns a fixed size
output string h (usually much smaller than input)
3
Building Cryptographic Hash Function
• Operability:
• H() should work on any input length
• H() should produce output of fixed size
• H() should be easy to compute
4
Building a hash function
• CheckSum (CRC) --- too weak…
• Simple Hash Function - based on XOR of message blocks
5
not secure as manipulation is easy for any message
6
HF
Merkle-Damgard:
iterative application of compression function
7
8
HF
9
HF
10
HF construction – using Block Cipher
• Block cipher (standard or dedicated) in CBC
• Hash function H: {0,1}* {0,1}n
• Block cipher encryption E: {0,1}n x {0,1}k {0,1}n
• Hi = Hi-1 Mi
Initial Vector
n-bit long H0 Output is used as a chaining variable,
which serves as a key for the next
encryption
11
Using Block cipher
12
HF construction
• Dedicated function with optimized
performance
• Operates on iterative compression function
• based on 32-bit Registers/buffers, S-Boxes
• with multiple rounds of computations
13
Essential parameters
• Message pre-processing
• Chaining variable and hash output
• Collision-resistance of the compression function
• Word-orientation - Little-endian or big-endian conversion
• Sequential structure
• Message expansion
14
Hash Functions Security
• cryptanalytic attacks exploit structure
• analytic attacks on iterated hash functions
– typically focus on collisions in compression
function f
– like block ciphers, HF is often composed of rounds
– attacks exploit properties of round functions
15
Some Cryptographic hash functions
• MD5 (digest of 128-bit) -- Rivest
• SHA-1 (160-bit), NIST FIPS 180-1
• National Institute for Standard & Technology
FIPS 180-2 SHA-256, SHA-384, SHA-512
• SHA3 (slide will come later)
• RIPE-MD (160-bit) – (Dobbertin, Preneel)
• MASH (based on modular arithmetic) – (Girault)
• Whirlpool (up to 512 bit, based on a dedicated block-cipher),
project NESSIE (Barreto, Rijmaen)
• New European Schemes for Signatures, Integrity and
Encryption
• HMAC (with embeded hash functions (MD5, SHA-1) and a
secret key K), FIPS
16
MD-5 Hash function
[Rivest 1992 – RFC 1321]
• MD5 Produces hash of length 128 bits
17
MD-5
•Padding The message is "padded" (extended) so that its length
(in bits) is just 64 bits shy of being a multiple of 512 bits long.
18
MD-5
• Let message has a length (after padding) that is an exact
multiple of 16 (32-bit) words.
• Let M[0 ... N-1] denote the words of the resulting message,
where N is a multiple of 16.
• A four-word buffer (A,B,C,D) is used to compute the message
digest. Here each of A, B, C, D is a 32-bit register.
• These registers are initialized to the following values in
hexadecimal, low-order bytes first):
• word A: 01 23 45 67
• word B: 89 ab cd ef
• word C: fe dc ba 98
• word D: 76 54 32 10
19
MD5 - Message Digest Computation
• Four auxiliary functions - Four auxiliary functions are used, each
take as input three 32-bit words and produce as output one 32-
bit word. //F,G,H,I
• The functions G, H, and I are similar to the function F, in that they
act in "bitwise parallel" to produce their output from the bits of
X, Y, and Z.
• Note that the function H is the bit-wise "xor" or "parity" function.
20
MD5 – 1 operation
• F – non linear function (F,G,H,I)
• Total such 64 operations
• Grouped in 4 rounds (of 16 each)
21
MD-5
• Let T[i] denote the i-th element of the table, computed using sine
function.
22
64 values – from sine function
• for
• i from 0 to 63
• k[i] := floor(abs(sin(i + 1)) × (2 pow 32))
• end for
23
64 values – from sine function
k[ 0.. 3] := { 0xd76aa478, 0xe8c7b756, 0x242070db, 0xc1bdceee }
k[ 4.. 7] := { 0xf57c0faf, 0x4787c62a, 0xa8304613, 0xfd469501 }
k[ 8..11] := { 0x698098d8, 0x8b44f7af, 0xffff5bb1, 0x895cd7be }
k[12..15] := { 0x6b901122, 0xfd987193, 0xa679438e, 0x49b40821 }
k[16..19] := { 0xf61e2562, 0xc040b340, 0x265e5a51, 0xe9b6c7aa }
k[20..23] := { 0xd62f105d, 0x02441453, 0xd8a1e681, 0xe7d3fbc8 }
k[24..27] := { 0x21e1cde6, 0xc33707d6, 0xf4d50d87, 0x455a14ed }
k[28..31] := { 0xa9e3e905, 0xfcefa3f8, 0x676f02d9, 0x8d2a4c8a }
k[32..35] := { 0xfffa3942, 0x8771f681, 0x6d9d6122, 0xfde5380c }
k[36..39] := { 0xa4beea44, 0x4bdecfa9, 0xf6bb4b60, 0xbebfbc70 }
k[40..43] := { 0x289b7ec6, 0xeaa127fa, 0xd4ef3085, 0x04881d05 }
k[44..47] := { 0xd9d4d039, 0xe6db99e5, 0x1fa27cf8, 0xc4ac5665 }
k[48..51] := { 0xf4292244, 0x432aff97, 0xab9423a7, 0xfc93a039 }
k[52..55] := { 0x655b59c3, 0x8f0ccc92, 0xffeff47d, 0x85845dd1 }
k[56..59] := { 0x6fa87e4f, 0xfe2ce6e0, 0xa3014314, 0x4e0811a1 }
k[60..63] := { 0xf7537e82, 0xbd3af235, 0x2ad7d2bb, 0xeb86d391 }
24
MD-5 – round 1 – F function
• /* Round 1. */ /* Let [abcd k s i] denote the operation
a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s). */
A B C D are updated……
25
Round 2 – G function
• /* Round 2. */
• /* Let [abcd k s i] denote the operation
• a = b + ((a + G(b,c,d) + X[k] + T[i]) <<< s). */
26
Round 3 – H function
• /* Round 3. */
• /* Let [abcd k s t] denote the operation
a = b + ((a + H(b,c,d) + X[k] + T[i]) <<< s). */
27
Round 4 – I function
• /* Round 4. */
• /* Let [abcd k s t] denote the operation
a = b + ((a + I(b,c,d) + X[k] + T[i]) <<< s). */
• /* Do the following 16 operations. */
[ABCD 0 6 49] [DABC 7 10 50] [CDAB 14 15 51] [BCDA 5 21
52] [ABCD 12 6 53] [DABC 3 10 54] [CDAB 10 15 55] [BCDA
1 21 56] [ABCD 8 6 57] [DABC 15 10 58] [CDAB 6 15 59]
[BCDA 13 21 60] [ABCD 4 6 61] [DABC 11 10 62] [CDAB 2 15
63] [BCDA 9 21 64]
28
MD-5 - result
• /* Then perform the additions – update with
the original vlaues*/
• A = A + AA B = B + BB C = C + CC D = D +
DD
• /* end of loop on i */
• Output = ABCD /*concatenated – 128 bit*/
29
Secure Hash Algorithm (SHA)
• SHA originally designed by NIST (National Institute of standards and
technology) and published as a Federal Information Processing
Standard (FIPS 180) in 1993.
• revised in 1995 as FIPS 180-1 and referred to as SHA-1, also Internet
RFC3174
• The algorithm is SHA, the standard is SHS – secure hash standard
30
SHA-1 (Secure Hash Algorithm)
• an iterated hash function with a 160-bit message digest. (5 registers)
• Padding same as MD-5
• SHA-1 is built from word-oriented operations on bitstrings, where a
word consists of 32 bits (or eight hexadecimal characters (nibble)).
• The operations used in SHA-1 are as follows:
• X٨Y bitwise “and” of X and Y
• X٧Y bitwise “or” of X and Y
• X xor Y bitwise “xor” of X and Y
• ¬X bitwise complement of X
• X+Y integer addition modulo 232
• ROTLs(X) circular left shift of X by s position (0 ≤ s ≤ 31)
31
Padding and Blocks
• Same as MD5
• y = M1 || M2 || …. || Mn.
32
MD-5 and SHA-1
33
SHA-1
• Four constants are used :
– Kt = 0x5a827999, for t = 0 to 19
– Kt = 0x6ed9eba1, for t = 20 to 39
– Kt = 0x8f1bbcdc, for t = 40 to 59
– Kt = 0xca62c1d6, for t = 60 to 79
34
SHA-1 functions
• Define the function f0,…f79 as follows :
ft (B,C,D) =
• (B٨C) ٧ ((¬B) ٨D) if 0 ≤ t ≤ 19
• B xor C xor D if 20 ≤ t ≤ 39
• (B٨C) ٧ (B٨D) ٧ (C٨D) if 40 ≤ t ≤ 59
• B xor C xor D if 60 ≤ t ≤ 79.
• Each function ft takes three words B, C and D as input, and
produces one word as output.
35
SHA-1
• H0 67452301
• H1 EFCDAB89
• H2 98BADCFE
• H3 10325476
• H4 C3D2E1F0
• Define the word constants K0,…,K79, which are used in the computation of
SHA-1(x), as follows:
• Kt =
• 5A827999 if 0 ≤ t ≤ 19
• 6ED9EBA1 if 20 ≤ t ≤ 39
• 8F1BBCDC if 40 ≤ t ≤ 59
• CA62C1D6 if 60 ≤ t ≤ 79
36
SHA-1
• for i 1 to n
• do
• denote Mi = W0 || W1||…..||W15, where each Wi is a word
• for t 16 to 79
• do Wt ROTL1 (Wt-3 Wt-8 Wt-14 Wt-16)
• A H0
• B H1
• C H2
• D H3
• E H4
37
Description of SHA-1
39
One SHA-1 operation
For t=0 to 79
TEMP = (a <<< 5) + ft(b,c,d) + e + Wt + Kt
e=d
d=c
c=b <<< 30
b=a
a = TEMP
40
Whirlpool – 512 bit HF – using AES internals - endorsed by
European NESSIE project – updating 512 bit buffer
41
SHA-3 <evolving standard>
• NIST has initiated an effort to develop one or
more additional hash algorithms through a public
competition – 2007
• to develop a new cryptographic hash algorithm,
which converts a variable length message into a
short "message digest" that can be used in
generating digital signatures, message
authentication codes, and many other security
applications in the information infrastructure
42
SHA-3
• a tunable security parameter, such as the
number of rounds, which would allow the
selection of a range of possible
security/performance tradeoffs
• a recommended value for each digest size
43
SHA-3
• First round submissions (64 entries) and conf –
Dec 2008 – 51 candidates
• Second round – 14 candidates – July 2009
• Third round – 5 finalists – Dec 2012
44
5 finalists
• BLAKE
• Grøstl
• JH
• Keccak
• Skein
45
SHA3
• Performance, Security, CryptAnalysis, Diversity
• Winner – Keccak (Sponge functions)!!!
46
HF design and security issues
• Known Answer Tests (KATs) and Monte Carlo Tests
(MCTs)
• Known Answer Test values must be provided with
submissions, which demonstrate operation of the SHA-
3
• candidate algorithm with varying length inputs, for
each of the minimum required hash length values (224,
256, 384, and 512-bits).
• There are three types of KATs that are required for all
submissions: 1) Short Message Test, 2) Long Message
Test, and 3) Extremely Long Message Test (2^32 bits).
47
MCT
• The Monte Carlo Test provides a way to stress the
internal components of a candidate algorithm.
• A seed message will be provided. This seed is
used by a pseudorandom function together with
the candidate algorithm to generate 100,000
message digests.
• 100 of these 100,000 message digests, i.e. once
every 1,000 hashes, are recorded as checkpoints
to the operation of the candidate algorithm
48
HF Security issues
• collision-finding, first-preimage-finding, second-preimage-
finding, length-extension attack, multicollision attack
49
Security tests - compression function
• 1-bit sensitivity test – change only a single bit in an
input block (one after another and look for output
collisions)
• 2-bit sensitivity test (change two bits – balancing)
• Exhaustive collision detection test (take input message
of 1 block, no padding, compute hash, modify input
block exhaustively and compute hash, look for
collisions)
• at input, any regular pattern)
50
Tests
• Statistical tests (correlations between input-output) any hash
bit is
• influenced by no. of 0’s or no. of 1’s in the input?
• influenced by any particular group of input bits?
• Avalanche effect (changing a 1-bit at input changes how many
bits in output?)
• Security - input discovery, collisions, Any functional weakness
– specific input patterns (for all 0’s, all 1’s)
51
Commitment Protocol
• Goal: A and B wish to play “odd or even” over the network
• Naive Commitment Protocol
• A picks a number X and sends it to B
• B picks a number Y and sends it to A
• A wins if X+Y is odd
• B wins if X+Y is even
• Problem: How can we guarantee that B doesn’t cheat?
52
Commitment Protocols with Hash
53
Summary
• Hash function generates a small identifier of a large object
(number, message, file, document, etc.)
• Speed of computation – fast (simple Mathematical functions used)
• Computational Complexity increases linearly with digest size and
input length (no. of blocks processed)
• Memory requirement (for storing temporary values)
• Published material (IV, Padding technique (bit sequence, byte
count))
• Scalability (by increasing block length)
• Security issues more rounds of computation remove predictable
output behavior for specific input patterns
54