Lecture8 AES - The Advanced Encryption Standard
Lecture8 AES - The Advanced Encryption Standard
February 4, 2017
7:16am
c
2017 Avinash Kak, Purdue University
Goals:
Python and Perl implementations for creating the lookup tables for the
byte substitution steps in encryption and decryption.
Python implementation of the Key Expansion Algorithms for the 128 bit,
192 bit, and 256 bit AES.
2
Computer and Network Security by Avi Kak Lecture 8
AES allows for three different key lengths: 128, 192, or 256 bits.
Most of our discussion will assume that the key length is 128
bits. [With regard to using a key length other than 128 bits,
the main thing that changes in AES is how you generate the key
schedule from the key an issue I address at the end of Section
8.8.1. The notion of key schedule in AES is explained in Sections
8.2 and 8.8.]
Except for the last round in each case, all other rounds are iden-
tical.
3
Computer and Network Security by Avi Kak Lecture 8
step, and the addition of the round key. The order in which these
four steps are executed is different for encryption and decryption.
Therefore, the first four bytes of a 128-bit input block occupy the
first column in the 4 4 matrix of bytes. The next four bytes
occupy the second column, and so on.
print statearray
print statearray[0]
print statearray[2][3]
block = range(128)
for i in range(4):
for j in range(4):
statearray[j][i] = block[32*i + 8*j:32*i + 8*(j+1)]
for i in range(4):
for j in range(4):
print statearray[i][j], " ",
5
Computer and Network Security by Avi Kak Lecture 8
RSA Security; Serpent by Ross Anderson, Eli Biham, and Lars Knudsen; and Twofish by a team led by the
always-in-the-news cryptographer Bruce Schneier. Rijndael was selected from these five after extensive testing
Whereas AES requires the block size to be 128 bits, the original
Rijndael cipher works with any block size (and any key size) that
is a multiple of 32 as long as it exceeds 128. The state array for
the different block sizes still has only four rows in the Rijndael
cipher. However, the number of columns depends on size of the
block. For example, when the block size is 192, the Rijndael
cipher requires a state array to consist of 4 rows and 6 columns.
6
Computer and Network Security by Avi Kak Lecture 8
round involves a key that is specific to that round this key is known as the round key. Round keys are
7
Computer and Network Security by Avi Kak Lecture 8
4-bits each. And how we prepended each 4-bit segment with the last bit of the previous 4-bit segment and
appended to each 4-bit segment the first bit of the next 4-bit segment. Subsequently, in order to find the
substitution 4-bits for an incoming 4-bit segment, we used the first and the last bit thus acquired for indexing
into the four rows of a 4 16 S-box, while using the 4-bit segment itself for indexing into the columns of the
the block coming into a round. On the other hand, all operations
in AES are purely byte-level, which makes for convenient and fast
software implementation of AES.
be an example of a theoretical attack since it is beyond the realm of any practical implementation. There is a
meet-in-the-middle attack called the biclique attack that very marginally improves upon this time complexity to
around 2126 which is still just a theoretical attack. The biclique attack was presented by Andrey Bogdanov,
Dmitry Khovratovich, and Christian Rechberger in their 2011 publication Biclique Cryptanalysis of the Full
AES. ]
http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf
9
Computer and Network Security by Avi Kak Lecture 8
The four column words of the key matrix are expanded into a
schedule of 44 words. (As to how exactly this is done, we will
explain that later in Section 8.8.) Each round consumes four
words from the key schedule.
10
Computer and Network Security by Avi Kak Lecture 8
k k k k
0 4 8 12
k k k k
1 5 9 13
k k k k
2 6 10 14
k k k k
3 7 11 15
w w w w w w w w
0 1 2 3 4 5 42 43
11
Computer and Network Security by Avi Kak Lecture 8
The number of rounds shown in Figure 2, 10, is for the case when
the encryption key is 128 bit long. (As mentioned earlier, the
number of rounds is 12 when the key is 192 bits, and 14 when
the key is 256.)
12
Computer and Network Security by Avi Kak Lecture 8
The last round for encryption does not involve the Mix columns
step. The last round for decryption does not involve the Inverse
mix columns step.
13
Computer and Network Security by Avi Kak Lecture 8
Key Schedule
Round 1 w w Round 9
w w
4 7 4 7
Round 2 w w Round 8
w w
8 11 8 11
14
Computer and Network Security by Avi Kak Lecture 8
Figure 3 shows the different steps that are carried out in each round
except the last one. [See the end of the previous section as to what steps are not allowed in the last
round.]
The entries in the lookup table are created by using the notions
of multiplicative inverses in GF (28) and bit scrambling to
destroy the bit-level correlations inside each byte. [See Lecture
7 for what is meant by the notation GF (28).]
STEP 2: (called ShiftRows for shifting the rows of the state array
during the forward process) (The corresponding transformation
15
Computer and Network Security by Avi Kak Lecture 8
Round Key
Shift Rows Add Round Key
Round Key
Add Round Key Inverse Shift Rows
16
Computer and Network Security by Avi Kak Lecture 8
17
Computer and Network Security by Avi Kak Lecture 8
the average there will be many bits of the ciphertext that will be identical to the
plaintext bits in the same positions after you have changed one bit of the plaintext.
However, again on the average, when you change one bit of the plaintext, you will
see its effect spanning all of the 128 bits of the ciphertext block. On the other
hand, with DES, changing one bit of the plaintext affects only 31 bit positions on
the average.]
18
Computer and Network Security by Avi Kak Lecture 8
20
Computer and Network Security by Avi Kak Lecture 8
21
Computer and Network Security by Avi Kak Lecture 8
We first fill each cell of the 16 16 table with the byte obtained
by joining together its row index and the column index. [The row
index of this table runs from hex 0 through hex F . Likewise, the column index runs
from hex 0 through hex F .]
For example, for the cell located at row index 2 and column
indexed 7, we place hex 0x27 in the cell. So at this point the
table will look like
0 1 2 3 4 5 6 7 8 9 ....
----------------------------------------------------
0 | 00 01 02 03 04 05 06 07 08 09 ....
|
1 | 10 11 12 13 14 15 16 17 18 19 ....
|
2 | 20 21 22 23 24 25 26 27 28 29 ....
|
.........
.........
After the above step, lets represent a byte stored in a cell of the
table by b7b6b5b4b3b2b1b0 where b7 is the MSB and b0 the LSB.
For example, the byte stored in the cell (9, 5) of the above table
is the multiplicative inverse (MI) of 0x95, which is 0x8A.
Therefore, at this point, the bit pattern stored in the cell with
row index 9 and column index 5 is 10001010, implying that b7
is 1 and b0 is 0. [Verify the fact that the MI of 0x95 is indeed 0x8A. The
polynomial representation of 0x95 (bit pattern: 10010101) is x7 + x4 + x2 + 1, and the
same for 0x8A (bit pattern: 10001010) is x7 + x3 + x. Now show that the product of
these two polynomials modulo the polynomial x8 + x4 + x3 + x + 1 is indeed 1.]
1 0 0 0 1 1 1 1
b0
1
1 1 0 0 0 1 1 1
b1
1
1 1 1 0 0 0 1 1
b2
0
1 1 1 1 0 0 0 1
b3
0
1 1 1 1 1 0 0 0
b4
0
0 1 1 1 1 1 0 0
b5
1
0 0 1 1 1 1 1 0
b6
1
0 0 0 1 1 1 1 1 b7 0
For bit scrambling for decryption, you carry out the following
bit-level transformation in each cell of the table:
bi = b(i+2) mod 8 b(i+5) mod 8 b(i+7) mod 8 di
constant d. No that does not happen. For decryption, the goal of bit scrambling is to reverse the effect of bit
scrambling on the encryption side. The bit scrambling operation for decryption maps 0x00 to 0x52. ]
The bytes c and d are chosen so that the S-box has no fixed points.
That is, we do not want S box(a) = a for any a. Neither do we
want S box(a) = a where a is the bitwise complement of a.
26
Computer and Network Security by Avi Kak Lecture 8
Section 8.5 and the Subsection 8.5.1 presented two different ways
of implementing the AES byte substitution step. As stated earlier
in Section 8.5, both these explanations are equivalent in the
sense that either will result in the same substitution byte for a
given input byte.
27
Computer and Network Security by Avi Kak Lecture 8
#!/usr/bin/env python
## gen_tables.py
## Avi Kak (February 15, 2015)
import sys
from BitVector import *
AES_modulus = BitVector(bitstring=100011011)
subBytesTable = [] # SBox for encryption
invSubBytesTable = [] # SBox for decryption
def genTables():
c = BitVector(bitstring=01100011)
d = BitVector(bitstring=00000101)
for i in range(0, 256):
# For the encryption SBox
a = BitVector(intVal = i, size=8).gf_MI(AES_modulus, 8) if i != 0 else BitVector(intVal=0)
# For bit scrambling for the encryption SBox entries:
a1,a2,a3,a4 = [a.deep_copy() for x in range(4)]
a ^= (a1 >> 4) ^ (a2 >> 5) ^ (a3 >> 6) ^ (a4 >> 7) ^ c
subBytesTable.append(int(a))
# For the decryption Sbox:
b = BitVector(intVal = i, size=8)
# For bit scrambling for the decryption SBox entries:
b1,b2,b3 = [b.deep_copy() for x in range(3)]
b = (b1 >> 2) ^ (b2 >> 5) ^ (b3 >> 7) ^ d
check = b.gf_MI(AES_modulus, 8)
b = check if isinstance(check, BitVector) else 0
invSubBytesTable.append(int(b))
genTables()
print "SBox for Encryption:"
print subBytesTable
print "\nSBox for Decryption:"
print invSubBytesTable
And shown below is the Perl implementation for doing the same
thing:
28
Computer and Network Security by Avi Kak Lecture 8
#!/usr/bin/env perl
## gen_tables.pl
## Avi Kak (February 16, 2015)
use strict;
use warnings;
use Algorithm::BitVector;
sub genTables {
my $c = Algorithm::BitVector->new(bitstring => 01100011);
my $d = Algorithm::BitVector->new(bitstring => 00000101);
foreach my $i (0..255) {
# For the encryption SBox:
my $a = $i == 0 ? Algorithm::BitVector->new(intVal => 0) :
Algorithm::BitVector->new(intVal => $i, size => 8)->gf_MI($AES_modulus, 8);
# For bit scrambling for the encryption SBox entries:
my ($a1,$a2,$a3,$a4) = map $a->deep_copy(), 0 .. 3;
$a ^= ($a1 >> 4) ^ ($a2 >> 5) ^ ($a3 >> 6) ^ ($a4 >> 7) ^ $c;
push @subBytesTable, int($a);
# For the decryption Sbox:
my $b = Algorithm::BitVector->new(intVal => $i, size => 8);
# For bit scrambling for the decryption SBox entries:
my ($b1,$b2,$b3) = map $b->deep_copy(), 0 .. 2;
$b = ($b1 >> 2) ^ ($b2 >> 5) ^ ($b3 >> 7) ^ $d;
my $check = $b->gf_MI($AES_modulus, 8);
$b = ref($check) eq Algorithm::BitVector ? $check : 0;
push @invSubBytesTable, int($b);
}
}
genTables();
print "SBox for Encryption:\n";
print "@subBytesTable\n";
print "\nSBox for Decryption:\n";
print "@invSubBytesTable\n";
29
Computer and Network Security by Avi Kak Lecture 8
99 124 119 123 242 107 111 197 48 1 103 43 254 215 171 118
202 130 201 125 250 89 71 240 173 212 162 175 156 164 114 192
183 253 147 38 54 63 247 204 52 165 229 241 113 216 49 21
4 199 35 195 24 150 5 154 7 18 128 226 235 39 178 117
9 131 44 26 27 110 90 160 82 59 214 179 41 227 47 132
83 209 0 237 32 252 177 91 106 203 190 57 74 76 88 207
208 239 170 251 67 77 51 133 69 249 2 127 80 60 159 168
81 163 64 143 146 157 56 245 188 182 218 33 16 255 243 210
205 12 19 236 95 151 68 23 196 167 126 61 100 93 25 115
96 129 79 220 34 42 144 136 70 238 184 20 222 94 11 219
224 50 58 10 73 6 36 92 194 211 172 98 145 149 228 121
231 200 55 109 141 213 78 169 108 86 244 234 101 122 174 8
186 120 37 46 28 166 180 198 232 221 116 31 75 189 139 138
112 62 181 102 72 3 246 14 97 53 87 185 134 193 29 158
225 248 152 17 105 217 142 148 155 30 135 233 206 85 40 223
140 161 137 13 191 230 66 104 65 153 45 15 176 84 187 22
82 9 106 213 48 54 165 56 191 64 163 158 129 243 215 251
124 227 57 130 155 47 255 135 52 142 67 68 196 222 233 203
84 123 148 50 166 194 35 61 238 76 149 11 66 250 195 78
8 46 161 102 40 217 36 178 118 91 162 73 109 139 209 37
114 248 246 100 134 104 152 22 212 164 92 204 93 101 182 146
108 112 72 80 253 237 185 218 94 21 70 87 167 141 157 132
144 216 171 0 140 188 211 10 247 228 88 5 184 179 69 6
208 44 30 143 202 63 15 2 193 175 189 3 1 19 138 107
58 145 17 65 79 103 220 234 151 242 207 206 240 180 230 115
150 172 116 34 231 173 53 133 226 249 55 232 28 117 223 110
71 241 26 113 29 41 197 137 111 183 98 14 170 24 190 27
252 86 62 75 198 210 121 32 154 219 192 254 120 205 90 244
31 221 168 51 136 7 199 49 177 18 16 89 39 128 236 95
96 81 127 169 25 181 74 13 45 229 122 159 147 201 156 239
160 224 59 77 174 42 245 176 200 235 187 60 131 83 153 97
23 43 4 126 186 119 214 38 225 105 20 99 85 33 12 125
The Python and Perl scripts in this section can be downloaded from
the link associated with Lecture 8 at the Lecture Notes website.
30
Computer and Network Security by Avi Kak Lecture 8
===> 1.1
s s2,1 s2,2 s2,3 s2.2 s2,3 s2,0 s2,1
2.0
s3.0 s3,1 s3,2 s3,3 s3.3 s3,0 s3,1 s3,2
32
Computer and Network Security by Avi Kak Lecture 8
For the bytes in the first row of the state array, this operation
can be stated as
s0,j = (0x02 s0,j ) (0x03 s1,j ) s2,j s3,j
33
Computer and Network Security by Avi Kak Lecture 8
For the bytes in the second row of the state array, this operation
can be stated as
s1,j = s0,j (0x02 s1,j ) (0x03 s2,j ) s3,j
For the bytes in the third row of the state array, this operation
can be stated as
s2,j = s0,j s1,j (0x02 s2,j ) (0x03 s3,j )
And, for the bytes in the fourth row of the state array, this
operation can be stated as
where, on the left hand side, when a row of the leftmost matrix
multiples a column of the state array matrix, additions involved
are meant to be XOR operations.
34
Computer and Network Security by Avi Kak Lecture 8
s0,1 s0,2 s0,3
0E 0B 0D 09 s0.0 s0,1 s0,2 s0,3 s0.0
09 0E 0B 0D s1.0 s1,1 s1,2 s1,3 s s1,1 s1,2 s1,3
= 1.0
0D 09 0E 0B s2.0 s2,1 s2,2 s2,3 s2.0 s2,1 s2,2 s2,3
s3.0 s3,1 s3,2 s3,3
0B 0D 09 0E s3.0 s3,1 s3,2 s3,3
35
Computer and Network Security by Avi Kak Lecture 8
Each round has its own round key that is derived from the original
128-bit encryption key in the manner described in this section.
One of the four steps of each round, for both encryption and
decryption, involves XORing of the round key with the state
array.
36
Computer and Network Security by Avi Kak Lecture 8
k0 k4 k8 k12
k1 k5 k9 k13
k k6 k10 k14
2
k3 k7 k11 k15
[ w0 w1 w2 w3 ]
The first four bytes of the encryption key constitute the word w0,
the next four bytes the word w1 , and so on.
Of these, the words [w0, w1, w2, w3] are bitwise XORed with the
input block before the round-based processing begins.
The remaining 40 words of the key schedule are used four words
at a time in each of the 10 rounds.
The above two statements are also true for decryption, except for
the fact that we now reverse the order of the words in the key
37
Computer and Network Security by Avi Kak Lecture 8
Now comes the difficult part: How does the Key Expansion
Algorithm expand four words w0, w1 , w2, w3 into the 44 words
w0, w1, w2, w3 , w4, w5, ........, w43 ?
38
Computer and Network Security by Avi Kak Lecture 8
w0 w1 w2 w3 g
w4 w5 w6 w7 g
w8 w9 w10 w11
39
Computer and Network Security by Avi Kak Lecture 8
Lets say that we have the four words of the round key for the ith
round:
wi wi+1 wi+2 wi+3
For these to serve as the round key for the ith round, i must be
a multiple of 4. These will obviously serve as the round key for
the (i/4)th round. For example, w4, w5, w6 , w7 is the round key
for round 1, the sequence of words w8, w9 , w10, w11 the round key
for round 2, and so on.
40
Computer and Network Security by Avi Kak Lecture 8
Note that except for the first word in a new 4-word grouping, each
word is an XOR of the previous word and the corresponding word
in the previous 4-word grouping.
That is, the first word of the new 4-word grouping is to be ob-
tained by XORing the first word of the last grouping with what
is returned by applying a function g() to the last word of the
previous 4-word grouping.
41
Computer and Network Security by Avi Kak Lecture 8
XOR the bytes obtained from the previous step with what is
known as a round constant. The round constant is a
word whose three rightmost bytes are always zero. Therefore,
XORing with the round constant amounts to XORing with
just its leftmost byte.
The only non-zero byte in the round constants, RC[i], obeys the
following recursion:
RC[1] = 0x01
RC[j] = 0x02 RC[j 1]
42
Computer and Network Security by Avi Kak Lecture 8
lecture, each round will still use only 4 words of the key schedule. Just as we organized the 128-bit key in the
form of 4 key words for the purpose of key expansion, we organize the 192 bit key in the form of six words.
The key expansion algorithm will take us from six words to six words for a total of nine key-expansion steps
43
Computer and Network Security by Avi Kak Lecture 8
with each step looking the same as what we described at the beginning of this section. Yes, it is true that
the key expansion will now generate a total of 54 words while we need only 52 we simply ignore the last two
words of the key schedule. With regard to the details of going from the six words of the j th key-expansion step
to the six words of the (j + 1)th key expansion step, lets focus on going from the initial (w0 , w1 , w2 , w3 , w4 ,
w5 ) to (w6 , w7 , w8 , w9 , w10 , w11 ). We generate the last five words of the latter from the last five words of
the former through straightforward XORing as was the case earlier in this section. As for the first word of the
latter, we generate it from the first and the last words of the former through the g function again as described
The cool thing about the 128-bit key is that you can think of the
key expansion being in one-one correspondence with the rounds.
However, that is no longer the case with, say, the 192-bit keys.
Now you have to think of key expansion as something that is
divorced even conceptually from round-based processing of the
input block.
44
Computer and Network Security by Avi Kak Lecture 8
45
Computer and Network Security by Avi Kak Lecture 8
With regard to key expansion, the main focus of the previous sub-
section was the 128-bit AES. Toward the end, I briefly described
the modifications needed for the case of 192-bit and 256-bit AES.
The goal of the implementation shown in this section is to clarify
the various steps for all three cases.
When you execute the code shown below, it will prompt you for
AES key size obviously, the number you enter must be one of
128, 192, and 256.
Subsequently, it will prompt you for the key. You are allowed to
enter any number of characters for the key. If the length of the
key you enter is shorter than what is need to fill the full width
of the AES key size, the script appends the character 0 to your
key to bring it up to the required size. On the other hand, if
you enter a key longer than what is needed, it will only use the
number of characters it needs.
46
Computer and Network Security by Avi Kak Lecture 8
#!/usr/bin/env python
## gen_key_schedule.py
## Avi Kak (April 10, 2016, bug fix: January 27, 2017)
## This script is for demonstrating the AES algorithm for generating the
## key schedule.
## It will prompt you for the key size, which must be one of 128, 192, 256.
## It will also prompt you for a key. If the key you enter is shorter
## than what is needed for the AES key size, we add zeros on the right of
## the key so that its length is as needed by the AES key size.
import sys
from BitVector import *
AES_modulus = BitVector(bitstring=100011011)
def main():
key_words = []
keysize, key_bv = get_key_from_user()
if keysize == 128:
key_words = gen_key_schedule_128(key_bv)
elif keysize == 192:
key_words = gen_key_schedule_192(key_bv)
elif keysize == 256:
key_words = gen_key_schedule_256(key_bv)
else:
sys.exit("wrong keysize --- aborting")
key_schedule = []
print("\nEach 32-bit word of the key schedule is shown as a sequence of 4 one-byte integers:")
for word_index,word in enumerate(key_words):
keyword_in_ints = []
for i in range(4):
keyword_in_ints.append(word[i*8:i*8+8].intValue())
if word_index % 4 == 0: print("\n")
print("word %d: %s" % (word_index, str(keyword_in_ints)))
key_schedule.append(keyword_in_ints)
# print("\n\nkey schedule: %s" % str(key_schedule))
num_rounds = None
if keysize == 128: num_rounds = 10
if keysize == 192: num_rounds = 12
if keysize == 256: num_rounds = 14
round_keys = [None for i in range(num_rounds+1)]
for i in range(num_rounds+1):
round_keys[i] = (key_words[i*4] + key_words[i*4+1] + key_words[i*4+2] +
key_words[i*4+3]).get_bitvector_in_hex()
print("\n\nRound keys in hex (first key for input block):\n")
for round_key in round_keys:
print(round_key)
47
Computer and Network Security by Avi Kak Lecture 8
def gen_key_schedule_128(key_bv):
byte_sub_table = gen_subbytes_table()
# We need 44 keywords in the key schedule for 128 bit AES. Each keyword is 32-bits
# wide. The 128-bit AES uses the first four keywords to xor the input block with.
# Subsequently, each of the 10 rounds uses 4 keywords from the key schedule. We will
# store all 44 keywords in the following list:
key_words = [None for i in range(44)]
round_constant = BitVector(intVal = 0x01, size=8)
for i in range(4):
key_words[i] = key_bv[i*32 : i*32 + 32]
for i in range(4,44):
if i%4 == 0:
kwd, round_constant = gee(key_words[i-1], round_constant, byte_sub_table)
key_words[i] = key_words[i-4] ^ kwd
else:
key_words[i] = key_words[i-4] ^ key_words[i-1]
return key_words
def gen_key_schedule_192(key_bv):
byte_sub_table = gen_subbytes_table()
# We need 52 keywords (each keyword consists of 32 bits) in the key schedule for
# 128 bit AES. The 128-bit AES uses the first four keywords to xor the input
# block with. Subsequently, each of the 12 rounds uses 4 keywords from the key
# schedule. We will store all 52 keywords in the following list:
key_words = [None for i in range(52)]
round_constant = BitVector(intVal = 0x01, size=8)
for i in range(6):
key_words[i] = key_bv[i*32 : i*32 + 32]
for i in range(6,52):
if i%6 == 0:
kwd, round_constant = gee(key_words[i-1], round_constant, byte_sub_table)
key_words[i] = key_words[i-6] ^ kwd
else:
key_words[i] = key_words[i-6] ^ key_words[i-1]
return key_words
def gen_key_schedule_256(key_bv):
byte_sub_table = gen_subbytes_table()
# We need 60 keywords (each keyword consists of 32 bits) in the key schedule for
# 128 bit AES. The 128-bit AES uses the first four keywords to xor the input
# block with. Subsequently, each of the 14 rounds uses 4 keywords from the key
# schedule. We will store all 60 keywords in the following list:
48
Computer and Network Security by Avi Kak Lecture 8
def gen_subbytes_table():
subBytesTable = []
c = BitVector(bitstring=01100011)
for i in range(0, 256):
a = BitVector(intVal = i, size=8).gf_MI(AES_modulus, 8) if i != 0 else BitVector(intVal=0)
a1,a2,a3,a4 = [a.deep_copy() for x in range(4)]
a ^= (a1 >> 4) ^ (a2 >> 5) ^ (a3 >> 6) ^ (a4 >> 7) ^ c
subBytesTable.append(int(a))
return subBytesTable
def get_key_from_user():
key = keysize = None
if sys.version_info[0] == 3:
keysize = int(input("\nAES Key size: "))
assert any(x == keysize for x in [128,192,256]), \
"keysize is wrong (must be one of 128, 192, or 256) --- aborting"
key = input("\nEnter key (any number of chars): ")
else:
keysize = int(raw_input("\nAES Key size: "))
assert any(x == keysize for x in [128,192,256]), \
"keysize is wrong (must be one of 128, 192, or 256) --- aborting"
key = raw_input("\nEnter key (any number of chars): ")
key = key.strip()
key += 0 * (keysize//8 - len(key)) if len(key) < keysize//8 else key[:keysize//8]
key_bv = BitVector( textstring = key )
return keysize,key_bv
main()
Each 32-bit word of the key schedule is shown as a sequence of 4 one-byte integers:
50
Computer and Network Security by Avi Kak Lecture 8
68656c6c6f3030303030303030303030
6d616868025158583261686802515858
be0b021fbc5a5a478e3b322f8c6a6a77
b809f77b0453ad3c8a689f130602f564
c7efb414c3bc192849d4863b4fd6735f
21607b90e2dc62b8ab08e483e4de97dc
1ce8fdf9fe349f41553c7bc2b1e2ec1e
c4268f313a1210706f2e6bb2decc87ac
0f311e2c35230e5c5a0d65ee84c1e242
6ca93273598a3c2f038759c18746bb83
0043de6459c9e24b5a4ebb8add080009
51
Computer and Network Security by Avi Kak Lecture 8
by Biham and Shamir in a paper titled Differential Cryptanalysis of DES-like Cryptosystems that appeared
in the Journal of Cryptology in 1991. The linear attack was firs described by Matsui in a publication titled
Linear Cryptanalysis Method for DES Ciphers, in Lecture Notes in Computer Science, no. 764. Finally,
the interpolation attack by first described by Jakobsen and Knudsen in a publication titled The Interpolation
Attack on Block Ciphers that appeared in Lecture Notes in Computer Science, Haifa, 1997. ]
The rest of this section reviews these three attacks briefly. [You will
get more out of this section if you first read the tutorial A Tutorial on Linear and Differential Cryptanalysis
by Howard Heys of the Memorial University of Newfoundland. Googling that authors name will take you
53
Computer and Network Security by Avi Kak Lecture 8
Y as the output differential. The fact that the propagation of a differential is NOT
affected by the round keys can be established in the following manner: Consider just
one round and lets say that K is the round key. Lets further say that the output of
the round is what is produced by the SBox XORed with the round key. For two dif-
ferent inputs X1 and X2 to the round, let Y1 and Y2 denote the outputs of the SBox
and and let Y1 and Y2 denote the final output of the round. We have Y1 = K Y1
and Y2 = K Y2 . The differential Y = Y1 Y2 for the output after key mixing is
related to the other differentials by Y = Y1 Y2 = K Y1 K Y2 = Y1 Y2 .
Therefore, the mapping between the input and the output differentials
of a round is not a function of the round key.]
operation of the SBox. If you use the statement in line (B8), the
byte substitutions will consist of replacing each byte by its MI in
GF (28) that is based on the AES modulus. On the other hand, if
you use the currently commented-out statement in line (B9), the
byte substitution will take place according to the lookup table
supplied through line (A9). [Yes, to be precise, the MI based byte substitution could also
be carried out through a lookup table. That is, just because one SBox is based on MI calculations and the
other on looking up a table is NOT the fundamental difference between the two. The lookup table supplied
through line (A9) was arrived at by experimenting with several such choices made possible by the commented
out statements in lines (A7) and (A8). The call to shuffle() in line (A7) gives a pseudorandom permutation
of the 256 one-byte words. Based on a dozen runs of the script, the permutation shown in line (A9) yielded
the best inhomogeneous histogram for the input/output differentials. The reader may wish to carry out such
]
experiments on his/her own and possibly make a different choice for the lookup table in line (A9).
The portion of the script starting with line (F1) is just for display-
ing the histogram of the input/output differentials and, therefore,
not central to understand what we mean by the differentials here
and the correlations between the input differentials and the out-
put differentials.
#!/usr/bin/perl -w
## find_differentials_correlations.pl
## This script creates a histogram of the mapping between the input differentials
## and the output differentials for an SBox. You have two choices for the SBox ---
## as reflected by lines (B8) and (B9) of the script. For a given input byte, the
## statement in line (B8) returns the MI (multiplicative inverse) of the byte in
## GF(2^8) based on the AES modulus. And the statement in line (B8) returns a byte
## through a user-specified table lookup. The table for this is specified in line
## (A9). More generally, such a table can be created by a random permutation
55
Computer and Network Security by Avi Kak Lecture 8
use strict;
use Algorithm::BitVector;
use Graphics::GnuplotIF;
$|++;
my $debug = 1;
my $AES_modulus = Algorithm::BitVector->new(bitstring => 100011011); #(A1)
# When SBox is based on lookup, we will use the "table" created by randomly
# permuting the the number from 0 to 255:
#my $lookuptable = shuffle([0..255]); #(A7)
#my @lookuptable = @$lookuptable; #(A8)
my @lookuptable = qw(213 170 104 116 66 14 76 219 200 42 22 17 241 197 41 216 85 140
183 244 235 6 118 208 74 218 99 44 1 89 11 205 195 125 47 236 113
237 131 109 102 9 21 220 59 154 119 148 38 120 13 217 16 100 191 81
240 196 122 83 177 229 142 35 88 48 167 0 29 153 163 146 166 77 79
43 10 194 232 189 238 164 204 111 69 51 126 62 211 242 70 214 247 55
202 78 239 114 184 112 228 84 152 187 45 49 175 58 253 72 95 19 37
73 145 87 198 71 159 34 91 168 250 255 8 121 96 50 141 181 67 26 243
130 68 61 24 105 210 172 139 136 128 157 133 80 93 39 2 143 161 186 33
144 178 30 92 138 169 86 249 252 155 193 63 223 203 245 129 4 171
115 3 40 151 7 188 231 174 25 23 207 180 56 46 206 215 227 162 199
97 147 182 149 108 36 132 5 12 103 110 209 160 137 53 224 185 173
20 222 246 28 179 134 75 254 57 60 234 52 165 225 248 31 230 156
124 233 158 27 18 94 65 32 54 106 192 221 190 101 98 251 212 150
201 117 127 107 176 226 135 123 82 15 64 90); #(A9)
# The call shown below will show that part of the histogram for which both
# the input and the output differentials are in the range (32, 63).
display_portion_of_histogram(32, 64); #(A11)
56
Computer and Network Security by Avi Kak Lecture 8
# Fisher-Yates shuffle:
sub shuffle { #(E1)
my $arr_ref = shift; #(E2)
my $i = @$arr_ref; #(E3)
while ( $i-- ) { #(E4)
my $j = int rand( $i + 1 ); #(E5)
@$arr_ref[ $i, $j ] = @$arr_ref[ $j, $i ]; #(E6)
} #(E7)
return $arr_ref; #(E8)
}
# Displays in your terminal window the bin counts in the two-dimensional histogram
# for the input/output mapping of the differentials. You can control the portion of
# the 2D histogram that is output by using the first argument to set the lower bin
# index and the second argument the upper bin index along both dimensions.
# Therefore, what you see is always a square portion of the overall histogram.
sub display_portion_of_histogram { #(F1)
my $lower = shift; #(F2)
my $upper = shift; #(F3)
foreach my $i ($lower .. $upper - 1) { #(F4)
print "\n"; #(F5)
foreach my $j ($lower .. $upper - 1) { #(F6)
print "$differential_hist->[$i][$j] "; #(F7)
}
57
Computer and Network Security by Avi Kak Lecture 8
}
}
# Displays with a 3-dimensional plot a square portion of the histogram. Along both
# the X and the Y directions, the lower bound on the bin index is supplied by the
# SECOND argument and the upper bound by the THIRD argument. The last argument is
# needed only if you want to make a hardcopy of the plot. The last argument is set
# to the number of second the plot will be flashed in the terminal screen before it
# is dumped into a .png file.
sub plot_portion_of_histogram {
my $hist = shift; #(G1)
my $lower = shift; #(G2)
my $upper = shift; #(G3)
my $pause_time = shift; #(G4)
my @plot_points = (); #(G5)
my $bin_width = my $bin_height = 1.0; #(G6)
my ($x_min, $y_min, $x_max, $y_max) = ($lower, $lower, $upper, $upper); #(G7)
foreach my $y ($y_min..$y_max-1) { #(G8)
foreach my $x ($x_min..$x_max-1) { #(G9)
push @plot_points, [$x, $y, $hist->[$y][$x]]; #(G10)
}
}
@plot_points = sort {$a->[0] <=> $b->[0]} @plot_points; #(G11)
@plot_points = sort {$a->[1] <=> $b->[1] if $a->[0] == $b->[0]} @plot_points; #(G12)
my $temp_file = "__temp.dat"; #(G13)
open(OUTFILE , ">$temp_file") or die "Cannot open temporary file: $!"; #(G14)
my ($first, $oldfirst); #(G15)
$oldfirst = $plot_points[0]->[0]; #(G16)
foreach my $sample (@plot_points) { #(G17)
$first = $sample->[0]; #(G18)
if ($first == $oldfirst) { #(G19)
my @out_sample; #(G20)
$out_sample[0] = $sample->[0]; #(G21)
$out_sample[1] = $sample->[1]; #(G22)
$out_sample[2] = $sample->[2]; #(G23)
print OUTFILE "@out_sample\n"; #(G24)
} else { #(G25)
print OUTFILE "\n"; #(G26)
}
$oldfirst = $first; #(G27)
}
print OUTFILE "\n";
close OUTFILE;
my $argstring = <<"END"; #(G28)
set xrange [$x_min:$x_max]
set yrange [$y_min:$y_max]
set view 80,15
set hidden3d
splot "$temp_file" with lines
END
unless (defined $pause_time) { #(G29)
my $hardcopy_name = "output_histogram.png"; #(G30)
my $plot1 = Graphics::GnuplotIF->new(); #(G31)
$plot1->gnuplot_cmd( set terminal png, "set output \"$hardcopy_name\""); #(G32)
$plot1->gnuplot_cmd( $argstring ); #(G33)
58
Computer and Network Security by Avi Kak Lecture 8
If you run the script with the SBox as specified in line (B8),
you will end up with a display of numbers as shown below for the
portion of the differentials histogram that is bounded by bin index
values ranging from 32 to 63 in both directions. To understand
these values, lets look at the first nonzero entry in the first row,
which happens to be in the column indexed 40. Recognizing
that the first row corresponds to the bin index 32, that nonzero
count of 2 means that in all of the runs of the loop in lines (B1)
through (B11) of the script, there were 2 cases when the input
differential was X = 00100000 (integer value = 32) and the
output differential was Y = 00101000 (integer value = 40).
0 0 0 0 0 0 0 2 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 0 0 0 0 0
0 0 2 0 0 0 0 2 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 2 0 2 0 2 2 0 0 0
59
Computer and Network Security by Avi Kak Lecture 8
0 0 0 2 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 2 0
2 2 0 0 0 2 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 2 0 0 2
2 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 2 0 2 0 2 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 2 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0
0 2 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 2 0 0 0 0 0 0
0 0 2 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 2 0 2 0 0 0 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0 2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 4 0 0 0 2 0 2 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0
0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 2
0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0
2 0 0 0 0 2 0 0 0 0 2 0 0 0 2 0 0 2 0 0 0 2 0 0 0 0 0 0 2 0 0 0
0 0 2 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 2 0 2 0 0 0 0 0 0 0 0 2 0 0
0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 0 0 0 2 0
0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 2 0 0 0 0 0 0 0 2 4 0 0 0 0 0 2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 0 0 0 0 0 0 0 0 0 0 0 0 2 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 2 0
0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 0 0 0 0 0 0 0 0 2 0 0
0 0 0 0 0 2 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0
2 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0
0 2 0 0 2 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 2 0 2 0 0 0 0 0 0 0 0 0
0 2 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0
0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0
0 0 0 2 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 0
Note that some of the bin counts in the portion of the histogram
shown above are as large as 4. A 3D plot of this portion of the
histogram as shown in Figure 8.5 is meant to make it easier to see
how many such bins exist in the portion of the histogram plotted.
If you comment out line (B8) and uncomment line (B9) to change
to the second option for the SBox, the same histogram will look
60
Computer and Network Security by Avi Kak Lecture 8
like what is shown in Figure 8.6. Recall that the second option
consists of doing byte substitution through the lookup table in
line (A9).
As you can see in Figure 8.6, the second option for the SBox
generates a more non-uniform histogram for the input/output
differentials. Ideally, for any input differential X, you would
want the output differential Y to be distributed uniformly with
a probability of 1/2n for an n-bit block cipher. (This would
translate into a count of 1 in every bin except for the (0, 0) bin
for reasons explained in the small-font note at the end of this
bullet.) That way, the output differentials will give an attacker
no clues regarding either the plaintext or the encryption key.
61
Computer and Network Security by Avi Kak Lecture 8
such a histogram: (1) If we had shown the entire histogram, especially if the cell at (0, 0) was included in the
histogram display, you would see the largest peak located over the (0, 0) bin and the bin count associated with
this peak would be 256. This is a result of the fact that, in the double loop in lines (B1) through (B11) of the
script find differentials correlations.pl, as we scan through 256 different values for the first input byte,
and, for each input byte, as we scan through 256 values for the second input byte, there will be 256 cases when
the first and the second bytes are identical. Therefore, for these 256 cases, we will have X = 0, and also
Y = 0. This would give us a count of 256 over the (0, 0) bin. (2) Another feature that all such histograms
possess is that every non-zero bin count is even. This is on account of the fact that in the double loop in lines
(B1) through (B11) of the script, the same X occurs in multiples of 2 since X = Xi Xj = Xj Xi . (3)
The sum of all the counts in each row and each column must add up to 256. That is because, every differential
in the input must map one of 256 possible differentials at the output. (4) Therefore, for best such histograms
62
Computer and Network Security by Avi Kak Lecture 8
(that is, histograms with no biases in how the input and the output differentials are related), half the randomly
located bins in each row would contain a count of 2 (this would not apply to the bins in the topmost row or
the leftmost column). (5) For all the reasons stated here, the ideal of having a count of 1 in each bin of the
256 256 bins of the histogram is clearly not achievable even theoretically. ]
As the value of the variable $M in line (A2) of the script find differen
tials correlations.pl approaches 256, with the MI option for the
SBox in line (B8), you will see more and more bins showing the
best possible count of 2 in the histogram of Figure 8.5. On the
other hand, with the table lookup option in line (B9) for the
SBox, you will see the histogram in Figure 8.6 staying just as
non-uniform as it is now with the max peaks becoming some-
what larger.
table lookup is XORed with the round key. The round key is
shifted circularly by one bit position to the right for the next
round key. [For a more realistic simple example of a differential attack that involves
both an SBox and permutations in each round, the reader should look up the previously
mentioned tutorial A Tutorial on Linear and Differential Cryptanalysis by Howard
Heys. The block size in that tutorial is 4 bits.]
clean_db_files.pl
to clear out the DBM files in your directory. The script clean db files.pl
is included in the code you can download from the lectures notes
website.
64
Computer and Network Security by Avi Kak Lecture 8
65
Computer and Network Security by Avi Kak Lecture 8
#!/usr/bin/perl -w
## differential_attack_toy_example.pl
## We assume that our block size is one byte and the SBox consists of finding a
## substitute byte by table lookup. We further assume that each round consists of
## one byte substitution step followed by xoring the substituted byte with the
## round key. The round key is the encryption key that is circularly shifted to the
## right by one position for each round.
## Since you are likely to run this script repeatedly as you experiment with
## different strategies for estimating the subkey used in the last round, the script
## makes it easy to do so by writing the information that is likely to stay constant
## from one run to the next to disk-based DBM files. The script creates the
## following DBM files:
##
## worst_differentials.dir and worst_differentials.pag -- See Line (A14)
##
## These DBM files are created the very first time you run this script. Your
## subsequent runs of this script will be much faster since this DBM database
## would not need to be created again. Should there be a need to run the script
## starting from ground zero, you can clear the DBM files created in your directory
## by calling the script:
##
## clean_db_files.pl
##
## Finally, if you set the number of tries in Line (A10) to a large number and you
## are tired of waiting, you can kill the script at any time you wish. To see the
## vote counts accumulated up to that point for the different possible candidates
## for the last round key, just run the script:
##
## get_vote_counts.pl
##
## The scripts clean_db_files.pl and get_vote_counts.pl are in the gzipped archive
## that goes with Lecture 8 at the lecture notes web site.
use strict;
use Algorithm::BitVector;
$|++;
my $debug = 1;
my $AES_modulus = Algorithm::BitVector->new(bitstring => 100011011); #(A1)
my $number_of_rounds = 1; #(A2)
my $encryption_key = Algorithm::BitVector->new(bitstring => 10001011); #(A3)
my $differential_hist; #(A4)
my %decryption_inverses; #(A5)
my %worst_differentials; #(A6)
66
Computer and Network Security by Avi Kak Lecture 8
my @worst_input_differentials; #(A7)
my @worst_output_differentials; #(A8)
my $hist_threshold = 8; #(A9)
my $tries = 500; #(A10)
# This lookup table is used for the byte substituion step during encryption in the
# subroutine defined in lines (C1) through (C14). By experimenting with the script
# differentials_frequency_calculator.pl this lookup table was found to yield a good
# non-uniform histogram for the plaintext/ciphertext differentials.
my @lookuptable = qw(213 170 104 116 66 14 76 219 200 42 22 17 241 197 41 216 85 140
183 244 235 6 118 208 74 218 99 44 1 89 11 205 195 125 47 236 113
237 131 109 102 9 21 220 59 154 119 148 38 120 13 217 16 100 191 81
240 196 122 83 177 229 142 35 88 48 167 0 29 153 163 146 166 77 79
43 10 194 232 189 238 164 204 111 69 51 126 62 211 242 70 214 247 55
202 78 239 114 184 112 228 84 152 187 45 49 175 58 253 72 95 19 37
73 145 87 198 71 159 34 91 168 250 255 8 121 96 50 141 181 67 26 243
130 68 61 24 105 210 172 139 136 128 157 133 80 93 39 2 143 161 186 33
144 178 30 92 138 169 86 249 252 155 193 63 223 203 245 129 4 171
115 3 40 151 7 188 231 174 25 23 207 180 56 46 206 215 227 162 199
97 147 182 149 108 36 132 5 12 103 110 209 160 137 53 224 185 173
20 222 246 28 179 134 75 254 57 60 234 52 165 225 248 31 230 156
124 233 158 27 18 94 65 32 54 106 192 221 190 101 98 251 212 150
201 117 127 107 176 226 135 123 82 15 64 90); #(A13)
# In what follows, we first check if the worst_differentials DBM files were created
# previously by this script. If they are already on the disk, create the disk-based
# hash %worst_differentials_db from the data in those files. If not, create the DBM
# files so that they can subsequently be populated by the call in line (A18).
# [IMPORTANT: In a more realistic attack logic, you will need to create a more
# general version of the code in lines (A14) through (A21) so that you find the
# histogram for the plaintext/ciphertext differentials not for just one round, but
# for all the rounds involved. See the tutorial by Howard Heys for this important
# point.]
dbmopen my %worst_differentials_db, "worst_differentials", 0644
or die "Cant open DBM file: $!"; #(A14)
unless (keys %worst_differentials_db) { #(A15)
foreach my $i (0..255) { #(A16)
foreach my $j (0..255) { #(A17)
$differential_hist->[$i][$j] = 0; #(A18)
}
}
gen_differential_histogram(); #(A19)
# The call shown below will show that part of the histogram for which both
# the input and the output differentials are in the range (32, 63).
display_portion_of_histogram(32, 64) if $debug; #(A20)
# From the 2D input/output histogram for the differentials, now represent that
# information has a hash in which the keys are the plaintext differentials and
# the value associated with each key the ciphertext differential whose histogram
# count exceeds the supplied threshold:
find_most_probable_differentials($hist_threshold); #(A21)
67
Computer and Network Security by Avi Kak Lecture 8
}
%worst_differentials = %worst_differentials_db; #(A22)
die"no candidates for differentials: $!" if keys %worst_differentials == 0; #(A23)
@worst_input_differentials = sort {$a <=> $b} keys %worst_differentials; #(A24)
@worst_output_differentials = @worst_differentials{@worst_input_differentials}; #(A25)
if ($debug) {
print "\nworst input differentials: @worst_input_differentials\n"; #(A26)
print "\nworst output differentials: @worst_output_differentials\n"; #(A27)
}
# The following call makes a hash that does the opposite of what is achieved by
# indexing into the lookup table of line (A13). It fills the hash
# %decryption_inverses with <key,value> pairs, with the keys being the ciphertext
# bytes and the values being the corresponding plaintext bytes.
find_inverses_for_decryption(); #(A28)
estimate_last_round_key(); #(A29)
# Now print out the ten most voted for keys. To see the votes for all possible keys,
# execute the script get_vote_counts.pl separately after running this script.
print "no votes for any candidates for the last round key\n"
if keys %votes_for_keys == 0; #(A30)
if (scalar keys %votes_for_keys) { #(A31)
my @vote_sorted_keys =
sort {$votes_for_keys{$b} <=> $votes_for_keys{$a}} keys %votes_for_keys; #(A32)
print "\nDisplaying the keys with the largest number of votes: @vote_sorted_keys[0..9]\n";
#(A33)
}
68
Computer and Network Security by Avi Kak Lecture 8
# Since the SubBytes step in encryption involves taking the square of a byte in
# GF(2^8) based on AES modulus, for invSubBytes step for decryption will involve
# taking square-roots of the bytes in GF(2^8). This subroutine calculates these
# square-roots.
sub find_inverses_for_decryption { #(D1)
foreach my $i (0 .. @lookuptable - 1) {
$decryption_inverses{$lookuptable[$i]} = $i;
}
}
# This subroutine generates a 2D histogram in which one axis stands for the
# plaintext differentials and the other axis the ciphertext differentials. The
69
Computer and Network Security by Avi Kak Lecture 8
# count in each bin is the number of times that particular relationship is seen
# between the plaintext differentials and the ciphertext differentials.
sub gen_differential_histogram { #(G1)
foreach my $i (0 .. 255) { #(G2)
print "\ngen_differential_hist: i=$i\n" if $debug; #(G3)
foreach my $j (0 .. 255) { #(G4)
print ". " if $debug; #(G5)
my ($a, $b) = (Algorithm::BitVector->new(intVal => $i, size => 8),
Algorithm::BitVector->new(intVal => $j, size => 8)); #(G6)
my $input_differential = int($a ^ $b); #(G7)
my ($c, $d) = (get_sbox_output_lookup($a), get_sbox_output_lookup($b)); #(B9)
my $output_differential = int($c ^ $d); #(G9)
$differential_hist->[$input_differential][$output_differential]++; #(G10)
}
}
}
# Displays in your terminal window the bin counts in the two-dimensional histogram
# for the input/output mapping of the differentials. You can control the portion of
# the 2D histogram that is output by using the first argument to set the lower bin
# index and the second argument the upper bin index along both dimensions.
# Therefore, what you see is always a square portion of the overall histogram.
sub display_portion_of_histogram { #(J1)
my $lower = shift; #(J2)
my $upper = shift; #(J3)
foreach my $i ($lower .. $upper - 1) { #(J4)
print "\n"; #(J5)
foreach my $j ($lower .. $upper - 1) { #(J6)
print "$differential_hist->[$i][$j] "; #(J7)
}
}
}
When you run the script for just one round, that is, when you set
the value of the variable $number_of_rounds to 1 in line (A2),
you should get the following answer for the ten encryption keys
that received the largest number of notes (in decreasing order of
the number of votes received):
139 51 200 225 108 216 161 208 26 140
This answer is how youd expect it to be since the decimal 139
70
Computer and Network Security by Avi Kak Lecture 8
get_vote_counts.pl
If you run the attack script with the number of rounds set to 2
in line (A2), you should see the following answer for the ten keys
that received the largest number of votes:
82 180 214 20 72 44 109 105 52 174
This answer says that the most likely key used in the second round
is the integer 82, which translates into the binary 01010010. If
you examine the logic of encryption in lines (C1) through (C14)
especially if you focus on how the round key is set in line
(C11) the answer returned by the script is incorrect. However,
that is not surprising since our input/output histogram of the
differentials is based on just one round. As explained in the
previously mentioned tutorial by Howard Heys, we would need
to construct a more elaborate model of the differentials propagate
through multiple rounds for the script to do better in those cases.
71
Computer and Network Security by Avi Kak Lecture 8
below average. For an ideal SBox, such relationships would be true (and, therefore, false) with a
probability of 0.5 for all sets of bit positions at the input and the output. ]
X i1 X i2 . . . X im Y i1 Y i2 . . . Y in = 0 (5)
X i1 X i2 . . . X im = Y i1 Y i2 . . . Y in (6)
72
Computer and Network Security by Avi Kak Lecture 8
73
Computer and Network Security by Avi Kak Lecture 8
X0 X1 = Y 2 Y 3
What follows is a Perl script that can calculate LAT for two
different choices of the SBox, depending on which of the two lines,
(B4) or (B5), is left uncommented. The statement in line (B4)
gives you an SBox that replaces a byte with its MI in GF (28)
based on the AES modulus. On the other hand, the statement in
line (B5) gives an SBox that is based on the lookup table defined
74
Computer and Network Security by Avi Kak Lecture 8
#!/usr/bin/perl -w
## linear_approximation_table_generator.pl
## This script demonstrates how to generate the Linear Approximation Table that is
## needed for mounting a Linear Attack on a block cipher.
use strict;
use Algorithm::BitVector;
use Graphics::GnuplotIF;
$|++;
my $debug = 1;
my $AES_modulus = Algorithm::BitVector->new(bitstring => 100011011); #(A1)
# Initialize LAT:
my $linear_approximation_table; #(A2)
foreach my $i (0..255) { #(A3)
foreach my $j (0..255) { #(A4)
$linear_approximation_table->[$i][$j] = 0; #(A5)
}
}
# When SBox is based on lookup, we will use the "table" created by randomly
# permuting the the number from 0 to 255:
#my $lookuptable = shuffle([0..255]); #(A6)
#my @lookuptable = @$lookuptable; #(A7)
my @lookuptable = qw(213 170 104 116 66 14 76 219 200 42 22 17 241 197 41 216 85 140
183 244 235 6 118 208 74 218 99 44 1 89 11 205 195 125 47 236 113
237 131 109 102 9 21 220 59 154 119 148 38 120 13 217 16 100 191 81
240 196 122 83 177 229 142 35 88 48 167 0 29 153 163 146 166 77 79
43 10 194 232 189 238 164 204 111 69 51 126 62 211 242 70 214 247 55
202 78 239 114 184 112 228 84 152 187 45 49 175 58 253 72 95 19 37
73 145 87 198 71 159 34 91 168 250 255 8 121 96 50 141 181 67 26 243
130 68 61 24 105 210 172 139 136 128 157 133 80 93 39 2 143 161 186 33
144 178 30 92 138 169 86 249 252 155 193 63 223 203 245 129 4 171
115 3 40 151 7 188 231 174 25 23 207 180 56 46 206 215 227 162 199
97 147 182 149 108 36 132 5 12 103 110 209 160 137 53 224 185 173
20 222 246 28 179 134 75 254 57 60 234 52 165 225 248 31 230 156
124 233 158 27 18 94 65 32 54 106 192 221 190 101 98 251 212 150
201 117 127 107 176 226 135 123 82 15 64 90); #(A8)
gen_linear_approximation_table(); #(A9)
75
Computer and Network Security by Avi Kak Lecture 8
# The call shown below will show that part of the LAT for which both the input and
# the output bit grouping integers are in the range (0, 32):
display_portion_of_LAT(0, 32); #(A10)
# This call makes a graphical plot for a portion of the LAT. The bit grouping index
# ranges for both the input and the output bytes are 32 to 64:
plot_portion_of_LAT($linear_approximation_table, 32, 64); #(A11)
## The following call makes a hardcopy of the plot:
plot_portion_of_LAT($linear_approximation_table, 32, 64, 3); #(A12)
# You have two choices for the SBox in lines (B4) and (B5). The one is line (B4) is
# uses MI in GF(2^8) based on the AES modulus. And the one in line (B5) uses the
# lookup table defined above in line (A8). Comment out the one you do not want.
sub gen_linear_approximation_table {
foreach my $x (0 .. 255) { # specify a byte for the input to the SBox #(B1)
print "\input byte = $x\n" if $debug; #(B2)
my $a = Algorithm::BitVector->new(intVal => $x, size => 8); #(B3)
# Now get the output byte for the SBox:
my $c = get_sbox_output_MI($a); #(B4)
# my $c = get_sbox_output_lookup($a); #(B5)
my $y = int($c); #(B6)
foreach my $bit_group_from_x (0 .. 255) { #(B7)
my @input_bit_positions; #(B8)
foreach my $pos (0..7) { #(B9)
push @input_bit_positions, $pos if ($bit_group_from_x >> $pos) & 1; #(B10)
} #(B11)
my $input_linear_sum = 0; #(B12)
foreach my $pos (@input_bit_positions) { #(B13)
$input_linear_sum ^= (($x >> $pos) & 1); #(B14)
}
foreach my $bit_group_from_y (0 .. 255) { #(B15)
my @output_bit_positions; #(B16)
foreach my $pos (0..7) { #(B17)
push @output_bit_positions, $pos if ($bit_group_from_y >> $pos) & 1; #(B18)
}
my $output_linear_sum = 0; #(B19)
foreach my $pos (@output_bit_positions) { #(B20)
$output_linear_sum ^= (($y >> $pos) & 1); #(B21)
}
$linear_approximation_table->[$bit_group_from_x][$bit_group_from_y]++ #(B22)
if $input_linear_sum == $output_linear_sum; #(B23)
}
}
}
foreach my $i (0 .. 255) { #(B24)
foreach my $j (0 .. 255) { #(B25)
$linear_approximation_table->[$i][$j] -= 128; #(B26)
}
}
}
76
Computer and Network Security by Avi Kak Lecture 8
# Fisher-Yates shuffle:
sub shuffle { #(E1)
my $arr_ref = shift; #(E2)
my $i = @$arr_ref; #(E3)
while ( $i-- ) { #(E4)
my $j = int rand( $i + 1 ); #(E5)
@$arr_ref[ $i, $j ] = @$arr_ref[ $j, $i ]; #(E6)
} #(E7)
return $arr_ref; #(E8)
}
# Displays in your terminal window the bin counts (minus 128) in the LAT calculated
# in lines (B1) through (B26). You can control the portion of the display by using
# the first argument to set the lower bin index and the second argument the upper
# bin index along both dimensions. Therefore, what you see is always a square
# portion of the LAT.
sub display_portion_of_LAT { #(F1)
my $lower = shift; #(F2)
my $upper = shift; #(F3)
foreach my $i ($lower .. $upper - 1) { #(F4)
print "\n"; #(F5)
foreach my $j ($lower .. $upper - 1) { #(F6)
print "$linear_approximation_table->[$i][$j] "; #(F7)
}
}
}
# Displays with a 3-dimensional plot a square portion of the LAT. Along both the X
# and the Y directions, the lower bound on the bin index is supplied by the SECOND
# argument and the upper bound by the THIRD argument. The last argument is needed
# only if you want to make a hardcopy of the plot. The last argument is set to the
# number of second the plot will be flashed in the terminal screen before it is
# dumped into a .png file.
sub plot_portion_of_LAT { #(G1)
my $hist = shift; #(G2)
my $lower = shift; #(G3)
my $upper = shift; #(G4)
my $pause_time = shift; #(G5)
my @plot_points = (); #(G6)
my $bin_width = my $bin_height = 1.0; #(G7)
my ($x_min, $y_min, $x_max, $y_max) = ($lower, $lower, $upper, $upper); #(G8)
foreach my $y ($y_min..$y_max-1) { #(G9)
foreach my $x ($x_min..$x_max-1) { #(G10)
push @plot_points, [$x, $y, $hist->[$y][$x]]; #(G11)
}
}
77
Computer and Network Security by Avi Kak Lecture 8
For the case when you run script with the SBox based on MI
calculations in GF (28), shown below is a small portion of the LAT
constructed by the script. [The portion of the LAT shown below was dictated by the page
width constraints.] In keeping with the explanation provided earlier,
you can see that the topmost row and the leftmost column values
78
Computer and Network Security by Avi Kak Lecture 8
are all zero, as we expect them to be. The entries at the other
locations tell us how much positive and negative bias there exists
in the linear relationships corresponding to those cells. Looking
at the seventh entry (of column index 6) in the second row (of
row index 1), we can say that the relationship X1 X2 Y1 = 0 is
true with a probability of 12/256, and so on. Note that the full
table that is calculated by the Perl script is 256 256. Theory
dictates that the sum of the entries in each row or each column
must be either 128 or -128.
128 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 -6 8 -14 4 6 12 6 -2 12 -2 -4 -6 -8 2 -8 -12 -2 12 -2 -8 -14 0 -6 -2
0 8 12 4 -8 -8 -4 12 -12 -12 8 8 -8 0 12 -12 -14 -6 -6 -6 -6 10 10 2 2
0 -14 4 6 12 -2 -8 2 -2 12 -14 0 -2 -12 10 8 -2 8 -2 8 2 4 -6 4 -12
0 4 -8 12 -12 8 0 12 -6 -6 10 10 2 -6 -2 -2 2 -6 -2 14 14 6 -2 6 -8
0 6 -8 -2 8 6 -12 -14 -4 10 -12 -14 -8 -2 4 -6 -14 4 6 -8 -6 -4 2 4 -2
0 12 -4 -8 0 -12 8 -12 -6 -6 -2 -2 -6 10 -6 -14 12 -12 8 0 12 12 4 -4 14
0 6 12 2 12 -14 -12 10 8 -2 4 -6 -12 -6 4 10 12 -2 0 10 -8 2 8 -6 16
0 -2 -12 -2 -6 -4 -6 8 -8 -6 4 -6 -2 -12 -2 8 8 -10 4 14 -6 12 -14 16 -12
0 12 -12 12 -6 10 -6 -2 -6 -2 6 -2 -8 8 0 12 0 4 4 12 -14 2 2 -2 10
0 -2 8 -14 10 -12 -2 4 4 6 -4 2 -2 12 2 4 -2 12 2 4 0 -6 0 -2 2
0 -4 8 0 10 -14 -2 -6 -6 -2 2 -6 12 4 8 12 6 -14 2 10 0 0 -8 -4 -4
0 -6 -8 -2 2 -8 -6 -12 -2 -8 -2 12 16 14 16 -6 -2 -12 10 -12 0 2 -4 10 12
0 -8 0 -12 -6 -2 10 -6 -12 8 12 4 14 -2 -2 2 -6 14 -2 -2 -4 -12 0 4 -14
0 2 12 10 -2 4 -6 4 -2 0 2 8 16 -2 12 6 -12 10 8 10 10 4 -2 -4 14
0 -8 -12 8 -2 -6 -14 10 8 12 4 12 -6 2 6 -14 -8 12 -12 12 -2 -2 -6 -2 0
0 -12 -14 -2 2 -14 12 12 8 0 -2 6 -2 -6 -12 -8 12 -8 10 6 -2 14 12 -12 -12
0 -2 -6 8 -6 4 -12 -2 -10 4 12 -14 -12 14 10 12 -8 -2 -2 4 10 -4 8 10 -14
0 12 -6 -2 -2 6 8 0 4 4 2 2 10 -2 8 -12 10 -2 -4 -8 0 -8 10 2 2
0 -2 -6 8 14 -8 0 10 14 12 4 10 -12 -2 10 12 6 4 -8 -2 12 -10 14 0 12
0 -8 -6 2 14 -6 12 -8 -6 -14 0 0 0 -4 10 -2 -2 10 0 12 4 -4 2 10 12
0 -14 10 4 6 -4 12 2 12 2 -6 0 2 -12 4 -2 14 -4 -8 -10 -4 -2 10 4 -6
0 0 10 -6 -2 2 4 8 -14 2 0 -8 -4 0 -2 -6 12 8 10 14 2 10 4 -12 6
0 -6 2 4 6 4 -4 -6 16 -2 -2 -4 10 4 -4 -2 -12 10 2 0 10 4 -12 14 8
0 -2 2 -12 -8 -2 14 16 -12 10 2 -4 12 -14 14 0 -12 -14 2 12 12 -6 6 8 -4
0 -12 -6 -6 -12 8 -6 -6 6 14 12 0 -6 -14 4 8 12 8 -6 2 8 4 -6 -14 10
0 6 -6 -12 4 2 -14 -4 -8 -2 -2 -8 8 -2 -6 -4 -10 -4 0 -6 2 -8 -8 10 6
0 -12 -6 -6 8 12 -2 -2 14 -2 -12 0 -14 2 4 0 6 -6 -8 16 6 -6 -4 4 0
79
Computer and Network Security by Avi Kak Lecture 8
If you comment out line (B3) and uncomment line (B4) so that
the SBox would be based on the lookup table in line (A8), the
portion of the plot shown in Figure 8.7 becomes what is shown
in Figure 8.8. Note that the largest peaks in the LAT of Figure
8.8 are larger than the largest peaks in the LAT of Figure 8.7.
That implies that the SBox based on the lookup table of line (A8)
results in larger biases for some of the linear equations compared
to the SBox that is based on MI in GF (28).
80
Computer and Network Security by Avi Kak Lecture 8
the type shown earlier across the rounds but not including the
last round and estimate the probabilistic biases associated with
the linear forms (these can also be affine forms). (2) Considering
different possible candidate keys for the last round, you partially
decrypt the ciphertext. For each candidate key for the last round,
this gives you candidate output bits for the last-round Sbox. (3)
Using these candidate output bits for the last round, you accu-
mulate votes for the different candidates for the last-round key
depending on the extent to which candidate SBox output bits for
the last round are consistent with the linear forms constructed
from the first n 1 rounds.
82
Computer and Network Security by Avi Kak Lecture 8
where we have used the fact that the output of each SBox, after
the addition of the round key, becomes the input to the next
round. That is, Yr,i Kr,i becomes Xr+1,i. The above linear
form may be expressed in the following form:
where K is the linear form that involves only the key bits from
the first n 1 rounds.
83
Computer and Network Security by Avi Kak Lecture 8
84
Computer and Network Security by Avi Kak Lecture 8
2. What are the steps that go into the construction of the 1616 S-
box lookup table?
3. What is rationale for the bit scrambling step that is used for
finding the replacement byte that goes into each cell of the S-box
table?
4. The second step in each round permutes the bytes in each row of
the state array. What is the permutation formula that is used?
85
Computer and Network Security by Avi Kak Lecture 8
6. Lets now talk about the Key Expansion Algorithm of AES. This
algorithm starts with how many words of the key matrix and
expands that into how many words?
7. Lets say the first four words of the key schedule are w0, w1, w2 , w3.
How do we now obtain the next four words w4 , w5, w6 , w7?
9. Programming Assignment:
Note that the variable modulus is set to the BitVector that corre-
sponds to the AES irreducible polynomial. The variable a can be
set to any arbitrary BitVector whose multiplicative inverse you
are interested in.
The other BitVector method that should prove particularly useful
for this homework is:
gf_multiply_modular
This method lets you multiply two bit patterns in GF (2n). To
multiply two bit patterns a and b, both instances of the BitVector
87
Computer and Network Security by Avi Kak Lecture 8
88