Lect Notes
Lect Notes
Lecture Notes
Dr Konstantin Blyuss
1 Introduction to Cryptography
Cryptography has a long and fascinating history, starting with the Egyptians some 4000 years ago,
all the way up into the twentieth century where it played a crucial role in the outcome of both world
wars. The predominant practitioners of the art have been those associated with the military, the
diplomatic service and government in general. Cryptography was used as a tool to protect national
secrets and strategies.
Cryptography (from Greek: κρυπτ oς, ”hidden, secret”; and γραφεiν, ”writing”) literally means
encryption (’secret writing’), but modern cryptography involves much more than just encrypting
written messages. In fact, it is used in many aspects of our everyday lives, from credit cards and
TV remotes to email, internet
Bearing in mind a wide range of possible application areas, there are several possible definitions
of cryptography. A useful and insightful definition of Cryptography can be found in the Handbook
of Applied Cryptography:
On the one hand, this definition means that cryptography is a part of mathematics (in particular,
algebra and number theory). On the other hand, it highlights the importance of information
security. Over the centuries, security of information was mainly achieved through physical or legal
means, but with the majority of information now being transmitted and processed electronically,
a new set of tools and methods is required. It is important to remember that in practice besides
cryptography there are may other aspects that have to be taken into account, such as procedural
measures, abidance by laws, good practical implementation, physical security mechanisms etc.
Let us now consider the relevant aspects of information security that pertain to both paper and
electronic information security.
1. Confidentiality is responsible for ensuring that the content of information is kept secret from
anyone who is not authorized to have access to it. There are two ways how confidentiality can be
achieved. The first one is a physical protection mechanism. In this case, the message is physically
protected from being intercepted, an example being courier mail. Another way of achieving confi-
dentiality is by using cryptographic encryption mechanisms. In this case, even if an unauthorized
party intercepts a message, they would be unable to decrypt it, and therefore the message would
be secure.
2. Data integrity addresses the issue of unauthorized alteration or manipulation of data. In particu-
lar, it concerns with preventing and identifying instance of data insertion, deletion and substitution.
1
seeking access to a computer network; in this case the server should be able to authenticate the user
prior to granting them access to the internet. Common techniques for proving entity authentication
are PINs, passwords etc.
Another aspect of authentication is that it should be possible to authenticate the information
being transmitted in terms of its origin, date of origin, data content, time sent etc. Data origin
authentication applies to data that is sent over a communication channel. In this case, the receiver
needs to authenticate that the data has actually come from the sender. In order to guarantee data
origin authentication, one must also have data integrity, as if the information has actually come
from the sender, then it cannot have been altered in transit. At the same time, data integrity does
not imply data origin authentication.
In order to illustrate how these different aspects on information security come into play, we
consider the following example.
Example. Alice wants to buy something over the internet from Bob. Alice send her credit card
number and payment details over the internet (unsecured channel) to Bob.
Alice is the sender, Bob is the receiver, and Eve is the attacker or adversary. When discussing
the exchange of information between two or more parties, we use the following concepts.
Channel: a means of conveying information from one party to another. Channels can refer both
to physical and electronic means of information exchange.
Unsecured channel: a channel, from which an attacker can read, re-order, delete or insert data.
Secured channel: a channel, from which an attacker cannot read, re-order, delete or insert data.
A secured channel is physically or cryptographically protected.
Let us consider how different aspects of information security can be interpreted within this
example. First of all, consider confidentiality: Alice requires her instruction to be confidential,
so that no other person would know her credit card details or what she is buying. Both Alice
and Bob require that the integrity of the message is preserved, so that the amount of transaction
would not be altered in transit. Bob requires data origin authentication, so that he would know
the instruction has actually come from Alice, and not from an impostor. Bob is also interested in
non-repudiation, so that Alice would not be able to deny that she had sent an instruction to Bob
and therefore refuse payment.
2
Cryptography provides mathematical techniques to address the four objectives of information
security. These mathematical techniques or cryptographic tools are sometimes called cryptographic
primitives. We will be primarily concerned with the following examples of cryptographic primitives:
Signature algorithms (e.g. DSA, RSA, ElGamal) can provide data origin authentication, data
integrity and non-repudiation.
Hash functions (e.g. SHA-1, SHA-256, MD5) can provide data integrity.
Message Authentication Codes (MACs) can provide data integrity and data origin authenti-
cation.
2 Classical cryptography
Up until about forty years ago, most information was stored and transmitted on paper. However,
nowadays most information is stored and transmitted electronically. The change from paper to
electronic data has had a profound effect on cryptography. Before the digital age, the function
of cryptography was simply to provide data confidentiality using encryption algorithms. Data
integrity was achieved by physically safeguarding paper documents. It is noteworthy that it is
much harder to alter a paper document that an electronic file. Similarly, authentication and non-
repudiation were usually provided by hand-written signatures, not by cryptography. Since it is
much easier to copy and alter digital data than data that is written on paper, new techniques
were required for protecting digital data. Therefore, cryptographic techniques for providing data
integrity, authentication and non-repudiation are comparatively recent. This means that all historic
examples of cryptography are encryption algorithms.
Lysander of Sparta (404 BC) used the ’Skytale’, a simple encryption device in a war against
Athenians. Indirect evidence from Archilocus (≈ 7th century BC) suggests it existed even earlier,
but first clear description come from Plutarch (50-120 AD). This is the first recorded military use
for a cryptographic device.
The Skytale is an example of a simple transposition cipher or permutation cipher. In such ciphers,
the symbols stay the same and just get permuted.
3
A Transposition cipher splits up the plaintext message M into blocks of fixed length t called
the period or order of the cipher. There is a permutation P of the t positions, which is called the
’encryption key’. There are t! possible keys. Decryption is the inverse permutation P −1 . For the
example of Skytale, the period of cipher is the diameter of the cylinder.
M = UNIVE|RSITY| OF S|USSEX
We work on each 5-letter block independently and transpose the characters according to the rules
specified in P . Then the ciphertext C is
C=VEUIN|TYRIS| S FO|EXUSS
Decryption is given by the inverse permutation
P −1 = (143)(25).
Since the symbols in the transposition ciphers remain the same and only their order changes,
transposition ciphers are easy to recognize and cryptanalyse. In particular, if an attacker has some
plaintext corresponding to ciphertext, it is very easy to find the permutation. This is called a
known-plaintext attack.
Caesar cipher (believed to be used by Julius Caesar) works by shifting the alphabet by a
certain number of places, e.g. shifting by 3 places gives this table
ABCDEFGHIJKLMNOPQRSTUVWXYZ
DEFGHIJKLMNOPQRSTUVWXYZABC
Then we substitute using the table. For example, the plaintext DOG gives the ciphertext GRJ.
Another way to think of this cipher is to assign a number to each letter in the alphabet.
Then a shift of 3 places corresponds to ’adding’ D to every letter. Here, D is the ’key’ - this is the
shift that tells both parties how the alphabet has to be modified for encryption/decryption. To
encrypt BOY using D=3 as the key, we proceed as follows:
B+D=1+3=4=E
O+D=14+3=17=R
Y+D=24+3=27=1 (mod 26) = B
4
Caesar cipher is an example of a simple substitution cipher or a mono-alphabetic substi-
tution cipher. The key for a mono-alphabetic substitution cipher is a permutation of the letters
of the alphabet. So, there are 26! keys (≈ 4 · 1026 ), using English alphabet (without a ’space’
character). For an alphabet with q symbols, there will be q! keys. With a mono-alphabetic cipher,
each letter of the alphabet always gets encrypted in the same way. This means that this type of
cipher is very easy to cryptanalyse using frequency analysis, and hence an attacker will be able to
find the permutation key by analysing quite a small amount of ciphertext.
A more sophisticated type of substitution cipher is the so-called Vigenère cipher. It was
originally described by Giovan Battista Bellaso in 1553, but was later misattributed to Blaise de
Vigenère in the 19th century. Historically, it was known as the chiffre indéchiffrable (undecipherable
cipher) for centuries until it was cracked by Charles Babbage in 1854, in a more general form by
Kasiski in 1863. Vigenère cipher is actually a generalization of the Caesar cipher. The key of the
Vigenère cipher is a repeated block of letters of length t (for Caesar cipher, t = 1). The cipher works
by splitting the plaintext into blocks of length t and then adding the key to each block individually.
For example, let us choose CBK for the key (so the period is 3). Then we encrypt the word
AUTOMATIC as CVDQNKVJM, as shown below
5
Despite the fact that the Vigenère cipher was not broken for 300 years, it is actually quite easy
to cryptanalyse and recover the key that has been used. The first step is to determine the period
t, which is done by trial and error. Once the block length is known, the letters in the ciphertext
can be divided into t grous, and then frequency analysis can be performed on each group.
An improvement on a Vigenère cipher is a one-time pad, invented by Gilbert Vernam in 1917.
It provides perfect secrecy as it is theoretically unbreakable. For this reason, it has been used by
military, diplomats and politicians across the globe for much of the 20th century. The one-time
pad itself is a random sequence of letters (or bits). The main aspect of this approach is that the
pad is the same length as the plaintext.
Because the one-time pad is random, it is impossible to recover the plaintext from the ciphertext,
as every plaintext is equally likely. Using bits instead of letters, the one-time pad looks like this
pi ⊕ ki = ci , ci ⊕ ki = pi ,
0⊕0=0
1⊕0=1
0⊕1=1
1⊕1=0
The one-time pad offers unconditional security or perfect secrecy since observation of the cipher-
text provides no information whatsoever about the plaintext or the one-time pad to the adversary.
The question is then: why don’t we always use a one-time pad?
6
Despite its theoretical superiority, there are several problems with practical implementation of
one-time pads. First of all, each time it has to be as long as the plaintext message, which may all
be of different lengths. The sender should be able to somehow physically transport the pad to the
receiver. Another implementation issue is how to generate a truly random pad. Finally, to ensure
unbreakability every pad should be used only once, and it should never be used to encrypt two
different plaintext messages.
3 Symmetric-key encryption
In the examples considered above, in each case Alice uses a key to encrypt her plaintext messages.
So far, we have seen the following examples of keys: a permutation of the positions of the plaintext
characters (transposition ciphers), a permutation of characters in the alphabet (simple substitution
ciphers), one-time pads.
In all of the encryption algorithms that we shall consider from now on, a key will simply be a
very large integer number. Since encryption algorithms are implemented on computers, we usually
think of a key as a long string of bits. The idea is that the key is so large that it is not feasible
for an attacker to guess what this key is. The range of all possible values of the key is called a
keyspace.
In order to encrypt the plaintext message M , Alice uses a key K. The result is the ciphertext
message C. We write
EK (M ) = C.
For decryption of the ciphertext C, Bob uses the same key K as follows:
DK (C) = M.
This is called symmetric-key encryption or single key encryption. The main feature of
this scheme is that the same key is used both for encryption, and for decryption. This is different to
asymmetric-key encryption (e.g. RSA), which uses different keys for encryption and decryption.
The security of symmetric-key encryption relies on the so-called Kerckhoffs’ assumption:
security of en encryption algorithm is provided by keeping the key secret, not by keeping the
algorithm secret. In fact, the majority of encryption algorithms are publicly available, so everyone
knows the details of how those algorithms work. There are some secret algorithms, but one cannot
rely on keeping the algorithm secret to provide security.
There are two types of symmetric-key encryption algorithms: stream ciphers (RC4, SEAL,
etc.) and block ciphers (DES, Triple-DES, AES, etc.).
7
4 Stream ciphers
Stream ciphers operate on the plaintext a single bit (or byte, or symbol) at a time. We shall
consider binary stream ciphers in this section (i.e. they work on bits of plaintext), but the same
general principles apply to non-binary case. The plaintext is combined with a keystream to
produce the ciphertext. Motivation for stream ciphers is a one-time pad we discussed earlier.
With a stream cipher, the sender and receiver generate the same keystream of ’random bits’ ki .
We no longer have perfect security, since the keystream is generated from a short key. The two
related observations are: the keystream is no longer truly statistically random, and the entropy
(randomness) of the keystream cannot exceed the entropy of the short key.
There are two types of stream ciphers: synchronous stream ciphers and self-synchronizing
(or aysnchronous) strea, ciphers. In the case of synchronous stream cipher, the keystream
is generated as a function of the key only and completely independent of the plaintext and the
ciphertext. In contrast, for the self-synchronizing stream cipher, the keystream is generate as
a function of the key and a fixed number t of the previous ciphertext bits.
The synchronous stream cipher works as follows
Encrypt: pi ⊕ ki = ci .
Decrypt: ci ⊕ ki = pi ⊕ ki ⊕ ki = pi .
8
Properties of a synchronous stream cipher:
• Sender and receiver must be synchronized. To give a proper decryption, the keystream generators
for the sender and receiver must be in the same state. If some part of ciphertext is lost or inserted
during the transmission, then the decryption will fail.
• There is no error propagation, i.e. even if a single bit is corrupted, this has no bearing on the
correct decryption of subsequent bits.
• Such ciphers are vulnerable to ’bit-toggling’ or ’bit-flipping’. An attacker can change ciphertext
bits and, thereby, alter the encrypted plaintext.
• Limited error propagation. Since the state of the keystream generator depends on t preceding
ciphertext bits, if a single ciphertext bit is modified in transmission, then t bits of decrypted data
will be lost.
For any stream cipher, it is essential that the keystream never gets re-used, i.e. keystream
should only be used to encrypt one set of plaintext. Alternatively, one has to change the key
before the keystream starts to repeat itself (if keystreams are periodic). The reason for this is the
following: if the keystream repeats itself, one has ki+h = kj+h for all h ≥ 0. By XOR-ing the
corresponding ciphertext, we find
ci+h ⊕ cj+h = pi+h ⊕ ki+h ⊕ pj+h ⊕ kj+h = pi+h ⊕ pj+h .
Therefore, by XOR-ing two sections of ciphertext, we obtain the XOR of two sections of plaintext.
From this it is then possible to recover both section of plaintext. This is one reason why keystream
generators are required to have a long period. If a generator has a short period, then one has to
update the key more frequently.
9
5 Linear feedback shift registers
For symmetric-key encryption, keystream generators produce a sequence of bits that appears ’ran-
dom’. The actual entropy of the keystream comes from a short key. A basic building block of
many keystream generators is the so-called Linear Feedback Shift Register (LFSR). Linear
Feedback Shift Registers are useful and convenient for the following reasons:
3) the initial state [sL−1 , ..., s1 , s0 ], wheresi ∈ {0, 1} denotes the bit in Ri .
• the bit in R0 is output and forms part of the LFSR’s output sequence;
• new content of RL−1 is formed by adding modulo 2 the content of a fixed subset of the L registers.
10
The subset of registers that gives the new content RL−1 is determined by L feedback taps
c1 , c2 , ..., cL . Here, ci ∈ {0, 1} and cL = 1. For j ≥ L, we have
Therefore,
sL = (c1 sL−1 + c2 sL−2 + ... + cL s0 ) mod 2
sL+1 = (c1 sL + c2 sL−1 + ... + cL s1 ) mod 2
...
...
A particular register Ri is included in the sum if and only if cL−i = 1.
The output sequence of the LFSR s = s0 , s1 , s2 , s3 , ... is uniquely determined by the above
recursion. Feedback taps may be summarised by the characteristic polynomial c(x) of the
LFSR:
c(x) = cL xL + cL−1 xL−1 + ... + c1 x + 1.
One should that sometimes the characteristic polynomial is written in reverse order. If the charac-
teristic polynomial is a ’primitive’ polynomial, then the output of the LFSR will have a maximum
period:
Maximum period of LFSR = 2L − 1.
For now, we do not worry about the definition of the ’primitive polynomial’. We call this a
maximum - length LFSR. The output of a maximum-length LFSR is called an m-sequence.
We will denote LFSR by < L, c(x) >, where L is the length and c(x) is the characteristic
polynomial.
11
Example. Consider an LFSR given by < 4, x4 + x + 1 >.
Here, x4 + x + 1 is a primitive polynomial, which means that for any non-zero initial state, the
LFSR will generate an m-sequence of length 24 − 1 = 15.
N.B. All zero initial state will produce zeroes only, for any LFSR.
s4 = s3 ⊕ s0 = 1 ⊕ 0 = 1;
s5 = s4 ⊕ s1 = 1 ⊕ 1 = 0;
s6 = s5 ⊕ s2 = 0 ⊕ 1 = 1;
s7 = s6 ⊕ s3 = 1 ⊕ 1 = 0.
LFSRs are never used by themselves as keystream generators for stream ciphers as their output
is too linear and therefore predictable. For this reason, LFSRs are insecure against a known-
plaintext attack: if an attacker know 2L bits of plaintext and the corresponding ciphertext, then
the attacker can easily find all the feedback taps c1 , c2 , .... Once this is done, the attacker can
generate the keystream.
12
At the same time, LFSRs are frequently used as components of keystream generators, because
they are very efficient and have low implementation costs. Usually, the outputs of several LFSRs
are combined using a nonlinear combining function f .
Solution. (1011100)∞ is an infinite sequence of bits, which is periodic with period 7. First of all,
we have to find how many registers should there be in the register. We know that the maximum
period of an LFSR of size L is 2L − 1. Since we have a period of 7 here, this means that it is
sufficient to have 3 registers to produce the desired sequence of bits, as 23 − 1 = 7.
Now that we know our LFSR has 3 registers, we have to identify the coefficients c1 , c2 and c3
that control which of the 3 registers take part in producing the output. To find these coefficients,
we use the first three outputs of the LFSR as given in the sequence. From the sequence (1011100)∞
it follows that the initial sequence for the LFSR is [s2 , s1 , s0 ] = [1, 0, 1]. The first three outputs are:
1, 1 and 0. Substituting the initial sequence into the LFSR gives
c1 · 1 ⊕ c2 · 0 ⊕ c3 · 1 = c1 ⊕ c3 = 1.
c1 · 1 ⊕ c2 · 1 ⊕ c3 · 0 = c1 ⊕ c2 = 1,
c1 ⊕ c2 ⊕ c3 = 0.
c1 ⊕ c2 ⊕ c3 = 1 ⊕ c3 = 0 =⇒ c3 = 1.
c1 ⊕ 1 = 1 =⇒ c1 = 0.
0 ⊕ c2 = 1 =⇒ c2 = 1.
< 3, x3 + x2 + 1 > .
13
Example. Consider the linear feedback shift register < 3, x3 + x + 1 > with the initial sequence
100. What the maximum period of this LFSR. Find the output of this LFSR.
Solution. Since the LFSR has the length L = 3, the maximum period of the sequence it can
produce is 2L − 1 = 23 − 1 = 7. Therefore, it is sufficient to calculate at most the first seven bits
of output of the LFSR in order to have the complete information about LFSR output.
By looking at the characteristic polynomial c(x) = x3 +x+1 one can identify coefficients c1 = 1,
c2 = 0, c3 = 1. Substituting the initial sequence [s2 , s1 , s0 ] = [0, 0, 1], we find
s3 = s2 ⊕ s0 = 0 ⊕ 1 = 1,
s4 = s3 ⊕ s1 = 1 ⊕ 0 = 1,
s5 = s4 ⊕ s2 = 1 ⊕ 0 = 1,
s6 = s5 ⊕ s3 = 1 ⊕ 1 = 0.
Hence, we have the output sequence 1001110, which is then repeated (due to LFSR having a period
7).
7 Block ciphers
It has already been noted that symmetric-key algorithms rely on the same key to be used both for
encryption and for decryption. So far, we have considered stream ciphers that operate on plaintext
one bit at a time. The requirement for stream ciphers is that the same keystream should be used
in order to encrypt and decrypt data. To achieve this, the two keystreams have to be generated
from the same key.
Next, we look at block ciphers, in which the plaintext is encrypted in blocks of fixed length
n bits. This gives a ciphertext block, which is also of length n bits. n is called the block length or
block size of the algorithm, which itself is then called an n-bit algorithm. The majority of existing
block ciphers have a block length of either 64 or 128 bits.
Most well-known symmetric-key encryption algorithms are block ciphers, the most popular
being DES (also called DEA when referring to the actual algorithm rather than a standard), its
more secure modification Triple-DES (also called TDEA), and AES. Similar to stream ciphers, the
security of block ciphers derives from the key, and not the algorithm itself. To this end, it is usually
assumed that the adversary will know the algorithm being used, and their goal is to discover the
key.
14
8 Attacks on encryption schemes
Obviously, there are many different kinds of attacks that can be mounted to break various encryp-
tion schemes. They can be classified into several major groups, most notable being
Ciphertext-only attack — the attacker tries to deduce the key by only observing the ciphertext;
Known-plaintext attack — the attacker has a quantity of plaintext and the corresponding ci-
phertext;
Chosen-plaintext attack — the attacker chooses plaintext and is given the corresponding ci-
phertext.
These attacks are listed in order of difficulty with a ciphertext-only attack being the hardest
for an attacker to perform. Cryptographers always try to devise algorithms that are resistant to
chosen-plaintext attacks. By definition, algorithms that are resistant to chosen-plaintext attacks
are also resistant to known-plaintext and ciphertext-only attacks. There are other, more theoret-
ical types of attacks, for example, a chosen-ciphertext attack, where the attacker chooses the
ciphertext and is given the corresponding plaintext.
Exhaustive key search is a known-plaintext attack. In this type of attack on the encryption
algorithm, the attacker has a small number of plaintext-ciphertext pairs encrypted using the key
K. For a block cipher with a k-bit key K, K can be recovered by exhaustive search in an expected
time of 2k−1 encryption operations. The way this type of attack operates is as follows:
Encrypt a plaintext block (or decrypt a ciphertext block) using all keys in the keyspace;
Discard keys that do not give the correct ciphertext (or plaintext) block;
The correct key will (on average) be found after checking half the keyspace (Handbook, Fact 7.26).
If the key is greater than the block length, then it might be necessary to check that the correct
key has been found using additional plaintext/ciphertext pairs. For a k-bit key and an n-bit
algorithm, we require
k+4
α
n
plaintext/ciphertext pairs, where α[·] means ’the integer greater than or equal to’.
Exhaustive key search is the most basic type of attack against encryption schemes, and it
can be applied to any encryption algorithms. It has already been mentioned that the security
of encryption algorithms is taken to stem from the size of the key. With current computational
capabilities, the keys of size 240 (40-bit key) is considered to be not secure. Keyspace of size 264
(64-bit key) is considered borderline — currently it is secure but probably not for much longer
(Moore’s Law suggests that electronic chip performance doubles every two-three years). Keyspace
of size 2128 (128-bit key) is very secure. A very large key is a necessary condition for achieving
security. However, large keys cannot guarantee security on their own, as there may be other less
obvious types of attacks that would compromise the security of an algorithm.
15
9 DES - Dat Encryption Standard
In 1973-74, after consulting with the National Security Agency (NSA), the American National
Bureau of Standards (NBS), later renamed into the National Institute of Standards and Technology
(NIST), issued a call for proposals for a cipher that could be used for encryption of unclassified but
sensitive information. The scheme, that in 1977 was adopted by the government, was proposed by
a team from IBM, and it became known as DES, which stands for Data Encryption Standard.
This cipher was described in full in FIPS PUB 46, but in 2005 it was officially withdrawn due
to security issues. This standard was replaced by AES in 2002, and a version of the cipher
known as Triple-DES has been approved by the NIST for use until 2030 for sensitive government
information.
DES is a block cipher, and now we specify its characteristic features. It has a block size
on 64 bits, the key size is 56 bits (in fact, the key also has 64 bits, but out of these 8 are parity
bits, so the effective key size is 56 bits).
DES is a Feistel cipher with two major characteristics: a) plaintext input block is divided
into two halves, and the encryption works separately on them; b) there is a repeated function. In
DES there are 16 rounds of repeated function. Generally, if a block cipher has a repeated round
function, it is called an iterated block cipher. Each round uses its own subkey Ki , 1 ≤ i ≤ 16, and
each subkey Ki has a length of 48 bits. The subkeys are derived from the 56-bit key using the
DES key schedule defined as part of the algorithm.
Here, IP is an Initial Permutation defined as part of the algorithm, and IP−1 is the In-
verse Permutation. IP and and IP−1 do not add to the security of encryption, as they are just
transposition ciphers.
16
Initial Permutation IP is given by the following matrix:
58 50 42 34 26 18 10 2
60 52 44 36 28 20 12 4
62 54 46 38 30 22 14 6
64 56 48 40 32 24 16 8
IP =
57 49 41 33 25 17 9 1
59 51 43 35 27 19 11 3
61 53 45 37 29 21 13 5
63 55 47 39 31 23 15 7
This matrix shows how the bits of plaintext input are shuffled around: bit in position 58 goes
to position 1, bit in position 52 goes to position 10 etc.
40 8 48 16 56 24 64 32
39 7 47 15 55 23 63 31
38 6 46 14 54 22 62 30
37 5 45 13 53 21 61 29
IP−1
36 4 44 12 52 20 60 28
35 3 43 11 51 19 59 26
34 2 42 10 50 18 58 26
33 1 41 9 49 17 57 25
Again, we see that bit in position 1 goes to position 58, bit in position 7 goes to position 10 etc.
The actual security of DES comes from the repeated round function. To describe how this func-
tion operates, we label the left and right halves of the data Li and Ri respectively, 0 ≤ i ≤ 16.
Round 1 has inputs L0 and R0 and outputs L1 and R1 ; L16 and R16 are the outputs from Round
16, which are then permuted using IP−1 .
All rounds are exactly the same, except for the fact that a different subkey Ki is used in each
round.
17
Round i looks like this
Let us now describe in detail each of the individual operations on data that take place during
each such round. The first operation is the Expansion Permutation E, which expands Ri−1 from
32 to 48 bits as follows
32 1 2 3 4 5
4 5 6 7 8 9
8 9 10 11 12 13
12 13 14 15 16 17
16 17 18 19 20 21
20 21 22 23 24 25
24 25 26 27 28 29
28 29 30 31 32 1
E not only changes the order of bits, but it also repeats some of the bits (which is the only
way to achieve the desired length). For example, the bit in position 1 of the input is duplicated in
positions 2 and 48 of the output. E changes the right half of the input so that it would have the
same length as Ki so that they could be XOR-ed. The main cryptographic purpose of the expansion
permutation is to provide diffusion: we want every bit of the ciphertext to depend upon every bit
of the plaintext. Effectively, the expansion permutation serves to ‘spread out’ the influence of bits
of the input.
Once the right half Ri−1 has been expanded from 32 to 48 bits through the expansion permu-
tation E, at the next step it is XOR-ed with the corresponding subkey Ki . The 48-bit result is put
through a substitution process S.
18
Substitutions are performed by eight substitution boxes known as S-boxes. Each S-box has
a 6-bit input and a 4-bit output. To this end, the input of substitution, which is 48 bits long,
is divided into eight 6-bit sub-blocks, and then each of those 6-bit blocks is put through its own
S-box, as illustrated in the diagram below
Each S-box is a table with 4 rows and 16 columns. To show how they operate, suppose a
particular S-box receives the 6-bit input abcdef . Two outer bits af specify the row (0 to 3), and
the four inner bits bcde specify the column of the S-box (0 to 15) for the look-up. For examples,
below is the S-box 5
2 12 4 1 7 10 11 6 8 5 3 15 13 0 14 9
14 1 2 12 4 7 13 1 5 0 15 10 3 9 8 6
4 2 1 11 10 13 7 8 15 9 12 5 6 3 0 14
11 8 12 7 1 14 2 13 6 15 0 9 10 4 5 3
Suppose, this S-box receives the input 011011. Now, the outer bits are 01, which is a binary
representation of 1, so we look at the second row of the above table. The inner bits are 1101,
which is a binary representation of 13, so we look at the fourteenth column of the table. At the
intersection of second row and fourteenth column we find number 9, which, when converted into
binary, gives 1001, and this is taken as an output of this S-box for the input of 011011.
S-box substitution is the most critical security element in DES. The reason why S-boxes
provide security is because they are nonlinear. This means that if we are given two 6-bit inputs x
and y, we have
S-box(x ⊕ y) 6= S-box(x) ⊕ S-box(y).
The issue here is that linear operations are easy to analyse, hence any block cipher algorithm must
contain some nonlinearity to ensure its security. In the case of DES, all nonlinearity comes from
the S-boxes.
After the S-box look-ups are completed, the next stage is a permutation P, given by
16 7 20 21
29 12 28 17
1 15 23 26
5 18 31 10
2 8 24 14
32 27 3 9
19 13 30 6
22 11 4 25
This shows that bit 16 goes to position 1 etc. Similar to IP, permutation P is a straightforward
permutation (or transposition). It provides more diffusion. Finally, the result of permutation P is
XOR-ed with Li−1 (the left half of the input) to give Ri . One should note that the new Li is just
Ri−1 without any changes to it.
19
It is possible to write a sequence of operations inside each round as follows
Li = Ri−1
Ri = Li−1 ⊕ f (Ri−1 , Ki ),
where the repeated round function is f (Ri−1 , Ki ) = P (S(E(Ri−1 ) ⊕ Ki )). A Feistel cipher can be
defined as an iterated block cipher satisfying the above two equations. If one changes the function
f , this would give a different Feistel cipher. It is also possible to change the number of rounds, and
a general observation is that the higher number of rounds provides a higher level of security, but
this results in a slower speed of operation.
Diffusion. It dissipates redundancy in the plaintext by spreading it out over the ciphertext. This
is usually achieved by permutations (transpositions). In the particular case of DES, this is done by
the permutations E and P in each DES round.
Confusion. Makes the relationship between the key and the ciphertext as complicated as possible,
obscuring the relationship between plaintext and ciphertext. This is normally done by substitutions.
In the case of DES, substitutions are done by S-boxes.
Each round of DES encryption algorithm provides both confusion and diffusion. The iterated
round structure means that the confusion and diffusion increase with each round. This is sometimes
called the ‘avalanche effect’.
One interesting and surprising property of Feistel ciphers is that decryption is the same as en-
cryption (see, e.g. Handbook Note 7.84). In order to decrypt a given ciphertext, one simply has
to use exactly the same scheme as for encryption and just reverse the order of subkeys: the first
round uses K16 , the second round uses K15 etc. This is very useful from practical perspective, as
it implies that it is not necessary to implement a separate algorithm for decryption.
• Each bit of the ciphertext should depend on all bits of the key and all bits of the plaintext.
• Altering any single plaintext bit or key bit should alter each ciphertext bit with probability 1/2.
20
DES appears to satisfy all of these properties, so in many respects it is a very good algorithm.
There are, however, some problems with DES. Most importantly, the key size (56 bits) is too small
for most applications. In fact, the small key size was criticised from the outset (W.Diffie & M.
Hellman, ”Exhaustive cryptanalysis of the NBS Data Encryption Standard”, 1977).
DES was ’cracked’ many times in the late 1990s using exhaustive key search. On average, only
35
2 guesses are required to find the key. For example, Electronic Frontier Foundation (EFF) has
built a DES cracker, which tests about 9.2 × 1010 keys per second and is able to perform parallel
searches using Internet. In this way, DES key recovered in less than a day in 1999, and with current
computer power this can be done in a matter of hours.
Problems with security of DES have led the NIST to take an appropriate action. The last
version of FIPS PUB 46-3 said
”... exhaustion of the DES (i.e. breaking a DES encrypted ciphertext by trying all possible keys)
has become increasingly more feasible with technology advances. Following a recent hardware based
DES key exhaustion attack, NIST can no longer support the use of singe DES for many applications.
Therefore, Governmental agencies with legacy single DES systems are encouraged to transition to
Triple DES. Agencies are advised to implement Triple DES when building new systems”.
DES was finally completely removed from the list of FIPS-approved algorithms on 19th May
2005.
There are two other main types of attack on block cipher algorithm: differential cryptanalysis
(Biham, Shamir, 1990) and linear cryptanalysis (Matsui, 1993). Differential cryptanalysis (DC)
is a chosen-plaintext attack, which compares the XOR of two inputs to the XOR of the correspond-
ing outputs. A significant limitation of this type of attack is that it requires a large amount of
chosen plaintext. Interestingly, DES is very resistant to DC since S-boxes and the permutation P
were specifically designed to counter DC attacks. This is quite remarkable, bearing in mind that
the DES had been designed designed 15 years before DC attacks were published. To break DES
using DC, one would require 247 chosen plaintext-ciphertext pairs required, which makes the attack
theoretical rather than practical.
Linear cryptanalysis (LC) is a known-plaintext attack, which uses linear approximations to
describe the action of the block cipher. Although DES was not specifically designed to withstand
LC, since LC attack on DES requires approximately 243 known plaintext-ciphertext pairs, this type
of attack is also impractical. Therefore, the best cryptanalytic attack on DES is the exhaustive key
search. From this point of view, DES is a very good algorithm, although we have seen that the key
size is far too small.
Another problem with the DES is that the block size is too small. This limits the amount
of data that can be encrypted with a single key: if the attacker intercepts two ciphertext blocks
that are equal, then this gives information about the plaintext (this is called information leakage).
Hence, one would not want a block cipher to produce any repeating ciphertext blocks. If the block
length is too short (e.g. 64 bits), then we expect to see repeating ciphertext blocks quite regularly if
encrypting at high data rates. For a 64-bit algorithm, there are 264 ciphertext blocks, and assuming
random inputs, we expect to see a repeat after encrypting only 232 inputs (this is known as the
’birthday paradox’). With the encryption rate of 100Mbps, we expect a repeating ciphertext block
after only about 45 minutes. For this reason, it is more customary now to use algorithms with the
block size of 128 bits.
21
11 Structural properties of DES
DES has a feature known as complementarity property:
DES has four weak keys - these are the keys that cause the encryption mode of DES to act
identically to the decryption mode of DES:
DESK (DESK (P )) = P,
Besides 4 weak keys, DES also has 6 semi-weak key pairs defined as follows
A practical drawback of DES is that the algorithm contains a lot of bit-oriented operations (2 per-
mutations per round), and these are inconvenient to implement in software, which results in poor
performance of the cipher.
Although DES has now been withdrawn as an algorithm due to its security issues, it survives
as a component of Triple-DES. Despite this, understanding the workings of DES are crucial for
understanding Triple-DES and Feistel ciphers, on which many modern block cipher algorithms are
based.
12 Triple-DES
A more secure version of the same algorithm used to be called Triple-DES or 3-DES. Triple-DES
was, in fact, defined in the same FIPS standard as the DES itself (FIPS PUB 46-3), which has
now been withdrawn. Now is is called the Triple Data Encryption Algorithm (TDEA) and is
defined in the FIPS Special Publication SP800 − 67 published in 2004). We shall use the original
name Triple-DES.
Triple-DES came to life as a result of research in the late 1970s into how the DES algorithm
could be improved. Effectively, Triple-DES is a DES performed 3 times on a given plaintext. Two
alternative versions of the Triple-DES were proposed.
22
Triple-DES, as defined in the standard, is the E-D-E version (encrypt-decrypt-encrpt). If all
the keys are the same, i.e. K1 = K2 = K3 , then the E-D-E scheme is functionally equivalent a
single DES encryption, and hence the Triple-DES provides backward compatibility with DES. An
important implication is that it is sufficient to implement Triple-DES, and there is then no need for
a separate DES implementation. However, this particular ‘keying’ option is no longer considered
to be secure because of the exhaustive search attacks on DES.
Triple-DES Decryption:
• If K1 , K2 and K3 are all different, then the key length becomes 56 × 3 = 168 bits. This option
is known as a 3-key Triple-DES.
• Another ’keying’ option is to have K1 = K3 . In this case, the key length is 56 × 2 = 112 bits,
and this option is known as a 2-key Triple-DES.
At the same time, there is a performance cost associated with the Triple-DES. For a Triple-DES
to be as efficient as DES in terms of data throughput, three separate DES engines are required,
and this requires more ’space’. Alternatively, it would be possible to implement Triple-DES with a
single DES engine, but this would be three times as slow as DES. Also, Triple-DES inherits small
block size of DES, which means that one cannot encrypt too much data using the same key. For a
standard DES algorithm, expect to get repeating ciphertext blocks after 232 encryptions.
23
One might ask: why is it not reasonable to use Double-DES rather than Triple-DES?
This looks like a viable alternative, as it increases the key-length to 112 bits, and is simpler to
implement than Triple-DES.
13 ‘Meet-in-the-middle’ attack
Double-DES is vulnerable to a so-called ‘meet-in-the-middle’ attack, which is a version of a known-
plaintext attack. The way it works is as follows:
• For a given pair (P, C), pre-compute all values of EK1 (P ) for all keys K1 .
• Then compute all values of DK2 (C) for all keys K2 and look for a match in the table.
• A match gives a possible solution pair of keys (K1 , K2 ), which can be checked using another
(P, C) pair.
24
This attack reduces the number of operations that need to be performed, but requires consid-
erable storage. For comparison, using a simple exhaustive key search for a 112 bit key we would
expect to have to perform about 2111 decrypt (or encrypt) operations. On the other hand, ‘meet-
in-the-middle attack’ on Double-DES requires:
• 256 pre-computed encryptions. Since these are pre-computed, we can ignore them when con-
sidering the ’cost’ of the attack.
• Storage for 256 64-bit ciphertexts, which is about 1017 bytes - an enormous amount
• 256 decryptions
It is usually said that the ‘meet-in-the-middle’ attack requires ‘256 time and 256 memory’. In fact,
exactly the same type of ‘meet-in-the-middle’ attack can be made against Triple-DES. Best attack
on 3-key Triple-DES is 2112 time and 256 memory:
This means that, in fact, 3-key Triple-DES is as secure against exhaustive key search as one
would naively expect 2-key Triple-DES to be. Although the key length is 168 bits, the ‘meet-in-
the-middle’ attack reduces the key strength to 112 ‘bits of security’. However, 112 bits of security
is still considered to be very secure.
There is also a known-plaintext ‘meet-in-the-middle’ attack against 2-key Triple DES: with
240 known plaintext/ciphertext pairs, this attack requires 280 operations. For this reason 2-key
Triple-DES is considered to offer only 80 bits of security. This is currently secure, but is close to
the borderline. The 2-key Triple-DES should be avoided for encrypting data that has to remain
secure for a long period of time. We finish the analysis of a Triple-DES scheme by noting that
Triple-DES is secure against differential and linear cryptanalysis. Triple-DES inherits resistance
to these attacks from the underlying DES algorithm.
• Resistance to cryptanalysis.
• Ease of implementation.
• Speed.
The aims of the AES selection process was a public design for the algorithm, so that the
algorithm would be available royalty-free.
25
In 2000, Rijndael was announced as the winner, which was designed by two Belgian cryptog-
raphers — Rijmen and Daemen. The complete description of this algorithm is described in the
standard (FIPS PUB 197) published in 2002, it is available from http://csrc.nist.gov, where
all FIPS PUBS and Special Publications can be downloaded. Rijndael was effectively renamed as
is now called AES.
Block length of the AES algorithm is 128 bits (16 bytes). There are three possible options
for the key sizes: 128, 192 or 256 bits. There are NO ‘meet-in-the-middle’ attacks on AES, and
AES is an iterated block cipher, which means that similar to the DES, it has repeated rounds.
AES state space contains 16 bytes, and operations are performed on bytes rather than bits. State
space is shown as a 4 × 4 array of bytes.
26
As with DES, each round has its own subkey or ‘round key’. Round keys are all 128-bit (the
same round key length for AES-128, AES-192 and AES-256). In addition to the round keys, there
is also an extra key that is ‘added’ to the plaintext input. So, a key expansion routine derives the
following numbers of subkeys from the main key:
−AES − 128 11
−AES − 192 13
−AES − 256 15
A typical AES round has the following components
• SubBytes
• ShiftRows
• MixColumns
• AddRoundKey
There is also an initial AddRoundKey, but there are No MixColumns in the final round. Let us
consider all these operations in order.
It replaces a0 → b0 , a1 → b1 , etc. The same S-box look-up table is used for each of the 16 bytes.
The SubBytes substitution is a nonlinear transformation.
ShiftRows is a permutation of bytes, it cyclically shifts the last three rows in the state:
27
• MixColumns operates on the state array column by column. MixColumns corresponds to mul-
tiplying each column by a matrix
c0 02 03 01 01 b0
c1 01 02 03 01 b1
c2 = 01 01
02 03 b2
c3 03 01 01 02 b3
The same matrix is used for every column. The matrix entries are chosen because they are
easy to implement:
ShiftRows and MixColumns together provide a very good diffusion. Multiple rounds lead to
the avalanche effect – after only a few rounds one bit of input spreads out over the whole state space.
28
AddRoundKey. Subkeys for each round stored in arrays, so that AddRoundKey adds corre-
sponding bytes in the arrays using bitwise XOR
d0 = c0 ⊕ k0
d1 = c1 ⊕ k1
··· ··· ······
SubBytes provides confusion, ShiftRows and MixColumns give diffusion, and AddRoundKey gives
a key input.
29
Despite all the successes, AES possesses certain Disadvantages.
- Encryption and decryption are not the same. We recall that for DES decryption simply
involves reversing the order of the keys. For AES, inverse S-boxes and inverse MixColumns opera-
tion are required for decryption. This means that for AES separate implementations are required
for encryption and decryption.
- There are some security concerns. The very simple nature of the algorithm might make some
forms of analysis easier. Also, the number of rounds is quite low, and even though this improves
performance, but, perhaps, at the cost of security. At the same time, at present these concerns are
still very theoretical, as no practical attack on AES has been found.
Padding scheme that is used for the final block must be reversible. For this reason, one cannot, for
instance, just add all 0s in the end, as in this case 11 and 110 would have the same padding. It
would be impossible to reverse this. A common padding scheme is to append a 1 to the end of the
data, and then pad with 0s. If the final block happens to be complete, then it is necessary to add
an entire extra block (1000 · · · 0000) in order to make this process reversible.
Once we have padded the final block, we obtain a series of plaintext blocks P1 , P2 , . . ., Pk . If
k > 1, then we need a block cipher mode in order to encrypt these plaintext blocks. A block
cipher mode is a function that uses a block cipher algorithm (such as DES or AES) to encrypt more
than a single plaintext block
30
Electronic codebook mode is the simplest mode of block cipher operation. In this mode, each
plaintext block is encrypted separately and independently on other blocks
One should note that a single bit of error in a ciphertext block occurring during transmission
will mean that the whole of the corresponding plaintext block will be random when decrypted. This
is an example of error propagation or error extension. Error propagation is the property whereby
a small ciphertext error becomes a larger plaintext error after the ciphertext has been decrypted.
ECB suffers from information leakage: if two ciphertext blocks are the same, then an attacker
knows that the two corresponding plaintext blocks must be the same:
Pi = Pj ⇐⇒ Ci = Cj .
If the plaintext is random, then this will happen very rarely. However, most plaintext is highly
structured, which means that we can expect to get frequent repeats. Repeating ciphertext blocks
will reveal the structure of the plaintext.
This makes ECB mode very bad choice for most applications. The exception would be if the
plaintext is a single block. For example, AES-192 could be used in ECB mode to encrypt a 128-bit
key. To illustrate the degree, to which the ECB mode leaves data patterns undisturbed, we encrypt
an image using an ECB mode and a non-ECB mode.
31
Cipher-block Chaining (CBC) mode. CCB mode aims to reduce the information leakage of
ECB mode by masking the plaintext input . To achieve this, each plaintext input block is XORed
with the ciphertext output block from the previous encryption.
32
The Initialization Vector IV serves to randomise the first plaintext block. If two different
messages are encrypted with the same key, then we need to prevent information leakage in the first
block. IV is not secret unlike the key. IV has to be sent to the receiver unencrypted because the
receiver needs it to begin encryption. However, we need an integrity mechanism to ensure that the
IV is not deliberately altered in transit, as an attacker can alter the first decrypted plaintext block
by altering the IV.
The IV causes message expansion: the transmitted data will be one block longer than the plain-
text. This can be an issue if the messages that are being sent are short, as it will reduce throughput
significantly.
33
A related issue with CBC mode is ‘bit-flipping’ attacks. By altering a ciphertext block, an attacker
is able to make predictable changes to the subsequent plaintext block. An example of this type of
attack is trying to re-route a decrypted IP packet to an address of the attacker’s choosing
34
To avoid this sort of attack it is essential to use an authentication mechanism, as encryption on its
own does not provide data integrity/authentication.
CBC masks the structure of the plaintext. The feedback mechanism effectively randomises the
inputs. However, this does not completely eliminate the problem of information leakage. If the
attacker sees two identical ciphertext blocks Ci and Cj , then we can say the following:
Ci = Cj
⇒ EK (Pi ⊕ Ci−1 ) = EK (Pj ⊕ Cj−1 ) (definition of CBC)
⇒ Pi ⊕ Ci−1 = Pj ⊕ Cj−1 (decrypt both sides)
⇒ Pi ⊕ Pj = Ci−1 ⊕ Cj−1 (rearranging)
The attacker knows what Ci−1 ⊕ Cj−1 is. This gives them the XOR of the two plaintext blocks Pi
and Pj , from which it is possible to recover both Pi and Pj .
Therefore, one would not want the ciphertext to repeat. For an n-bit cipher, we expect to see
Ci = Cj after encrypting 2n/2 random plaintext blocks (Birthday paradox). For DES or 3-DES, we
get Ci = Cj after 232 blocks. For AES, we get Ci = Cj after 264 . Therefore, for 64-bit algorithms,
we need to change the key frequently if using CBC mode at high data rates.
Cipher Feedback (CFB) mode. In CFB mode, data is encrypted in units that are smaller than
(or equal) the block size of the algorithm:
- 1-bit CFB mode
- 8-bit CFB mode
- 64-bit CFB mode
- etc.
These are usually called CFB-1, CFB-8 etc.
CFB-8 encryption (with a 128-bit block size algorithm) works as follows
35
For decryption all that is required is to reverse the arrows at the bottom of the picture:
CFB mode turns a block cipher into a stream cipher. In fact, CFB is a self-synchronising stream
cipher. Decryption of bit Ci depends on the previous 64 ciphertext bits (for 64-bit algorithm in
CFB-1 mode). So, if there are 64 uncorrupted ciphertext bits, then the next ciphertext bit will
decrypt correctly. CFB-1 mode gives a self-synchronising binary additive stream cipher. CFB mode
is useful for encrypting byte, as it can encrypt blocks of plaintext 8 bytes (DES, 3-DES), 16 bytes
(AES). With CFB-8 mode it is possible to encrypt individual bytes without delay, which is useful
for applications, where we want to encrypt single characters.
An important feature of the CFB mode is that for both encryption and decryption, the block
cipher algorithm only has to encrypt. This is a major advantage, as it means that only en encryption
implementation of the algorithm is required. This is useful for block cipher algorithms, where
encryption and decryption operations are different (such as AES). For comparison, in the CBC
mode, both encrypt and decrypt implementations are required.
Similar to CFB mode, the IV has to be sent to the receiver along with the ciphertext. This
causes message expansion and can be an issue for short messages. CFB mode, like CBC mode,
causes error propagation. Consider the case of a 64-bit algorithm operating in CFB-8 mode. If
only one byte gets corrupted in transmission, then decryption will proceed as follows
36
In summary, the first plaintext byte will be corrupted in the same bit positions as the cipher text.
This means that an attacker can make predictable changes to the first byte (bit-flipping). The next
8 plaintext bytes will also be corrupted (for a 64-bit algorithm operating in CFB-8 mode). This
means that 50% of bits will be in error, typically. So, a single bit error in one byte will effectively
cause 65 corrupted bits (for a 64-bit algorithm operating in CFB-8 mode).
If the size of the plaintext and ciphertext ‘blocks’ is less than the block size of the encryption
algorithm, then the throughput of CFB mode will be reduced compared to CBC mode. For ex-
ample, for a 128-but algorithm operating in CFB-1 mode, each execution of the encrypt algorithm
will result in a single bit of encrypted data. Throughput is reduced by a factor of 128.
This mode is quite similar to Cipher Feedback (CFB) mode. Instead of ‘feeding back’ the
ciphertext into the register, OFB feeds back the whole output block of the block cipher algorithm.
37
OFB Encryption
OFB Decryption
38
As with CFB mode, only the encrypt implementation of the block cipher algorithm is required
for both encrypting and decrypting. Encryption and decryption in OFB mode are exactly the same
operation. Also, it is not necessary to pad the final plaintext block, as it suffice to use as many
bits as is required from the last cipher output block. As with both CBC and CFB, the message is
expanded by one block (the size of the IV), which is only a problem for short messages.
OFB is a synchronous stream cipher with the keystream being generated independently of the
plaintext and ciphertext. CFB is self-synchronising, which means that if bits are lost or inserted
in transmission, the the decryptor will not be able to self-synchronise and recover. The loss or
insertion of ciphertext will destroy the alighment of the decrypting keystream with the ciphertext.
Hence, the OFB requires some sort of re-synchronisation mechanism.
In the light of OFB being a synchronous stream cipher, there is no error propagation, so a
single bit error in a ciphertext block only affects one bit at the same location in the corresponding
plaintext block, which is one of the main advantages of OFB cipher mode. For this reason it is
often used on ‘noisy’ communication channels, such as wireless links.
It is absolutely essential never to use the same IV when running a block cipher in the OFB
mode, as the use of the same IV (with the same key) will generate exactly the same keystream, and
if two messages are encrypted using the same keystream, then it is possible to recover the XOR
difference between the plaintexts from the ciphertexts. An attacker can use this to recover both
plaintexts.
Another security issue is related to the risk that two cipher output blocks might repeat. In
this case, the keystream would ‘cycle’, and one would get portions of plaintext encrypted using the
same keystream. For a 128-bit algorithm, we would expect to get repeating cipher output blocks
after about 264 encryptions. The risk of this happening is reduced by changing the key frequently.
It is possible to implement OFB-1, OFB-8 etc. simply by using the leftmost bit or byte of the
block cipher output (although these versions of OFB are not part of the official standard), which
is useful for encrypting bytes without delay. As described in the standard, OFB is a stream cipher
that acts on whole n-bit blocks, where n is the block size of the cipher.
Counted mode (CTR). Counter mode of block cipher operation is a simplification of OFB mode.
The input to the block cipher gets updated using a counter rather than using feedback.
CTR Encryption
39
CTR Decryption
Like OFB, CTR provides a synchronous stream cipher from a block cipher algorithm. This
means that there is no error propagation with the CTR mode. The counters must never repeat.
One approach is to have counters of the following form:
Nonce || i, i = 1, 2, 3, ...
Nonce stands for ’number used only once’. To avoid security issues, one should never use the same
nonce with the same key. Nonce is usually a message number, which guarantees their uniqueness.
For a 128-bit algorithm, we could typically use a 64-bit nonce and a 64-bit counter i. This limits
each message to 264 blocks (which is not a limitation in practice).
Ensuring that the counter never repeats is really the only concern with the CTR mode. One
needs to make sure that the implementation checks that Nonce is always unique. Nonce is usually
a message number, which guarantees their uniqueness. Counter value i does not ’wrap’ (for a given
message).
Only encrypt implementation of cipher required, as encrypt and decrypt operations are the
same for CTR. Also, no padding is required for the final block (same as OFB). Computation
of keystream can be parallelized, or keystream can be precomputed, which allows for very fast
CTR implementations and provides ’random access’ property for decrypting ciphertext blocks.
Decryption of a ciphertext is not dependent on decryption of previous blocks.
To summarize, we have considered five block cipher operation modes: ECB, CBC, CFB, OFB
and CTR, which are all defined in NIST Special Publication 800−38A, ”Recommendation for Block
Cipher Modes of Operation”, which is downloadable from http://csrc.nist.gov/. ECB, CBC,
CFB, OFB and CTR are called the five ‘confidentiality modes’.
16 Hash functions
A Hash Function converts a block of data (of any length) into a fixed length ‘hash-value’ (also
known as the ‘message digest’ or ‘digital fingerprint’). The length of the hash-value is relatively
short. The hash-value serves as a compact representation or fingerprint of the original data.
40
The property of producing a fixed length output is known as compression. A hash function h
maps an input M of arbitrary finite bitlength to an output h(M ) of fixed bitlength n. Another
property of hash functions is ease of computation. Given a hash function h and input M , h(M ) is
easy to compute.
Hash functions are used in a wide variety of cryptographic applications, such as Data Integrity,
Digital signatures (all signature algorithms use hash functions), Pseudorandom generators, Key
derivation applications etc. - they are extremely important in cryptography.
Well-known hash functions are SHA-1 and MD5. SHA-1 converts a block of data (of any length)
into a 160-bit string. MD5 converts a block of data (of any length) into a 128-bit string. There is
also the SHA-2 family of hash functions: SHA-224, SHA-256, SHA-384, SHA-512, with SHA-224
producing 224-bit hash-values, etc.
There are two main types of hash function:
MDCs do not use keys, and are therefore called unkeyed hash functions. On the other hand,
MACs use keys and called keyed hash functions. The (unkeyed) hash functions approved by
NIST are SHA-1 and the four SHA-2 functions; they are all specified in FIPS PUB 180 − 3,
see http://csrc.nist.gov/publications/PubsFIPS.html. As the name Modification Detection
Code suggests, MDCs are used to provide data integrity (recall the objectives of information secu-
rity from Lecture 1). For example, one can hash a large file once to obtain the hash-value and store
this hash-value securely. One can then check that the file has not been altered by calculating the
hash value again and by comparing it to the stored hash-value.
41
Initial computation:
If, however, the file has been modified, its has value will be different, and this will be recognized
as it will have a different hash value:
42
Unkeyed hash functions need to have the following two properties (in addition to compression
and ease of computation):
• Collision resistance
Preimage resistance is defined as follows: given any hash value y, it is hard to find an input
M such that h(M ) = y (by ‘Hard’ we mean ‘computationally infeasible’). This is also sometimes
called the one-way property, because it implies that it is hard to ‘invert’ the hash function
• Collision resistance means is computationally infeasible to find any two distinct inputs M
and M ∗ , which hash to the same hash hash-value, i.e. such that
h(M ) = h(M ∗ )
43
A weaker property than collision resistance (called ‘weak collision resistance’ or 2nd preimage
resistance) is that for a specified input M , it is computationally infeasible to find input M ∗ such
that
h(M ) = h(M ∗ ).
Collision resistance as defined earlier is sometimes called ‘strong collision resistance’. Strong colli-
sion resistance implies weak collision resistance. When we talk about collision resistance, we mean
strong collision resistance.
Some MDCs have preimage resistance but do not have a collision resistance. These hash func-
tions are called one-way hash functions. However, since MDCs are usually required to have both
properties, we shall only consider hash functions that are also collision resistant.
Collision resistance is quite a surprising property : for any hash function, there is an infinite
number of inputs and a finite number of outputs. Therefore, any hash function has an infinite
number of collisions. Collision resistance means that although an infinite number of collision exists,
it is not practically possible to find one of them.
Hash functions are used in digital signatures - these algorithms hash a message before signing it.
The hashed message is of the size that is appropriate for the signature algorithm. The hash-value
gets signed rather than the message itself. Collision resistance is a necessary requirement for digital
signature applications of hash functions. Collision resistance is associated with the birthday attack.
Question: how many people do you need in a room for the probability to be greater than 1/2
than two people share the same birthday?
Answer: 23. This is surprising - given that there are 365 possible birthdays we would expect that
more people would be needed. Birthday paradox: in a room of 23 randomly selected people the
probability is greater than 1/2 that two selected people share the same birthday.
√
NB. 23 is approximately 365. The rule holds in general: if we have a quantity that can take N
different
√ values, and we generate these values at random, then we expect to have to generate about
N values before we get a repeat (collision).
For a hash function with n-bit output, there are 2n possible hash-values. This means that we
can expect to see a collision after hashing about 2n/2 different inputs. This is known as the birthday
attack on a has function. For example, let us consider MD5 that produces 128-bit hash values. It
means it can produce at most 2128 different hash values. With the birthday attack, we expect to
find a collision after hashing only264 different inputs. For this reason, 128-bit hash values are now
viewed as too small, and hence MD5 is no longer considered secure.
SHA-1 produces 160-bit hash values. For this has function, there are 2160 different hash values,
and we would expect to see a collision after hashing 280 different inputs, which is still not possible
using current computers. However, a paper by the Wang et al. (2005) has shown that a SHA-
1 collision could actually be found in 269 hash operations. The same researchers subsequently
improved this result, showing that a collision could be found in 263 hash operations. Although 263
is currently still impossible, it is quite close to being achievable.
These results mean that SHA-1 is being phased out of use for digital signature applications.
SHA-1 is no longer allowed for digital signature applications as from 2010. Hence, SHA-2 algorithms
are used instead, which have longer output. NIST is currently organising a public competition,
similar to AES competition, to develop a replacement algorithm to be called SHA-3. Current
schedule will produce SHA-3 by the end of 2012, although the date is likely to change.
MD5, SHA-1 and all the SHA-2 algorithms all have a similar design: they are iterated hash
functions which work by splitting the input into a sequence of fixed-size blocks M1 , M2 , M3 , ...,
Mk , with some padding rule for the last block Mk . Input blocks are then processed in order, using
44
a compression function, which is a one-way function.This gives a set of intermediate hash-values
H0 , H1 , H2 , ..., Hk . H0 is a predefined initialising value (IV) defined in the specification of the
hash function. Hk is the hash-value output of the hash function.
Hi = f (Mi , Hi−1 ), 1 ≤ i ≤ k
Iterated design involves all message bits in the final hash value Hk .
For MD5, SHA-1, SHA-224 and SHA-256, the padded message is split into message blocks Mi
of length 512 bits, while for SHA-384 and SHA-512, the message blocks are of length 1024 bits.
Iterated hash function design has practical advantages:
• Processing can start on a long message as soon as the first part of the message is available
In order to achieve the one-way property (preimage resistance), we need the hash function to
be such that the input bits and output bits are not correlated. With multiple rounds, we get an
avalanche effect — every input bit affects every output bit. From block ciphers we are familiar with
the concepts of Confusion and Diffusion. Compression functions include non-linear operations and
the spread input differences across the output.
It is not yet clear whether new hash function SHA-3 will be based on an iterated design, as
some entries for the NIST competition use another novel construction, see http://csrc.nist.
gov/groups/ST/hash/sha-3.
45
In terms of operation, every MAC function has two inputs: a Message M (of arbitrary length),
and a Key K (fixed size). The output of a MAC function is fixed size ‘MAC value’, which we can
write as MAC(K, M ). When Alice sends the message M , she also sends MAC(K, M ). When Bob
receives the message, he recalculates the MAC value MAC(K, M ) and checks that it is the same
as the MAC value that he has received from Alice. If the MAC values are the same, he accepts
the message he has received as authentic, and if the two MAC values do not match, then Bob will
discard the message he has received.
It is important to note that both Alice and Bob must have the same key for this scheme to
work successfully. Therefore, MAC functions are symmetric-key algorithms (like stream ciphers
and block ciphers).
MACs can also be used by a single user to check whether files on their computer have been
corrupted, perhaps, by a virus. To do this, a user computes the MAC value of stored files using
a secret key K. At a later date, the user can recalculate the MAC and check that it matches
the original value. This is more secure than using MDC, since if an MDC were used, the virus
could alter files and recalculate the new hash-value, because there is no secret key with an MDC.
However, MAC values can only be computed if you have a secret key K. It is noteworthy that
this example is still ‘message authentication’, even though the message has not been sent over a
communication channel.
18 Properties of MACs
Similar to Modification Detection Codes, MACs need to have the following properties:
• Ease of computation – given key K and any input message M , it must be easy to compute
MAC(K, M).
46
We have seen that MDCs need to have preimage and collision resistance. Preimage and collision
resistance do not apply to MACs, so collision attacks, such as the recent attacks on SHA-1, are
not relevant for MAC functions. For MACs, the required security property is called computation
resistance: we do not want an attacker to be able to ‘forge’ MAC values. To explain this property,
let us assume that the attacker has a large number of ‘message-MAC pairs’. A message-MAC pair
is a message M with its corresponding MAC value MAC(K, M ). Computation resistance means
that if the attacker is given another message M 0 , it should not be possible for them to compute
MAC(K, M 0 ).
In summary, we compare and contrast the properties of MDCs and MACs.
Properties of MDCs:
• Compression
• Ease of computation
• Preimage resistance
• Collision resistance
Properties of MACs:
• Compression
• Ease of computation
• Computation resistance.
where || means ‘concatenated with’. Some minor implementation details are omitted from the above
(to do with padding), for full details see FIPS PUB 198-1.
The sequence of steps involved in computing the HMAC value is as follows:
1) Concatenate K and M .
47
We note that in Step 4, the input to the hash function h will be very short. Therefore, although
HMAC uses the hash function twice, the second use does not involve a lot of processing. So,
HMAC is an efficient algorithm. Because collision attacks are not applicable to MAC functions,
it is acceptable to use SHA-1 as the underlying hash function with HMAC. The official advice on
HMAC can be found on the following NIST link http://csrc.nist.gov/groups/ST/toolkit/
secure_hashing.html
It is also possible to construct MAC function from a block cipher and a block cipher mode. The
most common construction is called CBC-MAC. CBC-MAC uses a block cipher, such as DES or
AES operating in CBC mode (see lecture 3). Message M is split into n blocks M1 , M2 , ..., Mn .
C0 is the fixed (publicly-known) IV, which is specified to be 0000...0000 (all zeroes). MAC value is
the final block Cn .
Note that CBC-MAC involves the entire message in computing MAC value. Therefore, a single
bit change to any part of the message will result in a total change in the MAC value.
48
CBC-MAC has subsequently been replaced by a MAC function called CMAC, which stands
for Cipher-based MAC. CMAC is specified in Special Publication 800-38B, ”The CMAC mode for
Authentication”, which can be downloaded from http://csrc.nist.gov/. CMAC is the same as
CBC-MAC, but it incorporates a XOR of the last message block Mn with a subkey derived from
the key K.
Here K ∗ is a subkey derived from the key K. The subkey XOR is included in CMAC in order
to address some security problems with CBC-MAC (see ”The Handbook of Applied Cryptography”
(Section 9.5.1) for a discussion of CBC-MAC security issues). Effectively, the subkey XOR serves
to ‘mask’ the final message block.
Step 1. CBC-MAC is applied to the message M using the secret key K to give the MAC value
MAC(K, M ).
49
Step 3. Encrypt M ||MAC(K, M ) using CTR mode to give the ciphertext C.
Once Bob receives a message from Alice, he performs the following steps for decryption:
Step 2. Check that the MAC value MAC(K, M ) is valid. If the MAC is not valid, then the message
is rejected. If the MAC is valid, then the message M is returned.
It is worth to note that CCM calculates the MAC first, and then encrypts the message M and
the MAC
EK (M ||MAC(K, M )).
The alternative order would be to encrypt the message first: EK (M ), and then calculate the MAC
of EK (M ). In this case, the MAC would be sent as plaintext.
The order of CCM (authentication then encryption) is deliberately chosen for security reasons.
It means that the MAC value is ‘protected’ by the encryption. In security system, authentication is
more important than confidentiality, so it is best to protect the authentication mechanism as much
as possible. CCM is specified for use only with algorithms that have a 128-bit block size. Hence,
it must be used with AES, and not DES or Triple-DES.
Like CCM, GCM Mode is an authentication encryption mode. It is specified in a Special
Publication 800-38D, and can be downloaded from http://csrc.nist.gov/. As with CCM, there
are two functions:
For Authenticated encryption , Alice produces two outputs: a ciphertext C, which is the same
length as the plaintext P , and an authentication tag. For Authenticated decryption, Bob checks, and
if the authentication is successful, then the output is the plaintext P . If, however, authentication
fails, then the message is rejected.
Here the summary of Authenticated Encryption: Plaintext P is encrypted using CTR mode and
the key K to give ciphertext C. Ciphertext blocks are combined using a key H derived from K.
The key is derived by encrypting all zero block using K:
H = EK (0000...0000).
This gives a single 128-bit block, which is then encrypted to give the authentication tag. Finally,
Alice sends the ciphertext and the authentication tag to Bob.
The diagramme illustrating the workings of Authenticated Encryption are shown below.
50
The multiplications take place in a Galois Field. The result of multiplying two 128-bit strings
together is another 128-bit string. The authentication tag protects all of the ciphertext blocks.
Note also that, as with CCM, the authentication tag is encrypted in order to protect it during
transmission.
Authenticated Decryption can be summarized as follows: Ciphertext C is decrypted using the
key K to give plaintext P . The authentication tag is calculated from the received ciphertext If
the calculated authentication tag matches the received authentication tag, then the plaintext is
accepted as genuine. If the authentication tags do not match, the message is rejected.
• The authentication tag can be calculated at the same time as the plaintext is being encrypted
In summary, we have discussed five Confidentiality modes of operation for block ciphers: ECB,
CBC, CFB, OFB, CTR. We have also analysed CMAC giving Authentication Mode, and CCM,
GCM, which provide Authenticated Encryption mode.
22 Secure Channels
One of the most important notions of Cryptography is that of a Secure channel. It allows
creation of a secure connection between Alice and Bob, so that they can send discrete messages
51
(such as emails) to each other. We have already seen two fundamental aspects of secure channels:
confidentiality and authentication. The third vital aspect is message numbering, which is
responsible for ensuring that each message must have a unique message number. This implies that
no two messages can have the same number. Furthermore, the message numbers must increase
monotonically, so that a message sent later would have a larger message number.
For example, Alice numbers first message 1, second message 2, etc. Bob will only accept
messages with message numbers that are larger than the message number of the previous message.
With unique monotonically increasing message numbers, Bob can reject replayed messages, and
this rotects against replay attacks. Bob knows which messages have been lost in transmission and
the order that the messages were sent in.
It is necessary to protect the authenticity of the message numbers, as otherwise, the attacker
could alter the number to a later number. To this end, the message number should be a part of
the input to the authentication mechanism. Thus, a secure channel can be implemented by using
GCM or CCM mode, with the message numbers being protected by the authentication mechanism.
This provides: confidentiality, authentication, and replay protection. These are the three essential
ingredients of a secure channel.
52
Let us recall the definition of a function f :
This means that public keys can be made public, as the knowledge of the public key does not
reveal private key. So, public keys can, for example, be sent over the Internet without compromising
security.
53
Let n be a positive integer. We say that a number a is congruent to a number b modulo
n if there is some integer k, such that a = b + kn. In this case we say that a and b are congruent
modulo n. One can write this as
a ≡ b mod n.
To give an example:
5 ≡ 3 mod 2,
11 ≡ 4 mod 7,
6 ≡ 28 mod 11.
Given a non-negative integer a (i.e. a ≥ 0), we can always write a ≡ b mod n, where 0 ≤ b ≤
n − 1. Here, b is called a residue of a modulo n. For example,
7 ≡ 2 mod 5.
The set of numbers {0, 1, ..., n − 1} is called a complete set of residues modulo n. For every
integer a, its residue modulo n is a number from 0 to n − 1. We write
9 mod 7 = 2,
45 mod 6 = 3,
etc
• Modular addition
(5 + 9) mod 11 = 14 mod 11 = 3 mod 11.
• Modular multiplication
• Modular exponentiation
A prime number is a number greater than 1 whose only factors are 1 and itself. A number that
is not prime is called composite. Examples of prime numbers: 2, 3, 17, 2521, 2756839 −1, etc. In fact,
there are infinitely many prime numbers. We can also easily give examples of composite numbers:
4, 27, 100, etc., and there are also infinitely many composite numbers. Public-key cryptography
uses prime numbers that are extremely large. These are often taken to be 1024 bits long, sometimes
even larger.
With real numbers, any number x has an inverse, apart from 0. The inverse of x is 1/x. The
inverse of a number is the number that you have to multiply by to get 1:
1
x× = 1 for x 6= 0.
x
The same idea applies when using modular arithmetic. For example, take n = 5. First, we note
that the complete set of residues modulo 5 is {0, 1, 2, 3, 4}. The inverse of 3 is the residue that we
have to multiply 3 by to get 1:
3 × x ≡ 1 mod 5.
Trying all possible different residues, we find that x = 2, since
(3 × 2) mod 5 = 6 mod 5 = 1.
54
As with real numbers, 0 never has an inverse. This is because the equation
0 × x ≡ 1 mod n
has no solutions. Notice that (for any integer n), the inverse of 1 is itself:
1 × 1 ≡ 1 mod n.
Hence, 1 and n−1 are both self-inverse. For example, if n = 6, the non-zero residues are 1, 2, 3, 4, 5:
• If n is a composite number, then there exist some non-zero residues that do not have inverses.
For example, take p = 7. The set of non-zero residues is {1, 2, 3, 4, 5, 6}.
25 Finite fields
We have seen that for any prime number p, the set of non-zero residues {1, 2, 3, ..., p − 1} all
have inverses modulo p. This means that we can divide one residue by another residue modulo p.
Dividing by x is the same as multiplying by x−1 . For example, consider the case p = 7. We have
seen that
1−1 ≡ 1 mod 7,
2−1 ≡ 4 mod 7,
3−1 ≡ 5 mod 7,
4−1 ≡ 2 mod 7,
5−1 ≡ 3 mod 7,
6−1 ≡ 6 mod 7.
Therefore, for example
55
For any prime p, the set of residues {0, 1, 2, 3, ..., p − 1} is a field. This means that you can add,
subtract and multiply modulo p. By excluding 0 you can divide as well, because all of the elements
have an inverse. It is called a finite field because it has a finite number of elements. We call this
field GF(p).
Consider the field GF(5). This is the set of elements {0, 1, 2, 3, 4} with the operations +, −, ×, ÷.
The set of non-zero elements is 1, 2, 3, 4. This set is called the multiplication group of GF(5),
which we call GF(5)∗ . Let us recall that a group is a set with an operation. In this case, the set is
{1, 2, 3, 4}, and the operation is × (multiplication).
• Identity element is 1.
(a × b) × c = a × (b × c).
So, for any prime p, we have a finite field GF(p), which has a multiplicative subgroup {1, 2, 3, ..., p −
1}. Note that there are p elements in the field GF(p), but only p − 1 elements in GF(p)∗ .
Example. Consider GF(5)∗ : this is the set {1, 2, 3, 4}. Let us consider all powers of 2:
21 ≡ 2 mod 5,
22 = 2 × 2 ≡ 4 mod 5,
23 = 2 × 2 × 2 = 8 ≡ 3 mod 5,
24 = 2 × 2 × 2 × 2 = 16 ≡ 1 mod 5.
We get 2, 4, 3, 1, i.e. by raising 2 to some power we can get all elements in the group. In this case,
we say that 2 generates the group, or the 2 is the generator of the group. This also means that
the group is cyclic.
31 ≡ 3 mod 5,
32 = 9 ≡ 4 mod 5,
33 = 27 ≡ 2 mod 5,
34 = 32 × 32 ≡ 4 × 4 ≡ 1 mod 5.
41 ≡ 4 mod 5,
42 = 16 ≡ 1 mod 5,
56
26 One-way functions I: modular exponentiation
There are two different one-way functions that are used in public-key cryptography:
First we consider modular exponentiation in finite fields. Let p be a large prime, and let g be a
generator of GF(p)∗ . This means that the powers of g generate the whole set {1, 2, 3, ..., p − 1}.
The modular exponentiation function is
f : GF(p)∗ → GF(p)∗
f (x) = g x mod p
This function is
• one-to-one (injective),
• onto (surjective).
One of the main requirements to modular exponentiation is that this function is computationally
‘easy’ to compute on a computer, even if p is very large. In fact, there exist efficient polynomial-time
algorithms that will do this. The inverse of the modular exponentiation function is the discrete
logarithm function. The discrete logarithm of g x mod p to the base g is x
If p is a small prime, then it is possible to invert this function just by constructing a look-up table.
However, if p is a very large prime, then it would not be possible to construct this table because
of time and storage limitations .For example, in public-key cryptography p is typically 1024 bits
long, so there would be about 21024 entries in the table, but there are only about 2170 atoms in the
planet.
Therefore, to find discrete logarithms when p is a very large prime, we cannot use a look-
up table. At the moment, there are no known polynomial-time algorithm for computing discrete
57
logarithm. This means that finding discrete logarithms is computationally infeasible. The main
implication of this is that modular exponentiation is a one-way function if p is a large prime.
Given a prime p, a generator g of GF(p)∗ , and an element y of GF(p)∗ , find the integer x such
that
g x = y mod p,
where 1 ≤ x ≤ p − 1.
The inverse process is to be given the number n and have to find the two factors p and q. This
is called integer factorization, and at the moment there is no known polynomial-time algorithm
that would be able to do this. Therefore, the function f (p, q) = pq is a one-way function.
58
All of the well-known public-key algorithms are based on one or the other of these problems.
• RSA
• ...
The question now is: How can Alice and Bob ensure that they have the same key? The
problem is that the only means of communication between Alice and Bob is an unsecured channel.
Therefore, it is not possible to simply send the key, because the key must be kept secret. This is
known as the key distribution problem. It is not possible to solve this problem using symmetric-
key technologies, however, there are solutions using public-key cryptography.
One solution is called Diffie-Hellman key agreement. It was the first public-key system ever
to be publicized - appeared in ”New Directions in Cryptography”, published in 1976. This was a
landmark paper, as it introduced the whole idea of public-key cryptography. Diffie-Hellman key
agreement enables Alice and Bob to agree a shared secret key over an unsecured channel. The main
idea is that Alice and Bob send each other messages over the unsecured channel, and then these
messages are used to establish the shared key. Diffie-Hellman key agreement is, therefore, a protocol
rather than an algorithm.
The Diffie-Hellman protocol starts with a Setup stage. Before the protocol starts, Alice and
Bob agree a large prime p and a generator g of GF(p)∗ . The number p is usually taken to be at
least 1024 bits long in order to achieve security. There are efficient algorithms for finding large
prime numbers, and also, given a large prime p, there are efficient algorithms for finding generators
g.
59
In the Diffie-Hellman key agreement protocol, the numbers p and g are public parameters;
they do not have to be kept secret. Therefore, Alice and Bob can agree these parameters over the
unsecured channel. Setting up the parameters p and g is a one-time operation; subsequently, the
same p and g can be used many times in order to generate different shared secret keys.
The next stages of the Diffie-Hellman key agreement have the following Protocol actions:
• Alice generates a random integer x, where x < p. This is her private key.
Bob performs exactly the same steps: he generates a random integer y, where y < p - this is
his private key. Then, Bob calculates Y = g y mod p - this is his public key.
The keys x, y, X Y are all positive integers less than p. In fact, X could even be smaller than
x, since
X = g x mod p,
and similarly, Y could be smaller than y.
60
Alice and Bob now send each other their public keys:
Similarly, Bob raises Alice’s public key X raised to the power of his private key y:
K = X y = (g x )y = g xy mod p.
Alice and Bob now have the same number K = g xy mod p. K will be about 1024 bits long (if p is
that long). By selecting a portion of K (say, the first 128 bits) Alice and Bob now have a shared
secret key that they can use with a symmetric encryption algorithm.
As an example, let us consider p = 7 and g = 3. First of all, we raise g to different powers:
31 = 3 mod 7,
32 = 9 = 2 mod 7,
33 = 32 × 3 = 2 × 3 = 6 mod 7,
34 = 32 × 32 = 2 × 2 = 4 mod 7,
35 = 32 × 33 = 2 × 6 = 12 = 5 mod 7,
36 = 32 × 32 × 32 = 2 × 2 × 2 = 8 = 1 mod 7.
This means that 3 generates all of the numbers less than 7, and therefore, 3 is a generator.
61
As a first step of the protocol, Alice and Bob generate their private keys — these are just
random numbers less than 7. For example,
Alice’s private key: 3.
Bob’s private key: 4.
They now calculate their public keys using the generator 3:
Alice’s public key = 33 = 6 mod 7,
Bob’s public key = 34 = 4 mod 7.
They then exchange their public keys.
Finally,
they calculate the shared session key suing the other one’s public key and their own private key:
Alice: 43 = 42 × 4 = 16 × 4 = 2 × 4 = 8 = 1 mod 7,
Bob: 64 = 62 × 62 = 36 × 36 = 1 × 1 = 1 mod 7.
62
In the intruder-in-the-middle attack, Eve substitutes these public keys with her own spoof
values
Alice believes that this is Bob’s public key Y = g x mod p. Hence, Alice computes
0 0
(g y )x = g xy mod p.
Bob believes that this is Alice’s public key g x mod p, and therefore, he computes
0 0
(g x )y = g x y mod p.
Alice and Bob now think that they have agreed the same number. In fact, they have different
0 0
numbers: Alice has g xy mod p, and Bob has g x y mod p. Eve has both keys.
0
Alice will now encrypt using the key g xy mod p, which is the key she thinks she shares with
Bob. In reality, Eve can decrypt the message and read it. Eve can then encrypt the message again
0
using the key g x y mod p (the key that Bob has) and send the encrypted ciphertext. Bob will then
decrypt the message, and both Alice and Bob will not be aware that the confidentiality of the
message has been compromised.
63
The intruder-in-the-middle attack is a good example of why one has to be very careful when
designing cryptographic protocols, as it is very easy to overlook potential attacks. The attack
was not discovered until a number of years after the Diffie-Hellman key agreement protocol was
published.
The intruder-in-the-middle attack is only possible if Eve is it able to interfere with and alter
the public keys that Alice and Bob send to each other, i.e. it is an active attack. In order to avoid
the attack, we need to have a mechanism that will provide authentication. Alice and Bob need to
know that the public keys they receive are authentic, i.e. that they actually came from each other
and not from Eve. The basic Diffie-Hellman protocol that we have looked at is unauthenticated
since there is no authentication mechanism when the public keys are exchanged. At the same time,
we might want to require authenticated key agreement. The mechanism that is most commonly
used for providing the authentication is called a public-key infrastructure (PKI). In a PKI, the
authentication is provided by a ‘trusted third party’ called a Certificate Authority (CA). The
CA signs Alice and Bob’s public keys using a digital signature algorithm. Alice and Bob both trust
the CA, and therefore, the CA’s digital signature provides the authentication that Alice and Bob
require.
In terms of cryptography, the authentication mechanism is usually provided by a digital signa-
ture. The CA digitally signs Alice and Bob’s public keys Our system’s example of a DLP based
cryptographic system is the Digital Signature Algorithm (DSA).This signature algorithm is
often used in practical implementations of PKIs.
64
a hand-written message) - only the person who has possession of the private key is able to create
the signature.
For Signature verification, the algorithm checks that the signature S is correct for the message
M . The signature is verified using the public key. It does not matter from a security point of
view how many people verify a signature. This means that a user can make their public key freely
available. Upon checking, the algorithm returns ‘accepted signature’ or ‘rejected signature’.
Signature S is always the same length for any message M . DSA signature is 40 bytes (= 320
bits) long. There is a random input to the signature generation process. This means that if a
message M is signed many times using the same key, the signatures will all be different.
We need to be able to sign messages of any length. Therefore, before we create the signature,
we first find the hash-values of the message, using SHA-1. The 160-bit hash-value ‘represents’ the
original message. The DSA then signs the hash-value, not the message itself.
Signature generation (output is signature S):
The mathematical details of DSA are slightly complicated, see ”The Handbook”, Chapter 11,
for full details. Important points to remember:
• DSA relies on the Discrete Logarithm Problem for security
• DSA hashes a SHA-1 digest of the message, not the message itself
• DSA signatures are 320 bits long
• DSA signatures for the same message and using the same key are different (because of the random
input).
Because DSA signs a SHA-1 hash value rather than the message itself we need SHA-1 to be
collision resistant. If SHA-1 is not collision resistant, then two colliding messages M1 and M2 would
be signed with the same signature:
65
This has potentially very serious consequences, particularly, in financial applications. Consider
the following messages:
If these messages collide, then Bob could claim that Alice signed M2 , when in fact she signed M1 .
This is why the recent cryptanalysis relating to SHA-1 is such a concern. NIST have said that SHA-
1 is not to be used for digital signatures after 2010. SHA-2 algorithms should be used with DSA
instead, see http://csrc.nist.gov/groups/ST/toolkit/secure_hashing.html. SHA-1 is still
considered secure for other applications: as a component of HMAC, for key derivation applications,
as a random number generator, etc.
We have already discussed the first three of these, and we have already seen examples that show
why they are important. Non-repudiation prevents someone from denying a previous action or
commitment. For example, suppose that Alice has bought something from Bob on the Internet.
Alice should not be able to deny at a later date that she authorised the transaction.
Digital signatures are the primary cryptographic tool for providing non-repudiation. The signer
is prevented from signing a document and subsequently being able to deny having done so. The
reason for this is that the private key is used for signing and there is only one copy of the private
key. This means that nobody else is able to create the signer’s signature. The signer cannot claim
that someone else created the signature.
In many respects Message Authentication Codes (MACs) provide the same cryptographic func-
tion as digital signatures:
• MACs are symmetric-key cryptography,
• Digital signatures are public-key cryptography.
Both MACs and digital signatures are used for providing authentication. However, MACs
cannot provide non-repudiation, as there are two copies of the secret key. MACs are symmetric-
key, so both Alice and Bob have a copy of the secret key. Alice could ‘sign’ a document by creating
a MAC value, but she could claim at a later date that in fact Bob signed the document using his
copy of the secret key. In this respect, the fact that digital signatures provide non-repudiation is a
primary advantage of public-key cryptography.
66
32 RSA encryption
RSA Encryption and RSA signature scheme are based on Integer Factorisation Problem
(IFP). RSA was invented by Rivest, Shamir and Adleman in 1977, and was published in 1978. It
is now known that in fact, it was actually discovered by Clifford Cocks at GCHQ in 1973. RSA is
quite versatile, which is why it can be used for encryption and also digital signatures.
The security of RSA relies on the Integer Factorisation Problem (IFP):
This means that 6 is the largest number that divides 18, 24, 210, 444, 654 and 2406. In other
words, gcd(654, 2406) = 6.
If the final remainder is 1, this means that the two numbers are relatively prime. For example,
67
gcd(1109, 4999) = 1, because
4999 = 4 × 1109 + 563,
1109 = 1 × 563 + 546,
563 = 1 × 546 + 17,
546 = 32 × 17 + 2,
17 = 8 × 2 + 1.
If the final remainder is 1 (i.e. the numbers are relatively prime), then we can find the inverse
of the small number modulo the large number by ‘working backwards’.
As an example, let us find the inverse of 1109 modulo 4999:
1 = 17 − 8 × 2,
= 17 − 8 × (546 − 32 × 17) = 257 × 17 − 8 × 546,
= 257 × (563 − 1 × 546) − 8 × 546 = −265 × 546 + 257 × 563,
= −265 × (1109 − 1 × 563) + 257 × 563,
= 522 × 563 − 265 × 1109,
= 522 × (4999 − 4 × 1109) − 265 × 1109,
= 522 × 4999 − 2353 × 1109.
Therefore, −2353 × 1109 = 1 mod 4999, so −2353 = 2646 is the inverse of 1109 modulo 4999.
In RSA, we can use the Euclidean algorithm for two purposes:
• When we choose the encryption exponent e, we can check whether gcd(e, (p − 1)(q − 1)) = 1. If
so, then e is a valid choice for the encryption exponent. If not, then we need to choose a different
value for e.
• Having chosen a valid encryption exponent e, we can use the Eucliean algorithm by working
backwards to find the corresponding decryption exponent d = e−1 .
Example:
p = 47, q = 59, n = 2773, (p − 1)(q − 1) = 2668.
Randomly choose e = 157. From this to be a valid choice for e, we need to show that gcd(157, 2668) =
1.
2668 = 16 × 157 + 156,
157 = 1 × 156 + 1.
Since the final remainder is 1, this shows that 157 and 2668 are relatively prime, so e = 157 is a
valid choice.
We can then calculate d by working backwards:
1 = 157 − 1 × 156,
= 157 − (2668 − 16 × 157),
= (−1) × 2668 + 17 × 157.
68
Bob’s public key. So, the first step is for Alice to obtain Bob’s public key (n, e). The message is
represented as an integer m, where 0 ≤ m ≤ n − 1. Unless the message is very short, this means
that the message will have to be divided into blocks, where all of the blocks are smaller than m.
There will be a sequence of message blocks m1 , m2 , m3 , ... , and a corresponding sequence of
ciphertext blocks c1 , c2 , c3 , ....
Alice computes ci using Bob’s public key pair (n, e) using the following formula
ci = mei mod n.
Encryption is, therefore, an exponential modulo n, using the encryption exponent e. There will be
a sequence of message blocks m1 , m2 , m3 , ... and a corresponding sequence of ciphertext blocks c1 ,
c2 , c3 , .... Since the ciphertext block ci is another integer less than n, it will be roughly the same
length as the input message block mi .
Having encrypted m1 , m2 , m3 , ... using Bob’s public key pair (n, e), Alice sends the ciphertext
blocks c1 , c2 , c3 , ... to Bob. Bob decrypts ci using the following formula:
mi = cdi mod n.
d is Bob’s private key, the decryption exponent. It is noteworthy that the formulae for encryption
and decryption are the same:
ci = mei mod n.
mi = cdi mod n
Since the public key is (n, e), the attacker knows what the modulus n is. If the attacker were
able to factorise n, then they would know p and q, and this would allow them to calculate the
decryption exponent d, since
ed ≡ 1 mod(p − 1)(q − 1).
Therefore, the security of RSA relies on the difficulty of the Integer Factorisation Problem.
In order to resist factoring attacks, we need to ensure that we use primes p and q that are large
enough. Larger primes p and q give better security. 1024 bit primes give a 2048 bit modulus n;
using a modulus of this size is very secure.
In terms of implementation, encryption and decryption with RSA are exponentiations modulo
n. There are polynomial-time algorithms for performing exponentiation modulo n. Encryption can
be made more efficient by choosing and encryption exponent e that is small; a common choice is
e = 3. One has to check that e and (p − 1)(q − 1) are relatively prime; if they are not, one can try
e = 17 or e = 65537 - these values of e have binary representations that contain only two 1s, which
leads to efficient exponentiation implementations.
It is noteworthy that although we might choose a ‘special’ encryption exponent e, the decryp-
tion exponent d will not have the same property. From a practical perspective, this means that
decryption will take longer than encryption.
Even using a special encryption exponent such as e = 3, the performance of RSA is still very
slow compare to a block cipher algorithm such as DES or AES. For this reason, RSA is never used
for encrypting large amounts of data.
69
In practice, Alice could randomly generate a key K for use with a symmetric block cipher
algorithm (such as AES). She would then obtain Bob’s RSA public key (this can be transfered over
the unsecured channel since it is a public key), encrypt the key K using Bob’s RSA public key, and
send this to Bob. Bob can then decrypt K using his RSA private key. This is called key transport:
one party (Alice) creates the secret key and transfers it to the other party (Bob).
We often use public key cryptography for this purpose. Public-key cryptography is used for
establishing a ‘session key’ that can then be used with a fast symmetric algorithm. It is not possible
to establish session keys using purely symmetric techniques. We need to be able to transfer public
information over the unsecured channel. This is why public-key cryptography is so powerful — it
provides tools for performing various actions that cannot be achieved with symmetric techniques.
By now, we already know two different methods of key establishment:
• Key agreement: (e.g. Diffie-Hellman) is where the secret key is derived from information sup-
plied by both parties.
• Key transport (e.g. using RSA) is where one party creates the secret key and transfers it to
the other party.
Key agreement and key transport are fundamental cryptographic techniques used for producing
a secret key that is shared between two users.
As it has already been mentioned, RSA is very versatile as it can be used both as an encryption
algorithm, and as a signature algorithm. As with any public-key algorithm, each user has their own
public/private key pair: private key is held securely by the user, and the public key can be made
public. In the context of signature algorithms, private key is used for signature generation, and
public key is used for signature verification.
Key generation is exactly the same as for RSA encryption:
70
and use this to find private exponent d using the Euclidean algorithm. This would allow the attacker
to forge signatures.
• Generating secret keys for symmetric-key algorithms, e.g. master secret (MS) in SSL has to be
randomly generated
• ‘True’ random number generators — some unpredictable source that produces entropy
Physical: sound from a microphone, sampled thermal noise from a resistor, etc.
Software-based: system clock, elapsed time between keystrokes or mouse movement, etc.
These two types are often referred to as non-deterministic and deterministic RNGs respectively.
Usually, non-deterministic RNG provides a ‘seed’ for a deterministic PRNG. The PRNG serves
to ‘de-skew’ any bias in the seed:
71
Deterministic PRNGs are often hash functions or block cipher algorithms; in fact, SHA-1 is a
commonly used PRNG. Some PRNGs generate a long pseudorandom sequence from the seed. The
sequence will be periodic (i.e. it will start to repeat after a certain point), and the sequence will
only contain as much entropy as the input seed.
Designing good RNGs is very difficult. For this reason, the RNG is frequently the weakest link
in the cryptographic system (recall the Netscape example). The RNG can be an attractive target
for an attacker, as attackers always attack the weakest link.
We have seen that large prime numbers are an essential part of public-key cryptography: both
DLP and IFP based systems use large primes. If it were computationally infeasible to generate
primes of 1024 bits (or longer), then we would not be able to implement public-key cryptosystems.
Therefore, we need an efficient method for testing whether a randomly generated large number is
prime or composite (i.e. not prime).
The best and most commonly used method is called Miller-Rabin test, which is extremely
efficient with very large numbers. Miller-Rabin test is a probabilistic primality test. It is called
probabilistic because it does not return a completely definite result. We test a large odd integer
n: If n is prime, then the test will always correctly say that the number is prime. However, if n is
composite, then the test might wrongly say that the number is a prime.
Although it does not sound very useful, as one needs to have confidence that we are using
genuine prime numbers. However, we can make the probability as small as we like. The probability
that the test declares a composite number to be a prime is less than 0.25. By repeating the test t
times, the probability becomes less than 0.25t . For instance, by performing the test 5 times, the
probability that the algorithm returns an incorrect result is less than 0.255 , which is less than 1 in
a 1000.
Usually, prime generation is an ‘offline’ operation: primes are usually generated during the
parameter set-up stage (recall, for example, both Diffie-Hellman and RSA schemes). Therefore, we
can usually afford to run Miller-Rabin test with a large number of iterations. With 50 iterations
the test would take several minutes. However, the probability that the test would return a wrong
result would be less than 2−100 .
The standard n-bit prime generation method works as follows:
• Set high-order and low-order bits of p to 1: high-order bit ensures p is of required length, and
low-order bit ensures p is odd
• Check that p is not divisible by all primes less than 2000: this eliminates most odd composite
numbers
34 Efficient exponentiation
In public-key cryptography we need to perform modular exponentiation operations. In the case
of DLP-based schemes (e.g. Diffie-Hellman), we need to calculate g x mod p. For the IFP-based
algorithms (e.g. RSA), we have encryption is C = M e mod n, and decryption is M = C d mod n, so
we need an efficient algorithm to perform these exponentiations.
To compute ab appears to require b − 1 multiplications, because
ab = a × a × a × · · · × a.
72
Since in cryptographic applications a is likely to be a very large number, this naive method would
be very time-consuming. A much better method is to use the square-and-multiply algorithm.
For example, we can compute a32 using just 5 multiplications, rather than 31:
a2 = a × a,
a4 = a2 × a2 ,
a8 = a4 × a4 ,
a16 = a8 × a8 ,
a32 = a16 × a16 .
a43 = a32 × a8 × a2 × a.
In general, to compute ab using the square-and-multiply algorithm requires fewer than 2 log2 b
multiplications. This number is proportional to the number of bits required to represent b rather
than to b itself. This means that the method is efficient even with very large numbers.
It is important to remember that in cryptographic applications the exponentiations will be
modular. This means that after each multiplication, the remainder must be found. For example,
to compute 313 mod 7 using the square-and-multiply algorithm:
32 = 9 = 2,
34 = 32 × 32 = 2 × 2 = 4,
38 = 34 × 34 = 4 × 4 = 16 = 2.
Therefore,
313 = 38 × 34 × 3 = 2 × 4 × 3 = 24 = 3.
73
FIPS PUBs can be downloaded from http://csrc.nist.gov. FIPS PUB 140-2 is called ‘Se-
curity Requirements for Cryptographic Modules’. It is a standard that is used for evaluating cryp-
tographic modules that are to be issued by US Federal government - the ‘public-domain’ standard,
unlike national government approval schemes, which tend to be classified.
In the US, the information is either: Classified — subject to control under government legis-
lation, or Unclassified. Within the ‘Unclassified’ category, some information is still regarded as
‘sensitive’, and is termed as ‘Sensitive But Unclassified’ (SBU). FIPS 140 validation is mandated in
the US for SBU. This is what the standard and accompanying validation program were developed
for. But FIPS 140 is one of the very few recognised and respected standards against which the
cryptographic security of products may be validated. So, although it is written by the US, for the
US, its applicability is much wider than just the US. The European cryptographic security market,
in particular, often requires FIPS.
The Cryptographic Module Validation Program defines the process by which a ‘module’ (product)
can be ‘validated’ (certified/evaluated) against FIPS 140. It is managed by NIST and CSE =
Communications Security Establishment.
A product needs to pass all of the requirements in FIPS PUB 140-2 in order to receive a
certificate. There are four levels: Level 1 is the easiest, Level 4 is the hardest. When a module
passes an evaluation, it receives a certificate. The certificate is posted on the CMVP website; all
module certificates can be viewed here: http://csrc.nist.gov/groups/STM/cmvp.
The source for all FIPS 140 documentation is http://csrc.nist.gov. In addition to the FIPS
PUBs, there is a lot of useful security information on this website, see the following link, for exam-
ple, http://csrc.nist.gov/groups/ST/toolkit/index.html. The main document is FIPS PUB
140-2. It is divided into 11 areas:
74
Approved hash functions:
• HMAC
• CMAC
• ECB
• CBC
• CFB
• OFB
• CTR
• CMAC
• CCM
• GCM
• DSA
• RSA
• ECDSA (Elliptic Curve DSA)
Deterministic:
• SHA-1
• Triple-DES
• AES
DLP-based:
• Diffie-Hellman
• MQV
IFP-based:
Note: in fact, RSA key transport is not an Approved method, as it has not yet been put into a
standard by NIST. However, NIST ‘allow’ it to be used.
75
All of these FIPS Approved algorithms are specified either in other FIPS PUBs or in Special Pub-
lications (SPs):
• SP 800-67 — Triple-DES
• etc. etc.
All these standards are downloadable from http://csrc.nist.gov. A new version of FIPS 140
standard is currently under development - this will be FIPS PUB 140-3. FIPS PUB 140-2 is a good
standard, and, in fact, the most useful cryptographic standard in the public domain. FIPS 140-2
validation is a well-defined and well-managed process.
An alternative standard is CAPS - ‘CESG Assisted Products Service’. CESG is the
Information Assurance Branch of GCHQ (Government Communications Headquarter) based in
Cheltenham. CESG evaluates cryptographic products that are to be used by HMG.
All countries have their own schemes for evaluation of cryptographic products. In the UK it is
CAPS; in the US, evaluation of cryptographic products for government/military use is conducted
by the National Security Agency (NSA). Unlike FIPS, CAPS does not publish a public standard. At
the same time, CAPS do issue guidance documents that specify (for example) which cryptographic
algorithms are acceptable, however, all guidance is classified, and the same applies to all national
approved schemes, not just CAPS.
We have also seen that there are both symmetric-key and public-key mechanisms that can be
used to provide authentication: MACs (symmetric-key) or Digital signatures (public-key).
The question we pose now is: Which type of cryptography should we use? In order to answer
this question properly, we should note that both symmetric-key and public-key cryptography have
various advantages and disadvantages, with symmetric-key algorithms being more suitable in cer-
tain circumstances, and vice versa. In practice, therefore, we use a mixture of the two different
types of cryptography.
Let us begin the comparison with consider specific Advantages of symmetric-key cryptog-
raphy:
76
1. Speed and efficiency
Symmetric-key encryption algorithms (stream cipher and block ciphers) are designed to have
high rates of data throughput. For example, hardware implementations of AES can achieve data
throughput greater than 1 Gbps (109 bits per second). By comparison, public-key encryption meth-
ods, such as RSA are considerably slower. The main reason for this is that public-key cryptography
(both DLP and IFP based) uses modular exponentiation of very large numbers, which is a time-
consuming operation as it involves repeated modular multiplication of large numbers. This means
that we always use symmetric-key encryption for bulk encryption of data, since public-key encryp-
tion is too inefficient to be used for encrypting large amounts of data. Similarly, MAC functions
are faster to compute that digital signature algorithms, and in addition, digital signatures tend to
be longer than MAC values (e.g. DSA signature is 320 bits, compared to 160 bits for a MAC value
computed using HMAC with SHA-1 .
2. Shorter keys
Key sizes for symmetric-key ciphers are much shorter than for public-key algorithms. The main
reason for this is that the best known attack on a good symmetric-key encryption algorithm is an
exhaustive search. However, there are some DLP and factoring techniques that mean that we need
public-key key sizes to be very large to achieve the same security. For example, a 256 bit AES
key is considered equivalent (in terms of security) to RSA using a 15, 360-bit modulus n. Here
‘equivalent’ means that it would take an attacker roughly the same amount of time to discover the
key. Key sizes are compared in the
Special Publication 800 − 57, Part I, ‘Recommendation for Key Management’, http://csrc.nist.
gov/publications/
Shorter keys are also easier to store and transmit, which makes them better uited for constrained
environments such as smartcards.
3. Perceived security
Symmetric-key algorithms are perceived to be secure. For example, the best known attack on
AES is exhaustive key search. Therefore, by using large keys (at least 128 bit), symmetric-key algo-
rithms appear to be secure. By contrast, public-key cryptography relies on the assumed difficulty
of mathematical problems (DLP and IFP). It appears to be impossible to prove that public-key
cryptography is secure — we just have to hope that it is.
Public-key cryptography can perform functions that cannot be performed with symmetric-key
techniques, such as Key establishment over an unsecured channel (Diffie-Hellman key agreement,
RSA key transport), Non-repudiation, Digital signatures (DSA, RSA), True (single-source) data
origin authentication (DSA, RSA).
With public-key cryptography, we can transmit the public key over an unsecured channel with-
77
out compromising security. Although we need to make sure that the authenticity of public keys is
maintained, we do not have to worry about confidentiality of the public key. Authentication for
the public-key is important, as otherwise an attacker can mount an intruder-in-the-middle attack.
So, if Alice needs to encrypt data to send to Bob using a public-key encryption algorithm such as
RSA, she just needs to obtain Bob’s authentic public key. Using a symmetric-key encryption algo-
rithm, Alice and Bob would need to establish a shared secret key using a key establishment protocol.
The fact that public keys do not need to be kept secret results in simplified key management
using public-key cryptography. As an example, consider a network with 6 computers. We want all
6 computers to be able to encrypt information for each other.
With public-key cryptography, each computer generates a public/private key pair - as a result,
6 public keys are made publicly available. If computer A wants to send an encrypted message to
computer B, A obtains B’s public key and encrypts the data, which B can then decrypts using B’s
private key. Therefore, the number of public/private key pairs is just the number of computers on
the network (6 in our example).
However, using symmetric-key encryption, each computer would need to establish a unique key
with every other computer. The consequence is that even in this small network, 15 session keys
would need to be established.
In practice, as we have seen, public-key encryption is too slow to be used for bulk encryption
of data, but is can be used as a key establishment mechanism for distributing secret keys for use
with a symmetric-key algorithms, such as AES. By doing this, we can simplify key management
considerably.
Symmetric keys need to be changed on a regular basis to ensure security; typically, the key used
with a symmetric-key algorithm such as AES would be updated daily. However, public-key keys
have much longer lifetimes, usually several years.
78
The main conclusion that can be drawn from the above analysis is that symmetric-key and
public-key cryptography have complementary advantages: symmetric-key cryptography is very
efficient and uses relatively small keys, while public-key cryptography provides functionality that
symmetric-key cryptography cannot provide, and also facilitates simple key management.
Consequently, many applications use hybrid systems that exploit the advantages of both types
of cryptography. In particular, we use public-key techniques to establish a session key over an
unsecured channel that can be used by a symmetric encryption algorithm. The public/private key
pair can be used to generate multiple session keys, because of the long lifetime of asymmetric keys.
Subsequently we benefit from the high throughput of the symmetric algorithm.
It is noteworthy that in this example of a hybrid scheme, the use of public-key cryptography
for key establishment is a small fraction of the amount of time that will be spent encrypting data
using the symmetric-key encryption algorithm: perhaps, seconds for the key establishment, but
hours for the data encryption. By limiting the use of public-key cryptography to a minimum, we
get maximum benefit from the efficient performance of symmetric-key algorithms. The majority
of practical applications of cryptography tend to be hybrid schemes of this type, a good example
being a Secure Socket Layer (SSL).
It was discussed earlier that this protocol is not secure because of the intruder-in-the-middle
attack: Eve replaces the exchanged public keys with her own keys and then establishes keys with
both Alice and Bob, which leads to a total compromise of security. The reason why this attack
is possible is that the basic version of Diffie-Hellman key exchange is unauthenticated. In order
to avoid the intruder-in-the-middle attack, we need an authentication mechanism, in other words,
Alice and Bob need to authenticate the public keys when they receive them.
A possible solution to this problem is as follows: Alice and Bob sign their Diffie-Hellman public
keys before they exchange them (using DSA, for example), and send the signature with the Diffie-
Hellman public key
79
Alice and Bob can then both verify that the signatures are correct, and these correct signatures
would provide Alice and Bob with data origin authentication. If a signature is not correct, then
this means that the Diffie-Hellman public key has been altered. In this case, the protocol would be
aborted.However, in order to verify the signature on Bob’s Diffie-Hellman public key, Alice would
need to have Bob’s DSA public key (recall that with signature algorithms, the private key is used
for signing and the public key is used for verification). Similarly, in order to verify the signature on
Alice’s Diffie-Hellman public key, Bob would need to have Alice’s DSA public key.
Alice and Bob cannot simply send their DSA public keys over the unsecured channel, as this
would again be vulnerable to an intruder-in-the-middle attack: Eve could replace their DSA public
keys with hew own DSA public key. Hence, we need a further authentication mechanism to protect
the DSA public keys when they are exchanged.
To solve this problem, one can use Public-key Infrastructure (PKI). The PKI allows Alice
to recognise that she has received Bob’s public key, and Bob to recognize that he has received
Alice’s public key. The PKI ensures the authenticity of public keys. In PKI, there is a central
authority called the Certificate Authority (CA), sometimes called a ‘Trusted Third Party’.
The CA has its own public/private key pair for a digital signature algorithm, which in nearly
all cases this will be either a DSA or an RSA. Apart from the CA, there are the users of the PKI:
Alice, Bob, etc. All users of the PKI have a copy of the CA’s public key. In order to join the
PKI, Alice and Bob get their DSA (or RSA) public keys signed by the CA. The CA verifies that
Alice and Bob are who they say they are by signing Alice’s and Bob’s public keys (using the CA’s
private key). This certifies that the public keys do indeed belong to Alice and Bob, who now have
public-key certificates.
A certificate has two parts: a data part includes Alice’s name, Alice’s public key, and possibly,
some other information; a signature part is the digital signature of the data part, created using
the CA’s private key. Alice and Bob exchange their signed DSA public key certificates as part of
80
the Diffie-Hellman key agreement - they verify the signatures on the certificates using the CA’s
public key. If the signature verifies correctly, the public key is accepted; otherwise, the public key
is rejected. The PKI gives an assurance that the public keys are authentic.
Alice and Bob can now exchange their signed DSA public keys. Alice then verifies the signature
using Bob’s DSA public key, which she knows is authentic, and Bob verifies the signature using
Alice’s DSA public key, which he knows is authentic. In this way, it becomes impossible to mount
an intruder-in-the-middle attack, as the attacker cannot replace the Diffie-Hellman public keys
when they are exchanged.
There exist several versions of authenticated Diffie-Hellman key agreement, one of the simplest
being the Station-to-Station protocol, see Handbook (Section 12.57), or Stinton (Section 11.2.1).
The fundamental ideas of this approach are to use PKI for distribution of public-key certificates,
and then to exchange signed data in the key exchange, checking the signatures using the public-key
certificates that were distributed using the PKI.
Certificates are the building blocks of PKIs. A certificate ‘binds’ an identity to a public key.
When Bob receives Alice’s certificate, he knows that he is talking to Alice, because the certificate
contains Alice’s name. The most common form of certificate is called X.509 v3. An X.509 certifi-
cate contains the following fields:
1) Version number
2) Serial number
3) Signature algorithm ID
4) Issuer name
5) Validity period
6) Subject name (i.e. the certificate owner)
7) The certificate owner’s public key
8) Optional fields
9) CA’s signature on 1 − 8
Parts 1-8 constitute the data part, and 9 is the signature part. The internal structure of 9 fields in
an X.509 certificate is extremely complicated; for details, see X.509 Style Guide by Peter Gutmann,
http://www.cs.auckland.ac.nz/~pgut001/pubs/x509guide.txt
As practical examples of the use of PKI, we can mention the following situations.
• Electronic Banking
Banks need to allow customers to perform transactions on the bank’s website. Here, bank acts
as the CA and distributes certificates to all customers (users). When a customer wants to commu-
nicate with the bank, the bank will know that customer is authentic because the customer has a
certificate that was signed by the bank.
A VPN allows employees of a company to access the company’s network from home or while
travelling. In this case, the company’s IT department acts as the CA, and gives certificates to all
employees. The employee and the company’s network can then authenticate themselves at the start
of each communication session.
81
• Credit Card Organisation
Different banks that are part of the same credit card organisation need to be able to exchange
payments securely. A PKI allows banks to be able to identify each other. The credit card or-
ganisation (such as Visa or Mastercard) will act as a CA, signing the public keys of individual
banks.
82