Cys505 Lecture02
Cys505 Lecture02
Plain: abcdefghijklmnopqrstuvwxyz
Cipher: DKVQFIBJWPESCXHTMYAUOLRGZN
Plaintext: ifwewishtoreplaceletters
Ciphertext: WIRFRWAJUHYFTSDVFSFUUFYA
now have a total of 26! = 4 x 10^26 keys
with so many keys, might think is secure
◦ The simplicity and strength of the monoalphabetic
substitution cipher dominated for the first
millenium AD.
but would be !!!WRONG!!!
◦ First broken by Arabic scientists in 9th century
letters are not equally commonly used
in English e is by far the most common
letter
then T,R,N,I,O,A,S
other letters are fairly rare
cf. Z,J,K,Q,X
have tables of single, double & triple letter
frequencies
key concept - monoalphabetic substitution
ciphers do not change relative letter
frequencies
discovered by Arabian scientists in 9th
century
calculate letter frequencies for ciphertext
compare counts/plots against known values
for monoalphabetic must identify each
letter
◦ tables of common double/triple letters help
given ciphertext:
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
count relative letter frequencies
guess P & Z are e and t
guess ZW is th and hence ZWP is the
proceeding with trial and error finally get:
it was disclosed yesterday that several informal but
direct contacts have been made with political
representatives of the viet cong in moscow
not even the large number of keys in a
monoalphabetic cipher provides security
one approach to improving security was to
encrypt multiple letters
the Playfair Cipher is an example
invented by Charles Wheatstone in 1854, but
named after his friend Baron Playfair
a 5X5 matrix of letters based on a keyword
fill in letters of keyword (sans duplicates)
fill rest of matrix with other letters
eg. using the keyword MONARCHY
plaintext encrypted two letters at a time:
1. he plaintext is split into pairs of two letters
(digraphs). If there is an odd number of letters,
a Z is added to the last letter.
2. if a pair is a repeated letter, insert a filler like
'X', eg. "balloon" encrypts as "ba lx lo on"
3. if both letters fall in the same row, replace each
with letter to right (wrapping back to start from
end), eg. “ar" encrypts as "RM"
4. if both letters fall in the same column, replace
each with the letter below it (again wrapping to
top from bottom), eg. “mu" encrypts to "CM"
5. otherwise each letter is replaced by the one in
its row in the column of the other letter of the
pair, eg. “hs" encrypts to "BP", and “ea" to "IM"
or "JM" (as desired)
security much improved over monoalphabetic
since have 26 x 26 = 676 digrams
would need a 676-entry frequency table to
analyse (verses 26 for a monoalphabetic)
and correspondingly more ciphertext
was widely used for many years (eg. US &
British military in WW1)
it can be broken, given a few hundred letters
since still has much of plaintext structure
another approach to improving security is
to use multiple cipher alphabets
called polyalphabetic substitution ciphers
makes cryptanalysis harder with more
alphabets to guess and flatter frequency
distribution
use a key to select which alphabet is used
for each letter of the message
use each alphabet in turn
repeat from start after end of key is reached
key: deceptivedeceptivedeceptive
plaintext: wearediscoveredsaveyourself
ciphertext:ZICVTWQNGRZGVTWAVZHCQYGLMGJ
write the plaintext out
write the keyword repeated above it
◦ eg using keyword deceptive
◦ d order 4th letter Add 4-1 = 3
◦ e order 5th letter Add 5-1 = 4
◦ c order 3rd letter Add 3-1 = 2
use each key letter as a Caesar cipher key
encrypt the corresponding plaintext letter
simplest polyalphabetic substitution cipher is
the Vigenère Cipher
effectively multiple caesar ciphers
key is d-letter long K = k1 k2 ... kd
ith letter specifies ith alphabet to use
use each alphabet in turn
repeat from start after d letters in message
decryption simply works in reverse
The Modern Vigenère Tableau
have multiple ciphertext letters for each
plaintext letter
hence letter frequencies are obscured
but not totally lost
start with letter frequencies
◦ see if look monoalphabetic or not
if not, then need to determine number of
alphabets, since then can attach each
if a truly random key as long as the
message is used, the cipher will be secure
called a One-Time pad
is unbreakable since ciphertext bears no
statistical relationship to the plaintext
◦ No repetition of patterns
since for any plaintext & any ciphertext
there exists a key mapping one to other
can only use the key once though
have problem of safe distribution of key
now consider classical transposition or
permutation ciphers these hide the message
by rearranging the letter order without
altering the actual letters used
can recognise these since have the same
frequency distribution as the original text
write message letters out diagonally over a
number of rows then read off cipher row by
row
eg. write message “meet me after the toga
party” out as:
m e m a t r h t g p r y
e t e f e t e o a a t
giving ciphertext
MEMATRHTGPRYETEFETEOAAT
a more complex scheme write letters of
message out in rows over a specified number
of columns then reorder the columns
according to some key before reading off the
rows
Key: 4 3 1 2 5 6 7
Plaintext: a t t a c k p
o s t p o n e
d u n t i l t
w o a m x y z
Ciphertext: TTNAAPTMTSUOAODWCOIXKNLYPETZ
ciphers using substitutions or transpositions
are not secure because of language
characteristics
hence consider using several ciphers in
succession to make harder, but:
◦ two substitutions make a more complex substitution
◦ two transpositions make more complex transposition
◦ but a substitution followed by a transposition makes a
new much harder cipher
this is bridge from classical to modern ciphers
Multiple-stage substitution algorithms
before modern ciphers, rotor machines
were most common product cipher
were widely used in WW2
◦ German Enigma, Allied Hagelin, Japanese Purple
implemented a very complex, varying
substitution cipher
used a series of cylinders, each giving one
substitution, which rotated and changed
after each letter was encrypted
an alternative to encryption
hides existence of message
◦ using only a subset of letters/words in a longer
message marked in some way
◦ using invisible ink
◦ hiding graphic image or sound file
has drawbacks
◦ high overhead to hide relatively few info bits