
Historical Ciphers and their cryptanalysis
Historical Ciphers and their cryptanalysis
Caesar's cipher
Encryption:
Shift the letters of the alphabet 3 places forward
There is no key because the encryption method is fixed.
The shift cipher
The shift cipher can be viewed as a keyed variant of Caesar's cipher.
Encryption:
The key k is a number between 0 and 25. To encrypt, letters are shifted by k places forward.
Attack:
Try every possible key. This is called a brute-force or exhausitive-search attack.
sufficient key-space principle
Any secure encryption scheme must have a key space that is sufficiently large to make an exhaustive-search attack infeasible.
The mon-alphabetic substitution cipher
Encryption:
The key defines a map on the alphabet, and the map is allowed to be arbitrary subject only to the constraint that it be one-to-one.
Example:
The key defines the following permutation:
a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
X | E | U | A | D | N | B | K | V | M | R | O | C | Q | F | S | Y | H | W | G | L | Z | I | J | P | T |
Message tellhimaboutme
is encrypted to GDOOKVCXEFLGCD
Attack:
Tabulate the frequency distribution of characters in the cipher text. The frequencies are then compared to compared to the known letter frequencies of normal English text.
An improved attack on the shift cipher
It is difficult for a compute to check whether a given plaintext "make sense". But the following attack does not suffer from that.
Let $p_i$ denotes the frequency of the $i$th letter in normal English text.
$$ \sum ^{25} _{i=0} p^2_i \approx 0.065 $$
Now say we are given some ciphertext and let $q_i$ denotes the frequency of $i$th letter of the alphabet in this ciphertext.
$$ I_j \overset{\underset{\mathrm{def}}{}}{=} \sum ^{25}{i=0} p_i \cdot q $$
If we compute for each value of $j \in { 0, \ldots, 25 }$, the we expect to find that $I_k \approx 0.065$, whereas $I_j$ for $j \neq k$ will be different from 0.065.
The Vigen$\grave{e}$re (poly-alphabetic shift) cipher
Encryption:
The key is viewed as a string of letters; encryption is done by shifting each plaintext character by the amount indicated by the next character of the key, wrapping around in the key when necessary.
Example:
Plaintext: | tellhimaboutme |
---|---|
Key (repeated): | cafecafecafeca |
Ciphertext | VEQPJIREDOZXOE |
Attack:
If the length of the key if known:
Say the length of the key is $t$, then an observed cipher $c = c_1c_2\cdots$ can be divided into $t$ parts where each part can be viewed as having been encrypted using the shift cipher.
How to determine the key length from an observed ciphertext ?
The following approach is called the index of coincidence method.
For $\tau = 1,2,\ldots,T$, look at the sequence of ciphertext characters $c_1, c_{1+\tau}, c_{1+2\tau}, \cdots$ and tabulate $q_0, \cdots, q_{25}$ for this sequence. Then compute
$$ S_{\tau} \overset{\underset{\mathrm{def}}{}}{=} \sum ^{25}_{i=0} q_i^2 $$
When $\tau = t$ we expect $S_{\tau} \approx 0.065$
On the other hand, if $\tau$ is not a multiple of $t$ we expect that all characters will occur with roughly equal probability in the sequence $c_1, c_{1+\tau}, c_{1+2\tau}, \cdots$ , and so we expect $q_i \approx 1 / 26$ for all $i$. In this case we will obtain
$$ S_{\tau} \approx \sum^{25}_{i=0} (\frac{1}{26})^2 \approx 0.038 $$