Simple Substitution
Table of contents
Introduction
The most simple crypto systems only substitute the plaintext letters for other letters, numbers, or, in some cases, arbitrary symbols. Usually only one cryptosymbol is allotted to each individual plaintext symbol, but in some more complex systems, variant cryptosymbols are allotted to the more common letters of the language in question.
The Caesar cipher
Julius Caesar is said to have used a very simple method to safeguard his communications, the so called Caesar cipher. In the Caesar cipher the letters of the plaintext are substituted for the letters found three places further down the alphabet (at the end of the alphabet, the letters "wrap around", so after Z, the letter A follows), and the key for Caesar's secret cipher looks like this:
| Plaintext letters: | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z |
| Cipher letters: | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C |
The plaintext Cross the river Rubicon will become Furvv wkh ulyhu Uxelfrq encrypted in this key.
Today, one usually calls any crypto system which uses the normal sequence of the alphabet, transposed any number of steps as cipher alphaber, a Caesar cipher.
The Checkerboard
The Greek historian Polybios (c:a 200 BC) tells us about a signalling systems, that should have been in use in Greece. The Greek alphabet of 24 letters are written in five rows of five letters each (the last row only having four letters), thus forming a square, or checkerboard. Then, according to Polybios, to send a message to a place within sight, torches are held up. First between one and five torches are used to indicate the row where the sought letter stands in the square, then the number of torches are held up, which tells the column where the sought letter is found.
Needeless to say, this signalling scheme is somewhat slow, but it can be used as a cryptosystem in the following way: We first adopt the system to the Latin alphabet. Since there are 26 letters, but only 25 cells in a five by five square, one letter must be sacrificed (or we can use e.g. six rows instead of five). Usually the letters I and J are put in the same cell, and treated equally since seldom any ambiguity will arise as to which letter is meant. Here are a typical checkerboard with the Latin alphabet:
| 1 | 2 | 3 | 4 | 5 |
| 1 | a | b | c | d | e |
| 2 | f | g | h | ij | k |
| 3 | l | m | n | o | p |
| 4 | q | r | s | t | u |
| 5 | v | w | x | y | z |
To encrypt a text with this crypto, the letters of the plaintext are substituted for two-figure numbers, the first figure of every number telling in which row the plaintext letter stands, and the last figure telling the column. The plaintext Troy has fallen will become 44 42 34 54 - 23 11 43 - 21 11 31 31 15 33 in this checkerboard key.
A number of variants to the key shown above exists. It is possible to use letters to indicate the rows and columns, instead of figures, if one likes. In some cases a different order of the numbers telling the rows and columns, are used, or each row and column is given two figures like this:
| 2,6 | 0,3 | 1,5 | 7,9 | 4,8 |
| 6,8 | a | b | c | d | e |
| 1,4 | f | g | h | ij | k |
| 0,9 | l | m | n | o | p |
| 2,7 | q | r | s | t | u |
| 3,5 | v | w | x | y | z |
The user gets to choose between one of these two variants, when deciding how to encrypt a certain letter, and it is -of course - possible to choose different cipher numbers for the same letter occuring some place else in the message, thus hiding repetitions, like this:
| Message: | T | H | E | | B | A | T | T | A | L | I | O | N |
| Cipher: | 77 | 15 | 68 | | 83 | 66 | 29 | 79 | 62 | 02 | 17 | 09 | 91 |
| I | S | | M | O | V | I | N | G | | S | O | U | T | H |
| 19 | 21 | | 00 | 99 | 32 | 47 | 05 | 10 | | 75 | 99 | 24 | 79 | 11 |
Commonly, one would put these numbers together to form standard five-figure groups before transmission, like this:
77156 88366 29796 20217 09911 92100 99324 70510 75992 47911
A nifty checkerboard variant exists, where some of the letters - usually the ones occuring most frequent in the language in question - receives single figure cryptosymbols, and the rest gets two-figure combinations just as above. Lets say the key looks like this:
| 7 | 4 | 1 | 0 | 8 | 5 | 2 | 9 | 6 | 3 |
| | A | S | I | N | T | O | E | R | | |
| 6 | B | C | D | F | G | H | J | K | L | M |
| 3 | P | Q | U | V | W | X | Y | Z | . | / |
The first row containing letters is formed by the mnemonic phrase A sin to err, with the last r dropped (The phrase happens to contain the eight most frequent letters of English.). Then the rest of the alphabet is listed in order in two rows of ten letters, ending with a period mark and a slash (the slash may be used to separate words when ambiguity would arise if they were written together). The figures in the top row and at the last two positions of the first column, are used as coordinats to refer to a cell in the table, containing the letter to be encrypted.
The first row of letters are encrypted as single figures, the second row of letter gets two-figure numbers commencing with the number 6, and the letters of the last row gets two-figure numbers commencing with the number 3. As can be seen by looking at the table, the figures 6 and 3 can not be single-figure numbers, but must commence, or be part of, a two-figure number. Thus, there is no danger involved if one runs the numbers of a cryptogram together as a string, or in five-figure groups. It is always possible to decrypt such a cryptogram without any ambiguity as to which figures are to be read as single-figures, or which figures are to be treated as two-figure numbers. The string:
645636331016478150
can only be divided in one way, thus:
64-5-63-63-31-0-1-64-7-8-1-5-0
By referring to the table above, the plaintext communication is easily derived.
As can be seen, only five out of a total of thirteen letters are encrypted as two-figure numbers, thus shortening the cryptogram and the transmission time needed substantially.
The following cryptogram uses the above table, but different order of the coordinates. Try and see if it is breakable; the plaintext is in English, military language:
13492 09610 41763 07431 46918 65737 67721 86111 11581 71559 14176 30710
Viking Cryptography
The idea of the simpler checkerboard cipher just described, together with a Caesar-like crypto were to some extent used by the old Vikings. To explain the systems used, we must first have a little look at the Rune Alphabet:
The Scandinavian Normal Runes
On the famous runestone of Rök in Östergötland, Sweden, both cipher-types are used.
In one passage, the reader gets the text a i r f b f r b n h n which is as uncomprehensible to speakers of Old Norse, as it is to you. The trick here, is to read the Rune immediately following in the Rune alphabet, and then the text becomes s a k u m u k m i n i which a scolar fluent in Old Norse will also have some trouble understanding, since the Vikings never bothered to write double runes, even when one word ended in the same letter as the next word started with, and the carver of the Rökstone didn't bother to use word separators either, but ran the text together in a long row. So, the text can be read as "sakum ukmini" meaning "I say to the youth"/"I tell the young man", or, if the text is read "sakum mukmini" it will mean "I tell the great memory" (Don't ask me why the carver had to encrypt this).
There are several checkerboard ciphers of various kinds found on the Rökstone. One of them looks like this:
One of the Rökstone ciphers
To read this text, one first counts the number of hooks pointing to the left and then the hooks pointing right of each individual vertical stroke, putting them together as two-figure numbers. In the last portion, the same is done, starting with the symbols having a topstroke pointing left and grouping the number of them together with the ones whose topstroke points to the right. When writing the numbers out, it will look like this:
25 24 36 32 13 32 36 13 23 22 23 - 33 32 35
If we consult the Rune Alphabet above, we can easily decipher this text. The first figure of every number tells to which rune family the sought rune belongs - for some reason the three families are always numbered backwards - the second figure tells us which rune to read in a given family. When the numbers are substituted for the right runes, and translated to the Latin alphabet, the text reads: s a k u m u k m i n i th u r. The reader might recognize the first part, which is the same as in the previous cipher explained. This is followed by the word Thor - the Norse god of thunderstorms, a very powerful figure in Norse mythology, perhaps explaining the need for encryption.
Simple Substitution using an Unordered Alphabet
In the systems described so far the normal sequence of the alphabet has been used, but one can of course use an unordered sequence of letters or numbers as cipheralphabet. The classical method uses a keyword to achieve this. Any word or phrase will do, but all repeated letters must be deleted. If the keyword is RAMSES the following cipherkey - amongst several possible - can be constructed:
| Plaintext letters: | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z |
| Cipher letters: | R | A | M | S | E | B | C | D | F | G | H | I | J | K | L | N | O | P | Q | T | U | V | W | X | Y | Z |
A major drawback of this system, is the fact that towards the end of the alphabet the plaintext letters tend to be encrypted by themselves if the keyword doesn't contain, say, an "X", "Y", or "Z". To counter this the users can agree to start writing the keyword and the rest of the letters, at a different starting position than the letter "A". The starting position can even be varied from message to message, and this information can be hidden somewhere in the cryptogram. For instance, when starting with the keyword under the letter "f", the result will be the following key:
| Plaintext letters: | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z |
| Cipher letters: | V | W | X | Y | Z | R | A | M | S | E | B | C | D | F | G | H | I | J | K | L | N | O | P | Q | T | U |
Suppose the starting position is hidden as the third letter of the resulting cryptogram, then the following table will illustrate the use of the above key:
| Message: | d | e | * | s | p | e | r | a | t | e | | n | e | e | d |
| Cipher: | Y | Z | F | K | H | Z | J | V | L | Z | | F | Z | Z | Y |
| o | f | | s | u | p | p | l | i | e | s |
| G | R | | K | N | H | H | C | S | Z | K |
© Torbjörn Andersson.Torbjörn Andersson Fecit