Simple Substitution
Table of contents
Introduction
The most simple crypto systems only substitute the plaintext letters for
other letters, numbers, or, in some cases, arbitrary symbols. Usually
only one cryptosymbol is allotted to each individual plaintext symbol,
but in some more complex systems, variant cryptosymbols are allotted to the more common letters of the language in question.
The Caesar cipher
Julius Caesar is said to have used a very simple method to safeguard his communications, the so called Caesar cipher.
In the Caesar cipher the letters of the plaintext are substituted for
the letters found three places further down the alphabet (at the end of
the alphabet, the letters "wrap around", so after Z, the letter A follows), and the key for Caesar's secret cipher looks like this:
| Plaintext letters: | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z |
| Cipher letters: | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C |
The plaintext Cross the river Rubicon will become Furvv wkh ulyhu Uxelfrq encrypted in this key.
Today, one usually calls any crypto system which uses the normal sequence of the alphabet, transposed any number of steps as cipher alphaber, a Caesar cipher.
The Checkerboard
The Greek historian Polybios (c:a 200 BC) tells us about a
signalling systems, that should have been in use in Greece. The Greek
alphabet of 24 letters are written in five rows of five letters each
(the last row only having four letters), thus forming a square, or checkerboard.
Then, according to Polybios, to send a message to a place within sight,
torches are held up. First between one and five torches are used to
indicate the row where the sought letter stands in the square, then the number of torches are held up, which tells the column where the sought letter is found.
Needeless to say, this signalling scheme is somewhat slow, but it can be
used as a cryptosystem in the following way: We first adopt the system
to the Latin alphabet. Since there are 26 letters, but only 25 cells in a
five by five square, one letter must be sacrificed (or we can use e.g.
six rows instead of five). Usually the letters I and J are
put in the same cell, and treated equally since seldom any ambiguity
will arise as to which letter is meant. Here are a typical checkerboard
with the Latin alphabet:
| 1 | 2 | 3 | 4 | 5 |
| 1 | a | b | c | d | e |
| 2 | f | g | h | ij | k |
| 3 | l | m | n | o | p |
| 4 | q | r | s | t | u |
| 5 | v | w | x | y | z |
To encrypt a text with this crypto, the letters of the plaintext are
substituted for two-figure numbers, the first figure of every number
telling in which row the plaintext letter stands, and the last figure
telling the column. The plaintext Troy has fallen will become 44 42 34 54 - 23 11 43 - 21 11 31 31 15 33 in this checkerboard key.
A number of variants to the key shown above exists. It is possible to
use letters to indicate the rows and columns, instead of figures, if one
likes. In some cases a different order of the numbers telling the rows and columns, are used, or each row and column is given two figures like this:
| 2,6 | 0,3 | 1,5 | 7,9 | 4,8 |
| 6,8 | a | b | c | d | e |
| 1,4 | f | g | h | ij | k |
| 0,9 | l | m | n | o | p |
| 2,7 | q | r | s | t | u |
| 3,5 | v | w | x | y | z |
The user gets to choose between one of these two variants, when deciding
how to encrypt a certain letter, and it is -of course - possible to
choose different cipher numbers for the same letter occuring some place
else in the message, thus hiding repetitions, like this:
| Message: | T | H | E | | B | A | T | T | A | L | I | O | N |
| Cipher: | 77 | 15 | 68 | | 83 | 66 | 29 | 79 | 62 | 02 | 17 | 09 | 91 |
| I | S | | M | O | V | I | N | G | | S | O | U | T | H |
| 19 | 21 | | 00 | 99 | 32 | 47 | 05 | 10 | | 75 | 99 | 24 | 79 | 11 |
Commonly, one would put these numbers together to form standard five-figure groups before transmission, like this:
77156 88366 29796 20217 09911 92100 99324 70510 75992 47911
Monome-dinome checkerboard
A nifty checkerboard variant exists - the Monome-dinome Checkerboard -
where some of the letters - usually the ones occuring most frequent in
the language in question - receives single figure cryptosymbols, and the rest gets two-figure combinations just as above. Lets say the key looks like this:
| 7 | 4 | 1 | 0 | 8 | 5 | 2 | 9 | 6 | 3 |
| | A | S | I | N | T | O | E | R | | |
| 6 | B | C | D | F | G | H | J | K | L | M |
| 3 | P | Q | U | V | W | X | Y | Z | . | / |
The first row containing letters is formed by the mnemonic phrase A sin to err, with the last r
dropped (The phrase happens to contain the eight most frequent letters
of English.). Then the rest of the alphabet is listed in order in two
rows of ten letters, ending with a period mark and a slash (the slash
may be used to separate words when ambiguity would arise if they were
written together). The figures in the top row and at the last two
positions of the first column, are used as coordinats to refer to a cell
in the table, containing the letter to be encrypted.
The first row of letters are encrypted as single figures, the second row
of letter gets two-figure numbers commencing with the number 6, and the letters of the last row gets two-figure numbers commencing with the number 3. As can be seen by looking at the table, the figures 6 and 3
can not be single-figure numbers, but must commence, or be part of, a
two-figure number. Thus, there is no danger involved if one runs the
numbers of a cryptogram together as a string, or in five-figure groups.
It is always possible to decrypt such a cryptogram without any ambiguity
as to which figures are to be read as single-figures, or which figures
are to be treated as two-figure numbers. The string:
645636331016478150
can only be divided in one way, thus:
64-5-63-63-31-0-1-64-7-8-1-5-0
By referring to the table above, the plaintext communication is easily derived.
As can be seen, only five out of a total of thirteen letters are
encrypted as two-figure numbers, thus shortening the cryptogram and the
transmission time needed substantially.
The following cryptogram uses the above table, but different order of
the coordinates. Try and see if it is breakable; the plaintext is in
English, military language:
13492 09610 41763 07431 46918 65737 67721 86111 11581 71559 14176 30710
A variant of the monome-dinome checkerboard for the English language looks like this:
| 7 | 4 | 1 | 0 | 8 | 5 | 2 | 9 |
| | N | O | T | A | R | I | E | S |
| 6 | B | C | D | F | G | H | K | L |
| 3 | M | P | Q | U | V | W | X | Y |
With this type, the figures used as row indicators only have this function - they can never indicate a column - and their respective frequencies
in the resulting cryptograms will be somewhat reduced, thus these figures don't stand out quite as much as they would otherwise.
One drawback is, that I will have to stand for J which is missing, the same being the case for Z which will have to be encoded using some other letter, perhaps X.
Here are a few checkerboards of the monome-dinome type, optimized for other languages:
| Swedish |
| | 7 | 4 | 1 | 0 | 8 | 5 | 2 | 9 | 6 | 3 |
| | S | T | R | A | N | D | E | K | | |
| 6 | B | C | F | G | H | I | J | L | M | O |
| 3 | P | Q | U | V | X | Y | Z | Å | Ä | Ö |
| Spanish |
| | 7 | 4 | 1 | 0 | 8 | 5 | 2 | 9 | 6 | 3 |
| | E | S | T | A | D | O | Y | | | |
| 9 | B | C | F | G | H | I | J | | | |
| 6 | K | L | M | N | Ñ | P | Q | | | |
| 3 | R | U | V | W | X | Z | | | | |
| German |
| | 7 | 4 | 1 | 0 | 8 | 5 | 2 | 9 | 6 | 3 |
| | D | E | I | N | | | S | T | A | R |
| 8 | B | C | F | G | H | J | K | L | M | O |
| 5 | P | Q | U | V | W | X | Y | Z | . | , |
|
| Esperanto-1 |
| | 7 | 4 | 1 | 0 | 8 | 5 | 2 | 9 | 6 | 3 |
| | E | N | | L | I | S | T | O | | |
| 1 | A | B | | C | Ĉ | D | F | G | | |
| 6 | Ĝ | H | | Ĥ | J | Ĵ | K | M | | |
| 3 | P | R | | Ŝ | U | Ŭ | V | Z | | |
|
| Esperanto-2 |
| | 7 | 4 | 1 | 0 | 8 | 5 | 2 | 9 | 6 | 3 |
| | E | N | | L | I | S | T | O | | A |
| 1 | B | C | Ĉ | D | F | G | Ĝ | H | Ĥ | J |
| 6 | Ĵ | K | M | P | R | Ŝ | U | Ŭ | V | Z |
|
Viking Cryptography
The idea of the simpler checkerboard cipher just described, together
with a Caesar-like crypto were to some extent used by the old Vikings.
To explain the systems used, we must first have a little look at the
Rune Alphabet:
The Scandinavian Normal Runes
On the famous runestone of Rök in Östergötland, Sweden, both cipher-types are used.
In one passage, the reader gets the text a i r f b f r b n h n
which is as uncomprehensible to speakers of Old Norse, as it is to you.
The trick here, is to read the Rune immediately following in the Rune
alphabet, and then the text becomes s a k u m u k m i n i
which a scolar fluent in Old Norse will also have some trouble
understanding, since the Vikings never bothered to write double runes,
even when one word ended in the same letter as the next word started
with, and the carver of the Rökstone didn't bother to use word
separators either, but ran the text together in a long row. So, the text
can be read as "sakum ukmini" meaning "I say to the youth"/"I tell the
young man", or, if the text is read "sakum mukmini" it will mean "I tell
the great memory" (Don't ask me why the carver had to encrypt this).
There are several checkerboard ciphers of various kinds found on the Rökstone. One of them looks like this:
One of the Rökstone ciphers
To read this text, one first counts the number of hooks pointing to the
left and then the hooks pointing right of each individual vertical
stroke, putting them together as two-figure numbers. In the last
portion, the same is done, starting with the symbols having a topstroke
pointing left and grouping the number of them together with the ones
whose topstroke points to the right. When writing the numbers out, it
will look like this:
25 24 36 32 13 32 36 13 23 22 23 - 33 32 35
If we consult the Rune Alphabet above, we can easily decipher this text. The first figure of every number tells to which rune family
the sought rune belongs - for some reason the three families are always
numbered backwards - the second figure tells us which rune to read in a
given family. When the numbers are substituted for the right runes, and
translated to the Latin alphabet, the text reads: s a k u m u k m i n i th u r.
The reader might recognize the first part, which is the same as in the
previous cipher explained. This is followed by the word Thor - the Norse god of thunderstorms, a very powerful figure in Norse mythology, perhaps explaining the need for encryption.
Simple Substitution using an Unordered Alphabet
In the systems described so far the normal sequence of the alphabet has
been used, but one can of course use an unordered sequence of letters or
numbers as cipheralphabet. The classical method uses a keyword to
achieve this. Any word or phrase will do, but all repeated letters must
be deleted. If the keyword is RAMSES the following cipherkey - amongst several possible - can be constructed:
| Plaintext letters: | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z |
| Cipher letters: | R | A | M | S | E | B | C | D | F | G | H | I | J | K | L | N | O | P | Q | T | U | V | W | X | Y | Z |
A major drawback of this system, is the fact that towards the end of the
alphabet the plaintext letters tend to be encrypted by themselves if
the keyword doesn't contain, say, an "X", "Y", or "Z". To counter this
the users can agree to start writing the keyword and the rest of the
letters, at a different starting position than the letter "A". The
starting position can even be varied from message to message, and this
information can be hidden somewhere in the cryptogram. For instance,
when starting with the keyword under the letter "f", the result will be
the following key:
| Plaintext letters: | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z |
| Cipher letters: | V | W | X | Y | Z | R | A | M | S | E | B | C | D | F | G | H | I | J | K | L | N | O | P | Q | T | U |
Suppose the starting position is hidden as the third letter of the
resulting cryptogram, then the following table will illustrate the use
of the above key:
| Message: | d | e | * | s | p | e | r | a | t | e | | n | e | e | d |
| Cipher: | Y | Z | F | K | H | Z | J | V | L | Z | | F | Z | Z | Y |
| o | f | | s | u | p | p | l | i | e | s |
| G | R | | K | N | H | H | C | S | Z | K |
© Torbjörn Andersson.Torbjörn Andersson Fecit