(Last update 19-Sep-2010)
Codes are a special kind of cryptosystem, closely related to substitution ciphers. Whereas in a cipher, the individual letters - or sometimes groups of letters of a fixed length - are substituted for other cryptosymbols, the codesystems operate on whole words and/or phrases, substituting these with codegroups. Usually these codegroups are of a fixed lenght and entirely made up of letters only, or figures only.
Codesystems require the use of codebooks, or - in the case of a very small code - codecharts, listing the plaintext words and phrases together with the allotted codegroups. Since there exists quite a few words in any language, all words can't be listed in an average codebook, so commonly there are also found codegroups in the book representing the letters of the alphabet, syllables, numbers, punctuation marks, grammatical terms and so on.
In an ordered code, the plaintext entries are listed alphabetically and the codegroups are allotted to these entries in numerical order - in case the codegroups consists of figures - or in alphabetical order - in case the codegroups consists of letters. A small sample part of the beginning of one such code might look something like this:
|
Plaintext |
Codegroup |
Plaintext |
Codegroup |
|
|
A |
00001 |
address |
00051 |
|
|
-ab- |
00002 |
addressee |
00052 |
|
|
abandon |
00003 |
adjacent |
00053 |
|
|
abide |
00004 |
adjust |
00054 |
|
|
able |
00005 |
adjutant |
00055 |
|
|
... |
... |
... |
... |
At the turn of the last century, when the telegraph grew more and more popular, non-secret codes began to be manufactured for the public. The purpose of these codes was to reduce the cost of a cable by listing frequently occuring sentences with short codegroups in large codebooks. When sending a cable, one consulted one of the many different codes that were around (after making sure the receiver also held the same code), and with it's help, tried to reduce the length of the cable, and thus the cost. One of the first pages of one such code - Bentley's Complete Phrase Code - is a good example of how a page of an ordered code can look like:
To encode a message like "Await instructions before taking further action. Acknowledge receipt of this telegram by wire." only takes two code groups: ADTUR ADKUH. Quite a reduce in cost!
Both encoding and decoding are easily done with the same book, since both plaintext and codegroups follow in their normal order. Unfortunately, this also greatly helps the enemy trying to break a secret ordered code, so to counter this unordered codes was invented.
In an unordered code, the codegroups are allotted to the plaintext in random fashion. If the code is a big one (i.e. not a small chart), one needs two books or lists. In one, the plaintext is listed in alphabetic order together with the codegroups in their mixed order. In the other, the codegroups are listed in (numerical or alphabetical) order with the plaintext in mixed order. A small sample will make things clearer:
|
Encoding section |
Decoding section |
|||
|
Plaintext |
Codegroup |
Codegroup |
Plaintext |
|
|
... |
... |
... |
... |
|
|
Stop |
7404 |
3729 |
Strong |
|
|
Stopped |
4017 |
3730 |
A |
|
|
Storm |
2809 |
3731 |
Was |
|
|
Strength |
3318 |
3732 |
Does not |
|
|
Strike |
5056 |
3733 |
Will be |
|
|
Strong |
3729 |
3734 |
And |
|
|
Succeed |
0047 |
3735 |
Unit |
|
|
Success |
6395 |
3736 |
Enemy |
|
|
... |
... |
... |
... |
|
A big sample can be viewed here: Swedish Diplomatic Code.
Unordered codes can pose a difficult problem for the enemy cryptanalyst, especially if the intercepted material is small. The problem facing the legitimate users is that the codebook usually sees heavy use - thus providing the enemy with a lot of traffic, since it is no small thing replacing the code with a new edition. The countermeasure to this, is to use superencipherment.
Superencipherment of a code can be achieved in a number of ways and serve to hide the actual codegroups from the enemy cryptanalyst. The most common way to do this, if the codegroups consists of figures, is to use an additive.
The additive is a - usually very long - series of figures listed as groups in a table
or book of it's own. The user starts somewhere in this series, and allots groups from the
additive to each group of the coded message, and adds them together (almost as in
Gronsfeld's cipher) modulo 10, the sums being the cryptogram to be sent.
The receiver, who must have the same additive series and knowledge of where the sender
started to pick out groups, subtracts these modulo 10 from the received cryptogram,
to get the naked codetext.
Modulo 10 addition/subtraction may puzzle some of you reading this, so here is a quick explanation: When adding two single numbers, modulo 10, one simply keeps the last figure of the sum, if the sum is greater than nine, and forget everything one learned in school about carry.
When subtracting modulo 10, one automatically add 10 to the first number if the result would otherwise be negative.
(N.B. This is the practical explanation, which probably won't please the mathematicians out there, but they hardly need modulo arithmetic explained anyway.)
I'll give an example of the use of the additive superencipherment method using this additive key:
81855 06392 93111 72993 95106 30217 25634 33084 01669 17442
95745 76799 13525 85433 66391 63054 24755 51069 06037 50362
10815 30580 71285 74122 53029 05471 80545 55717 85607 56281
The basic code is ordered and the text Enemy force moving east towards your sector is first coded like this:
|
enemy |
force |
move |
-ing |
east |
towards |
your |
sector |
|
25348 |
31800 |
55362 |
43915 |
25724 |
94039 |
99151 |
78673 |
Starting with the first group of the additive and proceeding from left to right when reading the rest of the groups off, and then adding them modulo 10 to the above code text, will give the following result:
|
enemy |
force |
move |
-ing |
east |
towards |
your |
sector |
|
25348 |
31800 |
55362 |
43915 |
25724 |
94039 |
99151 |
78673 |
|
81855 |
06392 |
93111 |
72993 |
95106 |
30217 |
25634 |
33084 |
|
+ |
+ |
+ |
+ |
+ |
+ |
+ |
+ |
|
06193 |
37192 |
48473 |
15808 |
10820 |
24246 |
14785 |
01657 |
Together with the final cryptogram in the last row, it is also necessary to communicate where in the additive to start. For this reason, the additive is usually paginated (if it runs over several pages), and the rows and columns are indexed in some way, so that it is possible to form a numeric group giving the page, row, and column of the starting position. This information is usually hidden in the cryptogram as a special group. For instance, it could be inserted as the fourth group from the begining, after the third cryptogroup has been added to it, modulo 10.
If you feel up to it, here is a series of cryptograms superenciphered with an additive, which you can try and break: A Superenciphered Code Exercise.Another superencipherment system sometimes used, involves a substitution chart. In a German WWI-code using an ordered code with three-figure codegroups, the first two digits of every such group was superenciphered with a 10 by 10-cells chart, called Geheimklappe ("Secret flap"). Different charts were used by different divisions, and they also changed from time to time. One such Geheimklappe looked like this:
|
Encoding table |
Decoding table |
|||||||||||||||||||||
|
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
|||
|
0 |
23 |
48 |
60 |
05 |
78 |
35 |
58 |
64 |
29 |
52 |
0 |
87 |
22 |
16 |
60 |
73 |
03 |
44 |
99 |
19 |
36 |
|
|
1 |
20 |
77 |
33 |
59 |
21 |
70 |
02 |
40 |
63 |
08 |
1 |
48 |
20 |
91 |
84 |
76 |
68 |
65 |
97 |
33 |
41 |
|
|
2 |
11 |
49 |
01 |
69 |
47 |
41 |
79 |
74 |
22 |
42 |
2 |
10 |
14 |
28 |
00 |
52 |
71 |
80 |
56 |
49 |
08 |
|
|
3 |
32 |
76 |
39 |
18 |
75 |
30 |
09 |
51 |
80 |
65 |
3 |
35 |
54 |
30 |
12 |
75 |
05 |
93 |
77 |
79 |
32 |
|
|
4 |
61 |
19 |
43 |
81 |
06 |
56 |
73 |
62 |
10 |
28 |
4 |
17 |
25 |
29 |
42 |
66 |
86 |
95 |
24 |
01 |
21 |
|
|
5 |
85 |
50 |
24 |
88 |
31 |
84 |
27 |
90 |
55 |
57 |
5 |
51 |
37 |
09 |
63 |
82 |
58 |
45 |
59 |
06 |
13 |
|
|
6 |
03 |
91 |
96 |
53 |
68 |
16 |
44 |
89 |
15 |
87 |
6 |
02 |
40 |
47 |
18 |
07 |
39 |
88 |
89 |
64 |
23 |
|
|
7 |
97 |
25 |
71 |
04 |
95 |
34 |
14 |
37 |
93 |
38 |
7 |
15 |
72 |
81 |
46 |
27 |
34 |
31 |
11 |
04 |
26 |
|
|
8 |
26 |
72 |
54 |
92 |
13 |
83 |
45 |
00 |
66 |
67 |
8 |
38 |
43 |
96 |
85 |
55 |
50 |
90 |
69 |
53 |
67 |
|
|
9 |
86 |
12 |
98 |
36 |
99 |
46 |
82 |
17 |
94 |
07 |
9 |
57 |
61 |
83 |
78 |
98 |
74 |
62 |
70 |
92 |
94 |
|
When encrypting, the first figure of a codegroup is used as a row-index to the leftmost table, and the second is used as a column-index. These two figures are substituted by the ones found at the intersection in the encoding table, and the third figure of the original codegroup is appended as it is. To decode a received codegroup, one uses the rightmost table in the same way, since it is the inverse of the encoding table.
The codegroup 153 meaning Gegner geht zurück (="Enemy is retreating") will be superenciphered as 703 using the above keychart.
A similar superencipherment system was used by the Soviet Baltic Navy during World War II.
The Soviet codegroups were four figures long and these were split up into pairs and then
superenciphered with a chart and recombined into four-figure groups prior to transmission.
The Soviet Baltic Navy four-figure-code was successfully attacked and read by the Swedish
signal intelligence organisation during WWII, probably due to the fact that the basic codebook
was ordered and saw heavy use.
You can read more about the Soviet four-figure-code and other codes on the
"From the Archives" page.
In military situations, small code charts are often used as low-level tactical cryptosystem. A few examples of code charts will be given, first an Austrian, probably WWI or earlier:
|
3 |
6 |
0 |
7 |
4 |
8 |
1 |
9 |
5 |
2 |
|
|
2 |
a |
ä |
ai |
au |
äu |
b |
c |
ch |
ck |
d |
|
6 |
e |
ei |
eu |
f |
ff |
g |
h |
i |
ie |
j |
|
3 |
k |
l |
ll |
m, mm |
n, nn |
o |
ö |
p |
pp |
r |
|
7 |
s |
sch |
sp |
spr |
ss |
st |
str |
t |
tt |
u |
|
4 |
ü |
v |
w |
x |
y |
z |
0 |
1 |
2 |
3 |
|
0 |
4 |
5 |
6 |
7 |
8 |
9 |
. |
, |
; |
? |
|
8 |
Ab- |
Armee |
Artillrie |
Ba- |
Batterie |
Brigade |
Brücke |
Division |
Eisen- |
Eskadron |
|
1 |
Feld |
Flieger |
Flugzeug |
Genie |
Geschütz |
Ge- |
Gruppe |
Infanterie |
Jäger |
Kanone |
|
9 |
Ka- |
Kom- |
Kom- |
Korps |
Mann |
Mörser |
Munition |
Offizier |
Pferd |
Pionier |
|
5 |
Regiment |
Sanität |
Sappeure |
Schützen |
Stab |
Staffel |
Train |
Truppe |
Wache |
Zug |
The figures at the left side and top row are used as coordinates to the cells of the chart. To encrypt Zug mit Munition angekommen (="Train with ammunition has arrived."), the first word Zug is found in the lower right hand corner, or row 5, column 2, so the codegroup will be '52'. The next word mit will have to be spelled out as '37 69 79'. The whole message put into code will look like this:
52 37 69 79 91 23 34 68 63 33 38 37 63 34
(Standard radio signalling practice would run the groups together though, and split the result up into five figure groups, so garbles would be easier to spot and correct. So, the message from above would appear in the ether like this: "52376 97991 23346 86333 38376 334". Also, the last group would in most cases be filled out, to make it five figures long as well.)As can be seen, in some instances whole words are encrypted as single codegroups, but as the chart is small the amount of words are limited, so a lot of text will have to be spelled out. Since no variants are given for the more common letters (e will be represented by 63 all the time) this system will be (and probably was) likely to be broken by the enemy, if it is used a lot without changing the coordinates.
If it's known that the type of traffic to be put into code isn't very stereotyped and is therefore likely to use a large vocabulary, a syllabary square can be used. There are - of course - many possible constructions, but one sometimes found in cryptographic literature looks like this:
|
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
0 |
|
|
1 |
A |
1 |
AL |
AN |
AND |
AR |
ARE |
AS |
AT |
ATE |
|
2 |
ATI |
B |
2 |
BE |
C |
3 |
CA |
CE |
CO |
COM |
|
3 |
D |
4 |
DA |
DE |
E |
5 |
EA |
ED |
EN |
ENT |
|
4 |
ER |
ERE |
ERS |
ES |
EST |
F |
6 |
G |
7 |
H |
|
5 |
8 |
HAS |
HE |
I |
9 |
IN |
ING |
ION |
IS |
IT |
|
6 |
IVE |
J |
0 |
K |
L |
LA |
LE |
M |
ME |
N |
|
7 |
ND |
NE |
NT |
O |
OF |
ON |
OR |
OU |
P |
Q |
|
8 |
R |
RA |
RE |
RED |
RES |
RI |
RO |
S |
SE |
SH |
|
9 |
ST |
STO |
T |
TE |
TED |
TER |
TH |
THE |
THI |
THR |
|
0 |
TI |
TO |
U |
V |
VE |
W |
WE |
X |
Y |
Z |
Normally, a two-figure system will double the length of the text (since two figures have to be used to encrypt every individual letter of the plaintext), but in a syllabary system - if it is carefully constructed, like the above one - the resulting cryptogram is in most cases shorter. In the above chart clusters of up to three letters will be encrypted by single two-figure codegroups, and if we encrypt the sample text from above (in English this time) Train with ammunition has arrived, it will look like this:
|
t |
ra |
in |
w |
it |
h |
a |
m |
m |
u |
n |
it |
ion |
has |
ar |
ri |
ve |
d |
|
93 |
82 |
56 |
06 |
50 |
40 |
11 |
68 |
68 |
03 |
60 |
50 |
58 |
52 |
16 |
86 |
05 |
31 |
This is 18 groups and that is only 2 more than if we had used single codegroups for train and ammunition and spelled the rest of the message letter for letter, like in the Austrian system, and if we had used one codegroup per letter the resulting cryptogram would have been 29 groups long. The two repetitions '50' and '68' stands for it and m, which is of much lower frequency than, say e, and therefore not the first guess of plaintext, that the cryptanalyst would make; not an unimportant feature.
Here is a similar code chart, specially constructed for encrypting messages in Esperanto.
|
08-14 |
60-65 |
87-99 |
15-28 |
00-07 |
38-52 |
73-86 |
29-37 |
53-59 |
66-72 |
|
|
C,K,R |
A |
1 |
AL |
AN |
AND |
AR |
ARE |
AS |
AT |
ATE |
|
L,P,Z |
ATI |
B |
2 |
BE |
C |
3 |
CA |
CE |
CO |
COM |
|
E,W |
D |
4 |
DA |
DE |
E |
5 |
EA |
ED |
EN |
ENT |
|
A,M,X |
ER |
ERE |
ERS |
ES |
EST |
F |
6 |
G |
7 |
H |
|
G,J,Q |
8 |
HAS |
HE |
I |
9 |
IN |
ING |
ION |
IS |
IT |
|
B,U |
IVE |
J |
0 |
K |
L |
LA |
LE |
M |
ME |
N |
|
D,F,V |
ND |
NE |
NT |
O |
OF |
ON |
OR |
OU |
P |
Q |
|
N,S,Y |
R |
RA |
RE |
RED |
RES |
RI |
RO |
S |
SE |
SH |
|
H,O |
ST |
STO |
T |
TE |
TED |
TER |
TH |
THE |
THI |
THR |
|
I,T |
TI |
TO |
U |
V |
VE |
W |
WE |
X |
Y |
Z |
Using the message from above, "Train with ammunition has arrived.", it could be encoded thus:
O95 S61 Q43 I39 J67 X66 K09 B35 U29 I89 B71 G70 Q33 J64 R38 N39 T02 W14
Now, although the message has become longer, no repetitions at all are present, and the cryptanalysts task has become significantly harder.