Codes

(Last update 10-Feb-2001)


Introduction


Codes are a special kind of cryptosystem, closely related to substitution ciphers. Whereas in a cipher, the individual letters - or sometimes groups of letters of a fixed length - are substituted for other cryptosymbols, the codesystems operate on whole words and/or phrases, substituting these with codegroups. Usually these codegroups are of a fixed lenght and entirely made up of letters only, or figures only.

Codesystems require the use of codebooks, or - in the case of a very small code - codecharts, listing the plaintext words and phrases together with the allotted codegroups. Since there exists quite a few words in any language, all words can't be listed in an average codebook, so commonly there are also found codegroups in the book representing the letters of the alphabet, syllables, numbers, punctuation marks, grammatical terms and so on.


Ordered codes


In an ordered code, the plaintext entries are listed alphabetically and the codegroups are allotted to these entries in numerical order - in case the codegroups consists of figures - or in alphabetical order - in case the codegroups consists of letters. A small sample part of the beginning of one such code might look something like this:

Plaintext

Codegroup

 

Plaintext

Codegroup

         

A

00001

 

address

00051

-ab-

00002

 

addressee

00052

abandon

00003

 

adjacent

00053

abide

00004

 

adjust

00054

able

00005

 

adjutant

00055

...

...

 

...

...

At the turn of the last century, when the telegraph grew more and more popular, non-secret codes began to be manufactured for the public. The purpose of these codes was to reduce the cost of a cable by listing frequently occuring sentences with short codegroups in large codebooks. When sending a cable, one consulted one of the many different codes that were around (after making sure the receiver also held the same code), and with it's help, tried to reduce the length of the cable, and thus the cost. One of the first pages of one such code - Bentley's Complete Phrase Code - is a good example of how a page of an ordered code can look like:

To encode a message like "Await instructions before taking further action. Acknowledge receipt of this telegram by wire." only takes two code groups: ADTUR ADKUH. Quite a reduce in cost!

Both encoding and decoding are easily done with the same book, since both plaintext and codegroups follow in their normal order. Unfortunately, this also greatly helps the enemy trying to break a secret ordered code, so to counter this unordered codes was invented.


Unordered codes


In an unordered code, the codegroups are allotted to the plaintext in random fashion. If the code is a big one (i.e. not a small chart), one needs two books or lists. In one, the plaintext is listed in alphabetic order together with the codegroups in their mixed order. In the other, the codegroups are listed in (numerical or alphabetical) order with the plaintext in mixed order. A small sample will make things clearer:

Encoding section

 

Decoding section

Plaintext

Codegroup

 

Codegroup

Plaintext

         

...

...

 

...

...

Stop

7404

 

3729

Strong

Stopped

4017

 

3730

A

Storm

2809

 

3731

Was

Strength

3318

 

3732

Does not

Strike

5056

 

3733

Will be

Strong

3729

 

3734

And

Succeed

0047

 

3735

Unit

Success

6395

 

3736

Enemy

...

...

 

...

...

A big sample can be viewed here: Swedish Diplomatic Code.

Unordered codes can pose a difficult problem for the enemy cryptanalyst, especially if the intercepted material is small. The problem facing the legitimate users is that the codebook usually sees heavy use - thus providing the enemy with a lot of traffic, since it is no small thing replacing the code with a new edition. The countermeasure to this, is to use superencipherment.


Superencipherment


Superencipherment of a code can be achieved in a number of ways and serve to hide the actual codegroups from the enemy cryptanalyst. The most common way to do this, if the codegroups consists of figures, is to use an additive.

The additive is a - usually very long - series of figures listed as groups in a table or book of it's own. The user starts somewhere in this series, and allots groups from the additive to each group of the coded message, and adds them together (almost as in Gronsfeld's cipher) modulo 10, the sums being the cryptogram to be sent.
The receiver, who must have the same additive series and knowledge of where the sender started to pick out groups, subtracts these modulo 10 from the received cryptogram, to get the naked codetext.

Modulo 10 addition/subtraction may puzzle some of you reading this, so here is a quick explanation: When adding two single numbers, modulo 10, one simply keeps the last figure of the sum, if the sum is greater than nine, and forget everything one learned in school about carry.
When subtracting modulo 10, one automatically add 10 to the first number if the result would otherwise be negative.
(N.B. This is the practical explanation, which probably won't please the mathematicians out there, but they hardly need modulo arithmetic explained anyway.)

I'll give an example of the use of the additive superencipherment method using this additive key:

81855 06392 93111 72993 95106 30217 25634 33084 01669 17442
95745 76799 13525 85433 66391 63054 24755 51069 06037 50362
10815 30580 71285 74122 53029 05471 80545 55717 85607 56281

The basic code is ordered and the text Enemy force moving east towards your sector is first coded like this:

enemy

force

move

-ing

east

towards

your

sector

25348

31800

55362

43915

25724

94039

99151

78673

Starting with the first group of the additive and proceeding from left to right when reading the rest of the groups off, and then adding them modulo 10 to the above code text, will give the following result:

enemy

force

move

-ing

east

towards

your

sector

25348

31800

55362

43915

25724

94039

99151

78673

81855

06392

93111

72993

95106

30217

25634

33084

+

+

+

+

+

+

+

+

06193

37192

48473

15808

10820

24246

14785

01657

Together with the final cryptogram in the last row, it is also necessary to communicate where in the additive to start. For this reason, the additive is usually paginated (if it runs over several pages), and the rows and columns are indexed in some way, so that it is possible to form a numeric group giving the page, row, and column of the starting position. This information is usually hidden in the cryptogram as a special group. For instance, it could be inserted as the fourth group from the begining, after the third cryptogroup has been added to it, modulo 10.

If you feel up to it, here is a series of cryptograms superenciphered with an additive, which you can try and break: A Superenciphered Code Exercise.


Another superencipherment system sometimes used, involves a substitution chart. In a German WWI-code using an ordered code with three-figure codegroups, the first two digits of every such group was superenciphered with a 10 by 10-cells chart, called Geheimklappe ("Secret flap"). Different charts were used by different divisions, and they also changed from time to time. One such Geheimklappe looked like this:

Encoding table

 

Decoding table

 

0

1

2

3

4

5

6

7

8

9

   

0

1

2

3

4

5

6

7

8

9

0

23

48

60

05

78

35

58

64

29

52

 

0

87

22

16

60

73

03

44

99

19

36

1

20

77

33

59

21

70

02

40

63

08

 

1

48

20

91

84

76

68

65

97

33

41

2

11

49

01

69

47

41

79

74

22

42

 

2

10

14

28

00

52

71

80

56

49

08

3

32

76

39

18

75

30

09

51

80

65

 

3

35

54

30

12

75

05

93

77

79

32

4

61

19

43

81

06

56

73

62

10

28

 

4

17

25

29

42

66

86

95

24

01

21

5

85

50

24

88

31

84

27

90

55

57

 

5

51

37

09

63

82

58

45

59

06

13

6

03

91

96

53

68

16

44

89

15

87

 

6

02

40

47

18

07

39

88

89

64

23

7

97

25

71

04

95

34

14

37

93

38

 

7

15

72

81

46

27

34

31

11

04

26

8

26

72

54

92

13

83

45

00

66

67

 

8

38

43

96

85

55

50

90

69

53

67

9

86

12

98

36

99

46

82

17

94

07

 

9

57

61

83

78

98

74

62

70

92

94

When encrypting, the first figure of a codegroup is used as a row-index to the leftmost table, and the second is used as a column-index. These two figures are substituted by the ones found at the intersection in the encoding table, and the third figure of the original codegroup is appended as it is. To decode a received codegroup, one uses the rightmost table in the same way, since it is the inverse of the encoding table.

The codegroup 153 meaning Gegner geht zurück (="Enemy is retreating") will be superenciphered as 703 using the above keychart.

A similar superencipherment system was used by the Soviet Baltic Navy during World War II. The Soviet codegroups were four figures long and these were split up into pairs and then superenciphered with a chart and recombined into four-figure groups prior to transmission.
The Soviet Baltic Navy four-figure-code was successfully attacked and read by the Swedish signal intelligence organisation during WWII, probably due to the fact that the basic codebook was ordered and saw heavy use.
You can read more about the Soviet four-figure-code and other codes on the "From the Archives" page.


Code charts


In military situations, small code charts are often used as low-level tactical cryptosystem. A few examples of code charts will be given, first an Austrian, probably WWI or earlier:

3

6

0

7

4

8

1

9

5

2

2

a

ä

ai

au

äu

b

c

ch

ck

d

6

e

ei

eu

f

ff

g

h

i

ie

j

3

k

l

ll

m, mm

n, nn

o

ö

p

pp

r

7

s

sch

sp

spr

ss

st

str

t

tt

u

4

ü

v

w

x

y

z

0

1

2

3

0

4

5

6

7

8

9

.

,

;

?

8

Ab-
teilung

Armee

Artillrie

Ba-
taillon

Batterie

Brigade

Brücke

Division

Eisen-
bahn

Eskadron

1

Feld

Flieger

Flugzeug

Genie

Geschütz

Ge-
schwader

Gruppe

Infanterie

Jäger

Kanone

9

Ka-
vallerie

Kom-
pagnie

Kom-
mando

Korps

Mann

Mörser

Munition

Offizier

Pferd

Pionier

5

Regiment

Sanität

Sappeure

Schützen

Stab

Staffel

Train

Truppe

Wache

Zug

The figures at the left side and top row are used as coordinates to the cells of the chart. To encrypt Zug mit Munition angekommen (="Train with ammunition has arrived."), the first word Zug is found in the lower right hand corner, or row 5, column 2, so the codegroup will be '52'. The next word mit will have to be spelled out as '37 69 79'. The whole message put into code will look like this:

52 37 69 79 91 23 34 68 63 33 38 37 63 34

(Standard radio signalling practice would run the groups together though, and split the result up into five figure groups, so garbles would be easier to spot and correct. So, the message from above would appear in the ether like this: "52376 97991 23346 86333 38376 334". Also, the last group would in most cases be filled out, to make it five figures long as well.)

As can be seen, in some instances whole words are encrypted as single codegroups, but as the chart is small the amount of words are limited, so a lot of text will have to be spelled out. Since no variants are given for the more common letters (e will be represented by 63 all the time) this system will be (and probably was) likely to be broken by the enemy, if it is used a lot without changing the coordinates.


If it's known that the type of traffic to be put into code isn't very stereotyped and is therefore likely to use a large vocabulary, a syllabary square can be used. There are - of course - many possible constructions, but one sometimes found in cryptographic literature looks like this:

 

1

2

3

4

5

6

7

8

9

0

1

A

1

AL

AN

AND

AR

ARE

AS

AT

ATE

2

ATI

B

2

BE

C

3

CA

CE

CO

COM

3

D

4

DA

DE

E

5

EA

ED

EN

ENT

4

ER

ERE

ERS

ES

EST

F

6

G

7

H

5

8

HAS

HE

I

9

IN

ING

ION

IS

IT

6

IVE

J

0

K

L

LA

LE

M

ME

N

7

ND

NE

NT

O

OF

ON

OR

OU

P

Q

8

R

RA

RE

RED

RES

RI

RO

S

SE

SH

9

ST

STO

T

TE

TED

TER

TH

THE

THI

THR

0

TI

TO

U

V

VE

W

WE

X

Y

Z

Normally, a two-figure system will double the length of the text (since two figures have to be used to encrypt every individual letter of the plaintext), but in a syllabary system - if it is carefully constructed, like the above one - the resulting cryptogram is in most cases shorter. In the above chart clusters of up to three letters will be encrypted by single two-figure codegroups, and if we encrypt the sample text from above (in English this time) Train with ammunition has arrived, it will look like this:

t

ra

in

w

it

h

a

m

m

u

n

it

ion

has

ar

ri

ve

d

93

82

56

06

50

40

11

68

68

03

60

50

58

52

16

86

05

31

This is 18 groups and that is only 2 more than if we had used single codegroups for train and ammunition and spelled the rest of the message letter for letter, like in the Austrian system, and if we had used one codegroup per letter the resulting cryptogram would have been 29 groups long. The two repetitions '50' and '68' stands for it and m, which is of much lower frequency than, say e, and therefore not the first guess of plaintext, that the cryptanalyst would make; not an unimportant feature.

One way to strengthen code charts, which is sometimes seen, is to allot more than one coordinate to choose from for each row and/or column. Building on the syllabary square from above, a system like the one seen below could be constructed. It uses a letter to indicate the intended row of the plaintext element, and it is possible to choose between two or three different letters for each row. The column is indicated by a two figure number, and one can choose any number in the interval given for each column (e.g. the first column is indicated by any of the numbers 08, 09, 10, 11, 12, 13, or 14).

 

08-14

60-65

87-99

15-28

00-07

38-52

73-86

29-37

53-59

66-72

C,K,R

A

1

AL

AN

AND

AR

ARE

AS

AT

ATE

L,P,Z

ATI

B

2

BE

C

3

CA

CE

CO

COM

E,W

D

4

DA

DE

E

5

EA

ED

EN

ENT

A,M,X

ER

ERE

ERS

ES

EST

F

6

G

7

H

G,J,Q

8

HAS

HE

I

9

IN

ING

ION

IS

IT

B,U

IVE

J

0

K

L

LA

LE

M

ME

N

D,F,V

ND

NE

NT

O

OF

ON

OR

OU

P

Q

N,S,Y

R

RA

RE

RED

RES

RI

RO

S

SE

SH

H,O

ST

STO

T

TE

TED

TER

TH

THE

THI

THR

I,T

TI

TO

U

V

VE

W

WE

X

Y

Z

Using the message from above, "Train with ammunition has arrived.", it could be encoded thus:

O95 S61 Q43 I39 J67 X66 K09 B35 U29 I89 B71 G70 Q33 J64 R38 N39 T02 W14

Now, although the message has become longer, no repetitions at all are present, and the cryptanalysts task has become significantly harder.


© Torbjörn Andersson.Torbjörn Andersson Fecit