The MAGENTA Blo ck Cipher Algorithm
y
M.J. Jacobson, Jr. and K. Hub er
Deutsche Telekom AG
Am Kavalleriesand 3
64295 Darmstadt
GERMANY
June 8, 1998
Contents
1 Intro duction 2
2 The MAGENTA Algorithm 2
3 Computational Eciency 3
4 Algebraic Prop erties 5
4.1 The Function f x ...... 5
4.2 The Function PEx; y ...... 6
4.3 Two Successive Combinations of ...... 6
4.4 The Function T ...... 7
3
4.5 The Function E ...... 7
4.6 The MAGENTA Algorithm ...... 8
5 Avalanche Prop erties 8
5.1 The Function f x ...... 8
5.2 The Function PEx; y ...... 8
3
5.3 The Function E ...... 9
5.4 The MAGENTA Algorithm ...... 10
6 Statistical Tests 10
7 Di erential Cryptanalysis 12
8 Linear Cryptanalysis 13
9 Palindrome Prop erties 14
10 Conclusions 14
A Assembler Co de 16
y
hub [email protected] 1
1 Intro duction
The developmentofMAGENTAMultifunctional Algorithm for General-purp ose Encryption and Network
Telecommunication b egan in 1990, with the basic design principles explained in the unpublished pap er
[7]. The idea was to apply simple and transparent techniques no magic tables or constants which can b e
eciently implemented in b oth software and hardware. Originally, this was realized by using a butter y
structure to accomplish di usion and discrete exp onentiation in a nite eld for confusion.
In the following years, the idea of hardware realizations was investigated more closely. In co op eration
with hardware sp ecialist S. Wolter the butter y structure of the original prop osal was switched to the FHT
shue structure, which has the advantage of giving identical structures at each stage. Further slightchanges
to the algorithm o ccurred when an analysis of the algorithm done by sp ecialists of the company SIT [5] was
carried out in 1994.
Plans were to develop a chip which would be capable of op erating up to the gigabit/sec range see
[8]. It was envisioned to use suchchips for encryption of ATM connections. Unfortunately the hardware
realization did not pro ceed as planned since the need for such encryption is not yet widely appreciated,
although investigations have shown that it should b e p ossible to achieve [8].
Currently, the MAGENTA algorithm is used within Deutsche Telekom for securing sensitive management
data. In addition, a VHDL design and a FPGA Field Programmable Gate Array realization are in progress.
There are two schemes to mention which essentially used structures of the fast Fourier Transform for
cryptographic purp oses prior to the MAGENTA algorithm. The rst is an invention of Jean Pierre Vaseur
[19] whichwas led on 2nd June 1959. The second is the so-called Comp-128 algorithm which is used by
some GSM providers. The Comp-128 algorithm was designed at the end of the eighties, and was recently
disclosed through the Internet.
In the next section, we describ e the MAGENTA algorithm in detail. As mentioned ab ove, the security
of MAGENTA has b een analyzed in great detail on b ehalf of Deutsche Telekom by SIT GmbH for use with
128-bit keys. In Sections 4 to 9, we highlight the details of the internal rep ort based on this analysis [5] and
extend the analysis to 192 and 256 bit keys where appropriate.
2 The MAGENTA Algorithm
8
Let B = f0; 1g b e the set of all 8-bit binary vectors bytes. For x 2 B; we will write x =x ;x ;:::;x ;
7 6 0
7 6
and we asso ciate eachbyte with an integer in f0; 1;:::;255g via the formula x ;x ;:::;x 7! 2 x +2 x +
7 6 0 7 6
::: + x : The op erator will denote bitwise addition mo dulo 2; i.e., the usual bitwise XOR op eration.
0
The heart of the MAGENTA algorithm is based on the Fast Hadamard Transform FHT [10]. However,
we replace the addition and subtraction at each no de in the shue structure by the following non-linear
8 6
op eration. Let b e a primitive element of the eld GF 256 with generating p olynomial px=X + X +
5 2
X + X + 1 and p =0: For all x 2 B; de ne
x
x 6= 255
f x= 1
0 x = 255 :
2
Then, for all x; y 2 B we de ne
Ax; y =f x f y 2
and
PEx; y =Ax; y ;Ay; x = f x f y ;fy f x : 3
16
For all x ;:::;x 2 B ; our mo di cation of the FHT is given by
0 15
T x ;:::;x = x ;:::;x ; 4
0 15 0 15
where x ;:::;x is de ned as
0 15
x ;:::;x =PEx ;x ;PEx ;x ;:::;PEx ;x : 5
0 15 0 8 1 9 7 15 2
The function T x ;:::;x op erates on a single 128-bit parameter and returns a 128-bit output. Clearly,
0 15
this op eration is very quick, since it can b e implemented entirely with bit op erations.
16
For all X =x ;:::;x 2 B ; de ne
0 15
X =x ;x ;:::;x
e 0 2 14
and
X =x ;x ;:::;x ;
o 1 3 15
i.e., X consists of the bytes of X with even index and X consists of the bytes of X with o dd index. The
e o
function C consists of rep eated applications of our FHT variant, and is recursively de ned for j 1 and all
x ;:::;x by
0 15
j +1 j j
6 C x ;:::;x =T x ;:::;x C ; x ;:::;x C
0 15 0 7 8 15
e o
1
where the initial value C = T x ;:::;x : For a xed numb er of rounds r; we de ne
0 15
r r
E x ;:::;x =C : 7
0 15
e
Originally,MAGENTAwas designed with r =7: However, during analysis by SIT GmbH [5, App endix] it
was discovered that using r = 7 made a chosen plaintext attack p ossible. It was recommended that the
numb er of rounds b e reduced to 3; and the analysis in the following chapters shows that to the b est of our
knowledge this choice do es not result in any signi cant cryptographic weaknesses of the overall blo ck cipher.
Therefore, we xr =3:
The complete MAGENTA blo ck cipher makes use of the well-known Feistel construction [3] using the
3 16 8
function E as the basic cyryptomo dule. For x = x ;:::;x 2 B and y = y ;:::;y 2 B ; one
0 15 0 7
\Feistel-round" is de ned as
3
F x= x ;:::;x ; x ;:::;x E x ;:::;x ;y ;:::;y : 8
y 8 15 0 7 8 15 0 7
16
Let M =x ;:::;x 2 B b e one plaintext blo ck 128 bits. The MAGENTA algorithm supp orts the
0 15
following three key sizes:
128 bit: K =K ;K ;
1 2
192 bit: K =K ;K ;K ;
1 2 3
256 bit: K =K ;K ;K ;K ;
1 2 3 4
where K =y ;:::;y ;K =y ;:::;y ;K =y ;:::;y ; and K =y ;:::;y : The MAGENTA
1 0 7 2 8 15 3 16 23 4 24 31
algorithm makes use of six or eightFeistel rounds, where each round uses a di erent part of the key. The
algorithm is given by
8
16
F F F F F F M if K =K K 2 B
<
K K K K K K 1 2
1 1 2 2 1 1
24
F F F F F F M if K =K K K 2 B
Enc M = : 9
K K K K K K 1 2 3
K
1 2 3 3 2 1
:
32
F F F F F F F F M if K =K K K K 2 B
K K K K K K K K 1 2 3 4
1 2 3 4 4 3 2 1
Due to a palindromic prop erty of Equation 9 given in Section 9, the decryption function can easily be
expressed in terms of the encryption function by
Dec M =V Enc V M ; 10
K K
where V x ;:::;x =x ;x ;:::;x ;x ;x ;:::;x :
0 15 8 9 15 0 1 7
3 Computational Eciency
In this section, we give estimates of the numb er of clo ck cycles required to p erform ve basic op erations of
the MAGENTA algorithm on two architectures, an Intel Pentium Pro pro cessor running at 200 MHz and
the Z80 micropro cessor, a typical 8-bit pro cessor. In particular, we consider the following op erations: 3
set up a key,
change a key,
initialize the algorithm,
encrypt one 128-bit blo ck in ECB mo de,
decrypt one 128-bit blo ck in ECB mo de.
We analyze all three key sizes supp orted byMAGENTA, namely 128; 192; and 256 bits.
In order to obtain eciency estimates on a Pentium Pro pro cessor, we computed the run times in seconds
6
required to p erform eachof the ve op erations 10 times on a given random key-plaintext pair using our
optimized C implementation. These run times were then normalized to obtain an approximation of the
6
number of clo ck cycles required for one iteration by dividing by 10 the total number of iterations and
8
multiplying by the numb er of cycles p er second 2 10 . The results of these calculations are in Table 1.
Op eration 128-bit 192-bit 256-bit
Key Setup 2044 3094 4308
Change Key 2044 3094 4308
Alg. Init 2320 2320 2320
Encrypt Blo ck 23694 23694 31550
Decrypt Blo ck 23884 23884 31744
Table 1: Eciency estimates in clo ck cycles for 200MHz Pentium Pro
Note that in MAGENTA, key setup and changing a key are the same op eration. Furthermore, the
initialization of the algorithm consists of computing a table of values of the function f; i.e., a table containing
256 bytes. This initialization need only b e p erformed once p er session, or not at all if an array representing
f is implemented as a constant.
For computing estimated clo ck cycle requirements on a Z80 pro cessor, we use the architecture and
instruction set sp eci ed in [20]. Wehave written rough assembler programs for parts of MAGENTA using
this instruction set, based on an optimized C implementation. These programs have never b een tested or
even run on a computer, so our clo ck cycle counts should b e considered only as estimates. Samples of this
assembler co de can b e found in App endix A. The clo ck cycle estimates are lo cated in Table 2.
One frequently o ccurring op eration in MAGENTA is copying 8-byte and 16-byte blo cks of memory. Z80
assembler co de for this can b e found in App endix A. According to the cycle requirements p er instruction
given in [20], this op eration as wehave written it requires 72 cycles for 8-byte blo cks and 136 cycles for
16-byte blo cks.
If we assume that the key input is given to us as raw byte data, then setting up the key consists of
recording the size of the key and copying either 2; 3; or 4 8-byte data blo cks, dep ending on the key size.
Assembler co de for this is given in App endix A. According to the cycles p er instruction in [20], this requires
157; 234; or 306 cycles for 128; 192; and 256-bit keys resp ectively. Again, changing a key is the same op eration
as setting a key up. It should b e noted that the clo ck cycle requirements wehave computed for key setup
onaPentium Pro are higher b ecause the C implementation we used includes overhead for converting the
ASCI I character representation of the key to rawbyte format.
Assembler co de for initializing the algorithm i.e., computing a table of values representing the function f
is given in App endix A. According to the cycles p er instruction in [20], this op eration takes 3582 clo ck cycles.
However, as stated ab ove, this table should b e implemented as a constant in an actual implementation, so
this computation should usually not b e necessary.
The function Equation 5 requires 672 clo ck cycles according to the instruction set in [20] and our
assembler co de in App endix A. The function T Equation 4 consists of four consecutive applications of
PI; and hence requires 2688 clo ck cycles. During one Feistel round Equation 8, T is executed three times.
In addition, based on our optimized C implementation, two 8-byte and two 16-bytes memory transfers are 4
required, as well as two iterations of simple computations requiring 144 and 188 cycles eachper iteration
computing X and X and computing the XOR of two 16-byte blo cks and one iteration of a blo ck requiring
e o
108 cycles computing the last 8-byte blo ck XOR sum. Hence, one complete Feistel round requires 9249
clo ck cycles. One encryption of a 128-bit blo ck in ECB mo de requires six Feistel rounds for 128 and 192-
bit keys and eight Feistel rounds for 256-bit keys, plus one 16-byte memory copy copying the plaintext
blo ck. In total, 55630 clo ck cycles are required to encrypt one blo ck using 128 and 192-bit keys, and 74128
clo ck cycles for 256-bit keys. The decrypt op eration is almost identical, except we also need to compute
the function V x ;:::;x = x ;x ;:::;x ;x ;x ;:::;x twice, which costs an additional four 8-byte
0 15 8 9 15 0 1 7
memory copies, resulting in 55918 clo ck cycles for 128 and 192-bit keys, and 74416 clo ck cycles for 256-bit
keys.
Op eration 128-bit 192-bit 256-bit
Key Setup 157 234 306
Change Key 157 234 306
Alg. Init 3582 3582 3582
Encrypt Blo ck 55630 55630 74128
Decrypt Blo ck 55918 55918 74416
Table 2: Eciency estimates in clo ck cycles for 8-bit pro cessor
4 Algebraic Prop erties
4.1 The Function f x
During one application of the MAGENTA algorithm, the function f x Equation 1 is executed a total of
2304 times for 128 and 192-bit keys and 3072 times for 256-bit keys. Since this function is the non-linear
building blo ck of the algorithm, it has sp ecial imp ortance in any analysis of the entire algorithm. The
following prop erties of f are known:
1. It is one-to-one, i.e., it is a p ermutation over the set B:
2. This p ermutation can b e represented as a pro duct of 6 disjoint cycles of lengths 198; 38; 9; 5; 5; and
1: With regards to combinatorial analysis [17], these values are \normal" | there is no signi cant
deviation from the cycle representations of randomly generated p ermutations. The single xed p oint
is the numb er 161:
7 6
3. Considering bytes as integers via the map x ;x ;:::;x 7! 2 x +2 x + ::: + x , for all x 2 B
7 6 0 7 6 0
such that f x 2f1; 2;:::;127g wehave
f x +1=2 f x mo d 255;
2
where by we denote multiplication over the integers. For all x; y 2 B such that f x f y 2
f1; 2;:::;255g; we can generalize this prop ertyby
f x + y mo d 255 = f x f y :
These relatively strong prop erties do not seem to haveany negative cryptographic e ects on the overall
security of the algorithm.
4. If we consider f x as a vector function of the form f x = f x;f x;:::;f x; then each of
7 6 0
the eight Bo olean functions f ;:::;f is non-linear with degree 7: Also, all p ossible non-trivial XOR-
0 7
combinations of these functions are non-linear. In [5], these functions are explicitly given in algebraic
normal form Shegalkin p olynomials. One interesting prop erty is that each function has exactly 128
summands. No other sp ecial prop erties of these functions symmetry prop erties, etc. were found. 5
4.2 The Function PEx; y
The function PE Equation 3 has two striking prop erties:
2
1. For x; y 2 B ; de ne v x; y =y; x: Then, PEx; y has the following symmetry prop erty:
v PEx; y = PEv x; y ; 11
which can simplify certain analyses.
2. PEx; y is not one-to-one. There are
2
24235 elements in B with no pre-image
2
23952 elements in B with exactly one pre-image
2
12028 elements in B with exactly two pre-images
2
4079 elements in B with exactly three pre-images
2
1007 elements in B with exactly four pre-images
2
172 elements in B with exactly ve pre-images
2
47 elements in B with exactly six pre-images
2
11 elements in B with exactly seven pre-images
2
3 elements in B with exactly eight pre-images
2
1 elements in B with exactly nine pre-images
2
1 elements in B with exactly eleven pre-images.
The element 236; 236 has 9 pre-images, namely 18; 18; 38; 141; 67; 205; 141; 38; 146; 146;
205; 67; 213; 224; 224; 213; 236; 236: The element 256; 256 is a xed p oint of PE: The ele-
ment 227; 227 has 11 pre-images, namely 17; 111; 28; 255; 55; 55; 104; 186; 111; 17; 126; 126;
166; 166; 173; 173; 186; 104; 197; 197; 255; 28:
The fact that PE is not one-to-one implies that it is p otentially easier to construct collisions if MAGENTA
3
is used as a hash function. In addition, it also implies that the images of the function E can o ccur with
unequally distributed frequencies. On the other hand, it is p ossible that this may increase the dicultyof
3
inverting E :
4.3 Two Successive Combinations of
The function T consists of 4 successive applications of : Two successive combinations of ; i.e., the function
x ;:::;x consists of four indep endent PE-combinations. Each of these combinations has four input
0 15
bytes x ;x ;x ;x and transforms them into four output bytes y ;y ;y ;y as follows:
0 1 2 3 0 1 2 3
y = f f x f x f f x f x
0 0 1 2 3
y = f f x f x f f x f x
1 2 3 0 1
y = f f x f x f f x f x
2 1 0 3 2
y = f f x f x f f x f x
3 3 2 1 0
As a direct consequence of the prop erties of PE wehave:
1. This function has the following symmetry prop erties:
a The p ermutation :x ;x ;x ;x ! x ;x ;x ;x on the pre-image implies the p ermutation
0 1 2 3 2 3 0 1
:y ;y ;y ;y ! y ;y ;y ;y on the image.
0 1 2 3 1 0 3 2
0
b The p ermutation :x ;x ;x ;x ! x ;x ;x ;x on the pre-image implies the p ermutation
0 1 2 3 1 0 3 2
0
:y ;y ;y ;y ! y ;y ;y ;y on the image.
0 1 2 3 2 3 0 1
00
c The p ermutation :x ;x ;x ;x ! x ;x ;x ;x on the pre-image implies the p ermutation
0 1 2 3 3 2 1 0
00
:y ;y ;y ;y ! y ;y ;y ;y on the image.
0 1 2 3 3 2 1 0 6
16 2 4
2. This function is not one-to-one. At least 2 24235 2 24235 = 2589194695 elements of B more
4 2
than 60 p ercent have no pre-image. There is at least one element in B with 11 = 121 or more
pre-images. In particular, the element 193; 193; 193; 193 has 201 pre-images.
4
The unequal o ccurrences of elements in B as images leads one to ask howmuch this is re ected in the
3 32
function E : The frequencies of the image comp onents y and y were determined with resp ect to all 2
0 2
p ossible pre-images. The variation of the frequencies around the average 65536 was relatively small. The
smallest value was 64393 for 82; 98 and the largest was 67130 for 41; 41:
4.4 The Function T
The shue structure of the function T Equation 4 also exhibits some symmetry. In conjunction with the
16
symmetry prop erties of PE; one can show that for all x ;:::;x 2 B
0 15
^ ^
V T x ;:::;x = T V x ;:::;x
0 15 0 15
where
^
V x ;:::;x =x ;:::;x :
0 15 15 0
Generalizations of this easily lead to further prop erties of T:
From the fact the PE is not one-to-one, it is easy to see that T is not one-to-one either. Since PE is
applied 32 times during one application of T; one exp ects that the image of T is considerably smaller than
16
B : In fact, it follows from the results of the previous subsection that at least 97:4 p ercent of all elements
16
in B have no pre-image with resp ect to T: On the other hand, the element 220; 220;:::;220 has at least
4 9
201 > 1:6 10 pre-images.
The fact that T is so clearly not one-to-one can have a negative e ect on the confusion and di usion prop-
erties of the overall cipher. However, the following prop erty helps rectify the situation. For all x ;:::;x ;
0 15
for all c 2 B nf0g; and for all i 2f0;:::;15g; if
y ;:::;y =T x ;:::;x
0 15 0 15
and
0 0
y ;:::;y =T x ;:::;x ;x c; x ;:::;x
0 i 1 i i+1 15
0 15
0
: This prop erty provides us with an essential fact guaranteeing crypto- then for all j 2f0;:::;15g y 6= y
j
j
3
graphic prop erties of the function E as well as the overall cipher. At the same time, it follows that the
16
function T assumes all 256 p ossible byte-comp onents if all 256 pre-images are applied.
3
4.5 The Function E
3
The function E Equation 7 has the following symmetry prop erties. Let x ^ y denote bit-wise AND.
16
Then, for all x ;:::;x 2 B
0 15
3
1. If x = x = ::: = x ; then for E x ;:::;x =y ;:::;y it follows that y = y = ::: = y : For
0 1 15 0 15 0 7 0 1 7
3
example, E 102;:::;102 = 0;:::;0:
3
2. If x = x = ::: = x and x = x = ::: = x ; then for E x ;:::;x =y ;:::;y it follows that
0 1 7 8 9 15 0 15 0 7
y = y ^ y = y ^ y = y ^ y = y :
0 4 1 5 2 6 3 7
3. If x = x = x = x = x = x = x = x and x = x = x = x = x = x = x = x then for
0 1 2 3 8 9 10 11 4 5 6 7 12 13 14 15
3
E x ;:::;x =y ;:::;y it follows that y = y ^ y = y ^ y = y ^ y = y :
0 15 0 7 0 2 1 3 4 6 5 7
4. If x = x = x = x = x = x = x = x and x = x = x = x = x = x = x = x then for
0 1 4 5 8 9 12 13 2 3 6 7 10 11 14 15
3
E x ;:::;x =y ;:::;y it follows that y = y ^ y = y ^ y = y ^ y = y :
0 15 0 7 0 1 2 3 4 5 6 7
A p ossible cryptographic e ect of these prop erties is that for certain keys and highly structured plaintexts
with many identical bytes the corresp onding ciphertext may exhibit structural prop erties. In this resp ect,
one can designate keys with many identical bytes as \weak keys." However, the probability that suchaweak
key is generated randomly is suciently small. 7
4.6 The MAGENTA Algorithm
An essential requirement for the suitability of Equation 9 as a cipher is ful lled due to the fact that for all
16 24 32 16 16 16
K 2 B ;B ; and B ; and all M 2 B ; the function Enc M is a one-to-one function B ! B : This
K
fact follows from the easily demonstrated fact that F x Equation 8 is also one-to-one.
y
Further algebraic prop erties of Enc M are given in Section 9.
K
5 Avalanche Prop erties
An imp ortant design criterium for any blo ck cipher is the \Avalanche E ect" | by mo difying one input
bit on average one half of the output bits should change. This should ensure that the relationship of the
plaintext to the ciphertext and resp ectively the key to the ciphertext is as irregular as p ossible, i.e., the
ciphertext should seem random and not exhibit any structural dep endencies on the plaintext or the key.
The Avalanche E ect is also imp ortantin the context of hash functions. The smallest mo di cation of
the text should result in considerable mo di cations to the hash value with high probability.
In the following we describ e the results of tests [5], where several of the functions involved in the MA-
GENTA algorithm were tested as to whether they ful ll one of the strictest avalanche criteria [4]. This
criteria is satis ed if the mo di cation of one input bit causes each output bit to change with probability
2
\close" to 1=2: Signi cance levels were computed for these probabilities and -tests were applied in order
to have a precise test. A short presentation of the tests can b e found in [6].
During the course of each test an n n matrix A =a ; 0 i; j i;j numb er of input bits and the numb er of output bits of the function b eing tested. We call A the dep endency matrix. For sp eci c i and j; a is the numberofchanges of bit i in the image vector when bit j of m randomly i;j selected pre-image vectors is changed. One can consider each a as the sum of m random variables A with i;j i;j 0 1 distribution. If each of these is binomially distributed with parameter 0:5; then with high probability p p the value of each a should lie b etween m 3 m=2 and m +3 m=2; the so-called 3 range. i;j The following hyp othesis was tested: Hyp othesis 1 H The function under consideration ful l ls the strict avalanche criterium SAC | for 0 al l i; j 2f0;:::;n 1g 1 P A =1 i;j 2 holds. 2 A -test was constructed with help of the dep endency matrix with test size 2 m X