CRYPTANALYSIS OF THE A5/2
ALGORITHM
Slob o dan Petrovi c and Amparo Fuster-Sabater
Abstract - An attack on the A5/2 stream cipher algorithm is describ ed,
that determines the linear relations among the output sequence bits. The
vast ma jority of the unknown output bits can b e reconstructed. The time
17
complexity of the attack is prop ortional to 2 .
Introduction: A5 is the stream cipher algorithm used to encrypt the link from the
telephone to the base station in the GSM system. According to [1], twoversions
of A5 exist: A5/1, the 'stronger' version, and A5/2, the 'weaker' version. The
attacks on the A5/1, utilizing the birthday paradox, are describ ed in [2, 3]. The
attack on the A5/2 presented here is of algebraic nature.
The scheme of the A5/2 algorithm is given in the Fig. 1. The LFSR R clo cks
4
the LFSRs R ;:::;R in the stop/go manner. The feedback p olynomials of
1 3
14 17 18 19 21 22
the registers are: g x = 1+x + x + x + x , g x = 1+x + x ,
1 2
8 21 22 23 12 17
g x=1+x + x + x + x , g x=1+x + x . The function F is the
3 4
ma jority function F x ;x ;x =x x + x x + x x .
1 2 3 1 2 1 3 2 3
The communication in the GSM system is p erformed through frames. Each
frame consists of 228 bits. For every frame to b e enciphered, the initialization
pro cedure takes place, that yields the initial state of the LFSRs on the basis of
the 64-bit secret key K and the 22-bit frame number F . During the initializa-
tion, the bits of the secret key are rst imp osed into all the LFSRs, at every
clo ck pulse, without the stop/go clo cking, starting from the LSB of each key
byte. Then the bits of the frame numb er are imp osed into all the LFSRs in the
Instituto de F sica Aplicada CSIC, Serrano 144, 28006 Madrid, Spain 1
same way, starting from the LSB. Finally, the algorithm is run for 100 clo ck
pulses utilizing the stop/go clo cking, but pro ducing no output.
Cryptanalytic attack: The attack consists of up dating the system of linearized
equations that relate the state variables of the LFSRs R ;:::;R with the output
1 3
bits, on the basis of the clo ck-control sequence pro duced by the LFSR R , for
4
17
its initial state picked from the set of 2 p ossible states. The linearization of
the equations is p erformed by substitution of the nonlinear terms by the new
variables. Due to the frequent reinitializations, small numb er of skipp ed bits in
the initialization pro cess and the distribution of the feedback taps, many linearly
dep endent equations app ear, and almost all the unknown output bits, that come
after very few known output bits, can be reconstructed without solving the
system at all.
For the analysis of the system, we start from the analysis of the rank of a matrix
to which a random last row is added. Namely,we prove the following
Prop osition 1 - Let W =[w ] b e a matrix over GF2, whose
i;j
i=1;:::;m;j =1;:::;n
rank is r W = m. Let U =[u ] be a matrix over GF2,
i;j
i=1;:::;m+1;j =1;:::;n
whose rst m rows are resp ectively equal to the rows of W , and the elements
of the last row are generated indep endently at random, with the probability
Pru =1=0:5, 1 j n. Then the probability that r U=r W is
m+1;j
m n
Prr U=r W = 2 : 1
Pro of: The rst m linearly indep endentrows of the matrix U span the vector
m
space, whose cardinalityis2 . The claim that r U=r W means that the
last row of U must b elong to the vector space spanned by the rst m rows.
m n
Since Pru =1=0:5, the required probabilityis2 . Q.E.D.
m+1;j
Due to the nonlinear order of the ma jority function, the maximum number of
variables in the system will be n = 719. Consider now the pro cess of adding
equations to this system. Supp ose that for some clo ck pulse c of the algorithm,
t
the system consists of m linearly indep endent equations. If the contribution of 2
R to the new equation do es not dep end on k state variables, i =1;:::;3, then
i i
t
the numb er of equations that can b e added to the system reduces at least from