Quick viewing(Text Mode)

Pdf 801.15 K

Pdf 801.15 K

The ISC Int'l Journal of Information Security July 2017, Volume 9, Number 2 (pp. 101–110) ISeCure http://www.isecure-journal.org On the Computational Complexity of Finding a Minimal Basis for the Guess and Determine Attack Shahram Khazaei 1, and Farokhlagha Moazami 2,∗ 1Sharif University of Technology, Department of Mathematical Sciences, Iran, Tehran 2Shahid Beheshti University, Cyberspace Research Institute, Iran, Tehran

A R T I C L E I N F O. ABSTRACT

Article history: Guess-and-determine attack is one of the general attacks on stream ciphers. Received: 6 March 2017 Revised: 13 June 2017 It is a common tool for evaluating security of stream ciphers. Accepted: 10 July 2017 The effectiveness of this attack is based on the number of unknown bits Published Online: 12 July 2017 which will be guessed by the attacker to break the . In this Keywords: work, we present a relation between the minimum number of the guessed Guess-and-determine Attack, Computational Complexity, bits and uniquely restricted matching of a graph. This leads us to see that NP-complete, Fixed Parameter finding the minimum number of the guessed bits is NP-complete. Although Tractable, Uniquely Restricted Matching, Alternating Cycle Free fixed parameter tractability of the problem in term of minimum number of Matching, Perfect Matching, Jump the guessed bits remains an open question, we provide some related results. Number, Forcing Number. Moreover, we introduce some closely related graph concepts and problems including alternating cycle free matching, jump number and forcing number of a perfect matching. c 2017 ISC. All rights reserved.

1 Introduction ficient on modern w-bit processors where typically w = 8, 16, 32, 64. These ciphers can be described us- uess-and-determine (GD) attack is a general ing some equations over F2w . GD attack is a basic Gcryptanalysis technique for evaluating security of divide-and-conquer approach to solve the correspond- symmetric . In particular, GD attacks ing equation system. The attacker guesses the values have been a favourite tool for evaluating strength of some unknowns, and then recovers the remaining of many prominent stream ciphers during the past variables using the available relations. If the guessed decade. Examples include, the attacks on RC4 in [1], values are incorrect, the attacker reaches a contradic- on A5/1 in [2], on Snow family in [3–7] on Sober tion. However, if he is lucky and the guessed values family in [8–13], on Sosemanuk in [14–16], on Polar are correct, the remaining unknowns will be deter- Bear in [17, 18] and on in [19]. mined. To mount the attack, all possible values for the set of guessed variables, known as basis, must be Any cryptosystem can essentially be described us- tried. Thus, the effectiveness of the attack is deter- ing a system of nonlinear equations. In many cases, mined by the size of the basis. the system is an over-defined set of equations over some finite field. In particular, many ciphers proposed In this paper, we consider a class of GD attacks on during the last decades have been designed to be ef- a system of m equations over Fq involving n unknown variables where m = poly(n) and q = O(1). We as- ∗ Corresponding author. sume that each equation in this system has the follow- Email addresses: [email protected] (S. Khazaei), ing property: if the equation depends on, let us say, t f [email protected] (F. Moazami) variables and t − 1 of them are known, then the last ISSN: 2008-2045 c 2017 ISC. All rights reserved.

ISeCure 102 Computational Complexity of Finding a Minimal Basis for the Guess and ... — Khazaei and Moazami

variable is uniquely and efficiently determined. This GD attack which leads to improved cryptanalytic at- is a typical property of equations derived for many tacks in practice. Finally, the paper is concluded in ciphers in practice. Assume that a basis of size k ex- Section 7. ists and the attacker knows one. To mount the actual attack, the attacker then needs to try all qk possible 2 Graph Preliminaries ways that the unknowns can be assigned. Since the We assume that the reader is familiar with basic graph correctness of each assignment can be efficiently veri- terminologies. Here we introduce some more advance fied, one can find all satisfying solutions essentially notions. Appendix A includes some illustrating exam- k in time O(q ), ignoring polynomial factors. ples. An isolated vertex is a vertex with degree zero. The naive approach to find a basis of size k, if one An empty graph has no edges, i.e., it consists of only n isolated vertices. A pendant vertex is a vertex with exists, is to examine all k subsets of size k of the set of all unknowns. Checking if a given subset is a basis degree one. A matching in a graph is a set of edges can be done in polynomial time. Therefore, this leads where no two edges share a common vertex. A vertex to the time complexity O(nkpoly(n)) for finding a is matched (or saturated) if it is an endpoint of one of basis of size k. the edges in the matching. A matching of a graph is called perfect if all vertices of the graph are matched. Finding a basis using the naive approach is already unfeasible even for moderate values such as n = Definition 2.1 (uniquely restricted matching). 64 and k = 10 since nk = 260 is considered hardly A matching M in a graph G is called a uniquely re- stricted matching if its matched vertices induce a sub- affordable for cryptanalysis purposes. Therefore, GD graph which has a unique (perfect) matching, namely attacks have often been designed ad-hoc based on the experience of cryptanalysts to find a basis of a small M itself (see Figure A.1a and A.1b). size. There are also some research [5–7] trying to find Definition 2.2 (alternating cycle with respect a (non-optimal) basis in a more systematic way using to a matching). Let M be a matching in graph some heuristics based on greedy algorithms. G. A cycle in G is called alternating with respect to M if its edges appear alternately in M and 1.1 Motivation and Contribution E(G)setminusM (see Figure A.1c). Definition 2.3 (alternating cycle free match- Although finding an optimal basis, i.e., a basis with ing). A matching M in a graph G is called alter- minimum size, has been understood to be a difficult nating cycle free if G has no alternating cycle with problem by cryptanalysts, to the best of our knowl- respect to M. edge, no attempt has been devoted to understand the Definition 2.4 (tree decomposition of a graph). computational complexity aspects of this problem. In A tree decomposition of a graph G is a pair (T, B), this paper, we consider a rigorous treatment of the where T is a tree and B = {B } is a collec- problem of finding a minimum-size basis and show i i∈V (T ) tion of subsets of V called bags, such that (see Fig- that it is NP-complete. We also study the fixed pa- ure A.2): rameter tractability of the minimal basis problem in terms of the size of optimal basis size k; that is, if it • Every vertex of G is contained in at least one has an O(f(k)poly(n))-algorithm. This is very impor- bag Bi, tant both from a theoretical complexity point of view • For each edge of G, some bag Bi contains both and practical cryptanalysis since the naive approach its vertices, has a lower bound complexity Ω(nk). Although we • For all vertices i, j, k ∈ V (T ), if j belongs to the are not able to prove the fixed parameter tractability unique path from i to k in T , then Bi ∩ Bk ⊂ of the problem in terms of k, we show that it is fixed Bj. parameter tractable in terms of the treewidth of a Definition 2.5 (treewidth of a graph). The graph which we associate to the problem. width of a tree decomposition (T, B) is one less than The remainder of this paper is organized as fol- its maximum bag size, i.e., maxi∈V (T )|Bi|−1. The lows. In Section 2, we introduce some graph concepts treewidth of a graph G is defined as the minimum that are used in the next sections. In Section 3, we width over all possible tree decompositions of G. give a precise definition of finding a minimal basis See [20, 21] for more details about treewidth and for GD attack and present an equivalent description tree decomposition. of the problem in graph theory terminology. Then in Section 4, we study computational complexity of the 3 Minimal Basis Problem finding a minimal basis. Section 5 relates our prob- lem to two other mathematical problems from graph In this section, we provide a formal definition of theory. In Section 6, we discuss some aspects of the the problem of finding a minimum-size basis of an

ISeCure July 2017, Volume 9, Number 2 (pp. 101–110) 103

equation set with some special properties. We also has no basis of size one, but has some bases of size present an equivalent description of the problem in two, e.g., B = {x2, x3}. graph theory terminology which is easier to work with. We need to introduce some definitions. Our goal is to study the problem of finding a mini- mal basis of an invertible equations system. Instead, Definition 3.1 (invertible equation). An equa- we will work with an equivalent description of the tion f(x , . . . , x ) = 0 over is called invertible if 1 t Fq problem in terms of bipartite graphs. We first define any variable of the equation is uniquely determined the basis of such graphs. in time O(t log q) when all other variables are known. A preprocessing phase that computes a table of size Definition 3.4 (basis of a bipartite graph). Let O(q) in time O(q) is allowed. G = (F,X,E) be a bipartite graph. A subset B ( X is called a basis of G if at the end of Procedure 2, G See Appendix B for a typical invertible equation becomes empty: that appears during cryptanalysis of symmetric cryp- tosystems. Procedure 2 (used in Definition 3.4) Definition 3.2 (invertible equations system). Input: A bipartite graph G = (F,X,E) and a Let X = {x , . . . , x } be a set of n variables and subset B ( X 1 n Remove every edge in G which is incident to some F = {f1, . . . , fm} be a set of m invertible equations, each depending on a subset of X. We call (F,X) an vertex in B while there is an edge fx ∈ E such invertible equations system. that f ∈ F , x ∈ X and deg(f) = 1 do Remove every edge in G which is incident to Definition 3.3 (basis of an invertible equations x system). Let (F,X) be an invertible equations sys- end tem. A subset B ( X is called a basis for the system if at the end of Procedure 1 we have U = X. We now formally introduce the graph version of Procedure 1 (used in Definition 3.3) minimal basis problem, that we will work with in the rest of this paper. Input: An invertible equations system (F,X) and a subset B ( X Definition 3.5 (minimal basis problem). The U ← B while there is an equation f ∈ F de- minimal basis problem is defined with the following pending on t variables with t−1 of them in U do input instance and question: Let x∈ / U be the (remaining) variable that f • Instance: A connected bipartite graph G = depends on Add x to U (F,X,E) and an integer k end • Question: Does G have a basis of size at most k? Note that any subset B ⊂ X of size |B|= |X|−1 is a basis; also a basis might be empty. Therefore, We ignore to provide a variant of the above def- for a basis B we have 0 ≤ |B|≤ |X|−1. If B is a inition for an invertible equations system. The re- basis for (F,X), any assignment to the variables in lation between the two versions of the problem of B can efficiently be checked in terms of consistency. finding a minimal basis should be clear. Let F = If the whole set of equations are consistent given the {f1, f2, . . . , fm} be a set of invertible equations de- assigned values, the remaining variables are uniquely pending on variables X = {x1, x2, ··· , xn}. We as- determined. Otherwise, a contradiction is reached. sociate a bipartite graph G = (F,X,E) with vertex Below, we present a toy example as well as a real set F ∪ X to the equation set as follows: the equa- example of a set of invertible equations. tion fi ∈ F is adjacent to the variable xj ∈ X, i.e., f x ∈ E, if and only if f effectively depends on x . Example 3.1 (real). Appendix C describes the set i j i j Clearly, a set B X is a basis for the invertible equa- of equations that fully specifies the Snow 2.0. [4, 22] ( tions system (F,X) if and only if it is a basis for the . associated graph G. Notice that the associated graph Example 3.2 (toy). The following set of invertible has no isolated vertex. equations   f (x , x , x ) = 0 4 Computational Complexity of  1 1 2 3  Minimal Basis Problem  f (x , x , x ) = 0  2 2 3 4  4.1 NP-completeness f3(x1, x2, x4) = 0   The following theorem presents a close connection  f4(x1, x3, x4) = 0  between bases and uniquely restricted matchings of   f5(x1, x2, x3, x4) = 0 a bipartite graph.

ISeCure 104 Computational Complexity of Finding a Minimal Basis for the Guess and ... — Khazaei and Moazami

Theorem 1. Let G = (F,X,E) be a non-empty bi- We use a contrapositive argument. Let M = partite graph with no isolated vertex and let 0 ≤ k ≤ {f1x1, . . . , f`x`}, where ` = |X|−k, and all vertices |X|−1 be an integer. The graph G has a basis of size {f1, . . . , f`} have degree at least 2. Notice that M is k if and only if it has a uniquely restricted matching the unique matching of H0 of size `. Let K be the of size |X|−k. subgraph of H0 induced by the vertices of the set V (M) = {x , . . . , x , f , . . . , f }. Since all vertices In the following we propose and prove two lemmas 1 ` 1 ` in the set {f , . . . , f } have degree at least 2, then from which Theorem1 is concluded. 1 ` K has a cycle C. Since K is a bipartite graph, the Lemma 1. Let G = (F,X,E) be a non-empty bipar- number of edges of the cycle C is an even number. tite graph with no isolated vertex and let B X be a ( Also, C has some edges of the matching M alterna- basis of size k for G, where 0 ≤ k ≤ |X|−1. Then G tively. Thus, C is an alternating cycle with respect has a uniquely restricted matching of size |X|−k. to a submatching of M. This is a contradiction since M is a uniquely restricted matching. Hence, K has Proof. Similar to the first step of Procedure 2, re- a pendant vertex f1 ∈ F . Degree of f1 in the graph move every edge in G which is incident to some ver- H0 is also one since every vertex of the graph H0 in the section X is a vertex of the graph K. tex in B. Denote the remaining subgraph by H0. The graph H is non-empty and it has at least one pen- 0 Let x1 ∈ X \ B be the only adjacent vertex to f1 dant vertex f ∈ F . Assume that x ∈ X is the 1 1 in H0. Remove all edges incident to x1 in H0 and let only vertex incident to f and let e = f x . Let 1 1 1 1 H1 denote the resulting graph. H1 be a graph obtained from H0 by deleting every If |M|= 1, then E(H ) = ∅ and hence B is a ba- edge in H0 which is incident to x1. Since B is a 1 basis for the graph G, then (assuming k ≤ |X|−2) sis as required. Otherwise we have E(H1) 6= ∅. Let M = M \{f x }. Clearly, M is the only matching there exists at least one pendant vertex f2 ∈ F in 1 1 1 1 of size ` − 1 for the graph H . Again, we argue that H1. Let x2 ∈ X be the only vertex incident with 1 H has a pendent vertex f ∈ F . By repeating the f2 and let e2 = f2x2. Repeat the previous proce- 1 2 dure until an empty graph is obtained. Note that previous procedure |M| times, we finally obtain an since B is a basis of size k for the graph G, this empty graph. Therefore, B is a basis for the graph procedure stops after |X|−k rounds. In the sequel, G. we show that M = {e1, e2, . . . , e|X|−k} is a uniquely restricted matching for G. Assume that K is the In [23], it has been shown that maximum uniquely induced subgraph of the graph G by the vertices restricted matching problem is NP-complete for bi- {f1, x1, f2, x2, ··· , f|X|−k, x|X|−k}. The degree of the partite graphs. Therefore, we have the following corol- vertex f1 in the graph K is one. Therefore, if K has lary. a matching M 0 of size |X|−k, then all vertices of the Corollary 1. The minimal basis problem is NP- graph K must be matched. The only edge which is in- complete. 0 cident to the vertex f1 is e1. Hence, e1 ∈ M . Clearly, the degree of f2 in the subgraph K \{f1, x1} is one. 4.2 Fixed Parameter Tractability Therefore, if K has a matching M 0 of size |X|−k, 0 A naive algorithm to find a basis of size k for a bipar- then e2 ∈ M . By repeating this procedure, we con- clude that M 0 = M and, therefore, M is a uniquely tite graph G = (F,X,E), if one exists, is to examine n restricted matching. all k subsets B ⊂ X of size k. Checking if a given subset is a basis can be done in time O(|E|+|X|) us- ing Procedure 2. Therefore, this leads to the time Lemma 2. Let G = (F,X,E) be a bipartite graph complexity O(nkpoly(n)) for finding a basis of size and let M E be a uniquely restricted matching of ( k, where |X|= n and |F |= poly(n). This section dis- size |X|−k for G, where 0 ≤ k ≤ |X|−1. Then, B = cusses the fixed parameter tractability of the minimal X \ V (M) is a basis of size k for G. basis problem. Unfortunately, the fixed parameter tractability of the problem in terms of the size of op- Proof. Clearly, B X since |M|≥ 1. We show that timal basis size k—that is, if an O(f(k)poly(n))-time 1 at the end of Procedure 2, on input G and B, the algorithm solves the problem —remains unanswered. graph G becomes empty. The algorithm first removes every edge in G which is incident to some vertex 1 As an example for a fixed parameter tractable problem on graphs, we mention the vertex cover problem. A vertex cover of in B. Denote the remaining subgraph by H0. Since a graph G = (V,E) is a subset of vertices S ⊆ V that includes |M|≥ 1, the graph H0 is non-empty. We show that at least one endpoint of every edge in E. There is a simple there exist some edge fx ∈ E(H0) such that f ∈ F , algorithm that finds a vertex cover of size k, if any exists, in x ∈ X and deg(f) = 1. time O(2k|E|): pick an edge uv ∈ E and recursively check if

ISeCure July 2017, Volume 9, Number 2 (pp. 101–110) 105

Therefore, we draw our attention to studying the follows that x1, . . . , xn is a chain of L where X = fixed parameter tractability of the problem in terms {x1, . . . , xn}. A pair of consecutive elements (xi, xi+1) of the graph treewidth. is a jump of L if xi and xi+1 are incomparable in P . In fact, the jumps partition the chain x , . . . , x of In [23], it has been shown that a matching of a bi- 1 n L into disjoint chains C ,C ,...,C of P such that partite graph is uniquely restricted if and only if it is 1 2 m the maximum element of C is incomparable with alternating cycle free. Therefore, the problem of find- i the minimum element of C . The jump number of ing a maximum alternating cycle free matching in bi- i+1 a partial order P is the minimum number of jumps partite graphs is polynomially equivalent to the prob- taken over all linear extensions of P . lem of finding a minimum-size basis. The Courcelle’s celebrated theorem [24] (also see [25]) can be used A (loopless undirected) graph G = (X,E) is called to show that finding a maximum alternating cycle a comparability graph, if it is possible to find a poset free matching is fixed parameter tractable in terms of P = (X, ≤), called a poset of G, such that for each the graph treewidth. Also notice that the problem of distinct a, b ∈ X if ab ∈ E then a and b are compara- computing treewidth is fixed parameter tractable [26]. ble in P . It is well-known that a graph is a compa- Therefore, we have the following corollary. rability graph if it is transitively oriented; that is, if Corollary 4.1. Finding a minimal basis of a bipar- it can be turned into a directed graph by orienting tite graph is fixed parameter tractable with respect each edge such that for every vertices x, y, z, it holds to the graph treewidth. that if xy and yz are oriented edges then so is xz. The jump number of a comparability graph is defined We remark that as it has been noticed in [25], as the jump number of any of its posets. From the the stated complexity by Courcelle’s theorem con- results of Habib [27] and M¨ohring[28] the jump num- tains towers of exponents in the treewidth parameter, ber of a comparability graph is well-defined since it making it impractical and a purely theoretical result. is invariant with respect to the underlying poset. However, in [25], a significantly faster algorithm with w2+w 3 Orienting the edges of a bipartite graph from one running time O(4 · w · log(w) · n) has been pro- side of the bipartition to the other clearly results in posed for finding a maximum alternating cycle free a transitive orientation. Thus, every bipartite graph matching of a bipartite graph G, where n is the num- is a comparability graph. A result from Chaty and ber of vertices of G and w denotes the width of the Chein [29] shows that for a bipartite graph G = tree decomposition. (F,X,E) with a maximum alternating cycle free 5 Problems Related to Minimal matching M we have Basis Problem |M|+j(G) = |F |+|X|−1 , In this section, we draw reader’s attention to two where j(G) is the graph’s jump number. Therefore, mathematical problems related to minimal basis prob- we have the following theorem: lem. Theorem 2. Finding the jump number of a bipartite graph is polynomially equivalent to the problem of 5.1 Jump Number finding the maximum alternating cycle free matching of G and it is itself equivalent to finding the minimum- We introduce a parameter for a bipartite graph which size basis of G. is related to its minimum-size basis. In the sequel, we give the necessary notations and definitions. 5.2 Smallest Forcing Number Recall that a partially ordered set or a poset is a pair P = (X, ≤) where X is a set and ≤ is a reflexive and This subsection establishes a relationship between transitive binary relation. A pair of elements a, b ∈ X a basis of a bipartite graph and another parameter are called comparable if a ≤ b or b ≤ a; otherwise defined for graphs. To continue, we need some termi- they are called incomparable. We write a < b if a ≤ b nology that follows. and a 6= b. A sequence a1, . . . , as is called a chain of Let G be a graph that admits a perfect matching. P if a1 ≤ · · · ≤ as. A poset without incomparable A forcing set of a perfect matching M in G is a subset elements is called a linear or total order. Let P = S of M contained in no other perfect matchings of G. (X, ≤) be a poset on a finite set X.A linear extension The forcing number of a perfect matching M of G is of P is a total ordering L = (X, ) where the relation defined as between comparable elements of P is preserved in L; f(G, M) = min{|S|: S is a forcing set of M}. that is, if a ≤ b, for a, b ∈ X, then a  b. It then The forcing number of G is defined as G \ u or G \ v has a vertex cover of size k − 1. f(G) = min{f(G, M): M is a perfect matching of G}.

ISeCure 106 Computational Complexity of Finding a Minimal Basis for the Guess and ... — Khazaei and Moazami

Remark 1. Motivated by applications in chemistry, This might lead to improved algorithms for finding a the concept of forcing number was first introduced satisfying solution. in [30]. Later, the forcing number of a perfect match- ing was studied in [31] as a mathematical concept. 6.1 Better Guessing Strategy See also ([32]) for a survey. The computational com- plexity of finding forcing number of perfect matching Suppose that an invertible equations system (F,X) and graph are studied in the literature, see ([33, 34]). over Fq has a minimal basis of size k. We have implic- itly assumed that an attacker first guesses the basis Remark 2. Forcing set can be considered for other and then computes the rest of the variables. This mathematical parameters other than perfect match- of course requires a computation of order O(qk), ig- ings. In fact, this concept has a general definition as noring polynomial factors. However, this might not follows. Let (F,S) be a pair such that F is a fam- necessarily be the best strategy. Bellow, we consider ily of sets and S ∈ F. A set D ⊆ S is a forcing two such cases. set, also known as defining set, of (F,S) if S is the only element of F that contains D as a subset. This 1. Separable equations. One trivial example is concept has been studied in numerous cases, such when the system is separable. More precisely, assume as vertex colorings, perfect matchings, dominating that X and F can respectively be partitioned into sets, block designs, geodetics, orientations, and Latin (X1,X2) and (F1,F2) such that (F1,X1) and (F2,X2) squares [35]. both have unique solutions. If these systems have respectively minimal bases of sizes k1 and k2, then We can prove the following relation between the the original system has a basis of size k = k1 +k2. The forcing number and the minimum basis size of a graph unique solution can then be found in time qk1 + qk2 G. instead of qk1+k2 . Theorem 3. Let G be a bipartite graph that admits a perfect matching. Let b(G) be the size of the minimal 2. Early contradictions. A less trivial case is basis of the graph G. Then, b(G) = f(G). when after guessing some part of a basis an early con- tradiction is reached. To be concrete, assume that (F,X), with |X|= n, has a minimal basis of size k. Proof. Let G = (F,X,E) be a bipartite graph and Furthermore, suppose that by guessing only k0 < k M be a perfect matching of G that f(G) = f(G, M). elements of the basis, m other variables are subse- Let S be a forcing set of the matching M that |S|= quently uniquely determined where m < n. Assume f(G, M). Then by definition of the forcing set, M \S that there is an equation in F that only depends on is the only perfect matching of the graph G \ S. In the determined m variables and it is algebraically in- other words, M \ S is a uniquely restricted matching dependent of all the equations that have so far been of the graph G. used to determine these m unknowns. This “check” k0 By Lemma2, the set B = X\V (M \S) = V (S)∩X equation can be used to reduce the initial q possi- is a basis of the graph G. Also notice that |B|= |S| bilities by a factor of q. Therefore, the overall attack k−1 and b(G) ≤ |B|, since b(G) is the size of a minimum complexity is reduced by a factor of q, i.e., q in- k basis. Consequently, b(G) ≤ |S|= f(G, M) = f(G). stead of q . In [14, 16] this idea has been used to Conversely, let B be a basis of the graph G of size mount a faster attack on the Sosemanuk [36] stream b(G). Then by Lemma1, the graph G has a uniquely cipher. restricted matching M of size |X|−b(G). Consider the subgraph H of the graph G induced by the ver- 6.2 Working With an Equivalent Equations tices F ∪ (X \ (V (M) ∩ X)). Since G admits a per- System fect matching and H is an induced subgraph of the In some cases working with modified equations system graph G, H has a matching M 0 such that every ver- might lead to an improved attack. Bellow we present tex of (X \ (V (M) ∩ X)) is a saturated vertex in M 0. three examples. So, M 0 ∪ M is a perfect matching of the graph G. Since M is a uniquely restricted matching then B 1. Linear equations system. When (F,X) is a 0 is a forcing set of the matching M ∪ M. Therefore, linear system over Fq, say with unique solution, then f(G) ≤ f(G, M) ≤ |B|= b(G). the satisfying solution can be found using Gaussian elimination in polynomial time. 6 Improved Guess and Determine 2. Using redundant equations. In some cases Attacks adding redundant equations to the equation system might lead to a faster attack. One case is when attack- In this section, we discuss some points with respect ing a stream cipher based on LFSRs (Linear Feed- to the effectiveness of guess and determine attack.

ISeCure July 2017, Volume 9, Number 2 (pp. 101–110) 107

back Shift Registers). The output of an LFSR is de- in , pages 37–46, 2002. termined by its feedback polynomial. The feedback [5] Azad Mohammadi-Chambolbol. Cryptanalysis polynomial imposes a linear constraint on the output of word-oriented stream ciphers. Master thesis sequence of the LFSR. Redundant constraints can be (in Persian), Sharif University of Technology., obtained by considering any multiple of the feedback 2004. polynomial. See [37] for an example or refer to the [6] Hadi Ahmadi and Taraneh Eghlidos. Advanced last paragraph of Appendix C for further discussion. Guess and Determine Attacks on Stream Ci- phers. International Symposium on Telecommu- 3. Changing the underlying finite field. In nications (IST), Tehran, Iran, 2005. some cases the equations can be seen over a smaller [7] Hadi Ahmadi and Taraneh Eghlidos. Heuristic finite field. As an example, Sosemanuk [36] stream guess-and-determine attacks on stream ciphers. cipher works with 32-bit words. In contrary to the IET Information Security, 3(2):66–73, 2009. attack in [14, 16] which solves a system of equations [8] Philip Hawkes. An attack on SOBER-II. Tech- over 32 , the authors of [38] take a byte-based ap- F2 nical report, QUALCOMM Australia, Suite 410, proach and solve a system of equations over 8 . Con- F2 Birkenhead Point, Drummoyne NSW 2137, Aus- sequently, the overall attack time is dramatically re- tralia, 1999. duced form 2226 to 2176. [9] S Blackburn, S Murphy, F Piper, and P Wild. 7 Conclusion A SOBERing remark. Unpublished report. Infor- mation Security Group, Royal Holloway Univer- In this paper, we showed that finding the minimum sity of London, Egham, Surrey TW20 0EX, UK, number of the guessed bits in the guess-and-determine 1998. attack is equivalent to finding the maximum uniquely [10] Daniel Bleichenbacher, Sarvar Patel, and Willi restricted matching in a bipartite graph. Using this Meier. Analysis of the SOBER stream cipher. observation we studied the computational complexity TIA contribution TR45. AHAG/99.08, 30, 1999. aspect of this attack. Also, we introduced the relation [11] Daniel Bleichenbacher and Sarvar Patel. SOBER between this problem and some other mathematical Crytanalysis. In FSE, pages 305–316, 1999. problems such as jump number and forcing number [12] Christophe De Canni`ere.Guess and determine of a perfect matching. Exploring the fixed parameter attack on SOBER. NESSIE Public Document, tractability of the problem in terms of the minimum- NES/DOC/KUL/WP5/010/a, 2001. basis size remains an open question. An interesting [13] Steve Babbage, Christophe De Canni`ere, Joseph research problem is to study the approximability as- Lano, , and Joos Vandewalle. Crypt- pects of the minimal basis problem. analysis of SOBER-t32. In FSE, pages 111–128, Acknowledgment 2003. [14] Hadi Ahmadi, Taraneh Eghlidos, and Shahram The first author has been supported by Iranian Na- Khazaei. Improved guess and determine at- tional Science Foundation (INSF) under contract no. tack on SOSEMANUK. eSTREAM, ECRYPT 92027548 and Sharif Industrial Relation Office (SIRO) Stream Cipher Project, Report 2005/085, under grant no. G931223. The authors would like to 2005. http://www.ecrypt.eu.org/stream/ thank the anonymous reviewers for their constructive papersdir/085.pdf. and valuable comments that greatly contributed to [15] Yukiyasu Tsunoo, Teruo Saito, Maki Shigeri, To- improve the quality and presentation of the paper. moyasu Suzaki, Hadi Ahmadi, Taraneh Eghli- dos, and Shahram Khazaei. Evaluation of SOSE- References MANUK with regard to guess-and-determine [1] Lars R. Knudsen, Willi Meier, Bart Preneel, Vin- attacks. SASC (the State of the Art of Stream cent Rijmen, and Sven Verdoolaege. Analysis Ciphers), pages 25–34, 2006. Methods for (Alleged) RC4. In ASIACRYPT, [16] Xiutao Feng, Jun Liu, Zhaocun Zhou, Chuankun pages 327–341, 1998. Wu, and Dengguo Feng. A Byte-Based Guess [2] Jovan Dj. Golic. Cryptanalysis of Alleged A5 and Determine Attack on SOSEMANUK. In Stream Cipher. In EUROCRYPT, pages 239– ASIACRYPT, pages 146–157, 2010. 255, 1997. [17] John Mattsson. A Guess-and-Determine At- [3] Christophe De Canni`ere.Guess and determine tack on the Stream Cipher . eS- attack on SNOW. NESSIE Public Document, TREAM, ECRYPT Stream Cipher Project, Re- NES/DOC/KUL/WP5/011/a, 2001. port 2006/017, 2006. http://www.ecrypt.eu. [4] Philip Hawkes and Gregory G. Rose. Guess-and- org/stream/papersdir/2006/017.pdf. Determine Attacks on SNOW. In Selected Areas [18] Mahdi Hasanzadeh, Elham Shakour, and Shahram Khazaei. Improved Cryptanalysis of Po-

ISeCure 108 Computational Complexity of Finding a Minimal Basis for the Guess and ... — Khazaei and Moazami

lar Bear. eSTREAM, ECRYPT Stream Cipher numbers of bipartite graphs. Discrete Mathe- Project, Report 2005/084, 2005. http://www. matics, 281(1-3):1–12, 2004. ecrypt.eu.org/stream/papersdir/084.pdf. [34] P. Afshani, H. Hatami, and E. S. Mahmoodian. [19] Xiutao Feng, Zhenqing Shi, Chuankun Wu, and On the spectrum of the forced matching number Dengguo Feng. On Guess and Determine Anal- of graphs. Australasian J. Combin., 30, pages ysis of Rabbit. Int. J. Found. Comput. Sci., 147–160, 2004. 22(6):1283–1296, 2011. [35] Diane Donovan, ES Mahmoodian, Colin Ramsay, [20] Hans L. Bodlaender and Arie M. C. A. Koster. and Anne Penfold Street. Defining sets in com- Treewidth computations i. upper bounds. Inf. binatorics: a survey. Surveys in combinatorics, Comput., 208(3):259–275, 2010. pages 115–174, 2003. [21] Ton Kloks. Treewidth, Computations and Ap- [36] CˆomeBerbain, Olivier Billet, Anne Canteaut, proximations, volume 842 of Lecture Notes in Nicolas Courtois, Henri Gilbert, Louis Goubin, Computer Science. Springer, 1994. Aline Gouget, Louis Granboulan, C´edricLau- [22] Patrik Ekdahl and Thomas Johansson. A New radoux, Marine Minier, Thomas Pornin, and Version of the Stream Cipher SNOW. In Selected Herv´e Sibert. Sosemanuk, a Fast Software- Areas in Cryptography, pages 47–61, 2002. Oriented Stream Cipher. In The eSTREAM Fi- [23] Martin Charles Golumbic, Tirza Hirst, and nalists, pages 98–118. 2008. Moshe Lewenstein. Uniquely restricted match- [37] Mohammad Sadegh Nemati Nia and Ali Payan- ings. Algorithmica, 31(2):139–154, 2001. deh. THE NEW HEURISTIC GUESS AND DE- [24] Bruno Courcelle. The monadic second-order TERMINE ATTACK ON SNOW 2.0 STREAM logic of graphs: Definable sets of finite graphs. In CIPHER. IACR Cryptology ePrint Archive, Graph-Theoretic Concepts in Computer Science, 2014:619, 2014. 14th International Workshop, WG ’88, Amster- [38] Xiutao Feng, Jun Liu, Zhaocun Zhou, Chuankun dam, The Netherlands, June 15-17, 1988, Pro- Wu, and Dengguo Feng. A byte-based guess ceedings, pages 30–53, 1988. and determine attack on SOSEMANUK. In Ad- [25] Benjamin A. Burton, Thomas Lewiner, Jo˜ao vances in Cryptology - ASIACRYPT 2010 - 16th Paix˜ao, and Jonathan Spreer. Parameterized International Conference on the Theory and Ap- complexity of discrete morse theory. ACM Trans. plication of Cryptology and Information Secu- Math. Softw., 42(1):6:1–6:24, 2016. rity, Singapore, December 5-9, 2010. Proceedings, [26] Hans L. Bodlaender. A linear-time algorithm for pages 146–157, 2010. finding tree-decompositions of small treewidth. [39] Patrik Ekdahl and Thomas Johansson. SNOW-a SIAM J. Comput., 25(6):1305–1317, 1996. new stream cipher. In Proceedings of First Open [27] Michel Habib. Comparability invariants. In NESSIE Workshop, KU-Leuven, 2000. Orders: description and roles (L’Arbresle, 1982), [40] Joan Daemen and Vincent Rijmen. The Design volume 99 of North-Holland Math. Stud., pages of Rijndael: AES - The Advanced 371–385. North-Holland, Amsterdam, 1984. Standard. Information Security and Cryptogra- [28] Rolf H. M¨ohring.Algorithmic aspects of compa- phy. Springer, 2002. rability graphs and interval graphs. In Graphs and order (Banff, Alta., 1984), volume 147 of NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., pages 41–101. Reidel, Dordrecht, 1985. [29] G. Chaty and M. Chein. Ordered matching and matchings without alternating cycles in bipartite graphs. Utilitas Math., 16:183–187, 1979. [30] D. J. Klein and M. Randic. Innate degree of freedom of a graph. J. Comput. Chem. 8, pages 516–521, 1987. Shahram Khazaei is an assistant [31] F. Harary, D. J. Klein, and T. P. Zivkovic. Graph- professor at the Department of Math- ical properties of polyhexes: perfect matching ematical Sciences at Sharif Univer- vector and forcing. J. Math. Chem. 6, pages sity of Technology, Iran, since 2012. 295–306, 1991. He received his Ph.D. in computer [32] Z. Che and Z. Chen. Forcing on perfect match- science from EPFL, Switzerland, in ings a survey. MATCH Commun. Math. Com- 2010 and was a postdoctoral re- put. Chem. 66, pages 93–136, 2011. searcher at KTH Royal Institute of Technology, Swe- [33] Peter Adams, Mohammad Mahdian, and Ebadol- den, from 2011 to 2012. His main research interests lah S. Mahmoodian. On the forced matching is theoretical and practical aspects of cryptography.

ISeCure July 2017, Volume 9, Number 2 (pp. 101–110) 109

The binary operation “+” from Fq × Fq to Fq is Farokhlagha Moazami is an assis- defined as follows: tant professor at the Cyber Space Re- search Institute at Shahid Beheshti x + y 7→ z , University, Iran, Tehran, since 2013. She received B.S. and Ph.D. degrees w−1 w−1 where x = P x α , y = P y α , z = in mathematics from Alzahra Univer- i=0 i i i=0 i i Pw−1 z α with x , y , z ∈ and sity, Tehran, Iran, in 2004 and 2012, i=0 i i i i i Fp

respectively and M.S. degree in mathematics from w−1 w−1 w−1 Sharif University of Technology, Iran, Tehran, in 2006. X i X i X i w xip + yip = zip mod p , She was a postdoctoral at Sharif University of Tech- i=0 i=0 i=0 nology, Iran, Tehran, from 2012 to 2013. Her main research interests is theoretical and practical aspects where x , y , z are interpreted as elements of . of cryptography. i i i Zp P The first three ’s are additions over Fq and the A Some Illusterating Graphs last three are over integers. The “−” operation can Examples be defined similarly.

The binary operation “≪” from Fq ×{0, 1, . . . , w− 1} to Fq is defined as follows:

x ≪ r 7→ y ,

where x and y are represented as above and yi = (a) a non-uniquely restricted (b) a uniquely restricted match- xi−r mod w. The binary operation ≫ can be defined matching ([23]). ing ([23]). similarly as yi = xi+r mod w where x ≫ r 7→ y. Example B.1. Let r1, r2 ∈ {0, 1, . . . , w − 1} and β1, β2, β3 ∈ Fq be some constants where β1, β2 6= 0. Let σ : Fq → Fq be a permutation. Then,

f(x, y, z) = (β1(x ≫ r1)⊕β2y)+σ(z ≪ r2)+β3 = 0

is an invertible equation since we have (c) an alternating cycle (of length 4) with respect to a (per-  −1  fect) matching. x = β1 (β2y ⊕ (0 − σ(z ≪ r2) − β3)) ≪ r1 , Figure A.1. Matching examples −1 y = β2 (β1(x ≫ r1) ⊕ (0 − σ(z ≪ r2) − β3)) ,

−1 z = σ (0 − (β1(x ≫ r1) ⊕ β2y) − β3) ≫ r2 .

After computing the inverse permutation σ−1 in time O(q) and saving it in a memory of size O(q), Figure A.2. A graph with a tree decomposition ([20]). the three inversions can be computed in O(log q). C Equations Describing Snow 2.0. B Example of an Invertible Equation Stream Cipher In this section we give an example of a typical invert- The Snow 2.0. [22] stream cipher is an updated version ible equation that appears in analysis of cryposystems. of Snow [39]. Figure C.1 shows a schematic picture of First we introduce some notations. Snow 2.0. It is composed of a Linear Feedback Shift Let p be the characteristic of Fq and denote the Register (LFSR) and a Finite State Machine (FSM). field addition and multiplication operations respec- The LFSR is defined with the following primitive tively by “⊕” and “·” where multiplication is nor- feedback polynomial mally dropped. Let (α0, ··· , αw−1) be a fixed basis 16 14 −1 5 for the field extension Fq over Fp. π(x) = αx ⊕ x ⊕ α x ⊕ 1 ∈ F232 [x],

ISeCure 110 Computational Complexity of Finding a Minimal Basis for the Guess and ... — Khazaei and Moazami

Here, + denotes the modulo 232 addition on 32-bit registers (see also Appendix B). The values of the registers are updated according to

R1t+1 = st+5 + R2t,

R2t+1 = S(R1t).

where t ≥ 0 and S : F232 → F232 is a known permu- tation based on round function of AES [40]. Finally, the output sequence of the stream cipher, also called the running or , is denoted by {zt}t≥0 and computed as follows:

zt = Ft ⊕ st, t ≥ 0

Figure C.1. A schematic of Snow 2.0. stream cipher [22] In an initial state recovery attack, an attacker N−1 is given a piece of keystream, say {zt}t=0 , and where α is some known constant and ⊕ is the his goal is to find the unknown initial state 18 finite field addition (or bit-wise XOR operation on (s15, s14, . . . , s0,R10,R20) ∈ F232 . When N ≥ 18, the 32-bit registers). The initial state of the LFSR is initial state is almost uniquely determined. Clearly, 16 all the involved equations are invertible. In [7], it is denoted by (s15, s14, . . . , s0) ∈ F 32 . Therefore, the 2 claimed that for a certain amount of N (unspecified output sequence of the LFSR, {st}t≥0, is determined according to the following recursion: in [7], but probably 50 − 70), there exists a basis of size 8 for the corresponding equations system. This re- −1 st+16 = α st+11 ⊕ st+2 ⊕ αst. duces the search from 218×32 = 2576 to 28×32 = 2256. The FSM has two 32-bit registers, denoted by R1 and However, an observation of [37] shows that one can R2, and the value of the registers at time t ≥ 0 are do better by utilizing redundant equations such as those imposed on the output sequence of the LFSR respectively denoted by R1t,R2t ∈ F232 . by multiples of the feedback polynomial. In particu- The output of the FSM at time t ≥ 0, denoted lar, the authors of [37] claim that, by incorporating by Ft, is then computed as follows from the initial 2 equations imposed on {st}t≥0 by (x + 1)π(x) and values of the registers, R10 and R20, and the output (x5 + 1)π(x), one can find a basis of size 6, which sequence of the LFSR: shows an attack with complexity 26×32 = 2192. See

Ft = (st+15 + R1t) ⊕ R2t. Section 6 for further discussion.

ISeCure