<<

2018 IX International Conference on Optimization and Applications (OPTIMA 2018) ISBN: 978-1-60595-587-2

Pseudo-Boolean Black-Box Optimization Methods in the Context of Divide-and-Conquer Approach to Solving Hard SAT Instances

Oleg ZAIKIN1,2,* and Stepan KOCHEMAZOV1

1 Matrosov Institute for System Dynamics and Control Theory SB RAS, 664033, Irkutsk, Russia 2 ITMO University, 197101, St Petersburg, Russia * Corresponding author

Keywords: Black-box optimization, Pseudo-Boolean optimization, SAT, Divide-and-conquer, Crypt- analysis, .

Abstract. Solving hard instances of the Boolean satisfiability problem (SAT) in practice is an interestingly nontrivial area. In particular, the heuristic nature of state-of-the-art SAT solvers makes it impossible to know in advance how long it will take to solve any particular SAT instance. One way of coping with this disadvantage is the divide-and-conquer approach when an original SAT instance is decomposed into several simpler subproblems. However, the way it is decomposed plays a crucial role in the resulting effectiveness of solving. In the present study, we reduce the problem of “dividing” a hard SAT instance into many simpler subproblems to a pseudo-Boolean black-box optimization problem. Then we use relevant implementations of corresponding methods to analyze hard SAT instances encoding the of state-of-the-art stream ciphers proposed during the recent e-STREAM competition.

Introduction Boolean satisfiability problem (SAT) [4] is historically the first NP-complete problem. It means that there are numerous practical problems that can be reduced to SAT. Moreover, the progress in SAT-solving algorithms in recent two decades made it possible to reach a stage when state- of-the-art SAT solvers can cope with various industrial and combinatorial problems in reasonable time. One of the areas where SAT solvers allow to obtain quite interesting results is cryptanalysis. The corresponding approach is called SAT-based cryptanalysis. The reason why SAT is useful for cryptographic problems is that they essentially deal with low-level data transformations which are hard to invert. Thus it is natural to reduce them to SAT and attempt to solve them in that form. In this study, we analyze one specific facet of SAT-based cryptanalysis. In particular, when solving hard SAT instances, e.g. the ones encoding the cryptanalysis of some ciphers, one of the commonly used techniques is the so-called divide-and-conquer approach. Informally, it consists in splitting an original problem into a (large) family of simpler subproblems, preferably in such a way that their resulting solving time is less than that for an original problem. An interesting fact is that the latter is not obligatory, because the divide-and-conquer approach enables the use of distributed computing resources compared to launching SAT solver on a target instance on a single processor (core) [45]. The non-trivial issue is that the way a problem is split into a family of subproblems has a direct impact on the time it takes to solve it. In several papers, including

76 [35, 36, 43], it was shown that it is possible to represent the problem of how to split a hard SAT instance into simpler tasks as a problem of pseudo-Boolean black-box optimization. The goal of the present paper is to answer the question whether it is necessary to design specific algorithms for solving the outlined black-box problem or the existing widely available tools are good enough to cope even with splitting very hard SAT instances which encode the cryptanalysis of state-of-the-art stream ciphers. In the role of the latter, we analyzed several finalists of e-STREAM project, a cryptographic competition, aimed at finding new stream ciphers that could replace the existing ones in the nearest future. Let us give a brief outline of the paper. In the next section, we introduce a number of notations, formulate and discuss the objective pseudo-Boolean black-box function that will later be optimized using different algorithms. In the third section, we describe the optimization algorithms that we employ to optimize the objective function and their features. The fourth section contains detailed information on the considered problems and the cryptographic specifics. In the fifth section, we present the results of computational experiments. In the last two sections, we discuss the results and draw conclusions.

Black-box Optimization and Hard SAT Instances Boolean satisfiability problem (SAT) consists in the following: for an arbitrary Boolean formula to decide if there is an assignment of its variables that makes this formula T rue, meaning that a formula is satisfiable, or to prove that there are no such assignments and thus a formula is unsatisfiable. There are many ways to divide a hard SAT instance into a family of simpler subproblems. Some of them were described, for example, in [23]. In the present paper we use the so-called plain partitioning approach (in terms of [23]), that was used in [35, 12, 43, 25], etc. Let us briefly describe it below. Assume that C is a Boolean formula in a conjunctive normal form (CNF) over a set of n Boolean variables X. We choose a subset S of variables from X. Since it contains Boolean variables, it means that there are exactly 2|S| possible ways to assign values ∈ {T rue, F alse} to variables from S in C. When we assign values α = (α1, . . . , αk) to variables from a set S, |S|, let us denote a simplified SAT instance as C[α/S]. Let us refer to a set of subproblems produced by instantiating variables from S in C as to a decomposition of an original SAT instance C and denote it as |B| DS[C] = {C[α/S], α ∈ {0, 1} }. Naturally, it is possible to choose a set S in such a way (in the simplest case by varying its size |S|), that the resulting subproblems can be made as simple for a SAT solver as necessary. The nontrivial moment consists in that different sets of the same size can produce vastly different subproblems complexity-wise. I.e. it is entirely possible that for some S1,S2 ⊆ X, S1 6= S2,

|S1| = |S2| the total time to solve all the subproblems from the decomposition DS1 [C] using some specific SAT solver can be significantly less than that for DS2 [C]. That is why it is important to pick a good set S to decompose a problem. One possible way of doing this is via blackbox optimization. Ideally, the value of an objective function on a set S would for a particular solver G represent the sum of runtimes of the solver on all subproblems from DS. However, this approach is impractical since it implies solving the same problem multiple times. And what if the problem is too complex to be solved? Realistically, the value of an objective function should be computed if not effectively, then at least relatively fast.

77 77 In the remainder of the paper the objective function value is computed using the Monte- Carlo method [30] as follows. Assume there is a SAT instance C, a set S we use to construct a decomposition DS[C] and a solver G we employ to solve simplified subproblems. Our goal is to estimate the time it would take to solve all subproblems from DS[C] with the help of G. For this purpose, in accordance with the Monte Carlo method first construct a random sample |S| RS = {C[β1/S],...,C[βN /S]} of size N by choosing randomly {β1, . . . , βN } from {0, 1} . Then launch the solver G on each subproblem from RS. Assume that TG(C[βi/S]) is the runtime of G on C[βi/S]. Then, the objective function is computed as follows [25]:

N 1 X Runtime Estimation (S) = 2|S| × × T (C[β /S]). (1) C,B,N N G i i=1 Basically, (1) implements in the straightforward manner the Monte Carlo method to com- pute the runtime estimation for solving a particular hard SAT instance by decomposing it via a particular set S. Note, that (1) is in fact a black-box objective function. It is also a pseudo-Boolean objective function, because its value is a real number (an estimation in seconds), while S is a subset of a set of Boolean variables X, |X| = n, so each possible S corresponds to some Boolean vector of length n. Finally, the objective function can be represented as follows:

n Runtime EstimationC,B,N (S):B → R. (2) In the following section, we discuss what requirements should the algorithms for minimization of the objective function (1) satisfy.

Optimization Algorithms Involved Assume that a CNF over a set of Boolean variables X is given. Then in the general case, a search space has 2|X| points, each corresponding to some subset S of set X. Thus, for any of them, it is possible to compute the value of the objective function (1). Hence, a black-box optimization algorithm can be used to traverse a search space. It is important to note, that a search space can be significantly reduced, if, for example, some variables in CNF are dependent on others. When a SAT instance encodes some circuit with explicit inputs and outputs, then instantiating the variables corresponding to inputs usually leads to the derivation of all the other variables in a formula. It makes sense to construct a search space based only on such independent variables. In all problems considered in Sections 4 and 5 such knowledge was available. It is clear that not just any optimization algorithm can be applied to minimization of function (1). To be used in practice, it should satisfy several criteria: 1. It should be able to operate with a black-box objective function; 2. It should be able to work with Boolean (or pseudo-Boolean) variables; 3. There should be a possibility to launch it on a computing cluster equipped with standard compilers and software packages. The first two criteria are quite natural for the objective function (1). The third criterion was chosen because computing the value of (1) at any point is usually quite expensive. For example, depending on the problem and the size of the random sample, it can take up to several minutes to compute (1) for one subset S on a computer equipped with multi-core CPU. Thus, to deal with the

78 78 hard optimization problems that will be further described in Section 4, it is necessary to employ a computing cluster. Since usually on a cluster only standard user rights are allowed, it is quite hard and sometimes impossible to install any nonstandard additional software on it. This restriction, in turn, implies that optimization algorithms must be implemented on widely used programming languages (C++, Python, Java, Fortran, etc.) maintained by modern computing clusters. The vast majority of optimization algorithms cannot operate with black-box objective func- tions, so they do not satisfy the first criterion. A number of black-box optimization algorithms don’t satisfy the second one, because they are designed to deal only with continuous variables. For instance, it holds true for Covariance Matrix Adaptation Evolution Strategy (CMA-ES) [16], Hooke-Jeeves method [19], pseudo-gradient approach [13] and a variety of coordinate descent tech- niques. Finally, almost all black-box discrete optimization algorithms fail the third criterion. In particular, Multilevel Coordinate Search (MSC) [22] has an only Matlab implementation, so it can- not be launched on a standard computing cluster. In the case of Radial Basis Function algorithm for black-box optimization (RBFOpt [11]) there exists its Python implementation, but RBFOpt must be installed on a cluster. The latter is impossible because it cannot be done using ordinary cluster’s users rights. The only general-purpose optimization algorithm we have found so far which satisfies all three criteria is the Sequential Model-based Algorithm Configuration (SMAC) [21]. Also in paper [25] there was described the ALIAS tool which implements a variant of Greedy Best First Search (GBFS) algorithm designed specifically for minimization of function (1). Of course, it is also possible to implement specifically tailored versions of evolutionary or genetic algorithms [41] for this problem, as well as variable neighborhood search [17] or tabu search [15], as well as a number of other heuristics. However, the main goal was to take existing implementations and use them in the study. In the following subsections, we briefly describe both SMAC and the GBFS algorithms we employed in our experiments.

Sequential Model-Based Optimization for General Algorithm Configuration SMAC (sequential model-based algorithm configuration) [21] is a tool that can be used both for black-box optimization with mixed (continuous and discrete) variables, and for solving algorithm configuration problem (to find a set of parameter values which yield the best performance of an algorithm in given conditions). The predictive models SMAC is based on can also capture and exploit important information about the model domain, such as which input variables are most important. SMAC is an implementation of the Sequential Model-based optimization (SMBO) framework [24]. According to SMBO, a regression model is constructed, that predicts values of an objective function and then this model is used for optimization. SMBO has been used for optimizing costly black-box functions. SMAC is based on the random forest machine learning algorithm [6]. In fact, a random forest is a collection of either decision trees or regression trees. In SMAC random forest contains regression trees that have real values (values of an objective function) at their leaves. Every time a new value of an objective function is calculated, a random forest is reconstructed. In fact, random forest recommends in which points an objective function should be calculated. SMAC is implemented in Python and Java.

79 79 Greedy Best First Search in ALIAS As it was mentioned in Section 2, each possible subset S of a set of Boolean variables X, |X| = n of a given CNF, corresponds to a Boolean vector of length n. That is why any black-box local search algorithm that operates with Hamming distances can be employed to minimize the considered objective function. As such algorithm, the Greedy Best First Search (GBFS) algorithm [34] was chosen. The GBFS algorithm described below was implemented as a module of the ALIAS tool [25]. GBFS starts from a starting point, in the role of which it is possible to use the whole set of Boolean variables X, to obtain a baseline runtime estimation (for such a point it can always be computed effectively). Then the algorithm checks all points from the neighborhood of the starting point (a set of points at Hamming distance of 1). If it finds a better point, then it starts checking its neighborhood. If all points from a neighborhood are worse than the current best-known value, then it means that a local minimum is reached. Since the computation of the objective function in an arbitrary point is costly, the points for which the function value was computed are stored in special lists in order to not compute function at any point twice. When a local minimum is reached, in [25] the following simple jump strategy is used: the point with the current best-known value is partially randomly permuted and the obtained point is used as a new starting point. The algorithm stops either if the time limit is exceeded, or if the limit on jumps is reached. It turned out, that this strategy usually does not lead to updating the best-known value. That is why in the present study another jump strategy is used. According to it, 2k Boolean variables are added to a record point. Among them, k variables are the ones, which occurred in the least number of points of the search space, in which the value (1) was computed. In the role of another k variables the ones which participate in the largest number of record points are chosen. In all experiments described in Section 5 k was equal to 4. It turned out, that this strategy shows much better results than the one used in [25].

Considered Problems In the role of hard SAT instances, we considered several instances of cryptanalysis of cryp- tographic generators. A cryptographic keystream generator is a special mathematical design that is used to construct a pseudorandom bit sequence that is later used to cipher known data via bitwise XOR [29]. The generated bit sequence is usually referred to as keystream. Crypto- graphic keystream generators are usually designed in such a way that knowing their inner workings gives attacker little to no leverage without the knowledge of (at least a part of) secret or inordi- nate amounts of keystream. We considered the problem of cryptanalysis of keystream generators in the following form: given a known fragment of keystream to recover the values of a generator reg- isters cells that were used to generate this keystream. The problems of cryptanalysis of keystream generators in SAT form can be viewed as almost a perfect example of hard SAT instances – they are quite compact but at the same time very hard. We analyzed five keystream generators — finalists in the eSTREAM project [8]. This project was organized by European cryptological community and was aimed at identifying new stream ciphers that could in the nearest future replace existing ones. Thus the e-STREAM’s finalists can be considered as state of the art in the area of stream ciphers. The generators analyzed below are called [7], [18], Mickey [1]. [5] and [3]. We will refrain from providing detailed descriptions for these generators because they are not entirely relevant to the

80 80 Table 1. Characteristics of SAT encodings of the considered cryptanalysis problems.

Generator Variables Clauses Size (Mb) Registers size (bits) Keystream size (bits) Trivium 1 887 22 776 0.5 288 300 Grain 1 945 37 542 1.1 160 160 Mickey 72 078 586 080 15.7 200 250 Rabbit 98 449 879 713 23.7 513 512 Salsa20 26 976 989 912 44.8 512 512 paper. In order to construct the SAT encodings we employed the Transalg software system [32]. It uses a C-like language, thus, thanks to the fact that eSTREAM candidates provided C imple- mentations of each generator, it was relatively simple to transform the latter into programs for Transalg and test their correctness. The data on obtained SAT encodings is presented in Table 1.

Computational Experiments For each of 5 considered keystream generators (see Section 4) 1 cryptanalysis instance was considered. The obtained instances were encoded to SAT in a way described in Section 4 resulting in 5 hard SAT instances, which in turn correspond to 5 hard black-box pseudo-Boolean optimization problems. As a starting point for every problem, a set of Boolean variables that encode keystream gen- erator registers’ initial state (see Table 1) was used. This is possible because the corresponding Boolean variables encode the function input, i.e. they can be viewed as the most important. Thus, the search spaces contained 2288, 2160, 2200, 2513, 2512 for Trivium, Grain, Mickey, Rabbit and Salsa20, respectively. Further, when it is said that a point has k variables, it means that a corresponding set used to decompose an original SAT instance contains exactly k Boolean variables. In our experiments to calculate the value of the objective function (1) we used the implemen- tation from the ALIAS tool [25]. It is a multi-threaded program based on three ALIAS modules: a Python script alias., and two C++ programs sampler and genipainterval. ALIAS op- erates with incremental SAT solvers via the IPASIR interface [2]. In all experiments described below the IPASIR-based version of the sequential rokk SAT solver [42] was used because it recently has shown good results on SAT-based cryptanalysis problems. To perform the experiments, the HPC-cluster “Academician V.M. Matrosov” [9] was employed. Each its computational node is equipped with 2 × 18-core Intel Xeon E5-2695 CPUs and 128 Gb of RAM. While the implementation of the objective function is a multi-threaded program, at each moment of time an optimization algorithm operates with exactly one point from a search space. Since SAT is an NP-complete problem and thus the SAT solvers are essentially heuristic al- gorithms which given a sufficiently hard problem in practice can work for very large amounts of time, it makes sense to restrict the calculation of (1) by introducing the time limit on it. It turned out, that this limit is a very important parameter. The following values of the time limit were considered: no limit, 10 000 seconds, 1 000 seconds, 500 seconds and 100 seconds. While both op- timization algorithms implemented in SMAC and ALIAS are stochastic (see Section 3), and the objective function is quite costly, every tool was launched 3 times for 1 day on each configuration (problem, time limit) on one cluster’s computational node (i.e. on 36 CPU cores) to alleviate the

81 81 effect of randomness. Thus, 2 × 3 × 5 × 5 = 150 1-day launches were performed in total. Note, that only 102 of them resulted in any record point. There are two reasons for this phenomenon. First, SMAC cannot deal with Rabbit and Salsa20 at all, because its maximum available objec- tive function value is lower than baseline estimations for these keystream generators. Apparently, SMAC uses double data type in Java to store the function value, thus when it exceeds 1.7e + 308, its behaviour is undefined. Second, in several cases the values at starting points cannot be calcu- lated due to a low time limit and/or an extremely high cost of the objective function: it occurs on Mickey and Rabbit with the time limit of 100 seconds, and on Salsa20 with the time limit of 100, 500 and 1 000 seconds. For every configuration, the best result (with the minimal objective function value) out of 3 random launches was chosen. The results are shown in Table 2.

Table 2. Found record points for all problems.

Value of objective function Time limit (sec) Tool [number of variables/total number of variables] Trivium Grain Mickey Rabbit Salsa20 3.08e+47 2.66e+45 2.4e+58 - - SMAC [153/288] [160/160] [200/200] - 1.73e+44 4.77e+30 1.56e+48 3.16e+145 5.25e+152 ALIAS [147/288] [109/160] [158/200] [481/513] [502/513] 3.54e+46 2.92e+38 6.96e+53 - - SMAC [163/288] [137/160] [183/200] 10 000 3.62e+44 4.84e+30 1.6e+50 6.82e+144 5.5e+152 ALIAS [152/288] [109/160] [169/200] [484/513] [508/513] 2.12e+46 1.98e+33 1.36e+54 - - SMAC [161/288] [117/160] [186/200] 1 000 2.46e+41 4.04e+30 8.18e+50 1.52e+142 - ALIAS [145/288] [109/160] [175/200] [477/513] 2.47e+46 7.61e+33 2.69e+54 - - SMAC [162/288] [121/160] [187/200] - - 500 1.4e+41 2.09e+31 2.26e+50 2.57e+139 - ALIAS [144/288] [111/160] [172/200] [468/513] 6.62e+45 5.42e+34 - - - SMAC [161/288] [129/160] 100 3.84e+47 3.75e+32 - - - ALIAS [167/288] [116/160]

According to the table, ALIAS has shown much better results on the considered problems than SMAC. Also, it was not possible to launch SMAC on Rabbit and Salsa20 due to the reasons discussed above. The time limit had a strong influence. In particular, SMAC did not manage to move from starting points on Grain and Mickey with no time limit, while with the time limit enabled it showed competitive results in several cases. The best results for 3 out of 5 problems were obtained by ALIAS with the time limit enabled: 500 seconds on Trivium (record function value 1.4e+41) and Rabbit (2.57e+139); 1 000 seconds on Grain (4.04e+30). On Mickey and Salsa20 the best results were obtained by ALIAS with no time limit (1.56e+48 and 5.25e+152 respectively). In Figure 1 the progress of those launches that gave the best results on Trivium, Grain and Mickey are shown. Note, that on Trivium and Grain the effect of the jump strategy (see Section 3) can be seen.

82 82 Trivium

1x1080 1x1075 1x1070 1x1065 1x1060 1x1055 1x1050 1x1045 1x1040 50 100 150 200 250 objective function value (in sec.) function value (in objective number of updates of the best known value

Grain 1x1046 1x1044 1x1042 1x1040 1x1038 1x1036 1x1034 1x1032 1x1030 20 40 60 80 100 120 140 objective function value (in sec.) function value (in objective number of updates of the best known value

Mickey

1x1058 1x1056 1x1054 1x1052 1x1050 1x1048 5 10 15 20 25 30 35 40 objective function value (in sec.) function value (in objective number of updates of the best known value

Figure 1. Minimization on Trivium, Grain and Mickey

83 83 Discussion In [35, 38] objective functions were defined by two different SAT-based Monte Carlo algorithm. In the present study, the objective function is defined based on an improved SAT-based Monte Carlo algorithm suggested in [43]. In [35, 38] the GBFS without any jump strategy was used. In [25] it was extended by a random jump strategy. In the present study, the GBFS with an improved jump strategy that employs information about the search progress is used. The considered optimization problems were solved on a computing cluster. Also, they can be solved using the same objective function via cloud computing (e.g., by the Autotuner service [40]) or via distributed computing (e.g., by BNB-solver [14]). Note, that optimization algorithms implemented in these tools do not satisfy the third criterion (see Section 3), that is why we did not use them in the present study. A number of optimization algorithms can be also used to minimize this function, they were mentioned in Section 3. The idea of the SAT-based cryptanalysis was first proposed in [10]. In [28] a reduced variant of the DES was analyzed using SAT. In [31] SAT was used to study cryptographic hash functions of the MD family. In [12, 39, 37, 35, 43, 38] several cryptographic keystream generators (Crypto-1, Hitag2, Alternating step generator, A5/1, Trivium, Grain0) were analyzed using the SAT-based cryptanalysis. It should be noted, that four keystream generators (Grain1, Mickey, Rabbit, Salsa20) have not been analyzed by the SAT-based cryptanalysis before the present paper. From the cryptanalysis viewpoint, our runtime estimations obtained in the previous section for Rabbit and Salsa20 ciphers are not very interesting. Both ciphers are exceptionally resistant to cryptographic attacks in general and to SAT-based cryptanalysis in particular. Since the number of bits that need to be guessed in constructed attacks is comparable with the total number of input bits, it means that the attacks in question are unlikely to be faster than brute force search. Apparently, the best results for cryptanalysis of Trivium using a small amount of keystream were presented in [20]. That attack employs a similar idea to decompose the original hard problem by instantiating the values of Boolean variables, however, the system of Boolean equations is solved not via SAT but by a so-called Characteristic Set Method. The attack from [20] uses about 190 bits of keystream and has an estimated runtime of 2115 (≈ 4.15e + 34) seconds, which is about 3e+6 times better than the attack constructed by ALIAS-based GBFS (see Section 5). It appears that the characteristic set method itself suits better to cryptanalysis of Trivium. The best SAT- based attack on Trivium proposed in the present study (with the estimation of 1.4e+41 seconds) is slightly better than the previous best such attack, described in [38] (2.04e+41). With regard to cryptanalysis of the Mickey cipher, the state of the art results are the ones obtained in [26]. That paper uses the related key attack technique which, essentially, implies having information about several (sometimes several dozens) keys which are related to each other together with keystream fragments obtained using these keys. Thus it is hard to compare our results with that from [26] since the conditions of the attacks are drastically different. Informally, the attack from [26] requires much more data which is quite hard to obtain. However, in optimal conditions, it allows performing cryptanalysis of Mickey in mere minutes. Our attack requires one small fragment of keystream and no specific knowledge about keys, but the runtime estimation is very large. The Grain cipher has two modifications, informally denoted as Grain0 and Grain1. The first one represents the initial version of Grain submitted to the e-Stream project. The second one (Grain1) is the variant with resolved vulnerabilities found in Grain0. To the best of our knowledge

84 84 there are no known SAT-based attacks on Grain1, thus our results are novel in this direction. The other relevant methods for cryptanalysis of the Grain cipher either rely on combinations of weak secret keys and initialization vectors [44] or require inordinate amounts of keystream [27].

Conclusion The novelty of the present research is two-fold. First, apparently for the first time, the SAT- based cryptanalysis of several e-STREAM finalists was considered. While it did not lead to any groundbreaking results, it should be noted that such results usually imply quite a thorough and detailed analysis of underlying cryptographic functions and here we use an automated “number- crunching” scheme. The second contribution of the paper consists in comparing existing widely used algorithms for finding decompositions for hard SAT instances with state-of-the-art tools for blackbox optimization, represented by SMAC tool. Overall, we believe that the present direction of research is promising and can be used to solve hard SAT instances both related and unrelated to .

Acknowledgements The research was partially supported by Council for Grants of the President of the Russian Federation grant no. MK-4155.2018.9, RFBR grant no. 16-07-00155-a. Stepan Kochemazov was additionally supported by Council for Grants of the President of the Russian Federation (stipend no. SP-1829.2016.5).

References [1] Steve Babbage and Matthew Dodd. The MICKEY Stream Ciphers. In Robshaw and Billet [33], pages 191–209. [2] Tom´asBalyo, Armin Biere, Markus Iser, and Carsten Sinz. SAT race 2015. Artif. Intell., 241:45–65, 2016. [3] Daniel J. Bernstein. The Salsa20 Family of Stream Ciphers. In Robshaw and Billet [33], pages 84–97. [4] Armin Biere, Marijn Heule, Hans van Maaren, and Toby Walsh, editors. Handbook of Satisfi- ability, volume 185 of Frontiers in Artificial Intelligence and Applications. IOS Press, 2009. [5] Martin Boesgaard, Mette Vesterager, and Erik Zenner. The Rabbit Stream Cipher. In Rob- shaw and Billet [33], pages 69–83. [6] Leo Breiman. Random forests. Machine Learning, 45(1):5–32, 2001. [7] Christophe De Canni`ereand . Trivium. In Robshaw and Billet [33], pages 244– 266. [8] eSTREAM: the ECRYPT stream cipher project, http://www.ecrypt.eu.org/stream/. [9] Irkutsk supercomputer center of SB RAS, http://hpc.icc.ru. [10] Stephen A. Cook and David G. Mitchell. Finding hard instances of the satisfiability problem: A survey. In Ding-Zhu Du, Jun Gu, and Panos M. Pardalos, editors, Satisfiability Prob- lem: Theory and Applications, volume 35 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 1–18. DIMACS/AMS, 1996.

85 85 [11] Gonzalo I. Diaz, Achille Fokoue-Nkoutche, Giacomo Nannicini, and Horst Samulowitz. An effective algorithm for hyperparameter optimization of neural networks. IBM Journal of Research and Development, 61(4):9, 2017. [12] T Eibach, E Pilz, and G V¨olkel. Attacking Bivium using SAT solvers. In H K B¨uningand X Zhao, editors, Proceedings of the 11th International Conference on Theory and Applications of Satisfiability Testing: 12-15 May 2008; Guangzhou, China, pages 63–76, 2008. [13] Yu. G. Evtushenko, S. A. Lurie, M. A. Posypkin, and Yu. O. Solyaev. Application of opti- mization methods for finding equilibrium states of two-dimensional crystals. Computational Mathematics and Mathematical Physics, 56(12):2001–2010, 2016. [14] Yuri Evtushenko, Mikhail Posypkin, and Israel Sigal. A framework for parallel large-scale global optimization. Computer Science - R&D, 23(3-4):211–215, 2009. [15] Fred Glover and Manuel Laguna. Tabu Search, pages 2093–2229. 1999. [16] Nikolaus Hansen and Andreas Ostermeier. Adapting arbitrary normal mutation distributions in evolution strategies: The covariance matrix adaptation. In Toshio Fukuda and Takeshi Furuhashi, editors, Proceedings of 1996 IEEE International Conference on Evolutionary Com- putation, Nayoya University, Japan, May 20-22, 1996, pages 312–317. IEEE, 1996. [17] Pierre Hansen and Nenad Mladenovi´c.Variable neighborhood search: Principles and applica- tions. European Journal of Operational Research, 130(3):449 – 467, 2001. [18] Martin Hell, Thomas Johansson, Alexander Maximov, and Willi Meier. The Grain Family of Stream Ciphers. In Robshaw and Billet [33], pages 179–190. [19] Robert Hooke and T. A. Jeeves. “ direct search” solution of numerical and statistical problems. J. ACM, 8(2):212–229, April 1961. [20] Zhenyu Huang and Dongdai Lin. Attacking bivium and trivium with the characteristic set method. In Progress in Cryptology – AFRICACRYPT 2011, pages 77–91, 2011. [21] Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In Carlos A. Coello Coello, editor, Learning and Intelligent Optimization - 5th International Conference, LION 5, Rome, Italy, January 17-21, 2011. Selected Papers, volume 6683 of Lecture Notes in Computer Science, pages 507–523. Springer, 2011. [22] Waltraud Huyer and Arnold Neumaier. Global optimization by multilevel coordinate search. J. Global Optimization, 14(4):331–355, 1999. [23] A E J Hyv¨arinen. Grid Based Propositional Satisfiability Solving. PhD thesis, Aalto University, 2011. [24] Donald R. Jones, Matthias Schonlau, and William J. Welch. Efficient global optimization of expensive black-box functions. J. Global Optimization, 13(4):455–492, 1998. [25] Stepan Kochemazov and Oleg Zaikin. ALIAS: A modular tool for finding backdoors for SAT. In Theory and Applications of Satisfiability Testing – SAT 2018, pages 419–427, 2018. [26] Ding Lin and Guan Jie. Cryptanalysis of MICKEY family of stream ciphers. Security and Communication Networks, 6(8):936–941, 2012. [27] Z. Ma, T. Tian, and W. F. Qi. Internal state recovery of grain v1 employing guess-and- determine attack. IET Information Security, 11(6):363–368, 2017. [28] Fabio Massacci and Laura Marraro. Logical cryptanalysis as a SAT problem. J. Autom. Reasoning, 24(1/2):165–203, 2000. [29] Alfred J. Menezes, Scott A. Vanstone, and Paul C. Van Oorschot. Handbook of Applied Cryptography. CRC Press, Inc., Boca Raton, FL, USA, 1st edition, 1996. 86 86 [30] Nicholas Metropolis and Stanislaw Ulam. The Monte Carlo Method. J. Amer. statistical assoc., 44(247):335–341, 1949. [31] Ilya Mironov and Lintao Zhang. Applications of SAT solvers to cryptanalysis of hash functions. In Proceedings of the 9th International Conference on Theory and Applications of Satisfiability Testing, SAT’06, pages 102–115, Berlin, Heidelberg, 2006. Springer-Verlag. [32] Ilya Otpuschennikov, Alexander Semenov, Irina Gribanova, Oleg Zaikin, and Stepan Kochemazov. Encoding cryptographic functions to SAT using TRANSALG system. In ECAI 2016 - 22nd European Conference on Artificial Intelligence, 29 August-2 September 2016, The Hague, The Netherlands, volume 285 of Frontiers in Artificial Intelligence and Applications, pages 1594–1595. IOS Press, 2016. [33] Matthew J. B. Robshaw and Olivier Billet, editors. New Stream Cipher Designs - The eS- TREAM Finalists, volume 4986 of Lecture Notes in Computer Science. Springer, 2008. [34] Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, 3rd edition, 2009. [35] Alexander Semenov and Oleg Zaikin. Algorithm for finding partitionings of hard variants of boolean satisfiability problem with application to inversion of some cryptographic functions. SpringerPlus, 5(1):1–16, 2016. [36] Alexander Semenov and Oleg Zaikin. On the accuracy of statistical estimations of SAT parti- tionings effectiveness in application to discrete function inversion problems. In Supplementary Proceedings of the 9th International Conference on Discrete Optimization and Operations Re- search and Scientific School (DOOR 2016), Vladivostok, Russia, September 19 - 23, 2016., volume 1623 of CEUR Workshop Proceedings, pages 261–275. CEUR-WS.org, 2016. [37] Alexander Semenov, Oleg Zaikin, Dmitry Bespalov, and Mikhail Posypkin. Parallel logical cryptanalysis of the generator A5/1 in BNB-grid system. In 11 International Conference on Parallel Computing Technologies - PaCT 2011, volume 6873 of Lecture Notes in Computer Science, pages 473–483, 2011. [38] Alexander Semenov, Oleg Zaikin, Ilya Otpuschennikov, Stepan Kochemazov, and Alexey Ig- natiev. On cryptographic attacks using backdoors for SAT. In The Thirty-Second AAAI Conference on Artificial Intelligence, AAAI’2018, pages 6641–6648, 2018. [39] M Soos, K Nohl, and C Castelluccia. Extending SAT solvers to cryptographic problems. In O Kullmann, editor, Proceedings of the 12th International Conference on Theory and Appli- cations of Satisfiability Testing: 30 June - 3 July, 2009; Swansea, UK, pages 244–257, 2009. [40] Oleg Sukhoroslov, Sergey Volkov, and Alexander Afanasiev. Program autotuning as a service: opportunities and challenges. In Changjun Jiang, Omer F. Rana, and Nick Antonopoulos, editors, Proceedings of the 9th International Conference on Utility and Cloud Computing, UCC 2016, Shanghai, China, December 6-9, 2016, pages 148–155. ACM, 2016. [41] Darrell Whitley. A genetic algorithm tutorial. Statistics and Computing, 4(2):65–85, Jun 1994. [42] Takeru Yasumoto and Takumi Okuwaga. Rokk 1.0.1. In Anton Belov, Daniel Diepold, Marijn Heule, and Matti J¨arvisalo,editors, SAT Competition 2014, page 70, 2014. [43] Oleg Zaikin and Stepan Kochemazov. An improved SAT-based guess-and-determine attack on the alternating step generator. In ISC 2017, volume 10599 of LNCS, pages 21–38, 2017. [44] Haina Zhang and Xiaoyun Wang. Cryptanalysis of stream cipher grain family. IACR Cryp- tology ePrint Archive, 2009:109, 2009.

87 87