Quick viewing(Text Mode)

Declaratively Solving Tricky Google Code Jam Problems with Prolog

Declaratively Solving Tricky Google Code Jam Problems with Prolog

arXiv:1412.2304v2 [cs.PL] 14 Dec 2014 rmr rbes ouini osdrdcreti it if correct considered poses is and solution long A hours 2–4 problems. usually more quickly is or code round 3 and individual think an must – competitors results, one least good For at solved them of 25,462 participants and task. 50,000 2014, almost in world: registered the in competitions ming Jam Code INTRODUCTION 1. competitions programming programming, linear gramming, p , logic programming, Declarative Keywords etrls ecamta elrtv rgamn with programming declarative con- the that by claim imposed We limit ECL time rules. the within test answers find to l elrtv rgasi ECL in sim- programs relatively present declarative we imper- ple but conventional clever language, a inventing programming in ative problems by them implementing solved These efficiently be and to programming. designed (integer) were linear progra logic and constraint ming techniques: ECL declarative Prolog-based using solving the tem of with Jam examples contest Code Google several programming the demonstrate from problems we algorithmic challenging paper this In ABSTRACT Analysis [ D.3.2 Descriptors Subject and solu- Categories of kinds both with up the come comparing to tions. by required this steps than show mental Jam We Code Google programming. in imperative offered problems programming of tions— 1 https://code.google.com/codejam elrtvl ovn rcyGol oeJmproblems Jam Code Google tricky solving Declaratively i PS rgamn Languages Programming osritadlgclanguages logic and Constraint e :Optimization— ]: sbte utdfrsligcrancmo kinds common certain solving for suited better is [email protected] 1 ihPoo-ae CiS L system CLP ECLiPSe Prolog-based with GJ soeo h igs program- biggest the of one is (GCJ) needn Researcher Independent egiDymchenko Sergii ierprogramming Linear i PS :Lnug Classifica- Language ]: e .. [ G.1.6 ; htaefs enough fast are that Numerical i PS e sys- m- ro- ol! rga.Teagrtmpopsfr1 elnum- real 11 for ( prompts bers and The “Hello, syntac- the basic beyond Knuth program. some E. language World!” D. demonstrate programming a to by of used constructs is proposed tic It algorithm [4]. Pardo simple T. L. a a explain is and TPK present we syntax algorithm. the TPK of a feeling for program a give to so f agaeo ytm(nldn h ECL the (including programming system available freely or any language use 8 can and competitors GCJ input “small” a the within for one). cases “large” minutes the test for (4 given minutes limit all time for certain answers correct produces ist ev sapafr o nertn aiu logic various integrating pr for ECL logic (CLP). platform constraint gramming particular a in as extensions, programming serve to aims ed o sueta h edri aiirwt ECL with familiar is reader the that assume not do We vntiilt ov elrtvl sn ihlvlprogr ECL high-level like using or languages declaratively easy ming solve be to can challeng- estimate trivial be setters’ even problem should the that to problems according GCJ ing some that show We seasoned of crowd this the mostly and algorithmists. their When mind evaluate contests, in and keep competitions. other problems they complexity, the all in design setters across also problem language but GCJ same GCJ, the in use only par- contestants not Many languages ticipate Python). programming C#, popular Java, of C++, pa set (typically restrict limited competitions a other to Most ticipants paper). this in scribed ECL [3] CPLEX) Gu solvers, ‘grasper’ (Gecode, COIN-OR solvers domains, third-party to finite interfaces for and graphs), ( ‘fd’ loops libraries arithmetic, declarative interval programming its (like for constraint in several extensions contains language Prolog [6]), some programming offers Prolog core, the of plementation ECL programming. (integer) linear and programming logic h loih upt ar( pair a outputs algorithm the AG)otherwise. LARGE) 2 http://www.eclipseclp.org/ ( t = ) i PS a i PS 0 p e a . . . | [email protected] L[,1 sa pnsuc otaesse which system software open-source an is 1] CLP[7, t e | needn Researcher Independent mlmnaino P algorithm TPK of implementation 5 + 10 aiaMykhailova Mariia n o each for and ) t 3 fe htfor that After . i PS i PS e e a oenadecetim- efficient and modern a has a n ehiuslk constraint like techniques and i ,b i, computes i i if ) 10 = b i . . . ≤ i b PS i htorder) that (in 0 i 0,o ( or 400, = e2 f ytmde- system ( a i ,where ), i, TOO i PS am- ‘ic’ robi, for o- r- e , 3 1 f(T, Y) :- Triangle Areas 2 Y is sqrt(abs(T)) + 5*T^3. “Triangle Areas” is a problem from the second round of GCJ 3 main :- 2008. The problem statement can be rephrased as follows: 4 read(As), given integer N, M and A, find a triangle with vertices in 5 length(As, N), reverse(As, Rs), integer points with coordinates 0 ≤ xi ≤ N and 0 ≤ yi ≤ M 6 A ( foreach(Ai, Rs), for(I, N - 1, 0, -1) do that has an area equal to 2 , or say that it does not exist. 7 Bi is f(Ai), 8 ( Bi > 400 -> “Triangle Areas” is almost perfect for solving with constraint 9 printf("%w TOO LARGE\n", I) logic programming. Variables are discrete, constraints are 10 ; non-linear (so integer can not be used), 11 printf("%w %w\n", [I, Bi]) and we are looking for any feasible solution. The problem is 12 ) not too hard from purely algorithmic point of view, but cor- 13 ). rect implementation using conventional programming lan- Listing 1: TPK algorithm in ECLiPSe guages is very tricky. Many contestants who solved all other problems (including higher valued problems) failed to solve Lines 1–2 define a predicate to calculate value of f. Lines the large or both inputs for “Triangle Areas”. At the same i e 3–13 define predicate main. The name of this predicate is time, modeling this problem in ECL PS is almost trivial. passed to ECLiPSe translator as a command-line parameter. To come up with an effective model we need to notice that one vertex of the triangle can be chosen arbitrarily. With Line 4 reads a list into variable As. The input is a Prolog list this observation, the most convenient way to calculate the (comma-delimited and bracket-enclosed). We preprocess the doubled triangle area is to place one vertex in (0, 0); then space-separated input with a simple ‘sed’ script to convert 2S = |x2y3 − x3y2|. (The same formula can be used in an it to the Prolog format. imperative solution.)

Line 5 stores the length of the input list in the variable N For this problem we present complete source code of the so- (the program can work with inputs of different lengths, not lution. For subsequent problems we present only interesting necessarily 11 as original TPK program), and stores original parts: ‘model’ plus other problem-specific predicates. input numbers in reverse order in the variable Rs. 1 :- lib(ic). i e 2 model(N, M, A, [X2, Y2, X3, Y3]) :- Line 6 is the head of ECL PS loop: we iterate simultane- 3 [X2, X3] :: 0..N, ously over every number in Rs and over I = N − 1 . . . 0. I, 4 [Y2, Y3] :: 0..M, Ai and Bi are local variables for every loop iteration. Line 5 A #= abs(X2 * Y3 - X3 * Y2). 7 simply assigns the value f(Ai) to Bi. This line demon- i e 6 do_case(Case_num, N, M, A) :- strates ECL PS support of using arithmetic predicates as 7 printf("Case #%w: ", [Case_num]), functions – in many other Prolog implementations less clear 8 ( model(N, M, A, Points), labeling(Points) -> f(Ai, Bi) must be used. Lines 8–10 are Prolog ‘if-then-else’ 9 printf("0 0 %w %w %w %w", Points) construct that outputs “TOO LARGE” or Bi depending of 10 ; the value of Bi. printf is very similar to the corresponding 11 write("IMPOSSIBLE") C function. %w is a “wildcard” control sequence that can be 12 ), used to output values of different types. The second argu- 13 . ment of printf is a value or a list of values for substitution. 14 main :- i e 15 read([C]), Indentation has no syntactic meaning in Prolog (or ECL PS ), 16 ( for(Case_num, 1, C) do we use it for clarity. 17 read([N, M, A]), 18 do_case(Case_num, N, M, A) ). Listing 2: Complete ECLiPSe program for the “Triangle Areas” problem

2. THE PROBLEMS Line 1 loads interval arithmetic constraint programming li- In this section we show how tricky algorithmic problems can brary ‘ic’. Lines 2–5 define the model with input parameters often be easily modeled and efficiently solved using declar- i e N, M, A and a list of output parameters [X2, Y2, X3, Y3]. ative programming techniques and ECL PS . We chose a :: and #= are from ‘ic’ library. With :: we define possi- set of GCJ problems from different tournament stages to i e ble domains for X2, X3, Y2, Y3 variables, and #= from ‘ic’ demonstrate different useful aspects of ECL PS : constraint constraints both left and right parts to be equal and inte- programming library ‘ic’, linear programming library ‘eplex’, ger. After model evaluation X2, X3, Y2, Y3 variables will working with integers and floating point numbers. not necessarily be instantiated to concrete values, but they will have reduced domains with possible delayed constraints To compare our declarative programming solutions with pos- and will be instantiated later with labeling. sible imperative solutions we compare the “mental steps” required to come up with the solution (counting number of Lines 6–13 define the do_case predicate to process a sin- lines of code would not be useful because the main challenge gle input case. Line 7 outputs case number according to is to invent the solution, not to code it.) Our ECLiPSe pro- grams solve both large and small inputs for each problem. 3Problem link: http://goo.gl/enHWlq the problem specification. Lines 8–12 are an ‘if-then-else’ One of the very useful features of ECLiPSe is that similar construct that outputs point coordinates if it is possible to (sometimes identical) models can be used to solve the same satisfy our model predicate and labels (assigns concrete val- problem using different solvers (for example, constraint pro- ues from the domain) all coordinate variables, or “IMPOS- gramming ‘ic’ and linear programming ‘eplex’ [8]). Linear SIBLE” otherwise. Line 13 simply outputs a new line char- (integer) programming is almost always much more effec- acter. tive than constraint programming (especially when looking not just for any feasible, but for the optimal solution), but Lines 14–18 define the main predicate that reads number of constraint programming has more expressive power because test cases C and for each test case reads N, M, A parameters of possible usage of non-linear constraints and objectives. and executes do_case. Some problems can be adequately modeled as linear (in- teger) programming problems, but it may not be obvious We can formulate following mental steps needed to invent how to formulate the model in terms of linear constraints and implement this ECLiPSe solution: and objective – additional variables and specific linearizing 1. Notice that one vertex can be chosen as (0, 0). “tricks” might be required [9, 2, 5]. So a useful technique 2. Recall or look up the formula for an area of a triangle. is to solve small input of a GCJ problem using a constraint 3. Formulate the constraint programming model. programming model (easier to formulate, but less effective), 4. Code the constraint programming model. For this prob- and then convert the constraint programming model into a lem, it requires 4 lines of straightforward code. linear programming model to solve large input using linear (integer) programming solver. Correctness of the less obvi- ous linear programming model can be verified by comparing Let us compare this with a possible imperative solution in its results with results of the constraint programming model a convenient that requires a more on the same (small) input. in-depth analysis of the problem. First observations will be the same. We will also note that it is impossible to find Constraint logic programming model for this particular prob- required triangle if A > M × N, and for A = M × N tri- lem is easy to formulate in ECLiPSe (based on the fact that angle (0, 0), (N, 0), (0, M) is a valid answer. Now, for A < logic values of arithmetic constraints – 0 or 1 – can be used M × N we can represent A as M(A div M)+(A mod M), in other arithmetic constraints as integers): 0 < A div M < N, 0 < A mod M < M. If we match this representation with the area formula, we can see that 1 :- lib(ic). points (0, 0), (1, M), and (−A div M,A mod M) form a tri- 2 :- lib(branch_and_bound). A 3 model(S, P, Points, Triplets, GtP) :- angle with area 2 . If we shift this triangle A div M units in positive direction along the x axis, we will get a trian- 4 length(Points, N), gle (A div M, 0), (A div M + 1, M), (0,A mod M) that will 5 length(Triplets, N), match all the requirements. 6 ( foreach(Triplet, Triplets), 7 foreach(Point, Points), The mental steps for this solution could be: 8 fromto(0, SPrev, SCurr, S), 9 fromto(0, GtPPrev, GtPCurr, GtP), 1. Notice that one vertex can be chosen as (0, 0). 10 param(P) do 2. Recall or look up the formula for an area of a triangle. 11 Triplet = [Min, Med, Max], 3. Figure out that for A > M × N there is no such trian- 12 Triplet :: 0..10, gle. 13 Min #=< Med, Med #=< Max, 4. Find the solution for border case A = M × N. 14 Max - Min #=< 2, 5. Come up with a representation of A that matches the 15 Max + Med + Min #= Point, area formula for a special triangle. 16 SCurr #= SPrev + (Max - Min #= 2), 6. Shift this triangle to fit inside the boundaries. 17 GtPCurr #= GtPPrev + (Max >= P ) ). 7. Code the solution. The code is even easier, check how 18 find(Triplets, GtP) :- A relates to M ×N and output corresponding triangle. 19 flatten(Triplets, Vars), 20 Cost #= -GtP, i e Arguably, declarative solution in ECL PS needs simpler 21 bb_min(labeling(Vars), Cost, steps and leaves less space for a possible mistake. 22 bb_options{strategy: dichotomic}). Listing 3: Constraint programming solution for “Dancing With the Googlers” Dancing With the Googlers4 “Dancing With the Googlers” is a problem from the qual- Lines 1 and 2 load ‘ic’ constraint programming library and ification round of GCJ 2012. In this problem we consider a library for branch-and-bound search [3]. Lines 3–17 de- triplets of integers from 0 to 10 which never contain num- fine constraint programming model with input parameters bers that are more than 2 apart. A triplet is surprising if S (number of surprising triplets), p, and Points (list of it contains numbers that are exactly 2 apart. Given the list triplet sums), and output parameters Triplets (list of possi- of sums of N triplets and the number of surprising triplets ble triplets values) and GtP (number of high triplets). Lines among them S, how many triplets can be high (have the 4 and 5 make Triplets a list of the same length as Points. highest number at least p)? Lines 6–17 form an ECLiPSe loop. Lines 6–10 are loop header; foreach lines loop over triplets and points simul- 4Problem link: http://goo.gl/JpQQYi taneously, and fromto lines collect number of possible sur- prising triplets (constrained to equal S after loop end) and 3. Apply linearization tricks to convert non-linear con- number of possible high triplets (returned as GtP). Lines straints to integer linear. 11–17 are loop body: they provide the representation of a 4. Code integer linear programming model. triplet as minimum, medium and maximum elements to sim- plify calculation of constraint expressions for surprising and For an imperative solution we have to notice that the lowest high triplets. sum of numbers in a high unsurprising triplet is 3p−2 (for a triplet p,p−1,p−1), and in a high surprising triplet is 3p−4 Lines 18–22 define the predicate find that finds concrete val- (for a triplet p,p − 2,p − 2), but only if p >= 2 (otherwise ues of Triplets and GtP using bb_min from the branch_and_bound the triplet won’t be surprising). After this, we count the library. In line 19 Triplets list is flattened to Var, because triplets which are high even when unsurprising Nhigh, and labeling works only with flat lists or arrays. bb_min min- the triplets which can be high if they are surprising Nsurp. imizes the objective, so in line 20 we define the objective The answer is Nhigh + min(Nsurp,S). (Cost) as negative of (GtP) that we want to maximize. Mental steps for this imperative solution could be: An integer linear programming model is very similar, but finds solution much faster and solves large input. 1. Notice that unsurprising triplet is high if its sum is ≥ 3p − 2, and surprising triplet is high if its sum is ≥ 3p − 4 and p ≥ 2. 1 :- lib(eplex). 2. Implement iteration over triplets and calculation of Nhigh 2 model(S, P, Points, Triplets, GtP) :- and Nsurp. 3 integers(GtP), 4 length(Points, N), 5 length(Triplets, N), Compared to our declarative solution, the imperative solu- 6 ( foreach(Triplet, Triplets), tion needs fewer steps, but the first step requires some math 7 foreach(Point, Points), insight into the problem. Besides, our declarative approach 8 fromto(0, SPrev, SCurr, S), provides an alternative solution for the small input which 9 fromto(0, GtPPrev, GtPCurr, GtP), can be used to validate the solution for the large input. 10 param(P) do 11 Triplet = [Min, Med, Max], Star Wars5 12 Triplet $:: 0..10, integers(Triplet), “Star Wars” was one of the harder problems from round 2 of 13 Min $=< Med, Med $=< Max, GCJ 2008, yet it can be almost trivially modeled and solved 14 Max + Med + Min $= Point, as a linear programming problem. 15 Surprise $:: 0..1, integers(Surprise), 16 Max - Min $=< 1 + Surprise, The essence of the problem statement is: you are given a 17 SCurr $= SPrev + Surprise, set of N 4-tuples of integers xi,yi,zi,pi. Find the minimal 18 G $:: 0..1, integers(G), possible Y for which exists a triplet x,y,z such that for each 19 Max $>= G * P, original tuple |xi − x| + |yi − y| + |zi − z|≤ piY . 20 GtPCurr $= GtPPrev + G ). 21 find(GtP) :- Direct translation of the problem statement to a model re- 22 eplex_solver_setup(max(GtP)), sults in non-linear constraints, but these can be easily con- 23 eplex_solve(_), verted to linear constraints using the fact that |X| ≤ Max 24 eplex_var_get(GtP, typed_solution, GtP), is equivalent to a pair of linear constraints X ≤ Max and 25 eplex_cleanup. −X ≤ Max [5]. Listing 4: Integer programming solution for “Dancing With the Googlers” 1 :- lib(eplex). 2 model(Xs, Ys, Zs, Ps, X, Y, Z, P) :- This integer linear programming model uses eplex library 3 P $>= 0, and requires two additional sets of integer variables 0..1 com- 4 ( foreach(Xi, Xs), foreach(Yi, Ys), pared to the constraint programming model. Surprise and 5 foreach(Zi, Zs), foreach(Pi, Ps), G are indicator variables local to loop iteration for “is triplet 6 param(X, Y, Z, P) do surprising” and “is triplet high”, and their use allows to lin- 7 +(Xi - X) +(Yi - Y) +(Zi - Z) $=< Pi * P, earize the constraints. Unlike the ic library, eplex doesn’t 8 +(Xi - X) +(Yi - Y) -(Zi - Z) $=< Pi * P, deduce that variables must be integer from integer domain 9 +(Xi - X) -(Yi - Y) +(Zi - Z) $=< Pi * P, bounds, so variables have to be declared as integers explic- 10 +(Xi - X) -(Yi - Y) -(Zi - Z) $=< Pi * P, itly. This solution doesn’t require branch-and-bound search, 11 -(Xi - X) +(Yi - Y) +(Zi - Z) $=< Pi * P, because integer linear programming solver already produces 12 -(Xi - X) +(Yi - Y) -(Zi - Z) $=< Pi * P, the optimal solution. 13 -(Xi - X) -(Yi - Y) +(Zi - Z) $=< Pi * P, 14 i e -(Xi - X) -(Yi - Y) -(Zi - Z) $=< Pi * P ). Mental steps for the ECL PS solution could be: 15 find(P) :- 1. Formulate triplet representation and constraints for an 16 eplex_solver_setup(min(P)), individual triplet. 17 eplex_solve(_), 2. Code constraint programming model. At this point we 18 eplex_var_get(P, typed_solution, P), can solve the small input and make sure that our im- plementation is correct. 5Problem link: http://goo.gl/DtpEQl 19 eplex_cleanup. is the total number of mines in corresponding square and Listing 5: Linear programming solution for “Star Wars” eight adjacent squares. The task is: given the grid of mine counts, find the maximum possible number of mines in the Line 1 loads linear programming library ‘eplex’. Lines 2–15 middle row of the original grid. define a linear programming model with input arguments Xs, Ys, Zs, Ps (lists of xi, yi, zi, pi), and output param- An efficient integer programming model for “Mine Layer” eter P. Other parameters of the model predicate (X, Y, Z) is relatively straightforward to come up with and code in can be useful for debugging. Line 3 constrains the value of P ECLiPSe: to be non-negative using $>= from ‘eplex’. Lines 4–6 define i e 1 :- lib(eplex). a header of ECL PS loop: execute the loop body for each 2 model(Clues, Mines, MiddleSum) :- xi, yi, zi, pi, and do not treat variables X, Y, Z, P as lo- 3 dim(Clues, [R, C]), cal for the loop iterations. Lines 7–14 specify the linearized 4 dim(Mines, [R, C]), problem constraints. 5 ( foreachelem(Mine, Mines) do 6 Mine $:: 0..1, Lines 15–19 define the auxiliary predicate find that sets up 7 integers(Mine) ), the goal for eplex solver (minimize P ), runs the solver, gets 8 ( multifor([I, J], 1, [R, C]), the typed value of P , and cleans up the solver environment 9 param(R, C, Clues, Mines) do for the following test cases. 10 ( multifor([Di, Dj], -1, 1), 11 fromto(0, Prev, Curr, S), Possible steps to come up with this solution: 12 param(I, J, R, C, Mines) do 1. Express the problem constraints in linear form. 13 ( I + Di > 0, I + Di =< R, 2. Formulate linear programming model. 14 J + Dj > 0, J + Dj =< C -> 3. Code linear programming model. Trivial after the for- 15 Curr = Prev + Mines[I + Di, J + Dj] mulation. 16 ; 17 Curr = Prev The first step towards a traditional algorithmic solution is 18 ) to notice that we can use binary search to find the smallest 19 ), 20 possible solution Ys: all Y >Ys will satisfy the constraints, Clues[I, J] $= S 21 and all Y

Our declarative solution has fewer steps, and they are much easier to perform, as they don’t require much insight into the problem.

8Results were obtained on a 64-bit Linux machine with Intel Core i7-4900MQ CPU @ 2.80GHz and 16GB RAM using ECLiPSe 6.1 #191. For linear (integer) programming free COIN-OR solvers bundled with ECLiPSe were used. 9goo.gl/K2dfPi