Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1027

Constraints for Membership in Formal Languages under Systematic Search and Stochastic Local Search

JUN HE

ACTA UNIVERSITATIS UPSALIENSIS ISSN 1651-6214 ISBN 978-91-554-8617-4 UPPSALA urn:nbn:se:uu:diva-196347 2013 Dissertation presented at Uppsala University to be publicly examined in Room 2446, Polacksbacken, Lägerhyddsvägen 2D, Uppsala, Friday, April 26, 2013 at 13:00 for the degree of Doctor of Philosophy. The examination will be conducted in English.

Abstract He, J. 2013. Constraints for Membership in Formal Languages under Systematic Search and Stochastic Local Search. Acta Universitatis Upsaliensis. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1027. 74 pp. Uppsala. ISBN 978-91-554-8617-4.

This thesis focuses on constraints for membership in formal languages under both the systematic search and stochastic local search approaches to constraint programming (CP). Such constraints are very useful in CP for the following three reasons: They provide a powerful tool for user- level extensibility of CP languages. They are very useful for modelling complex work shift regulation constraints, which exist in many shift scheduling problems. In the analysis, testing, and verification of string-manipulating programs, string constraints often arise. We show in this thesis that CP solvers with constraints for membership in formal languages are much more suitable than existing solvers used in tools that have to solve string constraints. In the stochastic local search approach to CP, we make the following two contributions: We introduce a stochastic method of maintaining violations for the regular constraint and extend our method to the automaton constraint with counters. To improve the usage of constraints for which there exists no known constant-time algorithm for neighbour evaluation, we introduce a framework of using solution neighbourhoods, and give an efficient algorithm of constructing a solution neighbourhood for the regular constraint. In the systematic search approach to CP, we make the following two contributions: We show that there may be unwanted consequences when using a propagator that may underestimate a cost of a soft constraint, as the propagator may guide the search to incorrect (non-optimum) solutions to an over-constrained problem. We introduce and compare several propagators that compute correctly the cost of the edit-distance based soft- regular constraint. We show that the context-free grammar constraint is useful and introduce an improved propagator for it.

Keywords: constraint programming, regular constraint, automaton constraint, context-free grammar constraint, solution neighbourhood,

Jun He, Uppsala University, Department of Information Technology, Computing Science, Box 337, SE-751 05 Uppsala, Sweden.

© Jun He 2013

ISSN 1651-6214 ISBN 978-91-554-8617-4 urn:nbn:se:uu:diva-196347 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-196347) To my family

Acknowledgements

First and foremost I would like to thank my supervisors Professor Pierre Flener and Associate Professor Justin Pearson for their invaluable support and super- vision during my PhD research. Thank you for providing me with the opportu- nity to work with you and for always giving me numerous encouragement and patient guidance, without which I could not have achieved the research goals. I also appreciate your tremendous effort to teach me how to write scientific papers and your comments on drafts of this thesis. I would like to thank Professor Weiming Zhang for involving me with the PhD programme of the National University of Defense Technology (NUDT) of China and for supporting me to apply for a PhD scholarship from the China Scholarship Council (CSC). Thanks also to Professor Xianqing Yi for his su- pervision during my master programme at NUDT. I am grateful to Dr Magnus Ågren, Dr Serdar Kadioglu,ˇ and Dr Toni Mancini for their help and useful discussions in some aspects of my research. Thanks to Dr George Katsirelos for some useful discussions on the CFG con- straint. Thanks to Assistant Professor Willem-Jan van Hoeve and Associate Professor Louis-Martin Rousseau for some useful discussions on the edit- distance based SOFTREGULAR constraint. Thanks to Associate Professor Xi- ang Fu, Dr Adam Kiezun,˙ and Dr Prateek Saxena for some useful discussions on SUSHI,HAMPI, and KALUZA respectively. I enjoyed working in the ASTRA group on constraint programming. I would like to thank Karl Sundequist for helping me to write the Swedish sum- mary of this thesis. Thanks to Farshid Hassani Bijarbooneh for some useful discussions on COMET and GECODE, and for kindly sharing with me an apart- ment and some delicious Persian food in 2010. Thanks to Loïc Blet for proof- reading one of my COMET programs. Thanks also to Dr Jean-Noël Monette, Joseph Scott, María Andreína Francisco Rodríguez, and Farshid Hassani Bi- jarbooneh for presenting some interesting papers and sharing some delicious cookies in the ASTRA journal club. Special thanks to CSC and NUDT for providing me with financial support during my PhD research. Thanks to Associate Professor Lars-Henrik Eriksson and the Computing Science Division (CSD) for kindly providing me with fi- nancial support after the expiration of my Chinese scholarship and for offering me an English course, and thanks to Maria Njoo for teaching me. Thanks to Ulrika Andersson and Anne-Marie Jalstrand at the IT department for their help and support. Thanks also to Mr Rui Fan and Mr Wei Wang at the Education Section of the Chinese Embassy in Sweden for their help and support. I would like to thank the Association for Constraint Programming (ACP) for giving me financial support to attend the ACP summer schools 2010 and 2011 and the CP 2009 conference. Thanks also to the ACM Special Interest Group on Applied Computing (SIGAPP) for giving me a student travel award to attend the ACM SAC 2012 conference. I would also like to thank Professor Wang Yi for his encouragement and help. Special thanks to Minpeng Zhu. Thank you for always being a good listener and joining me with lunch. Thanks to Nan Guan for always being a big brother with numerous encouragementand suggestion. I am grateful to the Chinese student community at house one of the IT department: Minpeng Zhu, Nan Guan, Cheng Xu, Ran Ji, Xiaoyue Pan, Ping Lu, and Yunyun Zhu. Thank you for inviting me to many interesting games and parties with delicious Chi- nese food. Thanks also to my NUDT colleagues: Jinping Yuan, Longming Dong, Lidong Cheng, Xin Lu, Chaofang Zhang, Haining Wang, and Fei Cai for their help and encouragement. Last, but not least, I would like to thank my parents, my wife, and my son for all their love and support. Without your encouragement and understanding, it would be impossible for me to finish this thesis. List of papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Jun He, Pierre Flener, and Justin Pearson. An automaton Constraint for Local Search. Fundamenta Informaticae, 107(2–3):223–248, 2011. An early version is published in Electronic Proceedings in Theoretical Computer Science, 5:13–25, 2009, and is then also published in Proceedings of RCRA’09, the 16th RCRA International Workshop on Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion.

II Jun He, Pierre Flener, and Justin Pearson. Solution Neighbourhoods for Constraint-Directed Local Search. In: S. Bistarelli, E. Monfroy, and B. O’Sullivan (editors), Proceedings of SAC/CSP’12, the track on Constraint Solving and Programming of the 27th ACM Symposium on Applied Computing, pages 74–79. ACM Press, 2012.

III Jun He, Pierre Flener, and Justin Pearson. Underestimating the Cost of a Soft Constraint is Dangerous: Revisiting the Edit-Distance Based Soft Regular Constraint. Accepted under minor revisions in Journal of Heuristics.

IV Jun He, Pierre Flener, and Justin Pearson. Solving String Constraints: The Case for Constraint Programming. Submitted to a conference.

Reprints were made with permission from the publishers. Papers I, II and IV are verbatim copies of the original papers but are reformatted to the one- column format of this thesis. Comments on my Participation

Paper I: I am the principal author of this paper. I proposed the ideas, designed the algorithms, performed the experiments, and participated in writing the paper. My supervisors contributed to the discussions.

Paper II: I am the principal author of this paper. I proposed the ideas, designed the algorithms, performed the experiments, and was the lead writer of the paper. My supervisors contributed to the discussions.

Paper III: I am the principal author of this paper. I proposed the ideas, de- signed the algorithms, performed the experiments, and was the lead writer of the paper. My supervisors contributed to the discussions.

Paper IV: I am the principal author of this paper. I proposed the ideas, de- signed the algorithms, performed the experiments, and was the lead writer of the paper. My supervisors contributed to the discussions and wrote the intro- duction and conclusion of the paper. Summary in Swedish

Constraint Programming (CP, e.g., se [2]) är en hög nivå deklarativa paradigm som använder en uppsättning villkor för att modellera och lösa kombinatoriska problem. Om en CP språk saknar ett villkor som gör det möjligt att formulera en viss modell av ett kombinatoriskt problem, så har modelleraren traditionellt val enligt följande: 1. att byta till ett annat CP språk som har alla nödvändiga villkor; 2. att formulera en annan modell som inte kräver de saknade villkoren; 3. att implementera de saknade villkoren i det låg-nivå språk valt av CP språket. Det första alternativet är ofta omöjligt i praktiken, eftersom det inte nöd- vändigtvis finns någon CP språk som innehåller alla nödvändiga villkor; det andra alternativet kan vara svårt eller göra så att problemet löses ineffektivt. Följaktligen, användarnivå utökbara CP språk (för att underlätta det tredje al- ternativet) har varit ett viktigt mål för över ett decennium [19, 9, 7, 33, 31, 32, 18, 1]. Formella språk som anges av deterministiska ändliga automater (DFA) och kontextfri grammatik (CFG) ger en rik formalism för att uttrycka villkor som medlemskap i dessa språk. En viktig faktor är att 120 av de för närvarande 364 globala villkoren i Global Constraint Catalogue [5] beskrivs av DFAer som eventuellt är berikade med räknare [7]. Därför är villkor för medlemskap i formella språk lovande för att ge användarnivå utökbara CP språk. Såvitt vi vet har två sådana villkor införts i litteraturen: • REGULAR villkoret [33] (en generalisering av AUTOMATON villkoret [7]) är baserad på reguljära språk. REGULAR villkoret kräver att en sekvens av beslutsvariabler tillhör ett reguljärt språk som anges av en DFA eller ett reguljärt uttryck; AUTOMATON villkoret kan även ta en DFA med räknare, samt en icke-deterministisk ändlig automat (NFA). • CFG villkoret [36, 39] bygger på kontextfria språk, vilket är en rikare klass än reguljära språk. CFG villkoret kräver att en sekvens av beslutsvariabler tillhör ett kontextfritt språk, som anges av en CFG. I denna avhandling behandlar vi den viktiga frågan om att förbättra använd- ningen av villkoren för medlemskap i formella språk enligt både systema- tisk sökning och stokastisk lokal sökning i CP (kallas constraint-based local search, CBLS se [41]) och gör följande bidrag: • För CBLS, givet en sekvens X av n beslutsvariabler och en DFA A, föres- lår vi den första stokastiska metoden för att definiera och upprätthålla antalet brytningar emot ett villkor, nämligen REGULAR(X,A) villkoret och alla dess beslutsvariabler i O(n) tid, och fastställa dess praktiska användbarhet genom att jämföra den experimentellt med den O(n · |A|)- tids deterministiska metoden [35], som är det enda existerande arbetet utom vårt angående REGULAR(X,A) villkoret för CBLS så vitt vi vet. Vi utökar också vår metod för AUTOMATON villkoret med räknare. (Ar- tikel I) En förväntad inverkan av detta arbete är att det kan vara lämpligt att definiera och upprätthålla antalet brytningar av andra villkor stokastiskt, om den stokastiska metoden har en lägre komplexitet än en determinis- tisk metod. • För CBLS, ett problem löses genom iterationer av utvärdering av en lös- nings grannar. Därför är det viktigt att välja bra grannar under sökning, eftersom valet av bättre grannar vanligtvis gör så att problemet löses mer effektivt. Vi föreslår ett ramverk för att använda solution neigh- bourhoods, som endast innehåller lösningar på ett valt villkor. Baserat på ramverket, introducerar vi en metod för att utforma ett solution neigh- bourhood för REGULAR villkoret och visar dess praktiska användbarhet experimentellt. En fördel med att använda solution neighbourhoods är att vi sparar den tid som behövs för att utvärdera grannar av detta vill- kor, eftersom att flytta till någon granne har samma förändring i antal brytningar av det villkoret. (Artikel II) En förväntad inverkan av detta arbete är att solution neighbourhood kan vara användbart speciellt för andra villkor för vilka det inte finns någon känd konstant tids algoritm för grannes utvärdering. • Inom systematisk sökning hävdar vi att CP lösare med villkor för medlemskap i formella språk är mycket mer lämpade än de befintliga lösare för programverifieringsverktyg som måste lösa sträng villkor, eftersom CP har en rik tradition av villkor för medlemskap i formella språk. (Artikel IV) En förväntad inverkan av detta arbete är att CP kan bli vida använt som ett kraftfullt verktyg för att lösa sådana programverifiering problem. • De flesta CP lösare behandlar villkor för ett fast antal beslutsvariabler. Eftersom varje språk av fast storlek är ändlig och därmed reguljärt, och varje CFG för en fast stränglängd kan omvandlas till ett REGULAR villkor som i [25]. I metoder för systematisk sökning att uppnå domän konsis- tens (även känd som generalised arc consistency) behövs O(n · |A|) tid 3 för ett REGULAR villkor med en automat A av storlek |A|, men O(n ·|G|) tid för ett CFG villkor med en grammatik G av storlek |G|. Därför beror behovet av ett CFG villkor på grammatik och komplexiteten av propaga- torn. Så vitt vi vet har ingen verklig grammatik visats där den dyra CFG propagatorn utklassar den billiga REGULAR propagatorn, åtminstone när omformulering av [25] en grammatik i en automata för en fast längd n görs off-line; och ingen CP lösare inkluderar CFG villkoret. Vi visar att det finns icke-påhittade CFGer där en dyr CFG propagator slår en bil- ligGULARRE propagator när CFG måste omformuleras on-line för den senare till en DFA för en fast sträng längd. (Artikel IV) En förväntad inverkan av detta arbete är att CFG villkoret kommer att locka mer forskningsintresse och att vissa CP lösare äntligen inkluderar det. • Inom systematisk sökning, givet ett CFG villkor med en grammatik G och en sekvens av n beslutsvariabler, propagatorn av [23] uppnår domän konsistens inom O(n3 ·|G|) tid med O(n2 ·|G|) utrymme, vilket är bättre än den decomposition-baserade propagatorn av [37] med O(n3 · |G|) tid och utrymme. Vi förbättrar CFG propagatorn av [23] genom att utnyttja en idé från [27] för att omformulera en grammatik till ett reguljärt uttryck för en fast sträng längd. Vi kommer att bidra med vår implementation av den nya propagatorn till den öppna källkods CP lösaren Gecode [15]. (Artikel IV) En förväntad inverkan av detta arbete är att vår idé även kan användas för att förbättra propagatorerna av [36, 37, 26]. • Många verkliga problem är över-begränsade, så det inte finns någon lösning som uppfyller alla dess villkor. Mjuka villkor, med kostnader som anger hur mycket de villkoren bryts, används för att lösa dessa problem i CP. Inom systematisk sökning finns två mjuka versioner av REGULAR villkoret, nämligen Hamming-avstånds baserade och edit- avstånds baserade SOFTREGULAR villkor med två olika kostnader, och deras propagators har introducerats i [42]. Dock så kan propagatorn un- derskatta kostnaden för edit-avstånd baserade SOFTREGULAR villkor. Vi använder edit-avstånd baserade SOFTREGULAR villkor som ett exempel för att visa att en propagator som ibland underskattar kostnaden för ett mjukt villkor kan styra sökningen till felaktiga (icke-optimala) lösningar för över-begränsade problem. Därför är det viktigt för propagatorn att korrekt beräkna kostnaden. För att beräkna konstnaden korrekt för edit- avstånd baserad SOFTREGULAR villkor för en sekvens av n beslutsvari- abler, föreslår vi en O n2 -tids propagator baserad på dynamisk pro- grammering och ett bevis på dess korrekthet. Vi ger också en förbät- trad propagator baserad på en idé för att beräkna edit-avståndet mellan två strängar [40], som först antar edit-avståndet är ett, sedan verifierar antagandet: om antagandet är sant, så har korrekt edit avstånd beräk- nats; annars så görs ett nytt antagande genom att fördubbla värdet av det tidigare antagna edit avståndet. Vår förbättrade propagator har samma kvadratiska tidskomplexitet i värsta fall, men bättre resultat i praktiken. (Artikel III) En förväntad inverkan av detta arbete är att vi kan använda den idén av [40] för andra villkor med propagators baserade på dynamisk pro- grammering.

Contents

1 Introduction ...... 15 1.1 ResearchQuestions ...... 16 1.2 Contributions and Expected Impacts ...... 18 1.3 OutlineoftheThesis ...... 19

2 Constraint Programming ...... 21 2.1 Decision Variables and Domains ...... 21 2.2 Constraints ...... 22 2.2.1 Hard Constraints ...... 22 2.2.2 Soft Constraints ...... 22 2.3 Combinatorial Problems ...... 23 2.3.1 Constraint Satisfaction Problems ...... 23 2.3.2 Constrained Optimisation Problems ...... 24 2.4 SearchApproaches ...... 25 2.4.1 Systematic Search ...... 25 2.4.2 Stochastic Local Search ...... 30

3 Constraints for Membership in Formal Languages ...... 36 3.1 FormalLanguages ...... 36 3.2 The REGULAR Constraint ...... 39 3.2.1 Definition ...... 39 3.2.2 RelatedWork ...... 41 3.3 The SOFTREGULAR Constraint ...... 42 3.3.1 Definition ...... 42 3.3.2 RelatedWork ...... 44 3.4 The AUTOMATON Constraint with Counters ...... 45 3.4.1 Definition ...... 45 3.4.2 RelatedWork ...... 48 3.5 The CFG Constraint ...... 48 3.5.1 Definition ...... 48 3.5.2 RelatedWork ...... 49

4 Summary of Papers ...... 51 4.1 PaperI:AnAUTOMATON Constraint for Local Search ...... 51 4.2 Paper II: Solution Neighbourhoods for Constraint-Directed Local Search ...... 52 4.3 Paper III: Underestimating the Cost of a Soft Constraint is Dangerous: Revisiting the Edit-Distance Based Soft Regular Constraint ...... 53 4.4 Paper IV: Solving String Constraints: The Case for Constraint Programming ...... 54

5 Conclusion ...... 56

References ...... 57

A Omitted Materials from Paper I ...... 61 A.1 TwocDFAs ...... 61 A.2 TheSt Louis Police problem ...... 61 A.2.1 TheModel ...... 63 A.2.2 Experimental Results ...... 63

B Omitted Materials from Paper II ...... 64

C Omitted Materials from Paper IV ...... 66 1. Introduction

Constraint Programming (CP, e.g., see [2]) is a high-level declarative paradigm that uses a set of constraints to model and solve combinatorial problems. If a CP language lacks a constraint that would allow the formulation of a particu- lar model of a combinatorial problem, then the modeller traditionally has the following three choices: 1. switching to another CP language that has all the required constraints; 2. formulating a different model that does not require the lacking con- straint; 3. implementing the lacking constraint in the low-level implementation language of the chosen CP language. The first option is often impossible in practice, as there may be no CP language containing all the required constraints; the second option may be difficult or make the problem be solved inefficiently. Hence, the user-level extensibility of CP languages (for facilitating the third option) has been an important goal for over a decade [19, 9, 7, 33, 31, 32, 18, 1]. Formal languages that are specified by deterministic finite automata (DFA) and context-free grammars (CFG) provide a rich formalism to express con- straints as membership in such languages. One important fact is that 120 of the currently 364 global constraints in the Global Constraint Catalogue [5] are described by DFAs that are possibly enriched with counters [7]. Hence, constraints for membership in formal languages are promising for providing user-level extensibility of CP languages. As far as we know, two such con- straints have been introduced in the literature: • The REGULAR constraint [33] (a generalisation of which is known as the AUTOMATON constraint [7]) is based on regular languages. The REGULAR constraint requires a sequence of decision variables to belong to a , which is specified by a DFA or a regular expres- sion; the AUTOMATON constraint even takes a DFA with counters, as well as a non-deterministic finite automaton (NFA). • The CFG constraint [36, 39] is based on context-free languages, which is a richer class than regular languages. The CFG constraint requires a sequence of decision variables to belong to a context-free language, which is specified by a CFG. In CP, two orthogonal search approaches exist: • In the traditional systematic search approach, a problem is solved by ex- ploring a search tree, where all possible variable-value combinations in the domains of all decision variables are intelligently enumerated until a

15 solution to the problem is found or it is proved that none exists. At each node of the search tree, constraint propagation is performed separately for all constraints in the problem to remove from the current domains of the decision variables some (but not necessarily all) inconsistent values, which cannot be part of a solution to the constraint, and is repeated un- til no more pruning is possible (a fix point). Hence, each constraint is associated with a propagation algorithm, called a propagator. • In the more recent stochastic local search approach (called constraint- based local search, CBLS, see [41]), there are incremental algorithms to compute for each decision variable (or constraint) how many violations are caused by the decision variable (or how much the constraint is vio- lated), where a constraint has zero violation if and only if the constraint is satisfied under the current assignment to the decision variables. A problem is modelled and solved as an optimisation problem to minimise the total amount of constraint violation. From an initial assignment of values to all decision variables, CBLS iteratively moves to another as- signment, which ideally decreases the total amount of constraint vio- lation, by exploring a neighbourhood of small changes to the current assignment of some (usually violating) decision variable(s), until an as- signment with zero constraint violation is found or until allocated re- sources are exhausted. Meta-heuristics are used to escape local optima. In this thesis, we address the core question of improving the usage of con- straints for membership in formal languages under both systematic search and stochastic local search in CP.

1.1 Research Questions Most research on constraints for membership in formal languages focuses on the systematic search approach. To the best of our knowledge, at the start of my PhD, the only existing work for CBLS was [35], which provides a deter- ministic method for defining and maintaining the violations of the REGULAR constraint and all its decision variables. Given a REGULAR constraint with a DFA A of size |A| and a sequence of n decision variables, this method requires O(n · |A|) time to maintain the violations, which is too expensive for CBLS. Hence, the first question that came to mind was:

Question 1. Can we improve the method of [35], and find a better way to de- fine and maintain the violations of the REGULAR constraint and all its decision variables? Can we even make an extension to the AUTOMATON constraint with counters, which is more powerful than the REGULAR constraint?

Under CBLS, a problem is solved by iterations of neighbourhood evaluation. Hence, it is important to choose good neighbourhoods during search, as the

16 choice of better neighbourhoods usually makes the problem solvable more ef- ficiently. Considering the REGULAR constraint, we ask the following question:

Question 2. Can we design a neighbourhood for the REGULAR constraint that takes advantages of the structure of the constraint?

Now, we focus on the systematic search approach to CP. Although con- straints for membership in formal languages have already been successfully used to solve many problems, most of them are personnel rostering [12, 6] and car sequencing [43] problems. Hence, we pose the following question:

Question 3. Can we find other problems, where constraints for membership in formal languages can also be successfully used?

Most CP solvers deal with constraints on a fixed number of decision variables. Since every fixed-size language is finite and hence regular, every CFG con- straint for a fixed string length can be transformed into a REGULAR constraint as in [25] for instance. Achieving domain consistency (also known as gen- eralised arc consistency) takes O(n · |A|) time for a REGULAR constraint with an automaton A of size |A|, but O(n3 · |G|) time for a CFG constraint with a grammar G of size |G|. Hence, the need for a CFG constraint depends on the grammar and on the complexities of the propagators. To the best of our knowledge, no real-life grammar has been exhibited where the expensive CFG propagator outperforms the cheap REGULAR propagator, at least when the re- formulation of [25] of a grammar into an automaton for a fixed length n is taken off-line; and no CP solver includes the CFG constraint. Hence, we pose the following question:

Question 4. Is the CFG constraint useful? Can we find some problem where the expensive CFG propagator outperforms the cheap REGULAR propagator?

Given a CFG constraint with a grammar G and a sequence of n decision vari- ables, the propagator of [23] achieves domain consistency in O(n3 · |G|) time with O(n2 ·|G|) space, which is better than the decomposition-based propaga- tor of [37] with O(n3 · |G|) time and space. Thereafter, no one has improved the propagator of [23] as far as we know, and this may be why Question 4 was answered negatively so far. Hence, we pose the following question:

Question 5. Can we improve the propagator of [23] for the CFG constraint?

Many real-life problems are over-constrained, so that no solution satisfying all their constraints exists. Soft constraints, with costs denoting how much the constraints are violated, are used to solve these problems in CP. In [42], two softened versions of the REGULAR constraint, namely the Hamming-distance

17 based and the edit-distance based SOFTREGULAR constraints with two differ- ent cost measures, and their propagators have been introduced. However, the propagator for the edit-distance based SOFTREGULAR constraint may underes- timate its cost. Hence, we pose the following question:

Question 6. Are there are undesirable consequences of using a propaga- tor that may underestimate the cost for a soft constraint? If yes, then is there a propagator that correctly computes the cost for the edit-distance based SOFTREGULAR constraint?

1.2 Contributions and Expected Impacts By answering the research questions above, we make the following contribu- tions in this thesis: • For CBLS, given a sequence X of n decision variables and a DFA A, we propose the first stochastic method of defining and maintaining the vio- lations of a constraint, namely the REGULAR(X,A) constraint and all its decision variables, in O(n) time, and establish its practicality by com- paring it experimentally with the O(n · |A|)-time deterministic method of [35]. We also extend our method to the AUTOMATON constraint with counters. (Question 1 and Paper I) One expected impact of this work is that it may be useful to define and maintain violations of other constraints stochastically, if a stochastic method has a lower complexity than a deterministic method. • For CBLS, we propose a framework of using solution neighbourhoods, which contain only solutions to a chosen constraint. Based on the frame- work, we introduce a method of designing a solution neighbourhood for the REGULAR constraint, and demonstrate its practicality experimentally. One advantageof using solution neighbourhoods is that we save the time needed for neighbourhoodevaluation of that constraint, as moving to any neighbour has the same violation change for that constraint. (Question 2 and Paper II) One expected impact of this work is that solution neighbourhoods may be useful especially for other constraints for which there exists no known constant-time algorithm for neighbour evaluation. • In the systematic search approach, we argue that CP solvers with con- straints for membership in formal languages are much more suitable than existing solvers used in program verification tools that have to solve string constraints, as CP has a rich tradition of constraints for member- ship in formal languages. (Question 3 and Paper IV) One expected impact of this work is that CP may be widely used as a powerful tool for solving such program verification applications.

18 • In the systematic search approach, we show that there are non-contrived CFGs where an expensiveCFG propagatorbeats a cheap REGULAR prop- agator, when the CFG must be reformulated on-line for the latter into a DFA for a fixed string length. (Question 4 and Paper IV) One expected impact of this work is that the CFG constraint will attract more research interest and that some CP solvers may finally include it. • In the systematic search approach, we improve the CFG propagator of [23] by exploiting an idea of [27] for reformulating a grammar into a for a fixed string length. We will contribute our im- plemented new propagator to the open-source CP solver GECODE [15]. (Question 5 and Paper IV) One expected impact of this work is that our idea may also be used to improve the CFG propagators of [36, 37, 26]. • In the systematic search approach, we use the edit-distance based SOFTREGULAR constraint as an example to show that a propagator that sometimes underestimates the cost of a soft constraint may guide the search to incorrect (non-optimal) solutions to an over-constrained prob- lem. Hence, it is crucial for the propagator to compute correctly the cost. To compute correctly the cost for the edit-distance based SOFTREGULAR constraint on a sequence of n decision variables, we propose an O n2 - time propagator based on dynamic programming and a proof of its cor- rectness. We also give an improved propagator based on a clever idea in [40] of improving the classical dynamic programming algorithm for computing the between two strings [45], which first as- sumes the edit distance is one and then verifies the assumption: if the assumption is true, then the correct edit distance has been computed; otherwise, another assumption is made by doubling the value of the pre- viously assumed edit distance. Our improved propagator has the same quadratic-time complexity in the worst case, but better performance in practice. (Question 6 and Paper III) One expected impact of this work is that we may use that idea of [40] for other constraints with propagators based on dynamic programming.

1.3 Outline of the Thesis The rest of this thesis is organised as following: • Chapter 2 gives the necessary background material on CP for being able to read Papers I to IV. • Chapter 3 gives the necessary background material on constraints for membership in formal languages for being able to read Papers I to IV. • Chapter 4 summarises the papers included in this thesis. • Chapter 5 summarises this thesis.

19 • Appendices A, B, and C give the omitted materials (due to space limita- tions of the original publications) from Papers I, II, and IV respectively.

20 2. Constraint Programming

This chapter gives the necessary background material on constraint program- ming (CP, e.g., see [2]) for being able to read Papers I to IV. CP is a declara- tive paradigm that uses a set of constraints to model and solve combinatorial problems.1 Combinatorial problems arise in many areas of computer science and application domains, such as scheduling, planning, time-tabling, routing, resource allocation, and configuration. Combinatorial problems can be con- straint satisfaction (Section 2.3.1) or constrained optimisation (Section 2.3.2) problems. These problems involve finding assignments to a finite sequence of discrete-domain variables that satisfy some constraints such that some cost (or benefit) expression on these variables takes a minimum (or maximum) value if the problem is an optimisation problem. In CP, a combinatorial problem is modelled with decision variables (Sec- tion 2.1) and constraints (Section 2.2), and can be solved by one of two orthog- onal search approaches, namely systematic search (Section 2.4.1) and stochas- tic local search (Section 2.4.2). The modelling and solving of a combinatorial problem are separated. This separation promotes the reuse of models and con- straints. In addition, this separation simplifies the revision or extension of a CP application, as it allows a user of a CP system to try different models without changing the search approach or vice versa.

2.1 Decision Variables and Domains The decision variables of a combinatorial problem are a sequence of quanti- ties that need to be determined in order to solve the problem. Each decision variable is associated with a domain, which is a finite set of values that can be assigned to the decision variable. Hence, a combinatorial problem is solved by deciding for each decision variable what value in its domain it should be assigned to. A domain is called a singleton if only one value exists in the domain, and a decision variable is called bound if its domain is singleton (or undecided if its domain contains more than one value).

1Actually, there are also CP approaches to solving problems over continuous-domain decision variables; this thesis focuses however on combinatorial problems, where the decision variables range over discrete domains.

21 2.2 Constraints There are two kinds of constraints, namely hard constraints, which cannot be violated, and soft constraints, which can be violated and are associated with costs for this purpose.

2.2.1 Hard Constraints In a combinatorial problem, the constraints that must be satisfied are called hard constraints. Let X = hX1,...,Xni be a sequence of n decision variables, where the domain of decision variable Xi (for all Xi ∈ X) is denoted by D(Xi). A hard constraint C on X is often specified by an intensionally defined subset of the Cartesian product of the domains of all decision variables in X: C ⊆ D(X1) × ··· × D(Xn). An assignment X := hv1,...,vni ∈ C is called a solution to C.

Example 1. The ALLDIFFERENT constraint is a commonly used constraint in CP. Given a sequence X = hX1,...,Xni of n decision variables, the ALLDIFFERENT constraint requires that all decision variables of X must take pairwise different values. Hence, we have

ALLDIFFERENT(X) = hv1,...,vni | ∀i : vi ∈ D(Xi) ∧ ∀ j =6 i : v j =6 vi

Given n = 3 and all decision variables of X have the same domain {1,2,3} , we have ALLDIFFERENT(X) = {h1,2,3i,h1,3,2i,h2,1,3i,h2,3,1i,h3,1,2i,h3,2,1i}, and the assignment X := h1,2,3i is one of the six solutions to the ALLDIFFERENT(X) constraint.

2.2.2 Soft Constraints Many real-life problems are over-constrained, so that no solution satisfying all their constraints exists. In such situations, some constraints need to be relaxed so that we can at least find a solution, which is as close as possible to a solution, instead of finding no solution at all. These constraints that are allowed to be violated are called soft constraints. Given a constraint C on a sequence X = hX1,...,Xni of n decision variables, a softened version of the constraint is defined as soft_C(X,µ,z), where µ :D(X1) × ··· × D(Xn) → R is a cost measure function with µ(v1,...,vn) = 0 if and only if hX1,...,Xni ∈ C and µ(v1,...,vn) > 0 otherwise, and z is a non-negative cost variable denoting how much C is violated under the cost measure µ. For the soft_C(X,µ,z) constraint, we use the definition of [42].

Definition 1 (Constraint Softening). Let z be a cost variable with finite domain D(z) and C a constraint on a sequence X = hX1,...,Xni of n decision variables

22 with a cost measure µ. Then

soft_C(X,µ,z) = {hv1,...,vn,vzi | ∀i : vi ∈ D(Xi) ∧ vz ∈ D(z) ∧ µ(v1,...,vn) ≤ vz}

An assignment hX,zi := hv1,...,vn,vzi ∈ soft_C(X,µ,z) is called a solution to the soft_C(X,µ,z) constraint, as the violation of the constraint, which is µ(v1,...,vn), does not exceed the allowed cost value vz.

Example 2. In [34, 42], a softened version of the ALLDIFFERENT constraint is defined as SOFTALLDIFFERENT(X,µ,z), where X = hX1,...,Xni is a sequence of n decision variables, µ is a cost measure defined in [34] such that

µ(v1,...,vn) = ∑ max j ∈ [1,n] | v j = v − 1,0 n v∈ D(Xi) i=1   S for any assignment X := hv1,...,vni, and z is a cost variable. This cost mea- sure µ counts the number of decision variables that need to take another value for the hard constraint to be satisfied. Given n = 3, all decision vari- ables of X have the same domain {1,2,3}, and D(z) = {0,1,2}, we have that µ(1,1,1) = 2, µ(1,2,1) = 1, and µ(1,2,3) = 0. Hence the assignments hX,zi := h1,1,1,2i, hX,zi := h1,2,1,1i, and hX,zi := h1,2,1,2i are some so- lutions to the SOFTALLDIFFERENT(X,µ,z) constraint.

2.3 Combinatorial Problems There are two kinds of combinatorial problems, namely constraint satisfaction problems and constrained optimisation problems.

2.3.1 Constraint Satisfaction Problems A constraint satisfaction problem (CSP) is defined as a triple hX,D,Ci, where X = hX1,...,Xni is a sequence of n decision variables with D(Xi) denoting the domain of decision variable Xi, and C is a finite set of constraints on X. We call an assignment to X a solution to the problem if and only if the assignment respects the domains and is a solution to all constraints of C.

Example 3. The n-queens problem is a generalisation of the 8-queens prob- lem, which was first proposed in 1848 by a German chess player in [8]. In this problem, n ≥ 4 non-attacking queens are to be placed on an n × n chessboard so that no two queens are in the same row, column, or diagonal. For exam- ple, Figure 2.1 represents a solution to the 4-queens problem. The n-queens problem can be represented by the following CSP:

23 Figure 2.1. The solution X := h2,4,1,3i to the 4-queens problem

• The decision variables:

X = hX1,...,Xni

• The domains: D(X1) = ··· = D(Xn) = {1,...,n} • The constraints:

C1 : ALLDIFFERENT(X1,...,Xn) C2 : ALLDIFFERENT(X1 − 1,X2 − 2,...,Xn − n) C3 : ALLDIFFERENT(X1 + 1,X2 + 2,...,Xn + n) where the decision variable Xi denotes the row of the queen placed in the i- th column of the chess board (hence no two queens are in the same column), the constraint C1, C2, and C3 require that no two queens are in the same row, north-east to south-west diagonal, and north-west to south-east diagonal.

2.3.2 Constrained Optimisation Problems A constrained optimisation problem (COP) is defined as a quadruple hX,D,C,Oi, where X = hX1,...,Xni is a sequence of n decision variables with D(Xi) denoting the domain of decision variable Xi, C is a finite set of constraints on X, and O is the objective function. We call an assignment to X a solution to the problem if and only if the assignment is a solution to the CSP hX,D,Ci and the value of the objective function O is minimised (or maximised).

Example 4. The n-queens problem can also be represented by the following COP with soft constraints: • The decision variables:

X = hX1,...,Xn,Xn+1,Xn+2,Xn+3i

24 • The domains: D(X1) = ··· = D(Xn) = {1,...,n}

D(Xn+1) = D(Xn+2) = D(Xn+3) = {0,...,∞} • The constraints:

C1 : SOFTALLDIFFERENT(X1,...,Xn,µ,Xn+1) C2 : SOFTALLDIFFERENT(X1 − 1,X2 − 2,...,Xn − n,µ,Xn+2) C3 : SOFTALLDIFFERENT(X1 + 1,X2 + 2,...,Xn + n,µ,Xn+3) • The objective function:

minimise Xn+1 + Xn+2 + Xn+3 where the decision variable Xi with i ≤ n denotes the row of the queen placed in the i-th column of the chess board, Xn+1, Xn+2, and Xn+3 are three cost vari- ables for the three soft constraints, and the SOFTALLDIFFERENT constraint is the softened version of the ALLDIFFERENT constraint in Example 2. Note that the n-queens problem is satisfiable for all natural numbers n with the exception of 2 and 3 [20]. Hence the example above just gives an intuition of modelling an over-constrained problem with a COP.

2.4 Search Approaches There are two orthogonal search approaches, namely systematic search and stochastic local search. The systematic search approach is complete, as it guar- antees an (optimum) solution to a combinatorial problem will be found if one exists, but the approach may not work for some problems where the search space is too huge; the stochastic local search approach is incomplete, as it can- not guarantee an (optimum) solution to a combinatorial problem will be found, but the approach has a better scalability than the systematic search approach when the search space grows. As the two search approaches of solving a COP are similar to the ones of solving a CSP respectively, we only describe the latter here.

2.4.1 Systematic Search In the systematic search approach, each constraint is associated with a prop- agator, which achieves a level of consistency for the constraint by removing inconsistent values from the domains of its variables. Propagators alone usu- ally cannot solve a combinatorial problem, and they must be involved with a search procedure, which explores a search tree with branching and backtrack- ing. In this section, we first give a short description of levels of consistency and propagators, and then introduce a commonly used search procedure.

25 Levels of Consistency To solve a combinatorial problem efficiently, one objective is to construct a small search tree, hence the propagator should re- move as many inconsistent values from the domains as possible; the other objective is to design low-complexity propagators, as propagators are called many times during search. However the two objectives are conflicting, as a propagator that can remove more values from the domains is usually of higher computation complexity. This motivates the introduction of levels of consis- tency. We give definitions of the two levels of consistency that are useful in this thesis.

Definition 2 (Domain Consistency). Given a sequence X = hX1,...,Xni of n decision variables and a constraint C on X, we say that the constraint is domain consistent if for each 1 ≤ i ≤ n and each value vi ∈ D(Xi), there exist values d j ∈ D(Xj) for all j =6 i such that hd1,...,di−1,vi,di+1,...,dni ∈ C.

Definition 3 (Bounds Consistency). Given a sequence X = hX1,...,Xni of n decision variables and a constraint C on X, we say that the constraint is bounds consistent if for each 1 ≤ i ≤ n and each value vi ∈ {min D(Xi), max D(Xi)}, there exist values d j ∈ [min D(Xj), max D(Xj)] for all j =6 i such that hd1,..., di−1,vi,di+1,...,dni ∈ C.

Note that domain consistency is a stronger level of consistency than bounds consistency, as domain consistency checks every value in every domain while the latter only checks the lower and upper bound values.

Example 5. Given a sequence X = hX1,X2,X3,X4i of 4 decision variables with D(Xi) = {1,4} (for all Xi ∈ X), we consider the ALLDIFFERENT(X) constraint. For the lower bound 1 (or upper bound 4) in D(X1), we can find a solution X := h1,2,3,4i (or X := h4,1,2,3i) to the constraint. Similarly, the same holds for the other decision variables Xi (with i 6= 1). Hence, the ALLDIFFERENT(X) constraint is bounds consistent under these domains. However, the constraint is not domain consistent under these domains, as four decision variables can- not be pairwise different with only two values ({1,4}).

Propagators Given a sequence X = hX1,...,Xni of n decision variables, and a constraint C on X, a propagator pC is used to prune inconsistent val- ues from the domains D to achieve a given level of consistency for C. The propagator must have the following three properties: • Contracting: the purpose of a propagator is to shrink the domains, hence the domains obtained after propagation cannot be supersets of the orig- inal domains. Formally, we have pC(D) ⊆ D, where pC(D) denotes the domains obtained after executing the propagator pC on the domains D, and we say D1 ⊆ D2 if and onlyif D1(Xi) ⊆ D2(Xi) for all decision vari- ables Xi ∈ X.

26 • Monotone:D1 ⊆ D2 ⇒ pC(D1) ⊆ pC(D2). • Correct: the propagator cannot prune consistent values from the do- mains. Formally, we have C(X,D) = C(X, pC(D)), where C(X,D) de- notes the constraint C on X under the domains D. After propagation, the propagator can be in one of the following states: • Fix Point: the propagator pC declares to be at a fix point. • Success: the propagator pC succeeds if it observes that every possible assignment to X under the current domains is a solution to the constraint. If the propagator pC succeedsunder the current domains D, then we have pC(D) = D as every value in D is consistent. Hence, pC is also at a fix point under D. • Failure: the propagator pC fails if any decision variable of X has an empty domain, as no solution to the constraint exists under the current domains. • Unknown: the propagator pC is at an unknown state otherwise, where pC cannot judge whether a fix point is reached or not. Note that it is always safe for a propagator to declare that it is at an unknown state. Note that two states, namely fix point and success, are very useful for an effi- cient systematic search procedure. For example: • If a propagator is at a fix point under the current domains, then there is no need to re-execute it until the domains of its decision variables have been changed. Hence, in an efficient systematic search procedure, a propagator that reaches domain (or bounds) consistency for a constraint, is suspended whenever it declares that it has reached a fix point, and is woken up only if (one bound of) the domain of a decision variable involved in this constraint has been changed. • If a propagator even succeeds under the current domains, then there is no need to re-execute it at all. Hence, in an efficient systematic search procedure, a constraint is subsumed whenever its propagator succeeds, and the latter is never re-executed.

A Commonly Used Search Procedure In the systematic search approach, a combinatorial problem is solved by using a search procedure that explores a search tree with branching and backtracking. Given a global variable S, which is an initially empty stack (line 1) to store sub-CSPs for the purpose of backtracking, Algorithm 1 describes a commonly used search procedure of finding a solution to a CSP P = hX,D,Ci, where X = hX1,...,Xni is a se- quence of n decision variables with D(Xi) denoting the domain of decision variable Xi ∈ X, and C is a finite set of constraints with a propagator pC for each constraint C ∈ C. At each node of the search tree, the propagators of all non-subsumed constraints in the problem are executed separately to remove from the current domains some (but not necessarily all) inconsistent values, which cannot be part of a solution to the constraint, and this is repeated until no more pruning is possible, where all propagators are at a common fix point

27 Algorithm 1 A commonly used systematic search procedure with the first- fail branching heuristic of finding a solution to a CSP P = hX,D,Ci, where X = hX1,...,Xni is a sequence of n decision variables with D(Xi) denoting the domain of decision variable Xi ∈ X, and C is a finite set of constraints with a propagator pC for each constraint C ∈ C 1: global variable: a stack S ← [] for storing sub-CSPs 2: function systematic_search(P = hX,D,Ci) 3: CA ← C 4: while CA =6 /0 do 5: choose a (possible random) constraint C from CA and execute its prop- agator pC 6: if the domains D have been changed then ′ 7: CA ← CA ∪{C ∈ C | pC′ should be woken up by the domain changes} 8: if the propagator pC is at a fix point then 9: CA ← CA \{C} 10: if the propagator pC succeeds then C ← C \{C} 11: else if the propagator pC fails then 12: if S = [ ] then return failed 13: hX,D,Ci ← S.pop() 14: return systematic_search(hX,D,Ci) 15: if all decision variables Xi of X are bound then 16: return succeeded as a solution X := hD(X1),...,D(Xn)i is found 17: m ← min{|D(Xi)| | i ∈ [1,n] ∧ |D(Xi)| > 1} 18: j ← min{i | i ∈ [1,n] ∧ |D(Xi)| = m} 19: v ← min D(Xj) ′ 20: D ← hD(X1),...,D(Xj−1),D(Xj) \{v},...,D(Xn)i 21: S.push(hX,D′,Ci) 22: D(Xj) ← {v} 23: return systematic_search(hX,D,Ci)

(lines 3 to 14). Note that the propagators are executed in an efficient way by using the two useful states of a propagator, namely fix point and success. For this purpose, the search procedure maintains a set CA of active constraints, where the propagator pC for each constraint C ∈ CA possibly needs to be ex- ecuted under the current domains. In other words, the propagator pC for any constraint C ∈ C \ CA is at a fix point under the current domains. Initially, CA contains all constraints of C (line 3). While there exist active constraints (line 4), one (possible random) constraintC is chosen from CA and its propaga- tor pC is executed (line 5). Usually, heuristics are used to guide the scheduling of propagators. For example in the CP solver GECODE [15], the propagator pC with the lowest computational complexity is chosen to be executed first, and the purpose is to delay the execution of propagators that have high com-

28 putational complexity, so that the number of executions of such propagators is minimised. If pC has removed inconsistent values from the current domains ′ during propagation, then any constraint C ∈ C whose propagator pC′ should be woken up by the domain changes is inserted into CA (lines 6 and 7). Af- ter the propagation: if pC is at a fix point, then the constraint C is removed from CA so that pC is suspended (lines 8 and 9); if pC even succeeds, then C is further removed from C so that pC is deactivated (line 10). The loop above is repeated until a common fix point for all propagators is reached, i.e., CA = /0. Thereafter, the search procedure checks: if all decision variables of X are bound, then a solution, which is the unique assignment to X under the cur- rent domains, is found (lines 15 to 16); otherwise, a branching (with heuristic) of the search tree is needed. Branching heuristics are important, as a good heuristic usually leads to a small search tree so that a solution to the problem is found efficiently. Lines 17 to 19 describe a well-known branching heuris- tic, namely the first-fail heuristic, which chooses the first undecided decision variable Xj with the minimum domain size and the first value v in the domain of Xj. When using the first-fail branching heuristic, the current domain is split into two parts: one with D(Xj) = D(Xj)\{v} and the other with D(Xj) = {v}. The first part (an unexplored choice point) is pushed onto the stack S for back- tracking (lines 20 to 21), and the second part is used to escape from the current fix point for further propagation (lines 22 to 23). Whenever a propagator fails, the search procedure backtracks to the last choice point stored in S if one ex- ists (lines 13 to 14); otherwise, the whole search tree has been explored and no solution to the problem exists (line 12). The search procedure is complete, as propagators are correct and contracting, and as the mechanism of branching and backtracking ensures that the whole search space will be explored. Hence, it guarantees a solution will be found if there exists one. Note that backtrack- ing and the efficient scheduling of propagators are automatically supported by most CP solvers (e.g., GECODE [15]), hence a user of such CP solvers only needs to specify the chosen branching heuristic when implementing the search procedure. The search procedure of solving a COP is similar to Algorithm 1. However, an optimum solution with a minimum cost (or maximum benefit), which is expressed by an objective function on the decision variables, to the problem is to be found in this case. Hence, instead of returning succeeded whenever a solution is found (at line 18), the search procedure computes the value u of the objective function under the current solution, and then adds a betterness constraint, which bounds the objective function to take a value strictly less (or larger) than u, so that a solution with a better objective value is to be found next time.

29 2.4.2 Stochastic Local Search Stochastic Local search [21] is an efficient technique that aims at finding a high-quality solution to a combinatorial problem. The stochastic local search approach to CP is called constraint-based local search (CBLS, see [41]). In CBLS, constraints are used to guide search from an initial (possibly random) assignment to an (optimum) solution via iterations of local moves, in which a move from the current assignment is made by only re-assigning a few decision variables. In this section, we first give some important concepts for CBLS, and then describe a commonly used search procedure.

Violations For CBLS, constraints are not associated with propagators to prune inconsistent values from the domains, but are associated with viola- tion functions to guide local search. Given a CSP P = hX,D,Ci, where X = hX1,...,Xni is a sequence of n decision variables with D(Xi) denoting the domain of Xi ∈ X, and C is a finite set of constraints, and given a current assignment a to X, the following violation quantities are useful: • Constraint violation denotes how much a particular constraint is violated under the assignment a, and is zero if and only if a is a solution to the constraint. Hence, a constraint C in CBLS is modelled as its softened version soft_C with a cost measure function to calculate its violation under the assignment a. We use ViolationC[a] to denote the violation of a constraint C ∈ C under the assignment a. • Variable violation denotes how much violation of a constraint is caused by a particular decision variable under the assignment a,andiszeroifthe decision variable need not be reassigned to transform a into a solution to the constraint. We use ViolationC[Xi,a] to denote the variable violation of Xi in a constraint C ∈ C under the assignment a. Usually, if a is a solution to the constraint C, then it is required that ViolationC[Xi,a] = 0 for all decision variables Xi involved in C. • System violation denotes how much the problem P is violated under the assignment a, and is zero if and only if a is a solution to the prob- lem. Hence, a CSP in CBLS is modelled as a COP with an objec- tive to minimise the system violation. We use Violation[a] to denote the system violation under the assignment a, and it is calculated as Violation[a] = ∑C∈C ViolationC[a]. • System variable violation denotes how much system violation is caused by a particular decision variable under the assignment a, and is zero if the decision variable need not be reassigned to transform a into a solution to the problem. We use Violation[Xi,a] to denote the system variable violation of Xi under the assignment a, and it is calculated as Violation[Xi,a] = ∑C∈C ViolationC[Xi,a]. Usually, if a is a solution to the problem, then it is required that Violation[Xi,a] = 0 for all decision variables Xi ∈ X.

30 ViolationCi [Xi,a]

ViolationCi [a] X1 X2 X3 X4 C1 :ALLDIFFERENT(2,2,1,3) 1 1 1 0 0 C2 :ALLDIFFERENT(1,0,−2,−1) 0 0 0 0 0 C3 :ALLDIFFERENT(3,4,4,7) 1 0 1 1 0 system (variable) violation 2 1 2 1 0

Table 2.1. Violations for the 4-queens problem of Example 7 under the current assign- ment a = h2,2,1,3i to a sequence X = hX1,...,X4i of 4 decision variables, where C1, C2, and C3 denote the ALLDIFFERENT(X1,...,X4), ALLDIFFERENT(X1 −1,...,X4 − 4), and ALLDIFFERENT(X1 + 1,...,X4 + 4) constraints respectively

Example 6. We revisit the ALLDIFFERENT(X) constraint of Example 1, where X = hX1,...,Xni is a sequence of n decision variables with D(Xi) denoting the domain of Xi ∈ X. In [41], given a current assignment a = hv1,...,vni to X with each vi ∈ D(Xi), the constraint violation of the ALLDIFFERENT(X) constraint C under the assignment a is defined as

ViolationC[a] = ∑ max j ∈ [1,n] | v j = v − 1,0 n v∈ D(Xi) i=1   S which counts the number of decision variables that need to take another value for the constraint to be satisfied and is the same as the cost measure of the SOFTALLDIFFERENT constraint in Example 2. The variable violation of a de- cision variable Xi is defined as

ViolationC[Xi,a] = j ∈ [1,n] | j =6 i ∧ v j = vi  which counts the number of other decision variables that take the same value as the one assigned to Xi. Given n = 4, D(Xi) = {1,...,4} for all decision vari- ables Xi ∈ X, and the assignment a = h2,2,1,3i to X, we have ViolationC[a] = 1, as X1 and X2 are assigned the same value 2 so that at least one of them needs to be reassigned; ViolationC[X1,a] = ViolationC[X2,a] = 1, as X1 and X2 cause the constraint violation of 1; and ViolationC[X3,a] = ViolationC[X4,a] = 0, as X3 and X4 have nothing to do with the constraint violation.

Example 7. Now, we revisit the n-queens problem of Example 3. Given n = 4, D(Xi) = {1,...,4} for all decision variables Xi ∈ X, and a current assignment a = h2,2,1,3i to X, recall that the problem contains a set C = {C1,C2,C3} of 3 constraints, where C1, C2, and C3 denote the ALLDIFFERENT(X1,...,X4), ALLDIFFERENT(X1 − 1,...,X4 − 4), and ALLDIFFERENT(X1 + 1,...,X4 + 4) constraints respectively. Figure 2.2 represents the current assignment a, where the red solid (or green dashed) line denotes a violation of C1 (or C3) as a makes the two queens X1 and X2 (or X2 and X3) attack each other. Table 2.1 gives the violations for the 4-queens problem under the assignment a.

31 Figure 2.2. The assignment a = h2,2,1,3i to X for the 4-queens problem

Neighbourhoods In order to solvea CSP P = hX,D,Ci using CBLS, where X = hX1,...,Xni is a sequence of n decision variables with D(Xi) denoting the domain of Xi ∈ X, the search starts from an initial assignment to X and then iteratively moves to a neighbour assignment. Hence, the key issue of local search is the definition of neighbourhood.

Definition 4 (Neighbourhood). A neighbourhood N is a function that maps the current assignment to a set of assignments that are reachable from the current assignment. Formally, we have

D(X1)×···×D(Xn) N :D(X1) × ··· × D(Xn) → 2 where S = D(X1) × ··· × D(Xn) is the set of all possible assignments in the domains and 2D(X1)×···×D(Xn) is the set of all possible subsets of S.

There are at least two ways of constructing neighbourhoods by using the violations defined in the previous section: • A variable-directed neighbourhood can be constructed as follows: first a most violating decision variable,2 which has the largest system vari- able violation among all decision variables, is chosen; and then a neigh- bourhood, which contains all possible assignments obtained by reassign- ing the chosen decision variable, is constructed. As this decision vari- able causes most of the system violation of the problem, it is promis- ing to decrease the system violation by reassigning this decision vari- able. In Example 7, where all decision variables have the same domain {1,...,4} and the current assignment is a = h2,2,1,3i, the decision vari- able X2 is the most violating decision variable with Violation[X2,a] = 2. The neighbourhood N(a) = {h2,1,1,3i,h2,3,1,3i,h2,4,1,3i} is con- structed. As the assignment X := h2,4,1,3i in the neighbourhood can

2In paper I, we call a decision variable that contributes to the violation of a combinatorial problem a violated decision variable. However, this terminology is not very meaningful. Hence, we have switched to calling it a violating decision variable since Paper II.

32 maximally decrease the system violation, the search moves to this neigh- bour, which happens to be a solution to the 4-queens problem. • A constraint-directed neighbourhood can be constructed as follows: first a most violated constraint, which has the largest constraint violation among all constraints, is chosen, and then a neighbourhood, which con- tains all possible assignments that ideally decrease the constraint vio- lation. As this constraint causes most of the system violation of the problem, it is promising to decrease the system violation by trying to satisfy this constraint. In Example 7, where all decision variables have the same domain {1,...,4} and the current assignment is a = h2,2,1,3i, both constraints C1 and C3 have a constraint violation of 1, and we as-

sume C1 is chosen. As ViolationC1 [X1,a] = ViolationC1 [X2,a] = 1, deci- sion variables X1 and X2 equally contribute to the constraint violation of C1. We assume X2 is chosen to be re-assigned for constructing the neigh- bourhood N(a) = {h2,1,1,3i,h2,3,1,3i,h2,4,1,3i}. As the assignment X := h2,4,1,3i in the neighbourhood can maximally decrease the sys- tem violation, the search moves to this neighbour, which happens to be a solution to the 4-queens problem. When a variable-directed neighbourhoodis used for CBLS, we call it variable- directed local search; similarly, when a constraint-directed neighbourhood is used for CBLS, we call it constraint-directed local search.

Invariants and Differentiable Objects For CBLS, constraints with vio- lations are used to guide local search via iterations of local moves, where a neighbourhood is explored to make a best move that ideally maximally de- creases the system violation of a CSP in each step. Invariants and differen- tiable objects are introduced for CBLS systems (such as the CBLS back-end of COMET [41]) to maintain the violations incrementally and to evaluate a neighbourhood efficiently [18]. • Invariants declaratively specify expressions whose values must be main- tained incrementally and automatically under local moves. More pre- cisely, invariants specify what to maintain incrementally, but not how to do so, which is done automatically by a CBLS system that supports invariants. For example, the violations are usually expressed by invari- ants in a CBLS system, and then a user of the CBLS system can query the values of the violations without writing explicit code for calculat- ing them, as the calculation is done incrementally by the CBLS system. Note that it is crucial for an efficient CBLS system to maintain the viola- tions incrementally, as the violations are maintained whenever the search makes a local move. Hence, it is crucial for a CBLS system to support invariants, otherwise handcrafted incremental algorithms need to be im- plemented by a user, which may lead to an inefficient or error-prone implementation.

33 Algorithm 2 A commonly used stochastic tabu search procedure with a variable-directed neighbourhood of trying to find a solution to a CSP P = hX,D,Ci, where X = hX1,...,Xni is a sequence of n decision variables with D(Xi) denoting the domain of a decision variable Xi ∈ X, and C is a finite set of constraints 1: function local_search(P = hX,D,Ci,imax,tlen) 2: i ← 0 3: array T[1..n] ← 0 4: a ← a (possibly random) initial assignment to the decision variables X 5: a∗ ← a 6: v∗ ← Violation[a] ∗ 7: while v > 0 and i < imax do 8: select a random decision variable Xj ∈ X maximising Violation[Xj,a] ′ 9: N ← hd1,...,dni | d j ∈ D(Xj) \{a(Xj)} ∧ ∀ j =6 j : d j′ = a(Xj) 10: select a random a′ ∈ N maximising (Violation[a] − Violation[a′])  11: if Violation[a′] < Violation[a] ∨ T[ j] ≤ i then 12: a ← a′ 13: T[ j] ← i +tlen 14: v ← Violation[a] 15: if v∗ > v then 16: a∗ ← a 17: v∗ ← v 18: i ← i + 1 19: return a∗

• A differentiable object is a special invariant, which not only maintains the value of an expression, but also determines how the expression value increases (or decreases) when changing the assignment to some deci- sion variable(s), hence it supports differentiation such as evaluating the effect of a local move on the expression value. For example, the viola- tions are usually expressed by differentiable objects in a CBLS system. Hence, differentiable objects are crucial for an efficient neighbourhood evaluation, and are crucial for an efficient CBLS system.

A Commonly Used Search Procedure When using CBLS to solve a CSP, the search iteratively moves from an initial assignment to another assignment, until a solution is found or until allocated resources are exhausted. Meta- heuristics (such as tabu search [16, 17]) are used to escape local optima. Given two integer parameters imax and tlen, where imax is the maximum number of it- erations allowed by the search and tlen is the length of the tabu list, Algorithm 2 describes a commonly used tabu search procedure with a variable-directed neighbourhood of trying to find a solution to a CSP P = hX,D,Ci, where X = hX1,...,Xni is a sequence of n decision variables with D(Xi) denoting the

34 domain of a decision variable Xi ∈ X, and C is a finite set of constraints. It first initialises the iteration counter i (line 2), the tabu list T (line 3), the current assignment a to the decision variables X (line 4), the current best assignment a∗ (line 5), and the current minimum system violation v∗ (line 6). At each iteration of the search, it checks the termination condition (either a solution has been found or the maximum number imax of iterations has been reached) of the search (line 7). If the termination condition is met, then it terminates and returns the best assignment a∗ found by the search (line 19); otherwise, a most violating decision variable Xj is chosen (line 8), a variable-directed neighbourhood N is constructed by only reassigning Xj (line 9), and a random move that ideally maximally decreases the system violation is picked (line 10). If the move decreases the system violation or is not forbidden by the tabu list (line 11), then the move is made (line 12) and is now forbidden by the tabu list for the next tlen iterations (line 13). If a new best assignment is found upon the move, then v∗ and a∗ are updated (lines 14 to 17). Thereafter, the iteration counter i is increased by one at the end of each iteration (line 18).

35 3. Constraints for Membership in Formal Languages

This chapter gives the necessary background material on constraints for mem- bership in formal languages for being able to read Papers I to IV.

3.1 Formal Languages We first give some background material on formal languages (e.g., see [30, 22], upon which excerpts of this section are based). A string w over an alpha- bet Σ, which is a finite non-empty set of symbols, is a finite sequence of |w| symbols of Σ.A language is a set of strings over an alphabet.

Definition 5. Let x = x1 ···xm and y = y1 ···yn be two strings. The concate- nation of x and y, denoted by xy, is the string x1 ···xmy1 ···yn. Let L1 and L2 be two languages. The concatenation of L1 and L2, denoted by L1L2, is the set {xy | x ∈ L1 ∧ y ∈ L2}.

Definition 6. Let L be a language. We define L0 = {ε}, where ε is the empty string, and Li = LLi−1 for i ≥ 1. The Kleene closure of L, denoted by L∗, is the i + language i≥0 L . The positive closure of L, denoted by L , is the language LL∗. An alphabet Σ can be seen as a language of strings of length 1. Hence, S Σ∗ (or Σ+) is the Kleene (or positive) closure of Σ. Similarly, given a symbol v ∈ Σ, we say that v∗ (or v+) is the Kleene (or positive) closure of {v}.

A is a language that can be definedby a body of systematic rules, which are usually specified by grammars.

Definition 7. A grammar G is a quadruple hΣ,N,P,Si, where: • Σ is a finite alphabet and any symbol v ∈ Σ is called a terminal; • N is a finite set of non-terminals such that N ∩ Σ = /0; • P is a finite set of productions (or rules) of the form

α → β

where α ∈ (Σ ∪ N)∗ N (Σ ∪ N)∗ and β ∈ (Σ ∪ N)∗, i.e., α is a string of terminals and non-terminals containing at least one non-terminal and β is is a string of terminals and non-terminals;

36 • S ∈ N is the start non-terminal. We use |G| = ∑p∈P |p| to denote the size of G, where |p| is the number of terminals and non-terminals in a production p ∈ P.

Definition 8. Let G = hΣ,N,P,Si be a grammar, and γ1 and γ2 be two strings of terminals and non-terminals. • We say that γ1 directly derives γ2 (denoted by γ1 ⇒ γ2) if there exist four strings of terminals and non-terminals, namely ω1, ω2, α, and β, such that γ1 = ω1αω2, γ2 = ω1βω2, and α → β is a production of P. ∗ • We say that γ1 derives γ2 (denoted by γ1 ⇒ γ2) if there exists a sequence of m ≥ 1 strings of terminals and non-terminals, namely ω1,...,ωm, such that γ1 ⇒ ω1 ⇒ ··· ⇒ ωm ⇒ γ2.

Definition 9. Let G = hΣ,N,P,Si be a grammar. We use L(G) to denote the formal language defined by G, and we have L(G) = {w | w ∈ Σ∗ ∧ S ⇒∗ w}

In [10], grammars are classified into four groups based on the Chomsky hi- erarchy by gradually increasing the restrictions on the form of the productions. Given a grammar G = hΣ,N,P,Si, we say that 1. G is an if there is no restriction on the form of the productions. Hence, each production of the grammar has the form α → β where α ∈ (Σ ∪ N)∗ N (Σ ∪ N)∗ is a string of terminals and non-terminals containing at least one non-terminal, and β ∈ (Σ ∪ N)∗ is a string of ter- minals and non-terminals. The formal language L(G) is called a recur- sively enumerable language.

Example 8. Consider the unrestricted grammar G = hΣ,N,P,Si, where Σ = {a}, N = {S,L,R,D}, P = {S → LaR,L → LD,Da → aaD,DR → R,L → ε,R → ε}, and ε denotes the empty string. It generates the recur- n sively enumerable language L(G) = a2 | n ∈ N , which canbe defined recursively as follows: a) a is a word in L(G);  b) for all n ≥ 1, if an is a word in L(G), then a2·n is a word in L(G).

2. G is a context-sensitive grammar (CSG) if each production of the gram- mar has the form α → β

37 where α ∈ (Σ ∪ N)∗ N (Σ ∪ N)∗ is a string of terminals and non-terminals containing at least one non-terminal, β ∈ (Σ ∪ N)∗ is a string of terminals and non-terminals, and |α| ≤ |β|. The formal language L(G) is called a context-sensitive language (CSL).

Example 9. The grammar of Example 8 is not a CSG, as there are pro- ductions, e.g., DR → R, such that the conditions above are not satisfied. Consider the CSG G = hΣ,N,P,Si, where Σ = {a,b,c}, N = {S,A,B,C}, and P = {S → SABC,S → ABC,BA → AB,CA → AC,CB → BC,A → a,aA → aa,aB → ab,bB → bb,bC → bc,cC → cc}. It generates the CSL L(G) = {anbncn | n ∈ N}.

3. G is a context-free grammar (CFG) if each production of the grammar has the form A → α where A ∈ N is a non-terminal and α ∈ (Σ ∪ N)∗ is a string of terminals and non-terminals. The formal language L(G) is called a context-free language (CFL).

Example 10. The grammar of Example 9 is not a CFG, as there are productions, e.g., BA → AB, such that the conditions above are not sat- isfied. Consider the CFG G = hΣ,N,P,Si, where Σ = {ℓ,r}, N = {S}, and P = {S → ℓr, S → SS, S → ℓSr}. It generate a CFL of correctly bracketed expressions (e.g., ℓrℓr and ℓℓrr are correctly bracketed ex- pressions), with ‘ℓ’ denoting the left bracket and ‘r’ the right one.

4. G is a if each production of the grammar has one of the following forms A → aB, or A → ε where A,B ∈ N are non-terminals and a ∈ Σ is a terminal. The formal language L(G) is called a regular language.

Example 11. The grammar of Example 10 is not a regular grammar, as there are productions, e.g., S → SS, such that the conditions above are not satisfied. Consider the regular grammar G = hΣ,N,P,Si, where Σ = {a,b}, N = {S}, and P = {S → aA,S → ε,A → bB,B → aA,B → ε}. It generates the regular language L(G) = {(ab)∗}.

Using the definitions above, we have that every regular language is context- free, every context-free language not containing the empty string is context- sensitive, and every context-sensitive language is recursively enumerable. Fig- ure 3.1 represents the relationship above.

38 Figure 3.1. The of formal languages

Given a grammar G = hΣ,N,P,Si, a algorithm (called a parser) is needed to decide whether a given string over the alphabet Σ belongs to the for- mal language L(G) or not. As there are polynomial-time parsers for context- free grammars and regular grammars (but not for context-sensitive grammars and unrestricted grammars), the former two grammars are very useful in prac- tice. Hence, when talking about constraints for membership in formal lan- guages in this thesis, we mean constraints for membership in context-free lan- guages and regular languages.

3.2 The REGULAR Constraint We first give the definition of the REGULAR constraint, and then briefly sum- marise its related work in CP.

3.2.1 Definition The REGULAR constraint [33] (a generalisation of which is known as the AUTOMATON constraint [7], see Section 3.4 below) requires a sequence of de- cision variables to belong to a regular language, specified by a deterministic finite automaton (DFA) or a regular expression.

Using a DFA The REGULAR constraint can be defined using a DFA.

Definition 10. A DFA is a tuple hQ,Σ,δ,q0,Fi, where: • Q is a finite set of states; • Σ is an alphabet;

39 D d d d

O G v W e v e E Figure 3.2. A DFA for a work scheduling constraint for one employee

• δ : Q × Σ → Q is a transition function. Given two states q1,q2 ∈ Q and a symbol v ∈ Σ, we use δ(q1,v) = q2 to denote the transition from the state q1 via reading the symbol v to the state q2; • q0 ∈ Q is the start state; • F ⊆ Q is a set of accepting states. We use |A| = |Q|+ |δ| to denote the size of A.

Given a DFA A = hQ,Σ,δ,q0,Fi and a string w = w1 ···wn of n symbols over the alphabet Σ, we say that w is a string accepted by A if there exists a sequence hq1,...,qni of n states in Q such that: • δ(qi−1,wi) = qi for all 1 ≤ i ≤ n; • qn ∈ F is an accepting state. Hence, the string w of length n can be parsed by the DFA A in O(n) time. Note that the language accepted by A, denoted by L(A), is a regular language, because the DFA A can be transformed into a regular grammar G = hΣ,N,P,Si as following: • N = Q; • P = {q1 → vq2 | ∀q1,q2 ∈ Q : ∀v ∈ Σ : δ(q1,v) = q2} {q → ε | ∀q ∈ F}; • S = q . 0 S

Example 12. Figure 3.2 gives a DFA A = hQ,Σ,δ,q0,Fi, where Q = {O,D,G,E,W } containing five states, Σ = {d,e,v} contains three symbols, δ = {δ(O,d) = D,δ(O,e) = E,...} contains seven transitions, the start state q0 = O is marked by a transition entering from nowhere, and F = {O,G} con- tains two accepting states, which are marked by double circles. The string ddv belongs to the regular language L(A), as the start state O reaches D by reading the first symbol d, then reaches W from D by reading another d, and finally reaches the accepting state G by reading the last symbol v.

Given a sequence X = hX1,...,Xni of n decision variables and a DFA A = hQ,Σ,δ,q0,Fi, a REGULAR constraint is defined as REGULAR(X,A), and is satisfied when a sequence of n domain values that are assigned to X is a string accepted by the DFA A. The REGULAR constraint is very useful, for example, it can be used to model complex work regulation constraints in shift scheduling problems.

40 Example 13. We want to describe a work scheduling constraint. There are values for two work shifts, namely day (d) and evening (e), as well as a value for enjoying a day of vacation (v). Work shifts are subject to the following conditions: one must start with a work shift and must end with some vacation; one must enjoy some vacation before a change of work shift and cannot enjoy a vacation of more than one day; if one works on a day shift, then one must do so for exactly two consecutive days; and one must enjoy a vacation after working an evening shift. This constraint for one worker can be modelled by a REGULAR(X,A) constraint with a sequence X of decision variables and the DFA A given in Figure 3.2, as each string accepted by A is an acceptable work shift sequence by this constraint.

Using a Regular Expression The REGULAR constraint can also be defined using a regular expression.

Definition 11. A regular expression over an alphabet Σ is defined inductively as follows: 1. The symbol ε is a regular expression, which represents the language {ε} containing only the empty string ε. 2. For each symbol v ∈ Σ, we have that v is a regular expression, which represents the language {v}. 3. Given a regular expression R, we use L(R) to denote the language rep- resented by R. We say that R∗ is a regular expression, which represents the language L(R)∗. 4. Given two regular expressions R1 and R2, we say that R1 | R2 and R1R2 are regular expressions, which represent the languages L(R1) ∪ L(R2) and L(R1)L(R2) respectively.

Given a sequence X of decision variables and a regular expression R, we can first transform R into a DFA A as in [22], and then use the REGULAR(X,A) constraint.

Example 14. Considering the regular expression R = (ddv | ev)∗, we have that the regular language L(R) is the same as the one accepted by the DFA A of Example 12 and that R can be transformed into A, and vice-versa.

3.2.2 Related Work

Given a sequence X = hX1,...,Xni of n decision variables and a DFA A = hQ,Σ,δ,q0,Fi, in the systematic search approach to CP, a propagator based on dynamic programming that achieves domain consistency for the REGULAR(X,A) constraint in O(n · |A|) time is introduced in [33]. This prop-

41 agator first unrolls the DFA A into an n + 1 layered transition graph such that every path of length n represents a solution to the REGULAR(X,A) constraint, and then removes inconsistent values from the domains by removing arcs that do not exist in any path of length n. In [9, 28], the REGULAR(X,A) constraint is represented as a multi-valued decision diagram (MDD), which is argued to be more space-efficient than the layered-graph representation, and MDD- based propagators that achieve domain consistency for the constraint are in- troduced. In the meantime, several decomposition-based propagators for the REGULAR(X,A) constraint exist. In [36], the REGULAR(X,A) constraint is de- composed into a simple sequence of ternary constraints, which then can be enforced to domain consistency in O(n · |Σ| · Q) time. In [4, 37], proposi- tional satisfiability (SAT) based decompositions of the REGULAR(X,A) con- straint are introduced, so that some advanced features of SAT solvers like fast unit propagation, clause learning, and conflict-based search heuristics can be used. Those SAT-decomposition-based propagators also achieve domain con- sistency for the REGULAR(X,A) constraint. In this thesis, we only address the usage of the REGULAR(X,A) constraint under the stochastic local search approach to CP. To the best of our knowl- edge, at the start of my PhD, the only existing work was [35], which pro- vides a deterministic method for defining and maintaining the violations of the REGULAR(X,A) constraint and all its decision variables in O(n · |A|) time. To improve this deterministic method, we propose a stochastic method that takes O(n) time in Paper I. To further improve the usage of the REGULAR(X,A) con- straint, we show how to construct and use a solution neighbourhood for it in paper II.

3.3 The SOFTREGULAR Constraint We first give the definition of the SOFTREGULAR constraint, and then briefly summarise its related work in CP.

3.3.1 Definition To solve over-constrained problems, soft constraints that can be violated with costs are introduced. Consider a REGULAR(X,A) constraint, where X = hX1,...,Xni is a sequence of decision variables with D(Xi) denoting the domain of Xi ∈ X, and A is a DFA. A softened version of the REGULAR(X,A) constraint is defined as SOFTREGULAR(X,A,µ,z), where µ :D(X1) × ··· × D(Xn) → R is a cost measure function with µ(v1,...,vn) = 0 if and only if hv1,...,vni ∈ C and µ(v1,...,vn) > 0 otherwise, and z is a non-negative cost variable denoting how much C is violated under the cost measure µ. In [42], two versions of the SOFTREGULAR constraint are introduced, based on the two cost measure functions µHamming and µedit:

42 • The Hamming-distance based SOFTREGULAR(X,A,µHamming,z) con- straint uses a cost measure function µHamming such that

µHamming(v1,...,vn) = n min{Hamming_distance(v1 ...vn,w) | w ∈ (L(A) ∩ Σ )} where L(A) ∩ Σn is the sub-language of strings of fixed length n of the regular language accepted by the DFA A, and the between two strings of equal length is the number of positions at which the corresponding symbols are different. • The edit-distance based SOFTREGULAR(X,A,µedit,z) constraint uses a cost measure function µedit such that n µedit(v1,...,vn) = min{edit_distance(v1 ...vn,w) | w ∈ (L(A) ∩ Σ )} where the edit distance (also known as ) between two strings is the minimum number of non-copying edit operations (namely substitution, insertion, and deletion of a symbol) needed to transform one string into the other.

Example 15. Consider the DFA A of Example 12 and a sequence X = hX1,...,X5i of 5 decision variables under the domains D(Xi) = {d,e,v} for all Xi ∈ X. We have that the sub-language of strings of length 5 of the reg- ular language accepted by A contains two words, namely ddvev and evddv. Consider a sequence hd,v,e,v,di of 5 values in the domains: • For the SOFTREGULAR(X,A,µHamming,z) constraint, we have that µHamming(d,v,e,v,d) = 4, as the Hamming distance between dvevd and ddvev is 4 and the Hamming distance between dvevd and evddv is also 4. • For the SOFTREGULAR(X,A,µedit,z) constraint, we have that µedit(d,v,e,v,d) = 2, as the edit distance between dvevd and ddvev is 2 and the edit distance between dvevd and evddv is 4. We have that µedit(d,v,e,v,d), which is 2, is smaller than µHamming(d,v,e,v,d), which is 4.

Note that the edit distance between any two strings of the same length is usually smaller and never larger than their Hamming distance (where only substitution is allowed). Hence, we have that µedit(v1,...,vn) ≤ µHamming(v1,...,vn). This property makes the cost measure µedit more suit- able than µHamming for scheduling problems, as argued in [42]. We propose the following two additional reasons: • When using soft constraints to solve an over-constrained problem, the objective is to find a solution with the minimum cost. Hence, a cost measure function computing a smaller cost is preferable than the one computing a larger cost, as it may lead to a better solution with a smaller minimum cost.

43 • In the stochastic local search approachto CP, a constraint is modelled as its softened version with a cost measure function to calculate its viola- tion. Considering the REGULAR(X,A) constraint, its violation under the current assignment a to X can be calculated as µedit(a) (or µHamming(a)), which denotes the minimum number of local moves (non-copying edit operations) that are needed for changing a into a solution to the con- straint. As µedit(a) ≤ µHamming(a), using µedit may lead to a solution with fewer local moves than using µHamming. Hence, the cost measure µedit is preferable.

3.3.2 Related Work

Given a sequence X = hX1,...,Xni of n decision variables and a DFA A = hQ,Σ,δ,q0,Fi, in the systematic search approach to CP, the SOFTREGULAR(X,A,µHamming,z) constraint and the SOFTREGULAR(X,A,µedit,z) constraint are represented as flow networks and are enforced to domain con- sistency in O(n · |A|) time by flow-based propagators in [42]. Note that when enforcing domain consistency on the cost variable of a soft constraint, only the lower bound of the cost variable is considered, as the soft constraint only limits the lower bound of its cost variable (see Definition 1). Hence, enforcing domain consistency for a soft constraint is equivalent to enforcing bounds consistency on its cost variable and domain consistency on other decision variables. However, the propagator for the edit-distance based SOFTREGULAR constraint may underestimate its cost, and may thus guide the search to in- correct (non-optimal) solutions to an over-constrained problem. In [26], the authors show how to encode the two versions of SOFTREGULAR constraints into weighted regular grammars G, which can be enforced to domain consis- 2 tency in O n · |G| time by using the propagator for the WEIGHTEDGRAMMAR constraint. However, if the encoded grammars can accept words of length  other than n, then the propagator for the edit-distance based SOFTREGULAR constraint may also underestimate its cost. Hence, we propose several prop- agators that compute correctly the cost and achieve domain consistency 2 in O n · |A| time for the edit-distance based SOFTREGULAR constraint in Paper III. In the stochastic local search approach to CP, to the best of our knowl- edge, the only existing work at the start of my PhD was [35], which intro- duces a method of computing the violation of the REGULAR(X,A) constraint based on the cost measure function µHamming in O(n · |A|) time. We pro- pose a method which sometimes overestimates (but never underestimates) the Hamming-distance based violation of the REGULAR(X,A) constraint in O(n) time in Paper I, and show how to compute the edit-distance based violation in O n2 · |A| time in Paper III. 

44 3.4 The AUTOMATON Constraint with Counters We first give the definition of the AUTOMATON constraint with counters, and then briefly summarise its related work in CP.

3.4.1 Definition In [7], the AUTOMATON constraint is introduced. It is a generalisation of the REGULAR constraint, as it also works with a counter DFA (cDFA), which pro- vides a more convenient and generic way of describing constraints than a DFA or regular expression. Given a sequence X of decision variables and a cDFA A, the AUTOMATON(X,A) constraint is satisfied when a sequence of values assigned to X is a string accepted by A. Now, we give our own definition of a cDFA, which is more general than that of the Global Constraint Cata- logue [7, 5].

Definition 12. A cDFA is defined just like a DFA, except that the transitions can include assignment statements to counters and that the transitions and ac- cepting states can be guarded by conditions on these counters. The “transition” to the start state may have unguarded initialisations of the counters. Given the transition function δ, a guarded transition δ(q,a,α,β) = t, whose graphical representation is an arc annotated with “a {α → β}” from state q to state t, means that if symbol a is read and guard α holds at state q, then state t is reached, upon also executing the sequence of counter assignment statements β. In principle, a guard α can be any decidable logical expression of compar- ison and membership atoms among counters and parameters of the constraint. Similarly, a counter assignment statement β can be any computable sequen- tial or conditional composition of arithmetic operations on the counters, or a no-operation (denoted by nop). The guards of transitions on the same symbol from the same state must be mutually exclusive to make the counter automaton deterministic. A guarded accepting state (q,α), whose graphical representa- tion is “q : α” within a double ellipsis, means that q is an accepting state only if guard α holds.

As usual, we sometimes abbreviate the graphical representation of several arcs between the same pair of states by a single arc annotated with the set of sym- bols of those arcs, provided they have the same guards and counter assignment statements. Note that our definition of a cDFA is different from that of [7, 5] in the following three ways: • In [7, 5], it is the counter assignments that are guarded (rather than the transitions). In other words, the transition on a given symbol is fired un- conditionally when the symbol is read and then its counter assignments are executed if its guard holds. However, in our definition, a transition

45 D d d d W v O G d e v v e H e E Figure 3.3. A DFA for a work scheduling constraint for one employee

ona givensymbolis fired if the symbolis read and if its guard holds, and its counter assignments are executed unconditionally when the transition is fired. • In [7, 5], the transition on a given symbol from a given state is unique. However, in our definition, several transitions may exist on a given sym- bol from a given state under the condition that no two transitions can be fired at the same time. • Using our definition, a cDFA may have guarded accepting states, which is not the case for [7, 5]. In Paper I, we propose two methods of unwinding a cDFA into a DFA. One advantage of our definition is that we can detect failure states, from which no accepting state can be reached, much earlier by guarded transitions than by the definition of [7, 5] when unwinding a cDFA. Hence, unwinding a cDFA with our definition is more efficient as its unwound DFA can be much smaller. We get a powerful tool for modelling constraints by using cDFAs. Indeed, a DFA is often specific to an instance of that constraint, as seen in the following example:

Example 16. Reconsider the work scheduling constraint in Example 13: if we change just one parameter of that constraint, say that one cannot enjoy a vacation of more than two days, instead of exactly one day, then the DFA in Figure 3.2 needs to be changed into the DFA in Figure 3.3. Although the only difference of these two instances of the work constraint is a single parameter, their DFAs have many differences. It will require a lot of modelling work if every constraint instance needs a different DFA. However, cDFAs are often independent of constraint instances. Figure 3.4 gives a cDFA for all instances of the work constraint requiring that one must start with a work shift and must end with some vacation; that one cannot work for fewer than d or more than d consecutive days on the day shift; that one cannot work for fewer than e or more than e consecutive days on the evening shift; and that between any change of work shifts, one must enjoy at least v and at most v consecutive days of vacation. The counter c maintains the number of consecutive days on the same shift. State V is a guarded accepting state so that V is an accepting state only if c ≥ v. The guarded transition d c < d → c := c + 1 from D to itself means that the transition on symbol d fires if c < d and increments 

46 d c < d → c := c + 1  v {c ≥ d → c := 1} D d {c := 1} v {c < v → c := c + 1} d {c ≥ v → c := 1}

O V : c ≥ v

v {c ≥ e → c := 1}

e {c := 1} E e {c ≥ v → c := 1}

e {c < e → c := c + 1}

Figure 3.4. A cDFA for the work scheduling constraint in Example 16

c {ct := ct − 1} a {c := c + 1} b {ct > 1 → ct := ct − 1}

c := 0 b {ct := c − 1} b {ct = 1 → ct := c} A B C : ct = 0

Figure 3.5. A cDFA for the context-sensitive language {anbncn | n ∈ N} c by one. The unguarded transition d {c := 1} from state O to D means that the transition on d always fires and initialises c to 1. Hence, this cDFA with parameters hd,d,e,e,v,vi = h2,2,1,1,1,1i represents the DFA of Figure 3.2; and this cDFA with parameters hd,d,e,e,v,vi = h2,2,1,1,1,2i represents the DFA of Figure 3.3.

Note that cDFAs accept also non-regular languages. For example, the context-sensitive language {anbncn | n ∈ N} of Example 9 is accepted by the cDFA of Figure 3.5. In practice, we are here only interested in finite languages (of strings of a fixed length), so we will not exploit this additional expressive- ness.

47 3.4.2 Related Work In the systematic search approach to CP, the authors of [7] propose to specify constraints by cDFAs and introduce a propagator (that does not always achieve domain consistency) for the AUTOMATON constraint. In this thesis, we only address the usage of the AUTOMATON constraint with counters under the stochastic local search approach to CP. To the best of our knowledge, there is no existing work on it except ours. In Paper I, we extend our method for the REGULAR constraint to the AUTOMATON constraint with counters.

3.5 The CFG Constraint We first give the definition of the CFG constraint, and then briefly summarise its related work in CP.

3.5.1 Definition The CFG constraint [36, 39] requires a sequence of decision variables to be- long to a context-free language, specified by a context-free grammar (CFG). Given a sequence X of decision variables and a CFG G, the CFG(X,G) con- straint is satisfied when a sequence of values assigned to X is a string accepted by G. A CFG hΣ,N,P,Si is said to be in Chomsky normal form (CNF) if and only if P ⊆ N × Σ ∪ N2 . Every CFG can be converted into an equivalent grammar in CNF.  Example 17. Considering the CFG of Example 10, which defines a language of correctly bracketed expressions, its CNF is G′ = hΣ,N′,P′,Si, where N′ = {L,M,R,S} and P′ = {S → LR, S → SS, S → MR, M → LS, L → ℓ, R → r}.

The Cocke-Younger-Kasami (CYK) algorithm [24, 46, 11] is a parser for CFGs in CNF. Given a CFG G = hΣ,N,P,Si in CNF and a string w = w1 ···wn of n symbols over the alphabet Σ, the CYK parser computes a table V to parse 3 w in O(n · |G|) time, where Vi, j (with 1 ≤ j ≤ n and 1 ≤ i ≤ n + 1 − j) is the set of non-terminals that can be parsed by the string wiwi+1 ···wi+ j−1 of j symbols using dynamic programming:

{W | (W → wi) ∈ P} if j = 1 j−1 Vi j = (W → YZ) ∈ P ,  W otherwise   ( ∧ Y ∈ Vi,k ∧ Z ∈ Vi+k, j−k ) k[=1

 n Given a string w∈ Σ , w is accepted by a CFG G if and only if S ∈ V1,n. 48 Figure 3.6. The CYK algorithm parses a string ℓrℓr under the CFG G′ of Example 17.

Example 18. Figure 3.6 gives the CYK table V when parsing a string ℓrℓr of ′ four symbols under the grammar G of Example 17. We have V1,1 = {L} and ′ V1,4 = {S}. As S ∈ V1,4, the string ℓrℓr is accepted by the CFG G .

Given a CFG(X,G) constraint, where X is a sequence of |X| = n decision variables and G is a CFG, as the fixed-length sub-language L(G) ∩ Σn of L(G) is finite and hence regular, the CFG(X,G) constraint can be transformed into a REGULAR(X,A) or AUTOMATON(X,A) constraint as in [25], where A is a DFA for the regular language L(G) ∩ Σn. Hence, when solving a combinatorial problem where CFG constraints are used, a user of a CP system can try two models of the problem, one with the CFG(X,G) constraint and the other with the REGULAR(X,A) constraint, and choose the one that can solve the problem more efficiently depending on whether the transformation of G into A can be taken off-line or not.

3.5.2 Related Work Given a sequence X of n decision variables and a CFG G, in the systematic search approach to CP, several non-incremental propagators that achieve do- main consistency for the CFG(X,G) constraint in O n3 · |G| time are intro- duced in [39, 36]. Thereafter, a decomposition-based incremental propagator, which improves the propagator of [36] and achieves domain consistency for the CFG(X,G) constraint in O n3 · |G| time and space, is introduced in [37]. More recently, another incremental propagator, which improves the propa- gators of [39, 37] and achieves domain consistency for the CFG(X,G) con- straint in O n3 · |G| time with O n2 · |G| space, is introduced in [23]. In Paper IV, we propose a new incremental propagator, which improves the prop- agator of [23] in practice and achieves domain consistency for the CFG(X,G) constraint in the same O n3 · |G| time with O n2 · |G| space. In the stochastic local search approach to CP, to the best of our knowledge, there is no existing work on the CFG constraint. One possible reason of this is that parsing a CFG G for words of length n takes O n3 · |G| time, which is  49 too expensive for stochastic local search. However, we can first transform the CFG constraint into the REGULAR constraint, as in [25] for instance, and then use our method for the REGULAR constraint in Paper I.

50 4. Summary of Papers

This chapter summarises the papers included in this thesis.

4.1 Paper I: An AUTOMATON Constraint for Local Search In this paper, we explore the idea of using deterministic finite automata (DFAs) to implement new constraints for CBLS, which is already successfully used in the systematic search approach to constraint programming (CP). Before our paper, the only existing work on the REGULAR constraint for the stochastic local search approach to CP, namely constraint-based local search (CBLS), was [35], which introduces a deterministic method of defining and maintain- ing the violations of the REGULAR constraint and all its decision variables. Given a sequence X of n decision variables under the current assignment a and a DFA A, this deterministic method takes O(n · |A|) time to maintain the viola- tions for the REGULAR(X,A) constraint by computing the minimum Hamming distance between a and any words of length n accepted by A, and takes Θ(|A|) time to evaluate each candidate move in a neighbourhood. For CBLS, the key issue of efficiency is making cheap local moves, however the O(n · |A|) time required to make each move and the Θ(|A|) time required to evaluate each candidate move are too expensive. To improve this method, we propose a stochastic method of defining and maintaining the violations. Our method takes O(n) time to maintain the violations for the REGULAR(X,A) constraint by computing the Hamming distance between the current assignment and a stochastically chosen word of length n accepted by the DFA A, but takes O(n) time to evaluate each candidate move in a neighbourhood. Comparing with the O(n · |A|) time required to make each move of the deterministic method, our method is |A| times faster. Although |A| is a constant for a specific in- stance, |A| can be very large in practice. For example in [6], large-size DFAs are used to model a nurse scheduling problem, where all hard constraints for one nurse over a schedule period are modelled with a product DFA, with |A| up to 53,738. Comparing with the Θ(|A|) time required to evaluate each can- didate move of the deterministic method, our method takes O(n) time. Hence, there is no clear winner. However, our experimental results on several real- life combinatorial problems show that our stochastic method is much better in runtime (but with more local moves) than the deterministic method of [35] even for all instances where n > |A|.

51 To the best of our knowledge, we are the first and only one to propose a stochastic method of defining and maintaining violations of a constraint for CBLS. Hence, one expected impact of this work is that the idea may be useful for other constraints, if the stochastic method has a lower complexity than a deterministic method. We also extend our method to the AUTOMATON constraint with counters for CBLS. We show that DFAs with counters (cDFAs), which accept also non- regular languages, are more expressive than DFAs. In practice, we are here only interested in finite languages (of words of a given length), so that we will not exploit this additional expressiveness. We show that cDFAs are more in- dependent of the constraint instance than DFAs, hence cDFAs provide a pow- erful tool for modelling new constraints. To use the AUTOMATON constraint with counters for CBLS, we propose the following two methods: 1. We unwind a cDFA into a DFA in an off-line pre-processing step by mapping each pair of state and counter values of the cDFA into a state of the DFA, so that we can reuse our method for the REGULAR constraint and that the cDFA itself is purely for the convenience of modelling. 2. We generalise our violation algorithm for the REGULAR constraint to work directly on the AUTOMATON constraint with counters. Unlike the first method, which unwinds the whole cDFA first, the second method unwinds the cDFA dynamically and incrementally on demand while maintaining the violations.

4.2 Paper II: Solution Neighbourhoods for Constraint-Directed Local Search When using CBLS to solve a combinatorial problem, the search starts from an initial assignment of values to all decision variables, and iteratively moves to a neighbour after exploring a neighbourhood of the current assignment. Hence, it is important to choose a good neighbourhood so that a solution to the problem can be found efficiently. A search procedure can be variable or constraint directed (see Section 2.4.2). In this paper, we focus on constraint- directed local search and propose a solution neighbourhood, which contains only solutions to a constraint. We introduce a framework of using solution neighbourhoods, and provide an efficient algorithm of constructing a solution neighbourhood for the very useful REGULAR constraint. Our experimental re- sults on a library of nurse scheduling instances [44] demonstrate the practical- ity of our method. The motivation of this work is to improve the usage of constraints for which there exists no known constant-time algorithm for neighbour evaluation in CBLS, e.g., the REGULAR constraint. Given a sequence X of n decision vari- ables and a DFA A, we propose a method in Paper I that shrinks the time complexity of maintaining violations for the REGULAR(X,A) constraint from

52 |the O(n · A|) in [35] to O(n). However, even our algorithm takes O(n) time for neighbour evaluation, which is rather expensive. On the other side, we can encode all solutions to the REGULAR(X,A) constraint by unrolling the DFA A into an n+1 layered graph as in [33], as each path of length n in the graph is a solution to the REGULAR(X,A) constraint; and we can avoid such an expensive neighbour evaluation for the REGULAR(X,A) constraint by constructing a so- lution neighbourhood through finding paths of length n in the layered graph, as all neighbours are solutions and the violation changes for the REGULAR(X,A) constraint are all the same when evaluating the neighbours. We show that the time needed to construct a solution neighbourhood for the REGULAR(X,A) constraint is paid off by the zero-time neighbour evaluation. Interestingly, we find that even some constraints, for which there exists a constant-time al- gorithm for neighbour evaluation, such as the ALLDIFFERENT constraint can benefit from this idea.

4.3 Paper III: Underestimating the Cost of a Soft Constraint is Dangerous: Revisiting the Edit-Distance Based Soft Regular Constraint The authors of [42] introduce a flow network representation of the edit- distance based SOFTREGULAR constraint, and propose a propagator based on flow theory. Given a sequence X of n decision variables and a DFA A, their propagator measures the cost of the edit-distance based SOFTREGULAR(X,A) constraint by computing the minimum edit distance between any possible as- signment of X under the current domains and the whole regular language ac- cepted by A. Hence, their propagator may underestimate the correct cost mea- sure, which is the minimum edit distance between any possible assignment of X under the current domains and the sub-language of words of length n of the regular language accepted by A. This work is inspired by the observation above. We first use the edit-distance based SOFTREGULAR constraint as an exam- ple to show that there are unwanted consequences when using a propagator that sometimes underestimates the cost of a soft constraint, as such a propa- gator may guide the search to incorrect (non-optimum) solutions to an over- constrained problem. To compute correctly the cost for the edit-distance based SOFTREGULAR constraint on a sequence of n decision variables, we propose an O n2 -time propagator based on dynamic programming and a proof of its correctness. We also give an improved propagator based on a clever idea in [40] of improving the classical dynamic programming algorithm for com- puting the edit distance between two strings [45], which first assumes the edit distance is one and then verifies the assumption: if the assumption is true, then the correct edit distance has been computed; otherwise, another assump-

53 tion is made by doubling the value of the previously assumed edit distance. Our improved propagator has the same quadratic-time complexity in the worst case, but better performance in practice. We theoretically compare our prop- agators with two other propagators that we propose, one based on the propa- gator of [42], the other based on the propagator for the WEIGHTEDGRAMMAR constraint [26]; and demonstrate the efficiency of our propagators with some experiments. Finally, we show how to adapt our method for the violation mea- sure of an edit-distance based REGULAR constraint for CBLS. One expected impact of this work is that we may use that idea of [40] for other constraints with propagators based on dynamic programming.

4.4 Paper IV: Solving String Constraints: The Case for Constraint Programming In the analysis, testing, and verification of string-manipulating programs, con- straints on sequences (strings) of decision variables often arise. The authors of [27] argue that custom string solvers should not be designed any more, for sustainability reasons, since powerful off-the-shelf solvers are available: their tool, called HAMPI, translates a REGULAR or CFG constraint on a fixed- size string into bit-vector constraints so as to solve them using the satisfiability modulo theories (SMT) solver STP [14], much more efficiently than three cus- tom tools and even up to three orders of magnitude faster than the propositional satisfiability (SAT) based CFGAnalyzer tool [3]. The solver Kaluza [38] han- dles constraints over multiple string variables, unlike the restriction of HAMPI to one such variable, and it also generates bit-vector constraints that are passed to STP. The authors of [13] argue that it is important to model regular re- placement operations, which are not supported by HAMPI and Kaluza, and introduce the custom string solver SUSHI, which models string constraints via automata instead of a bit-vector encoding. So the question arises whether the constraints for formal language of CP are competitive with HAMPI, Kaluza, and SUSHI. We revisit the CFG(X,G) constraint, where X is a sequence of n decision variables and G a context-free grammar (CFG), and make the following con- tributions: • The authors of [23] introduce an incremental propagator that achieves domain consistency for the CFG(X,G) constraint in O n3 · |G| time with O n2 · |G| space, which is better than the decomposition-based  propagator of [37] with O n3 · |G| time and space. We improve the  CFG propagator of [23] by exploiting an idea of [27] for reformulat- ing a grammar into a regular expression for a fixed string length. This idea also applies to the CFG propagators of [36, 37, 26]. Although our new CFG propagator has the same time and space complexity as the one

54 of [23], our experimental results show that our new propagator is much better in practice. • Since every fixed-size language is finite and hence regular, the CFG(X,G) constraint can be transformed into the REGULAR(X,A) constraint, as in [25] for instance. Hence the need for a CFG constraint depends on the grammar and on the complexities of the propagators. As far as we know, no real-life grammar has been exhibited where an expensive CFG(X,G) propagator, with a time complexity of O n3 · |G| , outperforms a cheap REGULAR(X,A) propagator, with a time complex- ity of O(n · |A|), at least when the reformulation of [25] of the CFG G into the DFA A for a fixed length n is taken off-line; and no CP solver includes the CFG constraint. However, we show that there are applica- tions and CFGs where our new CFG propagator beats a cheap REGULAR propagator, when the CFG must be reformulated on-line for the latter into a DFA for a fixed length in such applications. • We show that the CP solver GECODE [15] with our new or the old CFG propagator systematically beats HAMPI, by up to four orders of magni- tude, on its own benchmark; and GECODE with the built-in REGULAR propagator systematically beats Kaluza and SUSHI, by a factor up to 130, on SUSHI’s benchmark. Hence, we argue that CP solvers are more suitable than existing solvers for program verification tools that have to solve string constraints, as CP has a rich tradition of constraints for membership in formal languages.

55 5. Conclusion

This thesis focuses on constraints for membership in formal languages under both systematic search and stochastic local search approaches to CP. Such constraints are very useful in CP for the following reasons: • Many constraints can be modelled by constraints for membership in for- mal languages. Hence, constraints for membership in formal languages provide a powerful tool for user-level extensibility of CP languages. • In many shift scheduling problems, there exist complex work shift reg- ulation constraints, which can be modelled and solved efficiently by constraints for membership in formal languages. Hence, constraints for membership in formal languages are very useful for those applications. • In the analysis, testing, and verification of string-manipulating pro- grams, string constraints often arise. We have shown that CP solvers with constraints for membership in formal languages are much more suitable than existing solvers used in tools that have to solve string con- straints. Hence, constraints for membership in formal languages are very important for solving those applications efficiently. In the stochastic local search approach to CP, we make the following contri- butions: • We introduce a stochastic method of maintaining violations for the REGULAR constraint and extend our method to the AUTOMATON con- straint with counters. • To improve the usage of constraints for which there exists no known constant-time algorithm for neighbour evaluation, we introduce a frame- work of using solution neighbourhoods, and give an efficient algorithm of constructing a solution neighbourhood for the REGULAR constraint. In the systematic search approach to CP, we make the following contributions: • We observe that the propagator in [42] sometimes underestimates the cost of the edit-distance based SOFTREGULAR constraint. Based on this observation, we show that there may be unwanted consequences when using a propagator that may underestimate a cost of a soft constraint, as the propagator may guide the search to incorrect (non-optimum) solu- tions to an over-constrained problem. We introduce and compare several propagators that compute correctly the cost of the edit-distance based SOFTREGULAR constraint. • We show that the CFG constraint is useful and introduce an improved propagator for the CFG constraint. In addition to the contributions above, we give some expected impacts of this thesis in Section 1.2.

56 References

[1] Magnus Ågren, Pierre Flener, and Justin Pearson. Generic incremental algorithms for local search. Constraints, 12(3):293–324, September 2007. [2] Krzysztof Apt. Principles of Constraint Programming. Cambridge University Press, 2003. [3] Roland Axelsson, Keijo Heljanko, and Martin Lange. Analyzing context-free grammars using an incremental SAT solver. In Proceedings of the 35th International Colloquium on Automata, Languages and Programming, volume 5126 of LNCS, pages 410–422, Reykjavik, Iceland, July 2008. Springer. [4] Fahiem Bacchus. GAC via unit propagation. In Proceedings of the 13th International Conference on Principles and Practice of Constraint Programming, volume 4741 of LNCS, pages 133–147, Providence, USA, September 2007. Springer. [5] Nicolas Beldiceanu, Mats Carlsson, Sophie Demassey, and Thierry Petit. Global constraint catalogue: Past, present, and future. Constraints, 12(1):21–62, March 2007. The current working version of the catalogue is at http:// www.emn.fr/z-info/sdemasse/aux/doc/catalog.pdf. [6] Nicolas Beldiceanu, Mats Carlsson, Pierre Flener, and Justin Pearson. On matrices, automata, and double counting in constraint programming. Constraints, 18(1):108–140, January 2013. An early version is published in the Proceedings of 7th International Conference on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, volume 6140 of LNCS, pages 10–24, 2010. [7] Nicolas Beldiceanu, Mats Carlsson, and Thierry Petit. Deriving filtering algorithms from constraint checkers. In Mark Wallace, editor, Proceedings of the 10th International Conference on Principles and Practice of Constraint Programming, volume 3258 of LNCS, pages 107–122, Toronto, Canada, September 2004. Springer. [8] M. Bezzel. Proposal of 8-queens problem. Berliner Schachzeitung, 3:363, 1848. Submitted under the author name "Schachfreund". [9] Kenil C. K. Cheng and Roland H. C. Yap. An MDD-based generalized arc consistency algorithm for positive and negative table constraints and some global constraints. Constraints, 15(2):265–304, April 2010. [10] Noam Chomsky. Three models for the description of language. IRE Transactions on Information Theory, 2(3):113–124, 1956. [11] John Cocke. Programming languages and their : Preliminary notes. Courant Institute of Mathematical Sciences, New York University, New York, 1969. [12] Sophie Demassey, Gilles Pesant, and Louis-Martin Rousseau. A cost-regular based hybrid column generation approach. Constraints, 11(4):315–333, December 2006.

57 [13] Xiang Fu, Michael C. Powell, Michael Bantegui, and Chung-Chih Li. Simple linear string constraints. Formal Aspects of Computing, 2012. Published on-line in January 2012 and available from http://dx.doi.org/10.1007/ s00165-011-0214-3.SUSHI is available from http://people. hofstra.edu/Xiang_Fu/XiangFu/projects/SAFELI/SUSHI. php. [14] Vijay Ganesh and David L. Dill. A decision procedure for bit-vectors and arrays. In Proceedings of the 19th International Conference in Computer Aided Verification, volume 4590 of LNCS, pages 519–531, Berlin, Germany, July 2007. Springer. STP is available from https://sites.google.com/ site/stpfastprover/. [15] Gecode Team. Gecode: A generic constraint development environment, 2006. Available from http://www.gecode.org/. [16] Fred Glover. Tabu search – Part I. INFORMS Journal on Computing, 1(3):190–206, 1989. [17] Fred Glover. Tabu search – Part II. INFORMS Journal on Computing, 2(1):4–32, 1990. [18] Pascal Van Hentenryck and Laurent Michel. Differentiable invariants. In Frédéric Benhamou, editor, Proceedings of the 12th International Conference on Principles and Practice of Constraint Programming, volume 4204 of LNCS, pages 604–619, Nantes, France, September 2006. Springer. [19] Pascal Van Hentenryck, Vijay Saraswat, and Yves Deville. Design, implementation, and evaluation of the constraint language cc(FD). Technical Report CS-93-02, Brown University, Providence, USA, January 1993. [20] E. J. Hoffman, J. C. Loessi, and R. C. Moore. Constructions for the solution of the m queens problem. Mathematics Magazine, 42(2):66–72, March 1969. [21] Holger H. Hoos and Thomas Stützle. Stochastic Local Search: Foundations and Applications. Elsevier / Morgan Kaufmann, 2004. [22] John E. Hopcroft and Jeffrey D. Ullman. Introduction to , Languages, and Computation. Addison Wesley, New York, 1979. [23] Serdar Kadiogluˇ and Meinolf Sellmann. Grammar constraints. Constraints, 15(1):117–144, January 2010. An early version is published in the Proceedings of the 23rd AAAI Conference on Artificial Intelligence in 2008. [24] Tadao Kasami. An efficient recognition and syntax-analysis algorithm for context-free languages. Scientific report AFCRL-65-758, Air Force Cambridge Research Laboratory, Bedford, Massachusetts, USA, 1965. [25] George Katsirelos, Nina Narodytska, and Toby Walsh. Reformulating global grammar constraints. In Willem-Jan van Hoeve and John N. Hooker, editors, Proceedings of the 6th International Conference on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, volume 5547 of LNCS, pages 132–147, Pittsburgh, USA, May 2009. Springer. [26] George Katsirelos, Nina Narodytska, and Toby Walsh. The weighted GRAMMAR constraint. Annals of Operations Research, 184(1):179–207, April 2011. An early version is published in the Proceedings of the 5th International Conference on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, volume 5015 of LNCS, pages

58 323–327, 2008. [27] Adam Kiezun,˙ Vijay Ganesh, Philip J. Guo, Pieter Hooimeijer, and Michael D. Ernst. HAMPI: A solver for string constraints. In Proceedings of the 18th International Symposium on Software Testing and Analysis, pages 105–116, Chicago, USA, July 2009. ACM Press. HAMPI is available from http:// people.csail.mit.edu/akiezun/hampi/. [28] Mikael Z. Lagerkvist. Techniques for Efficient Constraint Propagation. PhD thesis, Royal Institute of Technology (KTH), Sweden, 2008. [29] G Laporte and G Pesant. A general multi-shift scheduling system. Journal of the Operational Research Society, 55(11):1208–1217, November 2004. The benchmarks are at www.crt.umontreal.ca/~quosseca/fichiers/ 9-CyclicSchedBenchs.zip. [30] John C. Martin. Introduction to Languages and the . McGraw-Hill, New York, 1991. [31] Laurent Michel and Pascal Van Hentenryck. Localizer: A modeling language for local search. In Gert Smolka, editor, Proceedings of the 3rd International Conference on Principles and Practice of Constraint Programming, volume 1330 of LNCS, pages 237–251, Linz, Austria, October 1997. Springer. [32] Laurent Michel, Pascal Van Hentenryck, and Liyuan Liu. Constraint-based combinators for local search. In Mark Wallace, editor, Proceedings of the 10th International Conference on Principles and Practice of Constraint Programming, volume 3258 of LNCS, pages 47–61, Toronto, Canada, September 2004. Springer. [33] Gilles Pesant. A regular language membership constraint for finite sequences of variables. In Mark Wallace, editor, Proceedings of the 10th International Conference on Principles and Practice of Constraint Programming, volume 3258 of LNCS, pages 482–495, Toronto, Canada, September 2004. Springer. [34] Thierry Petit, Jean-Charles Régin, and Christian Bessière. Specific filtering algorithms for over-constrained problems. In Toby Walsh, editor, Proceedings of the 7th International Conference on Principles and Practice of Constraint Programming, volume 2239 of LNCS, pages 451–463, Paphos, Cyprus, November 2001. Springer. [35] Benoit Pralong. Implémentation de la contrainte Regular en Comet. Master’s thesis, École Polytechnique de Montréal, Canada, 2007. [36] Claude-Guy Quimper and Toby Walsh. Global grammar constraints. In Frédéric Benhamou, editor, Proceedings of the 12th International Conference on Principles and Practice of Constraint Programming, volume 4204 of LNCS, pages 751–755, Nantes, France, September 2006. Springer. [37] Claude-Guy Quimper and Toby Walsh. Decomposing global grammar constraints. In Christian Bessière, editor, Proceedings of the 13th International Conference on Principles and Practice of Constraint Programming, volume 4741 of LNCS, pages 590–604, Providence, USA, September 2007. Springer. [38] Prateek Saxena, Devdatta Akhawe, Steve Hanna, Feng Mao, Stephen McCamant, and Dawn Song. A symbolic execution framework for javascript. In Proceedings of the 31st IEEE Symposium on Security and Privacy, pages 513–528, California, USA, May 2010. IEEE Press. Kaluza is available from http://webblaze.cs.berkeley.edu/2010/kaluza/.

59 [39] Meinolf Sellmann. The theory of grammar constraints. In Frédéric Benhamou, editor, Proceedings of the 12th International Conference on Principles and Practice of Constraint Programming, volume 4204 of LNCS, pages 530–544, Nantes, France, September 2006. Springer. [40] Esko Ukkonen. Algorithms for approximate string matching. Information and Control, 64(1–3):100–118, 1985. [41] Pascal Van Hentenryck and Laurent Michel. Constraint-Based Local Search. The MIT Press, 2005. [42] Willem-Jan van Hoeve, Gilles Pesant, and Louis-Martin Rousseau. On global warming: Flow-based soft global constraints. Journal of Heuristics, 12(4–5):347–373,September 2006. An early version is published in the Proceedings of the 6th International Workshop on Preferences and Soft Constraints in 2004. [43] Willem-Jan van Hoeve, Gilles Pesant, Louis-Martin Rousseau, and Ashish Sabharwal. Revisiting the sequence constraint. In Frédéric Benhamou, editor, Proceedings of the 12th International Conference on Principles and Practice of Constraint Programming, volume 4204 of LNCS, pages 620–634, Nantes, France, September 2006. Springer. [44] Mario Vanhoucke and Broos Maenhout. On the characterization and generation of nurse scheduling problem instances. European Journal of Operational Research, 196(2):457–467, July 2009. [45] Robert A. Wagner and Michael J. Fischer. The string-to-string correction problem. Journal of the ACM, 21(1):168–173, January 1974. [46] Daniel H. Younger. Recognition and parsing of context-free languages in time n3. Information and Control, 10(2):189–208, February 1967.

60 A. Omitted Materials from Paper I

In this appendix, we first give two omitted counter automata (cDFA) and then give our omitted experiments on a police scheduling problem.

A.1 Two cDFAs Figure A.1 gives a counter automaton (cDFA) for any STRETCHPATHPARTITION (X,[{d,e,n}],[ℓ],[u]) constraint over an alphabet Σ containing {d,e,n}, which requires that each stretch taking any value of {d,e,n} be of a length between a lower bound ℓ and an upper bound u. For exam- ple, given an alphabet Σ = {a,b,d,e,n} and a sequence X of 5 de- cision variables, the assignment X := ha,d,e,b,ni is a solution to the STRETCHPATHPARTITION(X,[{d,e,n}],[1],[2]) constraint, as there are two stretches taking values of {d,e,n}, namely hd,ei of length 2 and hni of length 1; however, the assignment X := ha,d,e,e,ni violates the constraint, as there is one stretch taking values of {d,e,n}, namely hd,e,e,ni of length 4. Figure A.2 gives a cDFA for the GCC(X,V,L,U) constraint, where X de- notes the decision variables, V is the sequence of symbols that are constrained in the constraint, L is the sequence of lower bounds of occurrences, and U is the sequence of upper bounds. The constraint requires that the number of occurrences of any value v ∈ V in any assignment to X must be between the lower bound Lv and the upper bound Uv. For example, given an alphabet Σ = {a,b,d,e,n} and a sequence X of 5 decision variables, the assignment X := ha,d,e,b,ni is a solution to the GCC(X,[d,e,n],[1,1,1],[2,2,2]) con- straint, as the numbers of occurrences of d, e, and n are all 1; however, the assignment X := ha,d,e,e,ei violates the constraint, as the number of occur- rences of e is 3, which exceeds its allowed upper bound 2.

A.2 The St Louis Police problem The St Louis Police problem (the hardest problem described in [29]) is about constructing a rotating schedule of 17 police teams with a seventeen-week- cycle. There are day (d), evening (e), and night (n) shifts of work, as well as days off (x). Each team works maximum one shift per day. The aim is to construct a 17×7 matrix of values in {d,e,n,x} so that: in the first week, team i (with 1 ≤ i ≤ 17) is assigned to the schedule in row i; for any next week, each

61 {Σ \ d,e,n} {d,e,n} c < u → c := c + 1} {d,e,n} c := 1}

1 2 : c ≥ ℓ

{Σ \ d,e,n}{c ≥ ℓ → c := 0}

Figure A.1. A cDFA for any STRETCHPATHPARTITION(X,[{d,e,n}],[ℓ],[u]) con- straint over an alphabet Σ containing {d,e,n}, which requires that each stretch taking any value of {d,e,n} be of a length between the lower bound ℓ and the upper bound u; the counter c maintains the length of the current stretch; most transitions and the accepting state 2 are guarded by comparisons between c and the length bounds on the current stretch

v {v ∈ V ∧Cv < Uv → Cv := Cv + 1}

∀v ∈ V : Cv := 0 1 : ∧v∈V (Cv ≥ Lv)

v {v ∈/ V → nop} Figure A.2. A cDFA forthe GCC(X,V,L,U) constraint, where X denotes the decision variables, V is the sequence of symbols that are constrained in the constraint, L is the sequence of lower bounds of occurrences, and U is the sequence of upper bounds; C is a sequence of |V| counters with Cv to count the occurrences of any value v ∈ V team moves down to the next row, while the team on the last row moves up to the first row. It has more constraints than the rotating nurse problem discussed in Section 4.1 of Paper I. It has non-uniform daily workloads: for example, on Mondays, five teams work during the day, five at night, four in the evening, and three enjoy a day off; while on Sundays, three teams work during the day, four at night, four in the evening, and six enjoy a day off. The problem has other column constraints; for example, no team can work on the same shift on four consecutive Mondays. Any number of consecutive workdays must be between three and eight, and any transition in work shift can only occur after two to seven days off. There are more complex PATTERN constraints to limit transitions of work shifts: only the patterns (d,x,d), (e,x,e), (n,x,n), (d,x,e), (e,x,n), and (n,x,d) are allowed; if a rest period includes a Saturday or a Sunday, then the patterns (d,x,d) and (n,x,d) are forbidden.

62 Unrolled DFA DFS on DFA DFS on cDFA [35] on DFA %S Sec Iter %S Sec Iter %S Sec Iter %S Sec Iter 100 1.69 8,436 100 4.12 8,120 100 4.85 8,235 84 14.94 1,538 Table A.1. Benchmark results on the unique instance of the St Louis Police problem

A.2.1 The Model The non-uniform daily workload can be enforced by global cardinality (GCC) constraints on the columns; however, our model does not include those GCC constraints, because our search procedure ensures that those GCC constraints are always satisfied. The other column constraint can be enforced by the SEQUENCE constraint. The shift transition and shift length constraints on the rows are modelled by PATTERN and STRETCHPATH constraints. COMET does not have the PATTERN and STRETCHPATH global constraints as built-ins, but it does have the SEQUENCE constraint as a built-in. We have con- structed DFAs for the underlying instances of the PATTERN and STRETCHPATH constraints. When using AUTOMATON, our model actually uses the minimised product of these two DFAs, which is more efficient than using the two au- tomata individually, no matter which implementation of the AUTOMATON con- straint we deploy. The handcrafting of a violation algorithm for the present more complex PATTERN constraint instance is not as trivial as for the one in Section 4.1 of Paper I. Hence, we have no handcrafted violation algorithm for this problem. Using the differentiable invariants [18] of COMET, as outlined in the more general method of Section 4.1 of Paper I, would yield a very large expression, as we would have to unroll a DFA for 17 · 7 = 119 variables. Indeed, in our experiments, COMET crashes due to huge memory requirement when posting this expression. This difficulty precisely illustrates the point we are trying to make in this paper: automata enable the rapid prototyping of new global constraints!

A.2.2 Experimental Results Table A.1 gives the average performance over 25 runs of the discussed four ways of implementing the PATTERN and STRETCHPATH constraints on the unique instance of the problem, where %S denotes the average percentage of successful runs without timing out (30 seconds), Sec denotes the average runtime in seconds, and Iter denotes the average number of iterations. We observe that our methods have much lower runtimes (but higher numbers of iterations) than the REGULAR constraint of [35]. The overhead of depth first search (DFS) seems to be a runtime increase by a factor of two.

63 B. Omitted Materials from Paper II

In Section 5 of Paper II, we argue that our solution neighbourhoods also work for the ALLDIFFERENT and SUM constraints on the magic square prob- lem, which assigns n2 numbers, namely 1,2,...,n2, into an n × n square so that the sums of numbers in each row, column, and diagonal are all the same total. Table B.1 gives our omitted experimental results on this problem: each row specifies the instance n and gives the average number of iterations and runtimes (in seconds) over 15 runs for three search pro- cedures, namely the variable-directed search procedure (denoted by VDS) of the tutorial code magicSquareLS.co in the COMET distribution (ver- sion 2.1.1), a constraint-directed search procedure using our solution neigh- bourhoods for the ALLDIFFERENT and SUM constraints (denoted by SDS), and the constraint-directed search procedure (denoted by CDS) of the tuto- rial code magicSquareLS-genericConstrDir.co in the COMET distribution (ver- sion 2.1.1). Considering the runtimes, we have that SDS is worse than CDS with n ≤ 6 but better than CDS with n ∈ [7,9]. Considering the number of iterations, we have that SDS is much better than CDS for all n. Note that we are not pretending to beat the state of the art of the magic square problem, as VDS is the overall winner and is much better than the others. Our point is that a constraint-directed search procedure may benefit from a solution neigh- bourhood, even for constraints where constant-time neighbour evaluation al- gorithms exist.

64 iterations runtimes n VDS SDS CDS VDS SDS CDS 5 332 2,411 12,786 0.02 23.99 2.04 6 297 2,658 36,131 0.03 58.09 12.20 7 212 2,042 164,433 0.05 88.24 103.4 8 134 1,966 972,824 0.05 141.24 1,119.45 9 159 5,448 3,158,744 0.09 608.27 5,775.70

Table B.1. Experimental results on the magic square problem: each row specifies the instance n and gives the average number of iterations and runtimes (in seconds) over 15 runs for three search procedures, namely the variable-directed search procedure (denoted by VDS) of the tutorial code magicSquareLS.co in the COMET distribu- tion (version 2.1.1), a constraint-directed search procedure using our solution neigh- bourhood for the ALLDIFFERENT constraint (denoted by SDS), and the constraint- directed search procedure (denoted by CDS) of the tutorial code magicSquareLS- genericConstrDir.co in the COMET distribution (version 2.1.1)

65 C. Omitted Materials from Paper IV

Consider the CFG(X,G) constraint, where X is a sequence of n decision vari- ables and G is a context-free grammar (CFG). An incremental propagator for the CFG(X,G) constraint is proposed in [23], based on the Cocke-Younger- Kasami (CYK) parser, of computing a table V of non-terminals by parsing G for words of length n. In Section 3 of Paper IV, we show that there are two dependent opportuni- ties for improving the CFG propagator of [23], namely encoding and counting the size of the support sets space-efficiently. Given a non-terminal W in any cell V [i, j] of the CYK table, our improved propagatorin PaperIV first encodes ′ the low (or high) support sets of (or for) W as LS1, j (W) (or HS1, j (W )), and LS HS then uses the counter Ci, j (W ) (or Ci, j (W)) to count its low (or high) supports. Due to space reasons, we omitted the procedure rmNoHS of our propagator in Paper IV, which is given here in Algorithm 3. Given an array QHS of queues, where QHS[ j] stores all non-terminals without high support in the j-th row of the CYK table V , the procedure rmNoHS handles a non-terminal W without high support as follows: it checks whether W has low supports (line 2), if not then it terminates; otherwise, it iterates over each low support ofW (line 3) and decreases the three related counters related by one (lines 5, 8, and 11), then inserts the related non-terminal without high support into QHS (lines 6 to 7 and 9 to 10), and finally terminates if W has no more low support (lines 12 and 13). In Paper IV, we argue that it is crucial for our propagator to use both of the two opportunities above, so that a maximum improvement over the CFG propagator of [23] can be obtained. Now, we give the omitted experimental results to demonstrate that counting with our space-efficient encoding of the support sets (denoted by G++) works better than using only the latter (denoted by G+), which already works better than the propagator of [23] (denoted by G). In Section 4.1 of Paper IV, we describe a real-life shift scheduling prob- lem of [12]. Let w be the number of workers, p the number of periods of the scheduling horizon, and a the number of work activities. The aim is to con- struct a w× p matrix of values in [1,...,a + 3] (there are 3 non-work activities, namely break, lunch, and rest) to satisfy work regulation constraints, which can be modelled with a CFG constraint for each worker over the p periods and some global cardinality constraints (GCC). Table C.1 gives the experimental results for this shift scheduling problem with our chosen branching heuristic in Paper IV: each row specifies the instance and gives the runtimes of G++,

66 Algorithm 3 Given a non-terminal W without high support in the CYK table V , the procedure rmNoHS iterates over each low support of W and decreases the three counters related with this lost low support. 1: procedure rmNoHS(W,i, j,QHS) LS 2: if Ci, j (W ) > 0 then 3: for all (W → YZ, k) ∈ LS1, j (W ) do HS HS 4: if Ci,k (Y ) > 0 ∧Ci+k, j−k (Z) > 0 then HS HS 5: Ci,k (Y ) ← Ci,k (Y ) − 1 HS 6: if Ci,k (Y ) = 0 then 7: QHS[k].enQ((Y,i)) HS HS 8: Ci+k, j−k (Z) ← Ci+k, j−k (Z)− 1 HS 9: if Ci+k, j−k (Z) = 0 then 10: QHS[ j − k].enQ((Z,i + k)) LS LS 11: Ci, j (W) ← Ci, j (W) − 1 LS 12: if Ci, j (W ) = 0 then 13: break from the loop

G+, and G in seconds. We have that G++ always works better than G+, which already always works better than G. In Section 4.2 of Paper IV, we describe a revised problem of computing a string of length up to n = 50 accepted by both of two given context-free gram- mars (CFG) in a benchmark of 100 CFG pairs from [27], where the original problem computes a string of given length n (with 1 ≤ n ≤ 50) accepted by a given CFG pair. Hence, solving our revised problem is equivalent to running the original problem at most 50 times with the i-th run trying to find a string of length i accepted by a given CFG pair until a successful run. This revised (or original) CFG-intersection problem can be modelled by a conjunction of two CFG constraints. Figure C.1 gives the runtimes of G++, G+, and G for all 100 CFG pairs of our revised CFG-intersection problem with our chosen branching heuristic. Each ‘×’ (or ‘+’) denotes the comparison between G++ (or G) and G+. We have that G+ works better than G for all 100 CFG pairs; G++ works better than G+ for 84 CFG pairs (but worse than G+ for one CFG pair), and has an equivalent performance (within a runtime margin of 10 mil- liseconds) to G+ for the other 15 CFG pairs (which are denoted by possibly overlapped ‘×’ on the line y = x). Now, we give the omitted experimental results to demonstrate that G++ and even G systematically beat the string solver HAMPI [27] for all 100 CFG pairs with n from 10 to 50. Figures C.2 to C.6 give the runtimes of G++, G, and HAMPI for all 100 CFG pairs with n ∈ {10,20,30,40,50} of our revised CFG-intersection problem with our chosen branching heuristic. Each ‘×’ (or ‘+’) denotes the comparison between G++ (or G) and HAMPI. We have that

67 benchmark (p = 96) runtimes in seconds instance a w G++ G+ G 1_1 1 1 0.24 0.88 3.93 1_2 1 3 0.90 3.40 12.87 1_3 1 4 1.68 6.42 19.49 1_4 1 5 1.18 4.63 20.53 1_5 1 4 0.92 3.66 16.32 1_6 1 5 1.17 4.49 20.17 1_7 1 6 7.87 26.10 47.48 1_8 1 2 0.52 2.05 8.47 1_9 1 1 0.22 0.88 3.94 1_10 1 7 23.31 76.35 100.95 2_1 2 2 0.93 3.26 15.97 2_5 2 4 1.02 3.74 16.41 2_6 2 5 1.35 4.65 21.57 2_7 2 6 1.97 6.71 32.03 2_8 2 2 2.86 9.98 24.09 2_9 2 1 0.63 2.26 11.03 2_10 2 7 7.64 20.40 53.90

Table C.1. Runtimes of G++, G+, and G for the shift scheduling problem with our chosen branching heuristic

G++ and G work much better than HAMPI for all 100 CFG pairs with all n ∈ {10,20,30,40,50}. Note that Figure C.6 is different from Figure 2 of Paper IV, which gives the results for only 55 CFG pairs where HAMPI takes at least one second. Table C.2 gives the average runtime speed-ups of G++ (or G) over HAMPI on all 100 CFG pairs with n ∈ {10,20,30,40,50}. We have that the runtime speed-ups systematically increase with n rising from 20 to 50. Now, we give the omitted experimental results to demonstrate that G++ and even G also work much better than HAMPI for the original CFG-intersection problem of [27]. Figure C.7 gives the runtimes of G++, G, and HAMPI for all 100 CFG pairs with n = 50 of the original CFG-intersection problem with our chosen branching heuristic. Each ‘×’ (or ‘+’) denotes the comparison between G++ (or G) and HAMPI. We have that G++ works much better than HAMPI for 99 CFG pairs and a little worse than HAMPI for only one CFG pair; G works better than HAMPI for 64 (difficult) instances, but worse than HAMPI for the other 36 (easy) instances. Note that 97 CFG pairs are solved at the root of the search tree for our re- vised (or original) CFG-intersection problem with all n ∈ {10,20,30,40,50}, for G, G+, and G++ (since they all achieve the same propagation). Hence, the choice of a branching heuristic may only influence the results of the other 3 CFG pairs. Table C.3 gives the runtimes of G++, G, and HAMPI for the 3 CFG pairs with n = 50 of our revised CFG-intersection problem in [27] with the default first-fail branching heuristic. We have that G++ and G work worse

68 y

G vs G+

G++ vs G+ 50

=

n 1

y =x

0.1

0.01 Runtime of G++ (or G) inseconds (orG) Runtime for G++ of

x

0.01 0.1 1

Runtime of G+ in seconds for n = 50

Figure C.1. Runtimes of G++, G+, and G for all 100 CFG pairs of our revised CFG- intersection problem with our chosen branching heuristic, where each ‘×’ (or ‘+’) denotes the comparison between G++ (or G) and G+ than HAMPI on one CFG pair, and that G++ and G work better than HAMPI for the other two CFG pairs. Hence, we have that G++ and even G work much better than HAMPI over 99% of all 100 CFG pairs even with the default first- fail branching heuristic, thereby alleviating any fears that advanced skills on CP branching heuristics are crucial for the performance of CP solvers on this (kind of) problem. As we only address combinatorial problems on fixed number of decision variables, the CFG constraint can be reformulated into the REGULAR con- straint, as in [25] for instance. The reformulation of [25] however needs a propagator for the CFG constraint to shrink the initial domains of all deci- sion variables to achieve domain consistency for all constraints at the root of the search tree, so that the obtained deterministic finite automaton (DFA) is much smaller. Hence, this reformulation is instance-specific and must be done on-line. Now, we give the omitted experimental results on the revised CFG- intersection problem to demonstrate that there are CFGs where an expensive CFG propagator beats a cheap REGULAR propagator, when the CFG must be reformulated on-line for the latter into a DFA for a fixed length. For n = 50, we have that 97 CFG pairs are solved at the root of the search tree. Hence, the reformulation of [25] of the CFG constraint into the REGULAR constraint has similar runtimes as our CFG propagators for those 97 CFG pairs. Table C.4

69 y

G vs HAMPI

y =x 10

G++ vs HAMPI =

n

0.01 Runtime of G++ (or G) inseconds (orG) Runtime for G++ of

x

0.1

HAMPI runtime in seconds for n = 10

Figure C.2. Runtimes of G++, G, and HAMPI for all 100 CFG pairs with n = 10 of our revised CFG-intersection problem with our chosen branching heuristic, where each ‘×’ (or ‘+’) denotes the comparison between G++ (or G) and HAMPI

gives the runtimes of G++, G, HAMPI, DFAG++, and DFAG for the other three CFG pairs with n = 50 of our revised CFG-intersection problem with our cho- sen branching heuristic, where DFAG++ (or DFAG) denotes the on-line refor- mulation of [25] with the CFG propagator G++ (or G). For the three CFG pairs, we have that G++ and G always work much better than HAMPI, which always works much better than DFAG++ and DFAG.

70 y

G vs HAMPI

G++ vs HAMPI 20 y =x

0.1 =

n

0.01 Runtime of G++ (or G) inseconds (orG) Runtime for G++ of

x

0.1 1

HAMPI runtime in seconds for n = 20

Figure C.3. Runtimes of G++, G, and HAMPI for all 100 CFG pairs with n = 20 of our revised CFG-intersection problem with our chosen branching heuristic, where each ‘×’ (or ‘+’) denotes the comparison between G++ (or G) and HAMPI

y

G vs HAMPI

G++ vs HAMPI

y =x 30

=

n

0.1

0.01 Runtime of G++ (or G) inseconds (orG) Runtime for G++ of

x

0.1 1

HAMPI runtime in seconds for n = 30

Figure C.4. Runtimes of G++, G, and HAMPI for all 100 CFG pairs with n = 30 of our revised CFG-intersection problem with our chosen branching heuristic, where each ‘×’ (or ‘+’) denotes the comparison between G++ (or G) and HAMPI

71 y

1

G vs HAMPI

y =x 40 G++ vs HAMPI

=

n

0.1

0.01 Runtime of G++ (or G) inseconds (orG) Runtime for G++ of

x

0.1 1 10

HAMPI runtime in seconds for n = 40

Figure C.5. Runtimes of G++, G, and HAMPI for all 100 CFG pairs with n = 40 of our revised CFG-intersection problem with our chosen branching heuristic, where each ‘×’ (or ‘+’) denotes the comparison between G++ (or G) and HAMPI

y

G vs HAMPI

y =x

G++ vs HAMPI 50

=

n

1

0.1

0.01 Runtime of G++ (or G) inseconds (orG) Runtime for G++ of

x

0.1 1 10

HAMPI runtime in seconds for n = 50

Figure C.6. Runtimes of G++, G, and HAMPI for all 100 CFG pairs with n = 50 of our revised CFG-intersection problem with our chosen branching heuristic, where each ‘×’ (or ‘+’) denotes the comparison between G++ (or G) and HAMPI

72 average runtime speed-ups on all 100 CFG pairs n G++ over HAMPI G over HAMPI 10 51.9 22.5 20 114.4 9.8 30 154.0 10.2 40 245.1 12.3 50 357.4 15.7

Table C.2. Average runtime speed-ups of G++ (or G) over HAMPI on all 100 CFG pairs with n ∈ {10,20,30,40,50} of our revised CFG-intersection problem with our chosen branching heuristic

G vs HAMPI

y

G++ vs HAMPI

y =x

1 50

=

n

0.1

0.01 Runtime of G++ (or G) inseconds (orG) Runtime for G++ of

x

0.01 0.1 1 10

HAMPI runtime in seconds for n = 50

Figure C.7. Runtimes of G++, G, and HAMPI for all 100 CFG pairs with n = 50 of the original CFG-intersection problem in [27] with our chosen branching heuristic, where each ‘×’ (or ‘+’) denotes the comparison between G++ (or G) and HAMPI

runtimes in seconds instance G++ G HAMPI 045.cfg_AND_044.cfg > 180.00 > 180.00 21.46 053.cfg_AND_053.cfg 0.01 0.06 0.24 019.cfg_AND_019.cfg 0.01 0.17 0.50

Table C.3. Runtimes of G++, G, and HAMPI for the three CFG pairs that cannot be solved at the root of the search tree, with n = 50 of our revised CFG-intersection problem with the default first-fail branching heuristic

73 runtimes in seconds instance G++ G HAMPI DFAG++ DFAG 045.cfg_AND_044.cfg 0.28 2.85 21.46 > 180.00 > 180.00 053.cfg_AND_053.cfg 0.01 0.05 0.24 3.25 3.30 019.cfg_AND_019.cfg 0.01 0.16 0.50 65.15 65.39

Table C.4. Runtimes of G++, G, HAMPI, DFAG++, and DFAG for the three CFG pairs that cannot be solved at the root of the search tree, with n = 50 of our revised CFG-intersection problem with our chosen branching heuristic, where DFAG++ (or DFAG) denotes the on-line reformulation of [25] with the CFG propagator G++ (or G)

74

Acta Universitatis Upsaliensis Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1027 Editor: The Dean of the Faculty of Science and Technology

A doctoral dissertation from the Faculty of Science and Technology, Uppsala University, is usually a summary of a number of papers. A few copies of the complete dissertation are kept at major Swedish research libraries, while the summary alone is distributed internationally through the series Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology.

ACTA UNIVERSITATIS UPSALIENSIS Distribution: publications.uu.se UPPSALA urn:nbn:se:uu:diva-196347 2013