Week 12

1 Integer

This last lecture I choose to illustrate the subjects at the hand of problems, because this is a very big and important sub-field of Combinatorial Optimisation.

1.1 Local Search and Simulated Annealing N = {1, . . . , n} a set of jobs, pj ≥ 0 the processing time of job j, M = {1, . . . , m} a set of identical parallel machines. A of the jobs on the machines is feasible if each job is processed, and each machine processes only one job at a time. The schedule is non-preemptive if each job, once started processing is completed without interruption. Given a schedule we let Cj denote the completion time of job j. We always assume in this lecture, that no machine is idle as long as there are jobs that are ready to be processed on it, i.e., so-called non-idling .

Makespan. Find a non-preemptive feasible schedule of the jobs on the machines such that the time that the last job finishes processing maxj Cj is minimised. Start with an arbitrary feasible non-idling schedule.

As an example of a local search algorithm we consider improvement on the so- called Jump neighbourhood. Given a schedule a neighbouring schedule is one in which exactly one job changes machines. Draw example at the blackboard. The jump neighbourhood contains therefore n(m − 1) neighbours.

Another one is the Swap neighbourhood. Given a schedule a neighbouring sched- ule is one in which two jobs exchange their machines.µ ¶ Draw example at the n blackboard. The swap-neighbourhood contains neighbours. Notice that 2 a swap is in fact the consequence of two consecutive jumps. However, neither of the two jumps may in itself be advantageous. Thus, the set of local optima with respect to the swap-neighbourhood is contained in the set of local optima with respect to the jump-neighbourhood. Usually, the swap-neighbourhood is defined to contain also jumps.

From this it is straightforward to design an even larger neighbourhood: Take any set of at most k jobs and change or exchange their positions. Even for relativelyµ ¶ small values of k this neighbourhood becomes very large: more than n times the number of permutations on any k-element set. Clearly choos- k

1 ing k = n would result in a neighbourhood that contains all feasible schedules and therefore has the optimal schedule as its only local optimum.

Simulated Annealing - Define a neighbourhood. - Select a starting point. Make this current point x. - Draw at random a solution y in the neighbourhood of the current point x. - Always accept the new solution if it is an improvement. - If f(y) > f(x) accept it with a small probability.

In fact the whole procedure can be viewed as a Markov Chain over the set of all feasible solutions of the problem to be solved. The fine tuning of the program is in the selection of the random neighbour, i.e., selecting the probabilities qxy, and in the definition of the probability to accept a worse neighbour. This is usually taken to be Pr{accept y | f(y) > f(x)} = e−(f(y)−f(x))/T , with T a large enough constant. If the neighbourhood is symmetric and the probabilities q have been chosen in such a way that the Markov Chain is er- godic, then it converges to its stationary distribution. If T is chosen large enough then the stationary distribution has positive probability mass only on the set of global optima. It is an extremely interesting and important open research question how fast the convergence is, even for specific optimization problems. So far, the only information we have is that simulated annealing appears to work well on some problems and it works much less on some others, without having any insight why.

There is quite a lot of research on fast convergence of Markov Chains in the counting literature, or equivalently in the literature on sampling implicitly de- fined objects randomly. Theory developed there may help us to find the answers to the above question.

In my opinion the analysis of such (general purpose) simple algorithms is one of the most important topics for research in OR/combinatorial optimisation/algorithms. I welcome very good students to try the topic.

2 The art in linear optimization

If you ever need to solve a problem in practice and you decide that a (mixed integer) linear programming problem is the best model for your problem, then you may just formulate it, buy yourself any of the LP-software packages avail- able, and give your computer instructions to solve it.

2 However, this is not as easy as it seems. Problems start already with how to insert your data in the computer, i.e., how to provide the input to the LP- package. Also here solutions have been provided for you by modelling languages like AMPL and GAMMS. I know people who nowadays prefer to write there own JAVA-Code to communicate with the LP-solvers.

Using modelling languages makes it possible to input millions of constraints and millions of variables by typing only a few minutes. This may lead to LP- problems that are too big to solve within reasonable time. This holds dramati- cally more so for ILP-problems. For LP’s one may then give the LP-solver the instruction to do delayed column or row generation. For ILP’s one can use a lot of tricks to speed up the solution procedure: apply Lagrangean Relaxation for lower bounds, approximation algorithms (of any kind, including homeopathic methods) for upper bounds, preprocessing, etc. Fact is that all these features have to be instructed to the general (I)LP-solver.

It is a bit boring to tell tables showing the effects of all these tricks in a lecture, but I definitely advise you to read Sections 12.1, 12,2 and especially 12.3 to see that smart tools make a difference. This is more the rule than the exception in optimization problem solving in practice.

Section 12.4 gives an example of a rather huge LP-problem, in the sense of a lot of different variables and constraints needed to model the problem.

I decided to concentrate on Section 12.5, where it is easy to come up with a straightforward ILP-model of a job-shop scheduling problem, but the experi- ence in practice is that these problems resist any solution by direct application of standard software. Therefore, smart insight into the problems have to be generated and fed into the software-packages. I choose this subject because on the way it shows some nice theory on Scheduling.

A single machine scheduling problem Use the notation from the beginning: N = {1, . . . , n} a set of jobs, pj ≥ 0 the processing time of job j, Now we have wj ≥ 0 the weight of job j. In this case we have a single machine and we wish t find a feasible non- Ppreemptive non-idling schedule of the jobs on this machine that minimizes j∈N wjCj.

A simple exchange argument shows that the following algorithm solves the prob- lem, placing the problem in the complexity class P: Shortest weighted processing time first (SWPT). Schedule the jobs in order of non-decreasing ratio pj/wj (breaking ties arbitrarily).

3 To formulate feasible schedules in an LP we notice that the weights are irrele- vant; in particular the constraints are valid if and only if they are for wj = pj, for all j.

From this observation the LP-formulation is derived as in the book. There are exponentially many constraints, but they can be separated in polynomial time, which we skip to show.

Introducing release times, deadlines and precedence constraints N = {1, . . . , n} a set of jobs, pj ≥ 0 the processing time of job j, wj ≥ 0 the weight of job j. rj ≥ 0 the release time of job j, dj ≥ 0 the deadline of job j, G = (N , A) is an acyclic directed graph representing precedence constraints between the jobs.

Every job j must be processed between rj and dj, and if (i, j) ∈ A then pro- cessing of job i must be completed before processing of job j can be started. P The problem of minimising j∈N wjCj under these extra constraints is NP- hard. We extend the LP-formulation with two sets of constraints:

rj ≤ Cj ≤ dj, ∀j,

Cj ≥ Ci + pj, ∀(i, j) ∈ A.

The optimal LP-solution may be infeasible, by giving a preemptive schedule, or LP LP even simultaneous processing of jobs, i.e. a schedule in which Ci ≤ Cj < LP Ci + pj.

The LP-solution can be used as a start for branching in a B&B-algorithm. LP LP For one subproblem add the restriction Ci + pj ≤ Cj and for the other LP LP Cj + pi ≤ Cj . This clearly cuts away the above optimal LP-solution which does not correspond to a infeasible schedule. This can be continued until the optimal solution is found.

Another possibility is that in case of absence of deadlines, sorting the jobs on LP non-decreasing Cj and scheduling them in that order, satisfying release times and precedence constraints gives a feasible schedule, which is an approxima- tion of an optimal schedule. In the book it is shown that if also release times are absent this approximate schedule has an worst-case approximation ratio of 2.

4 Job shop scheduling N = {1, . . . , n} a set of jobs, M = {1, . . . , m} a set of machines.

Now, each job consists of an ordered set of k tasks j1, . . . , jk, which have to be processed in that order, such that the next task can start only if the previous one has been completed.

pji ≥ 0 the processing time of task ji, wj ≥ 0 the weight of job j.

Each task has to be processed on a specific machine. Let Mh be the set of tasks that are to be processed on machine h. Let C be the completion time of task P ji ji. Then we wish to minimize j∈N wjCjk

For each machine h, the tasks seen as jobs have to satisfy the same conditions as if this were the only machine:

 2 X X X 1 2 1 Cj pi ≥ p +  pi  , ∀S ⊆ Mh. i j 2 ij 2 j ji∈S ji∈S ji∈S

Nest to that we add the constraints imposed by the order of the tasks in the jobs

Cji ≥ Cji−1 + pij , ∀ji.

The LP is again separable in polynomial time, but may fail to give a feasible schedule for the same reason as in the previous problem.

Since we do not have deadlines here, a feasible solution can be obtained by ordering on each machine all tasks in non-decreasing CLP and schedule them ji in that order on the machine, as soon as booth the machine and the task are available.

Also here B&B can be applied based on the infeasibilities in the LP-solution.

Notice from the book that even 20×20 problems take minutes to solve. These job shop scheduling problems are very hard. Even randomly drawn instances are hard. There is a famous 10 × 10 problem, 10 jobs, 10 tasks each, and 10 machines, which is extremely hard to solve.

So in this course you have learned the theory of (Integer) LP, and the examples in this section show that knowing theory is almost indispensable if solving practical problems. In fact you often need more invention than what you have learned here. Moreover, the state of the art of combinatorial optimisation is such that

5 a lot of its results are ad hoc. This is certainly due to the relative young age of this research field, hardly half a century old. But it may very well be intrinsic to the subject. Anyway, lots of opportunities for young researchers for their claim to fame on a huge set of appealing mathematical problems!

Material of Week 12 from [B& T]

Chapter 11, Sections 11.6 and 11.7, and Chapter 12

Exercises of Week 12

No exercises

6