Outline

Artificial intelligence 1: Informed = use problem-specific knowledge informed search Which search strategies? – Best-first search and its variants Heuristic functions? – How to invent them Lecturer: Tom Lenaerts Local search and optimization SWITCH, Vlaams Interuniversitair Instituut voor Biotechnologie – Hill climbing, local , genetic algorithms,… Local search in continuous spaces Online search agents

AI 1 6 februari Pag.2 2008

Previously: -search Best-first search function TREE-SEARCH(problem,fringe) return a solution or failure General approach of informed search: fringe ← INSERT(MAKE-NODE(INITIAL-STATE[problem]), fringe) – Best-first search: node is selected for expansion based on loop do an evaluation function f(n) if EMPTY?(fringe) then return failure node ← REMOVE-FIRST(fringe) Idea: evaluation function measures distance if GOAL-TEST[problem] applied to STATE[node] succeeds to the goal. then return SOLUTION(node) – Choose node which appears best fringe ← INSERT-ALL(EXPAND(node, problem), fringe) Implementation: A strategy is defined by picking the order of – fringe is queue sorted in decreasing order of desirability. node expansion – Special cases: greedy search, A* search

AI 1 AI 1 6 februari Pag.3 6 februari Pag.4 2008 2008

A heuristic function Romania with step costs in km

[dictionary]“A rule of thumb, simplification, hSLD=straight-line or educated guess that reduces or limits the distance heuristic. hSLD can NOT be search for solutions in domains that are computed from the difficu lt and poor ly u nder stood. ” problem descri pt ion – h(n) = estimated cost of the cheapest path from node n to itself goal node. In this example –If n is goal then h(n)=0 f(n)=h(n) – Expand node that is closest to goal More information later.

= Greedy best-first search

AI 1 AI 1 6 februari Pag.5 6 februari Pag.6 2008 2008

1 Greedy search example Greedy search example

Arad Arad (366) Zerind(374) Sibiu(253)

Timisoara (329) Assume that we want to use greedy search The first expansion step produces: to solve the problem of travelling from Arad – Sibiu, Timisoara and Zerind to Bucharest. Greedy best-first will select Sibiu. The initial state=Arad

AI 1 AI 1 6 februari Pag.7 6 februari Pag.8 2008 2008

Greedy search example Greedy search example

Arad Arad Sibiu Sibiu

Fagaras Arad Rimnicu Vilcea Fagaras Oradea Sibiu Bucharest (366) (193) (176) (380) (253) (0)

If Sibiu is expanded we get: If Fagaras is expanded we get: – Sibiu and Bucharest – Arad, Fagaras, Oradea and Rimnicu Vilcea Goal reached !! Greedy best-first search will select: Fagaras – Yet not optimal (see Arad, Sibiu, Rimnicu Vilcea, Pitesti)

AI 1 AI 1 6 februari Pag.9 6 februari Pag.10 2008 2008

Greedy search, evaluation Greedy search, evaluation

Completeness: NO (cfr. DF-search) Completeness: NO (cfr. DF-search) – Check on repeated states Time complexity? – Minimizing h(n) can result in false starts, e.g. Iasi to m – Cfr. Worst-case DF-search O(b ) Fagaras. (w it h m is maxi mum depth of search space) – Good heuristic can give dramatic improvement.

AI 1 AI 1 6 februari Pag.11 6 februari Pag.12 2008 2008

2 Greedy search, evaluation Greedy search, evaluation

Completeness: NO (cfr. DF-search) Completeness: NO (cfr. DF-search) m Time complexity: O(bm ) Time complexity: O(b ) O(bm ) Space complexity:O(bm ) Space complexity: – Keeps all nodes in memory Optimality? NO – Same as DF-search

AI 1 AI 1 6 februari Pag.13 6 februari Pag.14 2008 2008

A* search A* search

Best-known form of best-first search. A* search uses an admissible heuristic Idea: avoid expanding paths that are already – A heuristic is admissible if it never overestimates the expensive. cost to reach the goal – Are optimistic Evaluation function f(n)=g(n) + h(n) – g(n) the cost (so far) to reach the node. – h(n) estimated cost to get from the node to the goal. Formally: 1. h(n) <= h*(n) where h*(n) is the true cost from n – f(n) estimated total cost of path through n to goal. 2. h(n) >= 0 so h(G)=0 for any goal G.

e.g. hSLD(n) never overestimates the actual road distance

AI 1 AI 1 6 februari Pag.15 6 februari Pag.16 2008 2008

Romania example A* search example

Find Bucharest starting at Arad – f(Arad) = c(??,Arad)+h(Arad)=0+366=366

AI 1 AI 1 6 februari Pag.17 6 februari Pag.18 2008 2008

3 A* search example A* search example

Expand Arrad and determine f(n) for each node Expand Sibiu and determine f(n) for each node – f(Sibiu)=c(Arad,Sibiu)+h(Sibiu)=140+253=393 – f(Arad)=c(Sibiu,Arad)+h(Arad)=280+366=646 – f(Fagaras)=c(Sibiu,Fagaras)+h(Fagaras)=239+179=415 – f(Timisoara)=c(Arad,Timisoara)+h(Timisoara)=118+329=447 – f(Oradea)=c(Sibiu,Oradea)+h(Oradea)=291+380=671 – f(Zerind)=c(Arad,Zerind)+h(Zerind)=75+374=449 – f(Rimnicu Vilcea)=c(Sibiu,Rimnicu Vilcea)+ Best choice is Sibiu h(Rimnicu Vilcea)=220+192=413 Best choice is Rimnicu Vilcea

AI 1 AI 1 6 februari Pag.19 6 februari Pag.20 2008 2008

A* search example A* search example

Expand Rimnicu Vilcea and determine f(n) for Expand Fagaras and determine f(n) for each each node node – f(Craiova)=c(Rimnicu Vilcea, Craiova)+h(Craiova)=360+160=526 – f(Sibiu)=c(Fagaras, Sibiu)+h(Sibiu)=338+253=591 – f(Pitesti)=c(Rimnicu Vilcea, Pitesti)+h(Pitesti)=317+100=417 – f(Bucharest)=c(Fagaras,Bucharest)+h(Bucharest)=450+0=450 – f(Sibiu)=c(Rimnicu Vilcea,Sibiu)+h(Sibiu)=300+253=553 Best choice is Pitesti !!! Best choice is Fagaras

AI 1 AI 1 6 februari Pag.21 6 februari Pag.22 2008 2008

A* search example Optimality of A*(standard proof)

Expand Pitesti and determine f(n) for each node Suppose suboptimal goal G2 in the queue. – f(Bucharest)=c(Pitesti,Bucharest)+h(Bucharest)=418+0=418 Let n be an unexpanded node on a shortest to Best choice is Bucharest !!! optimal goal G. f(G ) = g(G ) since h(G )=0 – Optimal solution (only if h(n) is admissable) 2 2 2 > g(G) since G2 is suboptimal Note values along optimal path !! >= f(n) since h is admissible Since f(G2) > f(n), A* will never select G2 for expansion

AI 1 AI 1 6 februari Pag.23 6 februari Pag.24 2008 2008

4 BUT … graph search Consistency

Discards new paths to repeated A heuristic is consistent if state. h(n) ≤ c(n,a,n')+ h(n') – Previous proof breaks down If h is consistent, we have Solution: f (n') = g(n')+ h(n') = g(n) + c(n,a,n')+ h(n') – Add extra bookkeeping i.e. remove more expsive of two paths. ≥ g(n) + h(n) – Ensure that optimal path to any repeated ≥ f (n) state is always first followed. i.e. f(n) is nondecreasing along any path. – Extra requirement on h(n): consistency (monotonicity)

AI 1 AI 1 6 februari Pag.25 6 februari Pag.26 2008 2008

Optimality of A*(more usefull) A* search, evaluation

A* expands nodes in order of increasing f value Contours can be drawn in state space Completeness: YES – Uniform-cost search adds circles. – Since bands of increasing f are added – Unless there are infinitly many nodes with f

Contour I has all

Nodes with f=fi, where fi < fi+1.

AI 1 AI 1 6 februari Pag.27 6 februari Pag.28 2008 2008

A* search, evaluation A* search, evaluation

Completeness: YES Completeness: YES Time complexity: Time complexity: (exponential with path – Number of nodes expppanded is still exponential in the length) length of the solution. Space complexity: – It keeps all generated nodes in memory – Hence space is the major problem not time

AI 1 AI 1 6 februari Pag.29 6 februari Pag.30 2008 2008

5 A* search, evaluation Memory-bounded heuristic search

Completeness: YES Some solutions to A* space problems Time complexity: (exponential with path (maintain completeness and optimality) length) – Iterative-deepening A* (IDA*) Space complexit y:(all nodes are stored) – Here cutoff information is the f-cost (g+h) instead of depth Optimality: YES – Recursive best-first search(RBFS)

– Cannot expand fi+1 until fi is finished. – Recursive algorithm that attempts to mimic standard best-first – A* expands all nodes with f(n)< C* search with linear space. – A* expands some nodes with f(n)=C* – (simple) Memory-bounded A* ((S)MA*) – A* expands no nodes with f(n)>C* – Drop the worst-leaf node when memory is full Also optimally efficient (not including ties)

AI 1 AI 1 6 februari Pag.31 6 februari Pag.32 2008 2008

Recursive best-first search Recursive best-first search

function RECURSIVE-BEST-FIRST-SEARCH(problem) return a solution or failure return RFBS(problem,MAKE-NODE(INITIAL-STATE[problem]),∞) Keeps track of the f-value of the function RFBS( problem, node, f_limit) return a solution or failure and a new f- cost limit best-alternative path available. if GOAL-TEST[problem](STATE[node]) then return node successors ← EXPAND(node, problem) – If current f-values exceeds this alternative f- if successors is empty then return failure, ∞ value than backtrack to alternative path. for each s in successors do f [s] ← max(g(s) + h(s), f [node]) – Upon change f-value to best f- repeat best ← the lowest f-value node in successors value of its children. if f [best] > f_limit then return failure, f [best] – Re-expansion of this result is thus still alternative ← the second lowest f-value among successors result, f [best] ← RBFS(problem, best, min(f_limit, alternative)) possible. if result ≠ failure then return result

AI 1 AI 1 6 februari Pag.33 6 februari Pag.34 2008 2008

Recursive best-first search, ex. Recursive best-first search, ex.

Path until Rumnicu Vilcea is already expanded Unwind recursion and store best f-value for Above node; f-limit for every recursive call is shown on top. current best leaf Pitesti Below node: f(n) result, f [best] ← RBFS(problem, best, min(f_limit, alternative)) The path is followed until Pitesti which has a f-value worse than best is now Fagaras. Call RBFS for new best the f-limit. – best value is now 450

AI 1 AI 1 6 februari Pag.35 6 februari Pag.36 2008 2008

6 Recursive best-first search, ex. RBFS evaluation

RBFS is a bit more efficient than IDA* – Still excessive node generation (mind changes) Like A*, optimal if h(n) is admissible Space complexity is O(bd). – IDA* retains only one single number (the current f-cost limit) Unwind recursion and store best f-value for current best leaf Time complexity difficult to characterize Fagaras – Depends on accuracy if h(n) and how often best path changes. result, f [best] ← RBFS(problem, best, min(f_limit, alternative)) best is now Rimnicu Viclea (again). Call RBFS for new best IDA* en RBFS suffer from too little memory. – Subtree is again expanded. –Best alternative subtree is now through Timisoara. Solution is found since because 447 > 417.

AI 1 AI 1 6 februari Pag.37 6 februari Pag.38 2008 2008

(simplified) memory-bounded A* Learning to search better

Use all available memory. All previous algorithms use fixed strategies. – I.e. expand best leafs until available memory is full – When full, SMA* drops worst leaf node (highest f-value) Agents can learn to improve their search by –Like RFBS backup forgotten node to its parent exploiting the meta-level state space. – Each meta-level state is a internal (computational) state of What if all leafs have the same f-value? a program that is searching in the object-level state space. – Same node could be selected for expansion and deletion. – In A* such a state consists of the current search tree – SMA* solves this by expanding newest best leaf and deleting A meta-level learning algorithm from oldest worst leaf. experiences at the meta-level. SMA* is complete if solution is reachable, optimal if optimal solution is reachable.

AI 1 AI 1 6 februari Pag.39 6 februari Pag.40 2008 2008

Heuristic functions Heuristic functions

E.g for the 8-puzzle E.g for the 8-puzzle knows two commonly used heuristics h = the number of misplaced tiles – Avg. solution cost is about 22 steps (branching factor +/- 3) 1 –h1(s)=8 – Exhaustive search to depth 22: 3.1 x 1010 states. h2 = the sum of the distances of the tiles from their goal – A good heuristic function can reduce the search process. positions (manhattan distance). –h2(s)=3+1+2+2+2+3+3+2=18

AI 1 AI 1 6 februari Pag.41 6 februari Pag.42 2008 2008

7 Heuristic quality Heuristic quality and dominance

Effective branching factor b* 1200 random problems with solution lengths – Is the branching factor that a uniform tree of from 2 to 24. depth d would have in order to contain N+1 nodes. N +1=1+ b*+(b*)2 + ...+ (b*)d

– Measure is fairly constant for sufficiently hard problems. – Can thus provide a good guide to the heuristic’s overall If h (n) >= h (n) for all n (both admissible) usefulness. 2 1 – A good value of b* is 1. then h2 dominates h1 and is better for search

AI 1 AI 1 6 februari Pag.43 6 februari Pag.44 2008 2008

Inventing admissible heuristics Inventing admissible heuristics

Admissible heuristics can be derived from the exact Admissible heuristics can also be derived from the solution cost of a subproblem of a given problem. solution cost of a relaxed version of the problem: This cost is a lower bound on the cost of the real problem. – Relaxed 8-puzzle for h : a tile can move anywhere 1 Pattern databases store the exact solution to for every possible As a result, h (n) gives the shortest solution 1 subproblem instance. – RlRelaxe d 8-puzzle for h : a tile can move to any adjacent square. 2 – The complete heuristic is constructed using the patterns in the DB As a result, h2(n) gives the shortest solution.

The optimal solution cost of a relaxed problem is no greater than the optimal solution cost of the real problem. ABSolver found a usefull heuristic for the rubic cube.

AI 1 AI 1 6 februari Pag.45 6 februari Pag.46 2008 2008

Inventing admissible heuristics Local search and optimization

Another way to find an admissible Previously: systematic exploration of search heuristic is through learning from space. experience: – Path to goal is solution to problem – Experience = solving lots of 8-puzzles YET, for some probl ems path is irrel evant . – E.g 8-queens – An inductive learning algorithm can be used to predict costs for other states that arise during search. Different algorithms can be used – Local search

AI 1 AI 1 6 februari Pag.47 6 februari Pag.48 2008 2008

8 Local search and optimization Local search and optimization

Local search= use single current state and move to neighboring states. Advantages: – Use very little memory – Find often reasonable solutions in large or infinite state spaces. Are also useful for pure optimization problems. – Find best state according to some objective function. – e.g. survival of the fittest as a metaphor for optimization.

AI 1 AI 1 6 februari Pag.49 6 februari Pag.50 2008 2008

Hill-climbing search Hill-climbing search

function HILL-CLIMBING( problem) return a state that is a local “is a loop that continuously moves in the maximum direction of increasing value” input: problem, a problem – It terminates when a peak is reached. local variables: current, a node. Hill climbing does not look ahead of the neighbor, a node. immediate neighbors of the current state. current ← MAKE-NODE(INITIAL-STATE[problem]) Hill-climbing chooses randomly among the loop do neighbor ← a highest valued successor of current set of best successors, if there is more than if VALUE [neighbor] ≤ VALUE[current] then return one. STATE[current] Hill-climbing a.k.a. greedy local search current ← neighbor

AI 1 AI 1 6 februari Pag.51 6 februari Pag.52 2008 2008

Hill-climbing example Hill-climbing example

a) b) 8-queens problem (complete-state formulation). Successor function: move a single queen to another square in the same column. Heuristic function h(n): the number of pairs a) shows a state of h=17 and the h-value for of queens that are attacking each other each possible successor. (directly or indirectly). b) A local minimum in the 8-queens state space (h=1).

AI 1 AI 1 6 februari Pag.53 6 februari Pag.54 2008 2008

9 Drawbacks Hill-climbing variations

Stochastic hill-climbing – Random selection among the uphill moves. – The selection probability can vary with the steepness of the uphill move. First-choice hill-climbing Ridge = sequence of local maxima difficult for – cfr. stochastic hill climbing by generating greedy algorithms to navigate successors randomly until a better one is found. Plateaux = an area of the state space where the evaluation function is flat. Random-restart hill-climbing Gets stuck 86% of the time. – Tries to avoid getting stuck in local maxima.

AI 1 AI 1 6 februari Pag.55 6 februari Pag.56 2008 2008

Simulated annealing

function SIMULATED-ANNEALING( problem, schedule) return a solution state Escape local maxima by allowing “bad” moves. input: problem, a problem – Idea: but gradually decrease their size and frequency. schedule, a mapping from time to temperature local variables: current, a node. Origin; metallurgical annealing next, a node. Bouncing ball analogy: T, a “temperature” controlling the probability of downward steps – Shaking hard (= high temperature). current ← MAKE-NODE(INITIAL-STATE[problem]) for t ← 1 to ∞ do – Shaking less (= lower the temperature). T ← schedule[t] If T decreases slowly enough, best state is reached. if T = 0 then return current next ← a randomly selected successor of current Applied for VLSI layout, airline scheduling, etc. ∆E ← VALUE[next] - VALUE[current] if ∆E > 0 then current ← next else current ← next only with probability e∆E /T

AI 1 AI 1 6 februari Pag.57 6 februari Pag.58 2008 2008

Local beam search Genetic algorithms

Keep track of k states instead of one Variant of local beam search with sexual – Initially: k random states recombination. – Next: determine all successors of k states – If any of successors is goal → finished – Else select k best from successors and repeat. Major difference with random-restart search – Information is shared among k search threads. Can suffer from lack of diversity. – Stochastic variant: choose k successors at proportionally to state success.

AI 1 AI 1 6 februari Pag.59 6 februari Pag.60 2008 2008

10 Genetic algorithms

Variant of local beam search with sexual function GENETIC_ALGORITHM( population, FITNESS-FN) return an individual recombination. input: population, a set of individuals FITNESS-FN, a function which determines the quality of the individual repeat new_population ← empty set loop for i from 1 to SIZE(population) do x ← RANDOM_SELECTION(population, FITNESS_FN) y ← RANDOM_SELECTION(population, FITNESS_FN) child ← REPRODUCE(x,y) if (small random probability) then child ← MUTATE(child ) add child to new_population population ← new_population until some individual is fit enough or enough time has elapsed return the best individual

AI 1 AI 1 6 februari Pag.61 6 februari Pag.62 2008 2008

Exploration problems Online search problems

Until now all algorithms were offline. Agent knowledge: – Offline= solution is determined before executing it. – ACTION(s): list of allowed actions in state s – C(s,a,s’): step-cost function (! After s’ is determined) – Online = interleaving computation and action –GOAL-TEST(s) Online search is necessary for dynamic and An agent can recognize previous states. semi-dynamic environments Actions are deterministic. – It is impossible to take into account all possible Access to admissible heuristic h(s) contingencies. e.g. manhattan distance Used for exploration problems: – Unknown states and actions. – e.g. any robot in a new environment, a newborn baby,…

AI 1 AI 1 6 februari Pag.63 6 februari Pag.64 2008 2008

Online search problems The adversary argument

Objective: reach goal with minimal cost – Cost = total cost of travelled path – Competitive ratio=comparison of cost with cost of the solution pppath if search space is known. – Can be infinite in case of the agent accidentally reaches dead ends Assume an adversary who can construct the state space while the agent explores it – Visited states S and A. What next? – Fails in one of the state spaces No algorithm can avoid dead ends in all state spaces.

AI 1 AI 1 6 februari Pag.65 6 februari Pag.66 2008 2008

11 Online search agents Online DF-search

function ONLINE_DFS-AGENT(s’) return an action The agent maintains a map of the input: s’, a percept identifying current state static: result, a table indexed by action and state, initially empty unexplored, a table that lists for each visited state, the action not yet tried environment. unbacktracked, a table that lists for each visited state, the backtrack not yet tried – Updated based on percept input. s,a, the previous state and action, initially null if GOAL-TEST(s’) then return stop – This map is used to decide next action. if s’ is a new state then unexplored[s’] ← ACTIONS(s’) if s is not null then do result[a,s] ← s’ add s to the front of unbackedtracked[s’] Note difference with e.g. A* if unexplored[s’] is empty then if unbacktracked[s’] is empty then return stop An online version can only expand the node it is else a ← an action b such that result[b, s’]=POP(unbacktracked[s’]) physically in (local order) else a ← POP(unexplored[s’]) s ← s’ return a

AI 1 AI 1 6 februari Pag.67 6 februari Pag.68 2008 2008

Online DF-search, example Online DF-search, example

Assume maze problem GOAL-TEST((,1,1))? on 3x3 grid. S’=(1,1) – S not = G thus false (1,1) a new state? s’ = (1,1) is initial state –True – ACTION((1,1)) -> UX[(1,1)] Result,,p(), unexplored (UX), – {RIGHT,UP} unbacktracked (UB), … s is null? are empty – True (initially) UX[(1,1)] empty? S,a are also empty –False POP(UX[(1,1)])->a –A=UP s = (1,1) Return a

AI 1 AI 1 6 februari Pag.69 6 februari Pag.70 2008 2008

Online DF-search, example Online DF-search, example

GOAL-TEST((2,1))? GOAL-TEST((1,1))? S’=(2,1) – S not = G thus false S’=(1,1) – S not = G thus false (2,1) a new state? (1,1) a new state? –false –True – ACTION((2,1)) -> UX[(2,1)] s is null? –{DOWN} –false (s=(2,1)) – result[DOWN,(2,1)] <- (1,1) S s is null? – false (s=(1,1)) – UB[(1,1)]={(2,1)} – result[UP,(1,1)] <- (2,1) UX[(1,1)] empty? – UB[(2,1)]={(1,1)} S –False UX[(2,1)] empty? A=RIGHT, s=(1,1) return A –False A=DOWN, s=(2,1) return A

AI 1 AI 1 6 februari Pag.71 6 februari Pag.72 2008 2008

12 Online DF-search, example Online DF-search, example

GOAL-TEST((1,2))? GOAL-TEST((1,1))? S’=(1,2) – S not = G thus false S’=(1,1) – S not = G thus false (1,1) a new state? (1,2) a new state? –false –True, s is null? UX[(1,2)]={RIGHT,UP,LEFT} –faa((,))lse (s=(1,2)) s is null? – result[LEFT,(1,2)] <- (1,1) – UB[(1,1)]={(1,2),(2,1)} – false (s=(1,1)) UX[(1,1)] empty? – result[RIGHT,(1,1)] <- (1,2) –True S – UB[(1,2)]={(1,1)} S – UB[(1,1)] empty? False UX[(1,2)] empty? A= b for b in –False result[b,(1,1)]=(1,2) –B=RIGHT A=LEFT, s=(1,2) return A A=RIGHT, s=(1,1) …

AI 1 AI 1 6 februari Pag.73 6 februari Pag.74 2008 2008

Online DF-search Online local search

Worst case each node is Hill-climbing is already online visited twice. – One state is stored. An agent can go on a long Bad performancd due to local maxima walk even when it is close to – Random restarts impossible. the solution. Solution: Random walk introduces expp(ploration (can produce An online iterative deepening exponentially many steps) approach solves this problem. Online DF-search works only when actions are reversible.

AI 1 AI 1 6 februari Pag.75 6 februari Pag.76 2008 2008

Online local search Learning real-time A*

function LRTA*-COST(s,a,s’,H) return an cost estimate if s’ is undefined the return h(s) Solution 2: Add memory to hill climber else return c(s,a,s’) + H[s’] – Store current best estimate H(s) of cost to reach goal function LRTA*-AGENT(s’) return an action – H(s) is initially the heuristic estimate h(s) input: s’, a percept identifying current state static: result, a table indexed by action and state, initially empty – Afterward updated with experience (see below) H, a table of cost estimates indexed by state, initially empty Learning real-time A* (LRTA*) s,a, the previous state and action, initially null if GOAL-TEST(s’) then return stop if s’ is a new state (not in H) then H[s’] ← h(s’) unless s is null result[a,s] ← s’ H[s] ← MIN LRTA*-COST(s,b,result[b,s],H) b ∈ ACTIONS(s) a ← an action b in ACTIONS(s’) that minimizes LRTA*-COST(s’,b,result[b,s’],H) s ← s’ return a

AI 1 AI 1 6 februari Pag.77 6 februari Pag.78 2008 2008

13