6 Advanced Adversarial Search

6 Advanced Adversarial Search

More Realistic Adversarial Settings Virginia Tech CS4804 Introduction to Artificial Intelligence Outline • Move ordering • Stochastic games • (Partially-observable games) Max Min … MINIMAX(s) = if TERMINAL-TEST(s) then UTILITY(s) if PLAYER(s) = MAX then max of MINIMAX(RESULT(s,a)) for a in ACTIONS(s) if PLAYER(s) = MIN then min of MINIMAX(RESULT(s,a)) for a in ACTIONS(s) 5 ADVERSARIAL SEARCH function MINIMAX-DECISION(state) returns an action IN ALUE ESULT state a return arg maxa ∈ ACTIONS(s) M -V (R ( , )) function MAX-VALUE(state) returns autilityvalue if TERMINAL-TEST(state) then return UTILITY(state) v ←−∞ for each a in ACTIONS(state) do v ← MAX(v,MIN-VALUE(RESULT(s, a))) return v function MIN-VALUE(state) returns autilityvalue if TERMINAL-TEST(state) then return UTILITY(state) v ←∞ for each a in ACTIONS(state) do v ← MIN(v,MAX-VALUE(RESULT(s, a))) return v Figure 5.3 An algorithm for calculating minimax decisions. It returns the action corresponding to the best possible move, that is, the move that leads to the outcome with the best utility, under the assumption that the opponent plays to minimize utility. The functions MAX-VALUE and MIN-VALUE go through the whole game tree, all the way to the leaves, to determine the backed-up value of a state. The notation argmaxa ∈ S f(a) computes the element a of set S that has the maximum value of f(a). 11 12 Chapter 5. AdversarialSearch function ALPHA-BETA-SEARCH(state) returns an action v ← MAX-VALUE(state, −∞, +∞) return the action in ACTIONS(state)withvaluev function MAX-VALUE(state, α, β) returns autilityvalue if TERMINAL-TEST(state) then return UTILITY(state) v ←−∞ for each a in ACTIONS(state) do v ← MAX(v,MIN-VALUE(RESULT(s,a), α, β)) if v ≥ β then return v α ← MAX(α, v) return v function MIN-VALUE(state, α, β) returns autilityvalue if TERMINAL-TEST(state) then return UTILITY(state) v ← +∞ for each a in ACTIONS(state) do v ← MIN(v,MAX-VALUE(RESULT(s,a),α, β)) if v ≤ α then return v β ← MIN(β, v) return v Figure 5.7 The alpha–beta search algorithm. Notice that these routinesarethesameasthe MINIMAX functions in Figure ??,exceptforthetwolinesineachofMIN-VALUE and MAX-VALUE that maintain α and β (and the bookkeeping to pass these parameters along). Expectiminimax • New player type: Chance Expectinimimax(s)= Utility(s)ifTerminal-Test(s) 8maxa Expectiminimax(Result(s, a)) if Player(s)=Max >mina Expectiminimax(Result(s, a)) if Player(s)=Min <> r Pr(r)Expectiminimax(Result(s, r)) if Player(s)=Chance > :>P MAX CHANCE B . ... ... ... ... 1/36 1/18 1/18 1/36 1,1 1,2 6,5 6,6 MIN . ... ... ... CHANCE C . ... ... ... 1/36 1/18 1/18 1/36 1,1 1,2 6,5 6,6 MAX . ... ... ... TERMINAL 2–1–1 1 1 MAX CHANCE B . ... ... ... ... 1/36 1/18 1/18 1/36 1,1 1,2 6,5 6,6 MIN . ... ... ... CHANCE C . ... ... ... 1/36 1/18 1/18 1/36 1,1 1,2 6,5 6,6 MAX . ... ... ... TERMINAL 2–1–1 1 1 Expectinimimax(s)= Utility(s)ifTerminal-Test(s) 8maxa Expectiminimax(Result(s, a)) if Player(s)=Max >mina Expectiminimax(Result(s, a)) if Player(s)=Min <> r Pr(r)Expectiminimax(Result(s, r)) if Player(s)=Chance > :>P Pruning Chance Nodes • Bounds on true utility function -> bounds on expectation • -10 <= Utility <= 10 • uniform probability die: ? ? ? ? ? ? Pruning Chance Nodes • Bounds on true utility function -> bounds on expectation • -10 <= Utility <= 10 • uniform probability die: 10 ? ? ? ? ? Best case 10 10 10 10 10 10 expectation 10 Worst case 10 -10 -10 -10 -10 -10 expectation -6.67 Pruning Chance Nodes • Bounds on true utility function -> bounds on expectation • -10 <= Utility <= 10 • uniform probability die: 10 10 ? ? ? ? Best case 10 10 10 10 10 10 expectation 10 Worst case 10 10 -10 -10 -10 -10 expectation -3.33 Pruning Chance Nodes • Bounds on true utility function -> bounds on expectation • -10 <= Utility <= 10 • uniform probability die: 10 10 -10 ? ? ? Best case 10 10 -10 10 10 10 expectation 6.67 Worst case 10 10 -10 -10 -10 -10 expectation -3.33 Partial Observations • One approach: simulate perfect information with Chance nodes … … 52! ≈ 8×1067 32! ≈ 3x1035 22! ≈ 4x1026 Summary • Minimax logic works for any move ordering • Expectiminimax adds Chance “player” and uses expected value • Strategy for handling partial information.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    16 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us