games tamara berg cs 560 artificial intelligence many slides throughout the course adapted from...
TRANSCRIPT
![Page 1: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/1.jpg)
Games
Tamara Berg
CS 560 Artificial Intelligence
Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore, Percy Liang, Luke Zettlemoyer
![Page 2: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/2.jpg)
Announcements
• I sent out an email to the class mailing list on Thurs (making the medium and big cheese mazes extra credit).
• HW2 will be released later this week.
![Page 3: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/3.jpg)
Adversarial search(Chapter 5)
World Champion chess player Garry Kasparov is defeated by IBM’s Deep
Blue chess-playing computer in a six-game match in May, 1997
(link)
© Telegraph Group Unlimited 1997
![Page 4: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/4.jpg)
Why study games?
• Games are a traditional hallmark of intelligence• Games are easy to formalize• Games can be a good model of real-world
competitive or cooperative activities– Military confrontations, negotiation, auctions, etc.
![Page 5: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/5.jpg)
Types of game environments
Deterministic Stochastic
Perfect information(fully observable)
Imperfect information(partially observable)
Chess, checkers, go Backgammon, monopoly
Battleship Scrabble, poker, bridge
![Page 6: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/6.jpg)
Alternating two-player zero-sum games
• Players take turns• Each game outcome or terminal state has a
utility for each player (e.g., 1 for win, 0 for loss)• The sum of both players’ utilities is a constant
![Page 7: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/7.jpg)
Games vs. single-agent search
• We don’t know how the opponent will act– The solution is not a fixed sequence of actions from start
state to goal state, but a strategy or policy (a mapping from state to best move in that state)
• Efficiency is critical to playing well– The time to make a move is limited– The branching factor, search depth, and number of
terminal configurations are huge• In chess, branching factor ≈ 35 and depth ≈ 100, giving a search
tree of 10154 nodes– Number of atoms in the observable universe ≈ 1080
– This rules out searching all the way to the end of the game
![Page 8: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/8.jpg)
Search problem components• Initial state• Actions• Transition model
– What state results fromperforming a given action in a given state?
• Goal state• Path cost
– Assume that it is a sum of nonnegative step costs
• The optimal solution is the sequence of actions that gives the lowest path cost for reaching the goal
Initialstate
Goal state
![Page 9: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/9.jpg)
Deterministic Games
Many possible formulations, e.g.:• States: S (start at initial state)• Players: P={1…N} (usually take turns)• Actions: A (may depend on player/state)• Transition Model: SxA -> S• Terminal Test: S->{t,f}• Terminal Utilities: SxP -> R
Solution for a player is a policy: S->A
![Page 10: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/10.jpg)
Deterministic Single-Player
Deterministic, single player, perfect information (e.g. 8-puzzle, rubik’s cube):• Know the rules, know what actions do,
know when you win
… it’s just search!
Slight reinterpretation:• Each node stores a value: the best
outcome it can reach• This is the maximal value of its children • Note we don’t have path sums as
before (utilities at end)
After search, pick moves that lead to best outcome
![Page 11: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/11.jpg)
![Page 12: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/12.jpg)
![Page 13: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/13.jpg)
Game tree• A game of tic-tac-toe between two players, “MAX” (X) and “MIN” (O)
![Page 14: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/14.jpg)
![Page 15: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/15.jpg)
A more abstract game tree
Terminal utilities (for MAX)
3 2 2
3
A two-ply game
![Page 16: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/16.jpg)
A more abstract game tree
• Minimax value of a node: the utility (for MAX) of being in the corresponding state, assuming perfect play on both sides
• Minimax strategy: Choose the move that gives the best worst-case payoff
3 2 2
3
![Page 17: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/17.jpg)
Computing the minimax value of a node
• Minimax(node) = Utility(node) if node is terminal maxaction Minimax(Succ(node, action)) if player = MAX
minaction Minimax(Succ(node, action)) if player = MIN
3 2 2
3
![Page 18: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/18.jpg)
Optimality of minimax
• The minimax strategy is optimal against an optimal opponent
• What if your opponent is suboptimal?– Your utility can only be higher than if
you were playing an optimal opponent!– A different strategy may work better for
a sub-optimal opponent, but it will necessarily be worse against an optimal opponent
11
Example from D. Klein and P. Abbeel
![Page 19: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/19.jpg)
More general games
• More than two players, non-zero-sum• Utilities are now tuples• Each player maximizes their own utility at their node• Utilities get propagated (backed up) from children to parents
4,3,2 7,4,1
4,3,2
1,5,2 7,7,1
1,5,2
4,3,2
![Page 20: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/20.jpg)
10 9
x
![Page 21: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/21.jpg)
Alpha-beta pruning• It is possible to compute the exact minimax decision
without expanding every node in the game tree
![Page 22: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/22.jpg)
Alpha-beta pruning• It is possible to compute the exact minimax decision
without expanding every node in the game tree
3
3
![Page 23: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/23.jpg)
Alpha-beta pruning• It is possible to compute the exact minimax decision
without expanding every node in the game tree
3
3
2
![Page 24: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/24.jpg)
Alpha-beta pruning• It is possible to compute the exact minimax decision
without expanding every node in the game tree
3
3
2 14
![Page 25: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/25.jpg)
Alpha-beta pruning• It is possible to compute the exact minimax decision
without expanding every node in the game tree
3
3
2 5
![Page 26: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/26.jpg)
Alpha-beta pruning• It is possible to compute the exact minimax decision
without expanding every node in the game tree
3
3
2 2
![Page 27: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/27.jpg)
General alpha-beta pruning• Consider a node n in the tree
---
• If player has a better choice at:– Parent node of n– Or any choice point further up
• Then n will never be reached in play.
• Hence, when that much is known about n, it can be pruned.
![Page 28: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/28.jpg)
Alpha-beta Algorithm• Depth first search
– only considers nodes along a single path from root at any time
a = highest-value choice found at any choice point of path for MAX
(initially, a = −infinity)
b = lowest-value choice found at any choice point of path for MIN
(initially, = +infinity)
• Pass current values of a and b down to child nodes during search.• Update values of a and b during search as:
– MAX updates at MAX nodes– MIN updates at MIN nodes
• Prune remaining branches at a node when a ≥ b
![Page 29: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/29.jpg)
Alpha-Beta Example Revisited
, , initial valuesDo DF-search until first leaf
=− =+
=− =+
, , passed to kids
![Page 30: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/30.jpg)
Alpha-Beta Example (continued)
MIN updates , based on kids
=− =+
=− =3
![Page 31: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/31.jpg)
Alpha-Beta Example (continued)
=− =3
MIN updates , based on kids.No change.
=− =+
![Page 32: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/32.jpg)
Alpha-Beta Example (continued)
=− =3
MIN updates , based on kids.No change.
=− =+
![Page 33: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/33.jpg)
Alpha-Beta Example (continued)
MAX updates , based on kids.=3 =+
3 is returnedas node value.
![Page 34: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/34.jpg)
Alpha-Beta Example (continued)
=3 =+
=3 =+
, , passed to kids
![Page 35: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/35.jpg)
Alpha-Beta Example (continued)
=3 =+
=3 =2
MIN updates ,based on kids.
![Page 36: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/36.jpg)
Alpha-Beta Example (continued)
=3 =2
≥ ,so prune.
=3 =+
![Page 37: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/37.jpg)
Alpha-Beta Example (continued)
2 is returnedas node value.
MAX updates , based on kids.No change. =3
=+
![Page 38: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/38.jpg)
Alpha-Beta Example (continued)
,=3 =+
=3 =+
, , passed to kids
![Page 39: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/39.jpg)
Alpha-Beta Example (continued)
,
=3 =14
=3 =+
MIN updates ,based on kids.
![Page 40: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/40.jpg)
Alpha-Beta Example (continued)
,
=3 =5
=3 =+
MIN updates ,based on kids.
![Page 41: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/41.jpg)
Alpha-Beta Example (continued)
=3 =+
2 =3 =2
MIN updates ,based on kids.
![Page 42: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/42.jpg)
Alpha-Beta Example (continued)
=3 =+ 2 is returned
as node value.
2
![Page 43: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/43.jpg)
Alpha-Beta Example (continued)
2
![Page 44: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/44.jpg)
Alpha-beta pruning
• Pruning does not affect final result• Amount of pruning depends on move ordering
![Page 45: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/45.jpg)
Example
3 4 1 2 7 8 5 6
-which nodes can be pruned?
![Page 46: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/46.jpg)
Answer to Example
3 4 1 2 7 8 5 6
-which nodes can be pruned?
Answer: NONE! Because the most favorable nodes for both are explored last (i.e., in the diagram, are on the right-hand side).
Max
Min
Max
![Page 47: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/47.jpg)
Second Example(the exact mirror image of the first example)
6 5 8 7 2 1 3 4
-which nodes can be pruned?
![Page 48: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/48.jpg)
Answer to Second Example(the exact mirror image of the first example)
6 5 8 7 2 1 3 4
-which nodes can be pruned?
Min
Max
Max
Answer: LOTS! Because the most favorable nodes for both are explored first (i.e., in the diagram, are on the left-hand side).
![Page 49: Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,](https://reader035.vdocuments.net/reader035/viewer/2022062408/56649eb35503460f94bba55c/html5/thumbnails/49.jpg)
Alpha-beta pruning
• Pruning does not affect final result• Amount of pruning depends on move ordering
– Should start with the “best” moves (highest-value for MAX or lowest-value for MIN)
– For chess, can try captures first, then threats, then forward moves, then backward moves
– Can also try to remember “killer moves” from other branches of the tree
• With perfect ordering, the time to find the best move is reduced to O(bm/2) from O(bm)– Depth of search you can afford is effectively doubled