agents and environments

43
AGENTS AND ENVIRONMENTS

Upload: trella

Post on 11-Feb-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Agents and environments. Environment types. Fully observable (vs. partially observable): An agent's sensors give it access to the complete state of the environment at each point in time . - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Agents and environments

AGENTS AND ENVIRONMENTS

Page 2: Agents and environments

Environment types Fully observable (vs. partially observable): An

agent's sensors give it access to the complete state of the environment at each point in time.

Deterministic (vs. stochastic): The next state of the environment is completely determined by the current state and the action executed by the agent.

Episodic (vs. sequential): The agent's experience is divided into atomic "episodes" (each episode consists of the agent perceiving and then performing a single action), and the choice of action in each episode depends only on the episode itself.

Page 3: Agents and environments

Environment types Static (vs. dynamic): The environment is

unchanged while an agent is deliberating. Discrete (vs. continuous): A limited number

of distinct, clearly defined percepts and actions.

Single agent (vs. multiagent): An agent operating by itself in an environment.

Adversarial (vs. benign): There is an opponent in the environment who actively trying to thwart you.

Page 4: Agents and environments

Example Some of these descriptions can be ambiguous,

depending on your assumptions and interpretation of the domainContinuo

usStochastic

Partially Observable

Adversarial

Chess, CheckersRobot SoccerPokerHide and SeekCards SolitaireMinesweeper

Page 5: Agents and environments

Environment typesChess with Chess without

Taxi driving a clock a clock

Fully observable Yes Yes No Deterministic Yes Yes No Episodic No No No Static Semi Yes No Discrete Yes Yes NoSingle agent No No No?

The real world is partially observable, stochastic, sequential, dynamic, continuous, multi-agent

Page 6: Agents and environments

GAMES(I.E. ADVERSARIAL

SEARCH)

Page 7: Agents and environments

Games vs. search problems Search: only had to worry about your

actions Games: opponent’s moves are often

interspersed with yours, need to consider opponent’s action

Games typically have time limits Often, an ok decision now is better than a

perfect decision later

Page 8: Agents and environments

Games Card games Strategy games FPS games Training games …

Page 9: Agents and environments

Single Player, Deterministic Games

Page 10: Agents and environments

Two-Player, Deterministic, Zero-Sum Games

Zero-sum: one player’s gain (or loss) of utility is exactly balanced by the losses (or gains) of the utility of other player(s) E.g., chess, checkers, rock-paper-scissors,

Page 11: Agents and environments

Two-Player, Deterministic, Zero-Sum Games : the initial state : defines which player has the move in a state : defines the set of legal moves : the transition model that defines the result of

the move : returns true if the game is over. In that case

is called a terminal state. : a utility function (objective function) that

defines the numeric value of the terminal state for player

Page 12: Agents and environments

Minimax

Page 13: Agents and environments

Game tree (2-player, deterministic, turns)

Page 14: Agents and environments

Minimax

Page 15: Agents and environments

Minimax “Perfect play” for deterministic games Idea: choose move to position with highest

minimax value = best achievable payoff against best play

Page 16: Agents and environments

Is minimax optimal? Depends

If opponent is not rational could be a better play

Yes With assumption both players always make

best move

Page 17: Agents and environments

Properties of minimax Complete?

Yes (if tree is finite)

Space complexity? O(bd) (depth-first exploration)

Optimal? Yes (against an optimal opponent)

Time complexity? O(bd)

For chess, b ≈ 35, d ≈100 for "reasonable" games

≈ 10154

exact solution completely infeasible

Page 18: Agents and environments

How to handle suboptimal opponents?

Can build model of opponent behavior Use that to guide search rather than MIN

Reinforcement learning (later in the semester) provides another approach

Page 19: Agents and environments

α-β pruning Do we need to explore every node in the

search tree?

Insight: some moves are clearly bad choices

Page 20: Agents and environments

α-β pruning example

Page 21: Agents and environments

α-β pruning example

Page 22: Agents and environments

What is the value of this node?

Page 23: Agents and environments

And this one?

Page 24: Agents and environments

First option is worth 3, so root is at least that good

Page 25: Agents and environments

Now consider the second option

Page 26: Agents and environments

What is this node worth?

Page 27: Agents and environments

At most 2

Page 28: Agents and environments

But, what if we had these values?

1 99

It doesn’t matter, they won’t make any difference so don’t look at them.

Page 29: Agents and environments

α-β pruning example

Page 30: Agents and environments

α-β pruning example

Page 31: Agents and environments

α-β pruning example

Page 32: Agents and environments

Why didn’t we check this node first?

Page 33: Agents and environments

Properties of α-β Pruning does not affect final result

i.e. returns the same best move (caveat: only if can search entire tree!)

Good move ordering improves effectiveness of pruning

With "perfect ordering," time complexity = O(bm/2) Can come close in practice with various heuristics

Page 34: Agents and environments

Bounding search Similar to depth-limited search:

Don’t have to search to a terminal state, search to some depth instead

Find some way of evaluating non-terminal states

Page 35: Agents and environments

Evaluation function Way of estimating how good a position is

Humans consider (relatively) few moves and don’t search very deep

But they can play many games well evaluation function is key

A LOT of possibilities for the evaluation function

Page 36: Agents and environments

A simple function for chess White = 9 * # queens + 5 *# rooks + 3 *

# bishops + 3 * # knights + # pawns Black= 9 * # queens + 5 *# rooks + 3 *

# bishops + 3 * # knights + # pawns

Utility= White - Black

Page 37: Agents and environments

Other ways of evaluating a game position?

Features: Spaces you control How compressed your pieces are Threat-To-You – Threat-To-Opponent How much does it restrict opponent options

Page 38: Agents and environments

Interesting ordering

Game Branching factor Computer qualityGo 360 << human

Chess 35 ≈ humanOthello 10 >> human

Page 39: Agents and environments

Implications

Game Branching factor Computer qualityGo 360 << human

Chess 35 ≈ humanOthello 10 >> human

• Larger branching factor (relatively) harder for computers

• People rely more on evaluation function than on search

Page 40: Agents and environments

Deterministic games in practice

Othello: human champions refuse to compete against computers, who are too good.

Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997.

Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. In 2007 developers announced that the program has been improved to the point where it cannot lose a game.

Go: human champions refuse to compete against computers, who are too bad.

Page 41: Agents and environments

More on checkers

Checkers has a branching factor of 10 Why isn’t the result like Othello?

Complexity of imagining moves: a move can change a lot of board positions A limitation that does not affect computers

Game Branching factor Computer qualityGo 360 << human

Chess 35 ≈ humanOthello 10 >> human

Page 42: Agents and environments

Summary Games are a core (fun) part of AI

Illustrate several important points about AI Provide good visuals and demos

Turn-based games (that can fit in memory) are well addressed

Make many assumptions (optimal opponent, turn-based, no alliances, etc.)

Page 43: Agents and environments

Questions?