a scene learning and recognition framework for robocup kevin lam 100215552 carleton university...

31
A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

Upload: allyson-scott

Post on 27-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

A Scene Learning and Recognition Framework for RoboCup

Kevin Lam 100215552Carleton University

September 6, 2005

M.A.Sc Thesis Defense

Page 2: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

2

Presentation Overview

RoboCup Overview Objective and Motivation Contributions Methodology

Scene Representation Scene Matching

What We Built Experimental Results Future Work

Page 3: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

3

RoboCup Overview

Page 4: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

4

Objective & Motivation

Based on Paul Marlow’s work Observe other agents instead of humans

Lots of RoboCup teams to choose from (subject to restrictions)

Advantages: reduce development time, access existing body of knowledge

Can an agent learn by observing and imitating another, with little or no intervention from

designers or domain experts?

Page 5: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

5

Contributions An extensible framework for research in RoboCup

agent imitation a conversion process (raw logs higher-level representation) customizable scene recognition and matching using k-

nearest-neighbor a RoboCup client agent based on this scene recognition

algorithm Semi-automated imitation results Student contributions (SYSC 5103, Fall 2004)

Can an agent learn by observing and imitating another, with little or no intervention from designers or domain experts?

… Yes! (with caveats)

Page 6: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

6

Current Approaches

Most agent development is: Hard coded or scripted (e.g. Krislet) High-level behaviour descriptions (Steffens) Supervised learning situations (Stone)

Some attempts at learning by observation ILP rule inducer (Matsui)

Results not directly reused; complex rules, “OK” results COTS data mining (Weka-based)

Problems: complex trees, hard to describe complex behaviours - tuning/pruning needed

Page 7: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

7

Methodology

Model agent behaviour as function of inputs and output:

y = ƒ(a, b, c) Assumptions

DeterministicStateless and memory-less (no memory

or internal status kept)

Page 8: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

8

Methodology

Observation of Existing Agents Scenario Representation

“Scene” spatial knowledge representation Scenario Recognition

K-nearest-neighbor search“Distance” metric definitionScene and Action selection

Page 9: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

9

Agent Observation(see 267 ((f c) 19.7 5 -0 -0) ((f l t) 79.8 28)

(turn 85.0)

(sense_body 312 (view_mode high normal)

(see 312 ((f c b) 38.1 12) ((f b 0) 42.5 8)

(dash 70.0)

(see 993 ((p "Canada2" 1 goalie) 7.4 32 )

(dash 70.0)

.

.

.

Logged messages describe:Objects seenActions sentAgent internal states

Page 10: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

10

Scene Representation

A “scene” is a snapshot of space at a given time

Up to 6,000 time slices in typical game In RoboCup, this means a list of objects:

PlayersBallGoalsLinesFlags

Distance

Direction

Velocity

Attributes (team, uniform number etc.)

Page 11: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

11

Scene Representation 2723

in: “see …”

out: “kick”

Page 12: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

12

Discretized Representation

Can discretize scenes into segments Size of “slices” degree of generalization Logical notions; consistent with simulator Reduced storage (or better coverage) vs generalization

Extreme Left

LeftCenter

RightExtreme Right

Far Goal

NearbyTeam Mate

Opponent

Close Ball

Page 13: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

13

Scene Recognition

Keep stored scenes from observed agent Find best match between current

situation and stored scenes Use k-nearest-neighbor search

“What should I do in this situation?”

“What did the observed agent do when faced with a situation like this?”

Page 14: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

14

“Distance” Metric

Object Pairing a c, b d Separate by type

Continuous vs. Discrete Distance Cosine Law (Continuous) Euclidian (Cell-based)

Page 15: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

15

Action Selection

Get k-nearest matching scene-action pairs

Only one action must be selected Choices:

RandomFirst availableWeighted majority

Also attributes (direction, power)

Page 16: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

16

What We Built

Agent based on Krislet New Brain algorithm:

Load scene file at startupWhen new info arrives, convert to sceneCompare with every stored scenePick the “best match” and reuse action

Validator (cross-validation testing)

Page 17: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

17

Krislet-Scenes Architecture

Page 18: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

18

Selection Algorithms

“Distance” Calculation Random Selection Continuous Distance Object

Calculation Discretized Distance Object

Calculation Discretized Ball-Goal

Calculation (student contribution)

Action Selection Random Selection First Available Weighted Vote Vote with Randomness

Object Matching Simple heuristic (sorted by distance to player)

Page 19: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

19

Experiments

Logged three RoboCup agents Krislet, NewKrislet, CMUnited

Experimental Parameters Distance calculation algorithms Action selection algorithms k-value {1, 5, 15} Object weights {0, 1} Discretization size fixed at (5, 3)

Quantitative (validation, real) vs Qualitative

Page 20: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

20

Experimental ResultsBest Statistical Results

Distance Calculation

Object Weights

k and action selection

Krislet CellBallGoal (~93%) Ball, or ball and goal

k=1 (choose first valid)

NewKrislet CellBallGoal (~70%) Ball, or ball and goal

k=1 or k=5 (equal vote)

CMUnited Continuous or CellBallGoal (~44%)

Ball, or flags, lines, players

k=5 (equal vote)

Best Qualitative ResultsParameters Description

Krislet CellBallGoal, k=1, ball and goal weights

Looks like Krislet! Able to score goals, difficult to distinguish from original.

NewKrislet CellBallGoal, or Continuous, k=1 or k=5/random, ball and goal

Strong resemblance but does not copy “stop and wait” behaviour, instead runs constantly

CMUnited Cell distance, k=1 or k=5/random, ball and goal weights

Sits and turns frequently, wanders to ball, sometimes tries to kick, shows no sign of other “intelligent” team behaviours

Page 21: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

21

Experimental Results

Best parameters: Continuous seems to work better than discretization k 5 with random selection, or k=1 Object weighting is critical!

Can successfully imitate Krislet client (difficult for a human observer to distinguish) Slightly less responsive, slower to “decide”

Imitates many aspects of NewKrislet Attacker Copies only very basic behaviours of CMUnited

Page 22: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

22

Limitations

Simplistic object matching algorithm Need way to match detailed objects like players, flags

Works best on stateless, deterministic, reactive agents Does not consider memory, field position, internal

state “Single-layered” logic approach

Performance (speed and coverage) Limited parameter exploration in our tests Not yet fully automated

Page 23: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

23

Future Work

Application of a “minimum weight perfect matching” algorithm (David Tudino)

Hierarchical scene storage for improved search and storage performance

State detection and learning (e.g. Hidden Markov Model)

Pattern mining within scenes Better qualitative evaluation metrics

(needed for automation)

Page 24: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

24

Conclusions

We contributed an extensible framework for research in RoboCup agent imitation a conversion process (raw logs higher-level

representation) customizable scene recognition and matching using

k-nearest-neighbor a RoboCup client agent based on this scene

recognition algorithm Encouraging imitation results (semi-

automated) Lots of direction for future work

Page 25: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

25

Questions?

Page 26: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

26

Krislet Behaviour

Simple decision tree logic Turn and look for ball Run toward ball Turn and look for goal Kick ball to goal

No concept of teams or strategies Stateless Deterministic

Page 27: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

27

NewKrislet Behaviour

Implement strategies as state machine “AntiSimpleton” prepackaged strategy

Attackers wait near center line until ball is near; then kicks the ball toward the goals

Defenders wait along the field; if the ball comes near, they kick it to the attacker and return to their position

Deterministic

Page 28: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

28

CMUnited Behaviour

World Cup Championship Winner! Layered-learning model

Passing, dribbling skills at low level Strategies at higher level

Formations Communications Coach Player Modeling

Page 29: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

29

Stateless Behaviour?

Model agent as function f(a, b, c) = x If inputs (a, b, c) usually results in x,

the agent is probably stateless If inputs (a, b, c) sometimes produces

x, but other times produces y, there might be two states involved

Subject to probability modeling

Page 30: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

30

Discretizing Scenes: Issues

Potential problemsBias introducedBoundary values/edgingOverfitting

Page 31: A Scene Learning and Recognition Framework for RoboCup Kevin Lam 100215552 Carleton University September 6, 2005 M.A.Sc Thesis Defense

31