an introduction to abl-wargus ashwin ram / michael mateas / charles isbell

An Introduction to ABL-WargusAshwin Ram / Michael Mateas / Charles Isbell

GILA Kickoff 30 May 06 2

The Wargus Domain

• Real-time strategy (RTS)– No turns, continuous time

– Massive number of games

states and actions

• Open-source clone of Warcraft II– Built on Stratagus game

engine

– Written in C/C++

Wargus As A Testbed

• Actively used in AI research– Wargus Challenge– Case-Based Plan Selection (Aha et al)

• Commonly used abstraction layer: Lua (TIELT)– Build, Train, Upgrade, AttackWithForce, Repair, …

• Better: Wargus C abstraction layer

– Enables tactical level decisions

– Supports Fog of War

– Explicit control of resource collection/allocation

– Unit level control: Stop, StandGround, CastSpell, …

– Building level control: Research, Upgrade, …

Step 1: Planning. HTN Planner produces a hierarchical plan (task network). Screenshots show the scenario as the plan is executed.

Task decomposition

Informs

Example GILA ScenarioBase building in Wargus

Enemy unit (catapult) has longer range - The defense

tower network is ineffective

Reasoning failureHTN task decomposition failure: cannot handle catapults

New Learning Goal:How to beat catapults

Blame Assignment

Step 2: Meta-Reasoning. Plan failure results in blame assignment, which identifies reasoning failure and corresponding learning goals.

Even the best plans can fail

Learning from observation

Barracks

DefeatCatapult

PatrolFootmen

BuildFootmen

Learning Goal:defeat catapult

CBR Learner

- Uses observation trace to learn

- Creates new case plan for killing catapults

NewLearned

New defense strategy is integrated into full plan

How do humans fight catapults?

catapult1 12, 6 [cat1] <sighted> [cat1] footman1 attacks [cat1] footman2 attacks [cat1] <damage> footman1 23 <damage> footman2 8 footman3 attacks [cat1] footman4 attacks [cat1] <damage> [cat1] 7 <damage> [cat1] 9......

Observation

Step 3: Goal-Driven Learning. Case-based learner “helps out” HTN planner by learning a case for the “knowledge gap”.

Trace Capture

From HTN Planner

From CBR Learner

Step 4: Integrated Reasoning. System can now handle long-range attackers by combining HTN Planning with CBR Planning.

Adapted plan

GILA-on-Wargus: Architecture

Wargus

Sensory-motor abstraction

Wargus

Agent Framework

Agent RuntimeBehaviorLibrary

Enemy Agent

AI/HumanPlayer

NetworkConnectionSocket communication

GILA perceives world state and produces agent behaviors in “workflow execution language”

Behaviors are executed in real-time Wargus world

Enemy agents may execute behaviors concurrentlyover the network

GILA-on-Wargus: Interface

Wargus

Enemy Agent

AI/HumanPlayer

NetworkConnection

Wargus

Agent Framework

Agent RuntimeBehaviorLibrary

Socket communication

ABL: A Behavior Language

Wargus

Enemy Agent

AI/HumanPlayer

NetworkConnection

Scalable framework for AI agents in real-time, multi-agent domains.

Wargus

ABLABL RuntimeBehavior

Library

The interactive drama Façade contains 200,000+ lines of ABL code. Currently at >300,000 downloads.

ABL has been interfaced with Unreal Tournament, Never Winter Nights, …

ABL: A Behavior Language

A reactive planning language for real-time, A reactive planning language for real-time, complex agentscomplex agents

ABL features:ABL features:• Sequential & parallel behaviors• Demons & continuously monitored conditions• Multiple simultaneous goal pursuit with reactivity• Joint goals and behaviors (joint intentions)• Reflection (meta-behaviors)• Sensory-motor abstraction

Mateas & Stern (2004)

Open question: Is ABL the “workflow execution language”?

ABL Overview

Active Behavior TreeRoot

behavior

Goal1 Goal2

Seq.Behavior1

Par.Behavior2

Mental Act Goal3

Available for execution

Behavior LibraryBehavior1

Behavior2

Behaviorn

Working MemoryWME1 WME2

SensorsSensor1

Example Behaviorsparallel behavior AttackEnemy() { context_condition { e = (UnitWME Visibility == VISIBLE) } act attack(e); subgoal KeepUnitsEngaged(); subgoal ConcentrateFire(e);}

To attack the enemy:

● Wait for an enemy to come● Send attack command● Ensure units do not get stuck

or wander● Make range units concentrate

fire sequential behavior HealUnit() { precondition { (UnitWME health < CRITICAL unitID::target location::tcoord) (UnitWME canHeal==true mana > 0

unitID::healer location::hcoord

spellrange::range) (tcoord.distanceto(hcoord) < range) } act castspell(HEAL, healer, target);}

To heal unit:

● Check for the corresponding preconditions for healing

● Cast healing spell

sequential behavior ProtectBase (){

precondition {

attackbeh = (GoalStepWME signature ==

“AttackEnemyBase()”)

(BaseIsUnderAttack)

mental_act{

attackbeh.fail();

● We send an attacking force to invade enemy base

● En route, however, our own base comes under attack

● We cancel the invasion branch of the ABT so that forces can be freed to defend the base.

Example Meta-Behaviors

Rootbehavior

Par.behavior2

Goal2Goal2

Seq.behavior1

Mental Act1

Act2Fail

Seq.behavior3

Goal2Success Test

Succeeds

MentalRemove

Remove Seq.behavior1

ABT (ABL Behavior Tree)

Goals, Actions, Conditions

• Steps: subgoals, primitive acts, mental acts, wait– Waits are used with conditions to accomplish

demons– All steps succeed or fail– Success and failure propagate up the ABT

• Continuously monitored conditions– Success tests – spontaneously succeeds a step

if test is satisfied– Context conditions – spontaneously fails a

behavior if test is satisfied– Makes behaviors immediately reactive to

changes in the world

Joint Goals And Behaviors

Agent teams need to coordinate action with variable Agent teams need to coordinate action with variable couplingcoupling

Some approachesSome approaches

• Coordinate through sensing (but plan recog. hard)Coordinate through sensing (but plan recog. hard)

• Explicitly communicate (but ad hoc)Explicitly communicate (but ad hoc)

• Build coordination into decision architectureBuild coordination into decision architecture

(but no fine-grained authorial control)(but no fine-grained authorial control)

Architecture coordinates author-specified joint actionArchitecture coordinates author-specified joint action

Negotiation

Agent1’s ABT

Rootbehavio

Goal1Joint Goal2

Seq.Behavior1

JointBehavior2

Mental

Agent2’s ABT

Rootbehavio

Seq.Behavior5

Joint Goal2

JointBehavior6

Intention to enter G2

Conflicting Intentions

Agent1’s ABT Agent2’s ABT Agent3’s ABT

Rootbehavior

Joint Goal2

Succeeds

Rootbehavior

Seq.Behavior1

Joint Goal2

JointBehavior2

Rootbehavior

Seq.Behavior3

Joint Goal2

JointBehavior4

Suspends

Resolution: intentions are precedence orderedResolution: intentions are precedence ordered

Mental Goal5

Problem: Problem: asynchronousasynchronous agents enter conflicting states agents enter conflicting states

Inconsistent Subtree Execution

Rootbehavior

JointGoal1

Seq.Behavior1

JointBehavior2

Joint Goal2

Succeeds

Resolution: freeze subtreeResolution: freeze subtree• Initiate exit intention at the subtree root• Remove all leaf steps• Deactivate all monitored conditions• Negotiate removal of all joint goals• Commit to exit intention at subtree root

Problem: Problem: continuingcontinuing execution execution leads to ABT inconsistenciesleads to ABT inconsistencies

Variably Coupled Multi-Mind

Agent1’s ABT

Seq.Behavior1

Rootbehavior

Goal1Joint Goal2

JointBehavior2

Agent2’s ABT

Rootbehavior

Goal3Joint Goal2

JointBehavior4

Mental

Seq.Behavior3

Effects propagate across ABTs

Mental Goal4

Effects propagate

within ABTs

Tunable spectrum between one-mind and many-mindsTunable spectrum between one-mind and many-minds

A2BL: ABL + Machine Learning• Turn machine learning into a

language-level primitive– Learn local, behavior-

specific policies

• Allow hierarchical mixtures of human-authored and machine-learned behaviors– Actions chosen by local

policy may root entire reactive subtrees

• New language constructs for specifying reinforcement– Behaviors may be adaptive

or non-adaptive

• Adaptive sequential behaviors– Hierarchical reinforcement

learning (HRL)– Temporal decomposition of Q-

function

• Adaptive collection behaviors– Modular reinforcement

learning (MRL)– Concurrent decomposition of

Q-function

• Case-based behavior generation– Case-based reasoning &

learning (CBR)– Automatic creation of new ABL

behaviors

Example A2BL codeadaptive sequential behavior AttackEnemy() {

success_condition {!(EnemyWME health>0visibility==VISIBLE)}

reward { 10 if {(UnitWME player!=SELF id::theID)

(DamageWME receiver==theID)} -10 if {(UnitWME player==SELF id::theID)

(DamageWME receiver==theID)} }

state {(ForceWME units::u location::coord)return (u, coord);

} subgoal KeepUnitsEngaged(); subgoal ConcentrateFire();

subgoal MoveTowardsEnemy(); act attack();

subgoal Retreat();subgoal AttackLongRange();subgoal AttackShortRange();

ABL-Wargus Communications Interface

Wargus

Enemy Agent

AI/HumanPlayer

NetworkConnection

Wargus

ABLABL RuntimeBehavior

Library

ABL talks to Wargus via a Sensory-Motor System

ABL-Wargus Communication Interface• ProxyBot class implements low-level socket

communication with Wargus

• Sensory-Motor System defines new actions and sensors

• Action classes define primitive actions to be performed on the Wargus side, eg. Attack, Repair, Follow, Dismiss, Cast Spell

• Actions call ProxyBot's corresponding functions to send messages to Wargus (via socket)

• WMEs (Working Memory Elements) are created corresponding to game state information received

Example Sensors and WMEs EnemyInfoWME

public void EnemyInfoWME extends WME{

public boolean getRace(){ return race;}

public int getNumUnits() {return numUnits;}

EnemyInfoSensor

public class EnemyInfoSensor extends WargusSensors {

private void sense {

EnemyInfoWME [] enemyinfo = proxy.GetAllEnemyInfo();

deleteAllOldWME();

addallEnemyInfoWME(enemyinfo);}

private void addallEnemyInfoWME (EnemyInfoWME[] enemyinfo){

for(int i = 0; i < enemyinfo.length; i++) {

BehavingEntity.getBehavingEntity().addWME(enemyinfo[i]); }}

Battle Management Example

GILA faces attack by catapults and ogres.

Catapults are destroying defense towers.

GILA sends knights and archers to defend against attack.

Battle Management Example (cont.)

Knights encounter ogres and stop to attack.

In the meantime, enemy catapults destroydefense towers.

Unsupported units are quickly destroyed.

GILA-Wargus Evaluation• Use internal Wargus AI as opponent

• Show improvement via integrated learning– Baseline: Play against Wargus AI (w/wo learning)– Observational learning (human studies)– Integrated learning (lesion studies)

• Metrics– Percent of wins– Kill ratio– Time to accomplish objective

an introduction to abl-wargus ashwin ram / michael mateas / charles isbell

cc gila kickoff

networkgila kickoff

trace capturegila kickoff

htn planning

catapultsnew learning

agent behaviors

abl overviewactiv

goaldriven learning

Documents

ashwin singh's portfolio

harris isbell - narcotics control (1961)

ashwin corporate

compsci590.04 instructor:!ashwin!machanavajjhala!

ashwin aflp.ppt

ashwin team eco

compsci590.03 instructor:!ashwin!machanavajjhala!

a 37 ashwin satra

william isbell portfolio for uploads

ashwin working of brain

pam ps ashwin

ashwin chapter

el quinto códice maya. tom isbell

ashwin malshe - resume 2020-02-06 · malshe, ashwin,...

ashwin balaji ug portfolio

jaguars sponsorships - rollin isbell

paul isbell - konrad-adenauer-stiftung - home

post traumatic stress disorder may 2014jessica isbell

isbell reflexiones finales tiwanaku

6/10/2015 anne isbell 1 pistons / pacers brawl 6/10/2015...