search algorithms for agents sachin kamboj cisc 886: multiagent systems fall 2004

47
Search Algorithms Search Algorithms for Agents for Agents Sachin Kamboj Sachin Kamboj CISC 886: MultiAgent Systems CISC 886: MultiAgent Systems Fall 2004 Fall 2004

Post on 21-Dec-2015

224 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Search Algorithms for Search Algorithms for AgentsAgents

Sachin KambojSachin Kamboj

CISC 886: MultiAgent SystemsCISC 886: MultiAgent SystemsFall 2004Fall 2004

Page 2: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Outline Introduction Path-Finding Problems

Formal Definition Asynchronous Dynamic Programming Learning Real Time A* Moving Target Search Real –Time Bidirectional Search

Constraint Satisfaction Problems Formal Definition Filtering Algorithm Hyper-Resolution Based Consistency Algorithm Asynchronous Backtracking Distributed Constraint Optimization Problems

Adopt (Asynchronous Distributed Optimization) OptAPO (OPTimal Asynchronous Partial Overlay)

Page 3: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Introduction Search:

an umbrella term for various problem solving techniques in AI

used when the sequence of actions required for solving a problem is not known a priori hence trial and error exploration of the alternatives is

required

Search algorithms are designed to solve three classes of problems: Path-finding problems Constraint satisfaction problems Competitive games

Page 4: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Introduction A whole set of search algorithms exist for single

agents have known properties (like time and space complexity). have been used effectively to solve a large number of AI

problems. Examples: BFS, DFS, Branch and Bound, A*

So, why use multiple agents? Agents have limited rationality

search is often intractable may not have a complete picture of the problem may not have the required computational capability

Agents may be self interested

Page 5: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Introduction Approach

If we represent the search problem as a graph, we can solve it by accumulating local computations for each node in the graph. Local computations can be executed asynchronously

and concurrently

Agent 1

Agent 2Agent 3

Page 6: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Introduction Advantages of asynchronous search

algorithms: Local computations needed will fit within the

limited rationality of the agents Execution order of these algorithms can be highly

flexible and arbitrary

Page 7: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Path Finding Problems

Page 8: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Example 1: Finding a path through a Maze

Start

Goal

Page 9: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Example 2: Solving the 8-puzzle problem

1 4 2

3 5

6 7 8

1 4 2

3 3 5

6 7 8

4 2

1 3 5

6 7 8

1 4 2

6 3 5

7 8

1 2

3 4 5

6 7 8

Initial State

Goal State

Page 10: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Formal Definition A path finding problem consists of the

following components: A set of nodes, N, each representing a state A set of directed links, L, each representing an

operator available to a problem solving agent A unique start state, S A set of goal states, G A set of weights, W, associated with each link

represent the cost of applying the operator called the “distance” between the nodes

Neighbors are nodes that have directed links between them

Page 11: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Principle of Optimality States that a path is optimal if and only if

every segment of it is optimal

Page 12: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Asynchronous Dynamic Programming Let:

h*(i) = shortest distance from node i to the goal

k(i,j) = cost of link between i and j f*(j) = shortest distance from node i to goal

via a neighboring node j

f*(j) = k(i,j) +h*(j) By the principle of optimality:

h*(i) = minj f*(j) Asynchronous dynamic programming computes

h* by repeating the local computations of each node

Page 13: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Asynchronous Dynamic Programming Assumes the following situation:

For each node, i, there exists a process corresponding to i

Each process records h(i), which is the estimated value of h*(i). The initial value of h*(i) is arbitrary (e.g., , 0) except for

the goal nodes For each goal node g, h(g) is 0. Each process can refer to h values of neighboring

nodes (via shared memory or message passing)

Page 14: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Asynchronous Dynamic Programming Each process updated h(i) by the following

procedure: For each neighboring node j:

Compute f(j) = k(i,j) + h(j) where h(j) is the current estimated distance from j to a goal node k(i,j) is the cost of the link from i to j

update h(i) as follows h(i) ← minj f(j)

Page 15: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Asynchronous Dynamic Programming Example:

2

1

3

11

2

1

1

3initial state

goal state

0

1

3

3

2 2

Page 16: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Asynchronous Dynamic Programming Is the algorithm complete?

Yes Is the algorithm optimal?

Yes Are there any problems?

cannot be used for reasonably large path-finding problems we cannot afford to have processes for all the nodes

Page 17: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Learning Real-Time A* Used when:

only one agent is present not possible to perform local computations for all nodes

when planning and execution needs to be interleaved

In this algorithm: the agents selectively execute the computations

for the current node each agent repeats the following procedure:

Lookahead: calculate f(j) = k(i,j) + h(j) Update: the estimate of node i as h(i) ← minj f(j) Action Selection: Move to the neighbor j that has the

minimum f(j) value. Ties are broken randomly

Page 18: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Learning Real-Time A* Requirement:

the initial value of h must be optimistic, i.e.h(i) h*(i)

Is the algorithm complete? Yes, in a finite number of nodes with positive link costs, in

which there exists a path from every node to a goal node, and starting with non-negative initial estimates, LRTA* will eventually reach a goal node

Is the algorithm optimal? Requires repeated trials for optimality If the initial estimates are admissible, then over repeated

problem solving trials, the values learned by LRTA* will eventually converge to their actual distances along every optimal path to the goal node

Page 19: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Moving Target Search Allows the goal state to change during the

course of the search For example, a robot’s task is to reach

another robot which is in fact moving as well The target robot may

cooperatively try to reach the problem solving robot actively avoid the problem solving robot move independent of the problem solving robot

In order to guarantee success, the problem solver must be able to move faster than the target

Page 20: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Moving Target Search Is a generalization of LRTA* The algorithm:

does NOT maintain a single heuristic of the distance to the target goal

instead tries to acquire heuristic information for each potential target location. Thus, MTS maintains a matrix of heuristic values,

representing the function h(x,y) for all pairs of states x and y

The matrix is updated on each move of the problem solver and the target.

Page 21: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Moving Target Search Let xi and xj be the current and neighboring

positions of the problem solver and yi and yj be the current and neighboring positions of the target.

Assume all edges in the graph have unit cost

When the problem solver moves:1. Calculate h(xj,yi) for each neighbor xj of xi.

2. Update the value of h(xi,yi) as follows:

h(xi,yi) ← max ( h(xi,yi) , minxj{h(xj,yi) + 1} )

3. Move to the neighbor xj with the minimum h(xj,yi), i.e. assign the value of xj to xi. Ties are broken randomly.

Page 22: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Moving Target Search When the problem solver moves:

1. Calculate h(xi,yj) for the target’s new position yj.

2. Update the value of h(xi,yi) as follows:

h(xi,yi) ← max ( h(xi,yi) , h(xj,yj) – 1 )

3. Reflect the target’s new position as the new goal of the problem solver, i.e. assign the value of yj to yi.

Is the algorithm complete? Yes, A problem solver executing MTS is

guaranteed to eventually reach the target Is the algorithm optimal?

No

Page 23: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Real –Time Bidirectional Search Two problem solvers starting from the initial and

goal states physically move towards each other. Planning and execution are interleaved The following steps are repeatedly executed until

the two problem solvers meet in the problem space:1. Control Strategy: Select a forward (step2) or backward

move (step3)

2. Forward Move: The problem solver starting from the initial stage (i.e. the forward problem solver) moves towards the problem solver starting from the goal state.

3. Backward Move: The problem solver starting from the goal stage (i.e. the backward problem solver) moves towards the problem solver starting from the initial state.

Page 24: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Real –Time Bidirectional Search Can be classified into two categories:

Centralized RTBS The best action is selected among all possible moves of

the two problem solvers The control strategy selects which of the two problem

solvers to run depending on what the best action is Two centralized RTBS algorithms (based on LRTA* and

RTA*) can be implemented Decoupled RTBS

The two problem solvers independently make their own decisions.

The control strategy alternatively runs the forward and backward problem solvers

MTS can be used for implementing decoupled RTBS.

Page 25: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Constraint Satisfaction Problems

Page 26: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Example 1: Scheduling a set of tasks A set of exams need to be scheduled during

the last week of December. No more than 5 exams can be scheduled on a Tuesday and no more than 7 exams on any other day………

Page 27: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Example 2: Graph-Coloring Problem

Objective: To paint the nodes of a graph so that any two nodes

connected by a link do not have the same color. Each node has a finite number of possible colors

{ red, blue, yellow } { red, blue, yellow }

{ red, blue, yellow }

{ red, blue, yellow }

X1 X2

X3

X4

Page 28: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Formal Definition A constraint satisfaction problem consists of:

A set of n variables V = {x1, x2, …, xn }

Discrete, finite domains for each of the variables D = { D1, D2, …, Dn }

A set of constraints on the value of the variables. The constraints are defined by predicates,

pk(xk1, xk2, …, xkj) where each pk is the function

pk : Dk1 x Dk2 x … x Dkj {0 , 1}.

The problem is to find an assignment of values to the variables such that all the constraints are satisfied.

Constraint satisfaction is NP-complete in general A trial and error exploration of alternatives is inevitable

Page 29: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Relation to DAI We assume that the variables of the CSP are

distributed amongst multiple agents. Many application problems in DAI can be

formalized as distributed constraint satisfaction problems.

For example: interpretation problems assignment problems, and multiagent truth maintenance problems

For simplicity, we assume an agent for each variable in all the algorithms

Page 30: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Filtering Algorithm Each agent communicates its domain to its neighbor and then

removes values that cannot satisfy constraints from its domain.

More specifically, a process (agent), xi performs the following procedure revise(xi,xj) for each neighbor xj.

procedure revise (xi, xj)

for all vi Di do

if there is no value vj Dj such that vj is consistent with vi

then delete vi from Di; end if; end do;

If some value of the domain is removed by performing the procedure revise, process xi sends the new domain to its neighboring processes.

If a new domain is received from a neighbor, call procedure revise again.

Page 31: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Filtering Algorithm For example,

{ red, blue, yellow } { red }

{ blue }

{ red, blue, yellow }

X1 X2

X3

X4

As a result of the filtering algorithm, x1 will remove red and blue from its domain and x4 will remove blue from its domain.

Page 32: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Filtering Algorithm If the domain of some variable becomes the empty

set: the problem is over-constrained and has no solution

If each domain has a unique value: the assignment of the unique values to the variables is a

solution. If there exist multiple values for some variable:

we cannot tell whether the problem has a solution or not further trial and error search is required to find a solution

Filtering algorithms cannot solve CSP problems in general This algorithm is used as a preprocessing procedure

before the application of some other method.

Page 33: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Hyper-Resolution Based Consistency Algorithm All constraints are represented as a “nogood”

a prohibited combination of variable values. For example, in the figure below:

{ red, blue } { red, blue }

{ red, blue }

X1 X2

X3

A constraint between x1and x2 can be represented using two nogoods: {x1 = red, x2 = red} {x1 = blue, x2 = blue}

The algorithm uses several existing nogoods and the domain of a variable to generate a new nogood.

Page 34: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Hyper-Resolution Based Consistency Algorithm For example, using the nogoods:

{x1 = red, x2 = red} {x1 = blue, x3 = blue}

and the domain of x1 {red, blue}, a new nogood: {x2 = red, x3 = blue}

is generated The hyper-resolution rule is described as follows:

A1 V A2 V … V Am

(A1 A11 … )

(A2 A21 … ):

:

(Am Am1 … )

(A11 … A21 … Am1 …)

Page 35: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Asynchronous Backtracking Asynchronous version of a backtracking algorithm

standard method for solving CSPs Each variable/process is assigned a priority

usually based on the alphabetical order of the variable identifiers Each process selects a random value from its domain Each process communicates its tentative variable assignments

to its neighboring processes. If the current value of a process is not consistent with the

assignment of higher priority processes, the process changes its value If no consistent value exists, generate a new nogood and send it to the

higher priority process On receiving a nogood, higher priority process changes its value.

Each process maintains the current variable assignments of other processes in its local_view. May contain obsolete information.

Page 36: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Asynchronous Backtracking Two main types of messages are

communicated: ok? messages to communicate the current value nogood messages to communicate a new nogood

Example:

{ 1, 2 } { 2 }

{ 1, 2 }

X1 X2

X3

(ok? (x1, 1)) (ok? (x2, 2))

local_view {(x1, 1), (x2, 2) }

(nogood {(x1, 1), (x2, 2) })

local_view {(x1, 1) }add neighbor request

(nogood {(x1, 1) })

Page 37: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Distributed Constraint Optimization Problems Are a generalization of constraint satisfaction problems Like DCSP, DCOP includes a set of variables:

each variable is assigned to an agent that has control over its value In DCSP

the agents assign values to variables so as to satisfy the constraints on them

In DCOP the agents must coordinate their choice of values so that a global

objective function is optimized. Applications of DCOP:

Multiagent Teamwork Distributed Scheduling Distributed Sensor Networks

Page 38: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Distributed Constraint Optimization Problems

Formal Definition A constraint satisfaction problem consists of:

A set of n variables V = {x1, x2, …, xn }

Discrete, finite domains for each of the variables D = { D1, D2, …, Dn }

A set of cost functions f = {f1, …, fm} . where each fi is a function

fi : Di1 x Di2 x … x Dij N U .

The problem is to find an assignment A* = {d1, …, dn | di Di} such that the global cost called F, is minimized. F is defined as follows:

m

ii AfAF

1

)()(

Page 39: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Distributed Constraint Optimization Problems

Design Criteria for DCOP algorithms: Agents should be able to optimize a global

function in a distributed fashion using only local communication

The agents should operate asynchronously agents should not sit idle waiting for a particular

message from a particular agent

The algorithm should provide provable quality guarantees on system performance

Page 40: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Adopt (Asynchronous Distributed Optimization) Generalization of Asynchronous Backtracking

with a bunch of performance tweaks. Starts by assigning a priority to the agents based on a

depth-first search tree each node has a single parent and multiple children parents have higher priority than the children hence, does not require a linear priority ordering on the

agents Constraints are only allowed between a node and any

of its ancestors and descendants there can be no constraints between different subtrees of

the DFS tree not a restriction of the constraint network itself

Page 41: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Adopt (Asynchronous Distributed Optimization)

Example:

x1

x2

x3 x4

x1

x2

x3 x4

Constraint Graph DFS Tree

Page 42: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Adopt (Asynchronous Distributed Optimization) Algorithm begins by all agents choosing their values

concurrently The algorithm uses three types of messages:

VALUE Messages: used to send the current selected value of the variable to the

descendants below the node in the DFS tree similar to ok? messages in ABT

THRESHOLD Messages: are only sent by a parent to its immediate children contain a single number which represents the backtrack threshold

COST Messages: are a generalization of nogood messages in ABT contain the current context (same as in ABT) and the lb and the

ub.

Page 43: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Adopt (Asynchronous Distributed Optimization)

The algorithm calculates the local cost using the formula:

where δ(di) is the local cost at xi when xi chooses d. This formula is used to calculate the cost of a node only

on the basis of the constraints that the node shares with its ancestors (NOT its children) This is because the current context is built from the VALUE

messages received by a node

The node (xi) also calculates LB and UB The idea is that LB and UB are the lower and upper bounds on

the cost seen so far for a subtrees rooted at xi.

textCurrentCondx jiijijj

ddfd),(

),()(

Page 44: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Adopt (Asynchronous Distributed Optimization) For a leaf node,

lb(di) = ub(di) = δ(di) For any other node,

For all nodes:

Similar for UB By keeping a track of LB and UB, the agent knows

the current lower bound and upper bound on cost in the subtrees

The algorithm uses a threshold values to decide when to backtrack

Childrenx lil

xdlbddlbDd ),()()(,

)(min dlbLBiDd

Page 45: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

OptAPO

OPTimal Asynchronous Partial Overlay used to increase the efficiency of previous DCOP

algorithms (eg adopt) previous DCOP algorithms were based on a total

separation of the agents knowledge during the problem solving process

is based on a partial centralization technique called cooperative mediation allows the agents to extend and overlap the context

that they use for making their local decisions

Page 46: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

OptAPO

When an agent acts as a mediator, it computes a solution to the overall problem recommends value changes to the agents involved

in the mediation session

Page 47: Search Algorithms for Agents Sachin Kamboj CISC 886: MultiAgent Systems Fall 2004

Questions?