search algorithms for agents

49
Search Algorithms for Agents problems that have been addressed by search algorithms can be divided into three classes : • path-finding problems • constraint satisfaction problems (CSP) • two-player games

Upload: adonis

Post on 17-Jan-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Search Algorithms for Agents. problems that have been addressed by search algorithms can be divided into three classes: path-finding problems constraint satisfaction problems (CSP) two-player games. Two-player games. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Search Algorithms for Agents

Search Algorithms for Agents

problems that have been addressed by search algorithms can be divided into three classes:

• path-finding problems

• constraint satisfaction problems (CSP)

• two-player games

Page 2: Search Algorithms for Agents

Two-player games

Two-player games studies are obviously related to DAI/multiagent systems where agents are competitive.

Page 3: Search Algorithms for Agents

CSP & Path-finding

Most algorithms for these classes were originally developed for a single-agent

Among them, what kinds of algorithms would be useful for cooperative problem

solving by multiple agents?

Page 4: Search Algorithms for Agents

search algorithm graph representation

A search problem can be represented by using a graph.

Some of the search problems can be solved by accumulating local computations for each node in the graph.

Page 5: Search Algorithms for Agents

Asynchronous search algorithms definition

• Asynchronous search algorithm

solves a search problem by accumulating

local computations.• The execution order of these local

computation can be arbitrary or highly

flexible, and can be executed

asynchronously and concurrently.

Page 6: Search Algorithms for Agents

CSP – a quick reminder

• A CSP consists of n variables x1,…,xn,

Whose values are taken from finite, discrete domains

D1,…,Dn, respectively, and a set of constraints on their values.

• The constraint pk(xk1,…,xkj) is a predicate

that is defined on the Cartesian product

Dk1 x … x Dkj. This predicate is true iff the

value assignment of these variables satisfies

this constraint.

Page 7: Search Algorithms for Agents

CSP

Since constraint satisfaction is NP-complete in general, a trial-and-error exploration of alternatives is inevitable.

For simplicity, we will focus our attention on binary CSPs, i.e., all the constraints are between two variables.

Page 8: Search Algorithms for Agents

Example: binary CSP graph

The figure shows 3 variables x1,x2,x3 and constraints x1 != x3, x1 = x2

x1x2

x3

=!

=

Page 9: Search Algorithms for Agents

Distributed CSP

• Assuming that the variables of a CSP

are distributed among agents, solving the

consist of achieving coherence between the

agents.

• Problems like multiagent truth maintenance

tasks, interpretation problems, and assignment

problems can be formalized as distributed CSPs.

Page 10: Search Algorithms for Agents

CSP and asynchronous algorithms

Each process will correspond to a variable.

We assume the following communication

model:• Processes communicate by sending messages. • The delay in delivering a massage is finite.• Between two processes, messages are received in

the order they were sent.

Processes that have links to xi is called neighbors

of xi.

Page 11: Search Algorithms for Agents

Filtering AlgorithmA process xi perform the following procedure revise(xi,xj) for each neighboring process xj.

procedure revise(xi,xj)

for all xiinDi do

if there is no value vj inDj such that vj is consistent with vi then delete vi from Di; end if; end do;

• When a value is deleted, the process sends it’s newdomain to his neighboring processes.

• When xi receives a new domain from a neighbor xj, the

procedure revise(xi,xj) is performed again.

The execution order of these processes is arbitrary.

Page 12: Search Algorithms for Agents

Filtering example: 3-Queens x1 x2 x3

x1 x2 x3 x1 x2 x3

x1

x2 x3

Revise(x1,x2)

Revise(x2,x3) Revise(x3,x2)

Page 13: Search Algorithms for Agents

3-Queens example – continue

x1

x2 x3

Revise(x1,x3)

x1 x2 x3

x1 x2 x3 x1 x2 x3

Page 14: Search Algorithms for Agents

Filtering Algorithm

• If a domain of some variable becomes an empty set, the problem is over-constrained and has no solution.

• If each domain has a unique value, then the remaining values are a solution.

• If there exist multiple values for some variables, we cannot tell whether the problem has a solution or not, and further search is required.

Filtering should be considered a preprocessing procedure that is invoked before the application of other search methods.

Page 15: Search Algorithms for Agents

K-Consistency

A CSP is k-consistent iff given any instantiation of any k-1 variables satisfying all the constraints among them, it is possible to find an instantiation of any kth variable such that these k variable values satisfy all the constraints among them.

If the problem is k-consistent and j-consistent for all j<k, the problem is called strongly k-consistent.

Next, we’ll see an algorithm that transforms a given problem into an equivalent strongly k-consistent problem.

Page 16: Search Algorithms for Agents

Hyper-Resolution-Based Consistency Algorithm

The hyper-resolution rule is described as follows (Ai is a proposition such as x1=1).

...)1...21...11(

...)1(

.

.

...),212(

...),111(

...21

AmAA

AmAm

AA

AA

AmAA

In this algorithm, all constraints are represented as a nogood, which is a prohibited combination of variables values. (example next slide).

Page 17: Search Algorithms for Agents

Graph coloring example• The constraints between x1 and x2 can be represented as

two nogoods {x1=red,x2=red} and {x1=blue,x2=blue}.

• By using the hyper-resolution rule we can obtain from {x1=red,x2=red} and {x1=blue,x3=blue} a new nogood {x2=red,x3=blue}

x1x2

x3

}red,blue{

}red,blue{

}red,blue{

Page 18: Search Algorithms for Agents

Hyper-Resolution-Based Consistency Algorithm

• Each process represents its constraints as nogoods.• Each process generates new nogoods by combining the

information about its domain and existing nogoods using the hyper-resolution rule.

• A newly obtained nogood is communicated to related processes.

• If a new nogood is communicated, the process tries to generate further new nogoods using the communicated nogood.

Page 19: Search Algorithms for Agents

Hyper-Resolution-Based Consistency Algorithm

• A nogood is a combination of variables values that is

prohibited, therefore, a superset of a nogood cannot be a solution.

• If an empty set becomes a nogood, the problem is over-

constrained and has no solution.

The hyper-resolution rule can generate a very large

number of nogoods. If we restrict the application of the

rules so that only nogoods whose length are less than k

are produced, the problem becomes strongly k-consistent.

Page 20: Search Algorithms for Agents

Asynchronous BacktrackingAn asynchronous version of a backtracking algorithm,

which is a standard method for solving CSPs .The completeness of the algorithm is guaranteed.

• The processes are ordered by the alphabetical order of the variable identifiers. Each process chooses an assignment.

• Each process maintains the current value of other processes from its viewpoint (local view). A process changes its assignment if its current value isn’t consistent with the assignments of higher priority processes.

• If there exist no value that is consistent with the higher priority processes, the process generates a new nogood, and communicate the nogood to a higher priority process.

Page 21: Search Algorithms for Agents

Asynchronous Backtracking

• The local view may contain obsolete information. Therefore, the receiver of a new nogood must check whether the nogood ia actually violated from its own local view.

• The main messages types communicated among processes are ‘ok?’ to communicate the current value,

and ‘nogood’ to communicate a new nogood.

Page 22: Search Algorithms for Agents

Asynchronous Backtracking example

X1

{1,2}X2

}2{

X3

}1,2{

)ok?, (x1,1)(

Local view: {(x1,1),(x2,2)}

=!=!)))ok?, (x2,2

Page 23: Search Algorithms for Agents

Asynchronous Backtracking example – continue(1)

X1

{1,2}X2

}2{

X3

}1,2{

(nogood, {(x1,1),(x2,2)})

=!=!

Local view: {(x1,1)}

New link

Add neighbor, and get value requests

Page 24: Search Algorithms for Agents

Asynchronous Backtracking example – continue(2)

X1

{1,2}X2

}2{

X3

}1,2{

=!=!

(nogood,{(x1,1)})

Page 25: Search Algorithms for Agents

Asynchronous Backtracking

When received (ok?, (xj,dj)) doadd (xj,dj) to local_view;check_local_view; end do;

When received (nogood, nogood) dorecord nogood as a new constraint;when (xk,dk) where xk is not a neighbor do

request xk to add xi to its neighbors;add xk to neighbors;add (xk,dk) to local_view; end do;

check_local_view;end do;

Page 26: Search Algorithms for Agents

Asynchronous Backtracking

Procedure check_local_view

when local_view and current_value are not consistent do

if no value in Di is consistent with local_view

then resolve new nogood using hyper-resolution rule and send the nogood to the lowest priority process in the nogood;

when an empty nogood is found do

broadcast to other processes that there is no solution, terminate this algorithm; end do;

else select d in Di where local_view and d are consistent;

current_value d;

send (ok?, (xi,d)) to neighbors; end if; end do;

Page 27: Search Algorithms for Agents

Asynchronous Weak-Commitment Search

This algorithm introduces a method for dynamically ordering processes so that a bad decision can be revised without an exhaustive search.

• For each process, the initial priority is 0.• If there exists no consistent value for xi, the priority of xi

is changed to k+1, where k is the largest value of related processes.

• The order is defined such that any process with a larger priority value has higher priority. If the priority value of processes are the same, the order is determined by the alphabetical order of the variables.

Page 28: Search Algorithms for Agents

Asynchronous Weak-Commitment Search

As in the asynchronous backtracking, each process concurrently assigns a value to its variable, and send the variable value to other processes.

• The priority value, as well as the current assignment, is communicated through the ‘ok?’ message.

• If the current value is not consistent with the local view the agent changes its value using the min-conflict heuristic, i.e., a value that is consistent with the local view and minimizes the number of constraint violations with variable of lower priority processes.

Page 29: Search Algorithms for Agents

Asynchronous Weak-Commitment Search

• Each process records the nogoods that have been resolved.

• When xi cannot find a consistent value with its local view, xi sends nogoods messages to other processes,

and increment its priority only if he created a new nogood.

Page 30: Search Algorithms for Agents

Asynchronous Weak-Commitment Search example

Q

Q

Q

Q

X1 (0)

X2 (0)

X4 (0)

X3 (0)

)a(

Q

Q

Q

Q

X1 (0)

X2 (0)

X4 (1)

X3 (0)

)b(

Page 31: Search Algorithms for Agents

Asynchronous Weak-Commitment Search example - continue

Q

Q

Q

Q

X1 (0)

X2 (0)

X4 (1)

X3 (2)

)c(

Q

Q

Q

Q

X1 (0)

X2 (0)

X4 (1)

X3 (2)

)d(

Page 32: Search Algorithms for Agents

Asynchronous Weak-Commitment Search Completeness

The completeness of algorithm is guaranteed by the factthat the processes record all nogoods found so far.

Handling a large number of nogoods is time/spaceconsuming. We can restrict the number of recordednogoods, such that each processes records only the mostrecently found nogoods. In this case the theoreticalcompleteness is not guaranteed. Yet, when the number of recorded nogoods is reasonably large, an infinite processing loop rarely occurs.

Page 33: Search Algorithms for Agents

Path Finding Problem

A path finding problem consist of the following components:• A set of nodes N, each representing a state.• A set of directed links L, each representing an operator

available to a problem solving agent. • A unique node s called the start node.• A set of nodes G, each represents a goal state.

Page 34: Search Algorithms for Agents

Path Finding Problem

More definitions:

• h*(i) is the shortest distance from node i to goal nodes

• If j is a neighbor of i, the shortest distance via j is given by f*(j) = k(i,j) + h*(j), where k(i,j) is the cost of the link between i and j.

• If i is not a goal node, then h*(i) = minjf*(j) holds.

Page 35: Search Algorithms for Agents

Asynchronous Dynamic Programming Algorithm

Let assume the following situation.• For each node i there exist a process corresponding to it.• Each process records h(i), which is the estimated value of

h*(i). The initial value of h(i) is except for goal nodes.• For each goal node g, h(g) is 0.• Each process can refer to h value of neighboring nodes.

The algorithm:

each process updates h(i) by the following procedure.

For each neighboring node j, compute f(j) = k(i,j) + h(j), and update h(i) as follows: h(i) minjf(j).

Page 36: Search Algorithms for Agents

Asynchronous Dynamic Programming Example

s

a

b

c

g

d

1

2

1

3

11

2

3

1

0

3

1

22

3

4

Page 37: Search Algorithms for Agents

Asynchronous Dynamic Programming

• If the costs of all links are positive, it is proved that for each node i, h(i) converges to the true value h*(i).

• In reality, the number of nodes can be huge, and we cannot afford to have processes for all nodes.

Page 38: Search Algorithms for Agents

Learning Real-Time A* Algorithm (LRTA*)

As with asynchronous dynamic programming, each agent

records the estimated distance h(i)

Each agent repeats the following procedure.

1. Lookahead: calculate f(j) = k(i,j) + h(j).

2. Update: h(i) minjf(j).

3. Action selection: move to the neighbor j that has the minimum f(j) value.

Page 39: Search Algorithms for Agents

LRTA*

• The initial value of h is determined using an admissible heuristic function.

• By using an admissible heuristic function on a problem with finite number of nodes, in which all links are positive and there exist a path from every node to a goal node, the completeness is guaranteed.

• Since LRTA* never overestimates, it learns the optimal solutions through repeated trials.

Page 40: Search Algorithms for Agents

Real-Time A* Algorithm (RTA*)

• Similar to LRTA*, only that the updating phase is different.

- instead of setting h(i) to the smallest value of f(j),

the second smallest value is assigned to h(i).

- as a result, RTA* learns more efficiently than LRTA*, but can overestimate heuristic costs.

In a finite space with positive edge costs, in which there exist a path from every state to a goal, using a non-negative admissible initial heuristic values, RTA* is complete.

Page 41: Search Algorithms for Agents

Moving Target Search (MTS)

• MST algorithm is a generalization of LRTA* to the case where the target can move.

• We assume that the problem solver and the target move alternately, and each can traverse at most one edge in a single move.

• The task is accomplished when the problem solver and the target occupy the same node.

• MTS maintains a matrix of heuristic values, representing the function h(x,y) for all pairs of states x and y.

• The matrix initialized to the values returned by the static evaluation function.

Page 42: Search Algorithms for Agents

MTS

To simplify the following discussion, we assume that all

edges in the graph have unit cost.

When the problem solver moves:

1. Calculate h(xj,yi) for each neighbor xj of xi.

2. Update the value of h(xi,yi) as follows:

h(xi,yi) max{ h(xi,yi), minxj{h(xj,yi) +1} }

3. Move to the neighbor xj with the minimum h(xj,yi).

Page 43: Search Algorithms for Agents

MTS

When the target moves:1. Calculate h(xi,yj) for the target’s new position yj.2. Update the value of h(xi,yi) as follows:

h(xi,yi) max{ h(xi,yi), h(xi,yj) -1 }3. Assign yj to yi, yj is the new target’s position.

MST completeness:In a finite problem space with positive edge costs , in whichthere exists a path from every state to the goal state,starting with non-negative admissible initial heuristicvalues, and with the other assumptions we mentioned,the problem solver will eventually reach the target.

Page 44: Search Algorithms for Agents

Real-Time Bidirectional Search Algorithm (RTBS)

• Two problem solvers starting from the initial and goal states move toward each other.

• Each of them knows its current location, and can communicate with the other.

The following steps are executed until the solvers meet:1. Control strategy: select a forward or backward move.2. Forward move: the forward solver moves toward

the other.3. Backward move: the backward solver moves toward

the other.

Page 45: Search Algorithms for Agents

RTBS

There are two categories of RTBS:

1. Centralized RTBS where the best action is selected from among all possible moves of the two solvers.

2. Decoupled RTBS where the two solvers independently make their own decisions.

The evaluation results show that when the heuristic

function return accurate values decoupled performs better

than centralized.

Otherwise, centralized is better.

Page 46: Search Algorithms for Agents

Is RTBS better than unidirectional search?

• The number of moves for centralized RTBS is around 1/2 in 15-puzzles and 1/6 in 24-puzzles that for real-time unidirectional search.

• In mazes, the number of moves for RTBS is double that for unidirectional search.

The key to understand this results is to view that the

difference between RTBS and unidirectional search is their

problem spaces.

Page 47: Search Algorithms for Agents

RTBS

• We call a pair of locations (x,y) a p-state.• We call the problem space consisting of p-states a

combined problem space.

• A heuristic depression is a set of connected states with heuristic values less than or equal to the set of immediate surrounding.

• The performance of real-time search is sensitive to the topography of the problem space, especially to heuristic depressions.

Page 48: Search Algorithms for Agents

RTBS

Heuristic depressions of the original problem space have been observed to become large and shallow in the combined problem space.

- if the original heuristic depressions are deep, they become large, and that makes the problem harder to solve.

- if the original depressions are shallow, they become very shallow, and this makes the

problem easier to solve

Page 49: Search Algorithms for Agents