unit 2 : ai problem solving

UNIT 2 : AI Problem Solving

Define the problem precisely by including

specification of initial situation, and final

situation constituting the solution of the

problem.

Analyze the problem to find a few important

features for appropriateness of the solution

technique.

Isolate and represent the knowledge that

is necessary for solution.

Select the best problem solving technique.

By: Anuj Khanna(Asst. Prof.) www.uptunotes.com

State Space The state space of a problem includes : An initial state, One or more goal states. Sequence of intermediate states through which

the system makes transition while applying various rules.

State space may be a tree or a graph.

The state space for WJP can be described as a set of ordered pairs of integers (x,y) such that x=0,1,2,3,or 4 and y= 0,1,2,or 3. the start state is (0,0) and the goal state is (2,n)


Rules for Water Jug Problem

1. {(x, y)| x<4 } (4,y)2. {(x, y) y<3 } (x,3)3. {(x, y) x>0 } (0,y)4. {(x, y) |y>0 } (x,0)5. {(x, y) | x + y ≥ 4 and y>0} (4, x + y -

4 )6. {(x, y) x + y ≥3 and x>0} (x+y-3, 3)7. {(x, y) | x+y≤4 and y>0} ( x + y , 0)8. {(x, y) | x+y≤3 and x>0} (0, x + y)9. (0,2) (2,0)10. (2,y) (0,y) 11. { (x , y) | y >0} (x, y-d)

Useless rule12. { (x , y) | x>0 } (x-d, y)

Useless rule By: Anuj Khanna(Asst. Prof.) www.uptunotes.com

Problem Characteristics1.) Is the problem decomposable?

2) . Can solution steps be ignored or at least undone if they prove unwise? E.g : 8- Puzzle problem , Monkey Banana Problem…

In 8 – puzzle we can make a wrong move and to overcome that we can back track and undo that…

Based on this problems can be : Ignorable (e.g : theorem proving) Recoverable (e.g : 8 - puzzle) Irrecoverable (e.g: Chess , Playing cards(like

Bridge game))

Note :

** Ignorable problems can be solved using a simple control structure that never back tracks. Such a structure is easy to implement.


** Recoverable problems can be solved by a slightly more complicated control strategy that can be error prone.(Here solution steps can be undone).

** Irrecoverable are solved by a system that exp[ands a great deal of effort

Making each decision since decision must be final.(solution steps can’t be undone)

3). Is the Universe Predictable?

Can we earlier plan /predict entire move sequences & resulting next state. E.g : In a Bridge game entire sequence of moves can be planned before making final play…..

Certain outcomes : 8- puzzle Uncertain outcomes : Bridge Hardest Problems to be solved : Irrecoverable +

Uncertain Outcomes

4). Is a good solution absolute / relative? By: Anuj Khanna(Asst. Prof.) www.uptunotes.com

5). Is the solution state or path?

6). Role of Knowledge

7). Requiring interaction with a person

8). Problem classification


Search Techniques(Blind)

Search strategies following the two properties (Dynamic and Systematic) areBreadth First Search (BFS)Depth First Search (BFS)

Problem with the BFS is “Combinatorial explosion”.

Problem with DFS is that it may lead to “blind alley”.Dead end.The state which has already been generated.Exceeds to futility value.


Advantages Advantages of BFS are

Will not get trapped exploring a blind alley.

Guaranteed to find the solution if exist. The solution found will also be optimal (in terms of no. of applied rules)

Advantages of DFS areRequires less memory.

By chance it may find a solution without examining much of the search space.


Search StrategiesA search strategy is defined by picking the order

of node expansion

Strategies are evaluated along the following dimensions:completeness: does it always find a solution if

one exists?time complexity: number of nodes generatedspace complexity: maximum number of nodes

in memoryoptimality: does it always find a least-cost

solution?

Time and space complexity are measured in terms of b: maximum branching factor of the search

tree d: depth of the least-cost solution m: maximum depth of the state space

(may be ∞)


Classification of Search Strategies

I. Uninformed Search strategies use only the information available in the problem definition

Breadth-first search Depth-first search Depth-limited search Iterative deepening search Branch and Bound

II . Informed Search (Heuristic Search) Hill climbing (i) Simple Hill climbing (ii) Steepest Ascent

Hill climbing Best First Search A*, AO * algorithms Problem Reduction Constraint Satisfaction Means & End Analysis , Simulated Annealing.


Breadth-first search

Expand shallowest unexpanded nodeImplementation:

fringe is a FIFO queue, i.e., new successors go at end


Breadth-first searchExpand shallowest unexpanded nodeImplementation:



Breadth-first search

Expand shallowest unexpanded nodeImplementation:



Breadth-first searchExpand shallowest unexpanded nodeImplementation:



Properties of breadth-first search

Complete? Yes (if b is finite)

Time? 1+b+b2+b3+… +bd + b(bd-1) = O(bd+1)

Space? O(bd+1) (keeps every node in memory)

Optimal? Yes (if cost = 1 per step)

Space is the bigger problem (more than time)


Depth-first searchExpand deepest unexpanded nodeImplementation:

fringe = LIFO queue, i.e., put successors at front


Depth-first search

Expand deepest unexpanded nodeImplementation:



Depth-first search




Properties of depth-first search

Complete? No: fails in infinite-depth spaces, spaces with loops Modify to avoid repeated states along path complete in finite spaces.

Time? O(bm): terrible if m is much larger than d but if solutions are dense, may be much

faster than breadth-first

Space? O(bm), i.e., linear space!

Optimal? NoBy: Anuj Khanna(Asst. Prof.) www.uptunotes.com

Comparison b/w DFS & BFSDepth First Search Breadth First Search

1. Downward traversal in the tree.

2. If goal not found up to the leaf node back tracking occurs.

3. Preferred over BFS when search tree is known to have a plentiful no. of goal states else DFS never finds the solution.

4. Depth cut-off point leads to problem.

If it is too shallow goals may be missed,

if set too deep extra computation of search nodes is required.

5. Since path from initial to current node is stored , less space required. If depth cut-off = d , Space Complexity= O (d).

1. Performed by exploring all nodes at a given depth before moving to next level.

2. If goal not found , many nodes need to be expanded before a solution is found, particularly if tree is too deep.

3. Finds minimal path length solution when one exists.

4. No Cut – off problem.

5. Space complexity = O (b)dBy: Anuj Khanna(Asst. Prof.) www.uptunotes.com

Depth-limited search= depth-first search with depth limit l,i.e., nodes at depth l have no successors

Recursive implementation:


Iterative deepening search


Iterative deepening search l =0


Iterative deepening search

Number of nodes generated in a depth-limited search to depth d with branching factor b:

NDLS = b0 + b1 + b2 + … + bd-2 + bd-1 + bd

Number of nodes generated in an iterative deepening search to depth d with branching factor b: NIDS = (d+1)b0 + d b^1 + (d-1)b^2 + … + 3bd-2

+2bd-1 + 1bd

For b = 10, d = 5,

NDLS = 1 + 10 + 100 + 1,000 + 10,000 + 100,000 = 111,111

NIDS = 6 + 50 + 400 + 3,000 + 20,000 + 100,000 = 123,456

Overhead = (123,456 - 111,111)/111,111 = 11%


Properties of iterative deepening search

Complete? Yes

Time? (d+1)b0 + d b1 + (d-1)b2 + … + bd = O(bd)

Space? O (bd)

Optimal? Yes , if step cost = 1


Difference b/w informed & Uninformed search

Un -Informed Search Informed Search 1. Nodes in the state are

searched mechanically, until the goal is reach or time limit is over / failure occurs.

2. Info about goal state may not be given

3. Blind grouping is done

4. Search efficiency is low.

5. Practical limits on storage available for blind methods.

6. Impractical to solve very large problems.

7. Best solution can be achieved.

E.g : DFS , BFS , Branch & Bound , Iterative Deepening …etc.

1. More info. About initial state & operators is available . Search time is less.

2. Some info. About goal is always given.

3. Based on heuristic methods

4. Searching is fast

5. Less computation required

6. Can handle large search problems

7. Mostly a good enough solution is accepted as optimal solution.

E.g: Best first search , A* , AO *, hill climbing…etc


Heuristic SearchSearch strategies like DFS and BFS can find

out solutions for simple problems.For complex problems also although DFS and

BFS guarantee to find the solutions but these may not be the practical ones. (For TSP time is proportional to N! or it is exponential with branch and bound).

Thus, it is better to sacrifice completeness and find out efficient solution.

Heuristic search techniques improve efficiency by sacrificing claim of completeness and find a solution which is very close to the optimal solution.

Using nearest neighbor heuristic TSP can be solved in time proportional to square of N.


When more information than the initial state , the operators & goal state is available, size of search space can usually be constrained. If this is the case, better info. available more efficient is the search process. This is called Informed search methods.

Depend on Heuristic information Heuristic Search improves the efficiency of search

process, possibly by sacrificing the claims of completeness.

“Heuristics are like tour guides. They are good to the extent that they point in

generally interesting directions . Bad to the extent that they may miss points of interest to a particular individuals.

E. g : A good general purpose heuristic that is useful for a variety of combinatorial explosion problems is the “Nearest Neighbor Heuristic”. This works by selecting the locally superior alternate at each step. Applying it to Traveling Salesman Problem, following algo is used:


1. Arbitrarily select a starting point let say A.

2. To select the next city , look at all the cities not yet visited & select the one closest to the current city…. Go to it Next.

3. Repeat step 2 until all the cities have been visited.

Combinatorial Explosion TSP involves n cities with paths connecting the cities. A

tour is any path which begins with some starting city , visits each of the other city exactly once & returns to the starting city.

If n cities then no. of different paths among them are (n-1) !

Time to examine single path is proportional to n .

T (total) = n ( n-1) ! = n !, this is total search time required

If n = 10 then 10 ! = 3, 628 , 800 paths are possible. This is very large no..

This phenomenon of growing no. of possible paths as n increases is called

“ Combinatorial Explosion”


Branch and Bound Technique To over come the problem of combinatorial

explosion Branch & Bound Technique is used. This begins with generating one path at a time ,

keeping track of shortest (BEST) path so far. This value is used as a Bound(threshold)

for future paths. As paths are constructed one city at a time ,

algorithm examines and compares it from current bound value.

We give up exploring any path as soon as its partial length becomes greater than shortest path(Bound value) so far….

This reduces search and increases efficiency but still leaves an exponential no. of paths.By: Anuj Khanna(Asst. Prof.) www.uptunotes.com

Heuristic FunctionHeuristic function is a function that maps from

problem state descriptions to measures of desirability and it is usually represented as a number.

Well designed heuristic functions can play an important role in efficiently guiding a search process towards a solution.

Called Objective function in mathematical optimization problems.

Heuristic function estimates the true merits of each node in the search space

Heuristic function f(n)=g(n) + h(n) g(n) the cost (so far) to reach the node. h(n) estimated cost to get from the node to the goal. f(n) estimated total cost of path through n to goal. By: Anuj Khanna(Asst. Prof.) www.uptunotes.com

Heuristic Search Techniques

Generate and TestHill Climbing

Simple Hill ClimbingSteepest Hill Climbing

Best First SearchProblem Reduction TechniqueConstraint Satisfaction TechniqueMeans Ends Analysis


Generate and Test

Generate a possible solution and compare it with the acceptable solution.

Comparison will be simply in terms of yes or no i.e. whether it is a acceptable solution or not?

A systematic generate and test can be implemented as depth first search with backtracking.


Hill Climbing It is a variation of generate and test in which feedback

from the test procedure is used to help the generator decide which direction to move in the search space.

It is used generally when good heuristic function is available for evaluating but when no other useful knowledge is available.

Simple Hill Climbing: From the current state every time select a state which is better than the current state.

Steepest Hill Climbing: At the current state, select best of the new state which can be generated only if it is better than the current state.

Hill Climbing is a local method because it decides what to do next by looking only at the immediate consequences of its choice rather than by exhaustively exploring all the consequences.


Problems with Hill ClimbingBoth simple and steepest hill climbing

may fail to find solution because of the following.

Local Maximum: A state that is better than all its neighbors but is not better than some other states farther away.

A Plateau: Is a flat area of the search space in which a whole set of neighboring states have the same value.

A Ridge: A special kind of local maximum. An area of the search space that is higher than surrounding areas and that itself has slope


Example

Block World ProblemA

HGFEDCB

HGFEDCBA

Initial GoalBy: Anuj Khanna(Asst. Prof.) www.uptunotes.com

ExampleHeuristic Function: Following heuristic functions

may be usedLocal: Add one point for every block that is

resting on the thing it is supposed to be resting on. Subtract one point for every block that is sitting on the wrong thing.

Global: For each block that has the correct support structure, add one point for every block in the support structure. For each block that has incorrect support structure, subtract one point for every block in the existing support structure.

With local heuristic function the initial state has the value 4 and the goal state has the value 8 whereas with global heuristic the values are -28 and +28 respectively.By: Anuj Khanna(Asst. Prof.) www.uptunotes.com

ExampleFrom the initial state only one move is

possible giving a new state with value 6 (-21).

From this state three moves are possible giving three new states with values as 4(-28), 4(-16), and 4(-15).

Thus we see that we are reached to plateau with local evaluation.

With global evaluation next state to be selected (with steepest hill climbing) is that with the value as -15 which may lead to the solution.

Why we are not able to find the solution? Because of deficiency of search technique are because of poor heuristic function. By: Anuj Khanna(Asst. Prof.) www.uptunotes.com

Best-first searchAt each step select most promising of

the nodes generated so far.Implementation of the best first search

requires the followingNode structure containing description of the

problem state, heuristic value, parent link, and the list of nodes that were generated from it.

Two lists named OPEN: containing nodes that have been generated, their heuristic value calculated but not expanded so far. Generally a priority list. CLOSED: containing nodes that have already been expanded (required in case of graph search)


Best-first search(Differences from hill climbing)

In hill climbing at each step one node is selected and all others are rejected and never considered again. While in Best-first search one node is selected and all others are kept around so that they may be revisited again.

Best available state is selected even if it may have a value lower than the currently expanded state.


Best-first search Algorithm1. Start with OPEN containing the initial node. Set its f value

to 0+h. Set the CLOSE list to empty2. Repeat the following until goal node is found

1. If open is empty, then report failure. Otherwise pick the node with lowest f value. Call it bestnode. Remove it from open and place it on CLOSE. Check if bestnode is a goal node. If so exit and report a solution. Otherwise, generate the successors of bestnode but do not set bestnode to point to them. For each successor do the following

1. Set successor to point back to bestnode.2. Find g(successor)=g(bestnode)+Cost of getting to successor from

best node.3. Check if successor is in OPEN. If so call it OLD. See whether it is

cheaper to get to OLD via its current path or to successor via bestnode by comparing their g values. If OLD is cheaper, then do nothing but if the successor is cheaper, then reset OLD’s parent link to point to the bestnode. Record the new cheaper path in g(OLD) and update f(OLD) accordingly.

4. If successor is not in OPEN but in CLOSE call the node in CLOSE list as OLD and add it to the list of bestnode’s successors. Check if the new or old path is better and set the parent link and g & f values accordingly. If the better path to OLD has been found then communicate this improvement to OLD’s successors.

5. If the successor is neither in OPEN nor in the CLOSE list, then put it on the OPEN and add it to the list of bestnode’s successors. Compute the f(successor)=g(successor)+h(successor).By: Anuj Khanna(Asst. Prof.) www.uptunotes.com

Certain ObservationsIf g=0, Getting to solution somehow. It

may be optimal/non-optimalIf g=constant (1), Solution with lowest

number of steps.If g=actual cost, Optimal solutionIf h=o, search is controlled by g.

If g=0, random search.If g=1, BFS

Since h is not absoluteUnderestimated: Suboptimal solution may be

generatedOverestimated: Wastage of efforts but the

solution is optimal


Effect of underestimation of hA

B C D

F

E

(3+1)

(4+1)

(5+1)

(3+2)

(3+3)

B Underestimated

Effect: Wastage of efforts but optimal solution can be found


Effect of overestimation of hA

B C D

F

E

(3+1)

(4+1)

(5+1)

(2+2)

(1+3)

D Overestimated

Effect: Suboptimal solution can be found

G (0+4)


Problem Reduction Technique(AND-OR Graph)AND-OR graphs are used to represent

problems that can be solved by decomposing them into a set of smaller problems, all of which must then be solved.

Every node may have AND and OR links emerging out of it.

Best-first search technique is not adequate for searching in AND-OR graph. Why?


Best-first search is not adequate for AND-OR graph

Choice of which node to expand next must depend not only on the f value of that node but also on whether that node is part of the current best path from the initial node.

Best node, part of best arc but not the part of best path

A

B C D(5) (4)(3)

(9)

A

B C D

(17) (27)(9)

(38)

E F IHG J(5) (3) (4) (15) (10)(10)

Node to be expanded next as per best-first but it will cost 9 (38) whereas expanding thru B (E & F) will cost 6 (18) only.

Best node but not the part of best arc/path


Romania with step costs in km

hSLD=straight-line distance heuristic.

hSLD can NOT be computed from the problem description itself

In this example f(n)=h(n) Expand node that is closest

to goal

= Greedy best-first search


Greedy search example

Assume that we want to use greedy search to solve the problem of travelling from Arad to Bucharest.

The initial state=Arad

Arad (366)



The first expansion step produces:Sibiu, Timisoara and Zerind

Greedy best-first will select Sibiu.

Arad

Sibiu(253)

Timisoara(329)

Zerind(374)



If Sibiu is expanded we get:Arad, Fagaras, Oradea and Rimnicu Vilcea

Greedy best-first search will select: Fagaras

Arad

Sibiu

Arad(366) Fagaras

(176)Oradea(380)

Rimnicu Vilcea(193)



If Fagaras is expanded we get: Sibiu and Bucharest

Goal reached !! Yet not optimal (see Arad, Sibiu, Rimnicu Vilcea, Pitesti)

Arad

Sibiu

Fagaras

Sibiu(253)

Bucharest(0)


Greedy search, evaluation Completeness: NO (cfr. DF-search)

Check on repeated statesMinimizing h(n) can result in false starts, e.g. Iasi to Fagaras.


Greedy search, evaluation Completeness: NO (cfr. DF-search) Time complexity?

Cfr. Worst-case DF-search(with m is maximum depth of search space)Good heuristic can give dramatic improvement.

O(bm )


Greedy search, evaluationCompleteness: NO (cfr. DF-search)Time complexity:Space complexity:

Keeps all nodes in memory

O(bm )

O(bm )


Greedy search, evaluationCompleteness: NO (cfr. DF-search)Time complexity:Space complexity:Optimality? NO

Same as DF-search

O(bm )

O(bm )


A* search Best-known form of best-first search. Idea: avoid expanding paths that are already expensive. Evaluation function f(n)=g(n) + h(n)

g(n) the cost (so far) to reach the node.h(n) estimated cost to get from the node to the goal.f(n) estimated total cost of path through n to goal.


A* searchA* search uses an admissible heuristic

A heuristic is admissible if it never overestimates the cost to reach the goal

Are optimistic

Formally: 1. h(n) <= h*(n) where h*(n) is the true cost from n2. h(n) >= 0 so h(G)=0 for any goal G.

e.g. hSLD(n) never overestimates the actual road distance


Romania example


A* search example

Find Bucharest starting at Arad f(Arad) = c(??,Arad)+h(Arad)=0+366=366


A* search example

Expand Arrad and determine f(n) for each node f(Sibiu)=c(Arad,Sibiu)+h(Sibiu)=140+253=393 f(Timisoara)=c(Arad,Timisoara)+h(Timisoara)=118+329=447 f(Zerind)=c(Arad,Zerind)+h(Zerind)=75+374=449

Best choice is Sibiu


A* search example

Expand Sibiu and determine f(n) for each node f(Arad)=c(Sibiu,Arad)+h(Arad)=280+366=646 f(Fagaras)=c(Sibiu,Fagaras)+h(Fagaras)=239+179=415 f(Oradea)=c(Sibiu,Oradea)+h(Oradea)=291+380=671 f(Rimnicu Vilcea)=c(Sibiu,Rimnicu Vilcea)+

h(Rimnicu Vilcea)=220+192=413 Best choice is Rimnicu Vilcea


A* search example

Expand Rimnicu Vilcea and determine f(n) for each node f(Craiova)=c(Rimnicu Vilcea, Craiova)

+h(Craiova)=360+160=526 f(Pitesti)=c(Rimnicu Vilcea, Pitesti)+h(Pitesti)=317+100=417 f(Sibiu)=c(Rimnicu Vilcea,Sibiu)+h(Sibiu)=300+253=553

Best choice is Fagaras


A* search example

Expand Fagaras and determine f(n) for each node f(Sibiu)=c(Fagaras, Sibiu)+h(Sibiu)=338+253=591 f(Bucharest)=c(Fagaras,Bucharest)+h(Bucharest)=450+0=450

Best choice is Pitesti !!!


A* search example

Expand Pitesti and determine f(n) for each node f(Bucharest)=c(Pitesti,Bucharest)+h(Bucharest)=418+0=418

Best choice is Bucharest !!! Optimal solution (only if h(n) is admissable)

Note values along optimal path !!


Optimality of A*(standard proof)

Suppose suboptimal goal G2 in the queue. Let n be an unexpanded node on a shortest to optimal goal G.

f(G2 ) = g(G2 ) since h(G2 )=0

> g(G) since G2 is suboptimal>= f(n) since h is admissible

Since f(G2) > f(n), A* will never select G2 for expansion


BUT … graph search Discards new paths to repeated state.

Previous proof breaks downSolution:

Add extra bookkeeping i.e. remove more expsive of two paths.

Ensure that optimal path to any repeated state is always first followed. Extra requirement on h(n): consistency

(monotonicity)


Consistency A heuristic is consistent if

If h is consistent, we have

i.e. f(n) is nondecreasing along any path.

h(n) c(n,a,n') h(n')

f (n') g(n') h(n')

g(n) c(n,a,n') h(n')

g(n) h(n)

f (n)


Optimality of A*(more usefull) A* expands nodes in order of increasing f value Contours can be drawn in state space

Uniform-cost search adds circles.

F-contours are graduallyAdded: 1) nodes with f(n)<C*2) Some nodes on the goalContour (f(n)=C*).

Contour I has allNodes with f=fi, where

fi < fi+1.


A* search, evaluationCompleteness: YES

Since bands of increasing f are addedUnless there are infinitly many nodes with f<f(G)


A* search, evaluationCompleteness: YESTime complexity:

Number of nodes expanded is still exponential in the length of the solution.


A* search, evaluation Completeness: YES Time complexity: (exponential with path length) Space complexity:

It keeps all generated nodes in memoryHence space is the major problem not time


A* search, evaluation Completeness: YES Time complexity: (exponential with path length) Space complexity:(all nodes are stored) Optimality: YES

Cannot expand fi+1 until fi is finished.A* expands all nodes with f(n)< C*A* expands some nodes with f(n)=C*A* expands no nodes with f(n)>C*

Also optimally efficient (not including ties)


Memory-bounded heuristic searchSome solutions to A* space problems (maintain

completeness and optimality)Iterative-deepening A* (IDA*)

Here cutoff information is the f-cost (g+h) instead of depth

Recursive best-first search(RBFS)Recursive algorithm that attempts to mimic

standard best-first search with linear space.(simple) Memory-bounded A* ((S)MA*)

Drop the worst-leaf node when memory is full


Recursive best-first searchfunction RECURSIVE-BEST-FIRST-SEARCH(problem) return a solution or failure

return RFBS(problem,MAKE-NODE(INITIAL-STATE[problem]),∞)

function RFBS( problem, node, f_limit) return a solution or failure and a new f-cost limit

if GOAL-TEST[problem](STATE[node]) then return node

successors EXPAND(node, problem)

if successors is empty then return failure, ∞

for each s in successors do

f [s] max(g(s) + h(s), f [node])

repeat

best the lowest f-value node in successors

if f [best] > f_limit then return failure, f [best]

alternative the second lowest f-value among successors

result, f [best] RBFS(problem, best, min(f_limit, alternative))

if result failure then return result


Recursive best-first searchKeeps track of the f-value of the best-

alternative path available.If current f-values exceeds this alternative f-

value than backtrack to alternative path.Upon backtracking change f-value to best f-

value of its children.Re-expansion of this result is thus still

possible.


Recursive best-first search, ex.

Path until Rumnicu Vilcea is already expanded Above node; f-limit for every recursive call is shown on top. Below node: f(n) The path is followed until Pitesti which has a f-value worse than the f-

limit.



Unwind recursion and store best f-value for current best leaf Pitesti

result, f [best] RBFS(problem, best, min(f_limit, alternative))

best is now Fagaras. Call RBFS for new bestbest value is now 450



Unwind recursion and store best f-value for current best leaf Fagarasresult, f [best] RBFS(problem, best, min(f_limit, alternative))

best is now Rimnicu Viclea (again). Call RBFS for new best Subtree is again expanded. Best alternative subtree is now through Timisoara.

Solution is found since because 447 > 417.


RBFS evaluation RBFS is a bit more efficient than IDA*

Still excessive node generation (mind changes) Like A*, optimal if h(n) is admissible Space complexity is O(bd).

IDA* retains only one single number (the current f-cost limit) Time complexity difficult to characterize

Depends on accuracy if h(n) and how often best path changes. IDA* en RBFS suffer from too little memory.


(simplified) memory-bounded A* Use all available memory.

I.e. expand best leafs until available memory is full When full, SMA* drops worst leaf node (highest f-value) Like RFBS backup forgotten node to its parent

What if all leafs have the same f-value? Same node could be selected for expansion and deletion. SMA* solves this by expanding newest best leaf and deleting

oldest worst leaf. SMA* is complete if solution is reachable, optimal if optimal

solution is reachable.


Learning to search better All previous algorithms use fixed strategies. Agents can learn to improve their search by exploiting

the meta-level state space.Each meta-level state is a internal (computational) state

of a program that is searching in the object-level state space.

In A* such a state consists of the current search tree A meta-level learning algorithm from experiences at the

meta-level.


Heuristic functions

E.g for the 8-puzzleAvg. solution cost is about 22 steps (branching factor +/- 3)Exhaustive search to depth 22: 3.1 x 1010 states.A good heuristic function can reduce the search process.


Heuristic functions

E.g for the 8-puzzle knows two commonly used heuristics h1 = the number of misplaced tiles

h1(s)=8 h2 = the sum of the distances of the tiles from their goal positions

(manhattan distance). h2(s)=3+1+2+2+2+3+3+2=18


Heuristic qualityEffective branching factor b*

Is the branching factor that a uniform tree of depth d would have in order to contain N+1 nodes.

Measure is fairly constant for sufficiently hard problems.Can thus provide a good guide to the heuristic’s

overall usefulness.A good value of b* is 1.

N 11 b*(b*)2 ... (b*)d


Heuristic quality and dominance 1200 random problems with solution lengths from 2 to

24.

If h2(n) >= h1(n) for all n (both admissible)

then h2 dominates h1 and is better for search


Inventing admissible heuristics Admissible heuristics can be derived from the exact solution

cost of a relaxed version of the problem: Relaxed 8-puzzle for h1 : a tile can move anywhere

As a result, h1(n) gives the shortest solution

Relaxed 8-puzzle for h2 : a tile can move to any adjacent square.

As a result, h2(n) gives the shortest solution.

The optimal solution cost of a relaxed problem is no greater than the optimal solution cost of the real problem.

ABSolver found a usefull heuristic for the rubic cube.


Inventing admissible heuristics Admissible heuristics can also be derived from the solution cost of a

subproblem of a given problem. This cost is a lower bound on the cost of the real problem. Pattern databases store the exact solution to for every possible

subproblem instance. The complete heuristic is constructed using the patterns in the DB


Inventing admissible heuristicsAnother way to find an admissible heuristic is

through learning from experience:Experience = solving lots of 8-puzzlesAn inductive learning algorithm can be used to

predict costs for other states that arise during search.


Local search and optimization Previously: systematic exploration of search space.

Path to goal is solution to problem YET, for some problems path is irrelevant.

E.g 8-queens

Different algorithms can be usedLocal search


Local search and optimization Local search= use single current state and move to

neighboring states. Advantages:

Use very little memoryFind often reasonable solutions in large or infinite state

spaces. Are also useful for pure optimization problems.

Find best state according to some objective function.e.g. survival of the fittest as a metaphor for optimization.


Local search and optimization


Hill-climbing search “is a loop that continuously moves in the direction of

increasing value” It terminates when a peak is reached.

Hill climbing does not look ahead of the immediate neighbors of the current state.

Hill-climbing chooses randomly among the set of best successors, if there is more than one.

Hill-climbing a.k.a. greedy local search


Hill-climbing searchfunction HILL-CLIMBING( problem) return a state that is a local maximum

input: problem, a problem

local variables: current, a node.

neighbor, a node.

current MAKE-NODE(INITIAL-STATE[problem])

loop do

neighbor a highest valued successor of current

if VALUE [neighbor] ≤ VALUE[current] then return STATE[current]

current neighbor


Hill-climbing example 8-queens problem (complete-state formulation). Successor function: move a single queen to another

square in the same column. Heuristic function h(n): the number of pairs of queens

that are attacking each other (directly or indirectly).


Hill-climbing example

a) shows a state of h=17 and the h-value for each possible successor.

b) A local minimum in the 8-queens state space (h=1).

a) b)


Drawbacks

Ridge = sequence of local maxima difficult for greedy algorithms to navigate

Plateaux = an area of the state space where the evaluation function is flat.

Gets stuck 86% of the time.


Hill-climbing variationsStochastic hill-climbing

Random selection among the uphill moves.The selection probability can vary with the

steepness of the uphill move.First-choice hill-climbing

cfr. stochastic hill climbing by generating successors randomly until a better one is found.

Random-restart hill-climbingTries to avoid getting stuck in local maxima.


Simulated annealing Escape local maxima by allowing “bad” moves.

Idea: but gradually decrease their size and frequency. Origin; metallurgical annealing Bouncing ball analogy:

Shaking hard (= high temperature). Shaking less (= lower the temperature).

If T decreases slowly enough, best state is reached. Applied for VLSI layout, airline scheduling, etc.


Simulated annealingfunction SIMULATED-ANNEALING( problem, schedule) return a solution state

input: problem, a problem

schedule, a mapping from time to temperature

local variables: current, a node.

next, a node.

T, a “temperature” controlling the probability of downward steps

current MAKE-NODE(INITIAL-STATE[problem])

for t 1 to ∞ do

T schedule[t]

if T = 0 then return current

next a randomly selected successor of current

∆E VALUE[next] - VALUE[current]

if ∆E > 0 then current next

else current next only with probability e∆E /T


Local beam search Keep track of k states instead of one

Initially: k random statesNext: determine all successors of k states If any of successors is goal finishedElse select k best from successors and repeat.

Major difference with random-restart search Information is shared among k search threads.

Can suffer from lack of diversity.Stochastic variant: choose k successors at proportionally

to state success.


Genetic algorithms Variant of local beam search with sexual recombination.


Genetic algorithmfunction GENETIC_ALGORITHM( population, FITNESS-FN) return an individual

input: population, a set of individuals

FITNESS-FN, a function which determines the quality of the individual

repeat

new_population empty set

loop for i from 1 to SIZE(population) do

x RANDOM_SELECTION(population, FITNESS_FN)y RANDOM_SELECTION(population, FITNESS_FN)

child REPRODUCE(x,y)

if (small random probability) then child MUTATE(child )

add child to new_population

population new_population

until some individual is fit enough or enough time has elapsed

return the best individual


Exploration problems Until now all algorithms were offline.

Offline= solution is determined before executing it.Online = interleaving computation and action

Online search is necessary for dynamic and semi-dynamic environments It is impossible to take into account all possible

contingencies. Used for exploration problems:

Unknown states and actions.e.g. any robot in a new environment, a newborn baby,…


Online search problems Agent knowledge:

ACTION(s): list of allowed actions in state s C(s,a,s’): step-cost function (! After s’ is determined) GOAL-TEST(s)

An agent can recognize previous states. Actions are deterministic. Access to admissible heuristic h(s)

e.g. manhattan distance


Online search problems Objective: reach goal with minimal cost

Cost = total cost of travelled pathCompetitive ratio=comparison of cost with cost of the

solution path if search space is known. Can be infinite in case of the agent

accidentally reaches dead ends


The adversary argument

Assume an adversary who can construct the state space while the agent explores it Visited states S and A. What next?

Fails in one of the state spaces No algorithm can avoid dead ends in all state spaces.


Online search agentsThe agent maintains a map of the

environment.Updated based on percept input.This map is used to decide next action.

Note difference with e.g. A*An online version can only expand the node it is

physically in (local order)


Online DF-searchfunction ONLINE_DFS-AGENT(s’) return an action

input: s’, a percept identifying current state

static: result, a table indexed by action and state, initially empty

unexplored, a table that lists for each visited state, the action not yet tried

unbacktracked, a table that lists for each visited state, the backtrack not yet tried

s,a, the previous state and action, initially null

if GOAL-TEST(s’) then return stop

if s’ is a new state then unexplored[s’] ACTIONS(s’)

if s is not null then do

result[a,s] s’

add s to the front of unbackedtracked[s’]

if unexplored[s’] is empty then

if unbacktracked[s’] is empty then return stop

else a an action b such that result[b, s’]=POP(unbacktracked[s’])

else a POP(unexplored[s’])

s s’

return a


Online DF-search, example Assume maze problem on

3x3 grid. s’ = (1,1) is initial state Result, unexplored (UX),

unbacktracked (UB), … are empty

S,a are also empty


Online DF-search, example GOAL-TEST((,1,1))?

S not = G thus false

(1,1) a new state? True ACTION((1,1)) -> UX[(1,1)]

{RIGHT,UP}

s is null? True (initially)

UX[(1,1)] empty? False

POP(UX[(1,1)])->a A=UP

s = (1,1) Return a

S’=(1,1)


Online DF-search, example GOAL-TEST((2,1))?


(2,1) a new state? True ACTION((2,1)) -> UX[(2,1)]

{DOWN}

s is null? false (s=(1,1)) result[UP,(1,1)] <- (2,1) UB[(2,1)]={(1,1)}


A=DOWN, s=(2,1) return A

S

S’=(2,1)




(1,1) a new state? false

s is null? false (s=(2,1)) result[DOWN,(2,1)] <- (1,1) UB[(1,1)]={(2,1)}


A=RIGHT, s=(1,1) return AS

S’=(1,1)



S not = G thus false (1,2) a new state?

True, UX[(1,2)]={RIGHT,UP,LEFT}

s is null? false (s=(1,1)) result[RIGHT,(1,1)] <- (1,2) UB[(1,2)]={(1,1)}


A=LEFT, s=(1,2) return A

S

S’=(1,2)




(1,1) a new state? false

s is null? false (s=(1,2)) result[LEFT,(1,2)] <- (1,1) UB[(1,1)]={(1,2),(2,1)}

UX[(1,1)] empty? True UB[(1,1)] empty? False

A= b for b in result[b,(1,1)]=(1,2) B=RIGHT

A=RIGHT, s=(1,1) …

S

S’=(1,1)


Online DF-search Worst case each node is visited

twice. An agent can go on a long walk

even when it is close to the solution.

An online iterative deepening approach solves this problem.

Online DF-search works only when actions are reversible.


Online local search Hill-climbing is already online

One state is stored. Bad performancd due to local maxima

Random restarts impossible. Solution: Random walk introduces exploration (can produce

exponentially many steps)


Online local search Solution 2: Add memory to hill climber

Store current best estimate H(s) of cost to reach goal H(s) is initially the heuristic estimate h(s) Afterward updated with experience (see below)

Learning real-time A* (LRTA*)


Learning real-time A*function LRTA*-COST(s,a,s’,H) return an cost estimate

if s’ is undefined the return h(s)

else return c(s,a,s’) + H[s’]

function LRTA*-AGENT(s’) return an action

input: s’, a percept identifying current state

static: result, a table indexed by action and state, initially empty

H, a table of cost estimates indexed by state, initially empty

s,a, the previous state and action, initially null

if GOAL-TEST(s’) then return stop

if s’ is a new state (not in H) then H[s’] h(s’)

unless s is null

result[a,s] s’

H[s] MIN LRTA*-COST(s,b,result[b,s],H) b ACTIONS(s)

a an action b in ACTIONS(s’) that minimizes LRTA*-COST(s’,b,result[b,s’],H)

s s’

return a


unit 2 : ai problem solving

Documents