Mais conteúdo relacionado Semelhante a Jarrar: Informed Search (20) Mais de Mustafa Jarrar (20) Jarrar: Informed Search 1. Jarrar © 2014 1
Chapter 4
Informed Searching
Dr. Mustafa Jarrar
Sina Institute, University of Birzeit
mjarrar@birzeit.edu
www.jarrar.info
Lecture Notes on Informed Searching
University of Birzeit, Palestine
1st Semester, 2014
Artificial Intelligence
2. Jarrar © 2014 2
Watch this lecture and download the slides from
http://jarrar-courses.blogspot.com/2011/11/artificial-intelligence-fall-2011.html
Most information based on Chapter 4 of [1]
3. Jarrar © 2014 3
Discussion and Motivation
How to determine the minimum number of coins to give while
making change?
The coin of the highest value first ?
4. Jarrar © 2014 4
Discussion and Motivation
Travel Salesperson Problem
• Any idea how to improve this type of search?
• What type of information we may use to improve our search?
• Do you think this idea is useful: (At each stage visit the unvisited city
nearest to the current city)?
Given a list of cities and their pair wise distances, the task is to find a
shortest possible tour that visits each city exactly once.
5. Jarrar © 2014 5
Best-first search
• Idea: use an evaluation function f(n) for each node
– family of search methods with various evaluation functions (estimate of
"desirability“)
– usually gives an estimate of the distance to the goal
– often referred to as heuristics in this context
Expand most desirable unexpanded node.
A heuristic function ranks alternatives at each branching step based
on the available information (heuristically) in order to make a decision
about which branch to follow during a search.
• Implementation:
Order the nodes in fringe in decreasing order of desirability.
• Special cases:
– greedy best-first search
– A* search
6. Jarrar © 2014 6
Romania with step costs in km
Suppose we can have this info (SLD)
How can we use it to improve our search?
7. Jarrar © 2014 7
Greedy best-first search
• Greedy best-first search expands the node that appears to be closest
to goal.
• Estimate of cost from n to goal ,e.g., hSLD(n) = straight-line distance
from n to Bucharest.
• Utilizes a heuristic function as evaluation function
– f(n) = h(n) = estimated cost from the current node to a goal.
– Heuristic functions are problem-specific.
– Often straight-line distance for route-finding and similar problems.
– Often better than depth-first, although worst-time complexities are
equal or worse (space).
8. Jarrar © 2014 8
Greedy best-first search example
Example from [1]
12. Jarrar © 2014 12
Properties of greedy best-first search
• Complete: No – can get stuck in loops (e.g., Iasi Neamt Iasi
Neamt ….)
• Time: O(bm), but a good heuristic can give significant improvement
• Space: O(bm) -- keeps all nodes in memory
• Optimal: No
b branching factor
m maximum depth of the
search tree
13. Jarrar © 2014 13
Discussion
• Do you think hSLD(n) is admissible?
• Would you use hSLD(n) in Palestine? How? Why?
• Did you find the Greedy idea useful?
Ideas to improve it?
14. Jarrar © 2014 14
A* search
• Idea: avoid expanding paths that are already expensive.
Evaluation function = path cost + estimated cost to the goal
f(n) = g(n) + h(n)
-g(n) = cost so far to reach n
-h(n) = estimated cost from n to goal
-f(n) = estimated total cost of path through n to goal
• Combines greedy and uniform-cost search to find the
(estimated) cheapest path through the current node
– Heuristics must be admissible
• Never overestimate the cost to reach the goal
– Very good search method, but with complexity problems
21. Jarrar © 2014 21
A* Exercise
How will A* get from Iasi to Fagaras?
22. Jarrar © 2014 22
A* Exercise
Node Coordinates SL Distance
A (5,9) 8.0
B (3,8) 7.3
C (8,8) 7.6
D (5,7) 6.0
E (7,6) 5.4
F (4,5) 4.1
G (6,5) 4.1
H (3,3) 2.8
I (5,3) 2.0
J (7,2) 2.2
K (5,1) 0.0
23. Jarrar © 2014 23
Solution to A* Exercise
Node Coordinates SL Distance
A (5,9) 8.0
B (3,8) 7.3
C (8,8) 7.6
D (5,7) 6.0
E (7,6) 5.4
F (4,5) 4.1
G (6,5) 4.1
H (3,3) 2.8
I (5,3) 2.0
J (7,2) 2.2
K (5,1) 0.0
24. Jarrar © 2014 24
Greedy Best-First Exercise
Node Coordinates Distance
A (5,9) 8.0
B (3,8) 7.3
C (8,8) 7.6
D (5,7) 6.0
E (7,6) 5.4
F (4,5) 4.1
G (6,5) 4.1
H (3,3) 2.8
I (5,3) 2.0
J (7,2) 2.2
K (5,1) 0.0
25. Jarrar © 2014 25
Solution to Greedy Best-First Exercise
Node Coordinates Distance
A (5,9) 8.0
B (3,8) 7.3
C (8,8) 7.6
D (5,7) 6.0
E (7,6) 5.4
F (4,5) 4.1
G (6,5) 4.1
H (3,3) 2.8
I (5,3) 2.0
J (7,2) 2.2
K (5,1) 0.0
26. Jarrar © 2014 26
Another Exercise
Node C g(n) h(n)
A (5,10) 0.0 8.0
B (3,8) 2.8 6.3
C (7,8) 2.8 6.3
D (2,6) 5.0 5.0
E (5,6) 5.6 4.0
F (6,7) 4.2 5.1
G (8,6) 5.0 5.0
H (1,4) 7.2 4.5
I (3,4) 7.2 2.8
J (7,3) 8.1 2.2
K (8,4) 7.0 3.6
L (5,2) 9.6 0.0
Do 1) A* Search and 2) Greedy Best-Fit Search
27. Jarrar © 2014 27
Admissible Heuristics
• A heuristic h(n) is admissible if for every node n, h(n) ≤ h*(n), where
h*(n) is the true cost to reach the goal state from n.
•
• An admissible heuristic never overestimates the cost to reach the goal,
i.e., it is optimistic.
• Example: hSLD(n) (never overestimates the actual road distance)
• Theorem-1: If h(n) is admissible, A* using TREE-SEARCH is optimal.
(Ideas to prove this theorem?)
Based on [4]
28. Jarrar © 2014 28
Optimality of A* (proof)
• Recall that f(n) = g(n) + h(n
• Now, suppose some suboptimal goal G2 has been generated and is in
the fringe. Let n be an unexpanded node in the fringe such that n is on
a shortest path to an optimal goal G.
f(G2) = g(G2) since h(G2) = 0
g(G2) > g(G) since G2 is suboptimal
f(G) = g(G) since h(G) = 0
f(G2) > f(G) from above
h(n) ≤ h^*(n) since h is admissible
g(n) + h(n) ≤ g(n) + h*(n)
f(n) ≤ f(G) Hence f(G2) > f(n), and n is expanded contradiction!
thus, A* will never select G2 for expansion
We want to prove:
f(n) < f(G2)
(then A* will prefer n over G2)
Based on [4]
29. Jarrar © 2014 29
Optimality of A* (proof)
• Recall that f(n) = g(n) + h(n
• Now, suppose some suboptimal goal G2 has been generated and is in
the fringe. Let n be an unexpanded node in the fringe such that n is on
a shortest path to an optimal goal G.
We want to prove:
f(n) < f(G2)
(then A* will prefer n over G2)
• In other words:
f(G2) = g(G2) + h(G2) = g(G2) > C*,
since G2 is a goal on a non-optimal path (C* is the optimal cost)
f(n) = g(n) + h(n) <= C*, since h is admissible
f(n) <= C* < f(G2), so G2 will never be expanded
A* will not expand goals on sub-optimal paths
30. Jarrar © 2014 30
Consistent Heuristics
• A heuristic is consistent if for every node n, every successor n' of n
generated by any action a,
h(n) ≤ c(n,a,n') + h(n')
• If h is consistent, we have
f(n') = g(n') + h(n')
= g(n) + c(n,a,n') + h(n')
≥ g(n) + h(n)
= f(n)
• i.e., f(n) is non-decreasing along any path.
• Theorem-2: If h(n) is consistent, A* using GRAPH-SEARCH is optimal.
• consistency is also called monotonicity
Based on [5]
31. Jarrar © 2014 31
Optimality of A*
• A* expands nodes in order of increasing f value
• Gradually adds "f-contours" of nodes
• Contour i has all nodes with f = fi, where fi < fi+1
Based on [6]
32. Jarrar © 2014 32
Complexity of A*
• The number of nodes within the goal contour
search space is still exponential
– with respect to the length of the solution
– better than other algorithms, but still problematic
• Frequently, space complexity is more important
than time complexity
– A* keeps all generated nodes in memory
Based on [2]
33. Jarrar © 2014 33
Properties of A*
• Complete: Yes (unless there are infinitely many nodes with f ≤ f(G) )
• Time: Exponential
• Because all nodes such that f(n) <= C* are expanded!
• Space: Keeps all nodes in memory,
• Fringe is exponentially large
• Optimal: Yes
who can propose an idea to improve the time/space
complexity
34. Jarrar © 2014 34
Memory Bounded Heuristic Search
• How can we solve the memory problem for A* search?
• Idea: Try something like iterative deeping search, but the
cutoff is f-cost (g+h) at each iteration, rather than depth first.
Two types of memory bounded heuristic searches:
Recursive BFS
SMA*
35. Jarrar © 2014 35
Recursive Best First Search (RBFS)
RBFS changes its mind very often
in practice.
This is because the f=g+h become
more accurate (less optimistic) as
we approach the goal. Hence,
higher level nodes have smaller f-
values and will be explored first.
Problem? If we have more
memory we cannot make use of it.
Ay idea to improve this?
best alternative
over fringe nodes,
which are not children:
do I want to back up?
36. Jarrar © 2014 36
Simple Memory Bounded A* (SMA*)
• This is like A*, but when memory is full we delete the worst
node (largest f-value).
• Like RBFS, we remember the best descendent in the branch
we delete.
• If there is a tie (equal f-values) we first delete the oldest
nodes first.
• SMA* finds the optimal reachable solution given the memory
constraint.
• But time can still be exponential.
37. Jarrar © 2014 37
SMA* pseudocode
function SMA*(problem) returns a solution sequence
inputs: problem, a problem
static: Queue, a queue of nodes ordered by f-cost
Queue MAKE-QUEUE({MAKE-NODE(INITIAL-STATE[problem])})
loop do
if Queue is empty then return failure
n deepest least-f-cost node in Queue
if GOAL-TEST(n) then return success
s NEXT-SUCCESSOR(n)
if s is not a goal and is at maximum depth then
f(s)
else
f(s) MAX(f(n),g(s)+h(s))
if all of n’s successors have been generated then
update n’s f-cost and those of its ancestors if necessary
if SUCCESSORS(n) all in memory then remove n from Queue
if memory is full then
delete shallowest, highest-f-cost node in Queue
remove it from its parent’s successor list
insert its parent on Queue if necessary
insert s in Queue
end
Based on [2]
38. Jarrar © 2014 38
Simple Memory-bounded A* (SMA*)
SMA* is a shortest path algorithm based on the A* algorithm.
The advantage of SMA* is that it uses a bounded memory, while the A*
algorithm might need exponential memory.
All other characteristics of SMA* are inherited from A*.
How it works:
• Like A*, it expands the best leaf until memory is full.
• Drops the worst leaf node- the one with the highest f-value.
• Like RBFS, SMA* then backs up the value of the forgotten node to
its parent.
39. Jarrar © 2014 39
Simple Memory-bounded A* (SMA*)
24+0=24
A
B G
C D
E F
H
J
I
K
0+12=12
10+5=15
20+5=25
30+5=35
20+0=20
30+0=30
8+5=13
16+2=18
24+0=24 24+5=29
10 8
10 10
10 10
8 16
8 8
A
12
A
B
12
15
A
B G
13
15 13
H
13
A
G
18
13[15]
A
G
24[]
I
15[15]
24
A
B G
15
15 24
A
B
C
15[24]
15
25
f = g+h
(Example with 3-node memory)
A
B
D
8
20
20[24]
20[]
Progress of SMA*. Each node is labeled with its current f-cost.
Values in parentheses show the value of the best forgotten descendant.
Is given to nodes that the path up to it uses all available memory.
Can tell you when best solution found within memory constraint is optimal or not.
= goal
Search space
41. Jarrar © 2014 41
SMA* Properties [2]
• It works with a heuristic, just as A*
• It is complete if the allowed memory is high enough to store the
shallowest solution.
• It is optimal if the allowed memory is high enough to store the
shallowest optimal solution, otherwise it will return the best solution
that fits in the allowed memory
• It avoids repeated states as long as the memory bound allows it
• It will use all memory available
• Enlarging the memory bound of the algorithm will only speed up the
calculation
• When enough memory is available to contain the entire search tree,
then calculation has an optimal speed
42. Jarrar © 2014 42
Admissible Heuristics
How can you invent a good admissible heuristic function?
E.g., for the 8-puzzle
43. Jarrar © 2014 43
Admissible heuristics
E.g., for the 8-puzzle:
• h1(n) = number of misplaced tiles
• h2(n) = total Manhattan distance
(i.e., no. of squares from desired location of each tile)
• h1(S) = ?
• h2(S) = ?
44. Jarrar © 2014 44
Admissible Heuristics
E.g., for the 8-puzzle:
• h1(n) = number of misplaced tiles
• h2(n) = total Manhattan distance
(i.e., no. of squares from desired location of each tile)
• h1(S) = ? 8
• h2(S) = ? 3+1+2+2+2+3+3+2 = 18
45. Jarrar © 2014 45
Dominance
• If h2(n) ≥ h1(n) for all n, and both are admissible.
• then h2 dominates h1
• h2 is better for search: it is guaranteed to expand less nodes.
• Typical search costs (average number of nodes expanded):
• d=12 IDS = 3,644,035 nodes
A*(h1) = 227 nodes
A*(h2) = 73 nodes
• d=24 IDS = too many nodes
A*(h1) = 39,135 nodes
A*(h2) = 1,641 nodes
• What to do If we have h1…hm, but none dominates the other?
h(n) = max{h1(n), . . .hm(n)}
46. Jarrar © 2014 46
Relaxed Problems
• A problem with fewer restrictions on the actions is called
a relaxed problem
• The cost of an optimal solution to a relaxed problem is
an admissible heuristic for the original problem
• If the rules of the 8-puzzle are relaxed so that a tile can
move anywhere, then h1(n) gives the shortest solution
• If the rules are relaxed so that a tile can move to any
near square, then h2(n) gives the shortest solution
47. Jarrar © 2014 47
Admissible Heuristics
How can you invent a good admissible heuristic function?
Try to relax the problem, from which an optimal solution can be
found easily.
Learn from experience.
Can machines invite an admissible heuristic automatically?
49. Jarrar © 2014 49
Local Search Algorithms
• In many optimization problems, the path to the goal is
irrelevant; the goal state itself is the solution.
• State space = set of "complete" configurations.
• Find configuration satisfying constraints, e.g., n-queens.
• In such cases, we can use local search algorithms.
– keep a single "current" state, try to improve it
according to an objective function
– Advantages:
1. Uses little memory
2. Finds reasonable solutions in large infinite spaces
50. Jarrar © 2014 50
Local Search Algorithms
• Local search can be used on problems that can be formulated as
finding a solution maximizing a criterion among a number of candidate
solutions.
• Local search algorithms move from solution to solution in the space of
candidate solutions (the search space) until a solution deemed optimal
is found or a time bound is elapsed.
• For example, the travelling salesman problem, in which a solution is a
cycle containing all nodes of the graph and the target is to minimize the
total length of the cycle. i.e. a solution can be a cycle and the criterion
to maximize is a combination of the number of nodes and the length of
the cycle.
• A local search algorithm starts from a candidate solution and then
iteratively moves to a neighbor solution.
51. Jarrar © 2014 51
Local Search Algorithms
• If every candidate solution has more than one neighbor solution; the
choice of which one to move to is taken using only information about
the solutions in the neighborhood of the current one.
• hence the name local search.
• Terminate on a time bound or if the situation is not improved after
number of steps.
• Local search algorithms are typically incomplete algorithms, as the
search may stop even if the best solution found by the algorithm is not
optimal.
52. Jarrar © 2014 52
Hill-Climbing Search
• continually moves in the direction of increasing value (i.e., uphill).
• Terminates when it reaches a “peak”, no neighbor has a higher value.
• Only records the state and its objective function value.
• Does not look ahead beyond the immediate
• Problem: can get stuck in local maxima,
• Its success depends very much on the shape of the state-space
land- scape: if there are few local maxima, random-restart hill
climbing will find a good solution very quickly.
Sometimes called
Greedy Local Search
53. Jarrar © 2014 53
Example: n-queens
• Put n queens on an n × n board with no two
queens on the same row, column, or diagonal.
• Move a queen to reduce number of conflicts.
54. Jarrar © 2014 54
Hill-climbing search: 8-queens problem
• h = number of pairs of queens that are attacking each other, either
directly or indirectly (h = 17 for the above state)
Each number indicates h if we move
a queen in its corresponding column
55. Jarrar © 2014 55
Hill-climbing search: 8-queens problem
A local minimum with h = 1
56. Jarrar © 2014 56
Simulated Annealing Search
• To avoid being stuck in a local maxima, It tries randomly (using a
probability function) to move to another state, if this new state is better
it moves into it, otherwise try another move… and so on.
• In other word, at each step, the SA heuristic considers some neighbour
s' of the current state s, and probabilistically decides between moving
the system to state s' or staying in state sit moves.
• Terminates when finding an acceptably good solution in a fixed
amount of time, rather than the best possible solution.
• Locating a good approximation to the global minimum of a given
function in a large search space.
• Widely used in VLSI layout, airline scheduling, etc.
Based on [1]
57. Jarrar © 2014 57
Properties of Simulated Annealing Search
• The problem with this approach is that the neighbors of a state are not
guaranteed to contain any of the existing better solutions which means
that failure to find a better solution among them does not guarantee
that no better solution exists.
• It will not get stuck to a local optimum
• If it runs for an infinite amount of time, the global optimum will be
found.
58. Jarrar © 2014 58
Genetic Algorithms
• Inspired by evolutionary biology such as inheritance.
• Evolves toward better solutions.
• A successor state is generated by combining two parent states.
• Start with k randomly generated states (population).
59. Jarrar © 2014 59
Genetic Algorithms
• A state is represented as a string over a finite alphabet (often a string
of 0s and 1s)
• Evaluation function (fitness function). Higher values for better states.
• Produce the next generation of states by selection, crossover, and
mutation.
• Commonly, the algorithm terminates when either a maximum number
of generations has been produced, or a satisfactory fitness level has
been reached for the population.
60. Jarrar © 2014 60
• Fitness function: number of non-attacking pairs of queens (min = 0, max = 8 × 7/2 = 28)
• 24/(24+23+20+11) = 31%
• 23/(24+23+20+11) = 29% etc
fitness:
#non-attacking queens
probability of being
regenerated
in next generation
Genetic Algorithms
61. Jarrar © 2014 61
Genetic Algorithms
[2,6,9,3,5 0,4,1,7,8] [3,6,9,7,3 8,0,4,7,1]
[2,6,9,3,5,8,0,4,7,1] [3,6,9,7,3,0,4,1,7,8]
62. Jarrar © 2014 62
References
• [1] S. Russell and P. Norvig: Artificial Intelligence: A Modern Approach Prentice Hall,
2003, Second Edition
• [2] http://en.wikipedia.org/wiki/SMA*
• [3] Moonis Ali: Lecture Notes on Artificial Intelligence
http://cs.txstate.edu/~ma04/files/CS5346/SMA%20search.pdf
• [4] Max Welling: Lecture Notes on Artificial Intelligence
https://www.ics.uci.edu/~welling/teaching/ICS175winter12/A-starSearch.pdf
• [5] Kathleen McKeown: Lecture Notes on Artificial Intelligence
http://www.cs.columbia.edu/~kathy/cs4701/documents/InformedSearch-AR-print.ppt
• [6] Franz Kurfess: Lecture Notes on Artificial Intelligence
http://users.csc.calpoly.edu/~fkurfess/Courses/Artificial-Intelligence/F09/Slides/3-
Search.ppt