A “search algorithm” accepts a search problem and returns a search solution
For now, we’re going to use trees to express our state-space graph
Abstract Romania
Tree Romania
Root node is the initial state. We can “expand” a node, populating (or generating) child nodes via a result function,
That’s the core idea of search: get our options, follow-up on one and set aside other options for later.
The set of unexpanded nodes is called the frontier
Graph Romania
Rectangular Grid Graph Search
The “interior” and “exterior” nodes are separated by the “frontier” nodes.
A general approach is best-first search where some node \(n\) is chosen that has the minimum value of an evaluation function \(f(n)\) of all available nodes.
Then, if the state is the goal state, return, otherwise expand the node and add child nodes to the frontier (if the nodes aren’t already in the frontier). If a node already exists, but the path is shorter (less costly), then it is re-added with the less expensive path.
This algorithm will either return some indication of failure, or the goal node.
By tweaking \(f(n)\) , we can get different exact algorithms
Best First
Our algorithm requires some data structure to hold our search tree.
Our nodes need four basic components
node.STATE: what state does our node represent?
node.PARENT: node which generated this node
node.ACTION: the action that was applied to generate this node
node.PATH-COST: total cost of the path from the root node (initial state) to this node ( \(g(node)\) )
If we follow the pointers of the goal node back to the parent node, we get our path!
We also need a data structure for the frontier (a queue is a good choice), we need the following operations:
IS-EMPTY(frontier): return true if frontier is empty
POP(frontier): remove top node and return it
TOP(frontier): return (but don’t remove) top node of frontier
ADD(node, frontier): insert node into queue
There are different types of queues (three are used for searches):
Priority Queue: first pops node with minimum cost (best-first search)
FIFO queue: first pops the node added first (breadth-first search)
LIFO queue: first pops the node most recently added (also called stack), (depth-first search)
Reached states can be stored in some lookup table (dictionary or hash table), where key is a state and value is the node for that state
Let’s look at the tree again:
Graph Romania
Notice there’s a path from Arad from Sibu back to Arad
Arad is a repeated state in the search tree resulting from a cycle (loopy path)
This means that even with only 20 states, we already have an infinite complete search tree.
A cycle is a special case of a redundant path, we can reach several nodes through more costly means
Let’s talk about a \(10 \times 10\) grid world:
We can reach any of the 100 spaces in 9 or fewer moves, but there are ~\({8^9}\) paths of length 9 (>100M), average path of about 1M. So… we can speed up our path by about 1M times by eliminating redundant paths.
So how can we address this?
A. remember all reached states (like best-first). If there are many redundant paths, then this is good, especially when your table of states can fit in memory
B. Don’t care about repeated states. If problem domain doesn’t have repeated states, (or they are rare). Or if memory is constrained.
C. Detect cycles but not redundant paths. This is more computationally expensive, but requires no more memory. We can either follow the entire chain, or only check a few links.
Four ways: How might we measure performance?
Completeness: Do we correctly report both the presence of success and the absence of it?
Cost optimality: Does it find the lowest-cost path?
Time Complexity: This could be some unit time, or just steps taken
Space complexity: How much memory does it need?
Completeness for finite problems is easier, as long as we account for loops and keep track of paths, we can explore every reachable state.
Infinite states are much harder, as we can always reach new states, but leave other parts entirely unexplored
This requires a systematic approach… consider the infinite \(2D\) grid again. However, if there is no solution, a sound algorithm will search forever
(reminder: we don’t know how close to the goal we are)
Systematic strategy, works in infinite spaces
A FIFO queue, keeps us in order and allows for early goal testing, as opposed to the late goal testing of the best-first search.
Let’s draw it!
This always finds a solution (should it exist) with a minimal number of actions. However watch how the the states grow…
\[ 1+b+b^2+b^3+...+b^d=O(b^d) \]
YIKES
For example, consider a branching problem with \(b=10\), procecessing 1M nodes/s, and 1 Kbyte/node. To search to depth \(d=10\) would take < 3h, but 10TB memory!
Memory requirements dominate BFS
But at \(d=14\), even assuming infinite memory, the search would take 3.5y.
In general, exponential complexity problems can’t be solved by an uninformed search, unless the space is very small.
When actions have different costs, we can use the best-first search with the evaluation function being the path-cost to current node.
This algorithm check for goals upon node expansion rather than node generation
Uniform Cost Romania
The complexity of Uniform Cost is in terms of \(C^*\) (cost of optimal solution) and \(\epsilon\) (lower bound on cost per action) with \(\epsilon > 0\)
The worst case time and space complexity is:
\[ O(b^{1+\lfloor C^* / \epsilon \rfloor}) \]
which might be greater than \(b^d\) . Uniform-cost search can explor large trees of low-cost actions first, rather than taking a leap to large-cost but optimal actions.
When all costs are equal: it’s just \(b^{d+1}\)
Let’s draw!
For finite trees, it’s efficient and complete, for acyclic state spaces, it can re-expand the same space many times but will explore the whole space, for cycles it is prone to infinite loops.
For infinite spaces, it can just barrel down an infinite path, incompletely filling the search space.
For proper application, it is very memory efficient (no reached table) and the frontier is small. This can be made even more efficient with its backtracking variant.