Solving Problems by Searching

When the correct action isn’t immediately apparent, an agent may have to plan… that is consider a hypothetical sequence of actions to reach a goal, this is called searching.

Problem solving agents use atomic approaches, where an agent that uses factored or structured approaches are called planning agents.

For now, we’ll consider the most simple environments: episodic, single agent, fully observable, deterministic, static, discrete, and known.

Though there will be a distinction between informed and uninformed algorithms, depending if it can be known how far away from the goal the agent is.

Problem-Solving Agents

Imagine an agent that is touring around Romania

Our agent is currently in Arad and needs to reach Bucharest.

Our agent observes three street signs leading out of Arad: Sibiu, Timisoara, and Zerind

If our environment is unknown then we can only guess (covered in Ch. 4)

However, let’s assume that we have knowledge of the world (a map)

Abstract Romania

With a known environment, we can adopt a four-phase problem solving process

Goal formulation: If we limit our objectives we limit our actions (reach Bucharest)
Problem formulation: Produce a description of states and actions needed to reach our goal, that is an abstract model of the world
Search: Simulate taking actions and their resultant states until the goal is reaches, this is called a solution. It may have to produce many sequences that don’t reach the goal, but it will eventually find a solution (or find that there is no solution)
Execution: Actually play out the solution

For any fully observable, deterministic, known environment, the solution to any problem is a fixed sequence of actions… that is once a solution has been found, we do not have to adjust it as it is being executed, e.g. “open loop”

If there’s even the chance that the environment could be nondeterministic, or the model is incorrect, we’d be better off adopting a “closed loop” approach.

Does this describe our environment (travel agent)?

Search Problem and Solution

Let’s talk about the search problem:

A set of possible states of the environment, called the State Space
The initial state (Arad)
A set of one (or more!) goal states. Or a property that could apply to many states
The actions available to take. \(Actions(Arad)=\{ToSibiu,ToTimisoara,ToZerind\}\)
A transition model that describes the effect of the actions on the model. \(Result(Arad,ToZerind)=Zerind\)
An action cost function \(Action-Cost(s,a,s')\), which reflects our performance measure, what is an appropriate cost for our Travel Agent’s actions?

Our sequence of actions is called a path which leads from the initial state to the solution

An optimal solution has the lowest cost path out of all possible solutions (for now, we’ll assume that all cost is positive).

We can represent state space as a graph in which the vertices are states and the (directed) edges are actions

Formulating Problems

This is one of the tricky bits without good, closed form solutions

Our model is just that, a mathematical model, it’s not real.

Removing detail from a representation is called abstraction. What details are we abstracting away?

What happens if we misstep and remove relevant detail?

What’s the appropriate level of detail?

What makes a valid abstraction, what about a useful one?

Example Problems

Let’s talk about some example problems.

A standardized problem (sometime called toy-problems) is a problem with an exact (and short) description meant to illustrate a concept and/or test an algorithm

A real-world problem is one that is actually intended to be used and contains non-standard elements (e.g. different sensors produce different data)

Standardized Problems

Grid World is a 2D array of squares where agents move around, cells contain object that can be interacted by the agent. Is Vacuum World a Grid world?

States: How many states can there be?
- \(2*2*2=8\), in general a vacuum world with \(n\) cells has \(n*2^n\) states
Initial state: Any state could qualify
Actions: Suck, Left, Right (absolute movement), Forward, Backwards, TurnRight, TurnLeft (egocentric movement)
Transition: Suck modifies dirt, Forward, Backward, etc. modifies agent
Goal states: Any state in which every cell is clean
Action cost: each action costs 1

What about other grid worlds?

Sokoban anyone? For any world with \(n\) non-obstacle cells and \(b\) boxes, there are \(n*n / n!(b!(n-b)!)\) states, so an 8x8 grid with 12 boxes the number of states exceeds 200 Trillion!

Sliding Tile Puzzle

Some number of tiles are arranged on a grid containing blank spaces where some of the tiles can slide into the blank spaces.

8-puzzle: 8 tiles on a 3x3 grid and one blank space

States:
- A state description specifies the arrangement of tiles
Initial State:
- Any state, though there’s a parity property partitions the states such that any goal is reachable by exactly half of the states
Actions:
- Blank tile “slides” on the grid, if in the corner not all slides are available
Transition:
- “Sliding” a tile swaps one blank with one space
Goal:
- Anything (usually the numbers are in order)
Cost: 1 per action

What abstractions did we assume?

Can we have infinite state spaces?

Donald Knuth devised this in conjecture in 1964: Starting with the integer 4, a sequence of square root, floor, and factorial operations can reach any desired positive integer.

\[ \lfloor{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{(4!)!}}}}}}\rfloor = 5 \]

States: Positive real numbers
Initial State: 4
Actions: square root, floor, factorial (integers only)
Transition: Math definitions
Goal: Some desired positive integer
Cost: 1 per action

We have to go through some very large numbers! What is (4!)!?

These infinite tasts come up in things like math, circuits, proofs, programs, and other recursively defined objects.

Real-world problems

We’ve seen our Tour-agent, what about air travel?

States: A location, current time, and retained information about past transitions
Initial State: Home airport
Actions: Take any flight, from current location, in any seat class, leaving after the current time, leaving enough time for intra-airport transfer (if applicable)
Transition: Move from airport and current time to next airport and future time
Goal: Some destination city, maybe other constraints (nonstop flight, not through certain airport)
Cost: Money, waiting time, flight time, customs/immigration (if applicable), seat quality, time of day, type of airplane, airline reward points, etc.

Any sort of routing can function like this (networking, military planning…)

Touring problems: the Traveling Salesman is one variety, schoolbusses

VLSI Layout: cell layout and channel routing of electronic chips

Robot navigation: self driving vehicles, search and rescue

Automatic Assembly Sequencing: manufacturing and protein design