Perhaps treating states as more than black boxes can lead to a deeper understanding (and more search strategies)
Review:
Chapter 3 and 4: problems can be represented as graphs (edges are actions, vertices are states)
Domain specific heuristics help the search, but each state is treated as atomic (a single indivisible object with no structure)
Now we’ll look at a factored representation for each state: a set of variables that have values. Our problem is solved when each constraint on each variable is solved. We call these Constrain Satisfaction Problems (CSP)
CSPs use general rather than domain-specific heuristics. We can eliminate parts of the search space that violate our constraints.
Let’s look at the components of a CSP: \(\mathcal{X},\mathcal{D},and\ \mathcal{C}\)
A domain \(D_i\), describes the valid valuables for a single variable. A boolean variable would have the domain \(\{True, False\}\), for example.
Each domain may have different domain sizes.
Each problem can have multiple constraints \(\mathcal{C}_i\) each consists of a pair \(\langle scope,rel\rangle\)
Where scope is the participating variables, and rel is the relation that defines the valuables the variables in the scope can take. We can describe the valid values explicitly, or with an expression that can compute valid values.
Consider two variables \(X_1,X_2\) with the domain \(\{1,2,3\}\), a constraint saying that \(X_1\) must be greater than \(X_2\) may be written as:
\[ \langle (X_1,X_2),X_1\gt X_2\rangle \]
Can you describe the other way to describe this?
CSPs deal with assignments to variables, \(\{X_i=v_i,X_j=v_j,...\}\).
An assignment that does not violate any constraints is called a consistent or legal assignment. A complete assignment is one where every variable is assigned a value, and a solution to a CSP is a consistent, complete assignment… right?
In general, solving a CSP is a NP-complete problem… but there are several subclasses of CSPs we can solve efficiently!
Let’s consider a famous application of CSP: when drawing and coloring maps, it is standard to color the regions such that no two touching regions are the same color.
Australia Regions and Constraint Graph
We’ll consider each region to be a variable:
\[ \mathcal{X}=\{WA,NT,Q,NSW,V,SA,T\} \]
The domain of each variable is \(D_i=\{red,green,blue\}\)
How many constraints do we need?
Go try!
\[ \mathcal{C}=\{SA\ne WA,SA\ne NT, SA\ne Q,SA\ne NSW, SA\ne V,\\ WA\ne NT, NT\ne Q, Q\ne NSW, NSW\ne V\} \]
Where each of these abbreviations are equivalent to their longer-form counterpart:
\(SA\ne WA \equiv\langle(SA,WA),SA\ne WA\rangle\)
There are many possible solutions to this problem:
\[ \{WA=red,NT=green,Q=red,NSW=green,V=red,SA=blue,T=read\} \]
It can be helpful to visualize our CSP as a constraint graph, the nodes are the variables and the edges connect any two variables that participate in a constraint.
Why formulate a problem in this way? Use your experience from the map exercise.
Normally (in an atomic space) we can only ask, “is this a goal state? How about this one?”
Consider \(\{SA=blue\}\), an atomic solver would have to consider \(3^5=243\) states for the 5 neighboring variables
With constraints, it’s only \(2^5=32\), savings of \(87\%\)(bang!)
The simplest CSP involve variables that have discrete, finite domains. Map-coloring and scheduling with time limits are both like this.
Another example is the 8-queens problem (which we looked at in chapter 4).
A discrete domain can also be infinite, (like the set of integers or strings). If this is the case, then implicit constraints of the form \(T_1+d_1\leq T_2\) must be used (as opposed to explicit tuples. There are special algorithms exist to solve constraints in linear form (as above), but there are none for nonlinear constraints (undecidable problem)
CSPs may also have continuous domains, and they are quite common in real life!
Consider the task of scheduling observational experiments on the James Webb Space Telescope, the start and end times of each observation, plus maneuvering are all continuous-valued variables that are subject to an plethora of astronomical, procedural, and engineering constraints.
The most famous category of these problems are of linear programming, the constraints are linear equalities or inequalities, (though other types, like quadratic programming, exist)
Just like the variables themselves, there are different types of constraints as well.
Perhaps the simplest is the unary constraint, which constrains the value of a single variable. For example:
\[ \langle (SA),SA\ne green \rangle \]
A binary constraint relates two variables, as we’ve seen.
A binary CSP is one with only unary and binary constraints, and can be represented as a constraint graph.
It’s also possible to have higher-order constraints, for example \(Between(X,Y,Z)\) may be defined as:
\[ \langle(X,Y,Z),X\lt Y\lt Z\ \mathbf{or}\ X\gt Y\gt Z\rangle \]
A constraint involving an arbetrary number of variables is called a global constraint (but it may not involve all variables.
One example of this is \(AllDiff\) which takes any number of variables and is satisfied when every variable has a different value. Useful in games like Suduko
Consider cryptarithmetic puzzles, wherein we’re given an equation of letters of differing values (and no leading zeroes). Let’s figure out the constraints on solving this problem!
Cryptarithmetic and Constraint Hypergraph
It is possible to describe any finite-domain constraint into a set of binary constraints (given enough auxiliary variables).
This means that any CSP can be converted to one which only manages binary constraints (which might make algorithm development easier!).
One way to do this is via the dual graph transformation, that is to create a new graph where there is one variable for each constraint in the original graph, and one binary constraint for each pair of constraints in the original graph that share variables.
For example: Consider a CSP with the variables \(\mathcal{X}=\{X,Y,Z\}\), each with the domain \(\{1,2,3,4,5\}\), and the two constraints \(C_1:\langle(X,Y,Z),X+Y=Z\rangle\) and \(C_2:\langle(X,Y),X+1=Y\rangle\).
This would result in a graph with \(\mathcal{X}=\{C_1,C_2\}\), where the domain of \(C_1\) is the set of \(\{(x_i,y_i,z_i)\}\) tuples from our original \(C_1\) constraint. What would be the domain of \(C_2\) ?
The dual graph would also have the binary constraint \(\langle (C_1,C_2),R_1\rangle\), where \(R_1\) is a relation that defines the constraint between \(C_1\) and \(C_2\). Take a minute or two and find this constraint!
Why might we desire to use global constraints anyway (Like \(AllDiff\))?
Easier and less error prone to write
It’s possible to design special-purpose algorithms for global constraints that are more efficient than working with more primitive constraints
There are also preference constraints, as opposed to the absolute constraints that we’ve seen thus far.
What features of class scheduling are absolute constraints, what about preference constraints?
Preference constraints can be often encoded as costs on a variable assignment, turning this into a problem of constrained optimization problem or (COP)