10/7/2014 constrainedness of search toby walsh nicta and unsw tw
TRANSCRIPT
04/11/23
Constrainedness of Search
Toby Walsh
NICTA and UNSWhttp://cse.unsw.edu.au/~tw
Motivation
Will a problem be satisfiable or unsatisfiable?
Will it be hard or easy? How can we develop
heuristics for a new problem?
04/11/23
Take home messages
Hard problems often associated with a phase transition Under constrained,
easy Critically
constrained, hard Over constrained,
easier
04/11/23
Provide definition of constrainedness Predict location of such
phase transitions Can be measured during
search Observe beautiful
“knife-edge” Build heuristics to get
off this knife-edge
04/11/23
Let’s start with the mother of all NP-complete problems!
04/11/23
04/11/23
3-SAT
Where are the hard 3-SAT problems? Sample randomly generated 3-SAT
Fix number of clauses, l Number of variables, n By definition, each clause has 3 variables Generate all possible clauses with uniform probability
04/11/23
Random 3-SAT
Which are the hard instances? around l/n = 4.3
What happens with larger problems?
Why are some dots red and others blue?
This is a so-called “phase transition”
04/11/23
Random 3-SAT
Varying problem size, n
Complexity peak appears to be largely invariant of algorithm complete algorithms like
Davis-Putnam Incomplete methods like local
search
What’s so special about 4.3?
04/11/23
Random 3-SAT
Complexity peak coincides with satisfiability transition
l/n < 4.3 problems under-constrained and SAT
l/n > 4.3 problems over-constrained and UNSAT
l/n=4.3, problems on “knife-edge” between SAT and UNSAT
04/11/23
Where did this all start?
At least as far back as 60s with Erdos & Renyi thresholds in random graphs
Late 80s pioneering work by Karp,
Purdom, Kirkpatrick, Huberman, Hogg …
Flood gates burst Cheeseman, Kanefsky &
Taylor’s IJCAI-91 paper
04/11/23
What do we know about this phase transition? It’s shape
Step function in limit [Friedgut 98]
It’s location Theory puts it in interval:
3.42 < l/n < 4.506 Experiment puts it at:
l/n = 4.2
04/11/23
3SAT phase transition
Lower bounds (hard) Analyse algorithm that almost always solves problem Backtracking hard to reason about so typically without
backtracking Complex branching heuristics needed to ensure success But these are complex to reason about
04/11/23
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions
04/11/23
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
04/11/23
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
No assumptions about the distribution of X except non-negative!
04/11/23
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
04/11/23
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
The expected value of X can be easily calculated
04/11/23
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
E[X] = 2^n * (7/8)^l
04/11/23
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then prob(X>=1) = prob(SAT) < 1
04/11/23
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then 2^n * (7/8)^l < 1
04/11/23
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then 2^n * (7/8)^l < 1
n + l log2(7/8) < 0
04/11/23
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then 2^n * (7/8)^l < 1
n + l log2(7/8) < 0
l/n > 1/log2(8/7) = 5.19…
04/11/23
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions To get tighter bounds than 5.19, can refine the counting
argument E.g. not count all solutions but just those maximal under
some ordering
04/11/23
Random 2-SAT
2-SAT is P linear time algorithm
Random 2-SAT displays “classic” phase transition l/n < 1, almost surely SAT l/n > 1, almost surely UNSAT complexity peaks around l/n=1
x1 v x2, -x2 v x3, -x1 v x3, …
04/11/23
Phase transitions in P
2-SAT l/n=1
Horn SAT transition not “sharp”
Arc-consistency rapid transition in whether
problem can be made AC peak in (median) checks
04/11/23
Phase transitions above NP
PSpace QSAT (SAT of QBF)x1 x2 x3 . x1 v x2 & -x1 v x3
04/11/23
Phase transitions above NP
PSpace-complete QSAT (SAT of QBF) stochastic SAT modal SAT
PP-complete polynomial-time probabilistic
Turing machines counting problems #SAT(>= 2^n/2)
04/11/23
Exact phase boundaries in NP
Random 3-SAT is only known within bounds 3.42 < l/n < 4.506
Exact NP phase boundaries are known:
1-in-k SAT at l/n = 2/k(k-1)
Are there any NP phase boundaries known exactly?
04/11/23
Backbone
Variables which take fixed values in all solutions alias unit prime implicates
Let fk be fraction of variables in backbone in random 3-SAT
l/n < 4.3, fk vanishing (otherwise adding clause could make problem unsat)
l/n > 4.3, fk > 0discontinuity at phase boundary!
04/11/23
Backbone
Search cost correlated with backbone size if fk non-zero, then can easily assign variable “wrong” value such mistakes costly if at top of search tree
One source of “thrashing” behaviour can tackle with randomization and rapid restarts
Can we adapt algorithms to offer more robust performance guarantees?
04/11/23
Backbone
Backbones observed in structured problems quasigroup completion problems (QCP)
Backbones also observed in optimization and approximation problems coloring, TSP, blocks world planning …
Can we adapt algorithms to identify and exploit the backbone structure of a problem?
04/11/23
2+p-SAT Morph between 2-SAT and 3-
SAT fraction p of 3-clauses fraction (1-p) of 2-clauses
2-SAT is polynomial (linear) phase boundary at l/n =1 but no backbone discontinuity
here!
2+p-SAT maps from P to NP p>0, 2+p-SAT is NP-complete
04/11/23
2+p-SAT phase transition
04/11/23
2+p-SAT phase transition
l/n
p
04/11/23
2+p-SAT phase transition
Lower bound are the 2-clauses (on their
own) UNSAT? n.b. 2-clauses are much more
constraining than 3-clauses
p <= 0.4 transition occurs at lower
bound 3-clauses are not
contributing!
04/11/23
2+p-SAT backbone
fk becomes discontinuous for p>0.4 but NP-complete for p>0 !
search cost shifts from linear to exponential at p=0.4
similar behavior seen with local search algorithms
Search cost against n
04/11/23
2+p-SAT trajectories
Input 3-SAT to a SAT solver like Davis Putnam REPEAT assign variable
Simplify all unit clauses Leaving subproblem with a mixture of 2 and 3-clauses
For a number of branching heuristics (e.g random,..) Assume subproblems sample uniformly from 2+p-SAT space Can use to estimate runtimes!
04/11/23
2+p-SAT trajectories
UNSAT
SAT
04/11/23
Beyond 2+p-SAT
Optimization MAX-SAT
Other decision problems 2-COL to 3-COL Horn-SAT to 3-SAT XOR-SAT to 3-SAT 1-in-2-SAT to 1-in-3-SAT NAE-2-SAT to NAE-3-SAT ..
04/11/23
COL Graph colouring
Can we colour graph so that neighbouring nodes have different colours?
In k-COL, only allowed k colours 3-COL is NP-complete
2-COL is P
04/11/23
Random COL
Sample graphs uniformly n nodes and e edges
Observe colourability phase transition random 3-COL is "sharp", e/n =approx 2.3
BUT random 2-COL is not "sharp"
As n->oo prob(2-COL @ e/n=0) = 1
prob(2-COL @ e/n=0.45) =approx 0.5
prob(2-COL @ e/n=1) = 0
04/11/23
2+p-COL
Morph from 2-COL to 3-COL fraction p of 3 colourable nodes
fraction (1-p) of 2 colourable nodes
Like 2+p-SAT maps from P to NP
NP for any fixed p>0
Unlike 2+p-SAT maps from coarse to sharp transition
04/11/23
2+p-COL
04/11/23
2+p-COL sharpness
p=0.8
04/11/23
2+p-COL search cost
04/11/23
2+p-COL
Sharp transition for p>0.8
Transition has coarse and sharp regions for 0<p<0.8
Problem hardness appears to increase from polynomial to exponential at p=0.8
2+p-COL behaves like 2-COL for p<0.8 NB sharpness alone is not cause of complexity since
2-SAT has a sharp transition!
04/11/23
Location of phase boundary
For sharp transitions, like 2+p-SAT:
As n->oo, if l/n = c+epsilon, then UNSAT
l/n = c-epsilon, then SAT
For transitions like 2+p-COL that may be coarse, we identify the start and finish: delta2+p = sup{e/n | prob(2+p-colourable) = 1}
gamma2+p = inf{e/n | prob(2+p-colourable) = 0}
04/11/23
Basic properties
monotonicity: delta <= gamma sharp transition iff delta=gamma simple bounds:
delta_2+p = 0 for all p<1
gamma_2 <= gamma_2+p <= min(gamma_3,gamma_2/1-p)
04/11/23
2+p-COL phase boundary
04/11/23
XOR-SAT XOR-SAT
Replace or by xor
XOR k-SAT is in P for all k
Phase transition XOR 3-SAT has sharp transition
0.8894 <= l/n <= 0.9278 [Creognou et al 2001]
Statistical mechanics gives l/n = 0.918 [Franz et al 2001]
04/11/23
XOR-SAT to SAT Morph from XOR-SAT to SAT
Fraction (1-p) of XOR clauses
Fraction p of OR clauses
NP-complete for all p>0 Phase transition occurs at:
0.92 <= l/n <= min(0.92/1-p, 4.3)
Upper bound appears loose for all p>0 Polynomial subproblem does not dominate!
3-SAT contributes (cf 2+p-SAT, 2+p-COL)
04/11/23
Other morphs between P and NP NAE 2+p-SAT
NAE = not all equal
NAE 2-SAT is P, NAE 3-SAT is NP-complete
1-in-2+p-SAT 1-in-k SAT = exactly one in k literals true
1-in-2 SAT is P, 1-in-3 SAT is NP-complete
…
04/11/23
NAE to SAT Morph between two NP-complete problems
Fraction (1-p) of NAE 3-SAT clauses
Fraction p of 3-SAT clauses
Each NAE 3-SAT clause is equivalent to two 3-SAT clauses NAE 3-SAT phase transition occurs around l/n = 2.1
Tantalisingly close to half of 4.2
NAE(a,b,c) = or(a,b,c) & or(-a,-b,-c)
Can we ignore many of the correlations that this encoding of NAE SAT into SAT introduces?
04/11/23
NAE to SAT Compute “effective” clause size
Consider (1-p)l NAE 3-SAT clauses and pl 3-SAT clauses
These behave like 2(1-p)l 3-SAT clauses and pl 3-SAT clauses
That is, (2-p)l 3-SAT clauses
Hence, effective clause to variable ratio is (2-p)l/n
Plot prob(satisfiable) and search cost against (2-p)l/n
NAE to SAT
04/11/23
04/11/23
The real world isn’t random?
Very true!Can we identify structural
features common in real world problems?
Consider graphs met in real world situations social networks electricity grids neural networks ...
04/11/23
Real versus Random Real graphs tend to be sparse
dense random graphs contains lots of (rare?) structure
Real graphs tend to have short path lengths as do random graphs
Real graphs tend to be clustered unlike sparse random graphs
L, average path lengthC, clustering coefficient(fraction of neighbours connected to
each other, cliqueness measure)
mu, proximity ratio is C/L normalized by that of random graph of same size and density
04/11/23
Small world graphs
Sparse, clustered, short path lengths
Six degrees of separation Stanley Milgram’s famous
1967 postal experiment recently revived by Watts &
Strogatz shown applies to:
actors database US electricity grid neural net of a worm ...
04/11/23
An example
1994 exam timetable at Edinburgh University 59 nodes, 594 edges so
relatively sparse but contains 10-clique
less than 10^-10 chance in a random graph assuming same size and
density
clique totally dominated cost to solve problem
04/11/23
Small world graphs
To construct an ensemble of small world graphs morph between regular graph (like ring lattice) and
random graph prob p include edge from ring lattice, 1-p from random
graph
real problems often contain similar structure and stochastic components?
04/11/23
Small world graphs
ring lattice is clustered but has long paths random edges provide shortcuts without
destroying clustering
04/11/23
Small world graphs
04/11/23
Small world graphs
04/11/23
Colouring small world graphs
04/11/23
Small world graphs
Other bad news disease spreads more
rapidly in a small world
Good news cooperation breaks out
quicker in iterated Prisoner’s dilemma
04/11/23
Other structural features
It’s not just small world graphs that have been studied
High degree graphs Barbasi et al’s power-law model
Ultrametric graphs Hogg’s tree based model
Numbers following Benford’s Law 1 is much more common than 9 as a leading digit!
prob(leading digit=i) = log(1+1/i) such clustering, makes number partitioning much easier
04/11/23
High degree graphs
Degree = number of edges connected to node Directed graph
Edges have a direction E.g. web pages = nodes, links = directed edges
In-degree, out-degree In-degree = links pointing to page Out-degree = links pointing out of page
04/11/23
In-degree of World Wide Web
Power law distribution Pr(in-degree = k) =
ak^-2.1
Some nodes of very high in-degree E.g. google.com, …
04/11/23
Out-degree of World Wide Web
Power law distribution Pr(in-degree = k) =
ak^-2.7
Some nodes of very high out-degree E.g. people in SAT
04/11/23
High degree graphs
World Wide Web Electricity grid Citation graph
633,391 out of 783,339 papers have < 10 citations
64 have > 1000 citations 1 has 8907 citations
Actors graph Robert Wagner, Donald
Sutherland, …
04/11/23
High degree graphs
Power law in degree distribution Pr(degee = k) = ak^-b where b typically around 3
Compare this to random graphs Gnm model
n nodes, m edges chosen uniformly at random Gnp model
n nodes, each edge included with probability p In both, Pr(degree = k) is a Poisson distribution
tightly clustered around mean
04/11/23
Random v high degree graphs
04/11/23
Generating high-degree graphs
Grow graph Preferentially attach new
nodes to old nodes according to their degree Prob(attach to node j)
proportional to degree of node j
Gives Prob(degree = k) = ak^-3
04/11/23
High-degree = small world?
Preferential attachment model n=16, mu=1 n=64, mu=1.35 n=256, mu=2.12 …
Small world topology thus for large n!
04/11/23
Search on high degree graphs
Random Uniformly hard
Small world A few long runs
High degree More uniform Easier than random
04/11/23
What about numbers?
So far, we’ve looked at structural features of graphs
Many problems contain numbers Do we see phase
transitions here too?
04/11/23
Number partitioning
What’s the problem? dividing a bag of numbers into
two so their sums are as balanced as possible
What problem instances? n numbers, each uniformly
chosen from (0,l ] other distributions work
(Poisson, …)
04/11/23
Number partitioning
Identify a measure of constrainedness more numbers => less constrained larger numbers => more constrained could try some measures out at random (l/n, log(l)/n,
log(l)/sqrt(n), …)
Better still, use kappa! (approximate) theory about constrainedness based upon some simplifying assumptions
e.g. ignores structural features that cluster solutions together
04/11/23
Theory of constrainedness
Consider state space searched see 10-d hypercube opposite
of 2^10 possible partitions of 10 numbers into 2 bags
Compute expected number of solutions, <Sol> independence assumptions
often useful and harmless!
04/11/23
Theory of constrainedness
Constrainedness given by: kappa= 1 - log2(<Sol>)/n where n is dimension of state space
kappa lies in range [0,infty) kappa=0, <Sol>=2^n, under-constrained kappa=infty, <Sol>=0, over-constrained kappa=1, <Sol>=1, critically constrained phase boundary
04/11/23
Phase boundary
Markov inequality prob(Sol) < <Sol>
Now, kappa > 1 implies <Sol> < 1 Hence, kappa > 1 implies prob(Sol) < 1
Phase boundary typically at values of kappa slightly smaller than kappa=1 skew in distribution of solutions (e.g. 3-SAT) non-independence
04/11/23
Examples of kappa
3-SAT kappa = l/5.2n phase boundary at kappa=0.82
3-COL kappa = e/2.7n phase boundary at kappa=0.84
number partitioning kappa = log2(l)/n phase boundary at kappa=0.96
04/11/23
Number partition phase transition
Prob(perfect partition) against kappa
04/11/23
Finite-size scaling
Simple “trick” from statistical physics around critical point, problems indistinguishable except for
change of scale given by simple power-law
Define rescaled parameter gamma = kappa-kappac . n^1/v kappac
estimate kappac and v empirically e.g. for number partitioning, kappac=0.96, v=1
04/11/23
Rescaled phase transition
Prob(perfect partition) against gamma
04/11/23
Rescaled search cost
Optimization cost against gamma
04/11/23
Easy-Hard-Easy?
Search cost only easy-hard here? Optimization not decision search cost! Easy if (large number of) perfect partitions Otherwise little pruning (search scales as 2^0.85n)
Phase transition behaviour less well understood for optimization than for decision sometimes optimization = sequence of decision problems (e.g
branch & bound) BUT lots of subtle issues lurking?
Looking inside search
04/11/23Clauses/variables down search branch
Looking inside search
04/11/23Clauses/variables down search branch
Looking inside search
04/11/23Clauses length down search branch
Constrainedness knife-edge
04/11/23kappa down search branch
Constrainedness knife-edge
04/11/23kappa down search branch
Constrainedness knife-edge
04/11/23Real world register allocation graph colouring problem
Constrainedness knife-edge
04/11/23Optimisation problems too (number partitioning)
Exploiting the knife-edge
Get off the knife-edge asap Aka minize constrainedness
Many existing heuristics can be viewed in this light E.g. fail first heuristic in CSPs E.g. KK heuristic for number partitioning …
04/11/23
Exploiting the knife-edge
Get off the knife-edge asap Aka minize constrainedness
Many existing heuristics can be viewed in this light E.g. fail first heuristic in CSPs E.g. KK heuristic for number partitioning …
Good way to design new heuristics Branch into subproblem with minimal kappa Challenge: to compute this efficiently!
04/11/23
04/11/23
The future?
What open questions remain?
Where to next?
04/11/23
Open questions Prove random 3-SAT occurs at l/n = 4.3
random 2-SAT proved to be at l/n = 1 random 3-SAT transition proved to be in range 3.42
< l/n < 4.506
2+p-COL Prove problem changes around p=0.8 What happens to colouring backbone?
04/11/23
Open questions
Does phase transition behaviour give insights to help answer P=NP? it certainly identifies hard problems! problems like 2+p-SAT and ideas like backbone also show
promise
But problems away from phase boundary can be hard to solve
over-constrained 3-SAT region has exponential resolution proofs under-constrained 3-SAT region can throw up occasional hard
problems (early mistakes?)
04/11/23
Summary
That’s nearly all from me!
04/11/23
Conclusions
Phase transition behaviour ubiquitous decision/optimization/... NP/PSpace/P/… random/real
Phase transition behaviour/constrainedness gives insight into problem hardness suggests new branching heuristics ideas like the backbone help understand branching
mistakes
04/11/23
Conclusions
AI becoming more of an experimental science? theory and experiment complement each other well increasing use of approximate/heuristic theories to keep
theory in touch with rapid experimentation
Phase transition behaviour is FUN lots of nice graphs as promised and it is teaching us lots about complexity and
algorithms!
04/11/23
Very partial bibliographyCheeseman, Kanefsky, Taylor, Where the really hard problem are, Proc. of IJCAI-91Gent et al, The Constrainedness of Search, Proc. of AAAI-96Gent & Walsh, The TSP Phase Transition, Artificial Intelligence, 88:359-358, 1996Gent & Walsh, Analysis of Heuristics for Number Partitioning, Computational Intelligence, 14 (3),
1998Gent & Walsh, Beyond NP: The QSAT Phase Transition, Proc. of AAAI-99Gent et al, Morphing: combining structure and randomness, Proc. of AAAI-99Hogg & Williams (eds), special issue of Artificial Intelligence, 88 (1-2), 1996Mitchell, Selman, Levesque, Hard and Easy Distributions of SAT problems, Proc. of AAAI-92Monasson et al, Determining computational complexity from characteristic ‘phase transitions’,
Nature, 400, 1998Walsh, Search in a Small World, Proc. of IJCAI-99Walsh, Search on High Degree Graphs, Proc. of IJCAI-2001.Walsh, From P to NP: COL, XOR, NAE, 1-in-k, and Horn SAT, Proc. of AAAI-2001.Watts & Strogatz, Collective dynamics of small world networks, Nature, 393, 1998
Some blatent adverts!
2nd International Optimisation Summer School
Jan 12th to 18th, Kioloa, NSW
Will also cover local search
http://go.to/optschool
NICTA
Optimisation Research Group
04/11/23
We love having visitors stopby to give talks, or for longer(week, month or sabbaticals!)