polynomial time - carnegie mellon university15455/pdf/lect-06.pdf · 2020. 1. 29. · amazingly,...

Post on 25-Aug-2021

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Polynomial Time

Klaus SutnerCarnegie Mellon University

2020/01/30

Tractability

Weak Reductions

Sanity Check

Feasible Decision Problems 2

The question arises whether any time complexity class is a good matchfor our intuitive notion of “feasible computation”.

Note that whatever answer we give, we are in a similar situation as withthe Church-Turing thesis: since we deal with intuitive notions therecannot be a formal proof – though one can collect overwhelmingevidence.

However, while there is fairly good agreement about what constitutes acomputation, there is much less agreement about what constitutes afeasible computation. Some people will tell you that anything beyondn log n is useless.

(Deterministic) Polynomial Time 3

Claim

Deterministic polynomial time,

P = TIME(nO(1)) =⋃k≥0

TIME(nk).

corresponds surprisingly well to feasible computation.

Note that this is only about decision problems, the arguably moreimportant question of computing functions is left out.

Feasible Search Problems 4

So how about function/search problems?

To deal with function/search problems we need transducers (as opposedto the acceptors that deal with decision problems): there is a read-onlyinput tape, the result appears on the write-only output tape and thecomputation uses a work tape for scratch-space.

Definition

A function is polynomial time computable if it can be computed by apolynomial time Turing machine with additional input/output tape.

For example, it is easy (and unimpressive) to construct a transducer thatcomputes the reverse of an input string in linear time.

Aside: History 5

Computing functions (as opposed to just solving decision problems) isvery important in the world of real algorithms. We focus on decisionproblems mostly because they are a bit easier to handle.

Interestingly, the concept of the class of polynomial time computablefunctions was developed a bit earlier (but still decades after thedevelopment of CRT):

von Neumann 1953

Godel 1956

Cobham 1964

Edmonds 1965

Closure 6

One important property of polynomial time computable functions is thatone can compose them without falling out of the class:

Lemma

Polynomial time functions are closed under composition.

This means that if we can translate the instances and solutions of afunction problem A in polynomial time into instances and solutions ofsome another function problem B, and B has a polynomial time solution,then A also has a polynomial time solution.

Proof 7

Proof. The short version is: polynomials are closed under substitution.

More precisely, suppose y = f(x) is computable in time at most p(n)where n = |x|.Then |y| ≤ p(n) and z = g(y) is computable in time at most q(p(n)) forsome polynomial q.

Hence we have a polynomial time bound for z = g(f(x)). 2

In fact, if p(n) = O(nk) and q = O(n`), then the composition is O(nk`)at most.

Polynomials are Great 8

Just to be clear: note that this result is not trivial, it depends crucially onour choice of polynomials as the resource bounds.

Closure fails for, say, exponential time EXP1 = 2O(n).

But similar claims do hold for

linear time (easy),

and for logarithmic space (hard).

Two Standard Objections 9

The notation TM(n) = O(nk) hides two constants:

∃n0, c∀n ≥ n0

(TM(n) ≤ c · nk

)What if these constants are huge? Something like 10001000? Note thatthere are only around 1080 particles in the universe.

This would render the asymptotic bounds entirely useless for anythingresembling feasible computation. It could even wreck physicalcomputation entirely.

Standard Answer 10

This objection has merit in principle, but in the RealWorldTM it ispointless: for practical problems it is a matter of experience that theconstants are easy to determine and are always very reasonable.

In fact, we can even compute the constants by writing down thealgorithms very carefully in a low-level language like C.

For practical algorithms this is somewhat of a pain, but not really terriblydifficult (as long as we are fairly relaxed about the bounds).

Example: Quadratic 11

Consider a code fragment like

// Pfor i = 1 to n do

for j = 1 to i doblahblahblah

Suppose blahblahblah is constant time, a bunch of comparisons andassignments, say.

Clearly, the running time of P is quadratic, O(n2).

Assembly 12

But if we wanted to, we could write P in assembly,

0: 55 push ebp1: 89 e5 mov ebp,esp3: 83 e4 f0 and esp,0xfffffff06: 83 ec 10 sub esp,0x109: c7 04 24 00 00 00 00 mov DWORD PTR [esp],0x0

Now we can count the number of steps each execution of P takes,likewise for the control structures. The result might be 55n + O(1).

But it surely won’t be 10001000n + O(1).

Digression: Knuth 13

D. Knuth (everbody bow three times in the direction of Stanford)actually thinks that pure asymptotic bounds are fairly cheesy.

We should never say “this algorithm has complexity O(n)”, we shouldexplicitly figure out the coefficients of the leading term. Say

. . . = 7 · n + O(log n)

This is easy to say for Knuth, who just develops the necessary math onthe fly, but quite challenging for ordinary mortals.

The Bible 14

D. E. KnuthThe Art of Computer ProgrammingAddison-Wesley, 1968–now

D. E. Knuth, O. Patashnik, R. GrahamConcrete MathematicsAddison-Wesley, 1988

Aside 15

Knuth is very unhappy about people publishing algorithms without everimplementing them (maybe one really should distinguish betweencomputable function and algorithm).

http://www.informit.com/articles/article.aspx

Truth in Advertising 16

The claim that we can always figure out the multiplicative constant issomewhat of a white lie. There are algorithms in graph theory, based onthe Robertson-Seymour theorem for graph minors where these constantsare utterly unknown.

Graph H is a minor of graph G if H is (isomorphic to) a graph obtainedfrom G by

edge deletions,

edge contractions, and

deletion of isolated vertices.

H G

Closed Classes 17

A collection of finite graphs G is closed under minors if whenever G ∈ Gand H is a minor of G, then H ∈ G.

The classical example is the class of planar graphs: every minor of aplanar graph is always planar. This produces a famous theorem:

Theorem (Kuratowski-Wagner)

A graph is planar if, and only if, it has no minor K5 or K3,3.

Wagner’s Conjecture 18

In 1937, K. Wagner conjectured that a similar theorem holds for everyminor closed class C of finite graphs:

Conjecture

Suppose C is minor closed. There are finitely many obstruction graphsH1, H2, . . . ,Hr such that G is in C iff G does not have Hi as a minor forall i = 1, . . . , r .

In other words, all anti-chains of graphs (wrt the minor order) are finite.

Note that this yields a decision algorithm: we just check if some givengraph G has one of the forbidden minors.

Robertson-Seymour 19

Wagner’s conjecture gave rise to the following amazing theorem:

Theorem (Robertson, Seymour)

Every minor-closed family of graphs has a finite obstruction set.

This was proven in a series of 23 papers from 1984 to 2004, an incredibletour de force. The total proof is hundreds of pages long.

It’s probably correct, but putting it through a theorem prover would nothurt.

LaLaLand to Computational Universe 20

The proof of the Robertson-Seymour theorem is brutallynon-constructive: the finite obstruction set exists in some set-theoreticuniverse, but we have no way of constructing it.

Worse, often we don’t even know its (finite) cardinality.

Does it have any computational content? Well, for planar graphs iscertainly does: we could check planarity by looking for minors K3 andK3,3. Needless to say, there are much better planarity testing algorithms.

Theorem To Algorithm 21

Suppose that graph H is fixed.

Problem: Minor QueryInstance: A ugraph G.Question: Is H a minor of G?

There is an algorithm that tests whether H is a minor of G in O(n2)steps, n = |G|.

Note that H has to be fixed, otherwise we could check whether Gcontains a cycle of length |G| in quadratic time (aka Hamiltonian cycle).

Doom and Disaster 22

But then we can check all H from a finite list of obstruction graphs inquadratic time.

Hence we can check membership in any minor closed class G in quadratictime.

Alas, there is a glitch: the finite obstruction list is obtainednon-constructively; it exists somewhere, and we can prove its existenceusing sufficiently strong set theory, but we cannot in general determineits elements—and, in fact, not even its cardinality. So we get a quadratictime algorithm, but we cannot bound the multiplicative constant.

Ouch.

Objection Two 23

What if TM(n) = O(n1000)?

This is polynomial time, but practically useless. By the time hierarchytheorem, we know that problems exist that can be solved in timeO(n1000) but essentially not in less time.

But note that these problems are constructed by diagonalizationtechniques, and are thus entirely artificial; they do not correspond todecent RealWorldTM problems.

Weird Empirical Fact 24

If a natural problem is in P at all, then it can actually be solved in timeO(n10) – for some small value of 10.

Take this with ample amounts of salt, but experience has shown so farthat there simply are no natural problems where the best knownalgorithm has running time O(n1000).

Alas, no one has any idea why this low-degree principle appears to betrue. Note the qualifier “natural”. Everyone understands intuitively whatthis means, but it seems very difficult to give a formal definition.

Close Call 25

In 2002, Agrawal, Kayal and Saxena published a paper that shows thatprimality testing is in polynomial time.

Amazingly, the algorithm uses little more than high school arithmetic.

The original algorithm had time complexity O(n12), but has since been

improved to O(n6).

Alas, the AKS algorithm seems useless in the RealWorldTM, probabilisticalgorithms are much superior.

Tractability

Weak Reductions

Sanity Check

Scaling Down 27

“Small” classes like P or EXP don’t play well with our reductions fromCRT. For example

A ∈ EXP implies A ≤T ∅

The problem is that the oracle TM can directly solve A, it has no need toquery the oracle. And, ≤T can eviscerate much more than just EXP.

We need to limit the compute power of the reduction mechanism.

Polynomial Time Turing Reductions 28

Perhaps the most obvious attempt would be to use Turing reductions,but with the additional constraint that the Turing machine must havepolynomial running time. This is called a polynomial time Turingreduction.

Notation: A ≤pT B.

Note that this actually makes sense for any kind of problem, not justdecision problems. Also, it really captures nicely the idea that problem Ais easier than B since we could use a fast algorithm for B to construct afast algorithm for A.

Free Lunch 29

Just to be clear: in a computation with oracle B we only charge for thesteps taken by the ordinary part of the Turing machine. For example, itcosts to write a query for the oracle on the tape.

But the answer provided by the oracle appears in just one step, we do notcare about the internals of the oracle.

Checking 30

Proposition

Polynomial time Turing reducibility is a preorder (reflexive and transitive).

For transitivity, this works since polynomials are closed undersubstitution. Hence we can from equivalence classes as usual, thepolynomial time Turing degrees. We won’t pursue this idea here.

Compatibility 31

Proposition

B ∈ P and A ≤pT B implies A ∈ P.

Here we can simply replace the oracle by a real polynomial algorithm.

Since polynomials are closed under substitution, this will produce apolynomial time algorithm.

Types of Problems 32

Computational problems naturally fall into several classes:

Decision Problems Return a Yes/No answer.

Counting Problems Count objects of a certain kind.

Function Problems Calculate a certain function.

Search Problems Select one particular solution.

In complexity theory we focus on decision problems because they are abit easier to handle. But the others are at least as important.

Example: Vertex Cover 33

Problem: Vertex Cover (decision)Instance: A ugraph G, a bound k.Question: Does G have a vertex cover of size k?

Problem: Vertex Cover (counting)Instance: A ugraph G.Solution: Find the size of a minimal vertex cover of G.

Problem: Vertex Cover (function)Instance: A ugraph G.Solution: Find the lex-minimal vertex cover of G.

Problem: Vertex Cover (search)Instance: A ugraph G.Solution: Find a vertex cover of G of minimal size.

It’s Surprisingly Hard 34

Exercise

Come up with a reasonable algorithm for Vertex Cover.

Comparisons 35

Intuitively, the function version is the hardest: once we have alex-minimal cover, all other problems are trivial. Clearly, they are allpolynomial Turing reducible to the function version.

We did not give a technical definition of what that means forcounting/function/search problems, but the idea is clear: for example, weare given a function f : Σ? → N as oracle, and we can ask it to computem = f(x) for some strings x.

Exercise

Figure out the details.

The Other Way 36

Proposition

The function version of Vertex Cover is polynomial time Turing reducibleto the decision version.

This requires some work. Let n be the number of vertices in G, say,V = {v1, . . . , vn }.

First, do a binary search to find the size k0 of a minimal cover, usingthe oracle for the decision problem.

Then find the least vertex v such that G− v has a cover of sizek − 1. Place v into C and remove it from G.

Recurse.

This is polynomial time, even using Turing machines.

Justification 37

The last proposition is important: it justifies our narrow focus on decisionproblems: up to a polynomial factor, they are no worse than the fancier,more applicable versions.

Of course, this does not help much when we are trying to findsuper-efficient algorithms, but it makes life much easier when it comes toshowing that something can be done in polynomial time at all.

Tarjan’s strongly connected component algorithm is a nice example ofthis type of refinement.

Still No Good 38

Unfortunately, while polynomial Turing reductions are compatible with P,they are still too brutish. To wit:

A ∈ P implies A ≤pT ∅

The problem again is that the oracle TM can directly solve A, it has noneed to query the oracle.

How about the other reductions that we considered in the CRT part?

Weak Reductions 39

As before in classical recursion theory, we can consider weaker reductionsinstead. In this case, many-one reducibility turns out to be the rightnotion.

Definition

A ⊆ Σ? is polynomial time (many-one) reducible to B ⊆ Σ? if there is apolynomial time computable function f such that

x ∈ A ⇐⇒ f(x) ∈ B.

Notation: A ≤pm B.

Properties 40

Proposition

Polynomial time many-one reducibility is a preorder.

Proof.

Reflexivity is trivial. For transitivity, consider a polynomial time reductionf from A to B and g from B to C.

Obviously h(x) = g(f(x)) is a reduction from A to C.

h is still polynomial time computable since polynomials are closed undersubstitution.

2

This may seem obvious, but note that it does not work for other classes(e.g., exponential time computable reductions). Incidentally, it does workfor linear time computable reductions.

Better Mousetrap? 41

Proposition

≤pm is compatible wrto P: if B is in P and A ≤p

m B, then A is also in P.

This is clear: we can replace the oracle B by a polynomial timealgorithm.

But is ≤pm really a useful reduction for P?

We could use the mediating function to “reduce” a problem inTIME(n1000) to TIME(n).

Again: we really want the reductions to be lightweight, the oracle shouldbe the place where the heavy lifting takes place.

Log-Space Reductions 42

As it turns out, a better reduction results from restricting f even more:we’ll insist that f can be computed in logarithmic space. More detailsabout space complexity later, for the time being use your intuition.

Definition

A ⊆ Σ? is log-space reducible to B ⊆ Σ? if there is a log-spacecomputable function f such that

x ∈ A ⇐⇒ f(x) ∈ B.

Notation: A ≤` B.

Recall: Transducer 43

M

10100input tape

work tape

aba acab a

0111

output tape

Only the work tape counts.

But Why? 44

Clearly A ≤` B implies A ≤pm B.

But the mediating function f is now much more constrained. Thinkabout graph problems where [n] is the vertex set.

We cannot remember an arbitrary subset S ⊆ [n].

Unfortunately, this wrecks many graph algorithms such as DFS thatrequire linear memory and nicely forces the oracle to do most of the work.

Example: Neighborhood Traversals 45

Here is a typical code fragment in a graph algorithm:

foreach v in V doforeach (v,u) in E do

blahblahblah

This idiom appears in lots of graph algorithms: look at theneighborhoods of all vertices.

Good news: it can be handled in logarithmic space: all we need is a fewvertices in [n] (these are written in binary), we can traverse all the edges(v, u) in any standard representation of a graph (this works fine for bothadjacency matrices or adjacency lists, though they may produce differentrunning times).

Structuring Polynomial Time 46

Unlike polynomial-time reductions, log-space reductions can be used tostudy the fine-structure of P.

Definition (P-Completeness)

A language B is P-hard if for every language A in P: A ≤` B.It is P-complete if it is P-hard and in P.

Building a P-hard set is easy: take an enumeration (Me) of allpolynomial time TMs, and build the analogue to the Halting set:

Kp = { e#x | Me(x) ' 1 }

Of course, there is no reason why Kp should be in P.

More Precisely . . . 47

To obtain (Me), we start with an arbitrary effective enumeration (Ne) ofall Turing machines.

We would like to run a poly-time tester that checks that Me runs inpolynomial time (if so, keep the machine, otherwise ditch).

Unfortunately, this tester does not exist: it is undecidable whether amachine runs in polynomial time. Intuitively we need to check

∃ k ∀n, x(|x| = n ⇒ TM(x) ≤ nk + k

)This looks completely hopeless, something like Σ2.

Workaround 48

But we can construct a new machine Me as follows. As always, assumethat (Ne) has infinitely many repetitions.

attach a clock to Ne, and

stop the computation after at most ne + e steps;

return the same output as Ne(x) if the computation has converged,

otherwise return 0 (or some other default value).

Note that Me is total by definition.

Furthermore, if some total machine M runs in polynomial time, thenthere is an index e such that M(x) 'Me(x) for all x.

If M is partial, we get the same behavior on the support.

Circuit Value Problem 49

Problem: Circuit Value Problem (CVP)Instance: A Boolean circuit C, input value x.Question: Check if C evaluates to true on input x.

Obviously CVP is solvable in polynomial time (even linear time given ahalfway reasonable representation).

There are several versions of CVP, here is a particularly simple one:compute the value of the “last” variable Xm given

X0 = 0, X1 = 1

Xi = XLi3iXRi , i = 2, . . . ,m

where 3i = ∧,∨ and Li, Ri < i.

CVP is Hard 50

Theorem (Ladner 1975)

The Circuit Value Problem is P-complete.

Sketch of proof. For hardness consider any language A accepted by apolynomial time Turing machine M.

We can encode the computation of the Turing machine M on x as apolynomial size circuit (polynomial in n = |x|): use lots of Booleanvariables to express tape contents, head position and state.

Constructing this circuit only requires “local” memory; for example weneed O(log n) bits to store a position on the tape.

The circuit evaluates to true iff the machine accepts.

2

Forward pointer: we will give a more careful argument for Cook/Levine.

Alternating Breadth First Search 51

A long list of P-complete problems in known, though they tend to be abit strange.

For example, consider an undirected graph G = 〈V,E〉 where E ispartitioned as E = E0 ∪ E1.

Question: Given a start vertex s and a target t, does t get visited alongan edge in E0 if the breadth-first search can only use edges in E0 at evenand E1 at odd levels?

Theorem

ABFS is P-complete (wrto logspace reductions).

Tractability

Weak Reductions

Sanity Check

Complexity and Algorithmic Analysis 53

In any standard algorithm text you will find a statement along the lines of

Proposition

DFS runs in linear time.

What does that mean in our framework?

First, the size of the input, a ugraph G = 〈V,E〉 is usually given as n + ewhere n = |V | and e = |E|. This is justified by the standard adjacencylist representation, and uniform cost function: in the RealWorldTM, noone deals with graphs larger than 264 ≈ 1.85× 1019, much less 2128.

If we were to implement DFS on a Turing machine, there would obvoiuslybe some slow-down. But still, there is a low-degree polynomial p so thatthe Turing machine runs in time p(n + e).

Easy versus Hard 54

It is great fun to figure out how small p can be made, but ultimatelyirrelevant: we are interested in lower bounds, in particular in statementslike

Problem such-and-such cannot be solved in polynomialtime.

A polynomial increase doesn’t matter in this case: we might as well thinkabout the computation in a luxurious RAM or a C program. If the Turingmachine does not run in polynomial time, these won’t either.

This makes it much easier to reason about the problem.

Example 0: Path Existence 55

Given a digraph and two vertices s and t, we can ask whether there is apath from s to t.

The brute-force approach would enumerate all simple paths starting at s,and check if any one of them hits t. Alas, this is exponential.

But we can simply run DFS (or BFS) starting at s, and get the answer inlinear time.

Example 1: Shortest Path 56

Given a digraph whose edges are labeled by a positive cost, one can ask:what is the shortest path from s to t?

Again, brute-force would require enumeration of all simple paths andrequire exponential time.

But there are better (and more complicated) algorithms such as Dijkstraor Bellman-Ford that work in polynomial time.

Example 2: Long Paths 57

Given a ugraph, we can ask: is there a simple path that has length atleast k?

Certainly brute-force enumeration of all simple paths would dismantlethis problem, and would require exponential time.

But this time no one knows a computational shortcut, in particular nopolynomial time algorithm for this problem is known.

The problem gets nasty when k = n− 1: we are asking for a simple pathusing all vertices. This seems to be fundamentally different fromanything like a shortest path.

Exponential Search 58

As we will see, there are lots of combinatorial problems like the last onethat are easy to solve with an exponential search, everything else ispolynomial.

Naturally, one would like to establish a lower bound, say, every algorithmrequires 2n steps (at least for infinitely many inputs).

Alas, these lower bounds seem be extremely elusive, to the point ofconstituting a major open problem in math, not just the theory ofalgorithms.

Old Example 59

We already talked about one other such problem: Vertex Cover.

We are given a ugraph G with n vertices and a bound k ≤ n, and wantto know whether G admits a VC of size k.

Obviously we can perform a brute force check on all subsets of size k, butthere are

(nk

)such sets—and that number is not polynomial in n for

variable k (though everything is fine for fixed k).

So, on the face of it, it is far from clear that VC can be handled inpolynomial time.

top related