slides by elery pfeffer and elad hazan, based on slides by michael lewin & robert sayegh

34
1 Slides by Elery Pfeffer and Elad Hazan, Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Based on slides by Michael Lewin & Robert Sayegh. Sayegh. Adapted from Oded Goldreich’s course Adapted from Oded Goldreich’s course lecture notes by Eilon Reshef. lecture notes by Eilon Reshef.

Upload: izzy

Post on 05-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Derandomizing Space Bound Computations. Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh. Adapted from Oded Goldreich’s course lecture notes by Eilon Reshef. Introduction. In this lecture we will show our main result: BPSPACE(log n)  DSPACE(log 2 n) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

1

Slides by Elery Pfeffer and Elad Hazan, Based on Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh.slides by Michael Lewin & Robert Sayegh.

Adapted from Oded Goldreich’s course lecture Adapted from Oded Goldreich’s course lecture notes by Eilon Reshef.notes by Eilon Reshef.

Page 2: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

2

IntroductionIntroduction

In this lecture we will show our main result: BPSPACE(log n) DSPACE(log2n)

The result will be derived in the following order: Formal Definitions Execution Graph Representation A Pseudorandom Generator based on UHF. Analysis of the Execution Graph traversal

by the Pseudorandom Generator.

Page 3: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

3

Definition of BPSPACE(·)Definition of BPSPACE(·)BPSPACE(·) is the family of bounded probability

space-bounded complexity classes.

Def: The complexity class BPSPACE(s(·)) is the set of all languages L s.t. there exists a randomized TM M that on input x:

M uses at most s(|x|) space The running time of M is bounded by exp(s(|x|)) xL Pr [M(x) = 1] 2/3 xL Pr [M(x) = 1] 1/3

s(·) is any complexity function s.t. s(·) log(·).

We focus on BPL, namely, BPSPACE(log)

16.2

Without this condition:NSPACE(·) = RSPACE(·)

BPSPACE(·)

Page 4: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

4

Execution GraphsExecution Graphs

We represent the execution of a BPSPACE machine M on input x as a layered directed graph GM,x

The vertices in the i-th layer ( ) correspond to all the possible configurations of M after it has used i random bits.

Each vertex has 2 outgoing edges corresponding to reading a “0” or “1” bit from the random tape.

Note: Width of GM,x = | | 2s(n) · s(n) · n exp(s(n))

Depth of GM,x = # of layers exp(s(n))

ixM,V

ixM,V

16.3

Page 5: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

5

Execution Graph ExampleExecution Graph Example

Wid

th

Depth

Page 6: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

6

Execution Graph DefinitionsExecution Graph Definitions

The set of final vertices are partitioned into: Vacc - The set of accepting configurations. Vrej - The set of rejecting configurations.

A random walk on GM,x is a sequence of steps emanating from the initial configuration and traversing randomly the directed edges of GM,x.

A guided walk on GM,x (with guide R) is a sequence of steps emanating from the initial configuration and traversing the i-th edge in GM,x according to the i-th bit given by the guide R.

Page 7: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

7

Execution Graph DefinitionsExecution Graph Definitions

Denote by ACC(GM,x,R) the event that the R guided walk reaches a vertex in Vacc.

Thus, Pr[M accepts x] = Pr[ACC(GM,x,R)]

Summarizing, we learn that for a language L in BPL there exists a (D,W)-graph s.t.

Width- W(n) = exp(s(n)) = poly(n). Depth- D(n) = exp(s(n)) = poly(n). For a random guide R:

– xL PrR [ACC(GM,x,R)] 2/3.– xL PrR [ACC(GM,x,R)] 1/3.

Page 8: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

8

Execution Graph DefinitionsExecution Graph Definitions

We note that the following technical step preserves a random walk on GM,x

pruning the layers of GM,x s.t. only every l-th layer remains, contracting edges when necessary.

Denote the new pruned graph as G.

This is done to ease the analysis of the pseudorandom generator further on.

Page 9: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

9

0 l 2l

0101…110

1100…010

Execution Graph DefinitionsExecution Graph Definitions

Clearly, a random walk on GM,x is equivalent to random walk on G.

Page 10: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

10

Universal Hash FunctionsUniversal Hash Functions

Def: A family of functions H = {h: AB} is called a universal family of hash functions if for every x1 and x2 in A, x1 x2,

PrhH[h(x1) = y1 and h(x2) = y2] = (1/|B|2)

We will use a linear universal family of hash functions seen previously:

Hl = {h: {0,1}l {0,1}l}ha,b = ax + b

This family has a succinct representation ( 2l space ) and can be computed in linear space.

16.4

Page 11: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

11

Universal Hash FunctionsUniversal Hash Functions

For every A {0,1}l denote by m(A) the probability that a random element hits the set A:

m(A) = |A| / 2l

Hash Proposition: For every universal family of hash functions, Hl, and for every two sets A,B {0,1}l, all but a 2-l/5 fraction of functions hHl satisfy

|Prx[x A and h(x) B] - m(A) · m(B)| 2-l/5

That is, a large fraction of the hash functions extend well to the sets A and B.

Page 12: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

12

Construction OverviewConstruction Overview

Def: A function H: {0,1}k {0,1}D is a (D,W)-pseudorandom generator if for every (D,W)-graph G:

|PrR{0,1}D[ACC(G,R)] - PrR’{0,1}k[ACC(G,H(R’)]| 1/10

Prop: There exists a (D,W)-pseudorandom generator H(·) with k(n)=O(logD·logW). Futher, H(·) is computable in space linear on its input.

16.5

Page 13: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

13

Construction OverviewConstruction Overview

Corollary: There exists a (D,W)-pseudorandom generator H(·) with the following parameters:

s(n) = (log n) D(n) poly(n) W(n) poly(n) k(n) = O(log2n)

By trying all seeds of H, it follows that:BPL DSPACE(log2n)

This is NOT surprising since:RL NL DSPACE(log2n)

Page 14: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

14

The Pseudorandom GeneratorThe Pseudorandom Generator

We will define a (D,W)-pseudorandom generator, H.Assume without loss of generality D W.

H extends strings of length O(l2) to strings of length D exp(l), for l = (log W).

The input to H is the tuple:I = (r,<h1> ,<h2>,…, ,<hl’>)

Where |r|=l, and <hi> are representations of functions in Hl and l’ = log(D/l).

Obviously, |I| = O(l2).

16.6

Page 15: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

15

The Pseudorandom GeneratorThe Pseudorandom GeneratorThe computation of the PRG can be represented

as a complete binary tree of depth l’.

The output of H is the concatenation of the binary values of the leaves.

Formally:H(r,<hi> ,…, ,<hl’>) = H(r,<hi+1> ,…, ,<hl’>)H(hi(r),<hi+1> ,

…, ,<hl’>)

starting with H(z) = z

r

r

r

r

h1(r)

h1(r)h2(r) h2(h1(r))

h3(h2(h1(r)))

h3(h2(r))

Page 16: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

16

intuitionintuitionr

r

r

r

h1(r)

h1(r)h2(r) h2(h1(r))

h3(h2(h1(r)))h3(h2(r)) resultedpseudorandom sequence

Execution graph

( using sequence )

0 L 2L 3L 4L 5L 6L 7L

The output of the pseudorandom generator represents a path in

The execution graph.

The sequence is contructed using onlyL bits for r, and l’ hash functions.

Page 17: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

17

AnalysisAnalysis

Claim: H is indeed a (D,W)-pseudorandom generator

We will show that a guided walk on GM,x using the guide H(I) behaves almost as a truly random walk.

We will perform a sequence of coarsenings. At each coarsening we will use a new hash function from H and reduce the number of random bits by a factor of 2. After l’ such coarsenings, the only random bits remaining are l random bits of r and the l’ representations of the hash functions

16.7

Page 18: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

18

At the first coarsening we replace the random guide

R = (R1, R2,…, RD/l) with the semi-random guide

R’ = (R1, hl’(R1), R3, hl’(R3),…, RD/l-1, hl’(RD/l-1)).

And we will show that this semi-random guide succeeds to “fool” GM,x.

AnalysisAnalysis

r1

h3(r1)r1

r3

h3(r3)r3

r5

h3(r5)r5

r7

h3(r7)r7

Page 19: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

19

IntuitionIntuition

h3(r1)r1 h3(r3)r3

h3(r5)r5

pseudorandom

random

seed

<

P(Acc)

P(Acc)-

By summing over all l’ levels in the tree we will show that the total difference in probability to accept is less the 1/10

Page 20: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

20

At the second coarsening we again replace half of the random bits by choosing a new hash function

R’ = (R1, hl’(R1), R3, hl’(R3),…, RD/l-1, hl’(RD/l-1))

R’’ = (R1, hl’(R1), hl’-1(R1), hl’-1(hl’(R1)),…)

And again we show that this semi-random guide also succeeds to “fool” GM,x.

AnalysisAnalysis

r1

h3(r1)r1h3(h2(r1))

r1

h2(r1)

h2(r1)

r5

h3(r5)r5h3(h2(r5))

r5

h2(r5)

h2(r5)

Page 21: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

21

AnalysisAnalysis

And so on, until we perform l’ such coarsenings. Upon which we have proven the the generator H(I) is indeed a (D,W)-pseudorandom generator.

We recall that the pruned execution graph was denoted G.

Page 22: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

22

AnalysisAnalysisAt the first coarsening we replace the random

guide R = (R1, R2,…, RD/l) with the semi-random guide

R’ = (R1, hl’(R1), R3, hl’(R3),…, RD/l-1, hl’(RD/l-1)).

We show that:|PrR [ACC(G,R)] - PrR’[ACC(G,R’]| <

Page 23: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

23

AnalysisAnalysis

We perform preprocessing by removing from G all edges (u,v) whose traversal probability is very small, that is, PrR [uv] < 1/W2. Denote by G’ the new graph.

Lemma 1: 1 = 2/W,

1. |PrR [ACC(G’,R)] - PrR[ACC(G,R)] | < 1

2. |PrR’ [ACC(G’,R’)] - PrR’[ACC(G,R’)] | < 1

Proof: For the first part, the probability that a random walk uses a low probability edge is at most D·(1/W2) < 1/W < 1 . For the second part, we consider two consecutive steps. The first step is truly random and the traversal probability is 1/W2. On the second step we use the hash proposition for the set {0,1}l

and the set of low probability edges.

Page 24: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

24

AnalysisAnalysis

Proof(continued): For all but a 2-l/5 fraction of hash functions the traversal probability is bounded by, 1/W2 + 2-l/5 < 2/W2. On the whole, except for a total of (D/2)·2-l/5 < 1/2 hash functions the overall probability to traverse a low probability edge is bounded by D·(2/W2) < 1/2. Thus the total probability is bounded by 1.

Thus removing low probability edges does not significantly affect the outcome of G.

Page 25: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

25

AnalysisAnalysis

We will show that on G’ the semi-random guide R’ performs similarly to the true random guide R.

Lemma 2: |PrR [ACC(G’,R)] - PrR’[ACC(G’,R’)] | < 2

Proof: Consider first 3 consecutive vertices u, v, w and the set of edges between them Eu,v, Ev,w.

The probability that a random walk leaves u and reaches w through v is:

Pru-v-w = PrR1,R2{0,1}l[R1 Eu, v and R2 Ev, w]

Since we removed low probability edges:Pru-v-w 1/W4

Page 26: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

26

AnalysisAnalysis

Proof(continued):The probability that a semi-random walk,

determined by hash function h, leaves u and reaches w through v is:

Prhu-v-w = PrR{0,1}l[R Eu, v and h(R) Ev, w]

Using the hash proposition with respect to sets Eu,v,

Ev,w we learn that except for a fraction of 2-l/5 h’s:

| Prhu-v-w - Pru-v-w | 2-l/5

Applying to all possible triplets we learn that except for a fraction of 3 =W3·2-l/5 hash functions:

u,v,w | Prhu-v-w - Pru-v-w | 2-l/5

Page 27: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

27

AnalysisAnalysis

Proof(continued):Denote by Pru-w (Prh

u-w) the probability of reaching w from u for the random ( semi-random ) guide.

Pru-w = v Pru-v-w and Prhu-w = v Prh

u-v-w

Consequently if we assume that h is a “good” hash function,

|Pru-w - Prhu-w| W·2-l/5 W4·2-l/5·Pru-w 4·Pru-w

For a large enough constant of l = (log W).

Page 28: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

28

AnalysisAnalysis

Proof(continued):Since the probability of traversing any path P in G is

the sum of the probabilities of traversing every two-hop u-v-w, we learn that:

|Pr[R’ = P] - Pr[R = P]| 4·Pr[R = P]

Summing over all accepting paths,|PrR[ACC(G’,R)]-PrR’[ACC(G’,R’)] | 4· PrR[ACC(G’,R)]

4

The probability that h is indeed a good hash function is bounded by 3. Therefore, if we define 2 = 3 + 4 we prove the lemma:

|PrR[ACC(G’,R)]-PrR’[ACC(G’,R’)] | 2

Page 29: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

29

AnalysisAnalysis

Applying both lemmas, we prove that the semi-random guide R’ behaves well in the original graph GM,x:

|PrR[ACC(GM,x,R)]-PrR’[ACC(GM,x,R’)] | .

We have proved that the first coarsening succeeds. To proceed, we contract every two adjacent layers of G and create a single edge for every two-hop path taken by R’. Lemma 1 and 2 can be reapplied consecutively until after l’ iterations we are left with a bipartite graph with a truly random guide.

Page 30: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

30

AnalysisAnalysis

All in all, we have shown:

|PrR{0,1}D[ACC(G,R)]-PrI{0,1}k[ACC(G,H(I)] | ·l’ 1/10

which concludes the proof that H is a (D,W)-pseudorandom generator.

Page 31: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

31

AnalysisAnalysis

Problem: h is a hash function dependent on O(logn) bits and M is a log-space machine. Why can’t M differentiate between a truly random guide and a pseudorandom guide by just looking at four consecutive blocks of the pseudo-random sequence z, h(z), z’, h(z’), and fully determining h by solving linear equations in log-space?

Solution: During the analysis we required that l = (log W) be large enough. In the underlying computation model this corresponds to the fact that M can not even retain a description of the hash function h.

Page 32: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

32

Extensions and Related ResultsExtensions and Related Results

We have shown that BPL DSPACE(log2n) but the running time of the straightforward derandomized algorithm is (exp(log2n) ). Here we sketch the following result BPL SC (“Steve’s Class”). Where SC is the class of all languages that can be recognized in poly(n) time and polylog(n) space.

Thm: BPL SC Proof Sketch: If we good guess a “good” set of hash

functions h1,h2,…,hl’. Then all that would be left to do is to enumerate on r which take poly(n) time. We will show that we can efficiently find a good set of hash functions.

16.8

Page 33: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

33

Extensions and Related ResultsExtensions and Related Results

Proof Sketch(continued): We will incrementally fix the hash functions one at a time from hl’ to h1. The important point to notice is that due to the recursive nature of H, whether h is a good hash function or not depends only on the hash functions fixed before it. Therefore it is enough to incrementally find good hash functions.

In order to check whether h is a good hash function we must test if lemma 1 and lemma 2 hold. This requires creating the proper pruned graph G and checking the probabilities on different subsets of edges. Both of these tasks can be performed in poly(n) time.

Page 34: Slides by Elery Pfeffer and Elad Hazan, Based on slides by Michael Lewin & Robert Sayegh

34

Extensions and Related ResultsExtensions and Related Results

Proof Sketch(continued): Hence, the total time required is l’·poly(n) = poly(n) time and the total space required, O(log2n), is dominated by storing the functions h1,h2,…,hl’.

Further Results (without proof): 1. BPL DSPACE(log1.5n) 2. Every random computation that can be

carried out in polynomial time and in linear space can also be carried out in polynomial time and in linear space, but using only a linear amount of randomness