context free pumping lemma zeph grunschlag. agenda context free pumping motivation theorem proof...

34
Context Free Pumping Lemma Zeph Grunschlag

Post on 21-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Context Free Pumping Lemma

Zeph Grunschlag

Agenda Context Free Pumping Motivation Theorem Proof

Proving non-Context Freeness Examples on slides Examples on blackboard

Pumping FA’s

Strings of length 3 or more in DFA above can be pumped because such strings correspond to paths of length 3, so visit 4 vertices. Hence, pigeonhole principle guarantees some vertex visited twice, and hence a pumpable cycle.

0

1

0

0

1

1

x y z

Pumping PDA’s

However, regular pumping lemma fails in this example.

Q: Give an example of a pattern that cannot be pumped.

r s$q

$

XX

Pumping PDA’s

A: (n )n can’t be pumped in the first half.

However, could pump two substrings at once. I.e. could take k left parens if take k right parens. I.e. can “tandem pump”.

r s$q

$

XX

Tandem PumpingDEF: A string s in L is said to be tandem

pumpable if can break s up into s = uvxyz

such that for all i 0 we have that s = uv ixy iz L

with at least one of v,y nonempty.Q1: Is 00111 tandem pumpable in 0*111 ?Q2: Is 00100 tandem pumpable in {0n10n}

?Q3: Is 00100100 tandem pumpable in

{0n10n10n} ?

Tandem PumpingA1: Yes. Any pumpable string is

automatically tandem pumpable by letting y = . In our case, let u = , v = 00, x = y = z = 111.

uv ixy iz =(00)i111 is indeed in 0*111.A2: Yes. Let u = , v = 00, x = 1, y = 11

and z = uv ixy iz =(00)i1(00)i is indeed in {0n10n}A3: NO! Tandem pumping 00100100

leads either to too many 1’s, or would increase two of the 0-streaks, without ability to increase the remaining 0-streak.

Tandem PumpingIn general, since pumping

automatically implies tandem pumping, all (infinite) regular languages are tandem pumpable. Turns out, that all (infinite) context free languages are as well. But Q3 can be generalized to show that {0n10n10n} does not admit tandem pumping of strings which are past a certain length. This will end up proving that {0n10n10n} is not context free:

Context Free Pumping Lemma

THM: Given a context free language L, there is a number p (tandem-pumping number) such that any string in L of length p is tandem-pumpable within a substring of length p. In other words, for all s L with |s| p we we can write: s = uvxyz |vy | 1 (some pumpable stuff non-empty) |vxy | p (pumping inside length-p portion) uv ixy iz L for all i 0 (tandem-pump v and

y)

CFPL – Intuition

Intuitively s = uvxyz is found as follows: Only finitely many stack changes possible at cycles in the graph of length n (the number of states). Thus if s is long enough, there will have to be some states q,r such that the same string is pushed at q as is popped at r and such that the path from q to r starts and ends with same stack configuration. With these assumption, can then pump up v and y in tandem as v pushes same stuff that y pops off.

q r-x

ps

-v -y

-z-u

tk … t2

t1

sk … s2

s1

sk … s2

s1

CFPL - ProofThe previous can actually be formalized and

used to prove the Context Free Pumping Lemma. However, this is actually quite painful compared to very simple grammar-theoretic proof:

Proof of CFPL: We may assume that the language is in CNF. This is not an essential assumption but it makes the proof a little easier.

Consider a derivation tree in which some occurring variable node has itself as an ancestor:

CFPL – Proof

Could replace last appearance of A by its first appearance. I.e., in tree replace

A * “and a” by A * “chuga and a choo” to get the following:

c h u g a

f o r y o u

S

a A

A

a n d a c h o o

c h u g a

f o r y o u

S

a A

c h o o

c h u g a

A

A

a n d a c h o o

And again:

CFPL – Proof

c h u g a

f o r y o u

S

a A

c h o o

c h u g a

A

c h o o

c h u g a

A

A

a n d a c h o o

CFPL – Proof

CFPL – Proof

Or could replace A * “chuga and a choo” byA * “and a” This is called tandem-pumping down:

c h u g a

f o r y o u

S

a A

A

a n d a c h o o

CFPL – Proof

f o r y o u

S

a A

a n d a

CFPL – Proof

In our particular case, we were able to create any string of the form

a (chuga)i and a (choo)i for youIn general, any branch down the

derivation tree with a repeated variable gives rise to strings of the form uv ixy iz all of which are in L.

The end of the proof is just a counting argument to see when a repeated variable is guaranteed to occur.

CFPL – Proof

Q: If n is the number of variables in the grammar, what tree-height guarantees that a variable is repeated?

(Recall: the height of the trivial tree –just a root– is 0)

CFPL – Proof

A: If n is the number of variables in the grammar, any subtree of height h = n+1 will have a repeated variable. This is because the bottom row of a derivation tree is composed of terminals, so height n+1 (= n+2 levels) guarantees n+1 levels of variables, at least on one branch from the root. Pigeonhole principle guarantees that some variable will be encountered twice!

CFPL – Proof

Q: If the grammar is in CNF, what kind of tree is any derivation tree?

CFPL – Proof

A: A binary tree!Q: What is the maximum number of

leaves that a binary tree of height n may have?

CFPL – Proof

A: 2n

Q: What the maximum number of leaves that a CNF derivation-tree of height n+1 may have?

CFPL – Proof

A: Still 2n! This is because the only way to get a terminal is through rule of the form Aa so there is no branching at the final level.

Q: What string length will guarantee a derivation tree of height n+1 ?

CFPL – Proof

A: 2n. This is because no tree with height < n+1 could generated this many leaves, or terminal letters.

This leads to setting the tandem-pumping number to be p=2n.

The rest of the theorem follows from the above considerations. Only fact that need to verify is that the pumping can happened within a substring of length p. This just follows from finding a repeating variable in the last n+2 levels of tree.

Proving Non-Context Freeness

Standard method for applying pumping lemma. Only no. 3 changes from example to example:

1. Suppose that the language is context free.

2. Then it would have a pumping no. p.3. Find a string s which isn’t tandem-

pumpable within a substring of length p.4. 2 and 3 contradict, so 1 must have been

false and the language is not context free.

Proving Non-Context Freeness

Example 1

L ={1n0n 1n0n | n is non-negative }

Proving Non-Context Freeness

Example 1

The hard part is number 3!!! Try s = 1p0p 1p0p

There are three cases of where the “sliding window” vxy could be.

I III 1…10…01…10…0

II

Proving Non-Context Freeness

Example 1Case I. Pumping up (or down) would

change the number of 0’s and or 1’s in the first half of the stringwithout affecting the second half.This would violate the languagedefinition.

I III 1…10…01…10…0

II

Proving Non-Context Freeness

Example 1Cases II and III. Same argument works

as in Case I. (Case III would causethe second half to change withoutaffecting the first half. Case II would cause the middle to changewithout the first 1p nor the lastchanging.) This completes the pf. of no.3.

I III 1…10…01…10…0

II

I III 1…10…01…10…0

II

Proving Non-Context Freeness

Example 2

ADD = { x=y+z | x, y, and z are bit- strings

which satisfy the equation }

Proving Non-Context Freeness

Example 2

The hard part is number 3! Define s by:

1p+1=1p+10p

There are two cases of wherethe substring vxy could be. (Sliding

p-window approach) I 1p+1=1p+10p

II

Proving Non-Context Freeness

Example 2Case I. v must occur to the left of “=” while

y must occur to the right as otherwise, pumping would give too many =‘s, or would affect one side of the equation, and not the other. Let k be the length of v and l be the length of y. Pumping down results in supposed equation: 1p+1-k=1p-

l+10p. This is impossible because the RHS is then much greater than LHS.

I 1p+1=1p+10p

II

Proving Non-Context Freeness

Example 2Case II. Pumping must occur to the right

of “=”: The RHS is affected without affecting the LHS. This is impossible since we want the equation to hold in binary.

This finishes the proof that ADD is not context free.

I 1p+1=1p+10p

II

Blackboard Exercises

{1n | n is prime}{0n 1n 0n 1n }{int x; x = 3; | x is an alphabetic string} Therefore, claim that Java is context free

is a lie. (If x = 3 occurred, must have declared x somewhere in past!)

UNIX regex (not regexp): (a*)b\1b\1 “ \n ” construct refers to n’ t h

parenthesized sub-expression