theory of computation (fall 2014): chomsky normal form parse trees & yield string lengths; cfl...

37
Theory of Computation CNF Parse Trees & Yield String Lengths, CFL Pumping Lemma, Chomsky Hierarchy Revisited Vladimir Kulyukin

Upload: vladimir-kulyukin

Post on 14-Dec-2014

182 views

Category:

Science


3 download

DESCRIPTION

 

TRANSCRIPT

Theory of Computation

CNF Parse Trees & Yield String Lengths, CFL Pumping Lemma, Chomsky

Hierarchy Revisited

Vladimir Kulyukin

Outline

Chomsky Normal Form (CNF) Parse Trees & Yield String Lengths

CFL Pumping Lemma & its Use Chomsky Hierarchy Revisited

CFN Parse Trees & Yield String Lengths

Review: Chomsky Normal Form (CNF)

A grammar G = (V, T, S, P) is said to be in Chomsky Normal Form (CNF) if each production in P has the following form:

1) A BC2) A a

where A, B, C are in V and a is in T

Sample CNF Derivation

G’s Productions:

1.S AB | BC

2.A BA | a

3.B CC | b

4.C AB | a

S

b a b

Sample CNF Derivation

G’s Productions:

1.S AB | BC

2.A BA | a

3.B CC | b

4.C AB | a

S

B C

b a b

Sample CNF Derivation

G’s Productions:

1.S AB | BC

2.A BA | a

3.B CC | b

4.C AB | a

S

B C

b a b

Sample CNF Derivation

G’s Productions:

1.S AB | BC

2.A BA | a

3.B CC | b

4.C AB | a

S

b a b

B C

A B

Sample CNF Derivation

G’s Productions:

1.S AB | BC

2.A BA | a

3.B CC | b

4.C AB | a

S

b a b

B C

A B

CNF Parse Tree Lemma

Suppose that L is a CFL and G is a CNF grammar for L – {ɛ}. If a string z is in L and the parse tree T for z has no path of length greater than i, then|z| ≤ 2(i-1).

Example: Parse Tree & String LengthThe parse tree has no path of length greater than 3;

The length of the string |bab|= 3 ≤ 2(3-1) = 4

S

b a b

B C

A B

CNF Parse Tree Lemma: Proof

The proof is by induction on the length i of the longest path in the CNF parse tree.Base case: If i = 1, then the parse tree must be of the form shown below, where a is some terminal symbol. So the statement of the lemma is true.

S

a

CNF Parse Tree Lemma: Proof

Inductive case: If i > 1, then the CNF parse tree must be of the form below.

S

T1T2

A B

CNF Parse Tree Lemma: ProofBy inductive hypothesis, the yield of T1 (substring covered by the A-rooted tree) has a length no greater than 2(i-2), because T1 has no path greater than i-1. The same is true for T2 . Thus the yield of the S tree has a length no greater than 2x 2(i-2)=2(i-1).

S

T1 T2

A B

yield of T1 yield of T2

Equivalent Formulation of CNF Parse Tree Lemma

Suppose that L is a CFL and G is a CNF grammar for L – {ɛ}. If a string z is in L and the length of z is |z| > 2(i-1), then the parse tree T for z has a path of length greater than i.

Pumping Lemma for CFLs

Review: Pumping Lemma for Regular Languages

.0for , and 1 where

, can write Then we . and Let

states. DFA with a is where,Let

iLwuvv

uvwxnxLx

nMMLL

i

The Pumping Lemma for CFLs

If L be a CFL, there exists a constant n that depends on the number of variables in a CNF grammar for L such that if z is in L and |z|≥ n, z can be written as z = uvwxy such that:

1) |vx| ≥ 1 2) |vwx| ≤ n 3) For all i ≥ 0, uviwxiy is in L

The Pumping Lemma for CFLs: Proof

Suppose that G has k variables, i.e., |V|= k > 1. If |V| = 1, the grammar is trivial. Let n = 2k. Suppose that z is in L and |z| ≥ n = 2k. By the equivalent formulation of the CNF Parse Tree Lemma, the parse tree for z has a path of length at least k+1. Such a path must have at least k+2 vertices. Only the last vertex on that path is a terminal symbol. The remaining k+1 vertices are variables. What does the last statement imply?

The Pumping Lemma for CFLs: Proof

There must be a variable vertex in the path that appears twice.

The Pumping Lemma for CFLs: Proof

Let P be a path that is as long or longer than any other path in the parse tree for z. (of length at least k+1) Then the following three statements are true:1. P has two vertices v1 and v2 that have the same label A.2. Vertex v1 is closer to the root S than v2.

3. The path from v1 to the leaf is of length at most k+1.

The Pumping Lemma for CFLs: Proof

Both v1 and v2 can be found as follows: start at the leaf of P and keep going up. P has k+1 variable vertices, v1 and v2 must have the same label.

T1

The Pumping Lemma for CFLs: ProofSuppose T1 is the parse tree rooted at v1 and T2 is the parse tree rooted at v2. We know that T2 must be a sub-tree of T1, as shown in below. Assume that v1 = v2 = X

X

X

T2

Path from v1 = X to v2 = X

The Pumping Lemma for CFLs: Proof

Suppose z1 is the yield of T1 and z2 is the yield of T2.

X

X

z1z2

T1

T2

Path from v1 = X to v2 = X

The Pumping Lemma for CFLs: Proofz1 = z3z2z4, where z3 and z4 cannot both be empty. Why?

X

X

z1 = z3z2z4z2 z4z3

T1

T2

Path from v1 = X to v2 = X

The Pumping Lemma for CFLs: Proofz1 = z3z2z4, where z3 and z4 cannot both be empty. Why? Because the first production used in the derivation of z1 must have been of the form X ZY. Why?

X

X

z1 = z3z2z4z2z3 z4

Z Y

Path from v1 = X to v2 = X

The Pumping Lemma for CFLs: Proof

z1 = z3z2z4, where z3 and z4 cannot both be empty. Why? Because the first production used in the derivation of z1 must have been of the form X ZY. Why? Because |z1| > 1 and G is a CNF grammar.

X

X

z1 = z3z2z4z2 z4z3

Z Y

Path from v1 = X to v2 = X

The Pumping Lemma for CFLs: Proof

We can pump now!!! X * z3Xz4 * z3z3Xz4z4 * z3i Xz4

i

X

X

z1 = z3z2z4z2z3 z4

Z Y

Path from v1 = X to v2 = X

Example of the CFL Pumping Lemma Use

Claim: L = {akbkck | k ≥ 1} is not a CFL.

Proof: Suppose L is a CFL. Let n be the constant of the CFL Pumping Lemma. Let z = anbncn. By the CFL Pumping Lemma, z = uvwxy and |vwx|≤ n and |vx| ≥ 1. Since |vwx|≤ n, it is impossible for the substring vx to contain a’s and c’s. There are five cases to consider: vx may contain 1) only a’s; 2) a’s and b’s; 3) b’s and c’s; 4) only b’s; 5) only c’s.

Example of the CFL Pumping Lemma Use

Proof Continued: Let us consider case 1 when vx contains only a’s. Then uv0wx0y has fewer a’s than b’s and c’s. This type of argument can be used for cases 4 and 5. Let us consider case 2 vx contains a’s and b’s, then uv0wx0y contains fewer a’s and b’s than c’s. The same type of argument holds for case 3 when vx contains b’s and c’s.

Two Pumping Lemmas Side By Side

The Pumping Lemma for regular languages states that every sufficiently long string in a given regular set has a non-empty sub-string that can be pumped

The Pumping Lemma for CFLs states that every sufficiently long string in a given CFL has two sub-strings, not both empty, that can be pumped the same number of times

In both cases, new strings obtained through pumping remain in the same language from which the original string comes

Two Pumping Lemmas Side By Side

The Pumping Lemmas are not used to prove that specific languages are regular or CF

The Pumping Lemmas state that if a language is regular/CF, then its sufficiently long strings have specific properties

A typical use is to assume that a language is regular/CF and then show that some sufficiently long string does not satisfy specific properties

Summary

The pumping lemma for regular languages is used to show that there are languages that are not regular (cannot be processed by finite state machines)

The pumping lemma for CFLs is used to show that there are languages that are not CF (cannot be processed by stack machines)

In summary, there are languages that are neither regular nor CF

Chomsky Hierarchy Revisited

Noam Chomsky

http://en.wikipedia.org/wiki/Noam_Chomsky

Chomsky Hierarchy Revisited

References & Reading Suggestions

Hopcroft and Ullman. Introduction to Automata Theory, Languages, and Computation, Narosa Publishing House

Moll, Arbib, and Kfoury. An Introduction to Formal Language Theory

Davis, Weyuker, Sigal. Computability, Complexity, and Languages, 2nd Edition, Academic Press

Brooks Webber. Formal Language: A Practical Introduction, Franklin, Beedle & Associates, Inc