non regular languages and the pumping lemmabchor/cm05/compute3.pdf · pumping lemma theorem:ifl is...

Computational Models - Lecture 3

Non Regular Languages and the PumpingLemma

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.1


Non Regular Languages and the Pumping Lemma

Algorithmic questions for NDAs





Context Free Grammars





Context Free Grammars

Sipser’s book, 1.4, 2.1, 2.2


if you can’t hear me



you should move closer



you should move closer

no functioning amplification equipment expectedin the near future (next year or so)


Proved Last TimeThm.: A language, L, is described by a regular

expression, R, if and only if L is regular.




=⇒ construct an NFA accepting R.




=⇒ construct an NFA accepting R.

⇐= Given a regular language, L, construct an

equivalent regular expression.


Negative Results

We have made a lot of progress understanding whatfinite automata can do. But what can’t they do?


Negative Results

We have made a lot of progress understanding whatfinite automata can do. But what can’t they do?Is there a DFA that accepts

B = {0n1n|n ≥ 0}C = {w|w has an equal number of 0’s and 1’s}D = {w|w has an equal number of occurrences

of 01 and 10 substrings}


Negative Results

We have made a lot of progress understanding whatfinite automata can do. But what can’t they do?Is there a DFA that accepts


of 01 and 10 substrings}Consider B:

DFA must “remember” how many 0’s it has seen

impossible with finite state.

The others are exactly the same.Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.4

Negative Results

Is there a DFA that accepts


of 01 and 10 substrings}


Negative Results







Negative Results






The others are exactly the same...


Negative Results







Question: Is this a proof?


Negative Results







Question: Is this a proof?

Answer: No, D is regular!??? (see problem set 1)Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.5

Pumping Lemma

We will show that all regular languages have a specialproperty.

Suppose L is regular.


Pumping Lemma



If a string in L is longer than a certain criticallength � (the pumping length),


Pumping Lemma




then it can be “pumped” to a longer string byrepeating an internal substring any number oftimes.


Pumping Lemma





The longer string must be in L too.


Pumping Lemma





The longer string must be in L too.

This is a powerful technique for showing that alanguage is not regular.


Pumping Lemma

Theorem: If L is a regular language, then there is an� > 0 (the pumping length), where if s is any string inL of length |s| > �, then s may be divided into threepieces s = xyz such that


Pumping Lemma


for every i > 0, xyiz ∈ L,

|y| > 0, and

|xy| ≤ �.


Pumping Lemma



|y| > 0, and

|xy| ≤ �.

Remarks: Without the second condition, the theoremwould be trivial.


Pumping Lemma



|y| > 0, and

|xy| ≤ �.

Remarks: Without the second condition, the theoremwould be trivial.The third condition is technical and sometimes useful.


Pumping Lemma – Proof

Let M = (Q,Σ, δ, q1, F ) be a DFA that accepts L.

Let � be |Q|, the number of states of M .





If s ∈ L has length at least �, consider the sequence ofstates M goes through as it reads s:






s1 s2 s3 s4 s5 s6 . . . sn

↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑q1 q20 q9 q17 q12 q13 q9 q2 q5 ∈ F






s1 s2 s3 s4 s5 s6 . . . sn

↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑q1 q20 q9 q17 q12 q13 q9 q2 q5 ∈ F

Since the sequence of states is of length |s| + 1 > �,

and there are only � different states in Q, at least one

state is repeated (by the pigeonhole principle).


Pumping Lemma – Proof (cont.)

Write down s = xyz

q1

q9

q5x

y

z

By inspection, M accepts xykz for every k ≥ 0.



Write down s = xyz

q1

q9

q5x

y

z


|y| > 0 because the state (q9 in figure) is repeated.



Write down s = xyz

q1

q9

q5x

y

z


|y| > 0 because the state (q9 in figure) is repeated.

To ensure that |xy| ≤ �, pick first state repetition,which must occure no later than � + 1 states insequence.


An Application

Theorem: The language B = {0n1n|n > 0} is notregular.

Proof: By contradiction. Suppose B is regular,accepted by DFA M . Let � be the pumping length.

Consider the string s = 0�1�.


An Application




By pumping lemma s = xyz, where xykz ∈ Bfor every k.


An Application





If y is all 0, then xykz has too many 0’s.


An Application








An Application







If y is mixed, then xykz is not of right form. ♣Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.10

Another Application

Theorem: The languageC = {w|w has an equal number of 0’s and 1’s} is notregular.Proof: By contradiction. Suppose C is regular,accepted by DFA M . Let � be the pumping length.



Another Application



By pumping lemma s = xyz, where xykz ∈ Cfor every k.


Another Application






Another Application






If y is mixed, then since |xy| ≤ �, y must be all0’s, contradiction. ♣


Context Switch


Algorithmic Questions for NDAs

Q.: Given an NDA, N , and a string s, is s ∈ L(N)?

Answer: Construct the DFA equivalent to N and runit on w.


Algorithmic Questions for NDAs

Q.: Given an NDA, N , and a string s, is s ∈ L(N)?

Answer: Construct the DFA equivalent to N and runit on w.

Q.: Is L(N) = ∅?Answer: This is a reachability question in graphs: Isthere a path in the states’ graph of N from the startstate to some accepting state. There are simple,efficient algorithms for this task.


More Algorithmic Questions for NDAs

Q.: Is L(N) = Σ∗?

Answer: Check if L(N) = ∅.



Q.: Is L(N) = Σ∗?


Q.: Given N1 and N2, is L(N1) ⊆ L(N2)?

Answer: Check if L(N2) ∩ L(N1) = ∅.



Q.: Is L(N) = Σ∗?




Q.: Given N1 and N2, is L(N1) = L(N2)?

Answer: Check if L(N1) ⊆ L(N2) andL(N2) ⊆ L(N1).



Q.: Is L(N) = Σ∗?




Q.: Given N1 and N2, is L(N1) = L(N2)?

Answer: Check if L(N1) ⊆ L(N2) andL(N2) ⊆ L(N1).

In the future, we will see that for stronger models ofcomputations, many of these problems cannot besolved by any algorithm.


Another, More Radical Context SwitchSo far we saw



finite automata,



finite automata,regular languages,



finite automata,regular languages,regular expressions,



finite automata,regular languages,regular expressions,pumping lemma for regular languages.




We now introduce stronger machines andlanguages with more expressive power:





pushdown automata,





pushdown automata,context-free languages,





pushdown automata,context-free languages,context-free grammars,





pushdown automata,context-free languages,context-free grammars,pumping lemma for context-free languages.


Context-Free GrammarsThis is an example of a context free grammer, G1:

A → 0A1

A → B

B → #



A → 0A1

A → B

B → #

Terminology:

Each line is a substitution rule or production.

Each rule has the form: symbol → string.The left-hand symbol is a variable(usually upper-case).



A → 0A1

A → B

B → #

Terminology:



A string consists of variables and terminals.



A → 0A1

A → B

B → #

Terminology:



A string consists of variables and terminals.

One variable is the start variable.


Rules for Generating Strings

Write down the start variable (lhs of top rule).




Pick a variable written down in current string anda derivation that starts with that variable.





Replace that variable with right-hand side of thatderivation.






Repeat until no variables remain.






Repeat until no variables remain.

Return final string (concatenation of terminals).


Example

Grammar G1:

A → 0A1

A → B

B → #


Example

Grammar G1:

A → 0A1

A → B

B → #

Derivation with G1:

A ⇒ 0A1

⇒ 00A11

⇒ 000A111

⇒ 000B111

⇒ 000#111


A Parse TreeA

A

A

B

#0 10 0 1 1


A Parse TreeA

A

A

B

#0 10 0 1 1

Question: What strings can be generated in this wayfrom the grammar G1?


A Parse TreeA

A

A

B

#0 10 0 1 1

Question: What strings can be generated in this wayfrom the grammar G1?

Answer: Exactly those of the form 0n#1n (n ≥ 0).


Context-Free Languages

The language generated in this way is the language ofthe grammar.




For example, L(G1) is {0n#1n|n ≥ 0}.




For example, L(G1) is {0n#1n|n ≥ 0}.

Any language generated by a context-free grammar is

called a context-free language.


A Useful AbbreviationRules with same variable on left hand side

A → 0A1

A → B

are written as:


A Useful AbbreviationRules with same variable on left hand side

A → 0A1

A → B

are written as:

A → 0A1 | B


Deriving English-like Sentences

A specific derivation in G2:

< SENTENCE > ⇒ < NOUN-PHRASE >< VERB >

⇒ < ARTICLE >< NOUN >< VERB >

⇒ a < NOUN >< VERB >

⇒ a boy < VERB >

⇒ a boy sees


Deriving English-like Sentences

A specific derivation in G2:




⇒ a boy < VERB >

⇒ a boy sees

More strings in G2:

a flower sees

the girl touches


Derivation and Parse Tree




⇒ a boy < VERB >

⇒ a boy sees

SENTENCE

NOUN-PHRASE VERB

ARTICLE NOUN

a boy seesSlides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.24

Formal DefinitionsA context-free grammar is a 4-tuple (V,Σ, R, S)where

V is a finite set of variables,




Σ is a finite set of terminals,





R is a finite set of rules: each rule is a variableand a finite string of variables and terminals.





R is a finite set of rules: each rule is a variableand a finite string of variables and terminals.

S is the start symbol.


Formal Definitions

If u and v are strings of variables and terminals,


Formal Definitions


and A → w is a rule of the grammar, then


Formal Definitions



we say uAv yields uwv, written uAv ⇒ uwv.


Formal Definitions




We write u∗⇒ v if u = v or

u ⇒ u1 ⇒ . . . ⇒ uk ⇒ v.

for some sequence u1, u2, . . . , uk.


Formal Definitions




We write u∗⇒ v if u = v or

u ⇒ u1 ⇒ . . . ⇒ uk ⇒ v.

for some sequence u1, u2, . . . , uk.

Definition: The language of the grammar is{w | S

∗⇒ w}

.


Example

Consider G4 = (V, {a, b} , R, S).

R (Rules): S → aSb | SS | ε .


Example



Some words in the language: aabb, aababb.


Example




Q.: But what is this language?


Example




Q.: But what is this language?

Hint: Think of parentheses.


Arythmetic Example

Consider (V,Σ, R,E) where

V = {E, T, F}Σ = {a,+,×, (, )}

Rules:E → E + T | TT → T × F | FF → (E) | a


Arythmetic Example


V = {E, T, F}Σ = {a,+,×, (, )}


Strings generated by the grammer:a + a × a and (a + a) × a.


Arythmetic Example


V = {E, T, F}Σ = {a,+,×, (, )}


Strings generated by the grammer:a + a × a and (a + a) × a.What is the language of this grammer?Hint: arithmetic expressions.


Arythmetic Example


V = {E, T, F}Σ = {a,+,×, (, )}


Strings generated by the grammer:a + a × a and (a + a) × a.What is the language of this grammer?Hint: arithmetic expressions.

E = expression, T = term, F = factor.


Parse Tree for a + a × a

E → E + T | TT → T × F | FF → (E) | a


Parse Tree for a + a × a

E → E + T | TT → T × F | FF → (E) | a

aXa+a

FFF

T T

TE

E


Parse Tree for (a + a) × a

E → E + T | TT → T × F | FF → (E) | a


Parse Tree for (a + a) × a

E → E + T | TT → T × F | FF → (E) | a

( a + aX)a

F F F

T T

E

E

F

T

T

E


Designing Context-Free Grammars

No recipe in general, but few rules-of-thumb

If CFG is the union of several CFGs, renamevariables (not terminals) so they are disjoint, andadd new rule S → S1 | S2 | . . . | Si.





To construct CFG for a regular language,“follow” a DFA for the language. For initial stateq0, make R0 the start variable. For state transitionδ(qi, a) = qj add rule Ri → aRj to grammer. Foreach final state qf , add rule Rf → ε to grammer.





To construct CFG for a regular language,“follow” a DFA for the language. For initial stateq0, make R0 the start variable. For state transitionδ(qi, a) = qj add rule Ri → aRj to grammer. Foreach final state qf , add rule Rf → ε to grammer.

For languages with linked substrings (like{0n#1n|n ≥ 0} ), a rule of form R → uRv maybe helpful, forcing desired relation betweensubstrings.


Closure Properties

Regular languages are closed under


Closure Properties

Regular languages are closed underunion


Closure Properties

Regular languages are closed underunionconcatenation


Closure Properties

Regular languages are closed underunionconcatenationstar


Closure Properties


Context-Free Languages are closed under


Closure Properties


Context-Free Languages are closed underunion : S → S1 | S2


Closure Properties



concatenation S → S1S2


Closure Properties



concatenation S → S1S2

star S → ε | SS


More Closure Properties

Regular languages are also closed under



Regular languages are also closed undercomplement (reverse accept/non-accept statesof DFA)




intersection(L1 ∩ L2 = L1 ∪ L2

).





).

What about complement and intersection ofcontext-free languages?





).

What about complement and intersection ofcontext-free languages?

Not clear . . .


Ambiguity

Grammar: E → E+E | E×E | (E) | a

aXa+a

EEE

E

E

aXa+a

EEE

E

E


Ambiguity

We say that a string w is derived ambiguously fromgrammer G if w has two or more parse trees thatgenerate it from G.


Ambiguity


Ambiguity is usually not only a syntactic notion butalso a semantic one, implying multiple meanings forthe same string.


Ambiguity



It is sometime possible to eliminate ambiguity byfinding a different context free grammer generatingthe same language. This is true for the grammerabove, which can be replaced by unambiguousgrammer from slide (14x2).


Ambiguity



It is sometime possible to eliminate ambiguity byfinding a different context free grammer generatingthe same language. This is true for the grammerabove, which can be replaced by unambiguousgrammer from slide (14x2).

Some languages (e.g. {1i2j3k | i = j or j = k} areinherentrly ambigous.


Chomsky Normal Form

A simplified, canonical form of context freegrammers.Every rule has the form

A → BC

A → a

S → ε

where S is the start symbol, A, B and C are any vari-

able, except B and C not the start symbol, and A can

be the start symbol.Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.36

TheoremTheorem: Any context-free language is generated bya context-free grammar in Chomsky normal form.Basic idea:

Add new start symbol S0.




Eliminate all ε rules of the form A → ε.





Eliminate all “unit” rules of the form A → B.






Patch up rules so that grammar generates thesame language.






Patch up rules so that grammar generates thesame language.

Convert remaining long rules to proper form.


ProofAdd new start symbol S0 and rule S0 → S.

Guarantees that new start symbol does not appear on

right-hand-side of a rule.


ProofEliminating ε rules.

Repeat:

remove some A → ε.



Repeat:


for each R → uAv, add rule R → uv.



Repeat:



and so on: for R → uAvAw add R → uvAw,R → uAvw, and R → uvw.



Repeat:




for R → A add R → ε, except if R → ε hasalready been removed.



Repeat:




for R → A add R → ε, except if R → ε hasalready been removed.

until all ε-rules not involving the original startvariable have been removed.


ProofEliminate unit rules.

Repeat:

remove some A → B.



Repeat:


for each B → u, add rule A → u, unless this ispreviously removed unit rule. (u is a string ofvariables and terminals.)



Repeat:


for each B → u, add rule A → u, unless this ispreviously removed unit rule. (u is a string ofvariables and terminals.)

until all unit rules have been removed.


ProofFinally, convert long rules.To replace each A → u1u2 . . . uk (for k ≥ 3),introduce new non-terminals

N1, N2, . . . , Nk−1

and rules


ProofFinally, convert long rules.To replace each A → u1u2 . . . uk (for k ≥ 3),introduce new non-terminals

N1, N2, . . . , Nk−1

and rules

A → u1N1

N1 → u2N2...

Nk−3 → uk−2Nk−2

Nk−2 → uk−1uk ♠Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.41

Conversion Example

Initial Grammar:

S → ASA | aB

A → B | S

B → b | ε

(1) Add new start state:

S0 → S

S → ASA | aB

A → B | S

B → b | ε


Conversion Example (2)

S0 → S

S → ASA | aB

A → B | S

B → b | ε

(2) Remove ε-rule B → ε:

S0 → S

S → ASA | aB | a

A → B | S | ε

B → b | εSlides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.43


S0 → S

S → ASA | aB | a

A → B | S | ε

B → b

(3) Remove ε-rule A → ε:

S0 → S

S → ASA | aB | a | AS | SA | S

A → B | S | ε

B → bSlides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p.44


S0 → S


A → B | S

B → b

(4) Remove unit rule S → S

S0 → S


A → B | S



S0 → S

S → ASA | aB | a | AS | SA

A → B | S

B → b

(5) Remove unit rule S0 → S:

S0 → S | ASA | aB | a | AS | SA


A → B | S



S0 → ASA | aB | a | AS | SA


A → B | S

B → b

(6) Remove unit rule A → B:



A → B | S | b





A → S | b

B → b

Remove unit rule A → S:



A → S | b | ASA | aB | a | AS | SA





A → b | ASA | aB | a | AS | SA

B → b

(8) Final simplification – treat long rules:

S0 → AA1 | UB | a | SA | AS

S → AA1 | UB | a | SA | AS

A → b | AA1 | UB | a | SA | AS

A1 → SA

U → a

B → b√


non regular languages and the pumping lemmabchor/cm05/compute3.pdf · pumping lemma theorem:ifl is...

Documents