context-free grammars haniel barbosa - university of...

60
CS:4330 Theory of Computation Spring 2018 Context-Free Languages Context-Free Grammars Haniel Barbosa

Upload: others

Post on 12-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

CS:4330 Theory of ComputationSpring 2018

Context-Free LanguagesContext-Free Grammars

Haniel Barbosa

Page 2: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Readings for this lecture

Chapter 2 of [Sipser 1996], 3rd edition. Section 2.1.

Page 3: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Context-Free Grammars (CFG)

B There are languages, such as {0n1n | n≥ 0} that cannot be described byfinite automata (or regexps)

B Context-free grammars provide a more powerful mechanism for languagespecification.

B Context-free grammars can describe features that have a recursivestructure, making them useful beyond finite automata.

1 / 39

Page 4: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Formal definition of a CFG

A context-free grammar is a 4-tuple (V , Σ, R, S) in which:

B V is a finite set of symbols called the variables or nonterminals

B Σ is a finite set of symbols, disjoint from V , called terminals

B R is a finite set of rules of the form lhs→ rhs, in which lhs ∈ V andrhs ∈ (V ∪Σ)∗

B S ∈ V is the start nonterminal

2 / 39

Page 5: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example

B CFG G1 has the following rules:

A → 0A1A → BB → #

B Nonterminals of G1 are {A, B} and A is the start symbol

B Terminals of G1 are {0, 1, #}

3 / 39

Page 6: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Language specification

A grammar is used for a language specification by generating each string of thelanguage in the following manner:

1. Write down the start variable; it is the lhs of the first rule, unless specifiedotherwise

2. Find a variable that is written down and a rule whose lhs is that variable.Replace the written down variable with the rhs of that rule.

3. Repeat step 2 until no variables remain in the string thus generated.

NoteThe sequence of substitutions used to obtain a string using a CFG is called aderivation and may be represented by a tree called derivation tree or a parse tree

4 / 39

Page 7: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example derivation tree

The derivation tree of the string 000#111 using CFG G1 is:

A

0

A

0

A

0

B

# 1 1 1

5 / 39

Page 8: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Note

B All strings of terminals generated in this way constitute the languagespecified by the grammar

B We write L (G) for the language generated by the grammar G. Thus,L (G1) = {0n#1n | n≥ 0}

B A language generated by a context-free grammar (CFG) is called aContext-Free Language (CFL).

6 / 39

Page 9: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

CFG G2

The CFG G2 specifies a fragment of English:

〈SENTENCE〉 → 〈NOUN-PHRASE〉〈VERB-PHRASE〉〈NOUN-PHRASE〉 → 〈CP-NOUN〉 | 〈CP-NOUN〉〈PREP-PHRASE〉〈VERB-PHRASE〉 → 〈CP-VERB〉 | 〈CP-VERB〉〈PREP-PHRASE〉〈PREP-PHRASE〉 → 〈PREP〉〈CP-NOUN〉〈CP-NOUN〉 → 〈ARTICLE〉〈NOUN〉〈CP-VERB〉 → 〈VERB〉 | 〈VERB〉〈NOUN-PHRASE〉〈ARTICLE〉 → a | the〈NOUN〉 → boy | girl | flower〈VERB〉 → touches | likes | sees〈PREP〉 → with

7 / 39

Page 10: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example derivation with G2

〈SENTENCE〉 → 〈NOUN-PHRASE〉〈VERB-PHRASE〉〈NOUN-PHRASE〉 → 〈CP-NOUN〉 | 〈CP-NOUN〉〈PREP-PHRASE〉〈VERB-PHRASE〉 → 〈CP-VERB〉 | 〈CP-VERB〉〈PREP-PHRASE〉〈PREP-PHRASE〉 → 〈PREP〉〈CP-NOUN〉〈CP-NOUN〉 → 〈ARTICLE〉〈NOUN〉〈CP-VERB〉 → 〈VERB〉 | 〈VERB〉〈NOUN-PHRASE〉〈ARTICLE〉 → a | the〈NOUN〉 → boy | girl | flower〈VERB〉 → touches | likes | sees〈PREP〉 → with

〈SENTENCE〉 ⇒ 〈NOUN-PHRASE〉〈VERB-PHRASE〉

⇒ 〈CP-NOUN〉〈VERB-PHRASE〉⇒ 〈ARTICLE〉〈NOUN〉〈VERB-PHRASE〉⇒ a〈NOUN〉〈VERB-PHRASE〉⇒ a boy〈VERB-PHRASE〉⇒ a boy〈CP-VERB〉⇒ a boy〈VERB〉⇒ a boy sees

8 / 39

Page 11: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example derivation with G2

〈SENTENCE〉 → 〈NOUN-PHRASE〉〈VERB-PHRASE〉〈NOUN-PHRASE〉 → 〈CP-NOUN〉 | 〈CP-NOUN〉〈PREP-PHRASE〉〈VERB-PHRASE〉 → 〈CP-VERB〉 | 〈CP-VERB〉〈PREP-PHRASE〉〈PREP-PHRASE〉 → 〈PREP〉〈CP-NOUN〉〈CP-NOUN〉 → 〈ARTICLE〉〈NOUN〉〈CP-VERB〉 → 〈VERB〉 | 〈VERB〉〈NOUN-PHRASE〉〈ARTICLE〉 → a | the〈NOUN〉 → boy | girl | flower〈VERB〉 → touches | likes | sees〈PREP〉 → with

〈SENTENCE〉 ⇒ 〈NOUN-PHRASE〉〈VERB-PHRASE〉⇒ 〈CP-NOUN〉〈VERB-PHRASE〉

⇒ 〈ARTICLE〉〈NOUN〉〈VERB-PHRASE〉⇒ a〈NOUN〉〈VERB-PHRASE〉⇒ a boy〈VERB-PHRASE〉⇒ a boy〈CP-VERB〉⇒ a boy〈VERB〉⇒ a boy sees

8 / 39

Page 12: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example derivation with G2

〈SENTENCE〉 → 〈NOUN-PHRASE〉〈VERB-PHRASE〉〈NOUN-PHRASE〉 → 〈CP-NOUN〉 | 〈CP-NOUN〉〈PREP-PHRASE〉〈VERB-PHRASE〉 → 〈CP-VERB〉 | 〈CP-VERB〉〈PREP-PHRASE〉〈PREP-PHRASE〉 → 〈PREP〉〈CP-NOUN〉〈CP-NOUN〉 → 〈ARTICLE〉〈NOUN〉〈CP-VERB〉 → 〈VERB〉 | 〈VERB〉〈NOUN-PHRASE〉〈ARTICLE〉 → a | the〈NOUN〉 → boy | girl | flower〈VERB〉 → touches | likes | sees〈PREP〉 → with

〈SENTENCE〉 ⇒ 〈NOUN-PHRASE〉〈VERB-PHRASE〉⇒ 〈CP-NOUN〉〈VERB-PHRASE〉⇒ 〈ARTICLE〉〈NOUN〉〈VERB-PHRASE〉

⇒ a〈NOUN〉〈VERB-PHRASE〉⇒ a boy〈VERB-PHRASE〉⇒ a boy〈CP-VERB〉⇒ a boy〈VERB〉⇒ a boy sees

8 / 39

Page 13: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example derivation with G2

〈SENTENCE〉 → 〈NOUN-PHRASE〉〈VERB-PHRASE〉〈NOUN-PHRASE〉 → 〈CP-NOUN〉 | 〈CP-NOUN〉〈PREP-PHRASE〉〈VERB-PHRASE〉 → 〈CP-VERB〉 | 〈CP-VERB〉〈PREP-PHRASE〉〈PREP-PHRASE〉 → 〈PREP〉〈CP-NOUN〉〈CP-NOUN〉 → 〈ARTICLE〉〈NOUN〉〈CP-VERB〉 → 〈VERB〉 | 〈VERB〉〈NOUN-PHRASE〉〈ARTICLE〉 → a | the〈NOUN〉 → boy | girl | flower〈VERB〉 → touches | likes | sees〈PREP〉 → with

〈SENTENCE〉 ⇒ 〈NOUN-PHRASE〉〈VERB-PHRASE〉⇒ 〈CP-NOUN〉〈VERB-PHRASE〉⇒ 〈ARTICLE〉〈NOUN〉〈VERB-PHRASE〉⇒ a〈NOUN〉〈VERB-PHRASE〉

⇒ a boy〈VERB-PHRASE〉⇒ a boy〈CP-VERB〉⇒ a boy〈VERB〉⇒ a boy sees

8 / 39

Page 14: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example derivation with G2

〈SENTENCE〉 → 〈NOUN-PHRASE〉〈VERB-PHRASE〉〈NOUN-PHRASE〉 → 〈CP-NOUN〉 | 〈CP-NOUN〉〈PREP-PHRASE〉〈VERB-PHRASE〉 → 〈CP-VERB〉 | 〈CP-VERB〉〈PREP-PHRASE〉〈PREP-PHRASE〉 → 〈PREP〉〈CP-NOUN〉〈CP-NOUN〉 → 〈ARTICLE〉〈NOUN〉〈CP-VERB〉 → 〈VERB〉 | 〈VERB〉〈NOUN-PHRASE〉〈ARTICLE〉 → a | the〈NOUN〉 → boy | girl | flower〈VERB〉 → touches | likes | sees〈PREP〉 → with

〈SENTENCE〉 ⇒ 〈NOUN-PHRASE〉〈VERB-PHRASE〉⇒ 〈CP-NOUN〉〈VERB-PHRASE〉⇒ 〈ARTICLE〉〈NOUN〉〈VERB-PHRASE〉⇒ a〈NOUN〉〈VERB-PHRASE〉⇒ a boy〈VERB-PHRASE〉

⇒ a boy〈CP-VERB〉⇒ a boy〈VERB〉⇒ a boy sees

8 / 39

Page 15: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example derivation with G2

〈SENTENCE〉 → 〈NOUN-PHRASE〉〈VERB-PHRASE〉〈NOUN-PHRASE〉 → 〈CP-NOUN〉 | 〈CP-NOUN〉〈PREP-PHRASE〉〈VERB-PHRASE〉 → 〈CP-VERB〉 | 〈CP-VERB〉〈PREP-PHRASE〉〈PREP-PHRASE〉 → 〈PREP〉〈CP-NOUN〉〈CP-NOUN〉 → 〈ARTICLE〉〈NOUN〉〈CP-VERB〉 → 〈VERB〉 | 〈VERB〉〈NOUN-PHRASE〉〈ARTICLE〉 → a | the〈NOUN〉 → boy | girl | flower〈VERB〉 → touches | likes | sees〈PREP〉 → with

〈SENTENCE〉 ⇒ 〈NOUN-PHRASE〉〈VERB-PHRASE〉⇒ 〈CP-NOUN〉〈VERB-PHRASE〉⇒ 〈ARTICLE〉〈NOUN〉〈VERB-PHRASE〉⇒ a〈NOUN〉〈VERB-PHRASE〉⇒ a boy〈VERB-PHRASE〉⇒ a boy〈CP-VERB〉

⇒ a boy〈VERB〉⇒ a boy sees

8 / 39

Page 16: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example derivation with G2

〈SENTENCE〉 → 〈NOUN-PHRASE〉〈VERB-PHRASE〉〈NOUN-PHRASE〉 → 〈CP-NOUN〉 | 〈CP-NOUN〉〈PREP-PHRASE〉〈VERB-PHRASE〉 → 〈CP-VERB〉 | 〈CP-VERB〉〈PREP-PHRASE〉〈PREP-PHRASE〉 → 〈PREP〉〈CP-NOUN〉〈CP-NOUN〉 → 〈ARTICLE〉〈NOUN〉〈CP-VERB〉 → 〈VERB〉 | 〈VERB〉〈NOUN-PHRASE〉〈ARTICLE〉 → a | the〈NOUN〉 → boy | girl | flower〈VERB〉 → touches | likes | sees〈PREP〉 → with

〈SENTENCE〉 ⇒ 〈NOUN-PHRASE〉〈VERB-PHRASE〉⇒ 〈CP-NOUN〉〈VERB-PHRASE〉⇒ 〈ARTICLE〉〈NOUN〉〈VERB-PHRASE〉⇒ a〈NOUN〉〈VERB-PHRASE〉⇒ a boy〈VERB-PHRASE〉⇒ a boy〈CP-VERB〉⇒ a boy〈VERB〉

⇒ a boy sees

8 / 39

Page 17: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example derivation with G2

〈SENTENCE〉 → 〈NOUN-PHRASE〉〈VERB-PHRASE〉〈NOUN-PHRASE〉 → 〈CP-NOUN〉 | 〈CP-NOUN〉〈PREP-PHRASE〉〈VERB-PHRASE〉 → 〈CP-VERB〉 | 〈CP-VERB〉〈PREP-PHRASE〉〈PREP-PHRASE〉 → 〈PREP〉〈CP-NOUN〉〈CP-NOUN〉 → 〈ARTICLE〉〈NOUN〉〈CP-VERB〉 → 〈VERB〉 | 〈VERB〉〈NOUN-PHRASE〉〈ARTICLE〉 → a | the〈NOUN〉 → boy | girl | flower〈VERB〉 → touches | likes | sees〈PREP〉 → with

〈SENTENCE〉 ⇒ 〈NOUN-PHRASE〉〈VERB-PHRASE〉⇒ 〈CP-NOUN〉〈VERB-PHRASE〉⇒ 〈ARTICLE〉〈NOUN〉〈VERB-PHRASE〉⇒ a〈NOUN〉〈VERB-PHRASE〉⇒ a boy〈VERB-PHRASE〉⇒ a boy〈CP-VERB〉⇒ a boy〈VERB〉⇒ a boy sees

8 / 39

Page 18: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Direct derivation

B If u,v,w ∈ (V ∪Σ)∗ (i.e. are strings of variables and terminals) andA→ w ∈ R (i.e. is a rule of the grammar), then we say that uAv yields uwv,written

uAv⇒ uwv

B We may also say that uwv is directly derived from uAv using the rule A→ w

9 / 39

Page 19: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Derivation

B We write u ∗⇒ v if u = v or if a sequence u1, . . . , uk ∈ (V ∪Σ)∗ exists, fork ≥ 0, and u⇒ u1⇒ ··· ⇒ uk⇒ v

B We may also say that u, u1, . . . , uk, v is a derivation of v from u

10 / 39

Page 20: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Language specified by G

If G = (V , Σ, R, S) is a CFG then the language specified by G (or the language ofG) is a CFL

L (G) = {w ∈ Σ∗ | S ∗⇒ w}

11 / 39

Page 21: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

More examples of CFGs

B Consider the grammar

G3 = ({S}, {a,b}, {S→ aSb | SS | ε}, S)

B L (G3) contains strings such as

abab, aaabbb, aababb

Note

If one thinks of a and b as the symbols ‘(’ and ‘)’ then we can see that L (G3) isthe language of all strings of properly nested parenthesis

12 / 39

Page 22: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

More examples of CFGs

B Consider the grammar

G3 = ({S}, {a,b}, {S→ aSb | SS | ε}, S)

B L (G3) contains strings such as

abab, aaabbb, aababb

Note

If one thinks of a and b as the symbols ‘(’ and ‘)’ then we can see that L (G3) isthe language of all strings of properly nested parenthesis

12 / 39

Page 23: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Important application

Context-free grammars are used as basis for compiler design and implementation

B Context-free grammars are used as specification mechanisms forprogramming languages

B Designers of compilers use such grammars to implement compiler’scomponents, such as scanners, parsers, code generators, code synthesizers

B The implementation of almost any programming languages is preceded by acontext-free grammar that specifies it

13 / 39

Page 24: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example

B Consider the grammar G4 = ({E, T , F},{a,+, ∗, (, )}, R, E) in which R is:

E → E+T | TT → T ∗F | FF → (E) | a

B L (G4) is the language of arithmetic expressions

14 / 39

Page 25: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example

B Consider the grammar G4 = ({E, T , F},{a,+, ∗, (, )}, R, E) in which R is:

E → E+T | TT → T ∗F | FF → (E) | a

B L (G4) is the language of arithmetic expressions

14 / 39

Page 26: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Application: program synthesis

Consider the grammar GS = ({E, B},{0,1,x,y,+, ite,≤, ∧, ¬, , , (, )}, R, E) inwhich R is:

E → 0 | 1 | x | y | (E+E) | ite(B, E, E)B → (¬B) | (B∧B) | (E ≤ E)

What is a program generated with this grammar that solves the following problem:

prog(x,y)≥ x∧prog(x,y)≥ y

A solution is prog(x,y) = ite(x≤ y, y, x), i.e. the max function.

15 / 39

Page 27: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Application: program synthesis

Consider the grammar GS = ({E, B},{0,1,x,y,+, ite,≤, ∧, ¬, , , (, )}, R, E) inwhich R is:

E → 0 | 1 | x | y | (E+E) | ite(B, E, E)B → (¬B) | (B∧B) | (E ≤ E)

What is a program generated with this grammar that solves the following problem:

prog(x,y)≥ x∧prog(x,y)≥ y

A solution is prog(x,y) = ite(x≤ y, y, x), i.e. the max function.

15 / 39

Page 28: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Designing CFGs

B As with the design of automata, the design of CFGs requires creativity

B CFGs are even trickier to construct than finite automata since “we are moreaccustomed to programming a machine than to specify programminglanguages.”

16 / 39

Page 29: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Design Techniques

B Many CFGs are unions of simpler CFGs. Hence the suggestion is toconstruct smaller, simpler grammars first and then to join them into largergrammar

B The mechanism of grammar combination consists of putting all their rulestogether and adding the new rules

S→ S1 | · · · | Sk

where the nonterminals Si, for 1≤ i≤ k, are the start variables of theindividual grammars and S is the new variable

17 / 39

Page 30: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example Grammar Design

Design a grammar for the language {0n1n | n≥ 0}∪{1n0n | n≥ 0}

1. Construct the grammar S1→ 0S11 | ε for the language

{0n1n | n≥ 0}

Construct the grammar S2→ 1S10 | ε for the language

{1n0n | n≥ 0}

3. Put them together adding the rule S→ S1 | S2, obtaining

S → S1 | S2S1 → 0S11 | εS2 → 1S20 | ε

18 / 39

Page 31: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example Grammar Design

Design a grammar for the language {0n1n | n≥ 0}∪{1n0n | n≥ 0}1. Construct the grammar S1→ 0S11 | ε for the language

{0n1n | n≥ 0}

Construct the grammar S2→ 1S10 | ε for the language

{1n0n | n≥ 0}

3. Put them together adding the rule S→ S1 | S2, obtaining

S → S1 | S2S1 → 0S11 | εS2 → 1S20 | ε

18 / 39

Page 32: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example Grammar Design

Design a grammar for the language {0n1n | n≥ 0}∪{1n0n | n≥ 0}1. Construct the grammar S1→ 0S11 | ε for the language

{0n1n | n≥ 0}

Construct the grammar S2→ 1S10 | ε for the language

{1n0n | n≥ 0}

3. Put them together adding the rule S→ S1 | S2, obtaining

S → S1 | S2S1 → 0S11 | εS2 → 1S20 | ε

18 / 39

Page 33: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example Grammar Design

Design a grammar for the language {0n1n | n≥ 0}∪{1n0n | n≥ 0}1. Construct the grammar S1→ 0S11 | ε for the language

{0n1n | n≥ 0}

Construct the grammar S2→ 1S10 | ε for the language

{1n0n | n≥ 0}

3. Put them together adding the rule S→ S1 | S2, obtaining

S → S1 | S2S1 → 0S11 | εS2 → 1S20 | ε

18 / 39

Page 34: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Second Design Technique

B Constructing a CFG for a regular language is easy if one can first construct aDFA for the language

B Conversion procedure:

1. Make a variable R1 for each state qi of the DFA2. Add rules Ri→ aRj to the CFG if δ(qi,a) = qj is a transition in the DFA3. Add the rule Ri→ ε if qi is an accept state of the DGA4. If q0 is the start state of the DGA make R0 the start variable of the CFG.

TheoremEvery regular language is context-free.

19 / 39

Page 35: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Third Design Technique

B Certain CFLs contain strings with two related substrings such as 0n and 1n

in {0n1n | n≥ 0}

B Example of relationship: to recognize such a language a machine wouldneed to remember an unbounded amount of information about one of thesubstrings

B A CFG that handles this situation uses a rule of the form R→ uRv whichgenerates strings wherein the portion containing u’s corresponds to theportion containing v’s.

20 / 39

Page 36: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example Application

Consider the CFG G = ({S, B}, {a,b}, {S→ aSb | B | ε, B→ bB | b}, S)

B The following are derivations with G:

S ⇒ aSb ⇒ aaSBB ⇒ aaSbBB,S ⇒ aSb ⇒ aaSBB ⇒ aaSBbB,S ⇒ aSb ⇒ aaSBB ⇒ aaSB,S ⇒ aSb ⇒ aaSBB ⇒ aaBB

which shows that derivations in this grammar can be quite complex

B When rewriting the strings aaSBB we can consider further derivations ofeach of its symbols in isolation

B Derivations from B are B⇒ bB⇒ bbB ∗⇒ bk−1B⇒ bk, k ≥ 1

B Therefore S⇒ aSB ∗⇒ aSbk, k ≥ 1

B Hence, L (G) = {anbm | n≤ m}

21 / 39

Page 37: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example Application

Consider the CFG G = ({S, B}, {a,b}, {S→ aSb | B | ε, B→ bB | b}, S)

B The following are derivations with G:

S ⇒ aSb ⇒ aaSBB ⇒ aaSbBB,S ⇒ aSb ⇒ aaSBB ⇒ aaSBbB,S ⇒ aSb ⇒ aaSBB ⇒ aaSB,S ⇒ aSb ⇒ aaSBB ⇒ aaBB

which shows that derivations in this grammar can be quite complex

B When rewriting the strings aaSBB we can consider further derivations ofeach of its symbols in isolation

B Derivations from B are B⇒ bB⇒ bbB ∗⇒ bk−1B⇒ bk, k ≥ 1

B Therefore S⇒ aSB ∗⇒ aSbk, k ≥ 1

B Hence, L (G) = {anbm | n≤ m}

21 / 39

Page 38: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example Application

Consider the CFG G = ({S, B}, {a,b}, {S→ aSb | B | ε, B→ bB | b}, S)

B The following are derivations with G:

S ⇒ aSb ⇒ aaSBB ⇒ aaSbBB,S ⇒ aSb ⇒ aaSBB ⇒ aaSBbB,S ⇒ aSb ⇒ aaSBB ⇒ aaSB,S ⇒ aSb ⇒ aaSBB ⇒ aaBB

which shows that derivations in this grammar can be quite complex

B When rewriting the strings aaSBB we can consider further derivations ofeach of its symbols in isolation

B Derivations from B are B⇒ bB⇒ bbB ∗⇒ bk−1B⇒ bk, k ≥ 1

B Therefore S⇒ aSB ∗⇒ aSbk, k ≥ 1

B Hence, L (G) = {anbm | n≤ m}

21 / 39

Page 39: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example Application

Consider the CFG G = ({S, B}, {a,b}, {S→ aSb | B | ε, B→ bB | b}, S)

B The following are derivations with G:

S ⇒ aSb ⇒ aaSBB ⇒ aaSbBB,S ⇒ aSb ⇒ aaSBB ⇒ aaSBbB,S ⇒ aSb ⇒ aaSBB ⇒ aaSB,S ⇒ aSb ⇒ aaSBB ⇒ aaBB

which shows that derivations in this grammar can be quite complex

B When rewriting the strings aaSBB we can consider further derivations ofeach of its symbols in isolation

B Derivations from B are B⇒ bB⇒ bbB ∗⇒ bk−1B⇒ bk, k ≥ 1

B Therefore S⇒ aSB ∗⇒ aSbk, k ≥ 1

B Hence, L (G) = {anbm | n≤ m}

21 / 39

Page 40: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Ambiguity

B If a CFG generates the same string in several different ways, we say that thestring is derived ambiguously in that grammar

B If a CFG generates some string we say that the grammar is ambiguous

ExampleThe grammar G5, whose rules are

E→ E+E | E ∗E | (E) | a

generates ambiguously some arithmetic expressions

22 / 39

Page 41: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Ambiguous expressions

Two different derivation trees for a+ a∗ a

E

E

E

a

+ E

a

* E

a

E

E

a

+ E

E

a

* E

a

23 / 39

Page 42: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Note

B The grammar G5 does not capture the casual precedence relations and sogroups the + before ∗ and vice versa

B In contrast, the grammar G4 generates the same language, but everygenerated string has a unique derivation treeG4 = ({E, T , F},{a,+, ∗, (, )}, R, E) in which R is:

E → E+T | TT → T ∗F | FF → (E) | a

B Hence, G5 is ambiguous and G4 is not, i.e. G4 is unambiguous

24 / 39

Page 43: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Note

B When a grammar generates a string ambiguously it means that the stringhas two different derivation trees

B However, two different derivations may produce the same derivation treebecause they may differ in the order in which they replace nonterminals, notin the rules they use

B To concentrate on the structure of the derivations we need to fix the order ofrule application

25 / 39

Page 44: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Fixing rule application order

Definition (Leftmost derivation)A derivation of a string w in a grammar G is a leftmost derivation if at every stepthe leftmost nonterminal is replaced.

Definition (Rightmost derivation)

A derivation of a string w in a grammar G is a rightmost derivation if at every stepthe rightmost nonterminal is replaced.

26 / 39

Page 45: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Inherent Ambiguity

B Some CFLs can have both ambiguous and unambiguous grammars.

B Some CFLs, however, can be generated only by an ambiguous grammar.

B A CFL that can be generated only by ambiguous grammars is calledinherently ambiguous.

Example of inherently ambiguous language

{0i1j2k | i = j∨ j = k}

27 / 39

Page 46: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Chomsky Normal Form

B It is often convenient to simplify CFGs so we can reason about them

B One of the simplest and most useful simplified forms of CFGs is called theChomsky Normal Form

B Another normal form usually used in algebraic specifications is Greibachnormal form

28 / 39

Page 47: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Definition

A context-free grammar G is in Chomsky normal form if every rule is of the form

A → BCA → a

where a is a terminal, A, B, C are nonterminals, and B, C may not be the startvariable

NoteThe rule S→ ε, where S is the start variable, is not excluded

29 / 39

Page 48: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Chomsky Normal Form characterizes CFLs

TheoremAny context-free language is generated by a context-free grammar in Chomskynormal form

Proof ideasB Show that any grammar G can be converted into Chomsky normal form

B Conversion procedure has several states where the rules that violateChomsky normal form conditions are replaced with equivalent ones thatsatisfy these conditions

B Order of transformations:1. add a new start variable2. eliminate all ε-rules3. eliminate unit rules4. convert rules

B Check that the obtained grammar defines the same language as the initialone.

30 / 39

Page 49: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Conversion: 1 - introduce new start state

Add a new start symbol S0 and the rule S0→ S where S was the original startsymbol

NoteThis change guarantees that the start symbol does not occur on the rhs of anyrule

31 / 39

Page 50: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Conversion: 2 - eliminate ε-rules

Repeat

1. Eliminate the ε rule A→ ε where A is not the start symbol

2. For each occurrence of A on the rhs of a rule, add a new rule with thatoccurrence of A deletedExample: To delete A→ ε, replace B→ uAv by B→ uAv | uv; replaceR→ uAvAw by B→ uAvAw | uvAw | uAvw | uwv

3. Replace the rule B→ A, (if it is present) by B→ A | ε unless the rule B→ ε

has not been previously eliminated

until all ε rules are eliminated.

32 / 39

Page 51: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Conversion: 3 - remove unit rules

Repeat

1. Remove a unit rule A→ B

2. For each rule B→ u that appears, add the rule A→ u , unless it was a unitrule previously removed

until all unit rules are eliminated.

Noteu is a string of variables and terminals

33 / 39

Page 52: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Conversion: 4 - Convert all remaining rules

Repeat

1. Replace a rule A→ u1, . . . , uk, k ≥ 3, where each ui, 1≤ i≤ k, is avariable or terminal, by:

A→ u1A1, A1→ u2A2, . . . Ak−2→ uk−1uk

where A1, . . . , Ak−2 are new variables

2. If k ≥ 2 replace any terminal ui with a new variable Ui and add the ruleUi→ ui

until no rules of the form A→ u1, . . . , uk with k ≥ 3 remain.

34 / 39

Page 53: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Example CFG conversion

Consider the grammarS → ASA | aBA → B | SB → b | ε

After the first step of transformation we get

S0 → SS → ASA | aBA → B | SB → b | ε

35 / 39

Page 54: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Removing ε rules

Removing B→ ε:S0 → SS → ASA | aB | aA → B | S | εB → b

Removing A→ εS0 → SS → ASA | aB | a | SA | AS | SA → B | SB → b

36 / 39

Page 55: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Removing ε rules

Removing B→ ε:S0 → SS → ASA | aB | aA → B | S | εB → b

Removing A→ εS0 → SS → ASA | aB | a | SA | AS | SA → B | SB → b

36 / 39

Page 56: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Removing unit rule

Removing S→ SS0 → SS → ASA | aB | a | SA | ASA → B | SB → b

Removing S0→ SS0 → ASA | aB | a | SA | ASS → ASA | aB | a | SA | ASA → B | SB → b

37 / 39

Page 57: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Removing unit rule

Removing S→ SS0 → SS → ASA | aB | a | SA | ASA → B | SB → b

Removing S0→ SS0 → ASA | aB | a | SA | ASS → ASA | aB | a | SA | ASA → B | SB → b

37 / 39

Page 58: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

More unit rules

Removing A→ BS0 → SS → ASA | aB | a | SA | ASA → S | bB → b

Removing A→ SS0 → ASA | aB | SA | ASS → ASA | aB | a | SA | ASA → b | ASA | aB | a | SA | SAB → b

38 / 39

Page 59: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

More unit rules

Removing A→ BS0 → SS → ASA | aB | a | SA | ASA → S | bB → b

Removing A→ SS0 → ASA | aB | SA | ASS → ASA | aB | a | SA | ASA → b | ASA | aB | a | SA | SAB → b

38 / 39

Page 60: Context-Free Grammars Haniel Barbosa - University of Iowahomepage.divms.uiowa.edu/~hbarbosa/teaching/cs4330/notes/06-ctx-free.pdfContext-Free Grammars (CFG) B There are languages,

Converting the remaining rules

S0 → AA1 | UB | a | SA | ASS → AA1 | UB | a | SA | ASA → b | AA1 | UB | a | SA | ASA1 → SAU → aB → b

39 / 39