chapter 5 context-free grammars

49
1 Chapter 5 Chapter 5 Context-Free Grammars Context-Free Grammars

Upload: suchi

Post on 22-Jan-2016

60 views

Category:

Documents


0 download

DESCRIPTION

Chapter 5 Context-Free Grammars. Grammar (Review). A grammar is a 4-tuple G = ( NT , T , P , S ), where NT : a finite set of non-terminal symbols T : a finite set of terminal (alphabet) symbols S : is the starting symbol S  NT P : a finite set of production rules: x → y - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 5 Context-Free Grammars

1

Chapter 5Chapter 5

Context-Free GrammarsContext-Free Grammars

Page 2: Chapter 5 Context-Free Grammars

2

Grammar (Review)Grammar (Review)

A grammar is a 4-tuple A grammar is a 4-tuple G = G = ( (NTNT, , , , PP, , SS), where), where NT: a finite set of non-terminal symbols : a finite set of terminal (alphabet) symbols S: is the starting symbol S NT P: a finite set of production rules: x → y

x NT y (NT T)*

Derivation of w T*: S w1 ··· wn w (written as S * w)

The language generated by the grammar is L(G) = {w T* : S * w}

Page 3: Chapter 5 Context-Free Grammars

3

5.1:5.1: Context-Free Grammars (1) Context-Free Grammars (1)

Context-Free GrammarsContext-Free Grammars (CFG) are language- (CFG) are language-description mechanisms used to generate the strings description mechanisms used to generate the strings of a languageof a language

A grammar is said to be context-free if every rule has a single non-terminal on the left-hand side A

A NT: single non-terminal on the left hand side

(NT T)*: string of terminals and non-terminals

A language A language LL is called context-free language ( is called context-free language (CFLCFL) ) if and only if there is a context-free grammar if there is a context-free grammar GG ((CFGCFG) that generates it (i.e., ) that generates it (i.e., L = L(G))

Page 4: Chapter 5 Context-Free Grammars

4

5.1:5.1: Context-Free Grammars (3) Context-Free Grammars (3)

Example 5.1 G = ({S}, {a, b}, S, {S → aSa | bSb | })

For aabbaa, typical derivation might be: S aSa aaSaa aabSbaa aabbaa

Grammar generates language L(G)={wwR : w {a, b}*}

Page 5: Chapter 5 Context-Free Grammars

5

5.1:5.1: Context-Free Grammars (5) Context-Free Grammars (5)

What is the language of this grammar?What is the language of this grammar? G = ({S}, {a, b}, S, {S → aSb, S → ab})

L(G) = {anbn | n 1}

What is the language of this grammar?What is the language of this grammar? G = ({S}, {a, b}, S, {S → aSb, S → })

L(G) = {anbn | n 0}

Page 6: Chapter 5 Context-Free Grammars

6

5.1:5.1: Context-Free Grammars (6) Context-Free Grammars (6)

What is the language of this grammar?What is the language of this grammar? G = ({S, A}, {a, b}, S, {S → aSa | aAa

A → bA | b } L(G) = {anbman | n 1, m 1}

What is the language of this grammar?What is the language of this grammar? G1 = ({S, A, B}, {a, b}, S, {S → AB

A → aA | aB → bB | }

G2 = ({S, B}, {a, b}, S, {S → aS | aB B → bB | }

L(G) = {a+b*} = {anbm | n 1, m 0}

Page 7: Chapter 5 Context-Free Grammars

7

5.1:5.1: Context-Free Grammars (7) Context-Free Grammars (7)

What is the language of this grammar?What is the language of this grammar? G = ({S, B}, {a, b, c}, S, {S → abScB |

B → bB | b } L = {(ab)n(cbm)n | n 0, m 1}

What is the grammar of this language?What is the grammar of this language? L = {anbmcmdn | n 1, m 1}

G = ({S, A}, {a, b, c, d}, S, { S → aSd | aAd A → bAc | bc }

What is the grammar of this language?What is the grammar of this language? L = {anbmcmd2n | n 0, m 1}

G = ({S, A}, {a, b, c, d}, S, { S → aSdd | AA → bAc | bc }

Page 8: Chapter 5 Context-Free Grammars

8

5.1:5.1: Context-Free Grammars (8) Context-Free Grammars (8)

What is the grammar of this language?What is the grammar of this language? L = {a*ba*ba*}

G1 = ({S, A}, {a, b}, S, {S → AbAbA

A → aA | }

G2 = ({S, A, C}, {a, b}, S, {S → aS | bA

A → aA | bC }

C → aC |

Page 9: Chapter 5 Context-Free Grammars

9

5.1:5.1: Context-Free Grammars (9) Context-Free Grammars (9)

What is the grammar of this language?What is the grammar of this language? A language over {a, b} with at least 2 b’s

G1 = ({S, A}, {a, b}, S, {S → AbAbA

A → aA | bA | }

G2 = ({S, A, C}, {a, b}, S, {S → aS | bA

A → aA | bC

C → aC | bC | }

Page 10: Chapter 5 Context-Free Grammars

10

5.1:5.1: Context-Free Grammars (10) Context-Free Grammars (10)

What is the grammar of this language?What is the grammar of this language? A language over {a, b} of even-length strings

G = ({S, O}, {a, b}, S, {S → aO | bO | O → aS | bS }

What is the grammar of this language?What is the grammar of this language? A language over {a, b} of an even no. of b’s

G = ({S, B, C}, {a, b}, S, { S → aS | bA | A →

aA | bS }

Page 11: Chapter 5 Context-Free Grammars

11

5.1:5.1: Context-Free Grammars (11) Context-Free Grammars (11)

What is the grammar of this language?What is the grammar of this language? A language over {a, b, c} that do not contain

the substring abc G = ({S, B, C}, {a, b, c}, S, {S → bS | cS | aA |

A → aA | bB | cS |

B → aA | bS | }

Page 12: Chapter 5 Context-Free Grammars

12

5.1:5.1: Context-Free Grammars (12) Context-Free Grammars (12)

Write a CFG for the following LanguagesWrite a CFG for the following Languages

L1 = {wcwR | w {a, b}*}

S aSa | bSb | c

L2 = {anbncmdm | n 1, m 1}

S XY

X aXb | ab

Y cYd | cd

Page 13: Chapter 5 Context-Free Grammars

13

5.1:5.1: Context-Free Grammars (13) Context-Free Grammars (13)

The following languages are NOT CFLsThe following languages are NOT CFLs L1 = {wcw | w {a, b}*} L2 = {anbmcndm | n 1, m 1} L3 = {anbncn | n 0}

G = ({S}, {a, b}, S, {S → aSb | SS | }) is this grammar context-free?

yes, there is a single variable on the left hand side

Page 14: Chapter 5 Context-Free Grammars

14

5.1:5.1: Context-Free Grammars (14) Context-Free Grammars (14)

Are regular languages context-free? Why? yes, SEE NEXT SLIDE But, not all context-free grammars are regular.

Regular languages are a proper subset of the class of context-free languages. Regular grammars are a proper subset of context-free

grammars.

Non-regular languages can be generated by context-free grammars, so the term context-free generally includes non-regular

languages and regular languages.

Page 15: Chapter 5 Context-Free Grammars

15

5.1:5.1: Context-Free Grammars (14) Context-Free Grammars (14)

Constructing (right linear) CFG from DFAConstructing (right linear) CFG from DFA

1.1. Each state of the DFA will be represented by Each state of the DFA will be represented by a nonterminala nonterminal

2.2. The initial state will correspond to the start The initial state will correspond to the start nonterminalnonterminal

3.3. For each transition For each transition (q(qii, a) = q, a) = qjj , add a rule , add a rule qi → aqj

4.4. For each accepting state qFor each accepting state qf f , add a rule , add a rule qf →

Page 16: Chapter 5 Context-Free Grammars

16

5.1:5.1: Context-Free Grammars (15) Context-Free Grammars (15)Leftmost & Rightmost DerivationsLeftmost & Rightmost Derivations

Given the grammar S → aAB, A → bBb, B → A |

String abbbb can be derived in different ways:

Left-most derivation: always replace the leftmost NT in each sentential formS aAB abBbB abAbB abbBbbB abbbbB abbbb

Righ-tmost derivation:always replace the rightmost NT in each sentential formS aAB aA abBb abAb abbBbb abbbb

Page 17: Chapter 5 Context-Free Grammars

17

5.1:5.1: Context-Free Grammars (16) Context-Free Grammars (16)

DEFINITION

derives in one step

derives in one step

indicates that the derivation utilizes the rules of grammar G

+

*

G

Page 18: Chapter 5 Context-Free Grammars

18

lm

5.1:5.1: Context-Free Grammars (17) Context-Free Grammars (17)

Leftmost: Replace the leftmost non-terminal symbol

Rightmost: Replace the rightmost non-terminal symbollmlm

Important Notes: A y

If xAz xyz , what’s true about x ? terminals

If xAz xyz , what’s true about z ? terminalsrm

Derivation Order

Page 19: Chapter 5 Context-Free Grammars

19

5.1:5.1: Context-Free Grammars (18) Context-Free Grammars (18)

ExampleExample S S + E | E E number | ( S )

Left-most derivationLeft-most derivation S S+E E+E (S)+E (S+E)+E (S+E+E)+E

(E+E+E)+E (1+E+E)+E (1+2+E)+E (1+2+(S))+E (1+2+(S+E))+E (1+2+(E+E))+E(1+2+(3+E))+E (1+2+(3+4))+E (1+2+(3+4))+5

Right-most derivationRight-most derivation S S+E E+5 (S)+5 (S+E)+5 (S+(S))+5

(S+(S+E))+5 (S+(S+4))+5 (S+(E+4))+5 (S+(3+4))+5 (S+E+(3+4))+5 (S+2+(3+4))+5 (E+2+(3+4))+5 (1+2+(3+4))+5

Page 20: Chapter 5 Context-Free Grammars

20

5.1:5.1: Context-Free Grammars (19) Context-Free Grammars (19) Derivation (Parsing) Trees

Given the grammar S → aaSB | , B → bB | b A leftmost derivation

S aaSB aaB aab A rightmost derivation

S aaSB aaSb aab

Both derivations correspond to the parse (or derivation) tree above.

Page 21: Chapter 5 Context-Free Grammars

21

5.1:5.1: Context-Free Grammars (20) Context-Free Grammars (20)

In a parse tree, the nodes labeled with NTs correspond to the left side of a production rule and the children of that node correspond to the right side of the rule, e.g., S → aaSB.

The tree structure shows the rule that is applied to each NT (for a specific derivation), without showing the order of rule application.

Each internal node of the tree corresponds to a NT, and the leaves of the derivation tree represent the string of terminals.

Note the tree applies to a specific derivation and may not include all rules, e.g., B → bB is not shown above.

Page 22: Chapter 5 Context-Free Grammars

22

5.1:5.1: Context-Free Grammars (21) Context-Free Grammars (21)

Definition 5.3: Let G = (NT, T, S, P) be a context-free grammar. An ordered tree is a derivation tree for G if and only if it has the following properties: 1. the root is labeled S 2. every leaf has a label from T {} 3. every non-leaf (interior) vertex has a label from NT. 4. if a vertex has label A NT, and its children are

labeled (left to right) a1, a2, ..., an, then P contain a production of the form A → a1a2…an

5. a leaf labeled has no siblings; that is, a vertex with a child labeled can have no other children

Partial derivation tree: a tree having properties 3–5, but for which property 1 may not hold, and for which property 2 is replaced with: 2a. every leaf has a label from NT T {}

Page 23: Chapter 5 Context-Free Grammars

23

5.1:5.1: Context-Free Grammars (22) Context-Free Grammars (22)

S S +E E+E (S)+E (S+E)+E (S+E+E)+E (E+E+E)+E (1+E+E)+E (1+2+E)+E … (1+2+(3+4))+E (1+2+(3+4))

+5

SS + E

E

( S )

5

S + E

S + E ( S )S + EE

E12

34

S S + E | EE number | ( S )

ParseTree

Derivation

Derivation Derivation Parse Tree Parse Tree

Page 24: Chapter 5 Context-Free Grammars

24

5.1:5.1: Context-Free Grammars (23) Context-Free Grammars (23)

E

EE op

E

EE op

id

E

EE op

id *

E

EE op

id id*

id op E

E E op E

id * E

id * id

E E op E | ( E ) | - E | idop + | - | * | / |

Page 25: Chapter 5 Context-Free Grammars

25

5.1:5.1: Context-Free Grammars (24) Context-Free Grammars (24)

A derivation tree for grammar G yields a sentence of the language L(G)

A partial derivation tree yields a sentential form

The yield of a parse tree is the string of leaf symbols obtained by reading the tree left-to-right the order they are encountered when the tree is

traversed in a depth-first manner, always taking the leftmost unexplored branch

Page 26: Chapter 5 Context-Free Grammars

26

5.1:5.1: Context-Free Grammars (25) Context-Free Grammars (25)

Given the grammar S → aAB A → bBb B → A |

, the yield of the partial derivation tree is the

sentential form abBbB

Page 27: Chapter 5 Context-Free Grammars

27

5.1:5.1: Context-Free Grammars (26) Context-Free Grammars (26)

Given the grammar S → aAB A → bBb B → A |

, the yield of the derivation tree is the sentence

abbbb L(G).

Page 28: Chapter 5 Context-Free Grammars

28

5.1:5.1: Context-Free Grammars (27) Context-Free Grammars (27)

Theorem 5.1 (Relationship between Sentential Forms & Derivation Trees)

Let G = (NT, T, S, P) be a CFG. Then for every w L(G), there exists a derivation

tree of G whose yield is w. the yield, w, of any derivation tree of G is such

that w L(G). if tG is any partial derivation tree for G whose

root is labeled S, then the yield of tG is a sentential form of G.

As a side note, any w L(G) has a leftmost and a rightmost derivation.

Page 29: Chapter 5 Context-Free Grammars

29

5.2:5.2: Parsing and Ambiguity (1) Parsing and Ambiguity (1)

In practical applications, it is usually not enough to decide whether a string belongs to a language.

It is also important to know how to derive the string from the language.

Parsing uncovers the syntactical structure of a string, which is represented by a parse tree.

The syntactical structure is important for assigning semantics to the string -- for example, if it is a computer program.

Page 30: Chapter 5 Context-Free Grammars

30

5.2:5.2: Parsing and Ambiguity (2) Parsing and Ambiguity (2)

Application of parsing (Compilers) Let G be a context-free grammar for the C language. Let the

string w be a C program.

One thing a compiler does - in particular, the part of the compiler called the “parser” - is determine whether w is a syntactically correct C program.

It also constructs a parse tree for the program that is used in code generation.

There are many sophisticated and efficient algorithms for parsing. You may study them in more advanced classes (for example, on compilers).

Page 31: Chapter 5 Context-Free Grammars

31

5.2:5.2: Parsing and Ambiguity (3) Parsing and Ambiguity (3)

S-grammars

Definition 5.4: A context-free grammar G = (NT, T, S, P) is said to be a simple grammar or s-grammar if all of its productions are of the form A → ax where A NT, a T, x NT*, and any

pair (A, a) occurs at most once in P.

Examples: S → aS | bSS | c is an s-grammar. S → aS | bSS | aSB | c is not an s-grammar. because pair (S, a) occurs in two productions:

S → aS, and S → aSB

Page 32: Chapter 5 Context-Free Grammars

32

5.2:5.2: Parsing and Ambiguity (4) Parsing and Ambiguity (4)

Example: given s-grammar G with production

S → aS | bSS | c show the derivation of the string w = abcc Since G is an s-grammar,

the only way to get the a up front is via rule S → aS

the only way to get the b is via rule S → bSS

the only way to get each c is via rule S → c

Thus, we have parsed the string in 4 steps as S aS abSS abcS abcc

Page 33: Chapter 5 Context-Free Grammars

33

5.2:5.2: Parsing and Ambiguity (5) Parsing and Ambiguity (5)

Find an s-grammar for aaa*b | b

We need to start with an a or have only the b. S → aA S → b

A → aB

B → aB B → b

Page 34: Chapter 5 Context-Free Grammars

34

5.2:5.2: Parsing and Ambiguity (6) Parsing and Ambiguity (6)

A terminal string may be generated by a number of different derivations

Let G be a CFG. A string w is in L(G), iff there is a leftmost derivation of w from S. (S w)

Question Is there a unique leftmost derivation of every

sentence (string) in the language of a grammar? NO

*

lm

Page 35: Chapter 5 Context-Free Grammars

35

5.2:5.2: Parsing and Ambiguity (7) Parsing and Ambiguity (7)

Consider the expression grammar:

E E+E | E*E | (E) | -E | id

Two different leftmost derivations of id + id * id

E E + E

id + E

id + E * E

id + id * E

id + id * id

E

EE *

E

EE +

id

idid

Page 36: Chapter 5 Context-Free Grammars

36

5.2:5.2: Parsing and Ambiguity (8) Parsing and Ambiguity (8)

Consider the expression grammar:

E E+E | E*E | (E) | -E | id

Two different leftmost derivations of id + id * id

E E * E

E + E * E

id + E * E

id + id * E

id + id * id

E

E E+

E

E E*

id

id id

Page 37: Chapter 5 Context-Free Grammars

37

5.2:5.2: Parsing and Ambiguity (9) Parsing and Ambiguity (9) Ambiguity

Definition 5.5: a grammar G is ambiguous if there is a string with

at least two parse trees, two or more leftmost or rightmost derivations.

Example: CFG G with productions S → aSb | SS | is ambiguous because there are two parse tree for w = aabb as shown below.

Page 38: Chapter 5 Context-Free Grammars

38

5.2:5.2: Parsing and Ambiguity (10) Parsing and Ambiguity (10)

A CFG is ambiguous if there is a string w L(G) that can be derived by two distinct leftmost derivations

A grammar G is ambiguous if there exists a sentence in G with more than one derivation (parsing) tree

A grammar that is not ambiguous is called unambiguous

If G is ambiguous then L(G) is not necessarily ambiguous

A language L is inherently ambiguous if there is no unambiguous grammar that generates it

Page 39: Chapter 5 Context-Free Grammars

39

5.2:5.2: Parsing and Ambiguity (11) Parsing and Ambiguity (11)

Let G be S → aS | Sa | a G is ambiguous since the string aa has 2 distinct

leftmost derivation S aS aa S Sa aa

L(G) = a+

This language is also generated by the unambiguous grammar S → aS | a

L(G) is not ambiguous. Why?

Page 40: Chapter 5 Context-Free Grammars

40

5.2:5.2: Parsing and Ambiguity (12) Parsing and Ambiguity (12)

Let G be S → bS | Sb | a L(G) = b*ab*

G is ambiguous since the string bab has 2 distinct leftmost derivation S bS bSb bab S Sb bSb bab

The ability to generate the b’s in either order must be eliminated to obtain an unambiguous grammar

This language is also generated by the unambiguous grammars

G1: S → bS | aA

A → bA |

G2: S → bS | A

A → Ab | a

Page 41: Chapter 5 Context-Free Grammars

41

5.2:5.2: Parsing and Ambiguity (13) Parsing and Ambiguity (13)

Eliminating Ambiguity Consider the following grammar

E E + E | E * E | num Consider the sentence 1 + 2 * 3 Leftmost Derivation 1

E E + E 1 + E 1 + E * E 1 + 2 * E 1 + 2 * 3

Leftmost Derivation 2 E E * E

E + E * E 1 + E * E 1 + 2 * E 1 + 2 * 3

E

E E+

E

E E*

3

1 2

E

EE *

E

EE +

1

32

Page 42: Chapter 5 Context-Free Grammars

42

5.2:5.2: Parsing and Ambiguity (14) Parsing and Ambiguity (14)

Different parse trees correspond to different evaluations (meaning)

1 2 3*

+

1 2 3

*

LMD-1 LMD-2

+

= 7 = 9

Page 43: Chapter 5 Context-Free Grammars

43

Ambiguous: E → E + E | E * E | number

Both + and * have the same precedence

To remove ambiguity, you have to give + and * different precedence

Let us say * has higher precedence than + E E + T | T T T * num | num

5.2:5.2: Parsing and Ambiguity (15) Parsing and Ambiguity (15)

Page 44: Chapter 5 Context-Free Grammars

44

5.2:5.2: Parsing and Ambiguity (16) Parsing and Ambiguity (16)

Eliminating Ambiguity E E + T | T T T * num | num

Consider the sentence 1 + 2 * 3 Leftmost Derivation (only one LMD)

E E + T T + T num + T 1 + T 1 + T * num 1 + num * num 1 + 2 * num 1 + 2 * 3

numT *

E

TE +

1

3

2

num

T

num

= 7

Page 45: Chapter 5 Context-Free Grammars

45

Let us say + has higher precedence than * E E * T | T T T + num | num Consider the sentence 1 + 2 * 3 Leftmost Derivation (only one LMD)

E E * T T * T T + num * T num + num * T 1 + num * T 1 + 2 * T 1 + 2 * num 1 + 2 * 3

5.2:5.2: Parsing and Ambiguity (17) Parsing and Ambiguity (17)

numT +

E

TE *

1

3

2num

T num

= 9

Page 46: Chapter 5 Context-Free Grammars

46

5.2:5.2: Parsing and Ambiguity (18) Parsing and Ambiguity (18)

A derivation tree… Corresponds to exactly one leftmost derivation. Corresponds to exactly one rightmost derivation.

CFG ambiguous any of following equivalent statements: string w with multiple derivation trees. string w with multiple leftmost derivations. string w with multiple rightmost derivations.

Note: Defining grammar, not language, ambiguity.

Page 47: Chapter 5 Context-Free Grammars

47

5.3: CFGs & Programming Languages (1)

Programming languages are context-free, but not regular

Programming languages have the following features that require infinite “stack memory” matching parentheses in algebraic expressions nested if .. then .. else statements nested loops block structure

Page 48: Chapter 5 Context-Free Grammars

48

5.3: CFGs & Programming Languages (2)5.3: CFGs & Programming Languages (2)

<unsigned constant<unsigned constant> > <unsigned number<unsigned number>><constant<constant> > <unsigned number<unsigned number> | <> | <signsign> > <unsigned number<unsigned number>>

<unsigned number<unsigned number> > <unsigned integer<unsigned integer> | > | <unsigned real<unsigned real>><unsigned integer<unsigned integer> > <digit<digit> > <unsigned integer<unsigned integer> | > | <digit<digit>><unsigned real<unsigned real> > <unsigned integer<unsigned integer> > .. <unsigned integer<unsigned integer> |> |

<unsigned integer<unsigned integer> > .. <unsigned integer<unsigned integer> > EE < <expexp> > ||

<unsigned integer<unsigned integer> > EE < <expexp>>

<<expexp> > <unsigned integer<unsigned integer> | <> | <signsign> > <unsigned integer<unsigned integer> >

<digit<digit> > 00 | | 11 | | 22 | | 33 | | 44 | | 55 | | 66 | | 77 | | 88 | | 99 <letter<letter>> aa | | bb | | cc | … | | … | yy | | zz | | AA | | BB | | CC | … | | … | YY | | ZZ <sign<sign> > ++ | | --

<<identifieridentifier> > <letter<letter> <> <identifier tailidentifier tail> > <<identifier tailidentifier tail> > <letter<letter> <> <identifier tailidentifier tail> | > | <digit<digit> <> <identifier identifier

tailtail>>

Page 49: Chapter 5 Context-Free Grammars

49

5.3: CFGs & Programming Languages (3)5.3: CFGs & Programming Languages (3)

<<expressionexpression>> < <simple expressionsimple expression> > <<simple expressionsimple expression>> < <termterm> | <> | <sign termsign term> | > |

<<simple expressionsimple expression> <> <adding operatoradding operator> > <<termterm> >

<<termterm>> < <factorfactor> | > | <<termterm> <> <multiplying operatormultiplying operator> >

<<factorfactor> >

<<factorfactor>> < <variablevariable> | <> | <unsigned constantunsigned constant> | > | ((<<expressionexpression>>))

<adding operator<adding operator>> ++ | | -- <multiplying operator<multiplying operator>> ** | | // | | divdiv | | modmod