costas buch - rpi1 simplifications of context-free grammars

51
Costas Buch - RPI 1 Simplifications of Context-Free Grammars

Post on 20-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Costas Buch - RPI 1

Simplifications of

Context-Free Grammars

Costas Buch - RPI 2

A Substitution Rule

bB

aAB

abBcA

aaAA

aBS

Substitute

Equivalentgrammar

aAB

abbcabBcA

aaAA

abaBS

|

|

bB

Costas Buch - RPI 3

A Substitution Rule

EquivalentgrammarabaAcabbcabBcA

aaAA

aaAabaBS

||

||

aAB

abbcabBcA

aaAA

abaBS

|

|

Substitute aAB

Costas Buch - RPI 4

In general:

1yB

xBzA

Substitute

zxyxBzA 1|equivalentgrammar

1yB

Costas Buch - RPI 5

Nullable Variables

:production A

Nullable Variable: A

Costas Buch - RPI 6

Removing Nullable Variables

Example Grammar:

M

aMbM

aMbS

Nullable variable

Costas Buch - RPI 7

M

M

aMbM

aMbSSubstitute

abM

aMbM

abS

aMbS

Final Grammar

Costas Buch - RPI 8

Unit-Productions

BAUnit Production:

(a single variable in both sides)

Costas Buch - RPI 9

Removing Unit Productions

Observation:

AA

Is removed immediately

Costas Buch - RPI 10

Example Grammar:

bbB

AB

BA

aA

aAS

Costas Buch - RPI 11

bbB

AB

BA

aA

aAS

SubstituteBA

bbB

BAB

aA

aBaAS

|

|

Costas Buch - RPI 12

Remove

bbB

BAB

aA

aBaAS

|

|

bbB

AB

aA

aBaAS

|

BB

Costas Buch - RPI 13

SubstituteAB

bbB

aA

aAaBaAS

||

bbB

AB

aA

aBaAS

|

Costas Buch - RPI 14

Remove repeated productions

bbB

aA

aBaAS

|

bbB

aA

aAaBaAS

||

Final grammar

Costas Buch - RPI 15

Useless Productions

aAA

AS

S

aSbS

aAaaaaAaAAS

Some derivations never terminate...

Useless Production

Costas Buch - RPI 16

bAB

A

aAA

AS

Another grammar:

Not reachable from S

Useless Production

Costas Buch - RPI 17

In general:

if wxAyS

then variable is usefulA

otherwise, variable is uselessA

)(GLw

contains only terminals

Costas Buch - RPI 18

A production is useless if any of its variables is useless

xA

DC

CB

aAA

AS

S

aSbS

Productions

useless

useless

useless

useless

Variables

useless

useless

useless

Costas Buch - RPI 19

Removing Useless Productions

Example Grammar:

aCbC

aaB

aA

CAaSS

||

Costas Buch - RPI 20

First: find all variables that can producestrings with only terminals

aCbC

aaB

aA

CAaSS

|| },{ BA

AS

},,{ SBA

Round 1:

Round 2:

Costas Buch - RPI 21

Keep only the variablesthat produce terminal symbols:

aCbC

aaB

aA

CAaSS

||

},,{ SBA

aaB

aA

AaSS

|

(the rest variables are useless)

Remove useless productions

Costas Buch - RPI 22

Second:Find all variablesreachable from

aaB

aA

AaSS

|

S A B

Use a Dependency Graph

notreachable

S

Costas Buch - RPI 23

Keep only the variablesreachable from S

aaB

aA

AaSS

|

aA

AaSS

|

Final Grammar

(the rest variables are useless)

Remove useless productions

Costas Buch - RPI 24

Removing All

Step 1: Remove Nullable Variables

Step 2: Remove Unit-Productions

Step 3: Remove Useless Variables

Costas Buch - RPI 25

Normal Formsfor

Context-free Grammars

Costas Buch - RPI 26

Chomsky Normal Form

Each productions has form:

BCA

variable variable

aAor

terminal

Costas Buch - RPI 27

Examples:

bA

SAA

aS

ASS

Not ChomskyNormal Form

aaA

SAA

AASS

ASS

Chomsky Normal Form

Costas Buch - RPI 28

Convertion to Chomsky Normal Form

Example:

AcB

aabA

ABaS

Not ChomskyNormal Form

Costas Buch - RPI 29

AcB

aabA

ABaS

Introduce variables for terminals:

cT

bT

aT

ATB

TTTA

ABTS

c

b

a

c

baa

a

cba TTT ,,

Costas Buch - RPI 30

Introduce intermediate variable:

cT

bT

aT

ATB

TTTA

ABTS

c

b

a

c

baa

a

cT

bT

aT

ATB

TTTA

BTV

AVS

c

b

a

c

baa

a

1

1

1V

Costas Buch - RPI 31

Introduce intermediate variable:

cT

bT

aT

ATB

TTV

VTA

BTV

AVS

c

b

a

c

ba

a

a

2

2

1

1

2V

cT

bT

aT

ATB

TTTA

BTV

AVS

c

b

a

c

baa

a

1

1

Costas Buch - RPI 32

Final grammar in Chomsky Normal Form:

cT

bT

aT

ATB

TTV

VTA

BTV

AVS

c

b

a

c

ba

a

a

2

2

1

1

AcB

aabA

ABaS

Initial grammar

Costas Buch - RPI 33

From any context-free grammar(which doesn’t produce )not in Chomsky Normal Form

we can obtain: An equivalent grammar in Chomsky Normal Form

In general:

Costas Buch - RPI 34

The Procedure

First remove:

Nullable variables

Unit productions

Costas Buch - RPI 35

Then, for every symbol : a

In productions: replace with a aT

Add production aTa

New variable: aT

Costas Buch - RPI 36

Replace any production nCCCA 21

with

nnn CCV

VCV

VCA

12

221

11

New intermediate variables: 221 ,,, nVVV

Costas Buch - RPI 37

Theorem:For any context-free grammar(which doesn’t produce )there is an equivalent grammar in Chomsky Normal Form

Costas Buch - RPI 38

Observations

• Chomsky normal forms are good for parsing and proving theorems

• It is very easy to find the Chomsky normal form for any context-free grammar

Costas Buch - RPI 39

Greinbach Normal Form

All productions have form:

kVVVaA 21

symbol variables

0k

Costas Buch - RPI 40

Examples:

bB

bbBaAA

cABS

||

GreinbachNormal Form

aaS

abSbS

Not GreinbachNormal Form

Costas Buch - RPI 41

aaS

abSbS

Conversion to Greinbach Normal Form:

bT

aT

aTS

STaTS

b

a

a

bb

GreinbachNormal Form

Costas Buch - RPI 42

Theorem:For any context-free grammar(which doesn’t produce ) there is an equivalent grammarin Greinbach Normal Form

Costas Buch - RPI 43

Observations

• Greinbach normal forms are very good for parsing

• It is hard to find the Greinbach normal form of any context-free grammar

Costas Buch - RPI 44

The CYK Parser

Costas Buch - RPI 45

The CYK Membership Algorithm

Input:

• Grammar in Chomsky Normal Form G

• String

Output:

find if )(GLw

w

Costas Buch - RPI 46

The Algorithm

• Grammar :G

bB

ABB

aA

BBA

ABS

• String : w aabbb

Input example:

Costas Buch - RPI 47

a a b b b

aa ab bb bb

aab abb bbb

aabb abbb

aabbb

aabbb

Costas Buch - RPI 48

aA

aA

bB

bB

bB

aa ab bb bb

aab abb bbb

aabb abbb

aabbb

bB

ABB

aA

BBA

ABS

Costas Buch - RPI 49

a A

a A

b B

b B

b B

aa

ab S,B

bb A

bb A

aab abb bbb

aabb abbb

aabbb

bB

ABB

aA

BBA

ABS

Costas Buch - RPI 50

aA

aA

bB

bB

bB

aa abS,B

bbA

bbA

aabS,B

abbA

bbbS,B

aabbA

abbbS,B

aabbbS,B

bB

ABB

aA

BBA

ABS

Costas Buch - RPI 51

Therefore: )(GLaabbb

Time Complexity:3||w

The CYK algorithm can be easily converted to a parser(bottom up parser)

Observation: