cs780(prasad)l13bupbasic1 bottom-up parsing basics

12
CS780(Prasad) L13BUPBasic 1 Bottom-Up Parsing Basics

Upload: rudolph-holmes

Post on 18-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 1

Bottom-Up Parsing Basics

Page 2: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 2

Example: Rightmost Derivation

S -> ABc A -> Aa | aB -> bF | bF -> f F | f

Input String: aaabffc

S

A B c

A a b F

A a f F

a f

S => ABc => AbFc => AbfFc => Abffc => Aabffc =>* aaabffc

Page 3: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 3

Example: Symbol Stack

Input String: aa | abffcSymbol Stack: A

S

A B c

A a b F

A a f F

a f

Input String: aaab | ffcSymbol Stack: Ab

Input String: aaabff | cSymbol Stack: AbF

Input String: aaabff | cSymbol Stack: AB

The symbol stack summarizes the consumed string.

Page 4: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 4

Example: Item (NFA state)

Input String: aa | abffc

NFA State : A -> A.a S

A B c

A a b F

A a f F

a f

Input String: aaab | ffc

NFA State : B -> b.F

Input String: aaabff | c

NFA State : B -> bF. Input String: aaabff | c

NFA State : S -> AB.c

The NFA state shows the part of a rule reachable using the consumed string.

Page 5: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 5

Example: State transitions

Input String: aaa | bffcaaab|ffc aaabf|fc

S

A B c

A a b F

A a f F

a f

S => A . B c

=> A b . F c

=> A b f . F c

S -> A.BcB -> .bFB -> .b …

B -> b.FB -> b.F -> .f FF -> .f …

b

State Transition

Page 6: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 6

Symbol stack and State stack

S -> .ABc

A -> .Aa

A -> .a

Input String: $ aaa b f | fc

Symbol Stack: $ A b f

S -> A . Bc

A -> A.aB -> .bF

B -> .b

B -> b.FB -> b. F -> . f FF -> . f

F -> f.FF -> f.

A b f

State stack

reduce shift

Page 7: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 7

Note 1 : Redundancy

• The state stack can be generated from the sequence of symbols on the symbol stack, by tracing the derivation from the start symbol, using the productions.– The state stack is used for efficiency.– It avoids running the entire DFA

(containing production fragments) to determine the next state each time.

Page 8: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 8

Why trace the entire symbol stack?

S -> [ A ] | { B }A -> a A | aB -> a B | a

It is not possible to determine the “current” production (state)by merely inspecting a fixed number of stack symbols.

Input 1: $ [ a a a | a ]Symbol Stack: $ [ a a a State:

A -> a . AA -> .a

Input 2: $ { a a a | a }Symbol Stack: $ { a a a State:

B -> a . BB -> .a

Page 9: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 9

DFA State : Input strings with eqvt. history

: RHS of rules sharing a prefix

Input String1: ab | ffcInput String2: ab | ge

Symbol Stack: $ A b

State

B -> b.FB -> b. F -> . f FF -> . fD -> b.g …

The DFA state captures all possible rule prefixes reachable from the start symbol using the consumed input.

S -> ABc | ADeA -> Aa | aB -> bF | bD -> bg | dBF -> f F | f

Page 10: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 10

Stacking states is adequate

S => A B c => A b c S => A D e => A d B e => A d b e

The effect of reduction by B -> b depends on the previous state.

S -> A.Bc

B -> .b S -> AB.c

D -> d.BB -> .b

B -> b.

B

b

D -> dB.B

S -> A.De

D -> .dB

D -> .bg

d

b

S -> AD.eD

Page 11: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 11

Note 2 : Redundancy

• The state stack contains enough information to regenerate the symbol stack. Note that all arcs into a DFA state carry the same symbol label.– As a consequence, the symbol stack can be

avoided by making the “reduce”-entry in the parse table specify the number of stack elements (length of RHS) to be popped and the (LHS) nonterminal to be pushed to make the transition.

– In practice, the (synthesized) semantic attribute associated with a stack symbol is allocated and computed on a parallel stack.

Page 12: CS780(Prasad)L13BUPBasic1 Bottom-Up Parsing Basics

CS780(Prasad) L13BUPBasic 12

Relation to Grammars

• The DFA naturally corresponds to a left-linear grammar.

S -> A.Bc

B -> .b

B -> b. b

Grammar Rule: P Q bQ P

• The DFA state is obtained from the NFA states using the standard construction.