cs780(prasad)l13bupbasic1 bottom-up parsing basics
TRANSCRIPT
CS780(Prasad) L13BUPBasic 1
Bottom-Up Parsing Basics
CS780(Prasad) L13BUPBasic 2
Example: Rightmost Derivation
S -> ABc A -> Aa | aB -> bF | bF -> f F | f
Input String: aaabffc
S
A B c
A a b F
A a f F
a f
S => ABc => AbFc => AbfFc => Abffc => Aabffc =>* aaabffc
CS780(Prasad) L13BUPBasic 3
Example: Symbol Stack
Input String: aa | abffcSymbol Stack: A
S
A B c
A a b F
A a f F
a f
Input String: aaab | ffcSymbol Stack: Ab
Input String: aaabff | cSymbol Stack: AbF
Input String: aaabff | cSymbol Stack: AB
The symbol stack summarizes the consumed string.
CS780(Prasad) L13BUPBasic 4
Example: Item (NFA state)
Input String: aa | abffc
NFA State : A -> A.a S
A B c
A a b F
A a f F
a f
Input String: aaab | ffc
NFA State : B -> b.F
Input String: aaabff | c
NFA State : B -> bF. Input String: aaabff | c
NFA State : S -> AB.c
The NFA state shows the part of a rule reachable using the consumed string.
CS780(Prasad) L13BUPBasic 5
Example: State transitions
Input String: aaa | bffcaaab|ffc aaabf|fc
S
A B c
A a b F
A a f F
a f
S => A . B c
=> A b . F c
=> A b f . F c
S -> A.BcB -> .bFB -> .b …
B -> b.FB -> b.F -> .f FF -> .f …
b
State Transition
CS780(Prasad) L13BUPBasic 6
Symbol stack and State stack
S -> .ABc
A -> .Aa
A -> .a
Input String: $ aaa b f | fc
Symbol Stack: $ A b f
S -> A . Bc
A -> A.aB -> .bF
B -> .b
B -> b.FB -> b. F -> . f FF -> . f
F -> f.FF -> f.
A b f
State stack
reduce shift
CS780(Prasad) L13BUPBasic 7
Note 1 : Redundancy
• The state stack can be generated from the sequence of symbols on the symbol stack, by tracing the derivation from the start symbol, using the productions.– The state stack is used for efficiency.– It avoids running the entire DFA
(containing production fragments) to determine the next state each time.
CS780(Prasad) L13BUPBasic 8
Why trace the entire symbol stack?
S -> [ A ] | { B }A -> a A | aB -> a B | a
It is not possible to determine the “current” production (state)by merely inspecting a fixed number of stack symbols.
Input 1: $ [ a a a | a ]Symbol Stack: $ [ a a a State:
A -> a . AA -> .a
Input 2: $ { a a a | a }Symbol Stack: $ { a a a State:
B -> a . BB -> .a
CS780(Prasad) L13BUPBasic 9
DFA State : Input strings with eqvt. history
: RHS of rules sharing a prefix
Input String1: ab | ffcInput String2: ab | ge
Symbol Stack: $ A b
State
B -> b.FB -> b. F -> . f FF -> . fD -> b.g …
The DFA state captures all possible rule prefixes reachable from the start symbol using the consumed input.
S -> ABc | ADeA -> Aa | aB -> bF | bD -> bg | dBF -> f F | f
CS780(Prasad) L13BUPBasic 10
Stacking states is adequate
S => A B c => A b c S => A D e => A d B e => A d b e
The effect of reduction by B -> b depends on the previous state.
S -> A.Bc
B -> .b S -> AB.c
D -> d.BB -> .b
B -> b.
B
b
D -> dB.B
S -> A.De
D -> .dB
D -> .bg
d
b
S -> AD.eD
CS780(Prasad) L13BUPBasic 11
Note 2 : Redundancy
• The state stack contains enough information to regenerate the symbol stack. Note that all arcs into a DFA state carry the same symbol label.– As a consequence, the symbol stack can be
avoided by making the “reduce”-entry in the parse table specify the number of stack elements (length of RHS) to be popped and the (LHS) nonterminal to be pushed to make the transition.
– In practice, the (synthesized) semantic attribute associated with a stack symbol is allocated and computed on a parallel stack.
CS780(Prasad) L13BUPBasic 12
Relation to Grammars
• The DFA naturally corresponds to a left-linear grammar.
S -> A.Bc
B -> .b
B -> b. b
Grammar Rule: P Q bQ P
• The DFA state is obtained from the NFA states using the standard construction.