discussion #5 ll(1) grammars &table-driven parsing
DESCRIPTION
Discussion #5 LL(1) Grammars &Table-Driven Parsing. Topics. Approaches to Parsing Full backtracking Deterministic Simple LL(1), table-driven parsing Improvements to simple LL(1) grammars. Prefix Expression Grammar. - PowerPoint PPT PresentationTRANSCRIPT
Discussion #5 1/18
Discussion #5
LL(1) Grammars&Table-Driven Parsing
Discussion #5 2/18
Topics• Approaches to Parsing
– Full backtracking– Deterministic
• Simple LL(1), table-driven parsing
• Improvements to simple LL(1) grammars
Discussion #5 3/18
Prefix Expression Grammar• Consider the following grammar (which yields prefix
expressions for binary operators):
E N | OEEO + | | * | /N 0 | 1 | 2 | 3 | 4
• Here, prefix expressions associate an operator with the next two operands.
* + 2 3 4
(* (+ 2 3) 4)
(2 + 3) * 4 = 20
* 2 + 3 4
(* 2 (+ 3 4))
2 * (3 + 4) = 14
Discussion #5 4/18
E
N O E E
… + * N O E E N
… + N N 0 1 2
0 1 2 3 0 1 2 3 4E N | OEEO + | | * | /N 0 | 1 | 2 | 3 | 4
*+342
Top-Down Parsing with Backtracking
Discussion #5 5/18
What are the obvious problems?• We never know what production to try.
• It appears to be terribly inefficient—and it is.
• Are there grammars for which we can always know what rule to choose? Yes!
• Characteristics:– Only single symbol look ahead– Given a non-terminal and a current symbol, we
always know which production rule to apply
Discussion #5 6/18
LL(1) Parsers• An LL parser parses the input from Left to
right, and constructs a Leftmost derivation of the sentence.
• An LL(k) parser uses k tokens of look-ahead.• LL(1) parsers, although fairly restrictive, are
attractive because they only need to look at the current non-terminal and the next token to make their parsing decisions.
• LL(1) parsers require LL(1) grammars.
Discussion #5 7/18
Simple LL(1) Grammars
For simple LL(1) grammars all rules have the form
A a11 | a22 | … | ann
where
• ai is a terminal, 1 <= i <= n
• ai aj for i j and
i is a sequence of terminals and non-terminal or is empty, 1 <= i <= n
Discussion #5 8/18
Creating Simple LL(1) Grammars
• By making all production rules of the form:
A a11 | a22 | … | ann
• Thus,
E 0 | 1 | 2 | 3 | 4 | +EE | EE | *EE | /EE
• Why is this not a simple LL(1) grammar?
E N | OEEO + | | * | /N 0 | 1 | 2 | 3 | 4
• How can we change it to simple LL(1)?
Discussion #5 9/18
E (1)0 | (2)1 | (3)2 | (4)3 | (5)4 | (6)+EE | (7)EE | (8)*EE | (9)/EE
* + 2 3 4
E
2 * 3
E
?
* E E
8
E E+
6
2
3
3
44
5 E E
7
2
3
E E*
8
3
4
Success! Fail!
Output = 8 6 3 4 5
Example: LL(1) Parsing
Discussion #5 10/18
Simple LL(1) Parse TableA parse table is defined as follows:
(V {#}) (VT {#}) {(, i), pop, accept, error}where
is the right side of production number i– # marks the end of the input string (# V)
If A (V {#}) is the symbol on top of the stack and a (VT {#}) is the current input symbol, then:
ACTION(A, a) = pop if A = a for a VT
accept if A = # and a = # (a, i) which means “pop, then push a and
output i” (A a is the ith production) error otherwise
Discussion #5 11/18
Parse TableE (1)0 | (2)1 | (3)2 | (4)3 | (5)+EE | (6)*EE
0 1 2 3 + * #
E (0,1) (1,2) (2,3) (3,4) (+EE,5) (*EE,6)
0 pop
1 pop
2 pop
3 pop
+ pop
* pop
# accept
V{#}
VT {#}
All blank entries are error
Discussion #5 12/18
0 1 2 3 + * #
E (0,1) (1,2) (2,3) (3,4) (+EE,5) (*EE,6)
0,1,2,3,+,* pop pop pop pop pop pop
# accept
Action Stack Input Output
Initialize E# *+123#
ACTION(E,*) = Replace [E,*EE], Out 6 *EE# *+123# 6ACTION(*,*) = pop(*,*) EE# *+123# 6ACTION(E,+) = Replace [E,+EE], Out 5 +EEE# *+123# 65ACTION(+,+) = pop(+,+) EEE# *+123# 65ACTION(E,1) = Replace [E,1], Out 2 1EE# *+123# 652ACTION(1,1) = pop(1,1) EE# *+123# 652ACTION(E,2) = Replace [E,2], Out 3 2E# *+123# 6523ACTION(2,2) = pop(2,2) E# *+123# 6523ACTION(E,3) = Replace [E,3], Out 4 3# *+123# 65234ACTION(3,3) = pop(3,3) # *+123# 65234ACTION(#,#) = accept Done!
Discussion #5 13/18
Simple LL(1):More Restrictive than Necessary
• Simple LL(1) grammars are very easy and efficient to parse but also very restrictive.
• The good news: we can achieve the same desirable results without being so restrictive.
• How? We only need to retain the restriction that single-symbol look ahead uniquely determines which rule to use.
Discussion #5 14/18
• Consider the following grammar, which is not simple LL(1):E (1)N | (2)OEEO (3)+ | (4)*N (5)0 | (6)1 | (7)2 | (8)3
• What are the problem rules? (1) & (2)• Observe that it is possible distinguish between
rules 1 and 2.– N leads to {0, 1, 2, 3}– O leads to {+, *}– {0, 1, 2, 3} {+, *} = – Thus, if we see 0, 1, 2, or 3 we choose (1), and if we
see + or *, we choose (2).
Relaxing Simple LL(1) Restrictions
Discussion #5 15/18
LL(1) Grammars
• FIRST() = { | * and VT}
• A grammar is LL(1) if for all rules of the form
A 1 | 2 | … | n
the sets
FIRST(1), FIRST(2), …, and FIRST(n)
are pair-wise disjoint; that is,
FIRST(i) FIRST(j) = for i j
Discussion #5 16/18
E (1)N | (2)OEEO (3)+ | (4)*N (5)0 | (6)1 | (7)2 | (8)3
+ * 0 1 2 3 #E (OEE,2) (OEE,2) (N,1) (N,1) (N,1) (N,1)O (+,3) (*,4)N (0,5) (1,6) (2,7) (3,8)+ pop* pop0 pop1 pop2 pop3 pop# accept
V{#}
VT {#}
For (A, a), we select (, i) if a FIRST() and is the right hand side of rule i.
Discussion #5 17/18
+ * 0 1 2 3 #
E (OEE,2) (OEE,2) (N,1) (N,1) (N,1) (N,1)
O (+,3) (*,4)
N (0,5) (1,6) (2,7) (3,8)
+,*,0,1,2,3 pop pop pop pop pop pop
# accept
Action Stack Input Output
Initialize E# *+123#ACTION(E,*) = Replace [E,OEE], Out 2 OEE# *+123# 2
ACTION(*,*) = pop(*,*) EE# *+123# 24ACTION(E,+) = Replace [E,OEE], Out 2 OEEE# *+123# 242
ACTION(+,+) = pop(+,+) EEE# *+123# 2423
ACTION(N,1) = Replace [N,1], Out 6 1EE# *+123# 242316ACTION(1,1) = pop(1,1) EE# *+123# 242316ACTION(E,2) = Replace [E,N], Out 1 NE# *+123# 2423161
ACTION(2,2) = pop(2,2) E# *+123# 24231617ACTION(E,3) = Replace [E,N], Out 1 N# *+123# 242316171
ACTION(3,3) = pop(3,3) # *+123# 2423161718ACTION(#,#) = accept Done!
ACTION(O,*) = Replace [O,*], Out 4 *EE# *+123# 24
ACTION(O,+) = Replace [O,+], Out 3 +EEE# *+123# 2423
ACTION(E,1) = Replace [E,N], Out 1 NEE# *+123# 24231
ACTION(N,2) = Replace [N,2], Out 7 2E# *+123# 24231617
ACTION(N,3) = Replace [N,3], Out 8 3# *+123# 2423161718
Discussion #5 18/18
What does 2 4 2 3 1 6 1 7 1 8 mean?
E (1)N | (2)OEEO (3)+ | (4)*N (5)0 | (6)1 | (7)2 | (8)3
E
(2)OEE
(1)N
(6)1 (7)2
(8)3
(4)* (2)OEE (1)N
(3)+ (1)N
2 4 2 3 1 6 1 7 1 8 defines a parse tree via a preorder traversal.