top-down parsing recursive descent & ll(1)cavazos/cisc471-672-spring... · 2018. 4. 10. ·...
TRANSCRIPT
![Page 1: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/1.jpg)
Top-down ParsingRecursive Descent & LL(1)
Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.
![Page 2: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/2.jpg)
1
Roadmap (Where are we?)
• Predictive top-down parsing —The LL(1) Property—First and Follow sets—Simple recursive descent parsers—Table-driven LL(1) parsers
![Page 3: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/3.jpg)
2
LL(1) Parser
• L = scan input left to right
• L = Leftmost derivation
• 1 = lookahead is enough to pick right production rule to use
• No Backtracking
• No Left Recursion
![Page 4: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/4.jpg)
3
Predictive Parsing
Given production rulesA ® aA ® b
the parser should be able to choose between a or b using one lookahead
Predictive Parser is a top-down parser free of backtracking
![Page 5: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/5.jpg)
4
First Sets
For some rhs aÎG
FIRST(a) is set of tokens (terminals) that appear as first symbol in some string deriving from a
x Î FIRST(a) iff a Þ* x g, for some g
Some number of derivations gets us x at the beginning
For SheepNoise:FIRST(Goal) = { baa }FIRST(SN ) = { baa }FIRST(baa) = { baa }
Goal ® SheepNoiseSheepNoise ® SheepNoise baa
| baa
![Page 6: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/6.jpg)
5
LL(1) Property
If A ® a and A ® b both appear in the grammar, we would like
FIRST(a) Ç FIRST(b) = Æ
This would allow the parser to make a correct choice with a lookahead of exactly one symbol !
Almost correct! See the next slide
FIRST(a) FIRST(b)
Does not have LL(1) Property
![Page 7: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/7.jpg)
6
What about e-productions?
If A ® a and A ® b and e Î FIRST(a), then we need to ensure
FOLLOW(A) Ç FIRST(b) = Æwhere,FOLLOW(A) = the set of terminal symbols that can
immediately follow A in a sentential formFormally,
Follow(A) = {t | (t is a terminal and G Þ*aAtb) or (t is eof and GÞ*aA)}
Note: eof if A is at the end of the derived sentence
![Page 8: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/8.jpg)
7
Follow Sets Intuition
![Page 9: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/9.jpg)
8
FIRST+setsDefinition of FIRST+(A®a) if e Î FIRST(a) then
FIRST+(A®a) = FIRST(a) È FOLLOW(A)else
FIRST+(A®a) = FIRST(a)
Grammar is LL(1) iff A ® a and A ® b implies
FIRST+(A®a) Ç FIRST+(A®b) = Æ
FIRST+( A®a) FIRST+(A®b)
![Page 10: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/10.jpg)
9
What If My Grammar Is Not LL(1) ?
Can we transform a non-LL(1) grammar into an LL(1) grammar?
• In general, the answer is no• In some cases, however, the answer is yes
• Perform:—Eliminate left-recursion Previously—Perform left factoring today
![Page 11: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/11.jpg)
10
What If My Grammar Is Not LL(1) ?
Given grammar G with productions
A ® ab1
A ® ab2
if a derives anything other than e andFIRST+(A ® ab1) Ç FIRST+(A ® ab2) ≠ Æ
This grammar is not LL(1)
FIRST+(ab1) FIRST+(ab2)
![Page 12: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/12.jpg)
11
Left FactoringIf we pull the common prefix, a, into a separate
production, we may make the grammar LL(1).
A ® a A’A’ ® b1
| b2
Now, if FIRST+(A’ ®b1) Ç FIRST+(A’ ® b2) = Æ, G may be LL(1)
Create a new Nonterminal
![Page 13: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/13.jpg)
12
Left FactoringFor each nonterminal A
find the longest prefix a common to 2 or more alternatives for Aif a ≠ e then
replace all of the productionsA ® ab1 | a b2 | a b3 | … | a bn | γwithA ® aA’ | γA’® b1 | b2 | b3 | … | bn
Repeat until no NT has rhs’ with a common prefix
NT with common prefix
![Page 14: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/14.jpg)
13
Left FactoringFor each nonterminal A
find the longest prefix a common to 2 or more alternatives for Aif a ≠ e then
replace all of the productionsA ® ab1 | a b2 | a b3 | … | a bn | γwithA ® aA’ | γA’® b1 | b2 | b3 | … | bn
Repeat until no NT has rhs’ with a common prefix
Put common prefix a into a separate production rule
![Page 15: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/15.jpg)
14
Left FactoringFor each nonterminal A
find the longest prefix a common to 2 or more alternatives for Aif a ≠ e then
replace all of the productionsA ® ab1 | a b2 | a b3 | … | a bn | γwithA ® aA’ | γA’® b1 | b2 | b3 | … | bn
Repeat until no NT has rhs’ with a common prefix
Create new Nonterminal (A’ ) with all unique suffixes
![Page 16: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/16.jpg)
15
Left Factoring
Transformation makes some grammars into LL(1) grammars There are languages for which no LL(1) grammar exists
For each nonterminal Afind the longest prefix a common to 2 or more alternatives for Aif a ≠ e then
replace all of the productionsA ® ab1 | a b2 | a b3 | … | a bn | γwithA ® aA’ | γA’® b1 | b2 | b3 | … | bn
Repeat until no NT has rhs’ with a common prefix
![Page 17: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/17.jpg)
16
Left Factoring not possible Here is an example where a programming language fails to be
LL(1) and is not in a form that can be left factored
identifier
FIRST+(assign-stmt) FIRST+(call-stmt)
![Page 18: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/18.jpg)
17
Left Factoring ExampleConsider a simple right-recursive expression grammar
0 Goal ® Expr1 Expr ® Term + Expr2 | Term - Expr3 | Term4 Term ® Factor * Term5 | Factor / Term6 | Factor7 Factor ® number8 | id
To choose between 1, 2, & 3, an LL(1) parser must look past the number or id to the operator.FIRST+(1) = FIRST+(2) = FIRST+(3)
andFIRST+(4) = FIRST+(5) = FIRST+(6)
Let’s left factor this grammar.
![Page 19: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/19.jpg)
18
Left Factoring ExampleAfter Left Factoring, we have
0 Goal ® Expr1 Expr ® Term Expr’2 Expr’ ® + Expr3 | - Expr4 | e
5 Term ® Factor Term’6 Term’ ® * Term7 | / Term8 | e
9 Factor ® number10 | id
Clearly,FIRST+(2), FIRST+(3), & FIRST+(4)
are disjoint, as areFIRST+(6), FIRST+(7), & FIRST+(8)
The grammar now has the LL(1) property
![Page 20: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/20.jpg)
19
FIRST Sets
FIRST(a)For some a Î (T È NT )*, define FIRST(a)
as the set of tokens that appear as the first symbol in some string that derives from a
That is, x Î FIRST(a) iff a Þ* x g, for some g
![Page 21: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/21.jpg)
20
Computing FIRST Sets
Outer loop is monotone increasing for FIRSTsets® | T È NT È e | is bounded, so it terminates
Inner loop is bounded by the length of the productions in the grammar
Set terminals
for each x Î T, FIRST(x) ¬ { x }for each A Î NT, FIRST(A) ¬ Øwhile (FIRST sets are still changing) do
for each p Î P, of the form A®b doif b is B1B2…Bk then begin;
FS ¬ FIRST(B1) – { e }for i ¬ 1 to k–1 by 1 while e Î FIRST(Bi) do
FS ¬ FS È ( FIRST(Bi+1) – { e } )end // for loop
end // if-then if i = k and e Î FIRST(Bk )
then FS ¬ FS È { e }FIRST(A) ¬ FIRST(A) È FSend // for loop
end // while loop
![Page 22: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/22.jpg)
21
Computing FIRST Sets
Outer loop is monotone increasing for FIRSTsets® | T È NT È e | is bounded, so it terminates
Inner loop is bounded by the length of the productions in the grammar
Set empty set for First of nonterminals
for each x Î T, FIRST(x) ¬ { x }for each A Î NT, FIRST(A) ¬ Øwhile (FIRST sets are still changing) do
for each p Î P, of the form A®b doif b is B1B2…Bk then begin;
FS ¬ FIRST(B1) – { e }for i ¬ 1 to k–1 by 1 while e Î FIRST(Bi) do
FS ¬ FS È ( FIRST(Bi+1) – { e } )end // for loop
end // if-then if i = k and e Î FIRST(Bk )
then FS ¬ FS È { e }FIRST(A) ¬ FIRST(A) È FSend // for loop
end // while loop
![Page 23: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/23.jpg)
22
Computing FIRST Sets
Outer loop is monotone increasing for FIRSTsets® | T È NT È e | is bounded, so it terminates
Inner loop is bounded by the length of the productions in the grammar
Fixed point algorithm; Monotone because we always add to First sets; never delete from sets
for each x Î T, FIRST(x) ¬ { x }for each A Î NT, FIRST(A) ¬ Øwhile (FIRST sets are still changing) do
for each p Î P, of the form A®b doif b is B1B2…Bk then begin;
FS ¬ FIRST(B1) – { e }for i ¬ 1 to k–1 by 1 while e Î FIRST(Bi) do
FS ¬ FS È ( FIRST(Bi+1) – { e } )end // for loop
end // if-then if i = k and e Î FIRST(Bk )
then FS ¬ FS È { e }FIRST(A) ¬ FIRST(A) È FSend // for loop
end // while loop
![Page 24: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/24.jpg)
23
Computing FIRST Sets
Outer loop is monotone increasing for FIRSTsets® | T È NT È e | is bounded, so it terminates
Inner loop is bounded by the length of the productions in the grammar
Iterate through each production
for each x Î T, FIRST(x) ¬ { x }for each A Î NT, FIRST(A) ¬ Øwhile (FIRST sets are still changing) do
for each p Î P, of the form A®b doif b is B1B2…Bk then begin;
FS ¬ FIRST(B1) – { e }for i ¬ 1 to k–1 by 1 while e Î FIRST(Bi) do
FS ¬ FS È ( FIRST(Bi+1) – { e } )end // for loop
end // if-then if i = k and e Î FIRST(Bk )
then FS ¬ FS È { e }FIRST(A) ¬ FIRST(A) È FSend // for loop
end // while loop
![Page 25: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/25.jpg)
24
Computing FIRST Sets
Outer loop is monotone increasing for FIRSTsets® | T È NT È e | is bounded, so it terminates
Inner loop is bounded by the length of the productions in the grammar
RHS is some set of T and NT.
for each x Î T, FIRST(x) ¬ { x }for each A Î NT, FIRST(A) ¬ Øwhile (FIRST sets are still changing) do
for each p Î P, of the form A®b doif b is B1B2…Bk then begin;
FS ¬ FIRST(B1) – { e }for i ¬ 1 to k–1 by 1 while e Î FIRST(Bi) do
FS ¬ FS È ( FIRST(Bi+1) – { e } )end // for loop
end // if-then if i = k and e Î FIRST(Bk )
then FS ¬ FS È { e }FIRST(A) ¬ FIRST(A) È FSend // for loop
end // while loop
![Page 26: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/26.jpg)
25
Computing FIRST Sets
Outer loop is monotone increasing for FIRSTsets® | T È NT È e | is bounded, so it terminates
Inner loop is bounded by the length of the productions in the grammar
Initialize rhs to First of first symbol minus epsilon
for each x Î T, FIRST(x) ¬ { x }for each A Î NT, FIRST(A) ¬ Øwhile (FIRST sets are still changing) do
for each p Î P, of the form A®b doif b is B1B2…Bk then begin;
FS ¬ FIRST(B1) – { e }for i ¬ 1 to k–1 by 1 while e Î FIRST(Bi) do
FS ¬ FS È ( FIRST(Bi+1) – { e } )end // for loop
end // if-then if i = k and e Î FIRST(Bk )
then FS ¬ FS È { e }FIRST(A) ¬ FIRST(A) È FSend // for loop
end // while loop
![Page 27: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/27.jpg)
26
Computing FIRST Sets
for each x Î T, FIRST(x) ¬ { x }for each A Î NT, FIRST(A) ¬ Øwhile (FIRST sets are still changing) do
for each p Î P, of the form A®b doif b is B1B2…Bk then begin;
FS ¬ FIRST(B1) – { e }for i ¬ 1 to k–1 by 1 while e Î FIRST(Bi) do
FS ¬ FS È ( FIRST(Bi+1) – { e } )end // for loop
end // if-then if i = k and e Î FIRST(Bk )
then FS ¬ FS È { e }FIRST(A) ¬ FIRST(A) È FSend // for loop
end // while loop
Outer loop is monotone increasing for FIRSTsets® | T È NT È e | is bounded, so it terminates
Inner loop is bounded by the length of the productions in the grammar
Iterate through symbols in production until have a symbol that does not have epsilon in First set
![Page 28: Top-down Parsing Recursive Descent & LL(1)cavazos/cisc471-672-spring... · 2018. 4. 10. · •Predictive top-down parsing —The LL(1) Property —First and Follow sets —Simple](https://reader033.vdocuments.net/reader033/viewer/2022060914/60a84e9399c42c41c06783a4/html5/thumbnails/28.jpg)
27
Expression Grammar
Symbol FIRSTnum numid id+ +- -* */ /( () )
eof eofe e
Goal num, id, (Expr num, id, (Expr’ +, -, eTerm num, id, (Term’ *, /, eFactor num, id, (
0 Goal ® Expr1 Expr ® Term Expr’2 Expr’ ® + Term Expr’3 | - Term Expr’4 | e5 Term ® Factor Term’6 Term’ ® * Factor Term’7 | / Factor Term’8 | e9 Factor ® number10 | id11 | ( Expr )