ParsingParsingParsingParsing
2
Front-End: ParserFront-End: ParserFront-End: ParserFront-End: Parser
Checks the stream of words and their parts of speech for grammatical correctness
scanner parsersourcecode
tokens IR
errors
3
Front-End: ParserFront-End: ParserFront-End: ParserFront-End: Parser
Determines if the input is syntactically well formed
scanner parsersourcecode
tokens IR
errors
4
Front-End: ParserFront-End: ParserFront-End: ParserFront-End: Parser
Guides context-sensitive (“semantic”) analysis (type checking)
scanner parsersourcecode
tokens IR
errors
5
Front-End: ParserFront-End: ParserFront-End: ParserFront-End: Parser
Builds IR for source program
scanner parsersourcecode
tokens IR
errors
6
Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysis Natural language analogy:
consider the sentence
He wrote the program
7
Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysis
He wrote the program
noun verb article noun
8
Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysis
He wrote the program
noun verb article noun
subject predicate object
9
Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysis Natural language analogy
He wrote the program
noun verb article noun
subject predicate object
sentence
10
Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysis Programming language
if ( b <= 0 ) a = b
bool expr assignment
if-statement
11
Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysissyntax errors
int* foo(int i, int j)){ for(k=0; i j; ) fi( i > j ) return j;}
Compiler Compiler ConstructionConstruction
Compiler Compiler ConstructionConstruction
Sohail Aslam
Lecture 11
13
Syntactic AnalysisSyntactic AnalysisSyntactic AnalysisSyntactic Analysisint* foo(int i, int j))
{
for(k=0; i j; )
fi( i > j )
return j;
}
extra parenthesis
Missing expression
not a keyword
14
Semantic AnalysisSemantic AnalysisSemantic AnalysisSemantic Analysis Grammatically correct
He wrote the computer
noun verb article noun
subject predicate object
sentence
15
Semantic AnalysisSemantic AnalysisSemantic AnalysisSemantic Analysis semantically (meaning) wrong!
He wrote the computer
noun verb article noun
subject predicate object
sentence
16
Semantic AnalysisSemantic AnalysisSemantic AnalysisSemantic Analysisint* foo(int i, int j){ for(k=0; i < j; j++ ) if( i < j-2 ) sum = sum+i return sum;}
undeclared var
return type
mismatch
17
Role of the ParserRole of the ParserRole of the ParserRole of the Parser Not all sequences of tokens
are program. Parser must distinguish
between valid and invalid sequences of tokens.
18
Role of the ParserRole of the ParserRole of the ParserRole of the Parser Not all sequences of tokens
are program. Parser must distinguish
between valid and invalid sequences of tokens.
19
Role of the ParserRole of the ParserRole of the ParserRole of the ParserWhat we need
An expressive way to describe the syntax
An acceptor mechanism that determines if input token stream satisfies the syntax
20
Role of the ParserRole of the ParserRole of the ParserRole of the ParserWhat we need
An expressive way to describe the syntax
An acceptor mechanism that determines if input token stream satisfies the syntax
21
Role of the ParserRole of the ParserRole of the ParserRole of the ParserWhat we need
An expressive way to describe the syntax
An acceptor mechanism that determines if input token stream satisfies the syntax
22
Study of ParsingStudy of ParsingStudy of ParsingStudy of Parsing Parsing is the process of
discovering a derivation for some sentence
23
Study of ParsingStudy of ParsingStudy of ParsingStudy of Parsing Mathematical model of
syntax – a grammar G.
Algortihm for testing membership in L(G).
24
Study of ParsingStudy of ParsingStudy of ParsingStudy of Parsing Mathematical model of
syntax – a grammar G.
Algortihm for testing membership in L(G).
25
Context Free GrammarsContext Free GrammarsContext Free GrammarsContext Free GrammarsA CFG is a four tuple
G=(S,N,T,P) S is the start symbol N is a set of non-terminals T is a set of terminals P is a set of productions
26
Why Not Regular Why Not Regular Expressions?Expressions?Why Not Regular Why Not Regular Expressions?Expressions?Reason:
regular languages do not have enough power to express syntax of programming languages.
27
Limitations of Regular Limitations of Regular LanguagesLanguagesLimitations of Regular Limitations of Regular LanguagesLanguages
Finite automaton can’t remember number of times it has visited a particular state
28
Example of CFGExample of CFGExample of CFGExample of CFG
Context-free syntax is specified with a CFG
29
Example of CFGExample of CFGExample of CFGExample of CFG Example
SheepNoise → SheepNoise baa| baa
This CFG defines the set of noises sheep make
30
Example of CFGExample of CFGExample of CFGExample of CFG We can use the
SheepNoise grammar to create sentences
We use the productions as rewriting rules
31
Example of CFGExample of CFGExample of CFGExample of CFGSheepNoise → SheepNoise baa
| baa
Rule Sentential Form- SheepNoise2 baa
32
Example of CFGExample of CFGExample of CFGExample of CFGSheepNoise → SheepNoise baa
| baa
Rule Sentential Form- SheepNoise1 SheepNoise baa2 baa baa
33
Example of CFGExample of CFGExample of CFGExample of CFG
And so on ...
Rule Sentential Form- SheepNoise1 SheepNoise baa1 SheepNoise baa baa2 baa baa baa
34
Example of CFGExample of CFGExample of CFGExample of CFG While it is cute, this
example quickly runs out intellectual steam
To explore uses of CFGs, we need a more complex grammar
35
Example of CFGExample of CFGExample of CFGExample of CFG While it is cute, this
example quickly runs out intellectual steam
To explore uses of CFGs, we need a more complex grammar
36
More Useful GrammarMore Useful GrammarMore Useful GrammarMore Useful Grammar1 expr → expr op expr2 | num3 | id4 op → +5 | –6 | *7 | /
37
Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF)
Grammar rules in a similar form were first used in the description of the Algol60 Language.
38
Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF) The notation was developed
by John Backus and adapted by Peter Naur for the Algol60 report.
Thus the term Backus-Naur Form (BNF)
39
Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF)Backus-Naur Form (BNF) The notation was developed
by John Backus and adapted by Peter Naur for the Algol60 report.
Thus the term Backus-Naur Form (BNF)
40
Derivation:Derivation:Derivation:Derivation: Let us use the expression
grammar to derive the sentence
x – 2 * y
41
Derivation: Derivation: x – 2 x – 2 ** y yDerivation: Derivation: x – 2 x – 2 ** y yRule Sentential Form
- expr1 expr op expr2 <id,x> op expr5 <id,x> – expr1 <id,x> – expr op
expr
42
Derivation: Derivation: x – 2 x – 2 ** y yDerivation: Derivation: x – 2 x – 2 ** y y
Rule Sentential Form2 <id,x> – <num,2> op
expr6 <id,x> – <num,2>
expr3 <id,x> – <num,2>
<id,y>
43
DerivationDerivationDerivationDerivation Such a process of rewrites
is called a derivation.
Process or discovering a derivations is called parsing
44
DerivationDerivationDerivationDerivation Such a process of rewrites
is called a derivation.
Process or discovering a derivations is called parsing
45
DerivationDerivationDerivationDerivation
We denote this derivation as:
expr →* id – num * id
46
DerivationsDerivationsDerivationsDerivations At each step, we choose a
non-terminal to replace
Different choices can lead to different derivations.
47
DerivationsDerivationsDerivationsDerivations At each step, we choose a
non-terminal to replace
Different choices can lead to different derivations.
48
DerivationsDerivationsDerivationsDerivations Two derivations are of
interest
1. Leftmost derivation
2. Rightmost derivation
49
DerivationsDerivationsDerivationsDerivations Leftmost derivation:
replace leftmost non-terminal (NT) at each step
Rightmost derivation: replace rightmost NT at each step
50
DerivationsDerivationsDerivationsDerivations Leftmost derivation:
replace leftmost non-terminal (NT) at each step
Rightmost derivation: replace rightmost NT at each step
51
DerivationsDerivationsDerivationsDerivations The example on the
preceding slides was leftmost derivation
There is also a rightmost derivation
52
Rightmost DerivationRightmost DerivationRightmost DerivationRightmost DerivationRule Sentential Form
- expr1 expr op expr3 expr op <id,x>6 expr <id,x>1 expr op expr
<id,x>
53
Derivation: Derivation: x – 2 x – 2 ** y yDerivation: Derivation: x – 2 x – 2 ** y y
Rule Sentential Form2 expr op <num,2>
<id,x>5 expr – <num,2>
<id,x>3 <id,x> – <num,2>
<id,y>
54
DerivationsDerivationsDerivationsDerivations In both cases we have
expr →* id – num id
55
DerivationsDerivationsDerivationsDerivations The two derivations produce
different parse trees.
The parse trees imply different evaluation orders!
56
DerivationsDerivationsDerivationsDerivations The two derivations produce
different parse trees.
The parse trees imply different evaluation orders!