1 october 2, 2015 1 october 2, 2015october 2, 2015october 2, 2015 azusa, ca sheldon x. liang ph. d....
TRANSCRIPT
1
April 21, 20231
April 21, 2023April 21, 2023 Azusa, CAAzusa, CA
Sheldon X. Liang Ph. D.
Computer Science at Computer Science at Azusa Pacific UniversityAzusa Pacific University
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS400 Compiler ConstructionCS400 Compiler Construction
2
Syntax AnalysisPart I
Chapter 4
April 21, 20232
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
3
Position of a Parser in the Compiler Model
LexicalAnalyzer
Parser and rest offront-end
SourceProgram
Token,tokenval
Symbol Table
Get nexttoken
Lexical error Syntax errorSemantic error
Intermediaterepresentation
April 21, 20233
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
4
April 21, 20234
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
Keep in mind following questionsKeep in mind following questions
• Parser– Twofold:– Checking syntax– Invoking semantic action
• Error Handling – Two abilities:– Identifying errors – Locating errors
• Chomsky Hierarchy – From Turing machine– to Finite Automaton– From non to more restriction
5
The Parser
• A parser implements a C-F grammar• The role of the parser is twofold:1. To check syntax (= string recognizer)
– And to report syntax errors accurately2. To invoke semantic actions
– For static semantics checking, e.g. type checking of expressions, functions, etc.
– For syntax-directed translation of the source code to an intermediate representation
April 21, 20235
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
6
Syntax-Directed Translation
• One of the major roles of the parser is to produce an intermediate representation (IR) of the source program using syntax-directed translation methods
• Possible IR output:– Abstract syntax trees (ASTs)– Control-flow graphs (CFGs) with triples, three-address
code, or register transfer list notation– WHIRL (SGI Pro64 compiler) has 5 IR levels!
April 21, 20236
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
7
WHIRL
• Winning Hierarchical Intermediate Representation
• 5 Levels:– VH, H, M L, VL– Lowering happens when needed – Each optimization performed at the right level
April 21, 20237
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
8
Error Handling
• A good compiler should assist in identifying and locating errors– Lexical errors: important, compiler can easily
recover and continue– Syntax errors: most important for compiler,
can almost always recover– Static semantic errors: important, can
sometimes recover– Dynamic semantic errors: hard or impossible
to detect at compile time, runtime checks are required
– Logical errors: hard or impossible to detect
April 21, 20238
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
9
Viable-Prefix Property
• The viable-prefix property of LL/LR parsers allows early detection of syntax errors– Goal: detection of an error as soon as possible without
further consuming unnecessary input
– How: detect an error as soon as the prefix of the input does not match a prefix of any string in the language
…for (;)…
…DO 10 I = 1;0…
Error isdetected here
Error isdetected here
Prefix Prefix
April 21, 20239
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
10
Error Recovery Strategies
• Panic mode– Discard input until a token in a set of designated synchronizing
tokens is found
• Phrase-level recovery– Perform local correction on the input to repair the error
• Error productions– Augment grammar with productions for erroneous constructs
• Global correction– Choose a minimal sequence of changes to obtain a global least-
cost correction
April 21, 202310
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
11
Grammars (Recap)
• Context-free grammar is a 4-tupleG = (N, T, P, S) where– T is a finite set of tokens (terminal symbols)
– N is a finite set of nonterminals
– P is a finite set of productions of the form
where (NT)* N (NT)* and (NT)*
– S N is a designated start symbol
April 21, 202311
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
12
Notational Conventions Used
• Terminalsa,b,c,… Tspecific terminals: 0, 1, id, +
• NonterminalsA,B,C,… Nspecific nonterminals: expr, term, stmt
• Grammar symbolsX,Y,Z (NT)
• Strings of terminalsu,v,w,x,y,z T*
• Strings of grammar symbols,, (NT)*
April 21, 202312
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
13
Derivations (Recap)
• The one-step derivation is defined by A
where A is a production in the grammar• In addition, we define
is leftmost lm if does not contain a nonterminal is rightmost rm if does not contain a nonterminal– Transitive closure * (zero or more steps)– Positive closure + (one or more steps)
• The language generated by G is defined byL(G) = {w T* | S + w}
April 21, 202313
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
14
Derivation (Example)Grammar G = ({E}, {+,*,(,),-,id}, P, E) withproductions P = E E + E
E E * EE ( E )E - EE id
E - E - id
E * E
E + id * id + id
E rm E + E rm E + id rm id + id
Example derivations:
E * id + id
April 21, 202314
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
15
Chomsky Hierarchy: Language Classification
April 21, 202315
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
Avram Noam Chomsky is an American linguist, philosopher,, and lecturer. He is a professor emeritus of linguistics at the MIT. Chomsky is well known in the academic and scientific community as the father of modern linguistics. In the 1950s, Chomsky began developing his theory of generative grammar, which has had a profound influence on linguistics. He established the Chomsky hierarchy, a classification of formal languages in terms of their generative power. His naturalistic approach to the study of language has affected the philosophy of language and mind.
16
April 21, 202316
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
• A grammar G is said to be– Regular if it is right linear where each production is of the form
A w B or A wor left linear where each production is of the form
A B w or A w– Context free if each production is of the form
A where A N and (NT)*
– Context sensitive if each production is of the form A
where A N, ,, (NT)*, || > 0– Unrestricted
Chomsky Hierarchy: Language Classification
17
Chomsky Hierarchy
L(regular) L(context free) L(context sensitive) L(unrestricted)
April 21, 202317
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
18
Chomsky Hierarchy
L(regular) L(context free) L(context sensitive) L(unrestricted)
Where L(T) = { L(G) | G is of type T }That is: the set of all languages
generated by grammars G of type T
L1 = { anbn | n 1 } is context free
L2 = { anbncn | n 1 } is context sensitive
Every finite language is regular! (construct a FSA for strings in L(G))
Examples:
April 21, 202318
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
unrestricted
context sensitive
context free
regular
19
Parsing
• Universal (any C-F grammar)– Cocke-Younger-Kasimi– Earley
• Top-down (C-F grammar with restrictions)– Recursive descent (predictive parsing)– LL (Left-to-right, Leftmost derivation) methods
• Bottom-up (C-F grammar with restrictions)– Operator precedence parsing– LR (Left-to-right, Rightmost derivation) methods
• SLR, canonical LR, LALR
April 21, 202319
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
20
April 21, 202320
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction
Got it with following questionsGot it with following questions
• Parser– Twofold:– Checking syntax– Invoking semantic action
• Error Handling – Two abilities:– Identifying errors – Locating errors
• Chomsky Hierarchy – From Turing machine– to Finite Automaton– From non to more restriction
21
Thank you very much!
Questions?
April 21, 202321
Azusa Pacific University, Azusa, CA 91702, Tel: (800) 825-5278 Department of Computer Science, http://www.apu.edu/clas/computerscience/
CS@APU: CS400 Compiler ConstructionCS@APU: CS400 Compiler Construction