compiler construction by: muhammad nadeem edited by: m. bilal qureshi
TRANSCRIPT
![Page 1: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/1.jpg)
Compiler Construction
By:Muhammad Nadeem
Edited By:M. Bilal Qureshi
![Page 2: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/2.jpg)
Compiler Construction
Lecture 2
![Page 3: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/3.jpg)
3
Compilation Process
Source Code Compilation Process Object Code
Error Messages
Something we can understand
easily
Something that computer can
understand easily
![Page 4: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/4.jpg)
Analysis
Phases of a Compiler(Structure of Compiler)
Synthesis ( front end of compiler) ( back end of compiler)
![Page 5: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/5.jpg)
5
Source Code
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Intermediate Code Generator
Code Optimizer
Code Generator
Object Code
SymbolTable
Manager
ErrorHandler
Synthesis
Analysis
Synthesis
Tokens
Syntax Tree
Syntax Tree
Intermediate Representation
Intermediate Representation
![Page 6: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/6.jpg)
6
Analysis part breaks up the source program into constituent pieces and checks grammar and syntax. It then uses this structure to generate intermediate representation of the source program.
If the source program is detected syntactically incorrect or semantically unsound then proper error messages are generated so that the user may take proper action.
Symbol Table is a data structure that collects information about the source program and pass it to the Synthesis part along with the intermediate representation.
Synthesis part constructs the desired target program from the intermediate representation and symbol table information.
![Page 7: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/7.jpg)
7
Example: Z = X + 10;
Token Token_ID
Z 1
= 2
X 1
+ 2
10 3
Symbol table
1 Variable
2 Operator
3 Number
![Page 8: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/8.jpg)
Lexical Analyzer (Scanner)
![Page 9: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/9.jpg)
9
Lexical Analyzer (Scanner)
It reads a stream of characters and groups the characters into tokens
Learn by Example Position = initial + rate*60
Tokens Generated 1. Identifier#1 Position2. Assignment Operator =3. Identifier#2 initial4. Addition Operator +5. Identifier#3 rate6. Multiplication Operator *7. Number 60
Learn by doingPercentage = Marks_Obtained / Total * 100
![Page 10: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/10.jpg)
10
Source Code
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Intermediate Code Generator
Code Optimizer
Code Generator
Object Code
SymbolTable
Manager
ErrorHandler
id1 = id2 + id3*number
![Page 11: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/11.jpg)
Token
The activity of breaking stream of characters into tokens s called lexical analysis.
The lexical analyzer partition input string into substrings, called words, and classifies them according to their role.
Example: Consider if(b == 0)
a = bIn the above programming sentence the words are “if”, “(”,
“b”, “==”, “0”, “)”, “a”, “=” and “b”.The roles are keyword, variable, boolean operator, assignment
operator.
11
![Page 12: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/12.jpg)
The pairs made by lexical analyzer are:<keyword, if> <symbol, (> <variable, b> <boolean operator, ==> <constant/number, 0> <symbol, )><variable, a> <assignment operator, => <variable, b>
The pair <role, word> is called token.
12
![Page 13: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/13.jpg)
Specification/description of tokens
Regular Languages are the most popular for specifying tokens. Regular languages can be described using regular
expressions. Each regular expression is a notation for a regular language (a
set of words). If A is a regular expression, we write L(A) to refer to language denoted by A.
A regular expression (RE) is defined inductively a ordinary character from ∑ є the empty string R|S either R or S RS R followed by S (concatenation) R* concatenation of R zero or more times (R* = є |R|RR|RRR...)
13
![Page 14: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/14.jpg)
Here are some REs and the strings of the language denoted by the RE.
RE Strings in L(R) a “a”
ab “ab”a|b “a” “b”(ab)* “” “ab” “abab” ...(a| є)b “ab” “b”
14
![Page 15: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/15.jpg)
15
Role of Lexical Analyzer
1. Removal of white space2. Removal of comments3. Recognizes constants4. Recognizes Keywords5. Recognizes identifiers6. Correlates error messages with the
source program
![Page 16: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/16.jpg)
16
1. Removal of white space
By white space we mean Blanks Tabs New lines
Why ? White space is generally used for formatting
source code.
A = B + C A=B+CEquals
![Page 17: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/17.jpg)
17
Learn by Example // This is beginning of my codeint A; int B = 2;int C = 33;A = B + C ;/* This is end of my code*/
1. Removal of white space
![Page 18: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/18.jpg)
18
2. Removal of comments
Why ? Comments are user-added strings which do
not contribute to the source codeExample in Java
// This is beginning of my codeint A; int B = 2;int C = 33;A = B + C ;/* This is end of my code*/
Means nothing to the program
Means nothing to the program
![Page 19: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/19.jpg)
19
3. Recognizes constants/numbers
How is recognition done? If the source code contains a stream of digits
coming together, it shall be recognized as a constant.
Example in Java // This is beginning of my codeint A; int B = 2 ;int C = 33 ;A = B + C ;/* This is end of my code*/
![Page 20: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/20.jpg)
20
4. Recognizes keywords
Keywords in C and Java If , else , for, while, do , return etc
How is recognition done? By comparing the combination of letters keywords pre defined in the grammar of the
programming language Example in Java int A; int B = 2 ;int C = 33 ;If ( B < C )
A = B + C ;else
A = C - B
Considered a keyword if character sequence 1. I2. N3. T
Considered a keyword if character sequence 1. I 2. F
Considered a keyword if character sequence 1. E 2. L 3.S 4.E
![Page 21: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/21.jpg)
21
5. Recognizes identifiers
What are identifiers ? Names of variables, functions, arrays , etc
How is recognition done? If the combination of letters with/without digits in source code is not a keyword,
then compiler considers it as an identifier. Where is identifier stored ?
When an identifier is detected, it is entered into the symbol table Example in Java
// This is beginning of my codeint A; int B2 = 2 ;int C4R = 33 ;A = B + C ;/* This is end of my code*/
![Page 22: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/22.jpg)
22
6. Correlates error messages with the source program
How ? Keeps track of the number of new line characters seen
in the source code Tells the line number when an error message is to be
generated. Example in Java
1. This is beginning of my code2. int A; 3. int B2 = 2 ;4. int C4R = 33 ;5. A = B + C ;6. /* This is7. end of 8. my code9. */
Error Message at line 1No // inserted in the beginning
![Page 23: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/23.jpg)
23
Errors generated by Lexical Analyzer
1. Illegal symbols • E.g., =>
2. Illegal identifiers• E.g., 2ab
3. Un terminated comments• E.g., /* This is beginning of my code
![Page 24: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/24.jpg)
24
Learn by example
// Beginning of Code int a char } switch b[2] =; // end of code
No error generated
Why ?
It is the job of syntax analyzer
![Page 25: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/25.jpg)
25
Terminologies
• Lexeme– Actual sequence of characters that matches a pattern and has a given Token class.– Examples:
Identifier: Name, Data, xInteger: 345, 2, 0, 629
Pattern– The rules that characterize the set of strings for a token– Example:
Integer: A digit followed or not followed by digits Identifier: A character followed or not followed by characters or
digits
![Page 26: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/26.jpg)
26
![Page 27: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/27.jpg)
Syntax Analyzer (Parser)
![Page 28: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/28.jpg)
28
Syntax Analyzer (Parser)
Uses the tokens produced by the lexical analyzer to create a tree-like intermediate representation.
Parse tree depicts the grammatical structure of the token stream.
Example Source Code --> Position = initial +
rate*60Lexical Analyzer --> id1= id2+ id3 * number
Parse Tree / Syntax Tree=
id1 id2 + id3 * number
![Page 29: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/29.jpg)
29
=
id1 +
id2 Id3 * 60
Syntax Analyzer (Parser)
![Page 30: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/30.jpg)
30
number
=
id1 +
id2 *
id3
position
initial
rate 60
Syntax Analyzer (Parser)
![Page 31: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/31.jpg)
31
Learn by doing Percentage = Marks_Obtained / Total *
100
Syntax Analyzer (Parser)
![Page 32: Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi](https://reader033.vdocuments.net/reader033/viewer/2022051417/5697bfe31a28abf838cb4e63/html5/thumbnails/32.jpg)
32
Source Code
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Intermediate Code Generator
Code Optimizer
Code Generator
Object Code
ErrorHandler
number
=
id1 +
id2 *id3
position
initial
rate 60