ch1.1 cse244 chapter 1: introduction to compiling prof. steven a. demurjian, sr. computer science...

26
CH1.1 CSE244 Chapter 1: Introduction to Chapter 1: Introduction to Compiling Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155 Storrs, CT 06269-3155 [email protected] http://www.engr.uconn.edu/~steve (860) 486 - 4818 Dr. Robert LaBarre United Technologies Research Center 411 Silver Lane E. Hartford, CT 06018 [email protected] [email protected]

Upload: norma-mcdowell

Post on 18-Jan-2018

240 views

Category:

Documents


0 download

DESCRIPTION

CH1.3 CSE244 Classifications of Compilers  Compilers Viewed from Many Perspectives  However, All utilize same basic tasks to accomplish their actions Single Pass Multiple Pass Load & Go Construction Debugging Optimizing Functional

TRANSCRIPT

Page 1: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.1

CSE244

Chapter 1: Introduction to CompilingChapter 1: Introduction to Compiling

Prof. Steven A. Demurjian, Sr.Computer Science & Engineering Department

The University of Connecticut191 Auditorium Road, Box U-155

Storrs, CT [email protected]

http://www.engr.uconn.edu/~steve(860) 486 - 4818

Dr. Robert LaBarreUnited Technologies Research Center

411 Silver LaneE. Hartford, CT [email protected][email protected]

Page 2: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.2

CSE244

Introduction to CompilersIntroduction to Compilers As a Discipline, Involves Multiple CSE AreasAs a Discipline, Involves Multiple CSE Areas

Programming Languages and Algorithms Software Engineering & Theory / Foundations Computer Architecture & Operating Systems

But, Has Surprisingly Simplistic Intent:But, Has Surprisingly Simplistic Intent:

CompilerSource program

Target Program

Error messages

Diverse & Varied

Page 3: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.3

CSE244

Classifications of CompilersClassifications of Compilers Compilers Viewed from Many PerspectivesCompilers Viewed from Many Perspectives

However, All utilize same basic tasks to However, All utilize same basic tasks to accomplish their actionsaccomplish their actions

Single Pass

Multiple Pass

Load & Go

Construction

Debugging

OptimizingFunctional

Page 4: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.4

CSE244

Classifications of CompilersClassifications of Compilers Also, Broadly Categorized as:Also, Broadly Categorized as:

We Will Discuss Each Category in This ClassWe Will Discuss Each Category in This Class

Analysis:

Synthesis:

Decompose Source into an intermediate representation

Target program generation from representation

Page 5: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.5

CSE244

Important Notes In Today’s Technology, In Today’s Technology, AnalysisAnalysis Is Often Performed Is Often Performed

by by Software ToolsSoftware Tools - This Wasn’t the Case in Early - This Wasn’t the Case in Early CSE DaysCSE Days Structure / Syntax directed editors: Force

“syntactically” correct code to be entered Pretty Printers: Standardized version for program

structure (i.e., blank space, indenting, etc.) Static Checkers: A “quick” compilation to detect

rudimentary errors Interpreters: “real” time execution of code a

“line-at-a-time”

Page 6: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.6

CSE244

Important Notes Compilation Is Compilation Is NotNot Limited to Programming Limited to Programming

Language ApplicationsLanguage Applications Text Formatters

LATEX & TROFF Are Languages Whose Commands Format Text

Silicon Compilers Textual / Graphical: Take Input and Generate Circuit

Design Database Query Processors

Database Query Languages Are Also a Programming Language

Input Is“compiled” Into a Set of Operations for Accessing the Database

Page 7: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.7

CSE244

The Many The Many PhasesPhases of a Compiler of a CompilerSource Program

Lexical Analyzer

1

Syntax Analyzer2

Semantic Analyzer3

Intermediate Code Generator

4

Code Optimizer5

Code Generator6

Target Program

Symbol-table Manager

Error Handler

1, 2, 3 : Analysis - Our Focus4, 5, 6 : Synthesis

Page 8: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.8

CSE244

Three Phases:Three Phases: Linear / Lexical Analysis:

L-to-r Scan to Identify Tokens Hierarchical Analysis:

Grouping of Tokens Into Meaningful Collection Semantic Analysis:

Checking to Insure Correctness of Components

The Analysis Task For Compilation

Page 9: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.9

CSE244

Phase 1. Lexical Analysis

Easiest Analysis - Identify tokens which are building blocks

For Example:

All are tokens

Blanks, Line breaks, etc. are scanned out

Position := initial + rate * 60 ;_______ __ _____ _ ___ _ __ _

Page 10: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.10

CSE244

Phase 2. Phase 2. Hierarchical AnalysisHierarchical Analysisaka aka ParsingParsing or or Syntax AnalysisSyntax Analysis

For previous example, we would have Parse Tree:

identifier

identifier

expression

identifier

expression

number

expression

expression

expression

assignment statement

position

:=

+

*

60

initial

rate

Nodes of tree are constructed using a grammar for the language

Page 11: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.11

CSE244

What is a Grammar?What is a Grammar? Grammar is a Set of Rules Which Govern the Grammar is a Set of Rules Which Govern the

Interdependencies & Structure Among the TokensInterdependencies & Structure Among the Tokens

statement is an assignment statement, or while statement, or if statement, or ...

assignment statement

expression is an

is an identifier := expression ;

(expression), or expression + expression, or expression * expression, or number, or identifier, or ...

Page 12: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.12

CSE244

Why Have We Divided Analysis Why Have We Divided Analysis in This Manner?in This Manner?

Lexical Analysis - Scans Input & Its Linear Lexical Analysis - Scans Input & Its Linear Actions Are Not RecursiveActions Are Not Recursive Identify Only Individual “words” that are the

the Tokens of the Language Recursion Is Required to Identify Structure of an Recursion Is Required to Identify Structure of an

Expression, As Indicated in Parse TreeExpression, As Indicated in Parse Tree Verify that the “words” are Correctly

Assembled into “sentences” What is Third Phase?What is Third Phase?

Determine Whether the Sentences have One and Only One Unambiguous Interpretation

“John Took Picture of Mary Out on the Patio”

Page 13: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.13

CSE244

Phase 3. Semantic AnalysisPhase 3. Semantic Analysis Find More Complicated Semantic Errors and Find More Complicated Semantic Errors and

Support Code GenerationSupport Code Generation Parse Tree Is Augmented With Semantic ActionsParse Tree Is Augmented With Semantic Actions

position

initial

rate

:=+

*

60

Compressed Tree

position

initial

rate

:=+

*

inttoreal

60

Conversion Action

Page 14: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.14

CSE244

Phase 3. Semantic AnalysisPhase 3. Semantic Analysis Most ImportantMost Important Activity in This Phase: Activity in This Phase: Type CheckingType Checking - - Legality of OperandsLegality of Operands Many Different Situations:Many Different Situations:

Real := int + char ;

A[int] := A[real] + int ;

while char <> int do

…. Etc.

Page 15: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.15

CSE244

Analysis in Text Formatting

Simple Commands : LATEX

\begin{single}

\end{single}

\noindent

\section{Introduction}

$A_i$

$A_{i_j}$

Embedded in a stream of text, i.e., a FILE

\ and $ serve as signals to LATEX

begin

single

noindent

section

Language

Commands

What are tokens?

What is hierarchical structure?

What kind of semantic analysis is required?

Page 16: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.16

CSE244

Supporting Phases/ Activities for Analysis

Symbol Table Creation / MaintenanceSymbol Table Creation / Maintenance Contains Info on Each “Meaningful” Token,

Typically Identifiers Data Structure Created / Initialized During

Lexical Analysis Utilized / Updated During Later Analysis &

Synthesis Error HandlingError Handling

Detection of Different Errors Which Correspond to All Phases

What Kinds of Errors Are Found During the Analysis Phase?

What Happens When an Error Is Found?

Page 17: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.17

CSE244

The Many The Many PhasesPhases of a Compiler of a CompilerSource Program

Lexical Analyzer

1

Syntax Analyzer2

Semantic Analyzer3

Intermediate Code Generator

4

Code Optimizer5

Code Generator6

Target Program

Symbol-table Manager

Error Handler

1, 2, 3 : Analysis - Our Focus4, 5, 6 : Synthesis

Page 18: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.18

CSE244

The Synthesis Task For Compilation Intermediate Code GenerationIntermediate Code Generation

Abstract Machine Version of Code - Independent of Architecture

Easy to Produce and Do Final, Machine Dependent Code Generation

Code OptimizationCode Optimization Find More Efficient Ways to Execute Code Replace Code With More Optimal Statements 2-approaches: High-level Language &

“Peephole” Optimization Final Code GenerationFinal Code Generation

Generate Relocatable Machine Dependent Code

Page 19: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.19

CSE244

Reviewing the Entire ProcessReviewing the Entire Process

Errors

position := initial + rate * 60

lexical analyzer

syntax analyzer

semantic analyzer

intermediate code generator

id1 := id2 + id3 * 60

:=

id1id2l

id3

+*

60

:=

id1id2l

id3

+*

inttoreal

60

Symbol Table

position ....

initial ….

rate….

Page 20: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.20

CSE244

Reviewing the Entire ProcessReviewing the Entire Process

Errorsintermediate code generator

code optimizer

final code generator

temp1 := inttoreal(60)temp2 := id3 * temp1temp3 := id2 + temp2id1 := temp3

temp1 := id3 * 60.0id1 := id2 + temp1

mov f id3, r2mulf #60.0, r2movf id2, r1addf r2, r2movf r1, id1

position ....

initial ….

rate….

Symbol Table

Page 21: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.21

CSE244

Compiler Cousins:Compiler Cousins: PreprocessorsPreprocessors Provide Input to Compilers

1. Macro Processing

#define in C: does text substitution before compiling

#define X 3

#define Y A*B+C

#define Z getchar()

Page 22: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.22

CSE244

2. File Inclusion

#include in C - bring in another file before compiling

defs.h

//////////////////

main.c

#include “defs.h”

…---…---…---…---…---…---…---…---…---

//////////////////

…---…---…---…---…---…---…---…---…---

Page 23: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.23

CSE244

3. Rational Preprocessors Augment “Old” Languages With Modern Augment “Old” Languages With Modern

ConstructsConstructs Add Macros for If - Then, While, Etc. Add Macros for If - Then, While, Etc. #Define Can Make C Code More Pascal-like#Define Can Make C Code More Pascal-like

#define begin {

#define end }

#define then

Page 24: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.24

CSE244

4. Language Extensions for a Database System

EQUEL - Database query language embedded in C

## Retrieve (DN=Department.Dnum) where

## Department.Dname = ‘Research’

is Preprocessed into:

ingres_system(“Retr…..Research’”,____,____);

a procedure call in a programming language.

Page 25: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.25

CSE244

The Grouping of Phases

Front End : Analysis + Intermediate Code Generation

Back End : Code Generation + Optimizationvs.

Number of Passes:Single - Preferred

Multiple - Easier, but less efficient

Tradeoffs ……..

Page 26: CH1.1 CSE244 Chapter 1: Introduction to Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut

CH1.26

CSE244

Compiler Construction Tools

Parser Generators : Produce Syntax Analyzers

Scanner Generators : Produce Lexical Analyzers

Syntax-directed Translation Engines : Generate Intermediate Code

Automatic Code Generators : Generate Actual Code

Data-Flow Engines : Support Optimization