chapter 1: introduction to compiling

26
CH1.1 CSE244 Chapter 1: Introduction to Chapter 1: Introduction to Compiling Compiling Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155 Storrs, CT 06269-3155 [email protected] http://www.engr.uconn.edu/~steve (860) 486 - 4818 Dr. Robert LaBarre United Technologies Research Center 411 Silver Lane E. Hartford, CT 06018 [email protected] [email protected]

Upload: mateo

Post on 21-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Chapter 1: Introduction to Compiling. Prof. Steven A. Demurjian, Sr. Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155 Storrs, CT 06269-3155. [email protected] http://www.engr.uconn.edu/~steve (860) 486 - 4818. Dr. Robert LaBarre - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 1: Introduction to Compiling

CH1.1

CSE244

Chapter 1: Introduction to CompilingChapter 1: Introduction to Compiling

Prof. Steven A. Demurjian, Sr.Computer Science & Engineering Department

The University of Connecticut191 Auditorium Road, Box U-155

Storrs, CT [email protected]

http://www.engr.uconn.edu/~steve(860) 486 - 4818

Dr. Robert LaBarreUnited Technologies Research Center

411 Silver LaneE. Hartford, CT 06018

[email protected]

[email protected]

Page 2: Chapter 1: Introduction to Compiling

CH1.2

CSE244

Introduction to CompilersIntroduction to Compilers

As a Discipline, Involves Multiple CSE AreasAs a Discipline, Involves Multiple CSE Areas Programming Languages and Algorithms Software Engineering & Theory / Foundations Computer Architecture & Operating Systems

But, Has Surprisingly Simplistic Intent:But, Has Surprisingly Simplistic Intent:

CompilerSource program

Target Program

Error messages

Diverse & Varied

Page 3: Chapter 1: Introduction to Compiling

CH1.3

CSE244

Classifications of CompilersClassifications of Compilers

Compilers Viewed from Many PerspectivesCompilers Viewed from Many Perspectives

However, All utilize same basic tasks to However, All utilize same basic tasks to accomplish their actionsaccomplish their actions

Single Pass

Multiple Pass

Load & Go

Construction

Debugging

OptimizingFunctional

Page 4: Chapter 1: Introduction to Compiling

CH1.4

CSE244

Classifications of CompilersClassifications of Compilers

Also, Broadly Categorized as:Also, Broadly Categorized as:

We Will Discuss Each Category in This ClassWe Will Discuss Each Category in This Class

Analysis:

Synthesis:

Decompose Source into an intermediate representation

Target program generation from representation

Page 5: Chapter 1: Introduction to Compiling

CH1.5

CSE244

Important Notes

In Today’s Technology, In Today’s Technology, AnalysisAnalysis Is Often Is Often Performed by Performed by Software ToolsSoftware Tools - This Wasn’t the - This Wasn’t the Case in Early CSE DaysCase in Early CSE Days

Structure / Syntax directed editors: Force “syntactically” correct code to be entered

Pretty Printers: Standardized version for program structure (i.e., blank space, indenting, etc.)

Static Checkers: A “quick” compilation to detect rudimentary errors

Interpreters: “real” time execution of code a “line-at-a-time”

Page 6: Chapter 1: Introduction to Compiling

CH1.6

CSE244

Important Notes

Compilation Is Compilation Is NotNot Limited to Programming Limited to Programming Language ApplicationsLanguage Applications Text Formatters

LATEX & TROFF Are Languages Whose Commands Format Text

Silicon Compilers Textual / Graphical: Take Input and Generate

Circuit Design

Database Query Processors Database Query Languages Are Also a

Programming Language

Input Is“compiled” Into a Set of Operations for Accessing the Database

Page 7: Chapter 1: Introduction to Compiling

CH1.7

CSE244

The Many The Many PhasesPhases of a Compiler of a CompilerSource Program

Lexical Analyzer

1

Syntax Analyzer2

Semantic Analyzer3

Intermediate Code Generator

4

Code Optimizer5

Code Generator6

Target Program

Symbol-table Manager

Error Handler

1, 2, 3 : Analysis - Our Focus

4, 5, 6 : Synthesis

Page 8: Chapter 1: Introduction to Compiling

CH1.8

CSE244

Three Phases:Three Phases: Linear / Lexical Analysis:

L-to-r Scan to Identify Tokens

Hierarchical Analysis:

Grouping of Tokens Into Meaningful Collection

Semantic Analysis:

Checking to Insure Correctness of Components

The Analysis Task For Compilation

Page 9: Chapter 1: Introduction to Compiling

CH1.9

CSE244

Phase 1. Lexical Analysis

Easiest Analysis - Identify tokens which are building blocks

For Example:

All are tokens

Blanks, Line breaks, etc. are scanned out

Position := initial + rate * 60 ;_______ __ _____ _ ___ _ __ _

Page 10: Chapter 1: Introduction to Compiling

CH1.10

CSE244

Phase 2. Phase 2. Hierarchical AnalysisHierarchical Analysisaka aka ParsingParsing or or Syntax AnalysisSyntax Analysis

For previous example,

we would have

Parse Tree:

identifier

identifier

expression

identifier

expression

number

expression

expression

expression

assignment statement

position

:=

+

*

60

initial

rate

Nodes of tree are constructed using a grammar for the language

Page 11: Chapter 1: Introduction to Compiling

CH1.11

CSE244

What is a Grammar?What is a Grammar?

Grammar is a Set of Rules Which Govern the Grammar is a Set of Rules Which Govern the Interdependencies & Structure Among the TokensInterdependencies & Structure Among the Tokens

statement is an assignment statement, or while statement, or if statement, or ...

assignment statement

expression is an

is an identifier := expression ;

(expression), or expression + expression, or expression * expression, or number, or identifier, or ...

Page 12: Chapter 1: Introduction to Compiling

CH1.12

CSE244

Why Have We Divided Analysis Why Have We Divided Analysis in This Manner?in This Manner?

Lexical Analysis - Scans Input & Its Linear Lexical Analysis - Scans Input & Its Linear Actions Are Not RecursiveActions Are Not Recursive Identify Only Individual “words” that are the

the Tokens of the Language Recursion Is Required to Identify Structure of an Recursion Is Required to Identify Structure of an

Expression, As Indicated in Parse TreeExpression, As Indicated in Parse Tree Verify that the “words” are Correctly

Assembled into “sentences” What is Third Phase?What is Third Phase?

Determine Whether the Sentences have One and Only One Unambiguous Interpretation

“John Took Picture of Mary Out on the Patio”

Page 13: Chapter 1: Introduction to Compiling

CH1.13

CSE244

Phase 3. Semantic AnalysisPhase 3. Semantic Analysis

Find More Complicated Semantic Errors and Find More Complicated Semantic Errors and Support Code GenerationSupport Code Generation

Parse Tree Is Augmented With Semantic ActionsParse Tree Is Augmented With Semantic Actions

position

initial

rate

:=+

*

60

Compressed Tree

position

initial

rate

:=+

*

inttoreal

60

Conversion Action

Page 14: Chapter 1: Introduction to Compiling

CH1.14

CSE244

Phase 3. Semantic AnalysisPhase 3. Semantic Analysis

Most ImportantMost Important Activity in This Phase: Activity in This Phase:

Type CheckingType Checking - - Legality of OperandsLegality of Operands

Many Different Situations:Many Different Situations:

Real := int + char ;

A[int] := A[real] + int ;

while char <> int do

…. Etc.

Page 15: Chapter 1: Introduction to Compiling

CH1.15

CSE244

Analysis in Text Formatting

Simple Commands : LATEX

\begin{single}

\end{single}

\noindent

\section{Introduction}

$A_i$

$A_{i_j}$

Embedded in a stream of text, i.e., a FILE

\ and $ serve as signals to LATEX

begin

single

noindent

section

Language

Commands

What are tokens?

What is hierarchical structure?

What kind of semantic analysis is required?

Page 16: Chapter 1: Introduction to Compiling

CH1.16

CSE244

Supporting Phases/ Activities for Analysis

Symbol Table Creation / MaintenanceSymbol Table Creation / Maintenance Contains Info on Each “Meaningful” Token,

Typically Identifiers Data Structure Created / Initialized During

Lexical Analysis Utilized / Updated During Later Analysis &

Synthesis

Error HandlingError Handling Detection of Different Errors Which

Correspond to All Phases What Kinds of Errors Are Found During the

Analysis Phase? What Happens When an Error Is Found?

Page 17: Chapter 1: Introduction to Compiling

CH1.17

CSE244

The Many The Many PhasesPhases of a Compiler of a CompilerSource Program

Lexical Analyzer

1

Syntax Analyzer2

Semantic Analyzer3

Intermediate Code Generator

4

Code Optimizer5

Code Generator6

Target Program

Symbol-table Manager

Error Handler

1, 2, 3 : Analysis - Our Focus

4, 5, 6 : Synthesis

Page 18: Chapter 1: Introduction to Compiling

CH1.18

CSE244

The Synthesis Task For Compilation Intermediate Code GenerationIntermediate Code Generation

Abstract Machine Version of Code - Independent of Architecture

Easy to Produce and Do Final, Machine Dependent Code Generation

Code OptimizationCode Optimization Find More Efficient Ways to Execute Code Replace Code With More Optimal Statements 2-approaches: High-level Language &

“Peephole” Optimization Final Code GenerationFinal Code Generation

Generate Relocatable Machine Dependent Code

Page 19: Chapter 1: Introduction to Compiling

CH1.19

CSE244

Reviewing the Entire ProcessReviewing the Entire Process

Errors

position := initial + rate * 60

lexical analyzer

syntax analyzer

semantic analyzer

intermediate code generator

id1 := id2 + id3 * 60

:=

id1id2l

id3

+*

60

:=

id1id2l

id3

+*

inttoreal

60

Symbol Table

position ....

initial ….

rate….

Page 20: Chapter 1: Introduction to Compiling

CH1.20

CSE244

Reviewing the Entire ProcessReviewing the Entire Process

Errors

intermediate code generator

code optimizer

final code generator

temp1 := inttoreal(60)

temp2 := id3 * temp1

temp3 := id2 + temp2

id1 := temp3

temp1 := id3 * 60.0

id1 := id2 + temp1

mov f id3, r2

mulf #60.0, r2movf id2, r1addf r2, r2movf r1, id1

position ....

initial ….

rate….

Symbol Table

Page 21: Chapter 1: Introduction to Compiling

CH1.21

CSE244

Compiler Cousins:Compiler Cousins: PreprocessorsPreprocessors Provide Input to Compilers

1. Macro Processing

#define in C: does text substitution before compiling

#define X 3

#define Y A*B+C

#define Z getchar()

Page 22: Chapter 1: Introduction to Compiling

CH1.22

CSE244

2. File Inclusion

#include in C - bring in another file before compiling

defs.h

//////

//////

//////

main.c

#include “defs.h”

…---…---…---…---…---…---…---…---…---

//////

//////

//////

…---…---…---…---…---…---…---…---…---

Page 23: Chapter 1: Introduction to Compiling

CH1.23

CSE244

3. Rational Preprocessors

Augment “Old” Languages With Modern Augment “Old” Languages With Modern ConstructsConstructs

Add Macros for If - Then, While, Etc. Add Macros for If - Then, While, Etc.

#Define Can Make C Code More Pascal-like#Define Can Make C Code More Pascal-like

#define begin {

#define end }

#define then

Page 24: Chapter 1: Introduction to Compiling

CH1.24

CSE244

4. Language Extensions for a Database System

EQUEL - Database query language embedded in C

## Retrieve (DN=Department.Dnum) where

## Department.Dname = ‘Research’

is Preprocessed into:

ingres_system(“Retr…..Research’”,____,____);

a procedure call in a programming language.

Page 25: Chapter 1: Introduction to Compiling

CH1.25

CSE244

The Grouping of Phases

Front End : Analysis + Intermediate Code Generation

Back End : Code Generation + Optimizationvs.

Number of Passes:

Single - Preferred

Multiple - Easier, but less efficient

Tradeoffs ……..

Page 26: Chapter 1: Introduction to Compiling

CH1.26

CSE244

Compiler Construction Tools

Parser Generators : Produce Syntax Analyzers

Scanner Generators : Produce Lexical Analyzers

Syntax-directed Translation Engines : Generate Intermediate Code

Automatic Code Generators : Generate Actual Code

Data-Flow Engines : Support Optimization