type checking in compiler design

Data representation 1

Chapter 4of Programming LanguagesbyRavi Sethi


Types: data representation The Role of Types Basic Types Arrays: Sequences of Elements Records: Named Fields Unions and Variant Records Sets Pointers: Efficiency and Dynamic Allocation Types and Error Checking


The role of types Data object:

Refers to something meaningful to the application Data representation:

Refers to the organization of values in a program Objects in an application have corresponding

representations in a program

Example: An application uses “days” as an object (thus, “January 22”, “May 6”, “tomorrow(d)”)

In the program days are represented as integers (22, 126, n+1)

Avoid confusion between objects and their representation: OK do add two days (integer addition), not OK to multiply two days!


Values and their types

In imperative languages, data representations are built from values that can be manipulated directly by the underlying machine Basic types

int, char, float, pointer, … Structured types

Arrays, records, sets, …


Type expressions Examples:

int temp [100] typedef person {

char name[20];char address[64];

}

Describes how a data representation is constructed Used to

1. Represent data objects2. Lay out values in the underlying machine3. Check that operators are applied properly within

expression


Arrays Arrays are sequences of elements – all of

the same type (and thus of the same size) Efficient access and storage allocation Indexed by integers or enumerations Array layout

Elements appear in consecutive locations in the underlying machine

Bounds evaluated at: Compile time (Pascal, C, …) Procedure entry (Algol 60) Run time (C++, java)

Arrays of Arrays : row-major layout, column-major layoutArrays of Arrays : row-major layout, column-major layout


Records

A record type is a template for grouping together variables that are logically related and hence are treated as a unit

A variable declaration allocates storage

Storage is allocated at compile time according to the definition of the record types


Arrays and records

The layout of arrays and records is known at compile time

For arrays selection of an element is done at run time

For records, selection of a field is known at compile time


Unions and variant records

Eclipsed by the Object-Oriented concepts

Define record types that share common properties

Variant record: a part common to all records of that type and a variant part

Union: a special case of a variant record with an empty common part


Unions and variant records (cont’d)

Layout of Variant Records1. Fixed Part2. Tag Field3. Variant Part

Variant Records could compromise type safety


Unions and variant records (cont’d)


Sets Set Values: in Pascal, all elements must be

of the same simple type Set Types: Type set of S represents all

possible subsets of S Example: var S : set of [1..3] S can denote one of the following sets:

[ ],[1],[2],[3],[1,2],[1,3],[2,3],[1,2,3] A set of n elements: implemented as a bit

vector of length n The basic operation on set is a membership

test


Pointers Pointers: provide indirect access to elements of a

known type More efficient to move or copy a pointer to the data

structure Necessary to implement dynamic data structures

Lists, Trees, Graphs, …

Size and layout of storage for are known statically Dynamic data structures can grow/shrink at run time

by allocating/deallocating fixed size memory chunks


Dangling pointers, garbage and memory leaks

A pointer that still points to a storage area that has been deallocated is left “dangling”

Storage that is still allocated but that it is no longer accessible (through a pointer to it) is called “garbage”

Programs that create garbage are said to have “memory leaks”


Types of expressions Types extend from values to expressions,

the type of an expression x + y can be inferred from the types x and y

Types of variable bindings1. Static or early bindings 2. Dynamic or late bindings

C, Pascal, … have static bindings of types and dynamic bindings of values to variables.

Lisp, Smalltalk have dynamic binding of both values and types


Type systems Language design principle:

Every expression must have a type that is known (at the latest, at run time)

Type system: a set of rules for associating a type to an expression allows one to determine the appropriate use the

operators in an expression Basic rule of type checking

1. Overloading: Multiple meanings2. Coercion: conversion from one type to another3. Polymorphism: parameterized type


Types and error checking

Static and Dynamic Checking Type error occurs if an operation is

improperly applied Programs are checked statically Dynamic checking is done during

program execution Strong type ensures freedom from type

errors


Miscellaneous

Short cut evaluation of Boolean expressions

Type coercion


Chapter 5 of Programming Languages

Ravi Sethi


Procedures

Introduction to Procedures Parameter Passing Methods Scope rules for Names Nested Scope in the Source Text

•


INTRODUCTION TO PROCEDURES

Procedures are constructs for giving a name to a piece of coding (body)

When the name is called , the body is executed.

Function Procedures - Functions Proper Procedures - Procedures

Functions: return a single value Procedures: have only a side effect such

as setting variables or performing output and (shouldn’t) return no value


Use of a Procedure is referred to as a call of Procedure

< Procedure - name > ( < parameters> )

The parenthesis around parameters are a syntactic cue to a call

Functions are called from within expressionsexample: r * sin( angle )

Procedures are treated as Atomic statementsexample : read(ch) ;

Procedure calls

Actual parameters


Elements of a procedure A name for the declared Procedure A body consisting of local declaration and statements The formal parameters which are place holders for actuals An optional result type

Example (pascal) function square ( x : integer): integer

begin square := x

end ;

Example (C) int square ( int x)

{ int sq; sq = x * x; return sq; }


RECURSION : MULTIPLE ACTIVATION

Activation - Each execution of a procedure body is referred to as an activationof the procedure

Recursion - A procedure is recursive if it can be activated from within its ownprocedure body

Example- Factorial function function f( n : integer) : integer; begin

if n = 0 then f := 1 else f := n * f ( n - 1 ) end ;

f(n) is computed in terms of f(n-1), f(n-1) in terms of f(n-2) and so onfor n = 3 the sequence of activation is a s follows

f(3) = 3 * f(2) f(2) = 2 * f(1)

f(1) = 1 * f(0) f(0) = 1

f(1) = 1 f(2) = 2

f(3) = 6


5.2 PARAMETER PASSING METHODS

If communication is desired between the caller and the callee , arrangements must be made for passing values back and forth through the procedures parameters.

Parameter passing refers to the matching of actuals with formals when a Procedure call occurs

Different interpretations of what a parameter stands for leads to different parameter passing methods.

• Call by Value• Call by Reference • Call by Value Result


Value Parameter

• Gets Own Memory location• Gets initial value from corresponding actual position• Uses but does not change the actual parameter• Actual parameter s can be variables or expressions of a return typeExampleb = future_value(total/2, rate, year2-year1).float future_value(float initial_balance, float p, int nyear) { p = 1 + p/12/100; int n = 12 * nyear; float b = initial_balance* pow(p, n) return b; }

main future_value

rateyear1year 2

total 1/2

rate

year2-year1

total initial_balancep

nyear

b

expressions Values are copied into parameter variables


Reference Parameters

• Changes the value of the actual Parameter

• Shares the memory location of the actual Parameter

• Must match in type

• The Actual Reference Parameter must have Location

Example procedure swap(var x : integer; var y : integer );var z : integer; begin

z := x; x := y; y := z; end


OBSERVATIONS

• Program execution always begins in the main

• Formal Parameters(function definition) and actual Parameters (function call) are matched by position. Names need not agree

• Data types of parameters do not appear in the function call

• When a function completes the flow of control returns to the place that called it.


SCOPE RULES FOR NAMES

The Scope rules of a language determine which declaration of a name x applies to an occurrence of x in a program .

There are two kinds of scope rules, called lexical and dynamic scope rules.

Binding and Scope Consider the following Pascal Procedure

procedure swap(var x, y: T)

var z : T; begin

z := x; x := y; y := z end

Binding Occurrence of z

Bound Occurrence of z

Scope ofz

The Procedure declaration also contains binding occurrences of the procedure name swap,theformal parameters x and y .The Scopes of the formal parameters x and y and the scope of the variable z consists of the procedure body.


LEXICAL AND DYNAMIC SCOPES

Lexical Scope• Also called Static Scope• Binding of name occurrences to declarations done statically, at compile time• A variable that is free in a procedure gets its value from the environment in which the procedure is defined, rather than from where the procedure is called• binding of variable is defined by the structure of the program and not by what happens at the run time.

V,W,X(block A)

V,Y(block B)

V,W,Z(block C)


Dynamic Scope

• The binding of name occurrences to declarations is done dynamically at run time

• A free variable gets its value from the environment from which it is called , rather than from the environment in which it is defined.

• Dynamic binding should not be confused with dynamic variables which are either reference variables or local variables.


Program L; var n : char

procedure W; begin writeln(n) end;

procedure D; var n : char; begin n := ‘D’ ; W end;

begin { L } n := ‘L’ ; W; Dend.

{ n declared in L }

{ Occurrence of n in W }

{ n redeclared in D }

{ W called from the main program L }

{ W called within D }


NESTED SCOPES- PROCEDURE DECLARATION IN PASCAL(Input, Output);

var X,Y : Real ;

Procedure Outer(var X : Real);

var M,N : Integer ;

Procedure Inner( Z : Real);

var N,O : Integer ;begin { Inner}

………..end : { Inner}

begin { outer}- - - - end { outer }

begin { Nested }- - - - - - end Nested.

Scope of Y

Scope of M

Scope of Z

Program nested


Activation Records

Each execution of the body is called an activation of the body

associated with each activation of a body is storage for the variables declared in the body called an activation record


Mapping or Binding Times

Compile Activation Run


Compile Time

Binding of name occurrences to declarations is defined in terms of lexical context{

int i;

{

int i,j; …

}

…

}


Activation Time

Binding of declarations to locations is done at activation time - this is important in recursive procedures

LocationName occurrence

Declaration Value

scope activation state


Run Time

The binding of locations to values is done dynamically at run time and can be changed by assignments


Control Flow Between Activations

In a sequential language, one procedure is called at a time

P calls Q : P is put on hold, Q gets activated and when finishes execution resumes with P

Coroutines - suspend execution, return back to caller, and then resume execution later from where they were suspended example the classic producer-consumer application


Activation trees

Nodes in the tree represent activations

activation trees and the structure chart are closely related.


Elements of an Activation Record

Control link

Access link

Saved state

Parameters

Function result

Local variables

Points to the activation record of the caller

Static link, used to implement lexically scoped languages


Results can be different under lexical and dynamic scope

Lexical - pointer to the block that contains declaration

Dynamic - follow the control links for the nearest binding


Heap Storage spot for activation records the records stay here as long as they are

needed pieces are allocated and freed in some

relatively unstructured manner problems of storage allocation, recovery,

compaction and reuse may be severe garbage collection - technique to reclaim

storage that is no longer needed


Stack

Activation records held in a stack storage reused efficiently storage is allocated when activation

begins and released when ends stack imposes restrictions on

language design - functions as parameters


Memory Layout

Code

Static global data

Stack local data

Heap dynamic data


Dangling Pointers

A pointer that refers to storage that is being used for another purpose

Example Returning the address of a local variable.


Displays

Optimization technique for obtaining faster access to nonlocals

Array of pointers to activation records, indexed by lexical nesting depth


Homework

Problem 5.4 – page 199 of textbookConsider the following procedure parens that reads strings such as []([]){[]} and checks whether the

opening parenthesis match the closing parenthesis.void parens(void) {

for ( ; ; ) {switch(lookahead) {case ‘{‘:

M(‘{‘); parens(); M(‘}’); continue;case ‘(‘:

M(‘(‘); parens(); M(‘)’); continue;case ‘[‘:

M(‘[‘); parens(); M(‘]’); continue;default: return;}

}}• Complete the program by giving an implementation for procedure M and by supplying an appropriate

main program. The program should output the string “OK” iff the input string consists of balanced parentheses.

• How would procedure parens handle strings like abc[a(b+d)f]gh ?• How would you change procedure parens so that strings like the one above are considered OK?

type checking in compiler design

Documents