type checking in compiler design
DESCRIPTION
UsefTRANSCRIPT
Data representation 1
Chapter 4of Programming LanguagesbyRavi Sethi
Data representation 2
Types: data representation The Role of Types Basic Types Arrays: Sequences of Elements Records: Named Fields Unions and Variant Records Sets Pointers: Efficiency and Dynamic Allocation Types and Error Checking
Data representation 3
The role of types Data object:
Refers to something meaningful to the application Data representation:
Refers to the organization of values in a program Objects in an application have corresponding
representations in a program
Example: An application uses “days” as an object (thus, “January 22”, “May 6”, “tomorrow(d)”)
In the program days are represented as integers (22, 126, n+1)
Avoid confusion between objects and their representation: OK do add two days (integer addition), not OK to multiply two days!
Data representation 4
Values and their types
In imperative languages, data representations are built from values that can be manipulated directly by the underlying machine Basic types
int, char, float, pointer, … Structured types
Arrays, records, sets, …
Data representation 5
Type expressions Examples:
int temp [100] typedef person {
char name[20];char address[64];
}
Describes how a data representation is constructed Used to
1. Represent data objects2. Lay out values in the underlying machine3. Check that operators are applied properly within
expression
Data representation 6
Arrays Arrays are sequences of elements – all of
the same type (and thus of the same size) Efficient access and storage allocation Indexed by integers or enumerations Array layout
Elements appear in consecutive locations in the underlying machine
Bounds evaluated at: Compile time (Pascal, C, …) Procedure entry (Algol 60) Run time (C++, java)
Arrays of Arrays : row-major layout, column-major layoutArrays of Arrays : row-major layout, column-major layout
Data representation 7
Records
A record type is a template for grouping together variables that are logically related and hence are treated as a unit
A variable declaration allocates storage
Storage is allocated at compile time according to the definition of the record types
Data representation 8
Arrays and records
The layout of arrays and records is known at compile time
For arrays selection of an element is done at run time
For records, selection of a field is known at compile time
Data representation 9
Unions and variant records
Eclipsed by the Object-Oriented concepts
Define record types that share common properties
Variant record: a part common to all records of that type and a variant part
Union: a special case of a variant record with an empty common part
Data representation 10
Unions and variant records (cont’d)
Layout of Variant Records1. Fixed Part2. Tag Field3. Variant Part
Variant Records could compromise type safety
Data representation 11
Unions and variant records (cont’d)
Data representation 12
Sets Set Values: in Pascal, all elements must be
of the same simple type Set Types: Type set of S represents all
possible subsets of S Example: var S : set of [1..3] S can denote one of the following sets:
[ ],[1],[2],[3],[1,2],[1,3],[2,3],[1,2,3] A set of n elements: implemented as a bit
vector of length n The basic operation on set is a membership
test
Data representation 13
Pointers Pointers: provide indirect access to elements of a
known type More efficient to move or copy a pointer to the data
structure Necessary to implement dynamic data structures
Lists, Trees, Graphs, …
Size and layout of storage for are known statically Dynamic data structures can grow/shrink at run time
by allocating/deallocating fixed size memory chunks
Data representation 14
Dangling pointers, garbage and memory leaks
A pointer that still points to a storage area that has been deallocated is left “dangling”
Storage that is still allocated but that it is no longer accessible (through a pointer to it) is called “garbage”
Programs that create garbage are said to have “memory leaks”
Data representation 15
Types of expressions Types extend from values to expressions,
the type of an expression x + y can be inferred from the types x and y
Types of variable bindings1. Static or early bindings 2. Dynamic or late bindings
C, Pascal, … have static bindings of types and dynamic bindings of values to variables.
Lisp, Smalltalk have dynamic binding of both values and types
Data representation 16
Type systems Language design principle:
Every expression must have a type that is known (at the latest, at run time)
Type system: a set of rules for associating a type to an expression allows one to determine the appropriate use the
operators in an expression Basic rule of type checking
1. Overloading: Multiple meanings2. Coercion: conversion from one type to another3. Polymorphism: parameterized type
Data representation 17
Types and error checking
Static and Dynamic Checking Type error occurs if an operation is
improperly applied Programs are checked statically Dynamic checking is done during
program execution Strong type ensures freedom from type
errors
Data representation 18
Miscellaneous
Short cut evaluation of Boolean expressions
Type coercion
Data representation 19
Chapter 5 of Programming Languages
Ravi Sethi
Data representation 20
Procedures
Introduction to Procedures Parameter Passing Methods Scope rules for Names Nested Scope in the Source Text
•
Data representation 21
INTRODUCTION TO PROCEDURES
Procedures are constructs for giving a name to a piece of coding (body)
When the name is called , the body is executed.
Function Procedures - Functions Proper Procedures - Procedures
Functions: return a single value Procedures: have only a side effect such
as setting variables or performing output and (shouldn’t) return no value
Data representation 22
Use of a Procedure is referred to as a call of Procedure
< Procedure - name > ( < parameters> )
The parenthesis around parameters are a syntactic cue to a call
Functions are called from within expressionsexample: r * sin( angle )
Procedures are treated as Atomic statementsexample : read(ch) ;
Procedure calls
Actual parameters
Data representation 23
Elements of a procedure A name for the declared Procedure A body consisting of local declaration and statements The formal parameters which are place holders for actuals An optional result type
Example (pascal) function square ( x : integer): integer
begin square := x
end ;
Example (C) int square ( int x)
{ int sq; sq = x * x; return sq; }
Data representation 24
RECURSION : MULTIPLE ACTIVATION
Activation - Each execution of a procedure body is referred to as an activationof the procedure
Recursion - A procedure is recursive if it can be activated from within its ownprocedure body
Example- Factorial function function f( n : integer) : integer; begin
if n = 0 then f := 1 else f := n * f ( n - 1 ) end ;
f(n) is computed in terms of f(n-1), f(n-1) in terms of f(n-2) and so onfor n = 3 the sequence of activation is a s follows
f(3) = 3 * f(2) f(2) = 2 * f(1)
f(1) = 1 * f(0) f(0) = 1
f(1) = 1 f(2) = 2
f(3) = 6
Data representation 25
5.2 PARAMETER PASSING METHODS
If communication is desired between the caller and the callee , arrangements must be made for passing values back and forth through the procedures parameters.
Parameter passing refers to the matching of actuals with formals when a Procedure call occurs
Different interpretations of what a parameter stands for leads to different parameter passing methods.
• Call by Value• Call by Reference • Call by Value Result
Data representation 26
Value Parameter
• Gets Own Memory location• Gets initial value from corresponding actual position• Uses but does not change the actual parameter• Actual parameter s can be variables or expressions of a return typeExampleb = future_value(total/2, rate, year2-year1).float future_value(float initial_balance, float p, int nyear) { p = 1 + p/12/100; int n = 12 * nyear; float b = initial_balance* pow(p, n) return b; }
main future_value
rateyear1year 2
total 1/2
rate
year2-year1
total initial_balancep
nyear
b
expressions Values are copied into parameter variables
Data representation 27
Reference Parameters
• Changes the value of the actual Parameter
• Shares the memory location of the actual Parameter
• Must match in type
• The Actual Reference Parameter must have Location
Example procedure swap(var x : integer; var y : integer );var z : integer; begin
z := x; x := y; y := z; end
Data representation 28
OBSERVATIONS
• Program execution always begins in the main
• Formal Parameters(function definition) and actual Parameters (function call) are matched by position. Names need not agree
• Data types of parameters do not appear in the function call
• When a function completes the flow of control returns to the place that called it.
Data representation 29
SCOPE RULES FOR NAMES
The Scope rules of a language determine which declaration of a name x applies to an occurrence of x in a program .
There are two kinds of scope rules, called lexical and dynamic scope rules.
Binding and Scope Consider the following Pascal Procedure
procedure swap(var x, y: T)
var z : T; begin
z := x; x := y; y := z end
Binding Occurrence of z
Bound Occurrence of z
Scope ofz
The Procedure declaration also contains binding occurrences of the procedure name swap,theformal parameters x and y .The Scopes of the formal parameters x and y and the scope of the variable z consists of the procedure body.
Data representation 30
LEXICAL AND DYNAMIC SCOPES
Lexical Scope• Also called Static Scope• Binding of name occurrences to declarations done statically, at compile time• A variable that is free in a procedure gets its value from the environment in which the procedure is defined, rather than from where the procedure is called• binding of variable is defined by the structure of the program and not by what happens at the run time.
V,W,X(block A)
V,Y(block B)
V,W,Z(block C)
Data representation 31
Dynamic Scope
• The binding of name occurrences to declarations is done dynamically at run time
• A free variable gets its value from the environment from which it is called , rather than from the environment in which it is defined.
• Dynamic binding should not be confused with dynamic variables which are either reference variables or local variables.
Data representation 32
Program L; var n : char
procedure W; begin writeln(n) end;
procedure D; var n : char; begin n := ‘D’ ; W end;
begin { L } n := ‘L’ ; W; Dend.
{ n declared in L }
{ Occurrence of n in W }
{ n redeclared in D }
{ W called from the main program L }
{ W called within D }
Data representation 33
NESTED SCOPES- PROCEDURE DECLARATION IN PASCAL(Input, Output);
var X,Y : Real ;
Procedure Outer(var X : Real);
var M,N : Integer ;
Procedure Inner( Z : Real);
var N,O : Integer ;begin { Inner}
………..end : { Inner}
begin { outer}- - - - end { outer }
begin { Nested }- - - - - - end Nested.
Scope of Y
Scope of M
Scope of Z
Program nested
Data representation 34
Activation Records
Each execution of the body is called an activation of the body
associated with each activation of a body is storage for the variables declared in the body called an activation record
Data representation 35
Mapping or Binding Times
Compile Activation Run
Data representation 36
Compile Time
Binding of name occurrences to declarations is defined in terms of lexical context{
int i;
{
int i,j; …
}
…
}
Data representation 37
Activation Time
Binding of declarations to locations is done at activation time - this is important in recursive procedures
LocationName occurrence
Declaration Value
scope activation state
Data representation 38
Run Time
The binding of locations to values is done dynamically at run time and can be changed by assignments
Data representation 39
Control Flow Between Activations
In a sequential language, one procedure is called at a time
P calls Q : P is put on hold, Q gets activated and when finishes execution resumes with P
Coroutines - suspend execution, return back to caller, and then resume execution later from where they were suspended example the classic producer-consumer application
Data representation 40
Activation trees
Nodes in the tree represent activations
activation trees and the structure chart are closely related.
Data representation 41
Elements of an Activation Record
Control link
Access link
Saved state
Parameters
Function result
Local variables
Points to the activation record of the caller
Static link, used to implement lexically scoped languages
Data representation 42
Results can be different under lexical and dynamic scope
Lexical - pointer to the block that contains declaration
Dynamic - follow the control links for the nearest binding
Data representation 43
Heap Storage spot for activation records the records stay here as long as they are
needed pieces are allocated and freed in some
relatively unstructured manner problems of storage allocation, recovery,
compaction and reuse may be severe garbage collection - technique to reclaim
storage that is no longer needed
Data representation 44
Stack
Activation records held in a stack storage reused efficiently storage is allocated when activation
begins and released when ends stack imposes restrictions on
language design - functions as parameters
Data representation 45
Memory Layout
Code
Static global data
Stack local data
Heap dynamic data
Data representation 46
Dangling Pointers
A pointer that refers to storage that is being used for another purpose
Example Returning the address of a local variable.
Data representation 47
Displays
Optimization technique for obtaining faster access to nonlocals
Array of pointers to activation records, indexed by lexical nesting depth
Data representation 48
Homework
Problem 5.4 – page 199 of textbookConsider the following procedure parens that reads strings such as []([]){[]} and checks whether the
opening parenthesis match the closing parenthesis.void parens(void) {
for ( ; ; ) {switch(lookahead) {case ‘{‘:
M(‘{‘); parens(); M(‘}’); continue;case ‘(‘:
M(‘(‘); parens(); M(‘)’); continue;case ‘[‘:
M(‘[‘); parens(); M(‘]’); continue;default: return;}
}}• Complete the program by giving an implementation for procedure M and by supplying an appropriate
main program. The program should output the string “OK” iff the input string consists of balanced parentheses.
• How would procedure parens handle strings like abc[a(b+d)f]gh ?• How would you change procedure parens so that strings like the one above are considered OK?