declaring and checking non-null types in an object-oriented language

Declaring and Checking Non-null Types in an Object-Oriented Language

Manuel FahndrichK. Rustan M. Leino OOPSLA’03

Presented by: Alexey Tsitkin

2 of 22

What’s wrong with this program?class Accounting

{public void RaiseSalary(Employee emp, int amount)

{

emp.Salary += amount;

}

}Possible NullReference

Exception!

3 of 22

Correct Version

class Accounting

{public void RaiseSalary(Employee emp, int amount)

{

if(emp == null)

{

// …handle the case…

}

emp.Salary += amount;

}

}

Code becomes

cumbersome…

4 of 22

Exceptions

Used to signal about “errors” during runtime. Better to find errors at compile-time! Types of exceptions:

Bad cast: dynamic type may only be known at runtime.

Division by zero: divisor unknown at compile time. NULL dereference: compile-time checker can be

created! …

5 of 22

General Idea

Splitting reference types into non-null and possibly-null types Already implemented in ML’s and some other languages’

type systems. C++ references?

A non-null field provides a contract: At construction, must be initialized with a non-null value. Read access yields a non-null value. Write access requires a non-null value.

6 of 22

Simple Solution

Require that an object under construction cannot be accessed until fully constructed. Possible in some languages, but not in

mainstream ones like C#/Java, where this may be accessed from the constructor or from methods called by it.

7 of 22

Exampleclass A{

[NotNull] string name;

public A([NotNull] string s) {

this.name = s;this.m(55);

}

virtual void m(int x) { … }}

name initialized before use

8 of 22

Example (cont’d)class B : A {

[NotNull] string path;

public B([NotNull] string p, [NotNull] string s) : base(s) {

this.path = p; }

override void m(int x) {

… this.path …}

}

m() called from A’s c’tor before B() initialized path!

9 of 22

A Glance at C++

In C++ base-class object is created and initialized (by c’tor). Then, derived-class object is created and initialized. Virtual functions act as non-virtual when called from within

c’tors. The problem from the previous example is eliminated.

In C#/Java, the object including all superclass objects is created first, then c’tors are called. Virtual functions act as virtual even from within c’tors.

10 of 22

Non-null Types Notations in the article:

T- - non-null references of type T.

T+ - possibly-null references of type T Current default in C# / Java Written T+ instead of just T to avoid confusion

Usage example:

T- t = new T(…); // new never returns nullint x = t.f; // t must be non-nullT+ n = t; // n may be nullif(n != null) t = n; // here n is of type T-

11 of 22

Advantages

Documentation of method input parameters, output parameters and return values.

Static (compile-time) check of object invariants such as non-null fields.

Error detection at the point of error commitment, not when dereferencing nulls.

No need to check for nulls at runtime – boosts performance.

Reduce/eliminate unexpected null reference exceptions.

12 of 22

Construction

Problem: half-baked objects in constructors. this.f may be null even though f is declared as non-null.

Notation: Traw- denotes partially-initialized object types. Subtyping: T- ≤ Traw-.

Rule: A T- field in Craw- object Read: May be null

Write: Must be with a T- value.

13 of 22

The Construction Duty

The c’tor must initialize all non-null fields. Restricted to the object proper,

Not including sub- or super-class object fields. Every path through a c’tor must include an assignment to every non-

null field. When a c’tor is called, all ancestor c’tors have already been

called, thus members initialized. The last c’tor called due to new C(…) casts this from Craw- to

C-. The annotation [Raw] allows a method to be called with this of

type Craw-.

14 of 22

Example

class A {[NotNull] string name;public A([NotNull] string s) {

this.name = s;this.m(55);

}

[Raw]virtual void m(int x) { … }

}

class B : A {

[NotNull] string path;

public B([NotNull] string p,

[NotNull] string s)

: base(s) { this.path = p; }

[Raw]

override void m(int x) {

… this.path …

}

} May yield null in the current context

15 of 22

Arrays

Arrays are references themselves and contain references.

The array itself and/or its elements may be non-null or possibly-null: T- []- non-null array of non-null elements T+ []- non-null array of possibly-null elements T- []+ possibly-null array of non-null elements T+ []+ possibly-null array of possibly-null elements

16 of 22

Arrays (cont’d)

In contrast to objects, there is no “constructor” that initializes all array elements to non-null values after allocation. new T- [n] returns a reference of type T- []raw-

Reading a[i] may yield null. Writing a[i] requires non-null.

17 of 22

Arrays (cont’d)

Compiler can not know when array has finished initialization

Explicit cast required from programmer. The cast validates the non-nullity of the elements.

Usage example:T- []raw- aTmp = new T- [n];

// initialize the elements of aTmp

T- []- a = (T- []-)aTmp;

18 of 22

Other Language Constructs

Structs Default constructor initializes fields to zero-

equivalent values (e.g. null for references). Problem: Cannot be overridden! Solution: All c’tors for a struct S produce a value

of type S except the default c’tor which produces Sraw. Only for structs that actually contain non-null fields

19 of 22

Other Language Constructs (cont’d) Call-by-reference (ref) parameters

Used for input – formal parameter type is a supertype of the actual parameter type.

Used for output - formal parameter type is a subtype of the actual parameter type. Required in order to maintain conformance.

Thus, no-variance on ref parameters Problem: For a raw object, a field f of type T- yields T+ on

read and requires T- on write. Solution: Disallow passing such fields as ref parameters.

20 of 22

Implementation

Possible! Created by the authors.

Does not (yet) implement the full design. Implemented at the CIL (a.k.a. MSIL) level.

Does not modify the compiler or runtime. Works with other languages compiled into CIL (VB.NET,

Managed C++ etc.).

Tested on a ~20,000 lines program.

21 of 22

Implementation - Benefits

Catches hard to find errors: Vacuous initialization – this.foo = foo inside a

c’tor. The goal was to initialize a field with a parameter, but there was no parameter named foo.

Wrong localbool m(Q other) {

T that = other as T;

if (other == null) return false; // should have used “that”

if (this.bar != that.bar) … // “that” may be null

22 of 22

Conclusion

Non-null types allow moving some errors from runtime to compile-time.

Not theoretical-only based, implementation is possible and exists.

Less runtime checks, better code. Backward compatible except in initialization,

mainly in constructors.

declaring and checking non-null types in an object-oriented language

Documents