1 formalization of generics for the.net common language runtime dachuan yu (yale university) andrew...
Post on 19-Dec-2015
231 Views
Preview:
TRANSCRIPT
1
Formalization of Generics for the .NET Common Language
Runtime
Dachuan Yu (Yale University)
Andrew Kennedy, Don Syme
(Microsoft Research Cambridge)
2
Introduction Upcoming revision of Microsoft .NET platform
includes support for parametric polymorphism (“generics”) in Programming languages C#, Visual Basic, Managed C++ Common Language Runtime (the “virtual machine”) Visual Studio (Integrated Development Environment) Libraries
Previous work (PLDI’01) described implementation techniques used in the CLR
Now we formalize the polymorphic intermediate language and aspects of the implementation
3
CLR: The big pictureC#
program
IL
C# compiler
Visual Basicprogram
IL
Visual Basic compiler
SML.NETprogram
IL
SML.NET compiler
Machine code
Loader & JIT front-end
JIT IL Common Language RuntimeGarbage
collector
Native interop
Security
Remoting
ExceptionHandling
Threads
Native binary
JIT code-gen
4
CLR: The big pictureC#
program
IL
C# compiler
Visual Basicprogram
IL
Visual Basic compiler
SML.NETprogram
IL
SML.NET compiler
Loader & JIT front-end
JIT IL Common Language RuntimeGarbage
collector
Native interop
Security
Remoting
ExceptionHandling
Threads
Native binary
Machine code
JIT code-gen
5
High-level design of generics Type parameterization for all declarations
classes e.g. class Set<T>
interfaces e.g. interface IComparable<T>
structse.g. struct HashBucket<K,D>
methods e.g. static void Reverse<T>(T[] arr)
delegates (“first-class methods”) e.g. delegate void Action<T>(T arg)
6
Good design => Tricky Implementation Unrestricted instantiation
List<string> ls = new List<string>(); // reference typesList<double> ld = … // primitive typesList<Pair<string,double>> lsd = … // struct types
Full support for run-time types
if (x is Set<string>) { ... } // type-test y = (List<T>) z; // checked cast
Recursion in instantiations
class List<T> : ICloneable<List<T>> // finiteclass C<T> { C<C<T>> fld; } // infinite
7
Why formalize? In previous work (POPL’01, Gordon & Syme) the
aim was a type soundness proof for a subset of IL (Baby IL)
Our aims are different: Implementation techniques used in the CLR product are
subtle and difficult to get right (=> bugs, perhaps security holes)
We’d like to validate those techniques Current JIT- and pre-compilers are not type-preserving
Our formalization provides a basis for typed compiler intermediate languages for more capable and robust compilers
It’s also difficult to express and apply optimizations Formalization makes this easier
By-product is a generic variant on Baby IL
8
Formalization: the big picture
BILG classes and methods
BILG = “Baby IL with Generics”A tiny subset of MS-IL
BILC classes and methods
BILC = “Baby IL with Constrained generics”
A typed intermediate language more suitable for code-generation
Specialize generic classes and methodsShare instantiations w.r.t. data
representationIntroduce types-as-values
Optimize use of types-as-values
9
Illustrative example, in C#
class ArrayUtils {static List<T> ArrayToList<T>(T[] arr){
…new List<T>()… }
}
class List<T> {virtual List<T> Append(object obj) { …(List<T>) obj… …new ListCell<T>…}
}
Pass type parameters at
runtime?
Look up type representations at
runtime?
Want to share generated code for ArrayToList over different instantiations of T
Look up type representations at runtime?
How do we know what T is?
Want to share generated code for List over different instantiations of T
10
Source Language: BILG “Baby IL with Generics” Purely functional, à la Featherweight Java (Igarashi, Pierce,
Wadler) Primitive types & generic classes Inheritance-based subtyping Generic methods (static and virtual) Type-case operation (isinst) inspects run-time type of object No overloading, no interfaces, no abstract methods, no structs
(“value classes”), no delegates, no boxing, no null values, no heap, no bounded polymorphism
Just enough to demonstrate most of the implementation techniques!
Typing rules & big-step semantics in paper Easier to work with big-step ¬ 9 v. e v taken as definition of divergence
11
Source language: BILG(type) T,U ::= X | int32 | int64 | I(inst type) I ::= C<T1,…,Tn>
(class def) cd ::= class C<X1,…,Xn > : I {T1 f1 ;…; Tm fm; md1 … mdk }
(method def ) md ::= static T m<X1,…,Xn>(T1,…,Tm) { e; }
| virtual T m<X1,…,Xn>(T1,…,Tm) { e; }
(method ref) M ::= I::m<T1,…,Tn>(expr) e ::= ldc.i4 i4 | ldc.i8 i8 | ldarg x
| e1 … en newobj I| e ldfld I::f| e1 … en call M
| e e1 … en callvirt M| e isinst I or e
12
BILG typing and evaluation for isinst
E ` e : I E ` e’ : I’
E ` e isinst I’ or e’ : I
fr ` e I’(f1=v1,…,fn=vn) ` I’ <: I
fr ` e isinst I or e’ I’(f1=v1,…,fn=vn)
fr ` e I’(f1=v1,…,fn=vn) ` ¬(I’ <: I) fr ` e’ v’
fr ` e isinst I or e’ v’
13
BILG typing and evaluation for isinst
E ` e : I E ` e’ : I’
E ` e isinst I’ or e’ : I
fr ` e I’(f1=v1,…,fn=vn) ` I’ <: I
fr ` e isinst I or e’ I’(f1=v1,…,fn=vn)
fr ` e I’(f1=v1,…,fn=vn) ` ¬(I’ <: I) fr ` e’ v’
fr ` e isinst I or e’ v’
Observe:
Types affect evaluation
They cannot be erased
They serve static and dynamic purposes
14
Target Language: BILC Similar to BILG, but adds
Representation constraints on type parameters ref: “must be a reference type” i4: “must be a 32-bit integer” i8: “must be a 64-bit integer”
Types-as-values RT is a value representing closed type T The value RT has singleton type Rep(T), interpreted as
“is a value representing the type T” Construct reps for open types
mkrepC<T1,…,Tn>(e1,…,en) creates a type-rep
for C<T1,…,Tn> given type-reps for T1,…,Tn
Semantics given by small-step reduction relation
15
Target language: BILC (subset)(type) T,U ::= X | int32 | int64 | I(inst type) I ::= C<T1,…,Tn>(extended types) ::= T | Rep(T)(constraint) s ::= ref | i4 | i8(class def) cd ::= class C<X1 :s1,…,Xn :sn > : I
{T1 f1 ;…; Tm fm; md1 … mdk }(method def ) md ::= static T m<X1 :s1,…,Xn :sn >(1,…, k)
{ e; } | virtual T m<X1 :s1,…,X :sn>(1,…, k )
{ e; }(method ref) M ::= I::m<T1,…,Tn>(expr) e ::= i4 | i8 | x
| I(e,e1,…,en)| e ldfld I::f| e1 … en call M| e e1 … en callvirt M| e isinstIe or e| RT
| mkrepC<T1,…,Tn>(e1,…,en)
16
Some typing and reduction rules
E ` C<T1,…,Tn> ok E ` e1 : Rep(T1) … E ` en : Rep(Tn)
E ` mkrepC<T1,…,Tn>(e1,…,en) : Rep(C<T1,…,Tn>)
E ` e : I’ E ` e’ : Rep(I) E ` e’’ : I
E ` e isinstI e’ or e’’ : I
v = I(w,v1,…,vn) w Á w’
` (v isinstT w or v’) ! v
v = I(w,v1,…,vn) w § w’
` (v isinstT w or v’) ! v’
“Reflected subtyping”:RI Á RI’ iff I <: I’
17
Some typing and reduction rules
E ` C<T1,…,Tn> ok E ` e1 : Rep(T1) … E ` en : Rep(Tn)
E ` mkrepC<T1,…,Tn>(e1,…,en) : Rep(C<T1,…,Tn>)
E ` e : I’ E ` e’ : Rep(I) E ` e’’ : I
E ` e isinstI e’ or e’’ : I
v = I(w,v1,…,vn) w Á w’
` (v isinstT w or v’) ! v
v = I(w,v1,…,vn) w § w’
` (v isinstT w or v’) ! v’
Observe:
Types do not affect evaluation
They can be erased
They serve only static purposes
18
Example Static generic method in BILG:
static List<T> Conv<T>(object a) { …a isinst List<T>…
Translated to BILC:
static Listi Convi(object a) { …a isinstTreei RTreei)…
static Listl Convl(object a) { …a isinstTreel RTreel…
static Listr<T> Convr<T:ref>(Rep(T) r, object a) { …a isinstListr<T> (mkrepListr<T>(r))…
Specialized code for T= int32
Specialized code for T= int64
Extra parameter representing T
Code shared for reference types
Lookup/Create type rep at runtime
19
We need more… So far:
specialization, sharing, and separation of run-time types from static types
but mkrep is a costly operation, requiring type-rep creation at runtime
Idea: instead of passing representations for type parameters, pass representations of types that we actually need:
static Listr<T> Convr<T:ref>(Rep(Listr<T>) r, object a) { …a isinstListr<T>(r)…
Extra parameter representing List<T>
20
We need more… In general, we need many type-reps in a single method body
So we pass around dictionaries of type-reps What type does a dictionary of type-reps have?
At its simplest, it is just a tuple e.g. Rep(List<X>) £ Rep(Vec<Vec<X>>) is type of a two-slot dictionary containing type-reps for List<X> and Vec<Vec<X>>
In general, dictionaries may contain cycles (e.g. for mutually recursive methods), so we need recursive values and their types
Worse still, polymorphic recursion requires “infinite” dictionaries Simpler: use name-based types for dictionaries
reps for methods: Rep(M), RM, mkrepM(e1,…,en) statically: each Rep-type determines a particular tuple of other
Rep-types dynamically: each type-rep RT or method-rep RM determines a
tuple of type-rep/method-rep values
21
Target language: BILC (full)(type) T,U ::= X | int32 | int64 | I(inst type) I ::= C<T1,…,Tn>(ext type) ::= T | Rep(T) | Rep(M)(constraint) s ::= ref | i4 | i8(class def) cd ::= class C<X1 :s1,…,Xn :sn > : I
{T1 f1 ;…; Tm fm; md1 … mdk } with 1,…,p
(method def ) md ::= static T m<X1 :s1,…,Xn :sn >(1,…, k) { e; } with 1,…,p
| virtual T m<X1 :s1,…,X :sn>(1,…, k) { e; }(method ref) M ::= I::m<T1,…,Tn>(expr) e ::= i4 | i8 | x
| I(e,e1,…,en)| e ldfld I::f| e1 … en call M| e e1 … en callvirt M| e isinstIe or e| RT | RM
| mkrepC<T1,…,Tn>(e1,…,en)
| mkrepC<T1,…,Tn>::m<U1,…,Uk>(e1,…,en,e1,…,ek)| objdicti e| mdicti e
22
Translation scheme Static generic methods:
Extra dictionary parameter associated with method Accessed using mdicti(e)
Virtual methods in generic classes Obtain dictionary through type of object Accessed using objdict_i(e)
Generic virtual methods: Dictionary type not known statically (body could be
overridden) So pass reps for type parameters and construct type-
reps at runtime using mkdrep
23
In the paper… Complete formalization of BILG, BILC, and
a translation Theorems:
Translation preserves types Translation preserves behaviour
And in forthcoming technical report: Full proofs Type erasure theorem: types in BILC do not
affect evaluation
24
Future work Extend BILG and the translation to cover more
features Value classes (structs)
Would satisfy representation constraint of form [s1,…,sn] where s1,…,sn are constraints on the fields’ representations
Now have unbounded number of specializations All methods on generic structs whose code is shared take a
dictionary parameter Need treatment of boxing
Flexible specialization policies Less sharing: e.g. full specialization of selected types More sharing: e.g. share all instantiations of C<T> by
boxing and unboxing appropriately (cf ML)
25
Future work: structural typing Flexible specialization interacts badly with run-
time types based on name-equivalence Instead, describe dictionaries using structural
typing: Products:
Rep(List<X>) £ Rep(X) is two-slot dictionary with type-reps for List<X> and X
Circular dictionaries => Recursive types e.g. D. Rep(Vec<X>) £ (Rep(Set<X>) £ D)
Polymorphic recursion in code => Higher-kinded recursive types e.g. (D. X. Rep(Vec<X>) £ D(Set<X>)) string
26
Related work Rep(T)
Crary, Weirich, Morrisett: “Intensional polymorphism in type-erasure semantics”
Dictionary-passing for polymorphism implementation Saha and Shao (ML) Viroli and Natali (Java)
27
Questions?
top related