[acm press the 2006 acm sigplan workshop - portland, oregon, usa (2006.09.16-2006.09.16)]...

11
An Object-Oriented Approach to Datatype-Generic Programming Adriaan Moors Frank Piessens Computer Science Department KU Leuven {adriaan, frank, wouter} @cs.kuleuven.be Wouter Joosen Abstract Datatype-generic programming (DGP) is the next step beyond ab- stracting over types using parametric polymorphism, which is often called “genericity” in object-oriented languages. However, unlike genericity, DGP has not received much attention in the OO com- munity. Nonetheless, in the context of functional languages, it has proven to make programs more robust with respect to changes in the type structure, as well as in many other applications, such as type-safe XML processing and marshalling. To carry these strengths over to an OO language, we present an extensible library for lightweight DGP in Scala, based on an existing lightweight approach in Haskell. We discuss the challenges in developing and using our library, and explore ways to overcome them. Categories and Subject Descriptors D.3.2 [Language Classifica- tions]: Object-oriented languages – Scala; D.3.2 [Language Clas- sifications]: Applicative (functional) languages; D.3.3 [Language Constructs and Features]: Polymorphism General Terms Languages, Experimentation Keywords Datatype-Genericity, Polytypic Programming, Scala 1. Introduction Nowadays, most statically typed object-oriented (OO) languages, such as Java, C#, and Scala provide support for abstracting over types using parametric polymorphism, which is unfortunately also called “genericity.” This area has received much attention lately [30, 20, 23, 18, 1]. While OO’s “genericity” abstracts over types, DGP abstracts over type constructors. More concretely, in OO languages, generic- ity is typically used to abstract over the type of a container’s con- tents, whereas DGP may be used to abstract over the type of the container, a type constructor. Due to this higher level of abstrac- tion, datatype-generic programs become more concise and more robust to changes in the type structure. As research on DGP is mainly carried out in the context of functional languages, it has not been exposed to challenges more prominent in OO languages. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. WGP’06 September 16, 2006, Portland, Oregon, USA. Copyright c 2006 ACM 1-59593-492-6/06/0009. . . $5.00 Examples of these complications are: (a) the strict, imperative nature of these languages, (b) encapsulation – in the sense that an object typically has (substantial) private as well as public state, and (c) (nominal) subtyping The goal of our research is to integrate recent advances in datatype-generic programming in an object-oriented language. In this paper, we present an experiment in which we port a functional library for DGP to Scala, staying as close to the original work as possible. This was a source of inspiration for our current work, which consists of developing a truly object-oriented approach to DGP. This paper is structured as follows. In section 3 we develop a Scala [29] library for datatype-generic programming, based on a Haskell encoding [14] of lightweight PolyP [19]. We then point out the limitations of our library and discuss solutions in the context of (a conservative extension of) Scala. In section 4, we explore a number of preliminary extensions of Scala to tackle the limitations we could not overcome in section 3, such as applying our library to unmodified user-defined classes. In section 5, we briefly discuss future work, which aims to make DGP libraries more powerful by extending the language with support for typed meta-programming on types. Finally, we conclude in section 6. For those unfamiliar with the language, we begin with a short overview of Scala. 2. A Brief Overview of Scala In the following, we give a very brief overview of Scala’s features that are relevant to this paper. For more information, consult Scala’s homepage [26], which contains extensive documentation, including a more elaborate overview [12]. 2.1 Syntax Scala’s syntax is reminiscent of Java’s, albeit more concise and more uniform. A definition always starts with a keyword that de- notes the kind of definition: class, trait (a mixin), def (a method), val (a readonly “variable”), var (a true variable 1 ) and type (a type member). The rest of the definition specifies the name being bound, possibly followed by its type and its “value”: for a class, this is the class body; a concrete method consists of a single expression or a block of statements, and so on. Scala also has convenient syntax for closures and function types. For example, x :Int => x+1 represents the successor function, with type Int=>Int. Furthermore, the Scala compiler performs quite extensive local type inference [35, 31]. 1 Depending on the enclosing definition, val and var define a local vari- able or a field of a class. 96

Upload: wouter

Post on 15-Oct-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

An Object-Oriented Approach to Datatype-GenericProgramming

Adriaan Moors Frank Piessens

Computer Science DepartmentKU Leuven

{adriaan, frank, wouter} @cs.kuleuven.be

Wouter Joosen

AbstractDatatype-generic programming (DGP) is the next step beyond ab-stracting over types using parametric polymorphism, which is oftencalled “genericity” in object-oriented languages. However, unlikegenericity, DGP has not received much attention in the OO com-munity. Nonetheless, in the context of functional languages, it hasproven to make programs more robust with respect to changesin the type structure, as well as in many other applications, suchas type-safe XML processing and marshalling. To carry thesestrengths over to an OO language, we present an extensible libraryfor lightweight DGP in Scala, based on an existing lightweightapproach in Haskell. We discuss the challenges in developing andusing our library, and explore ways to overcome them.

Categories and Subject Descriptors D.3.2 [Language Classifica-tions]: Object-oriented languages – Scala; D.3.2 [Language Clas-sifications]: Applicative (functional) languages; D.3.3 [LanguageConstructs and Features]: Polymorphism

General Terms Languages, Experimentation

Keywords Datatype-Genericity, Polytypic Programming, Scala

1. IntroductionNowadays, most statically typed object-oriented (OO) languages,such as Java, C#, and Scala provide support for abstracting overtypes using parametric polymorphism, which is unfortunately alsocalled “genericity.” This area has received much attention lately[30, 20, 23, 18, 1].

While OO’s “genericity” abstracts over types, DGP abstractsover type constructors. More concretely, in OO languages, generic-ity is typically used to abstract over the type of a container’s con-tents, whereas DGP may be used to abstract over the type of thecontainer, a type constructor. Due to this higher level of abstrac-tion, datatype-generic programs become more concise and morerobust to changes in the type structure.

As research on DGP is mainly carried out in the context offunctional languages, it has not been exposed to challenges moreprominent in OO languages.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. To copy otherwise, to republish, to post on servers or to redistributeto lists, requires prior specific permission and/or a fee.

WGP’06 September 16, 2006, Portland, Oregon, USA.Copyright c© 2006 ACM 1-59593-492-6/06/0009. . . $5.00

Examples of these complications are: (a) the strict, imperativenature of these languages, (b) encapsulation – in the sense that anobject typically has (substantial) private as well as public state, and(c) (nominal) subtyping

The goal of our research is to integrate recent advances indatatype-generic programming in an object-oriented language. Inthis paper, we present an experiment in which we port a functionallibrary for DGP to Scala, staying as close to the original work aspossible. This was a source of inspiration for our current work,which consists of developing a truly object-oriented approach toDGP.

This paper is structured as follows. In section 3 we develop aScala [29] library for datatype-generic programming, based on aHaskell encoding [14] of lightweight PolyP [19]. We then point outthe limitations of our library and discuss solutions in the contextof (a conservative extension of) Scala. In section 4, we explore anumber of preliminary extensions of Scala to tackle the limitationswe could not overcome in section 3, such as applying our libraryto unmodified user-defined classes. In section 5, we briefly discussfuture work, which aims to make DGP libraries more powerful byextending the language with support for typed meta-programmingon types. Finally, we conclude in section 6. For those unfamiliarwith the language, we begin with a short overview of Scala.

2. A Brief Overview of ScalaIn the following, we give a very brief overview of Scala’s featuresthat are relevant to this paper. For more information, consult Scala’shomepage [26], which contains extensive documentation, includinga more elaborate overview [12].

2.1 SyntaxScala’s syntax is reminiscent of Java’s, albeit more concise andmore uniform. A definition always starts with a keyword that de-notes the kind of definition: class, trait (a mixin), def (amethod), val (a readonly “variable”), var (a true variable1) andtype (a type member). The rest of the definition specifies the namebeing bound, possibly followed by its type and its “value”: for aclass, this is the class body; a concrete method consists of a singleexpression or a block of statements, and so on.

Scala also has convenient syntax for closures and functiontypes. For example, x :Int => x+1 represents the successorfunction, with type Int=>Int. Furthermore, the Scala compilerperforms quite extensive local type inference [35, 31].

1 Depending on the enclosing definition, val and var define a local vari-able or a field of a class.

96

2.2 ClassesIn addition to single inheritance as in Java, Scala supports mixincomposition [13] using a special kind of abstract classes, called“traits” [37]. Mixin composition is essentially a degenerated formof multiple inheritance where the common ancestor problem is a-voided by forcing the programmer to specify a total order on aclass’s ancestors. However, it has its own set of limitations. Mixincomposition is written as C with T1 . . . with Tn, wheremembers in Ti override members in Tj (j < i).

Besides mixins, the main difference between Java and Scala –with respect to classes – is the latter’s support for type members,primary constructors and explicit self types. Furthermore, in addi-tion to nested classes, function definitions may also be nested.

Scala’s type members can be seen as a generalisation of typeparameters: a class class C[T <: U] can be “desugared” asclass C { type T <: U}; an instance C[Foo] becomes C{type T=Foo}, which uses mixin composition to give T a con-crete type. When used in a type definition, mixin composition iscalled type refinement. Furthermore, Scala supports family poly-morphism [9] using path-dependent types and covariant overridingof abstract type members.

Another application of type refinement is simulating Haskell’stype application: x ::s Int ListF is written as x :s{typea=Int; type b=ListF}. Besides the fact that Scala does

not have an explicit notion of higher-kinded types, the differencewith Haskell is that type-arguments for type application are spec-ified by name rather than position. Naturally, this makes Scala’snotation somewhat more verbose.

A primary constructor provides a concise way of defining aconstructor along with the fields it initialises: class Cons[a](val head :a, val tail :List[a]) defines the publicfields head and tail, as well as a constructor, which initialisesthese fields. It is invoked by new Cons(1, new Nil()). Thiscan be abbreviated even further using Scala’s case classes: by defin-ing case class Cons[a](head :a, tail :List[a]),one may write Cons(1, Nil()). More importantly, Scala sup-ports efficient pattern matching on instances of case classes. Notethe similarity between type parameters vs. type members and pri-mary constructor arguments vs. abstract value members.

Furthermore, our encoding relies on Scala’s support for explicitself types. The primary use for explicitly specifying the self typeT of a class C is to express that C is meant to be a part of a class(composition) that conforms to T. In other words, C requires thisto be of type T. This is written as class C requires T {... }.

A self type must comply with two rules: (1) it must conform tothe self types of the base classes, and (2) the type of an instance ofthat class must conform to the class’s self type.

Finally, a singleton type, such as x.type, is only inhabited bythe object referenced by x. Thus, it is a lightweight dependent type.Besides encoding family polymorphism, self types come in handywhen used in combination with the this-reference. For example,in a class C, this.type is more precise than the type C, whichis an advantage when used covariantly. However, remember that itis a singleton type, so that the type of new C does not conform tothis.type.

3. Lightweight PolyP in ScalaRecently, Gibbons has shown [14] how a lightweight version ofPolyP [19] can be encoded in plain Haskell. The main strength ofthis approach to DGP, which comes at the cost of being limited toregular data types of fixed kind, is that it allows for a truly polymor-phic version of recursion combinators like fold and unfold.

trait TypeConstructor {type a; type b}

trait Bifunctor[s <: Bifunctor[s]]requires s extends TypeConstructor {

def bimap[c, d](f :a=>c, g :b=>d):s{type a=c; type b=d}

}

case class Fix[s <: Bifunctor[s], fa](out :s{type a=fa; type b=Fix[s,fa]}){

def map[mb](f :fa=>mb) :Fix[s,mb]= Fix(out.bimap(f, .map(f)))

def fold[fb](f :s{type a=fa;type b=fb}=>fb):fb = f(out.bimap(id[fa], .fold(f)))

}

def unfold[s <: Bifunctor[s],ua,ub](f :ub=>s{type a=ua;type b=ub})(x :ub):Fix[s, ua]

= Fix(f(x).bimap(id[ua], unfold(f)))

def hylo[s <: Bifunctor[s],ha,hb,hc](f :hb=>s{type a=ha;type b=hb},g :s{type a=ha;type b=hc}=>hc)(x :hb):hc= g(f(x).bimap(id[ha], hylo[s,ha,hb,hc

](f,g)))

trait Builder[s <: Bifunctor[s], ba] {final def build():Fix[s,ba]=bf(Fix[s,ba])def bf[_b](f :s{type a=ba; type b=_b}=>_b)

:_b}

Figure 1: Core of Scala encoding of Lightweight PolyP

3.1 A Straightforward EncodingOur first attempt at encoding a DGP approach in Scala is shown inFig. 1. It achieves the same level of type safety as the Haskell ver-sion by relying on two advanced2 features of Scala’s type system,namely (a) abstract type members and type refinement to simulatetype application, and (b) explicit self types.

For reference, we have included the corresponding Haskell code– after Gibbons [14] – in Fig. 2, 4, and 6. The full Scala andHaskell sources are available for download from the first author’shomepage3.

First, we define TypeConstructor, a trait that models a typeconstructor of kind � → � → �. It consists of two abstract typemembers, which will be given a concrete type by mixing in ananonymous class that defines a and b. Thus, as discussed in 2.2, ifs <: TypeConstructor, s can be thought of as having kind

2 “Advanced” with respect to contemporary OO languages, such as Java andC#, which do not support them.3 http://www.cs.kuleuven.be/~adriaan/?q=oodgp

97

class Bifunctor s wherebimap :: (a->c)->(b->d)->(s a b)->(s c d)

data Fix s a = In (s a (Fix s a))out :: Fix s a -> s a (Fix s a)out (In x) = x

map :: Bifunctor s =>(a->b) -> Fix s a -> Fix s b

map f = In . bimap f (map f) . out

fold :: Bifunctor s =>(s a b -> b) -> Fix s a -> b

fold f = f . bimap id (fold f ) . out

unfold :: Bifunctor s =>(b -> s a b) -> b -> Fix s a

unfold f = In . bimap id (unfold f ) . f

hylo :: Bifunctor s =>(b -> s a b) -> (s a c -> c) -> b -> c

hylo f g = g . bimap id (hylo f g) . f

build :: Bifunctor s =>(forall b. (s a b -> b) -> b) -> Fix s a

build f = f In

Figure 2: Core of the original Haskell encoding of LightweightPolyP

� → � → �, so that x :s{type a=theA;type b=theB}4

corresponds to x :: s theA theB in Haskell.Next, we define a trait that represents the type class5 of bifunc-

tors. New instances of this type class can be defined by having aclass like ListF (Fig. 3) or TreeF (not shown) mix-in (inherit)this trait, with its type parameter s instantiated to the concrete class(here, ListF or TreeF).

By abstracting over s, bimap can be given a more precise type.To make sure the parameter s is not instantiated improperly, it isbounded in two ways: its upper-bound is evident in the definition(it is Bifunctor[s]), but it also has an implicit lower-bound, asit is specified as Bifunctor’s self type (by the “requires s”-clause). This makes it impossible for a subclass (“an instance ofthe typeclass”) to specify a type for s that is more specific than thatsubclass, as its self type would not conform to Bifunctor’s selftype.

To illustrate how this works, we will try to ‘break’ this part ofour encoding in the following three paragraphs. Suppose we aredefining the ListF class and that we want to cheat Bifunctorinto believing our class conforms to TreeF.

Our first attempt, trait ListF extends Bifunctor[TreeF], is ill-formed, as ListF’s self type, ListF6, does notconform to TreeF, the self type of its baseclass, Bifunctor[TreeF]. Thus, it violates rule (1) of section 2.2. We can ‘fix’this by defining ListF as trait ListF requires TreeFextends Bifunctor[TreeF].

4 To make the informal discussion more readable, we sometimes write s[theA,theB] if s is understood to have the right kind. We do not do thisin listings, which are all accepted by version 2.1.5 of the Scala compiler,unless noted otherwise.5 Or, more correctly, the constructor class, since s is of a higher kind.6 As no explicit self type is given, the default self type is assumed, being theclass itself.

Unfortunately, given this definition, a member of ListF can-not access any member of that class through this, as it has typeTreeF. Again, this problem can be patched by reworking the def-inition to trait ListF requires (TreeF with ListF)extends Bifunctor[TreeF]. This definition respects thefirst rule for self types and makes ListF’s members accessiblethrough this. Thus, it looks like we were able to circumvent ourintention: s has been instantiated to a type that is not implementedby the class.

However, by rule (2) of section 2.2, ListF (or a subclass)cannot be instantiated unless it is first composed with a class thatfulfils the TreeF-requirement, i.e., a class that implements thattype. Therefore, the semantics of our encoding of a lower-boundon s are upheld for instances of subclasses of Bifunctor. Thisconcludes our example.

Note that, since this refers to an object of type Bifunctors =>s a b (in Haskell notation), the argument of that type (we

write it informally as s[a,b]) need not be passed explicitly tobimap.

Now, all the machinery is in place to implement the actual map,fold, unfold, and hylo operations. Since map and fold takesome data of type Fix[s,a], we define them as members of Fix[s <: Bifunctor[s], fa], a case class. In Scala, a hier-archy of case classes closely corresponds to (extensible) GADTs:they provide the same kind of generality: they can be defined (andinstantiated) succinctly, and they can be pattern-matched on (in-cluding the refined typing of different arms). Here, we only use thesyntactic convenience that a case class can be instantiated withoutspecifying the new-keyword.

Before we get to the operations, a word about the definition ofFix: it is polymorphic in the shape and element type, and its pri-mary (and only) constructor takes one argument, out. Additionally,as Fix is a case class, this definition automatically introduces apublic, read-only field called ’out’. This field stores an instance of aonce-unrolled version of the recursive type. So, constructing a newinstance of Fix (by val x = Fix(a)) corresponds to rolling(called “in” in PolyP), and x.out corresponds to unrolling (called“out” in PolyP).

The simplest operation on Fix, map, is polymorphic in mb (thismust be specified explicitly in Scala) and applies a given functionf, with type fa=>mb, to every piece of content (with type fa) inthe data structure to yield a new structure with the same shape asthe original one, but with its elements as transformed by f. Notethat .map(f) is syntactic sugar for x => x.map(f), i.e., map’s this-argument is Curried.

Another function that deserves some attention is build7: es-sentially, it takes a template of a datastructure with holes for thedata constructors, and plugs those holes with the roll-operation,Fix. The template’s holes are represented by an operation thattakes an s[a,b] and yields a b, for any b. Therefore, it cannotplug some of its holes itself (with some concrete data constructor),since its type would no longer be polymorphic enough.

In our encoding, the template is represented by Builder’sabstract member bf, which is parametrically polymorphic in _band takes a function that returns such a _b to plug the holes.

This abstract member can be seen as an argument to the buildmethod, as it has to be given a concrete value before build maybe invoked. More concretely, to invoke build, Builder must beinstantiated. To do this, it must first be composed with a class thatprovides a concrete implementation of bf, as classes with abstractmembers cannot be instantiated. This composition can be seen as aform of argument passing.

7 Note that build is an instance of the Template Method design pattern.

98

trait ListF extends Bifunctor[ListF]

case class NilF[la,lb] () extends ListF {type a=la; type b=lbdef bimap[c,d](f :a=>c, g :b=>d):NilF[c,d]

= NilF()}

case class ConsF[la, lb] (hd :la, tl :lb)extends ListF {

type a=la; type b=lbdef bimap[c,d](f :a=>c, g :b=>d):ConsF[c,d]

= ConsF(f(hd), g(tl));}

type List[a] = Fix[ListF, a]

Figure 3: Example of a data-type adapted for Lightweight PolyP

data ListF a b = NilF | ConsF a binstance Bifunctor ListF where

bimap f g NilF = NilFbimap f g (ConsF hd tl) = ConsF (f hd) (

g tl)type List a = Fix ListF a

Figure 4: Example of a data-type adapted for Lightweight PolyP(Haskell)

The bf function cannot simply be encoded as an argument ofthe build method, as Scala does not support universally quanti-fied type variables with the scope of a single argument. In recentversions of Haskell, this is possible using the forall construct.

The testLists function in Fig. 5 shows an example usage ofBuilder. Note that Scala’s type checker effectively prohibits thetemplate from plugging a hole itself:

val list123 = new Builder[ListF, Int]{def bf[_b](x :ListF{type a=Int;type b=_b}

=>_b)= x(ConsF(1,x(ConsF(2, x(ConsF(3,

/*<error>*/Fix[ListF, Int]/*</error>*/(NilF())))))))

}.build

does not type check.The remaining functions do not present any new difficulties, so

we do not discuss them in detail.Before discussing the strengths and weaknesses of this en-

coding, we consider the example in Fig. 3 and 5. ListF is de-clared as an instance of the Bifunctor-typeclass by extendingBifunctor[ListF]. Its two subclasses, ConsF and NilF im-plement bimap appropriately. Ideally, ConsF’s definition wouldlook like case class ConsF(x :a , y :b) extends ListF, but since typemembers are not in scope in the primary constructor, this has to beworked around by introducing type parameters which are equal tothe corresponding type members.

Finally, Fig. 5 shows how to use the generic fold to sum allthe elements in a list, how to unfold a number to the list of itspredecessors, how to build a list from a template with holes,which are called ‘x’, and how to map a function over this list.

def sumall(lst :Fix[ListF, Int]) :Int = {def myadd(x :ListF{type a=Int; type b=Int})

:Int =x match {case NilF() => 0case ConsF(hd,sumTail) => hd+sumTail

}

lst.fold(myadd)}

def preds(i :Int) :List[Int] = {def mypred(x :Int) :ListF{type a=Int; type

b=Int} =x match {case 0 => NilF()case _ => ConsF(x-1, x-1)

}

unfold[ListF, Int, Int](mypred)(i)}

def testLists = {val list123 = new Builder[ListF, Int]{ defbf[_b](x :ListF{type a=Int;type b=_b}=>_b)= x(ConsF(1,x(ConsF(2, x(ConsF(3, x(NilF

())))))))}.build

print(list123.map( x => x*3) )

print(sumall(list123))print(preds(10))

}

Figure 5: Example generic functions

--- example use:--- sumall (In (ConsF 1 (In (ConsF 2 (In (

ConsF 3 (In NilF)))))))sumall :: Fix ListF Integer -> Integersumall = fold myaddmyadd NilF = 0myadd (ConsF x y) = x+y

--- example use:--- preds 10preds :: Integer -> List Integerpreds = unfold mypredmypred :: Integer -> ListF Integer Integermypred 0 = NilFmypred x = ConsF (x-1) (x-1)

Figure 6: Example generic functions (Haskell)

99

3.2 Discussion of the Straightforward EncodingIn this section we briefly discuss the most apparent disadvantagesof lightweight PolyP and its encoding in Scala and Haskell.

3.2.1 Lightweight PolyPAs the PolyP approach exposes the points of recursion in a data-type, fold can be implemented generically once and for all. How-ever, this only works for regular data-types of a fixed kind (theapproach is not higher-order polymorphic).

Moreover, since the recursion points are marked by interpretinga recursive data-type as the least fixed point of the correspondingnon-recursive type, the approach does not operate directly on theoriginal, recursive data-types, but on a rewritten version that takesa type parameter where recursion was used originally.

3.2.2 Scala and HaskellDue to Scala’s support for (virtual) type members, the Haskellencoding of lightweight PolyP carries over quite straightforwardly.Note that encoding this in Java or C# would require (exponential)global program rewriting [2, 32] as they do not support typemembers [20] (let alone higher-kinded types). However, Scala’sdesign decisions regarding type members make some parts of theencoding more verbose than the Haskell version.

For example, we would like to be able to express the lower-bound on Bifunctor’s s-parameter more directly by defining thetrait as follows: trait Bifunctor[s >: this.type <:Bifunctor[s]]. Unfortunately, in the scope of the definition

of type parameters, this refers to the enclosing instance of theclass being defined. Note that we cannot do without s and simplyuse this.type directly in bimap’s return type, since it is asingleton type: its only inhabitant is this. Of course, these designdecisions are all quite sensible, they were simply taken to meetdifferent requirements.

Neither Haskell’s nor Scala’s type system directly supports ex-pressing the isomorphism between the original recursive type andits unrolling. Therefore, this isomorphism has to be witnessed by anembedding-projection pair. This induces run-time overhead whichonly serves to appease the type checker. In certain cases, this over-head can be avoided in Haskell using the newtype construct,which is not available in Scala.

Besides the run-time overhead due to the allocation of unneces-sary objects, the invocations of the functions that witness the iso-morphism also dilute the essence of the code. In section 4.1 weshow how Scala’s implicit conversions can be used to avoid thissyntactic overhead. But first, we tackle another disadvantage of ourencoding and try to make it more extensible.

3.3 An Extensible EncodingIn one dimension, our first attempt is already extensible: to be use-ful at all, it has to allow for new instances of the Bifunctor type-class (i.e., subclasses of the trait Bifunctor[s]). However, it isnot possible to add new operations to a Bifunctor (to supportan extension such as generic traversals [15]), without changes toexisting code. The same problem exists with Fix[s,a].

In this section we lift this limitation using a virtual class encod-ing, which is based on Odersky and Zenger’s solution to the well-known expression problem [39]. We first demonstrate its essenceusing the expression problem, then we apply it to our encoding ofthe previous section and finally, we discuss the degree of extensi-bility we have achieved.

Virtual classes are an elegant way of retroactively and modu-larly extending classes. They were introduced in Beta and gBeta[8] as a result of generalising the idea of late binding of virtualmethods to classes. Thus, similarly to a virtual method, a virtualclass may be overridden in a subclass of its enclosing class.

trait ECore {[open] trait Expr[t]case class Num(n :Int) extends Expr[Int]

}

trait EEval requires (ECore with EEval) {trait Expr[t] {def eval :t

}

case class Num(n :Int) extends Expr[Int]def eval = n

}}

[close] class Deployment extends ECorewith EEval

Figure 7: Pseudo-code pretending Scala supports virtual classes

In other words, a virtual class VC is a virtual member of anotherclass M (i.e., VC is nested in M), so that VC may be overridden ina subclass M’ of M. This redefines the meaning of VC in everymember of M’ (including the members inherited from M).

Full-blown support for virtual classes has only recently beenproven type-safe [10, 5], but many similar extensions have previ-ously been proposed in other languages [33, 28].

3.3.1 Virtual Classes and the Expression ProblemBefore applying the virtual class encoding to the code in Fig. 1, 3,and 5, we illustrate it using a simpler example. First, we formulatethe example using a hypothetical extension of Scala that supportsvirtual classes directly by introducing two new annotations, [open] and [close]. Then we show how to compile away these anno-tations. A nested class may be declared virtual (or “open”) by tag-ging it with the [open]-annotation. The deployment (discussedbelow) is annotated with [close].

Fig. 7 shows a solution to a minimalist version of the expressionproblem using our extension: it defines a virtual trait Expr[t],and a number of classes that are implicitly virtual (since they inheritfrom a virtual class8).

The main point is that Expr and Num in EEval – the “pack-age” that provides the evaluation-functionality – override the syn-onymous classes in the core package, giving these class names andthe corresponding types a new meaning. Thus, without changingexisting code, we have retroactively extended the class Num andthe corresponding type to provide more functionality. WhereverNum(10) was written to create a new basic number-expression,now a new number-expression is created that supports evaluation.(EEval’s self type simply specifies the requirements of this pack-age, or “component”: it relies on the core package and on itself9.)

Finally, the class Deployment puts it all together and fixesthe order of the different extensions. Think of it as deploying a(component-based) web-application: it determines the way the dif-ferent “components” are “wired,” but it does not contain any appli-cation logic. In other words, it maps the abstract types to a concretecomposition of implementations, thus resolving the components’requirements. New functionality can be added simply by mixingin additional packages that provide the required operations; exist-

8 In our current approach, we make classes that inherit from virtual classes,virtual by default.9 The latter has to be made explicit in Scala.

100

trait ECore {type TExpr <: Expr$ECoretype TNum <: TExpr with Num$ECore

def Num(n :Int) :TNum

trait Expr$ECore requires TExpr {type t}trait Num$ECore requires TNum extends

Expr$ECore{type t=Int; val n :Int}}

trait EEval requires (ECore with EEval) {type TExpr <: Expr$ECore with Expr$EEvaltype TNum <: TExpr with Num$ECore

with Num$EEval

trait Expr$EEval requires TExpr {def eval :t}

trait Num$EEval requires TNumextends Expr$EEval {

def eval = n}

}

class Deployment extends ECorewith EEval {

type TExpr = Expr$ECore with Expr$EEvaltype TNum = Num$ECore with Num$EEval

def Num(nn :Int) = new Num$ECore withNum$EEval {val n=nn}

}

Figure 8: Encoding of Fig. 7

ing code now automatically uses these refined classes (of course, ifnew operations have been added, these are only used by code thatrequires that this package has been mixed in).

3.3.2 Encoding a Solution to the Expression ProblemSince Scala does not allow a class (as a member of another class) tobe overridden (which would of course obviate the need for an en-coding), but does allow overriding of type members, we shall treata class and its type separately. As shown in Fig. 8, we introducea type member for each virtual class and name-mangle the actualclasses by appending the name of the enclosing class.

The introduced type member is used as the self type of the corre-sponding class and its upper bound is determined by the inheritancestructure of this class: a compound type consisting of the class itselfand the self type of every class it inherits from. Furthermore, we in-troduce an abstract function for every concrete class to represent itsconstructor, which is not known until deployment-time.

3.3.3 Applying the Virtual Class Encoding to our Encodingof Lightweight PolyP

The virtual class encoding carries over quite straightforwardly toour encoding of lightweight PolyP, as illustrated by the code inFig. 9, 10, 11, and 12. In section 4, we discuss a version usingthe pseudo-code [open]/[close] notation.

The DGPBase trait (Fig. 9) contains the essential DGP func-tionality: it provides an extensible Bifunctor and Fix (in the

trait DGPBase {type TBifunctor <: Bifunctor$DGPBasetype TFix <: Fix$DGPBasetype TBuilder <: Builder$DGPBase

trait Bifunctor$DGPBase requires TBifunctor{ type a; // content (inlined)type b; // recursive substructure(inlined)type s >: this.type <: TBifunctor /*{type

s=Bifunctor$DGPBase.this.s}*/

def bimap[c, d](f :a=>c, g :b=>d) :s{types=Bifunctor$DGPBase.this.s;type a=c;type b=d}

}

def Fix[s_ <:TBifunctor/*{type s=s_}*/,fa_](out_ :s_{type s=s_; type a=fa_; type b=TFix{type s=s_; type a=fa_}}) :TFix{

type s=s_; type a=fa_}

trait Fix$DGPBase requires TFix {type s <: TBifunctor/*{type s=this.s}*/type aval out :s{type s=Fix$DGPBase.this.s;type a=Fix$DGPBase.this.a;type b=TFix{type s=Fix$DGPBase.this.s;

type a=Fix$DGPBase.this.a}}

def map[b](f :a=>b) :TFix{type s=Fix$DGPBase.this.s; type a=b;}= Fix(out.bimap(f, .map(f)))

def fold[fb](f :s{type a=Fix$DGPBase.this.a;type b=fb}=>fb) :fb= f(out.bimap(id[Fix$DGPBase.this.a],

.fold(f)))}

def unfold[us <: TBifunctor/*{type s=us}*/,ua, ub](f :ub=>us{type s=us;type a=ua;

type b=ub})(x :ub):TFix{type s=us; type a=ua;}

= Fix(f(x).bimap(id[ua], unfold(f)))

def hylo[hs <: TBifunctor/*{type s=hs}*/,ha, hb, hc]

(f:hb=>hs{type s=hs;type a=ha;type b=hb},g:hs{type s=hs;type a=ha;type b=hc}=>hc)

(x :hb) :hc= g(f(x).bimap(id[ha], hylo(f,g)))

trait Builder$DGPBase requires TBuilder{// a_, s_ aliases avoid qualified namestype s <: TBifunctor ; type s_ =stype a ; type a_ =a

final def build() :TFix{type s=s_; type a=a_} = bf(Fix[s,a])

def bf[_b](f:s{type s=s_;type a=a_;type b=_b}=>_b) :_b

}}

Figure 9: A more extensible version of the code in Fig. 1

101

trait LCore {type TListF <: ListF$LCoretype TNilF <: TListF with NilF$LCoretype TConsF <: TListF with ConsF$LCore

trait ListF$LCore requires TListF {type a ; type b ; type s = TListF

}

def NilF[a_, b_]() :TNilF{type a=a_;type b=b_}

trait NilF$LCore requires TNilF extendsListF$LCore

def ConsF[a_, b_](head :a_, tail :b_) :TConsF{type a=a_; type b=b_}

trait ConsF$LCore requires TConsF extendsListF$LCore {

val hd :a ; val tl :b}

}

Figure 10: Encoding of the DGP-agnostic operations on lists

trait ListDGP requires (DGPBase with LCorewith ListDGP) {

type TListF <: TBifunctor with ListF$LCoretype TNilF <: TListF with NilF$LCore with

NilF$ListDGPtype TConsF <: TListF with ConsF$LCore with

ConsF$ListDGP

trait NilF$ListDGP requires TNilF {def bimap[c, d](f :a=>c, g :b=>d) :s{

type a=c; type b=d} = NilF[c,d]()}

trait ConsF$ListDGP requires TConsF {def bimap[c, d](f :a=>c, g :b=>d) :s{type

a=c; type b=d} = ConsF(f(hd), g(tl));}

type List[a_] = TFix{type s=TListF;type a=a_}

}

Figure 11: Encoding of the DGP-operations on lists

sense that more operations may be added to them, in addition tothe more traditional interpretation of allowing for subclasses). (Forbrevity, we have in-lined TypeConstructor.)

The LCore package (Fig. 10) defines the list functionality,independently of the other packages (however, foresight is stillrequired in using a type parameter instead of recursion). TheListDGP package (Fig. 11) extends the LCore package withDGP functionality (i.e., a suitable implementation of the bimapfunction). Finally, Fig. 12 shows the deployment.

class Deployment extends DGPBase with LCorewith ListDGP {

type TBifunctor = Bifunctor$DGPBasetype TFix = Fix$DGPBasetype TListF = TBifunctor with ListF$LCoretype TNilF = TListF with NilF$LCore with

NilF$ListDGPtype TConsF = TListF with ConsF$LCore with

ConsF$ListDGPtype TBuilder = Builder$DGPBase

def Fix[s_ <:TBifunctor/*{type s=s_}*/,fa_](out_ :s_{type s=s_; type a=fa_; type b=TFix{type s=s_; type a=fa_}}) :TFix{

type s=s_; type a=fa_}= new Fix$DGPBase {type s=s_; type a=

fa_; val out=out_}def NilF[a_, b_]() :TNilF{type a=a_;

type b=b_}= new Bifunctor$DGPBase with NilF$LCore

with NilF$ListDGP {type a=a_;type b=b_}

def ConsF[a_, b_](head :a_, tail :b_) :TConsF{type a=a_; type b=b_}= new Bifunctor$DGPBase

with ConsF$LCorewith ConsF$ListDGP {type a=a_;

type b=b_; val hd=head; valtl=tail}

}

Figure 12: The Deployment

3.4 Discussion of the Extensible Encoding3.4.1 Independent ExtensionsThe virtual class encoding only supports independent extensions upto a certain point. Packages can only be evolved independently ifthey really are independent. Suppose that we want to add a newsubclass to ListF$LCore in the package LCore: a cons cellthat supports constant-time appends, for example. If we do notenhance the ListDGP package with the corresponding subclass toListF$ListDGP, it is impossible to make a concrete deploymentthat contains both LCore and ListDGP.

More concretely, if ListDGP is included in the deployment,ListF’s self type conforms to TBifunctor. Therefore, a sub-class of ListF can only be instantiated if it implements bimap.Now, to make a concrete deployment, every concrete class it con-tains, must be instantiatable. This is a consequence of the secondrule for self types: an instance of a class composition must conformto its self type.

3.4.2 Extent of ExtensibilityAlthough our encoding effectively allows us to retroactively extendclasses while respecting modularity, it requires us to make a deci-sion about the exact functionality we need (by making the deploy-ment), before we can actually create any instances of these classes.This does not seem like a serious disadvantage, and it can be solvedby using object-based inheritance instead of static mixin composi-tion [33].

Another reason why we do not consider this a major problemis that an application should not depend on the concrete deploy-ment (as long as it meets the requirements, it does not matter howexactly) and the deployment does not contain any application logic.

102

implicit def unroll[ s <: Bifunctor[s], ua](v :Fix[s,ua]) :s{type a=ua;

type b=Fix[s,ua]}= v.out

implicit def roll[ s <: Bifunctor[s], ra](v :s{type a=ra;

type b=Fix[s,ra]}):Fix[s,ra] = Fix(v)

Figure 13: Making the witnesses to the isomorphism implicit

Moreover, if a new deployment only supplements the existingone, it can reuse it using inheritance. In case of a completelydifferent deployment, duplication is minimal as deployment onlyspecifies “the wiring of the components” (the mixin-compositionof the traits that implement the abstract types) that make up theapplication. Furthermore, it does not require the language to specifyany general conflict resolution rules – which may prevent certainsensible combinations, or may even lead to surprising implicitcombinations. Instead, it offers the programmer full control, whilestill guaranteeing type safety by the usual typing rules.

Unfortunately, even though virtual classes make it possible toadd new functionality without changing existing classes, they donot solve the more fundamental problem that classes have to bedeveloped with this particular approach to DGP in mind, since theymust make type recursion explicit using a type parameter.

3.4.3 Scala-Specific LimitationsThe virtual class encoding uses a type member to abstract over avirtual class’ self type, which must be parametrised if the classwas parametrised. However, Scala does not allow type membersto be parametrised. Thus, virtual classes cannot be parametrisedeither. Luckily, type parameters are mostly syntactic sugar for typemembers. Thus, our encoding desugars them to type members[32].

Unfortunately, type parameters are mostly syntactic sugar, assome constructs do not carry over from type parameters to typemembers. For example, the Scala compiler does not allow a typemember to be bounded by a type that contains that type member(even though this is legal for type parameters), therefore, the boundon the s in s is propagated to every use of that type.

More concretely, since we cannot write type s >: this.type <: TBifunctor{type s=Bifunctor$DGPBase.this.s}, we propagate {type s=Bifunctor$DGPBase.this.s} to every occurrence of s in the scope of this definition:s{type s = Bifunctor$DGPBase.this.s; ... }.

Yet, type members allow us to do things not possible with typeparameters: we can now use this.type as the lower bound ons: type s >:this.type <:TBifunctor.

4. Further Improvements4.1 Leveraging Scala-Specific FeaturesScala’s implicit conversions allow us to simulate support for equi-recursive types. That is, we can make the witness of the isomor-phism between the recursive type and its unrolling implicit usingthe definitions in Fig 13. Whenever these functions are in scope(i.e., accessible without a qualified name), the compiler inserts acall to one of them to remedy a discrepancy between an expres-sion’s inferred and its expected type.

So, if we have an e :Fix[s,a_], but really need an s[a_,Fix[s,a_]], the e is rewritten to unroll(e). In principle,these definitions would relieve us from writing “out” and “Fix,”were it not for some limitations in the current definition of the

trait DGPBase {// (implicit conversions omitted)[open] trait Bifunctor[a,b] {type self[a,b] >:this.type<:Bifunctor[a,b]def bimap[c,d](f :a=>c,g :b=>d) :self[c,d]

}

[open] case class Fix[s <: Bifunctor, a](out :s[a, Fix[s,a]]) {

def map[b](f :a=>b) :Fix[s,b]= bimap(f, .map(f))

def fold[b](f :s[a,b]=>b) :b= f(bimap(id, .fold(f)))

}

def unfold[s <: Bifunctor, a, b](f :b=>s[a,b])(x :b) :Fix[s, a]= f(x).bimap(id, unfold(f))

def hylo[s <: Bifunctor, a, b, c](f :b=>s[a,b], g :s[a,b]=>c)(x :b) :c= g(f(x).bimap(id, hylo(f,g)))

def build[s <: Bifunctor, a](f : (forall b.(s[a,b] => b) => b) ) :Fix[s,a]= f(Fix)

}

Figure 14: The DGP Core (in Extended Scala)

Scala language. (Implicit conversions are only inferred for callson explicit targets, whereas our encoding also requires them to beinferred when the target is this implicitly.) As already mentioned,these implicit conversions only resolve some syntactic overhead,they are still performed behind the scenes at run time.

4.2 Extending the LanguageIn this section we will briefly discuss extensions to Scala whichseem useful and feasible, but which we have not fully formalisedor which we cannot readily encode in “classic” Scala.

4.2.1 Consolidating Previous ExtensionsWe now consider an implementation of our library in an extendedversion of Scala, which supports the features we have motivatedwhen discussing the limitations of our current implementation. Fig.14, 15, 16, and 17 show what our code would look like, givensupport for (1) higher-kinded types and type application, (2) virtualclasses, (3) universal quantification for arbitrary type declarations,(4) implicit conversions with fewer restrictions, and (5) using typemembers in the signature of the primary constructor.

The higher-kinded types manifest themselves as ‘raw’ occur-rences of a parametrised type (i.e., type constructors), such as Bi-functor in Fig. 14. By the definition of the Bifunctor class,the compiler knows it takes two type parameters, which may beomitted, as in the bound on s in the definition of Fix. This requiressubtyping on higher-order types [34]. Finally, note the universallyquantified type variable b in the definition of the build method.

4.2.2 Integration with Existing CodeDespite all the extensions we have proposed, one important limi-tation of our approach remains: user-defined classes must have ex-actly two type parameters: one for the element type and one for the

103

trait LCore {[open] trait ListF[a,b]

case class NilF[a,b]() extends ListF[a,b]case class ConsF[a,b](hd :a, tl :b) extends

ListF[a,b]}

Figure 15: The List Core (in Extended Scala)

trait ListDGP requires (DGPBase with LCorewith ListDGP) {

trait ListF[a,b] extends Bifunctor[a,b]

case class NilF[a,b]() extends ListF[a,b] {def bimap[c,d](f :a=>c, g :b=>d):NilF[c,d]

= NilF()}

case class ConsF[a,b](hd :a, tl :b) extendsListF[a,b] {

def bimap[c,d](f :a=>c,g :b=>d):ConsF[c,d]= ConsF(f(hd), g(tl));

}

type List[a] = Fix[ListF, a]}

Figure 16: The DGP Extension for Lists (in Extended Scala)

[close] trait Deployment extends DGPBasewith LCore with ListDGP

object Main extends Deployment {// main method

}

Figure 17: The Deployment (in Extended Scala)

trait LCore {[open] trait List[a]

case class Nil[a]() extends List[a]case class Cons[a](hd :a, tl :List[a])

extends List[a]}

trait LAdaptDGP requires (LCore with DGPBasewith LAdaptDGP){

type List[a] = mu t. List[a, t]trait List[a,b] extends Bifunctor[a,b]

type Nil[a] = exists t. Nil[a, t]case class Nil[a,b]() extends List[a,b]

type Cons[a] = Cons[a, List[a]]case class Cons[a,b](hd :a, tl :b) extends

List[a,b]}

Figure 18: Adapting user-defined classes to our approach usingvirtual classes and generalized constraints (pseudo-code)

trait DGPBase {[open] trait Bifunctor[a,b] {type self[a,b] >: this.type <: Bifunctor[a,

b]def bimap[c,d](f :a=>c, g :b=>d) :self[c,d]

def map[b <: Bifunctor[a, b], a’](f :a=>a’):self[a’,b]

= bimap(f, .map(f))

def fold[b <: Bifunctor[a, b], b’](f :self[a,b’]=>b’) :b’

= f(bimap(id, .fold(f)))}

}

Figure 19: Core of our DGP library using generalized constraints(pseudo-code)

recursive substructures. While the former kind of type parameter isquite common, abstracting over the type of recursive substructuresis not. We now briefly explore a way to overcome this problem.We have not fully formalised these ideas, but we motivate why wethink they are feasible.

Suppose we have a class that is parametrically polymorphic inthe element type and that we want to use it with our DGP library. Asbefore, we use virtual classes for retroactive extension: we overridethe original class with a new one that has an extra type parameterfor the recursive substructures. This way, the fold method cancreate an instance of the (revised) user-defined class that storesthe required type of intermediate result in the fields that wouldnormally point to a recursive substructure.

Of course, the original user-defined methods must not be calledon such an object. This can be ensured statically by adding aconstraint to those methods so that they may only be invoked whenthe synthetic type parameter conforms to the type of the recursivesubstructures. This is illustrated in Fig. 19, where methods mapand fold constrain the class-level type parameter b. Only the

DGP methods that ‘open up’ the recursion, such as bimap, may beinvoked on instances with an arbitrary type for the (once recursive)substructures.

Consider the example in Fig. 18 and the generic methods (Fig.19). The package LAdaptDGP retrofits the original implementa-tion of lists to our DGP library. In doing so, three type synonymsare introduced: List[a] stands for the recursive mu t. List[a, t], Nil[a] is equal to exists t. Nil[a, t], andCons[a] is now an alias for the type Cons[a, List[a]].This assumes a class definition implies a type definition that canbe overridden separately: in LCore, the type List[a] is definedby the corresponding class; in LAdaptDGP, it is made explicit andis overridden by a type alias.

Note that this depends on μ-types and existential types. Further-more, it requires overloaded type members, as List[a] is onlydistinguished from List[a,b] by the number of its type param-eters; additionally, the type member List[a] (in LAdaptDGP)overrides the implicit type member introduced by the definition ofthe original List class.

These ideas also simplify the definition of the core DGP func-tionality, as shown in Fig. 19. The Bifunctor and Fix classeshave been merged, distributing the constraint on the recursive typeparameter (b) to the relevant methods.

Note that constraining a class-level type parameter in the scopeof a method has already been proposed and proven type-safe [7].

104

The remaining challenge is to allow a virtual class to be overrid-den with a class with one more type parameter, and where everyoccurrence of the recursive type is replaced by that parameter, con-strained to the original type where necessary.

Finally, it seems feasible to allow a virtual class to add typeparameters to the original class, given the similarity between typeparameters and type members and the fact that a subclass may al-ready introduce new type members. Furthermore, overloaded typemembers and the separation of a class and its type, do not seemproblematic.

5. Future and Related WorkAs part of our ongoing research, we are developing a fully object-oriented approach to DGP. The essential idea of identifying boiler-plate code and factoring it out as a reusable library, without sacrific-ing type safety and performance, applies to OO as much as it doesto FP. However, in OO, design patterns such as Iterator and Visitorand other – imperative – idioms are preferred over maps and folds.The experiment that we described in this paper served as a goodsource of inspiration, but it staid too close to the FP philosophy forit to be appealing to the average OO programmer.

Thus, the challenge is to distil the essential insights of DGPin FP and implement them in a way that fits in nicely with theOO way of programming. As Hinze [17] notes that DGP relieson two essential ingredients – (a) overloading on types (ad-hocpolymorphism), and (b) a generic view on the structure of data-types (so that a generic catch-all case can be defined easily) – weare modeling our approach after this taxonomy.

In OO, multi-methods are used for overloading on types. Multi-methods, and their generalisation to predicate dispatch, have beenstudied extensively in the OO community [6, 11]. However, a multi-method’s list of arguments cannot vary based on the type it isindexed by [16].

For the second ingredient, we propose typed meta-programmingon types, complemented with minimal support for structural types.We apply meta-programming to generate (structural) types in ad-dition to code. Essentially, we use this to support typed reflection.Additionally, we employ this technique to support multi-methodsthat take arguments whose shape depends on the type of anotherargument of the method.

To validate our approach, we are currently examining largeOO code bases (in particular, the Scala 2 compiler) to identifypatterns that can be factored out. Subsequently, we will developa formalisation of the static and dynamic semantics of an extensionto Scala, which we will implement as part of the Scala compiler.

5.1 Related Work5.1.1 Typed Meta-ProgrammingThe most prominent language to offer typed meta-programming, isMetaML [38]. However, it only supports the manipulation of code,and not types. Metaphor [27] is an extension of C#, which supportstyped meta-programming on code as well as types. Metaphor’sauthors have shown [27] how to implement a generic encode usingtheir language (although they do not explicitly call it datatype-generic programming).

5.1.2 Structural type systemsIn languages with structural type systems, the problem of datatype-genericity has gone by largely unnoticed [36], as they providenative support for viewing data structurally. Therefore, it seemslike a good idea to see how their approach can be integrated ina typically nominal OO type system. Modula-3 already integratedstructural types and nominal ones using branding, i.e., by tagging astructural type so that it becomes unique [4].

6. ConclusionWe have developed a small Scala library for datatype-generic pro-gramming, based on lightweight PolyP. We discussed various vari-ations on the core encoding and discussed their advantages and dis-advantages:

• Our first attempt did not allow new operations to be added tothe classes that implement the generic operations. Therefore,we turned to an encoding of virtual classes to alleviate this. Wedemonstrated how to retroactively extend virtual classes, whilerespecting modularity.

• Another important limitation, which we only addressed infor-mally, is that user-defined classes must have two type parame-ters: one for the element type, and one to abstract over the typeof the recursive substructures. As this is quite unusual, users ofour library have to adapt their classes. We indicated how thelanguage can be extended (beyond support for virtual classes)to make it possible to adapt these user-defined classes.

• Scala (nor Haskell) support equirecursive types, but we showedhow to hide this complication using Scala’s implicit conver-sions.

Currently, we are working on a truly object-oriented approach todata-type genericity. It is based on multi-methods, structural typesand, most importantly, typed meta-programming on types. We arestudying a large code base in Scala to identify patterns that canbe implemented reusably using DGP. At the same time, we aredeveloping a Scala extension to support our ideas. To validate ourapproach, we will formalise our extension and implement it as partof the Scala compiler.

7. AcknowledgmentsThis research was supported by a grant to the first author from IWT-Vlaanderen (Institute for the Promotion of Innovation through Sci-ence and Technology in Flanders). We are grateful to Tom Schri-jvers and Marko van Dooren, whose constructive criticism im-proved the presentation of this paper substantially. We would alsolike to thank the anonymous reviewers for their thorough, insightfulcomments. Finally, we gratefully acknowledge interesting conver-sations with the members of the Scala mailing list and with theattendants of the spring school on Datatype-Generic Programming2006 in Nottingham.

References[1] Eric E. Allen, Jonathan Bannet, and Robert Cartwright. A first-

class approach to genericity. In Ron Crocker and Guy L. SteeleJr., editors, Proceedings of the 2003 ACM SIGPLAN Conference onObject-Oriented Programming Systems, Languages and Applications,OOPSLA 2003, October 26-30, 2003, Anaheim, CA, USA, pages 96–114. ACM, 2003.

[2] Kim B. Bruce, Martin Odersky, and Philip Wadler. A statically safealternative to virtual types. In Jul [22], pages 523–549.

[3] Luca Cardelli, editor. ECOOP 2003 - Object-Oriented Programming,17th European Conference, Darmstadt, Germany, July 21-25, 2003,Proceedings, volume 2743 of Lecture Notes in Computer Science.Springer, 2003.

[4] Luca Cardelli, James E. Donahue, Lucille Glassman, Mick J. Jordan,Bill Kalsow, and Greg Nelson. Modula-3 language definition.SIGPLAN Notices, 27(8):15–42, 1992.

[5] Dave Clarke, Sophia Drossopoulou, James Noble, and TobiasWrigstad. Tribe: More types for virtual classes. December 2005.

[6] Curtis Clifton, Todd Millstein, Gary T. Leavens, and Craig Chambers.MultiJava: Design rationale, compiler implementation, and applica-tions. ACM Trans. Program. Lang. Syst., 28(3):517–575, 2006.

105

[7] Burak Emir, Andrew Kennedy, Claudio Russo, and Dachuan Yu.Variance and generalized constraints for C# generics. In DaveThomas, editor, ECOOP 2006 - Object-Oriented Programming, 20thEuropean Conference, Nantes, FR, July 3-7, 2006, Proceedings,Lecture Notes in Computer Science. Springer, 2006.

[8] Erik Ernst. gbeta – a Language with Virtual Attributes, BlockStructure, and Propagating, Dynamic Inheritance. PhD thesis,Department of Computer Science, University of Aarhus, Århus,Denmark, 1999.

[9] Erik Ernst. Family polymorphism. In Jørgen Lindskov Knudsen,editor, ECOOP 2001 - Object-Oriented Programming, 15th EuropeanConference, Budapest, Hungary, June 18-22, 2001, Proceedings,volume 2072 of Lecture Notes in Computer Science, pages 303–326.Springer, 2001.

[10] Erik Ernst, Klaus Ostermann, and William R. Cook. A virtual classcalculus. In J. Gregory Morrisett and Simon L. Peyton Jones, editors,Proceedings of the 33rd ACM SIGPLAN-SIGACT Symposium onPrinciples of Programming Languages, POPL 2006, Charleston,South Carolina, USA, January 11-13, 2006, pages 270–282. ACM,2006.

[11] Michael Ernst, Craig S. Kaplan, and Craig Chambers. Predicatedispatching: A unified theory of dispatch. In Jul [22], pages 186–211.

[12] Martin Odersky et al. An overview of the Scala programminglanguage. Technical Report LAMP-REPORT-2006-001, EPFLLausanne, Switzerland, 2006. Second Edition.

[13] Matthew Flatt, Shriram Krishnamurthi, and Matthias Felleisen.Classes and mixins. In MacQueen and Cardelli [25], pages 171–183.

[14] Jeremy Gibbons. Design patterns as higher-order datatype-genericprograms. 2006. Submitted.

[15] Jeremy Gibbons and Bruno C. d. S. Oliveira. The essence of theiterator pattern. To appear in Mathematically-Structured FunctionalProgramming, 2006.

[16] Ralf Hinze, Johan Jeuring, and Andres Löh. Type-indexed data types.Sci. Comput. Program., 51(1-2):117–151, 2004.

[17] Ralf Hinze, Andres Löh, and Bruno C. D. S. Oliveira. "Scrap yourboilerplate" reloaded. In Masami Hagiya and Philip Wadler, editors,FLOPS, volume 3945 of Lecture Notes in Computer Science, pages13–29. Springer, 2006.

[18] Atsushi Igarashi, Benjamin C. Pierce, and Philip Wadler. Feather-weight Java: a minimal core calculus for Java and GJ. ACM Trans.Program. Lang. Syst., 23(3):396–450, 2001.

[19] Patrik Jansson and Johan Jeuring. PolyP—A polytypic programminglanguage extension. In Lee et al. [24], pages 470–482.

[20] Jaakko Järvi, Jeremiah Willcock, and Andrew Lumsdaine. Associatedtypes and constraint propagation for mainstream object-orientedgenerics. In Johnson and Gabriel [21], pages 1–19.

[21] Ralph Johnson and Richard P. Gabriel, editors. Proceedings ofthe 20th Annual ACM SIGPLAN Conference on Object-OrientedProgramming, Systems, Languages, and Applications, OOPSLA2005, October 16-20, 2004, San Diego, CA, USA. ACM, 2005.

[22] Eric Jul, editor. ECOOP’98 - Object-Oriented Programming,12th European Conference, Brussels, Belgium, July 20-24, 1998,Proceedings, volume 1445 of Lecture Notes in Computer Science.Springer, 1998.

[23] Andrew Kennedy and Don Syme. Design and implementation ofgenerics for the .NET common language runtime. In Proceedingsof the 2001 ACM SIGPLAN Conference on Programming LanguageDesign and Implementation, PLDI 2001, June 20-22, 2001, Snowbird,Utah, USA, pages 1–12. ACM, 2001.

[24] Peter Lee, Fritz Henglein, and Neil D. Jones, editors. Proceedingsof the 2001 ACM SIGPLAN Conference on Programming LanguageDesign and Implementation, PLDI 2001, June 20-22, 2001, Snowbird,Utah, USA. ACM, 1997.

[25] David B. MacQueen and Luca Cardelli, editors. Proceedings ofthe 25th ACM SIGPLAN-SIGACT Symposium on Principles ofProgramming Languages, POPL 1998, San Diego, CA, USA, January19-21, 1998. ACM, 1998.

[26] Martin Odersky et al. The Scala homepage, 2006. http://scala.epfl.ch/.

[27] Gregory Neverov and Paul Roe. Towards a fully-reflective meta-programming language. In Proceedings of the Twenty-eighthAustralasian conference on Computer Science, CRPIT ’38, pages151–158, Darlinghurst, Australia, Australia, 2005. AustralianComputer Society, Inc.

[28] Nathaniel Nystrom, Stephen Chong, and Andrew C. Myers. Scalableextensibility via nested inheritance. In John M. Vlissides andDouglas C. Schmidt, editors, OOPSLA, pages 99–115. ACM, 2004.

[29] Martin Odersky, Vincent Cremet, Christine Röckl, and MatthiasZenger. A nominal theory of objects with dependent types. InCardelli [3], pages 201–224.

[30] Martin Odersky and Philip Wadler. Pizza into Java: Translating theoryinto practice. In Lee et al. [24], pages 146–159.

[31] Martin Odersky, Christoph Zenger, and Matthias Zenger. Coloredlocal type inference. In Proceedings of the 28th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages,POPL 2001, London, UK, January 17-19, 2001, pages 41–53. ACM,2001.

[32] Martin Odersky and Matthias Zenger. Scalable component abstrac-tions. In Johnson and Gabriel [21], pages 41–57.

[33] Klaus Ostermann. Dynamically composable collaborations withdelegation layers. In Boris Magnusson, editor, ECOOP 2002 -Object-Oriented Programming, 16th European Conference, Malaga,Spain, June 10-14, 2002, Proceedings, volume 2374 of Lecture Notesin Computer Science, pages 89–110. Springer, 2002.

[34] Benjamin C. Pierce and Martin Steffen. Higher-order subtyping. InIFIP Working Conference on Programming Concepts, Methods andCalculi (PROCOMET), 1994. Full version in Theoretical ComputerScience, vol. 176, no. 1–2, pp. 235–282, 1997 (corrigendum in TCSvol. 184 (1997), p. 247).

[35] Benjamin C. Pierce and David N. Turner. Local type inference. InMacQueen and Cardelli [25], pages 252–265.

[36] Didier Rémy and Jérôme Vouillon. On the (un)reality of virtualtypes, March 2000. In preparation (http://gallium.inria.fr/~remy/work/virtual/).

[37] Nathanael Schärli, Stéphane Ducasse, Oscar Nierstrasz, and An-drew P. Black. Traits: Composable units of behaviour. In Cardelli [3],pages 248–274.

[38] Tim Sheard. Using MetaML: A staged programming language. InAdvanced Functional Programming, pages 207–239, 1998.

[39] Matthias Zenger and Martin Odersky. Independently extensiblesolutions to the expression problem. Technical Report IC/2004/33,EPFL Lausanne, Switzerland, 2004.

106