scala for generic · pdf fileunder consideration for publication in j. functional programming...

48
Under consideration for publication in J. Functional Programming 1 Scala for Generic Programmers Comparing Haskell and Scala Support for Generic Programming BRUNO C. D. S. OLIVEIRA ROSAEC Center, Seoul National University, 599 Gwanak-ro, Gwanak-gu, Seoul 151-744, South Korea (e-mail: http://ropas.snu.ac.kr/ bruno/) JEREMY GIBBONS Oxford University Computing Laboratory, Wolfson Building, Parks Road, Oxford OX1 3QD, UK (e-mail: http://www.comlab.ox.ac.uk/jeremy.gibbons/) Abstract Datatype-generic programming involves parametrization of programs by the shape of data, in the form of type constructors such as ‘list of’. Most approaches to datatype-generic programming are developed in pure functional programming languages such as Haskell. We argue that the functional object-oriented language Scala is in many ways a better choice. Not only does Scala provide equiv- alents of all the necessary functional programming features (such as parametric polymorphism, higher-order functions, higher-kinded type operations, and type- and constructor-classes), but it also provides the most useful features of object-oriented languages (such as subtyping, overriding, traditional single inheritance, and multiple inheritance in the form of traits). Common Haskell tech- niques for datatype-generic programming can be conveniently replicated in Scala, whereas the extra expressivity provides some important additional benefits in terms of extensibility and reuse. We illustrate this by comparing two simple approaches in Haskell, pointing out their limitations and showing how equivalent approaches in Scala address some of these limitations. Finally, we present three case studies on how to implement in Scala real datatype-generic programming approaches from the literature: Hinze’s ‘Generics for the Masses’, L¨ ammel and Peyton Jones’s ‘Scrap your Boilerplate with Class’, and Gibbons’s ‘Origami Programming’. 1 Introduction Datatype-generic programming (DGP) is about writing programs that are parametrized by a datatype, such as lists or trees. This is different from parametric polymorphism, or ‘gener- ics’ as the term is used by most object-oriented programmers: parametric polymorphism abstracts from the ‘integers’ in ‘lists of integers’, whereas DGP abstracts from the ‘lists of’. There is a large and growing collection of techniques for writing datatype-generic pro- grams. Much of the early research on DGP relied on special-purpose languages or lan- guage extensions such as Charity (Cockett & Fukushima, 1992), PolyP (Jansson, 2000), and Generic Haskell (Hinze & Jeuring, 2002). With time, research has shifted towards more lightweight approaches, based on language extensions such as Scrap your Boiler- plate (L¨ ammel & Peyton Jones, 2003) and Template Haskell (Sheard & Peyton Jones, 2002); more recently, DGP techniques have been encapsulated in libraries for existing general-purpose languages, such as Generics for the Masses (Hinze, 2006) for Haskell and

Upload: doantuong

Post on 09-Mar-2018

241 views

Category:

Documents


1 download

TRANSCRIPT

Under consideration for publication in J. Functional Programming 1

Scala for Generic ProgrammersComparing Haskell and Scala Support for Generic Programming

BRUNO C. D. S. OLIVEIRAROSAEC Center, Seoul National University, 599 Gwanak-ro, Gwanak-gu, Seoul 151-744, South Korea

(e-mail: http://ropas.snu.ac.kr/ bruno/ )

JEREMY GIBBONSOxford University Computing Laboratory, Wolfson Building, Parks Road, Oxford OX1 3QD, UK

(e-mail: http://www.comlab.ox.ac.uk/jeremy.gibbons/ )

Abstract

Datatype-generic programming involves parametrization of programs by the shape of data, in theform of type constructors such as ‘list of’. Most approachesto datatype-generic programming aredeveloped in pure functional programming languages such asHaskell. We argue that the functionalobject-oriented language Scala is in many ways a better choice. Not only does Scala provide equiv-alents of all the necessary functional programming features (such as parametric polymorphism,higher-order functions, higher-kinded type operations, and type- and constructor-classes), but italso provides the most useful features of object-oriented languages (such as subtyping, overriding,traditional single inheritance, and multiple inheritancein the form of traits). Common Haskell tech-niques for datatype-generic programming can be conveniently replicated in Scala, whereas the extraexpressivity provides some important additional benefits in terms of extensibility and reuse. Weillustrate this by comparing two simple approaches in Haskell, pointing out their limitations andshowing how equivalent approaches in Scala address some of these limitations. Finally, we presentthree case studies on how to implement in Scala real datatype-generic programming approaches fromthe literature: Hinze’s ‘Generics for the Masses’, Lammeland Peyton Jones’s ‘Scrap your Boilerplatewith Class’, and Gibbons’s ‘Origami Programming’.

1 Introduction

Datatype-generic programming (DGP) is about writing programs that are parametrized bya datatype, such as lists or trees. This is different from parametric polymorphism, or ‘gener-ics’ as the term is used by most object-oriented programmers: parametric polymorphismabstracts from the ‘integers’ in ‘lists of integers’, whereas DGP abstracts from the ‘lists of’.

There is a large and growing collection of techniques for writing datatype-generic pro-grams. Much of the early research on DGP relied on special-purpose languages or lan-guage extensions such as Charity (Cockett & Fukushima, 1992), PolyP (Jansson, 2000),and Generic Haskell (Hinze & Jeuring, 2002). With time, research has shifted towardsmore lightweight approaches, based on language extensions such as Scrap yourBoiler-plate (Lammel & Peyton Jones, 2003) and Template Haskell (Sheard & Peyton Jones,2002); more recently, DGP techniques have been encapsulated in libraries for existinggeneral-purpose languages, such as Generics for the Masses(Hinze, 2006) for Haskell and

2 Bruno Oliveira and Jeremy Gibbons

Adaptive Object-Oriented Programming (Lieberherr, 1996)for C++. One key advantage ofthe lightweight approaches is that DGP becomes more accessible to potential users, sinceno new tool or compiler is required in order to enjoy its benefits. Indeed, the use of librariesor simple language extensions rather than completely new languages has greatly promotedthe adoption of DGP.

Despite the rather wide variety of host languages involved in the techniques listed above,the casual observer might be forgiven for concluding, from the wealth of proposals forlightweight generic programming in Haskell (Cheney & Hinze, 2002; Hinze, 2006; Lammel& Peyton Jones, 2005; Oliveiraet al., 2006; Hinzeet al., 2006; Weirich, 2006; Hinze &Loh, 2007; Mitchell & Runciman, 2007; Brown & Sampson, 2009), that ‘Haskell is theprogramming language of choice for discriminating datatype-generic programmers’. Ourpurpose in this paper is to argue to the contrary; we believe that although Haskell is ‘a finetool for many datatype-generic applications’, it is not necessarily the best choice.

In particular, we argue that the discriminating datatype-generic programmer ought seri-ously to consider using Scala, a relatively recent languageproviding a smooth integrationof the functional and object-oriented paradigms. Scala offers equivalents for most famil-iar features cherished by datatype-generic Haskell programmers, suchparametric poly-morphism, higher-order functions, higher-kinded types, andtype-andconstructor-classes.(Two significant missing features arelazy evaluationandhigher-ranked types.) In addition,it offers some of the most useful features of object-oriented programming languages, suchassubtyping, overriding, and both single and a form of multipleinheritance(via ‘traits’).We show not only that Haskell techniques for DGP can be conveniently replicated inScala, but also that the extra expressivity provides important additional benefits in termsof extensibility and reuse. Specifically, intricate constructions are often needed to bend theimplicit dictionaries in Haskell’s class system to DGP purposes—these convolutions couldmostly be avoided if one had first-class dictionaries. Scala’s traits mechanism providessuch a facility, including the ability to pass them implicitly where appropriate.

We are not the first to consider DGP in Scala: Moorset al.(2006) presented a translationinto Scala of a Haskell library of ‘origami operators’ (Gibbons, 2006); we discuss thistranslation in depth in Section 9. And of course, it is not really surprising that Scalashould turn out effectively to subsume Haskell, since its design is substantially inspiredby functional programming.

We are interested in finding the limitations of the general-purpose mechanisms, in orderto point out their weaknesses and to promote their improvement. The aim here is not tocompare particular DGP libraries, as was done by Rodriguezet al. (2008), but ratherto consider the basic mechanisms used to implement those libraries. We claim that thelanguage mechanisms traditionally used for implementing DGP libraries, namelytypeclasses(Hall et al., 1996) andgeneralized algebraic datatypes(GADTs) (Peyton Joneset al., 2006), each have their limitations, and that it might be better to exploit a differentmechanism incorporating the advantages of both. This paperexplores Scala’s object systemas such an alternative mechanism.

We feel that our main contribution is as a call to datatype-generic programmers to lookbeyond Haskell, and particularly to look at Scala. Not only can Scala be used to expresscurrent approaches to DGP; in some ways—in particular, withits open datatypes, inher-itance, andimplicits mechanism—it improves upon Haskell. Some of those advantages

Journal of Functional Programming 3

derive from Scala’s mixed-paradigm nature, and so do not translate back into Haskell;but others (such as case classes and anonymous case analyses, as we shall see) would fitperfectly well into Haskell. We emphasize that we are not arguing that Scala is universallysuperior to Haskell. Indeed, as we shall see, Scala too has some limitations. Instead, ourpurpose is to promote the migration of good ideas from Scala to Haskell and vice versa, soas to improve support for DGP in both languages.

As a secondary contribution, we show that Scala is more of a functional programminglanguage than is typically appreciated. Scala tends to be seen primarily as an object-oriented language that happens to have some functional features, and so potential usersfeel that they have to use it in an object-oriented way. For example, Moorset al. (2006)claimed to be ‘staying as close to the original work as possible’ in their translation of theorigami operators, but as we show in Section 9 they still ended up less functional than theymight have done. Scala is also a functional programming language that happens to haveobject-oriented features; indeed, it offers the best of both worlds, and this paper serves alsoas a tutorial in exploiting Scala as a multi-paradigm language.

The rest of this paper is structured as follows. Section 2 sets the scene, by reviewing twostraightforward approaches to DGP in Haskell—using representation datatypes and typeclasses respectively—and pointing out their limitations.Sections 3 and 4 introduce thebasics of Scala, and those more advanced features of its typeand class system on whichwe depend. Our contribution starts in Sections 5 and 6, whichshow how to implementin Scala the two approaches presented in Haskell in Section 2. After that, three casestudies of the translation of existing DGP libraries into Scala are presented: Section 7discusses an implementation of Hinze’sGenerics for the Massesapproach (Hinze, 2006);Section 8 shows a Scala implementation ofScrap your Boilerplate with Class(Lammel &Peyton Jones, 2005); and Section 9 presents a more functional alternative to Moorset al.’sencoding (2006) of theOrigami Programmingoperators (Gibbons, 2003; Gibbons, 2006)in Scala. Finally, Section 10 compares Haskell and Scala support for DGP, and brieflydiscusses some of the key ideas of the paper, and Section 11 concludes. Scala code for theexamples is available online (Oliveira, 2009b).

2 Some Limitations of Haskell for Datatype-Generic Programming

To conduct our experiments, we consider two very simple and straightforward librariesfor generic programming, using representation datatypes and type classes respectively.The purpose here is to discuss the limitations of the two mechanisms. While there arevarious clever tricks and workarounds for dealing with these limitations, better linguisticmechanisms would provide support more naturally, and do away with the need for thesetricks in the first place. We do not intend to have a debate about whether one structuralview of datatypes is better or worse than another; the issueswe identify are pervasive tomost generic programming libraries.

2.1 Generic programming with representation datatypes

A very natural style in which to write a generic programming library is to base it on adatatype of type representations (Cheney & Hinze, 2002; Hinze et al., 2006; Weirich,

4 Bruno Oliveira and Jeremy Gibbons

2006), leading to a structural approach similar to that of Generic Haskell. Cheney andHinze’s lightweight implementation of generics and dynamics(LIGD) (Cheney & Hinze,2002) provides the earliest example of such an approach, showing how to do a kind ofgeneric programming using only the standard Hindley-Milner type system extended withexistential datatypes. The key idea is to use a parametrizeddatatype, with the actual pa-rameter being (a representation of) the type index; constraints on the parameter enforceconsistency between the behaviour and the type index. SinceCheney and Hinze’s proposal,some Haskell implementations have been extended with GADTs, which provide additionalconvenience that existential datatypes alone lack; we use GADTs in this section to illustratethe approach.

Here is a representation of a family of datatypes based on sums of products:

data Unit = Unitdata Sum a b= Inl a | Inr bdata Prod a b= Prod a b

data Rep twhereRUnit ::Rep UnitRInt ::Rep IntRChar::Rep CharRSum::Rep a→ Rep b→ Rep(Sum a b)RProd::Rep a→ Rep b→ Rep(Prod a b)

The typesUnit, Sum, andProd represent, respectively, the unit type and the binary sum andbinary product type constructors. The datatypeRep tprovides the structural representationof a typet as a sum of products built from the primitive typesUnit, Int, andChar. Forsimplicity of presentation, the treatment of isomorphisms, which allows the application ofgeneric functions to values isomorphic to a sum of products,is omitted here; for the fulldetails, see elsewhere (Cheney & Hinze, 2002; Weirich, 2006).

Generic functions are defined by case analysis on the datatype of type representations.For example, here is a definition of generic equality:

equals::∀t. Rep t→ t → t → Boolequals RUnit = Trueequals RInt t1 t2 = (t1 ≡ t2)equals RChar t1 t2 = (t1 ≡ t2)equals(RSum ra rb) t1 t2 = case (t1, t2) of

(Inl x, Inl y) → equals ra x y(Inr x, Inr y) → equals rb x y

→ Falseequals(RProd ra rb) t1 t2 = case (t1, t2) of

(Prod x y,Prod x′ y′) → equals ra x x′ ∧ equals rb y y′

Equality is vacuous at the unit type, represented byRUnit, since there is exactly one valueof that type. When the type representation isRInt, the type constraint ensures that the twovalues of typet being compared are indeed integers, and so the primitive comparison onintegers is used; and similarly forRCharand characters. For binary sums, the constructor

Journal of Functional Programming 5

class Total awheretotal ::a→ Int

instance Total Unit wheretotal = const0

instance Total Int wheretotal = id

instance (Total a,Total b) ⇒ Total (Prod a b) wheretotal (Prod x y) = total x+ total y

instance (Total a,Total b) ⇒ Total (Sum a b) wheretotal (Inl x) = total xtotal (Inr y) = total y

instance Total a⇒ Total [a] wheretotal = total◦ fromList

fromList:: [a] → Sum Unit(Prod a[a])fromList[ ] = Inl UnitfromList(x :xs) = Inr (Prod x xs)

Fig. 1. A generic ‘sum’ function, using type classes.

RSumof the representation is applied to representations of the two summands, and theserepresentations are used in the recursive calls; similarlyfor products andRProd.

2.2 Generic programming with type classes

An alternative to using datatypes for type representationsis to use type classes (Lammel &Peyton Jones, 2003; Hinze, 2006; Mitchell & Runciman, 2007;Brown & Sampson, 2009),since both techniques can be used to definetype-indexed functions, albeit with slightlydifferent properties (Oliveira & Gibbons, 2005). Figure 1 presents a simple definition of ageneric ‘sum’ function using type classes. The classTotal ahas a methodtotal that takes anargument of typea and returns an integer result. There are instances for each of the sum-of-products structural types: for the unit type, zero is returned; for binary products, the resultson the two components are added; for binary sums, the function is applied recursively tothe appropriate case; for integers, the integer itself is returned. Instead of specially craftingan instance specific to lists, the argument is converted intosum-of-products form usingthe functionfromList and then subjected to structural cases oftotal. The same structuralapproach can be taken for other datatypes, avoiding the needfor specific definitions forthose datatypes.

2.3 Limitations

The two approaches presented in Sections 2.1 and 2.2 are relatively simple to understand.Unfortunately, they are also rather simple-minded, and suffer from some limitations interms of both convenience and expressiveness. More realistic generic programming li-braries address some of these limitations, but usually at the cost of comprehensibility. Wediscuss these limitations next, together with some of the attempts to address them.

6 Bruno Oliveira and Jeremy Gibbons

2.3.1 Convenience and readability

Two important considerations of convenience concerning a DGP library are how easy it isto define generic functions, and how easy it is to apply them. Defining generic functionswith a datatype of type representations is usually straightforward, since all that is needed isto use pattern matching on the type representation to introduce a definition by cases. Typeclasses impose some overhead, since each instance declaration requires some additionalcode. Furthermore, the use of datatypes and pattern matching is arguably more naturalthan type classes and dispatching. Nonetheless, the overhead imposed by type classes istolerable.

Using a generic function based on type representations requires that the correspondingvalue of the type representation is constructed. For example, to compare two pairs ofintegers, one needs a third argument representing the type ‘pairs of integers’:

testDT= equals(RProd RInt RInt) (Prod3 4) (Prod 4 4)

In contrast, with the type class approach, the explicit construction of the type representationis not necessary:

testTC= total (Prod3 4)

Not having to explicitly construct type representations isan advantage of the type classapproach, and provides additional convenience over an approach based on representationdatatypes.

The reality. In existing proposals for DGP libraries using datatypes of type representa-tions, it is common to use type classes to automatically generate the values of the typerepresentations. There are even some proposals, such as Hinze’sGenerics for the Masses(GM) approach (Hinze, 2006), which do not directly use a datatype of type representations,but encode one using type classes, and also use the same mechanism to generate the valuesof the encoded type representations. The basic idea is as follows:

class Representable awhererep:: Rep a

instance Representable Intwhererep= RInt

instance (Representable a,Representable b) ⇒ Representable(Prod a b) whererep= RProd rep rep

(Here, an LIGD-like approach is used for illustration.) An equality function that does notneed an explicit value for the type representation is definable as follows:

eq::Representable a⇒ a→ a→ Booleq= equals rep

While this design does bring some of the convenience of type classes into a datatype-basedapproach, the fact is thattwodifferent functions for equality are needed: the functionequalsdefines the structural generic function, and theeqfunction provides a convenient interfaceto this generic function. As we shall see in Section 2.3.2, sometimes it would be handy to

Journal of Functional Programming 7

have functions that could take an optional argument, leaving the job of generating the valueto the compiler when this argument is omitted. Having two differently-named functions forthe explicit and implicit cases is awkward.

2.3.2 Coexistence of implicit and explicit arguments

Generic Haskell provides a simple mechanism for precisely controlling which case getsapplied. For instance, using a Generic Haskell generic function total〈T〉 similar to the func-tion presented in Figure 1, it is possible to employlocal redefinition(Loh, 2004, Chapter 8)to overridea case in a particular use of the generic function. With localredefinitions it iseasy to have one variation that counts the values in a list:

let total〈a〉 = const1 in total〈[a]〉[1. .10]

and another sums the (integer) values in that list:

let total〈Int〉 = id in total〈[ Int]〉[1. .10]

The total function in Figure 1 sums the integers in a structure. In order to provide alocal redefinition to count rather than sum the elements, onemight attempt to provide thefollowing alternative instance for integers:

instance Total Intwheretotal = const1

However, this instance overlaps with the one already given in Figure 1. This leads to anambiguity in instance selection; Haskell provides no mechanism for resolving the am-biguity, and disallows the coexistence of such instances. (What is needed here is someexplicit mechanism for dictionaries, rather than the implicit mechanism provided by theclass system. GHC’s ‘overlapping instances’ flag doesn’t help, because the instances areduplicated; it is useful only when one instance is more specific than the other, when itallows the compiler to select the more specific instance.)

The reality. Very few generic programming libraries in Haskell provide support for localredefinitions. In fact, we believe that currently only GM andits extensible and modulargenerics for the masses(EMGM) extension (Oliveiraet al., 2006) provide some supportfor this feature, and then only partially—although there isan alternative encoding usinga proposed extension to Haskell (Hinze & Loh, 2009). As Hinze (2006) points out, inthe GM approach, a generic counter function can be instantiated to behave like a sum-ming function or a size function. However, this is significantly less convenient to use thanthe Generic Haskell solution, since two different functions are needed: one for when nolocal redefinitions are used, and another to explicitly passthe type representation argu-ment with the local redefinition. Worse, first-class genericfunctions induce an exponentialgrowth in the number of combinations. For example, in the case of a generic function likeeverywhere(Lammel & Peyton Jones, 2003), which in turn takes a genericfunction asan argument, four different variations would be needed—foreach combination of allowingand disallowing local redefinitions foreverywhereitself and for the generic function passedas an argument.

8 Bruno Oliveira and Jeremy Gibbons

In a generic programming approach like the one shown in Figure 1, it is simply not pos-sible to have local redefinitions without some forward planning. There have been some pro-posals to extend Haskell with a mechanism for choosing a particularnamed instance(Kahl& Scheffczyk, 2001; Dijkstra & Swierstra, 2005), but these extensions are not widelyimplemented. It is possible to use a simple trick to emulate named instances, as shown forexample by Loh (2004); however, this still entails significant rewriting, planning ahead,and some advanced Haskell extensions.

2.3.3 Extensibility

Generic functions are useful because they work ‘out of the box’ for a newly introduceddatatype. However, it is sometimes desirable to define a specific (non-generic) behaviourfor the generic function on a particular datatype. For this to happen, the generic functionneeds to beextensible, allowing the definition of new cases for particular datatypes. Froma generic programming point of view, extensible generic functions are essential for thedesign of modular generic programming libraries (Hinze & Peyton Jones, 2000; Lammel& Peyton Jones, 2005; Hinze, 2006; Oliveiraet al., 2006). For example, abstract datatypessuch as sets are often represented using a standard algebraic datatype like lists or trees, buta generic function based solely on the algebraic structure of the representation probablydoes not provide an appropriate implementation on the abstract type. Consider the case ofequality; while structural equality is the right thing to dofor most datatypes, it is wrong foran abstract datatype of sets represented as lists.

In the simple DGP approach presented in Section 2.1, it is notpossible to extend theequality generic function in a modular way. In order to add a new case for sets, one mustadd a new constructor to theRepdatatype, and provide a special case for equality on sets,as follows:

newtype Set a= Set[a]

data Rep twhere. . .

RSet:: Rep a→ Rep(Set a)

equals::∀t. Rep t→ t → t → Bool. . .

equals(RSet a) = . . .

On the other hand, in the type-class based approach, the generic function can be easilyextended with new cases; all that is needed is to create a new instance:

instance Total(Set a) wheretotal (Set xs) = . . .

In essence, approaches based on datatypes of representations usually do not supportmodular extensions, whereas those based on type classes do.

The reality. In recent proposals for DGP libraries, the trend is to use type classes: they canbe extended more easily than datatypes, and also they make the use of generic functions

Journal of Functional Programming 9

quite convenient (see Section 2.1). However, even using type classes,extensibilitycan stillbe problematic—especially in combination withfirst-class generic functions, as we shallsee in Section 2.3.4. This is the case, for example, for the original GM approach, whichdoes not allow extensible generic functions. To address these extensibility problems, anumber of clever approaches have been proposed. RepLib (Weirich, 2006) uses a mix ofdatatypes and type classes to allow extensible generic functions; the key idea is to use astandard generic function defined on a datatype of type representations, and use that genericfunction as the default for another (type-class-overloaded) function that can be extendedwith new ad-hoc cases. While the approach achieves its goal of supporting an extensiblegeneric programming library, it does so at the loss of some usability and understandability:it relies on several non-standard extensions, and it requires the programmer to write genericfunctions in two different styles, namely using datatypes and type classes. Ultimately, aprogrammer needs to understand quite a bit of the mechanics of the generic programminglibrary and some advanced Haskell features to use the approach effectively. The EMGMapproach (Oliveiraet al., 2006) addresses the extensibility limitations of the original GMproposal requiring only a common extension to Haskell 98, namely, multiple-parametertype classes; however, writing generic functions in EMGM (and in the original GM) is notas direct as using a datatype of type representations. The original Scrap your Boilerplate(SyB) approach (Lammel & Peyton Jones, 2003) approach doesnot support extensiblegeneric functions; to address this problem, an alternativeimplementation (Lammel & Pey-ton Jones, 2005) of SyB using type classes has been proposed,but this approach uses manynon-standard extensions and tricks.

A different solution (Loh & Hinze, 2006) to the extensibility problem consists of ex-tending Haskell with open datatypes and open functions. This would have some importantadvantages, especially from a usability point of view, since the natural style of writing ageneric function using pattern matching on the type representation would be preserved.However, to date that extension is not supported by any compiler.

2.3.4 First-class generic functions and generic function abstraction

The SyB approach has shown the utility of first-class genericfunctions for generic traver-sals and queries. With a datatype of type representations, it is straightforward to write suchfunctions (Hinze, 2003). For example, consider the function everywhere:

everywhere:: (∀b. Rep b→ b→ b) → Rep a→ a→ a

It takes as an argument a generic function that transforms a value of typeb into anothervalue of the same type; it also takes a representation of sometype a and a value of thattype, and returns a value of the same type. In this approach, first-class generic functionsare quite simple and natural. However, with the simple type-class approach, it is far fromobvious how to write the type ofeverywhere. The key problem is thateverywhereneedsto be applicable toanygeneric function as its first argument. In pseudo-code, the intendedtype is as follows:

everywhere:: Everywhere a⇒ (∀b. g b⇒ b→ b) → a→ a

10 Bruno Oliveira and Jeremy Gibbons

That is,everywhereshould take a generic function defined in an arbitrary type classg asthe first argument. However, Haskell does not support type class abstraction, and the typeclass constraintg b⇒ . . . is not valid.

In summary, while a datatype-based approach trivially supports first-class generic func-tions, a type-class-based approach stumbles over the fact that type classes cannot be ab-stracted.

The reality. There are some generic programming approaches based on typeclasses thatdo not have a problem with first-class generic functions. Interestingly enough, these showa strong correlation with the approaches that have a problemwith extensibility. In otherwords, there seems to be a conspicuous relationship betweenextensibility and first-classgeneric functions: having one of these features makes the other feature harder to achieve(in Haskell, at least).

Using a technique proposed by Hughes (1999), it is possible to emulate type-classabstraction in Haskell using only existing extensions. This technique has been used inthe ‘SyB with Class’ approach (Lammel & Peyton Jones, 2005)to allow extensible higher-order generic functions (see Section 8.1 for more details);a similar technique is used inRepLib (Weirich, 2006).

2.3.5 Reuse of generic functions

Generic Haskell providesdefault cases, a mechanism that allows the reuse of genericfunctions (Loh, 2004, Chapter 14). The motivation for thisis that often minor variationsof a generic function are written over and over again. For example, consider collectingvariables in some datatype of abstract syntax trees. Instead of defining a function genericin the datatype but specific to the problem of collecting variables, a more general functionfor collecting values (Loh, 2004, Chapter 9) could be reused, overriding the case for thevariable type. In Generic Haskell, this idea can be expressed as follows:

newtype Var = V String

varcollectextends collectvarcollect〈Var〉(V x) = [x]

Neither the datatype nor the type class solutions allow for this kind of reusability; withoutanticipation, it is necessary to duplicate code in creatinga variation of the original function.

The reality. As far as we are aware, this kind of reuse is not addressed by any genericprogramming library in existence, except by an early approach (Lammelet al., 2000) basedon Haskell records that achieves reuse between algebras by exploiting record updates. Un-fortunately, although the current implementation of type classes in most Haskell compilersis based on records, the updating feature is not available for type classes.

Reuse of generic functions is akin to inheritance, and it is known how to encode in-heritance in functional languages (Cook, 1989). So, in theory, it should be possible toadapt existing generic programming libraries to achieve this kind of reuse via inheritance;however, any encoding introduces its own cost in terms of usability. Another alternativeis to create a more general generic function that is parametrized by functions covering

Journal of Functional Programming 11

the different cases; but this requires anticipation, and makes the interface of the genericfunction more complex.

2.3.6 Exotic types

DGP techniques are applicable to some exotic types, such as datatypes with higher-kindedtype arguments and nested datatypes (Hinze, 2000). One example of the former is the typeof generalized rose trees:

data GRose f a= GFork a(f (GRose f a))

The type constructorGRoseis parametrized by a higher-kinded argumentf . Datatype-based approaches to DGP comfortably support type representations for such types, andcorresponding cases for generic functions:

data Rep twhere. . .

RGRose:: (∀a. Rep a→ Rep(f a)) → Rep a→ Rep(GRose f a)

equals::∀t. Rep t→ t → t → Bool. . .

equals(RGRose f a) (GFork x xs) (GFork y ys) =

equals a x y∧ equals(f (RGRose f a)) xs ys

One small inconvenience with the datatype-based approach is that it is not possible to usenested patterns such asRGRose RList a: the first argument of theRGRoseconstructor is nota valid pattern, since it is not fully applied to the right number of arguments. Nevertheless,there is a workaround that allows emulation of such nested patterns (Hinze & Loh, 2009).

With type-class-based approaches, support for datatypes like GRosedoes not work sosmoothly. Recent versions of some Haskell compilers support recursive dictionaries in typeclasses, and accept the following code:

instance (Total a,Total(f (GRose f a))) ⇒ Total (GRose f a) wheretotal (GFork x xs) = total x+ total xs

but this requires allowing undecidable type class instances. (Indeed, in older versions ofsome compilers, it used to be the case that such an instance would lead to non-terminationof the type checker (Hinze & Peyton Jones, 2000).)

A theoretically more appealing solution, suggested by Hinze and Peyton Jones (2000),would be to allowpolymorphic predicatesin the constraints. With such a feature, thefollowing instance would be valid:

instance (∀a. Total a⇒ Total (f a),Total a) ⇒ Total(GRose f a) wheretotal (GFork x xs) = total x+ total xs

The reality. As shown by the recent comparison of generic programming libraries inHaskell (Rodriguezet al., 2008), exotic features such as datatypes with higher-kinded typearguments and nested datatypes are not a problem for most approaches that use a datatype

12 Bruno Oliveira and Jeremy Gibbons

Datatypes Type classes

Convenience:Defining generic functions G#

Using generic functions G#

Implicit explicit parametrization #1

#2

Extensibility #

First-class generic functions #

Reuse of generic functions # #

Exotic types G#3

Fig. 2. Evaluation of the Haskell mechanisms for DGP. Key: =‘good’, G#=‘sufficient’,#=‘poor’support. Notes: 1) datatypes only allow explicit parametrization; 2) type classes only allow implicitparametrization; 3) datatypes with higher-kinded type arguments can be accommodated usingundecidable instances.

of type representations. However, none of the approaches based on type classes is fullycapable of handling such exotic types. The limitations of type classes are to blame.

2.4 Discussion

Type classes and datatypes provide two alternative mechanisms for DGP, but neither mech-anism is clearly superior to the other. Figure 2 shows the trade-offs between the twomechanisms. Datatypes provide a very natural and convenient way to define new genericfunctions, but they also require every value to be explicitly constructed by the programmer;this makes generic functions harder to use. Datatypes make it easy to support first-classgeneric functions, and they can be used to construct type representations for higher-kindedtypes; however, achieving extensibility is difficult. Typeclasses are convenient when itcomes to using generic functions, since dictionaries are automatically inferred by thecompiler, but they provide a somewhat less natural syntax for defining generic functions.It is easy to extend generic functions with new cases, but hard to support first-class genericfunctions. Exotica such as higher-kinded types and nested datatypes pose a challenge to atype-class-based implementation. Values of datatypes canonly be passed explicitly, whiletype class dictionaries can only be passed implicitly. Neither mechanism provides an easyway to reuse generic functions.

The reality is that in Haskell it is usually possible to work around the limitations ofthe two mechanisms in one way or another, but doing so typically requires clever tricksor solutions that hinder usability and comprehensibility.We feel that this a symptom ofinappropriate language features, and we claim that with a different mechanism, genericlibraries could be defined more naturally and used more conveniently.

3 Functional Programming in Scala

Scala is a strongly typed programming language that combines object-oriented and func-tional programming features. Although inspired by recent research, Scala is not just aresearch language; it is also aimed at industrial usage: a key design goal of Scala is that itshould be easy to interoperate with mainstream languages like Java and C#, making theirmany libraries readily available to Scala programmers. Theuser base of Scala is already

Journal of Functional Programming 13

quite significant, with the compiler being actively developed and maintained. For a morecomplete introduction to and description of Scala, see (Odersky et al., 2008; Odersky,2006a; Odersky, 2007a; Odersky, 2007b; Schinz, 2007).

3.1 Definitions and values

Functions are introduced using thedef keyword. For example, the squaring function onDoubles could be written:

def square(x :Double) :Double= x∗xScala distinguishes between definitions and values. In a definition def x= e, the expressione will not be evaluated until the value ofx is needed. Scala also offers a value definitionval x = e, in which the right-hand sidee is evaluated at the point of definition. However,only definitions can take parameters; values must be constants (although these constantscan be functions).

3.2 First-class functions

Functions in Scala are first-class values, sohigher-order functionsare supported. Forexample, to define the functiontwice that applies a given functionf twice to its argumentx, we could write:

def twice(f : Int ⇒ Int,x : Int) : Int = f (f (x))Scala supportsanonymous functions. For instance, to define a function that raises an integerto the fourth power, one could use the functiontwicetogether with an anonymous function:

def power(x : Int) : Int = twice((y : Int) ⇒ y∗y,x)The first argument of the functiontwice is the anonymous function that takes an integeryand returnsy∗y.

Scala also supportscurrying. To declare a curried version oftwice, one can write:def curryTwice(f : Int ⇒ Int) (x : Int) : Int = f (f (x))

3.3 Parametric polymorphism

Like Haskell and ML, and more recently Java and C#, Scala supportsparametric polymor-phism(known asgenericsin the object-oriented world). For example, function compositioncan be defined as follows:

def comp[a,b,c] (f : b⇒ c) (g :a⇒ b) (x : a) :c = f (g (x))The functioncompis parametrically polymorphic in the three typesa,b,c of the initial,intermediate and final values. Note that these type variables have to be explicitly quantified.

3.4 Call-by-name arguments

Function arguments are, by default, passedby value, being evaluated at the point of func-tion application. This gives Scala a strict functional programming flavour. However, onecan also pass argumentsby name, by prefixing the type of the formal parameter with ‘⇒’;the argument is then evaluated at each use within the function definition. This can be usedto emulate lazy functional programming; although multipleuses do not share evaluation,

14 Bruno Oliveira and Jeremy Gibbons

it is still useful, for example, for defining new control structures. Parser combinators are agood example of the use of laziness: the combinatorThentries to apply a parserp, and ifthat parser succeeds, applies another parserq to the remainder of the input:

def Then(p :Parser) (q :⇒Parser) :Parser= . . .

Here, the second parserq is passed by name: only ifq is needed will it be evaluated.

3.5 Type inference

The design goal of interoperability with languages like Java requires compatibility betweentype systems. In particular, this means that Scala needs to support subtyping and (name-)overloaded definitions such as:

def add(x : Int) : Unit = . . .

def add(x :String) : Unit = . . .

This makes type inference more difficult than in languages like Haskell. Nevertheless,Scala does support a form oflocal type inference(Oderskyet al., 2001). Thus, it is possible,most of the time, to infer the return type of a definition and the type of a lambda-boundvariable. For example, one may write:

def power(x : Int) = twice(y⇒ y∗y,x)and both the return type and the type of the lambda variabley will be inferred.

3.6 Sums, products, and lists

The Scala libraries provide implementations of sums, products, and lists. For sum types,the type constructorEither is used. Following Haskell conventions, this type has twoconstructorsLeft andRight, injections into the sum. For example,

val leftVal :Either[Int,String] = Left (1)

val rightVal:Either[Int,String] = Right("c" )

define two values of the typeEither[Int,String]. One can use pattern matching to decon-struct a value of a sum type, as discussed in Section 4.2, but amore compact notation isgiven by thefold of theEither type. For example,

def stringVal(x : Either[Int,String]) = x.fold (y⇒ y.toString(),y⇒ y)defines a function that takes a value of typeEither[Int,String] and returns a string repre-senting the value contained in the sum.

Products can be defined with the usual tuple notation; for example:val prodVal: (Int,Char) = (3, 'c' )

To extract the components of a tuple, Scala provides methodswith names consisting of anunderscore followed by the component number:

val fstVal = prodVal. 1val sndVal= prodVal. 2

Finally, we can use the syntaxList (a1, . . . ,an) to construct a list of sizen with the elementsai for i ∈ [1..n]. For example:

val list = List (1,2,3)

builds the list with 1, 2 and 3 as elements.

Journal of Functional Programming 15

4 Object-Oriented Programming in Scala

Scala has a rich object system, including object-oriented constructs such as concrete andabstract classes, subtyping, and inheritance familiar from mainstream languages like Javaor C#. Scala also incorporates some less commonly known concepts; in particular, thereis a syntactic notion ofobject, and interfaces are replaced by the more general notion oftraits (Scharliet al., 2003), which can be composed using a form of mixin composition.Furthermore, Scala introduces the notion ofcase classes, instances of which can be de-composed using case analysis and pattern matching.

This section introduces a subset of the full Scala object system, sufficient to model allthe programs in this paper.

4.1 Traits and mixin composition

Instead of interfaces, Scala has the more general concept oftraits (Scharliet al., 2003).Like interfaces, traits can be used to defineabstract methods(that is, method signatures).However, unlike interfaces, traits can also define concretemethods. Traits can be combinedusing mixin composition, making a safe form ofmultiple inheritancepossible, as thefollowing example demonstrates:

trait Hello {

val hello = "Hello!"}

trait HowAreU{

val howAreU = "How are you?"}

trait WhatIsUrName{val whatIsUrName= "What is your name?"

}

trait Shout{def shout(str :String) : String

}

This example uses traits in much the same way as one might haveused classes, allowing thedeclaration of both abstract methods likeshoutand concrete methods likehello, howAreUand whatIsUrName. In a single-inheritance language like Java or C#, it would not bepossible to define a subclass that combined the functionality of the four code blocks above.However, mixin composition allows any number of traits to becombined:

trait Basicsextends Hello with HowAreUwith WhatIsUrNamewith Shout{val greet = hello+ " " +howAreUdef shout(str :String) = str.toUpperCase()

}

The traitBasicsinherits methods fromHello, HowAreUandWhatIsUrName, implementsthe methodshoutfrom Shout, and defines a valuegreetusing the inherited methodshelloandhowAreU.

16 Bruno Oliveira and Jeremy Gibbons

trait List[A]case class Nil [A] extends List[A]case class Cons[A] (x :A,xs:List[A]) extends List[A]

def len[a] (l :List[a]) : Int = l match {case Nil () ⇒ 0case Cons(x,xs) ⇒ 1+ len(xs)

}

def ins[a<: Ordered[a] ] (x :a, l : List[a]) :List[a] = l match {case Nil () ⇒ Cons(x,Nil [a])case Cons(y,ys) ⇒ if (x 6 y) Cons(x,Cons(y,ys))

else Cons(y, ins(x,ys))}

Fig. 3. Algebraic datatypes and case analysis in Scala.

4.2 Objects and case classes

New object instances can be created as in most object-oriented languages, by using thenew keyword. For example, we could define a newBasicsobject by:

def basics1 = new Basics() {}

Alternatively, Scala supports a distinct notion ofobject:

object basics2 extends Basics

Scala also supports the notion of acase class, which simplifies the definition of functionsby case analysis. In particular, case classes allow the emulation of algebraic datatypes fromconventional functional languages. Figure 3 gives definitions analogous to the algebraicdatatype of lists and the length and (ordered) insertion functions. The traitList[A] declaresthe type of lists parametrized by some element typeA; the case classesNil andConsactas the two constructors of lists. The functionlen is defined using standard case analysis onthe list value. The definition of the functionins shows another case analysis on lists, andalso demonstrates the use oftype-parameter bounds: the list elements must be drawn froman ordered type.

Case classes do not require the use of thenew keyword for instantiation, as they providea more compact syntax inspired by functional programming languages:

val alist = Cons(3,Cons(2,Cons(1,Nil ())))

4.3 Higher-kinded types

Type-constructor polymorphism and constructor classes have proven to be very useful inHaskell, allowing, among other things, the definition of concepts such as monads (Wadler,1993), applicative functors (McBride & Paterson, 2008), and container-like abstractions.This motivated the recent addition of type-constructor polymorphism to Scala (Moorset al., 2008). For example, a very simple interface for theIterableclass could be defined inScala as:

Journal of Functional Programming 17

trait SetInterface{type Set[ ]type A

def empty: Set[A]def insert(x :A,q:Set[A]) :Set[A]def extract(q: Set[A]) :Option[(A,Set[A])]

}

trait SetOrderedextends SetInterface{type Set[X] = List[X]type A<:Ordered[A]

def empty= Nil ()def insert(x :A,q:Set[A]) = ins(x,q)def extract(q: Set[A]) = q match {

case Nil () ⇒ Nonecase Cons(x,xs) ⇒ Some(x,xs)

}}

Fig. 4. An abstract datatype for sets.

trait Iterable[A,Container[ ] ] {

def map[B] (f :A⇒ B) :Container[B]

def filter (p :A⇒ Boolean) :Container[A]

}

Note thatIterable is parametrized byContainer[ ], a type that is itself parametrized byanother type—in other words,Containeris a type constructor. By parametrizing over thetype constructor rather than a particular typeContainer[A], one can use the parameter inmethod definitions with different types. In particular, in the definition ofmap, the returntype isContainer[B], whereB is a type parameter of the methodmap; with parametrizationby types only,mapwould have to be homogeneous.

4.4 Abstract types

Scala has a notion ofabstract types, which provide a flexible way to abstract over concretetypes used inside a class or trait declaration. Abstract types are used to hide informationabout internals of a component, in a way similar to their use in Standard ML (Harper & Lil-libridge, 1994) and OCaml (Leroy, 1994). Odersky and Zenger(2005) argue that abstracttypes are essential for the construction of reusable components: they allow informationhiding over several objects, a key ingredient of component-oriented programming.

Figure 4 shows a typical example of an ML-style abstract datatype for sets. The abstracttrait SetInterfacedeclares the types and the operations required by sets. The abstract typesA andSet(which is a type constructor) are, respectively, abstractions over the element typeand the shape of the set. The operations supported by the set interface areempty, insertandextract. The traitSetOrderedpresents a concrete refinement ofSetInterface, in which setsare implemented with lists and the elements of the set are ordered.

18 Bruno Oliveira and Jeremy Gibbons

4.5 Implicit parameters and type classes

Scala’simplicit parametersallow some parameters to be inferred implicitly by the com-piler on the basis of type information; as noted by Odersky and others (Odersky, 2006b;Oliveiraet al., 2010), they can be used to emulate Haskell’s type classes (Hall et al., 1996).Consider this approximation to the concept of a monoid (Odersky, 2006a), omitting anyformalization of the monoid laws:

trait Monoid[a] {

def unit :a // unit of opdef op(x : a,y :a) :a // associative

}

This is clearly analogous to a type class. An example object would be a monoid on strings,with the unit being the empty string and the binary operationbeing string concatenation.

implicit object strMonoidextends Monoid[String] {def unit = ""def op(x : String,y :String) = x.concat(y)

}

Again, there is a clear correspondence with an instance declaration in Haskell. Ignoringthe implicit keyword for a moment, one can now define operations that are generic in themonoid:

def reduce[a] (xs:List [a]) (implicit m: Monoid[a]) :a =

if (xs.isEmpty) m.unit else m.op(xs.head, reduce(xs.tail) (m))

Now reducecan be used in the obvious way:

def test1 = reduce(List ("a" , "bc" , "def" )) (strMonoid)

However, one can omit the second argument toreduce, since the compiler has enoughinformation to infer it automatically:

def test2 : String= reduce(List ("a" , "bc" , "def" ))

This works because (a) theimplicit quantifier in the object states thatstrMonoid is thedefault value for the typeMonoid[String], and (b), theimplicit quantifier in the functionstates that the argumentm may be omitted if there exists an implicit object in scope withthe typeMonoid[a]. (If there are multiple such objects, the most specific one ischosen.)The second use ofreduce, with the implicit parameter inferred by the compiler, is similarto Haskell usage; however, it is more flexible, because thereis the option to provide anexplicit value overriding the one implied by the type.

5 Generic Programming with Open Datatypes

In this section, we present a Scala version of the Haskell approach based on datatypesof type representations described in Section 2.1. As shown in Section 4.2, Scala readilysupports a form of algebraic datatypes, via case classes. Itturns out that these algebraic

Journal of Functional Programming 19

trait Rep[A]implicit object RUnit extends Rep[Unit ]implicit object RInt extends Rep[Int ]implicit object RCharextends Rep[Char]case class RProd[A,B] (ra :Rep[A], rb :Rep[B]) extends Rep[(A,B)]case class RPlus[A,B] (ra :Rep[A], rb :Rep[B]) extends Rep[Either[A,B] ]case class RView[A,B] (iso: Iso[B,A], r : () ⇒ Rep[A]) extends Rep[B]

implicit def RepProd[a,b] (implicit ra :Rep[a], rb :Rep[b]) = RProd(ra, rb)implicit def RepPlus[a,b] (implicit ra :Rep[a], rb :Rep[b]) = RPlus(ra, rb)

Fig. 5. Type representations in Scala.

datatypes are quite expressive, being effectively comparable to Haskell’s GADTs. How-ever, unlike the algebraic datatypes found in most functional programming languages,Scala allows an encoding ofopen datatypes(or, from an object-oriented programmingperspective,multi-methods(Agrawalet al., 1991)), enabling the addition of new variantsto a datatype. This section exploits this encoding as a basisfor a generic programminglibrary with open type representations, and hence with support for (modular)ad-hoccases.

5.1 Type representations and generic functions

The traitRep[A] in Figure 5 is a datatype of type representations. The three objectsRUnit,RInt, RCharare used to represent the basic typesUnit, Int andChar; these objects can beimplicitly passed to functions that accept implicit valuesof typeRep[A]. The case classesRPlusand RProd handle sums and products, and theRViewcase class can be used tomap datatypes into sums of products (and vice versa). The first argument ofRViewshouldcorrespond to an isomorphism, which is defined as:

trait Iso[A,B] { // from andto are inversesdef from: A⇒ Bdef to : B⇒ A

}

For example, the isomorphism between lists and their sum-of-products representation isgiven bylistIso:

def fromList[a] = (l :List [a]) ⇒ l match {

case Nil ⇒ Left ({})case (x :: xs) ⇒ Right(x,xs)

}

def toList[a] = (s: Either[Unit,(a,List[a])]) ⇒ smatch {

case Left ( ) ⇒ Nilcase Right(x,xs) ⇒ x ::xs

}

def listIso[a] = Iso[List [a],Either[Unit,(a,List[a])]] (fromList) (toList)

20 Bruno Oliveira and Jeremy Gibbons

Note that the second argument ofRViewshould be lazily constructed. Unfortunately, Scalaforbids the by-name qualification at that argument position, so we have to encode call-by-name manually using the conventional ‘thunk’ technique.

As a simple example of a generic function, we present a serializer. The idea is that, givensomerepresentabletypet, we can define a generic binary serializer by case analysis onthestructure of the representation oft:

def serial[t ] (x : t) (implicit r : Rep[t ]) : String=

r match {

case RUnit ⇒ ""case RInt ⇒ encodeInt(x)case RChar ⇒ encodeChar(x)case RPlus(a,b) ⇒ x.fold ("0" +serial( ) (a), "1" +serial( ) (b))

case RProd(a,b) ⇒ serial(x. 1) (a)+serial(x. 2) (b)

case RView(i,a) ⇒ serial(i.from(x)) (a ())

}

For the purposes of presentation, we encode the binary representation as a string of zeroesand ones rather than a true binary stream. The arguments ofserial are the valuex of typet to encode and a representation oft (which may be passed implicitly). For theUnit case,we return an empty string; forInt andChar, we assume primitive encodersencodeIntandencodeChar. The case for sums applies thefold method (defined in theEither trait) to thevaluex; in casex is an instance ofLeft, we encode the rest of the value and prepend 0;in casex is an instance ofRight, we encode the rest of the value and prepend 1. The casefor products concatenates the results of encoding the two components of the pair. Finally,for the view case, we convert the valuex into its sum-of-products equivalent and apply theserialization function to that.

5.2 Open type representations and ad-hoc cases

In Scala, datatypes may be open to extension—that is, it is possible to introduce newvariants; in the case of type representations, it means thatwe can add new constructorsfor representations of new types. This is useful for ad-hoc cases in generic functions—that is, to provide a behaviour different from the generic one for a particular datatype in aparticular generic function.

For example, suppose that we want to use a different encodingof lists than the onederived generically: it suffices to encode the length of a list, followed by the encodingsof each of its elements. For long lists, this encoding is moreefficient than the genericbehaviour obtained from the sum-of-products view, which essentially encodes the lengthin unary rather than binary format. In order to be able to define an ad-hoc case, we firstneed to extend our type representations with a new case for lists.

case class RList[A] (a :Rep[A]) extendsRView[Either[Unit,(A,List[A])],List [A] ]

(listIso,() ⇒ RPlus(RUnit,RProd(a,RList(a))))

implicit def RepList[a] (implicit a :Rep[a]) = RList(a)

Journal of Functional Programming 21

This is achieved by creating a subtype ofRView, using the isomorphism between listsand their sum-of-products representation. Notice thatRList depends on itself; had thisrepresentation parameter not been made lazy, the representation would unfold indefinitely.The functionRepListyields a default implicit representation for lists, given arepresentationof the elements.

With the extra case for lists, we can have an alternative serialization function with aspecial case for lists:

def serial1 [t ] (x : t) (implicit r : Rep[t ]) : String=

r match {

case RUnit ⇒ ""case RInt ⇒ encodeInt(x)case RChar ⇒ encodeChar(x)case RPlus(a,b) ⇒ x.fold ("0" +serial1 ( ) (a), "1" +serial1 ( ) (b))

case RList(a) ⇒ serial1 (x.length)+

x.map(serial1 ( ) (a)).foldRight("" ) ((x,y) ⇒ x+y)case RView(i,a) ⇒ serial1 (i.from (x)) (a ())

}

The definition ofserial1 is essentially the same asserial, except that there is an extra casefor lists, producing an encoding of the list length followedby the encodings of its elements.

5.3 Inheritance of generic functions

The definition of theserial1 generic function is somewhat unsatisfactory, because it in-volves code duplication. Scala, being an object-oriented language, supports inheritance.However, to make use of inheritance on generic functions, weneed to adapt our programsto use classes instead of function definitions: the serialization functionserial has to berewritten as follows:

trait Producer[a] {

def apply[t ] (x : t) (implicit r : Rep[t ]) : a}

case class Serialextends Producer[String] {def apply[t ] (x : t) (implicit r : Rep[t ]) : String= r match {

case RUnit ⇒ ""case RInt ⇒ encodeInt(x)case RChar ⇒ encodeChar(x)case RPlus(a,b) ⇒ x.fold ("0" +apply( ) (a), "1" +apply( ) (b))

case RProd(a,b) ⇒ apply(x. 1) (a)+apply(x. 2) (b)

case RView(iso,a) ⇒ apply(iso.from(x)) (a ())

}

}

object serialextends Serial

The traitProducerdefines a ‘template’ for generic producer functions such as serialization,and can be reused for other producers. The case classSerial is a subclass ofProducer, im-

22 Bruno Oliveira and Jeremy Gibbons

plementing theapplymethod. The definition ofserial is recovered by an object extendingSerial. Scala treats methods namedapplyspecially, allowing the use of the conventionalfunction application notation for invocation: we can writeo (arg) instead ofo.apply(arg),andserial(List (1)) instead ofserial.apply(List (1)).

The advantage of writing the generic function in this style,rather than more directly us-ing a function definition, is that inheritance allows the definition to be reused. For example,instead of repeating all the cases inserial1, we could write the following:

case class Serial1 extends Serial{override def apply[t ] (x : t) (implicit r : Rep[t ]) : String= r match {

case RList(a) ⇒

apply(x.length)+x.map(apply( ) (a)).foldRight("" ) ((x,y) ⇒ x+y)case ⇒ super.apply(x) (r)

}

}

object serial1 extends Serial1

The case classSerial1 inherits fromSerial and adds a special case for lists, overridingthe generic definition. The default case ‘’ usessuper to invoke the definition ofapplyfrom Serialwhenever the argument is not a list representation. The definition of serial1 isonce again recovered by an object extending the classSerial1 that represents the genericfunction. The following shows how client code can use the newversions ofserial andserial1:

val testSerial = serial(List (1))

val testSerial1 = serial1 (List (1))

5.4 Evaluation of the approach

The Scala approach presented in this section compares favourably with the Haskell ap-proach using GADTs to encode type representations, which was presented in Section 2.1.While it is true that the code to define the representation type is somewhat more verbosethan the Haskell equivalent, we no longer need to create a separate type class to allow im-plicit construction of representations. Implicit representations may not be strictly necessaryfor a generic programming library, but they are very convenient, and nearly all approachesprovide them. The definition of generic functions using typerepresentations is basically asstraightforward in Scala as in Haskell; no significant additional verbosity is involved.

In Haskell, it is difficult to extend a datatype with new variants, which has drawbacksfrom a generic programming point of view, as discussed in Section 2.3.3. In contrast, inScala, adding a new variant is essentially the same as addinga new subclass. While itwould be possible to overcome Haskell’s extensibility limitations with theopen datatypesproposal (Loh & Hinze, 2006), Scala’s case classes have an extra advantage when com-pared to that proposal. In Scala, we can mark a trait assealed, which prohibits directsubclassing of that trait outside the module defining it. Still, we can extendsubclasseseven in a different module. Therefore, we could have marked the traitRep[A] as sealed;but modular extension ofRViewwould still be allowed. The nice thing about this solution is

Journal of Functional Programming 23

trait Total[A] {def total :A⇒ Int

}

def gtotal[A] (x :A) (implicit t :Total[A]) : Int = t.total (x)

implicit object totalUnit extends Total[Unit ] {def total = ⇒ 0

}

implicit object totalInt extends Total[Int ] {def total = x⇒ x

}

implicit def totalPair[a,b] (implicit totalA:Total[a], totalB:Total[b]) =new Total[(a,b)] {

def total = {case (x,y) ⇒ totalA.total (x)+ totalB.total (y)

}}

implicit def totalEither[a,b] (implicit totalA: Total[a], totalB:Total[b]) =new Total[Either[a,b] ] {

def total = {case Left (x) ⇒ totalA.total (x)case Right(y) ⇒ totalB.total (y)

}}

implicit def totalList[a] (implicit totalA:Total[a]) = new Total[List[a] ] {def total = x⇒ gtotal (fromList(x))

}

Fig. 6. A simple generic ‘sum’ function.

that we can be sure that a fixed set of patterns is exhaustive, so it is easier to avoid pattern-matching errors. Scala even has coverage checking of patterns when using case analysis onvalues of sealed types, warning about any missing cases.

Finally, inheritance allows one to reuse code from existinggeneric functions. However,generic functions need to be written using classes instead of function definitions, which isa bit more verbose and less direct. Nonetheless, reuse of generic functions is an importantand useful feature to have, and the native support provided by Scala makes this featurequite usable.

6 Generic Programming with Type Classes

This section shows how to implement a Scala equivalent for the simple generic program-ming approach based on type classes presented in Section 2.2. Following Section 4.5, thekey idea is to use Scala’s implicits mechanism instead of Haskell’s type classes.

24 Bruno Oliveira and Jeremy Gibbons

6.1 Simple extensible generic functions

Figure 6 shows a Scala implementation of the Haskell code presented in Figure 1. The traitTotal[A] plays a role similar to that of the classTotal in the Haskell version: it defines aninterface containing a functiontotal that takes a value of typeA and returns an integer.Thegtotal function provides an interface to the sum function, constructing the dictionaryimplicitly; this is not needed in the Haskell version. However, unlike with the type classversion, such a dictionary can also be passed explicitly. The totalUnit, totalInt, totalPairand totalEither definitions are analogous to the type class instances for units, integers,products and sums. One notable difference is the explicit selection of the dictionary inwhich to find a subsidiary sum function; for example, in the definition for pairs,totalA.totalselects the sum function from the first dictionary argument.For lists, thegtotal functionis used instead, allowing the dictionary to be automatically inferred by the compiler. (ThefunctionfromListwas defined in Section 5.1.)

6.2 Local redefinition

With the Scala approach, local redefinition is possible. Forexample, we can obtain ageneric function that counts the integers in a structure by using a different dictionary forintegers:

object countIntextends Total[Int ] {def total = x⇒ 1

}

val x = gtotal(List (1,2,3))

val y = gtotal(List (1,2,3)) (totalList (countInt))

The valuesx andy are computed by calling the same generic functiongtotal on the samelist; however, in the second case, a local redefinition of thedictionary to use for integersmeans that the result is 3 rather than 6. (Given this facilityfor local redefinition, it wouldbe reasonable to make the basic function a generic ‘counter’, returning zero for each basecase, and to expect some cases to be locally redefined for eachuse.)

6.3 Exotic types

In the Haskell approach using type classes, exotic types such as nested datatypes or datatypeswith higher-kinded type arguments presented a challenge; in Scala, they pose less of aproblem. For example, the type of generalized trees can be defined as follows:

sealed case class GRose[F [ ],A] (x : A,children:F [GRose[F,A] ])

It is possible to define an implicit representation for the generic sum function using anapproach similar to thepolymorphic predicatestechnique, as discussed in Section 2.3.6.

implicit def totalGRose[F [ ],A] (implicit totalA: Total[A],

totalF : {def apply[B] (implicit totalB: Total[B]) :Total[F [B] ]}) =

new Total[GRose[F,A] ] {

def total = r ⇒ totalA.total (r.x)+

Journal of Functional Programming 25

totalF (totalGRose[F,A] (totalA, totalF)).total (r.children)}

However, Scala currently does not support type inference for datatypes with higher-kindedtype arguments, so in the definition above it is necessary to explicitly construct the dictio-nary

totalF (totalGRose[F,A] (totalA, totalF))

so that the dictionary constructor functiontotalGRosecan be passed the type constructor ar-gumentF. To emulate polymorphic predicates, such as thetotalF argument oftotalGRose,we have to encode higher-ranked types; this is not too hard using structural types, asdiscussed in more detail in Section 10.3. It is also necessary to provide a dictionary objectthat fits the interface required fortotalF values:

implicit object totalList2{def apply[a] (implicit totalA: Total[a]) :Total[List[a] ] =

totalList[a] (totalA)

}

Having set up all this machinery, we can now apply generic sumto generalized rosetrees:

val myRose:GRose[List, Int ] = GRose[List, Int ] (3,Nil)

def test(rose:GRose[List, Int ]) (implicit totalR: (Total[GRose[List, Int ] ])) =

totalR.total (rose)

def testCount: Int = test(myRose) (totalGRose[List, Int ] (countInt, totalList2))

The functiontesttakes a valueroseof typeGRose[List, Int] and an implicit dictionary forthe generic sum function, and returns the result of applyingthe appropriate member of thisdictionary to the rose tree. The functiontestCountapplies this function to a particular rosetree and an explicit dictionary for generalized rose trees,returning the result 1. Unfortu-nately, the dictionary cannot be constructed automatically, because the higher-kinded typeList cannot be inferred.

6.4 Evaluation of the approach

Not surprisingly, the Scala approach has some additional verbosity when compared to theHaskell one, in particular with the long-winded syntax for implicits. In the Scala approach,a gtotal function is required in order to be able to pass a dictionary explicitly, while inHaskell no such definition is necessary. However,gtotalcan be used for passing argumentsboth implicitly and explicitly, while in Haskell only implicit dictionaries are possible.Exotic types pose a challenge to Scala, as they do for Haskell, but for different reasons. InScala, there is no native support for higher-ranked types; they have to be encoded, whichadds extra overhead. Furthermore, the current version of Scala does not yet support typeinference for higher-kinded types; in practice, this meansthat it is essentially not possibleto automatically infer dictionaries that involve higher-kinded types. Having to write those

26 Bruno Oliveira and Jeremy Gibbons

dictionaries manually is tedious. However, in the Scala approach, local redefinition ispossible, and can be used in a convenient way.

In summary, the Scala approach is a bit more verbose than the corresponding Haskellversion, but it is also more expressive, since both local redefinitions and polymorphicpredicates are possible. However the latter feature has some significant overhead thatmakes it impractical to use.

7 Generic Programming with Encodings of Datatypes

This section presents the first of three case studies on existing DGP approaches. Buildingon Section 4.5, which showed how implicit parameters can be used for type-class-style pro-gramming, a Scala implementation of theGenerics for the Masses(GM) technique (Hinze,2006) is shown. Furthermore, we discuss two distinct techniques for reusing generic func-tions in Scala:reuse by inheritanceand local redefinition. (Moors (2007) provides analternative Scala tutorial on GM.)

7.1 Generics for the masses, in Haskell

Figure 7 shows the essence of the GM approach in Haskell. The constructor classGenericis used to represent the type of generic functions. The parameterg represents the genericfunction, and each of the member functions of the class encodes the behaviour of thatgeneric function for a specific structural case. Generic functions over user-defined typescan be defined using theview type case: an isomorphism between the datatype and itsstructural representation must be provided. Instances of the type classRepdenote repre-sentable types; each such instance consists of a methodacceptthat selects the appropriatebehaviour from a generic function.

A new generic function is represented as an instance ofGeneric, providing an implemen-tation for each structural case. For instance, consider a generic template for functions thatcompute some integer measure of a data structure. Each case is a record of typeCount aforsome typea, which contains a single functioncountof typea→ Int that can be used fora structure of typea. The functiongCount, which is the actual generic function, simplyextracts the sole fieldcount from a record of the appropriate type, built automaticallyby accept. For sums, products, and user-defined datatypes, it does the‘obvious’ thing:choosing the appropriate branch of a sum, adding the counts of the two components of aproduct, and unpacking a view and recursively counting its contents; it counts zero for eachof the base cases, but these can be overridden to implement more interesting behaviour.

7.2 Generics for the masses, in Scala

Figure 8 presents a translation of the code in Figure 7 into Scala. The traitGeneric isparametrized with a higher-kinded type constructorG. As in Haskell, there are methodsfor sums, products, the unit type, and also a few built-in types such as integers and char-acters; for sums and products, which have type parameters, we need extra arguments thatdefine the generic functions for values of those type parameters. Theview case uses anisomorphism to adapt generic functions to existing datatypes; the ‘⇒’ before the typeG[a]

signals that that parameter is passed by name.

Journal of Functional Programming 27

class Generic gwhereunit ::g Unitplus ::g a→ g b→ g (Sum a b)prod ::g a→ g b→ g (Prod a b)view :: Iso b a→ g a→ g bchar ::g Charint ::g Int

class Rep awhereaccept ::Generic g⇒ g a

instance Rep Unitwhereaccept = unit

instance Rep Charwhereaccept = char

instance Rep Intwhereaccept = int

instance (Rep a,Rep b) ⇒ Rep(Sum a b) whereaccept = plus accept accept

instance (Rep a,Rep b) ⇒ Rep(Prod a b) whereaccept = prod accept accept

newtype Count a= Count{count::a→ Int}

instance Generic Countwhereunit = Count(λ → 0)char = Count(λ → 0)int = Count(λ → 0)plus a b = Count(λx→ case x of {Inl l → count a l; Inr r → count b r})prod a b = Count(λ(Prod x y) → count a x+count b y)view iso a= Count(λx→ count a(from iso x))

gCount::Rep a⇒ a→ IntgCount= count accept

Fig. 7. Generics for the masses in Haskell

The traitRep[T ] has a single methodaccept, which takes an encoded generic functionof type Generic[G]. The Scala implementation of the subclasses ofRep[T ] is almost atransliteration of the Haskell type class version, except that it uses implicit parametersinstead of type classes.

The generic counter function uses a parametrized classCount with a single field: afunction of typeA ⇒ Int. The concrete subtypeCountGof the traitGeneric[Count] pro-vides implementations for the actual generic function: each method yields a value of typeCount[A] for the appropriate typeA.

7.3 Constructing type representations

For each datatypeT we want to represent, we need to create a value of typeRep[T ]. Forexample, for lists we could write:

def listRep[a,g[ ] ] (a :g[a]) (implicit gen:Generic[g]) :g[List[a] ] = {

import gen.view(listIso[a]) (plus(unit) (prod (a) (listRep[a,g] (a) (gen))))

28 Bruno Oliveira and Jeremy Gibbons

trait Generic[G[ ] ] {def unit : G[Unit ]def int : G[Int ]def char : G[Char]def plus[a,b] : G[a] ⇒ G[b] ⇒ G[Either[a,b] ]def prod[a,b] : G[a] ⇒ G[b] ⇒ G[(a,b)]def view[a,b] : Iso[b,a] ⇒ (⇒ G[a]) ⇒ G[b]

}

trait Rep[T ] {def accept[g[ ] ] (implicit gen:Generic[g]) :g[T ]

}

implicit def RUnit= new Rep[Unit ] {def accept[g[ ] ] (implicit gen:Generic[g]) = gen.unit

}

implicit def RInt= new Rep[Int ] {def accept[g[ ] ] (implicit gen:Generic[g]) = gen.int

}

implicit def RChar= new Rep[Char] {def accept[g[ ] ] (implicit gen:Generic[g]) = gen.char

}

implicit def RPlus[a,b] (implicit a: Rep[a],b: Rep[b]) = new Rep[Either[a,b] ] {def accept[g[ ] ] (implicit gen:Generic[g]) =

gen.plus(a.accept[g] (gen)) (b.accept[g] (gen))}

implicit def RProd[a,b] (implicit a: Rep[a],b: Rep[b]) = new Rep[(a,b)] {def accept[g[ ] ] (implicit gen:Generic[g]) =

gen.prod (a.accept[g] (gen)) (b.accept[g] (gen))}

case class Count[A] (count:A⇒ Int)

trait CountGextends Generic[Count] {def unit = Count(x⇒ 0)def int = Count(x⇒ 0)def char = Count(x⇒ 0)def plus[a,b] = a⇒ b⇒ Count( .fold (a.count,b.count))def prod[a,b] = a⇒ b⇒ Count(x⇒ a.count(x. 1)+b.count(x. 2))def view[a,b] = iso⇒ a⇒ Count(x⇒ a.count(iso.from(x)))

}

Fig. 8. Generics for the Masses in Scala.

}

implicit def RList[a] (implicit a :Rep[a]) = new Rep[List[a] ] {

def accept[g[ ] ] (implicit gen: Generic[g]) =

listRep[a,g] (a.accept[g] (gen)) (gen)}

(The import declaration allows unqualified use of the methodsview, plus, and so on ofthe objectgen; listIso is the isomorphism presented in Section 5.2.) Here, the auxiliary

Journal of Functional Programming 29

functionlistRepconstructs the rightGenericvalue following the sum-of-product structure.Using listRep, the representationRList for lists is easily defined.

7.4 Applying generic functions

We can now define a methodgCountthat provides an easy-to-use interface for the genericfunction encoded byCountG: this takes a value of a representable typea and returns thecorresponding count.

def gCount[a] (x :a) (implicit rep: Rep[a]) = rep.accept[Count].count(x)

We definedCountGas a trait instead of an object so that it can be extended, as wediscussin more detail in Sections 7.5 and 7.6. We may, however, be interested in having an objectthat simply inherits the basic functionality defined inCountG. Furthermore, this object canbe made implicit, so that methods likerepcan automatically infer this instance ofGeneric.

implicit object countGextends CountG

Of course, this will still return a count of zero for any data structure; we show next how tooverride it with more interesting behaviour.

7.5 Reuse via inheritance

To recover a generic function that counts the integers in a structure, we can use inheritanceto extendCountGand override the case for integers so that it counts 1 for eachintegervalue.

trait CountIntextends CountG{override def int = Count(x⇒ 1)}

We can then define a methodcountIntto count the integers in any structure of representabletype.

def countInt[a] (x :a) (implicit rep: Rep[a]) =

rep.accept[Count] (new CountInt{}).count(x)

The ability to explicitly pass an alternative ‘dictionary’is essential to the definition ofthe methodcountInt, since we need to parametrize theacceptmethod with an instance ofCountother than the implicitly inferred one.

Using such generic functions is straightforward. The following snippet defines a list ofintegerstestand appliescountIntto this list.

val test= List (3,4,5)

def countTest= countInt(test)

Note that the implicit parameter for the type representations is not needed, because it canbe inferred by the compiler (since we provided animplicit def RList).

7.6 Local redefinition

Suppose that we want to count the instances of the type parameter in an instance of aparametric datatype such as lists. It is not possible to specialize Genericto define such

30 Bruno Oliveira and Jeremy Gibbons

2

1

6 1

5

Fig. 9. Tree with depth information at the nodes.

a function directly, because there is no way to distinguish values of the type parameterfrom other values that happen to be stored in the structure. For example, we could have aparametric binary tree that has an auxiliary integer at eachnode that is used to store thedepth of the tree at that node; this could be useful in keepingthe tree balanced. Figure 9shows such a tree; the squares represent the auxiliary integers, and the circles represent thevalues contained in the tree. If the elements of the tree are themselves integers, we cannotcount them without also counting the balance information.

val testTree= Fork (2,Fork (1,Value(6),Value(1)),Value(5))

val five= countInt(testTree) // returns 5

To solve this problem, we need to account for the representations of the type parametersof a parametric type. The methodlistRep, for example, needs to receive as an argument arepresentation of typeg[a] for its type parameter. A similar thing happens with binary trees;assuming that the equivalent method is calledbtreeRep, we can provide a special-purposecounter for trees that counts only the values of the type parameter.

def countOne[a] = Count((x :a) ⇒ 1)

def countTree[a] (x :Tree[a]) = btreeRep[a,Count] (countOne[a]).count(x)

val three= countTree(testTree) // returns 3

The idea here is to replace the default behaviour that would be used for the type parameter(as inferred from the type) by user-defined behaviour specified bycountOne.

7.7 Evaluation of the approach

Like other generic programming approaches, the GM technique is more verbose in Scalathan in Haskell: in the definitions of instances (such asRUnit, RChar, andRProd) of thetrait Rep, we need to explicitly declare the implicit argument of theacceptmethod and thetype constructor argumentg for each instance; this is not necessary in the Haskell version.

In terms of functionality, the Scala solution provides everything present in the Haskellsolution, including the ability to handle local redefinitions. In addition, we can easily reuseone generic function to define another through inheritance,as demonstrated in Section 7.5;with the Haskell approaches, this kind of reuse is harder to achieve. The only mechanismthat we know of that comes close to this form of reuse in terms of simplicity is GenericHaskell’sdefault caseconstruct (Loh, 2004), as discussed in Section 2.3.5.

Another nice aspect of the Scala approach is the ability to override an implicit parameter.Theacceptmethod ofReptakes an implicit argument of typeGeneric[g]. When we defined

Journal of Functional Programming 31

the genericcountInt function (see Section 7.5), we needed to override that argument.This was easily achieved in Scala simply by explicitly passing an argument; it would benon-trivial to achieve the same effect in Haskell using typeclasses, since dictionaries arealways implicitly passed. Note that we also explicitly override an implicit parameter in thedefinition ofcountTree(since the first argument ofbtreeRepis implicit by default).

Finally, it is interesting to observe that, when interpreted in an object-oriented language,the GM approach essentially corresponds to the VISITOR pattern. While this fact is notentirely surprising—the inspiration for GM comes from encodings of datatypes, and en-codings of datatypes are known to be related to visitors (Buchlovsky & Thielecke, 2006;Oliveira, 2007)—it does not seem to have been observed in theliterature before. As aconsequence, many of the variations observed by Hinze have direct correspondents invariations of visitors, and we may hope that ideas developedin the past in the context ofvisitors may reveal themselves to be useful in the context ofgeneric programming. Oliveira(2007) explored this, and has shown, for example, both how solutions to the expressionproblem (Wadler, 1998) using visitors can be adapted to GM, and how solutions to theproblem of extensible generic functions in the GM approach can be used as solutions tothe expression problem (Oliveira, 2009a).

8 Generic Programming with Extensible Superclasses

This section presents the second case study of a DGP library in Scala. We show how toemulateextensible superclasses(Sulzmann & Wang, 2006), and how this technique can beused to provide an implementation of theScrap your Boilerplate with Class(Lammel &Peyton Jones, 2005) approach to generic programming.

8.1 Scrap your boilerplate with class

After realizing that earlier implementations of the SyB approach (Lammel & Peyton Jones,2003) were limiting because they did not supportextensible generic functions, Lammeland Peyton Jones (2005) proposed a variation using type classes. This solution is shownin Figure 10. TheData class defines the higher-order generic functiongmapQ, which isused to define new generic functions. TheSizeclass declares a new generic functionsize.Overlapping instancesare used to provide a default implementation of the functionin termsof gmapQ; in the case ofSize, the instanceSize tplays this role. Generic functions can bemade extensible by providing additional instances of classSizethat override the defaultcase. The solution is somewhat involved, and it requires a number of non-standard Haskellextensions to get everything to work. In particular,undecidable instancesare needed, andan extension allowingrecursive dictionarieshad to be built into the GHC compiler. Also,proxies for types, which involve passing an extra (bottom) value to functions, are requiredto resolve type ambiguities.

The major difficulty Lammel and Peyton Jones found was that,in order to provide amodular definition of a new generic function, theData class had itself to be parametrizedby the generic function being defined. In essence, what seemsto be needed here areextensible superclasses. Inspired by Hughes’ (1999) work on restricted datatypes, Lammeland Peyton Jones found a solution by emulating type class parametrization: in the class

32 Bruno Oliveira and Jeremy Gibbons

data Proxy(a:: ∗→ ∗)

class Sat awhere dict ::a

class (Typeable a,Sat(ctx a)) ⇒ Data ctx awheregmapQ::Proxy ctx→ (∀b. Data ctx b⇒ b→ r) → a→ [r ]

instance Sat(ctx Char) ⇒ Data ctx CharwheregmapQ f n =[]

instance (Sat(ctx[a]),Data ctx a) ⇒ Data ctx[a] wheregmapQ f [ ] =[]gmapQ f (x :xs) =[f x, f xs]

class Size awhere gsize::a→ Int

data SizeD a= SizeD{gsizeD::a→ Int}

sizeProxy::Proxy SizeDsizeProxy= ⊥

instance Size t⇒ Sat(SizeD t) wheredict = SizeD{gsizeD= gsize}

instance Data SizeD t⇒ Size twheregsize t= 1+sum(gmapQ sizeProxy(gsizeD dict) t)

instance Size a⇒ Size[a] wheregsize[ ] = 0gsize(x :xs) = gsize x+gsize xs

test= (gsize['a' , 'b' ],gsize'x' )

Fig. 10. The original ‘SyB with Class’ implementation in Haskell

Data ctx a, thectx argument is supposed to represent an unknown type class, butbecauseHaskell does not allow abstraction over type classes, this has to be emulated using records.

8.2 Scrap your boilerplate with class, in Scala

Because of the wide range of non-standard features of Haskell used by the SyB with Classapproach, it is interesting to see what is involved in expressing the approach in Scala. LikeHaskell, Scala does not support extensible superclasses directly; that is, it is not possibleto have a trait (or class)

trait T [Super] extends Super

in which the trait is parametrized by its own superclass. However, Scala does provideexplicit self-types(Odersky, 2006a), which can be used to emulate this feature.In Figure 11,a Scala implementation of theDataclass is shown. As with the Haskell solution, theDatatrait is parametrized by a type constructorctx (the generic function) and a typea. Themajor difference from the Haskell solution is the use of aself-typeto ensure that the typeof the self object is a subtype ofctx[a]. This is to make the generic functions defined inctx available to all instances ofData. (The definitionme is just a public reference for theself object, and thegmapQgeneric function uses the technique discussed in Section 10.3to emulate higher-ranked types.) Two base ‘instances’ are provided for characters and lists,as in the Haskell implementation.Abstract case classesare used becauseDataCharand

Journal of Functional Programming 33

trait Data[ctx[ ],a] {self :ctx[a] ⇒def me= selfdef gmapQ[r ] : {def apply[b] (x :b) (implicit dt :Data[ctx,b]) : r }⇒ a⇒ List[r ]

}

abstract case class DataChar[ctx[ ] ] () extends Data[ctx,Char] {self :ctx[Char] ⇒def gmapQ[r ] = f ⇒ n⇒ Nil

}

abstract case class DataList[ctx[ ],a] (implicit d : Data[ctx,a]) extends Data[ctx,List[a] ] {self : ctx[List[a] ] ⇒def gmapQ[r ] = f ⇒ {

case Nil ⇒ Nilcase x ::xs⇒ List (f (x) (d), f (xs) (this))

}}

trait Size[a] extends Data[Size,a] {def gsize:a⇒ Int = t ⇒

1+sum(gmapQ[Int ] (new {def apply[b] (x :b) (implicit dt :Data[Size,b]) =dt.me.gsize(x)}) (t))

}

abstract case class SizeList[a] () (implicit d : Size[a])extends DataList[Size,a] () (d) with Size[List[a] ] {

override def gsize= {case Nil ⇒ 0case x ::xs⇒ d.gsize(x)+gsize(xs)

}}

implicit def sizeChar:Size[Char] =new DataChar[Size] () with Size[Char]

implicit def sizeList2[a] (implicit d : Size[a]) :Size[List[a] ] =new SizeList[a] () (d) with Size[List[a] ]

def test(implicit s1:Size[Char],s2:Size[List[Char] ]) =(s1.gsize('a' ),s2.gsize(List ('a' , 'b' )))

Fig. 11. An implementation of ‘SyB with Class’ in Scala.

DataList are incomplete, that is, they still need to be mixed with implementations of thetypesctx [Char] and ctx [List [a] ]. The trait SizeextendsData and defines the genericfunction gsize in terms ofgmapQ. This trait plays the role of both theSizeclass andthe Size tinstance in the Haskell solution. The abstract case classSizeListprovides theoverriding case for lists. Note thatSizeandSizeListsatisfy, respectively, theData[Size,a]

andDataList[Size,a] requirements for the self-type. The implicit definitionssizeCharandsizeListallow the dictionaries for characters and lists to be built automatically. Finally,testshows how the generic function can be used—here, to compute the size of a character andof a list of characters. Becausetest takes two implicit arguments, it is possible to call itwithout those arguments; alternatively, different dictionaries can be provided, overridingthe ones selected by the compiler.

34 Bruno Oliveira and Jeremy Gibbons

8.3 Local redefinition

In the Scala implementation of SyB with Class, local redefinition is possible. For example,instead of using thesizeListdictionary for lists, it is possible to provide an alternativedictionary that inherits the generic behaviour for lists rather than overriding it:

def alternativeList[a] (implicit d :Size[a]) :Size[List[a] ] =

new DataList[Size,a] () (d) with Size[List[a] ]

Given this definition, bothtest and test(sizeChar,alternativeList) are possible applica-tions, returning(1,2) and(1,5) respectively.

8.4 Evaluation of the approach

Verbosity is once again a problem. The lack of direct supportfor higher-ranked types anda long-winded syntax for implicits adds significant additional code in comparison to theHaskell approach. Another problem is that separate implicit definitions for dictionaries likesizeCharandsizeListare needed.

The Scala approach imposes an additional burden on the programmer due to the absenceof a mechanism similar tooverlapping instances. This requires the programmer to im-plement the definitions for the implicit dictionaries one byone. In the Haskell solution,if there is no overridden case, then no additional effort is needed. On the other hand,the Scala solution does not distinguish between types and type classes, and abstractingover the ‘type class’ is just the same as abstracting over a type: no encoding of thisfeature is required. Furthermore, a solution with explicitself-types does not require otheradvanced features such as recursive dictionaries or undecidable instances; everything isaccomplished naturally, using the standard extension mechanism.

In terms of expressiveness, the Scala solution is better, because it supports local redefi-nitions and allows greater control of dictionaries by providing the possibility to pass themexplicitly. In the original SyB with Class solution, local redefinitions are not possible.

In summary, for the SyB with Class approach the results are mixed: Haskell is moreconvenient to use because it imposes a lighter burden on the programmer, but the Scalasolution is more expressive and flexible because local redefinitions are possible.

9 Generic Programming with Recursion Patterns

Most generic programming libraries involve writing generic functions by case analysison the structure of the shape of the datatype, whether that case analysis is by value-based or type-based dispatch. An alternative is to make the shape parameter an activeparticipant in the computation—a higher-order function that can be applied, rather thanpassive data that must be analyzed. In particular, theOrigami Programming(Gibbons,2003) approach to DGP is based around datatypes representedas fixpoints of type functors,and programs expressed in terms of higher-order recursion patterns shape-parametrized bythose functors (Meijeret al., 1991). A consequence of black-box application rather thanwhite-box inspection of the shape parameter is a kind of higher-order naturality property,guaranteeing coherence between different instances of thegeneric function (Gibbons &Paterson, 2009).

Journal of Functional Programming 35

newtype Fix f a = In{out:: f a (Fix f a)}

class BiFunctor f wherebimap:: (a→ b) → (c→ d) → f a c→ f b dfmap2:: (c→ d) → f a c→ f a dfmap2= bimap id

map::BiFunctor f ⇒ (a→ b) → Fix f a→ Fix f bmap f= In◦bimap f (map f)◦out

cata::BiFunctor f ⇒ (f a r → r) → Fix f a→ rcata f = f ◦ fmap2(cata f)◦out

ana::BiFunctor f ⇒ (r → f a r) → r → Fix f aana f = In◦ fmap2(ana f)◦ f

hylo::BiFunctor f ⇒ (a→ f c a) → (f c b→ b) → a→ bhylo f g= g◦ fmap2(hylo f g)◦ f

build :: (∀b. (f a b→ b) → b) → Fix f abuild f = f In

Fig. 12. Origami in Haskell

One can view the origami recursion patterns as functional programming equivalents to(at least the code aspects of) some of the so-called Gang of Four design patterns (Gammaet al., 1995). Gibbons (2006) argues that recursive datatypes correspond to the COMPOSITE

design pattern, maps to the ITERATORpattern for enumerating the elements of a collection,folds to the VISITOR pattern for traversing a hierarchical structure, and unfolds and buildsto structured and unstructured instances of the BUILDER pattern for generating structureddata.

Moors et al. (2006) were the first to point out that Scala is expressive enough to bea DGP language; they showed how to encode these origami patterns in Scala. However,their encoding was done in an object-oriented style that introduced some limitations thatthe original Haskell version did not have. We feel that this object-oriented style, whileperhaps more familiar to the object-oriented programmer that Moorset al.were targetting,does not show the full potential of Scala from a generic programmer’s perspective. In thissection, we present an alternative encoding of the origami patterns that is essentially adirect translation of the Haskell solution and has the same extensibility properties.

9.1 A little Origami library

Figure 12 shows the Haskell implementation of the origami patterns, and Figure 13 showsa translation of this Haskell code into Scala. The key idea isto encode type classes throughimplicit parameters (see Section 4.5) rather than using theobject-oriented style proposedby Moorset al.. The newtypeFix and its constructorIn are mapped into a case classFix;the type classBiFunctor maps into a trait; and the origami operations map into Scaladefinitions with essentially the same signatures. (In Scala, implicit parameters can onlyoccur in the last parameter position.)

There are two things to note in the Scala version. Firstly, because evaluation in Scala isstrict, we cannot just write the following in the definition of cata:

36 Bruno Oliveira and Jeremy Gibbons

case class Fix [F [ , ],a] (out: F [a,Fix [F,a] ])

trait BiFunctor[F [ , ] ] {def bimap[a,b,c,d] : (a⇒ b) ⇒ (c⇒ d) ⇒ F [a,c] ⇒ F [b,d]def fmap2[a,c,d] : (c⇒ d) ⇒ F [a,c] ⇒ F [a,d] = bimap(id)

}

def map[a,b,F [ , ] ] (f :a⇒ b) (t :Fix [F,a]) (implicit ft :BiFunctor[F ]) :Fix [F,b] =Fix [F,b] (ft.bimap(f ) (map[a,b,F ] (f )) (t.out))

def cata[a, r,F [ , ] ] (f :F [a, r ] ⇒ r) (t :Fix [F,a]) (implicit ft :BiFunctor[F ]) : r =f (ft.fmap2(cata[a, r,F ] (f )) (t.out))

def ana[a, r,F [ , ] ] (f : r ⇒ F [a, r ]) (x : r) (implicit ft :BiFunctor[F ]) :Fix [F,a] =Fix [F,a] (ft.fmap2(ana[a, r,F ] (f )) (f (x)))

def hylo[a,b,c,F [ , ] ](f : a⇒ F [c,a]) (g: F [c,b] ⇒ b) (x :a) (implicit ft :BiFunctor[F ]) :b =g (ft.fmap2(hylo[a,b,c,F ] (f ) (g)) (f (x)))

def build[a,F [ , ] ] (f : {def apply[b] : (F [a,b] ⇒ b) ⇒ b}) = f .apply(Fix [F,a])

Fig. 13. Origami in Scala

trait ListF [a, r ]case class Nil [a, r ] extends ListF[a, r ]case class Cons[a, r ] (x :a,xs: r) extends ListF [a, r ]

implicit object biList extends BiFunctor[ListF] {def bimap[a,b,c,d] = f ⇒ g⇒ {

case Nil () ⇒ Nil ()case Cons(x,xs) ⇒ Cons(f (x),g (xs))

}}

type List[a] = Fix [ListF,a]

def nil [a] :List[a] = In [ListF,a] (Nil ())def cons[a] = (x :a) ⇒ (xs:List[a]) ⇒ In [ListF,a] (Cons(x,xs))

Fig. 14. Lists as a fixpoint

f ◦ ft.fmap2(cata[a, r,F ] (f ))◦ ( .out)

(the syntax( .m) is syntactic sugar for(x⇒ x.m); in other words, ‘ ’ denotes an ‘anony-mous’ lambda variable). Under strict evaluation, the abovedefinition would expand in-definitely; we have to write it less elegantly using application rather than composition.Secondly, higher-ranked types are once again required; we have to encode them in Scala—see Section 10.3 for more details.

9.2 Using the library

Figure 14 captures the shape of lists as a type constructorListF; the two possible shapesfor lists are defined with the case classesNil andCons. TheBiFunctorobjectbiList definesthebimapoperation for the list shape. Lists are obtained simply by applying Fix to ListF.

Journal of Functional Programming 37

trait TC {type A; type B}

trait BiFunctor[S<:BiFunctor[S] ] extends TC {self : S⇒def bimap[c,d] (f :A⇒ c,g: B⇒ d) :S{type A = c; type B = d}

}

trait Fix [S<:TC,a] {def map[b] (f :a⇒ b) : Fix [S,b]def cata[b] (f :S{type A = a; type B = b} ⇒ b) : b

}case class In [S<: BiFunctor[S],a] (out: S{type A = a; type B = Fix [S,a]}) extends Fix [S,a] {

def map[b] (f :a⇒ b) : Fix [S,b] = In (out.bimap(f , .map(f )))def cata[b] (f :S{type A = a; type B = b} ⇒ b) : b = f (out.bimap(id, .cata(f )))

}

def ana[s<:BiFunctor[s],a,b] (f : b⇒ s{type A = a; type B = b}) (x :b) : Fix [s,a] =In (f (x).bimap(id,ana(f )))

def hylo[s<:BiFunctor[s],a,b,c](f :b⇒ s{type A = a; type B = b},g: s{type A = a; type B = c} ⇒ c) (x :b) : c =g (f (x).bimap(id,hylo[s,a,b,c] (f ,g)))

trait Builder[S<:BiFunctor[S],a] {final def build () :Fix [S,a] = bf (In [S,a])def bf [b] (f :S{type A = a; type B = b}⇒ b) : b

}

Fig. 15. Origami in Scala, after Moorset al.

The figure also shows functionsnil andconsthat play the role of the two constructors forlists.

We can now define operations on lists using the origami operators. A simple example isthe function that sums all the elements of a list of integers:

def sumList= cata[Int, Int,ListF] {

case Nil () ⇒ 0case Cons(x,n) ⇒ x+n

}

9.3 Evaluation of the approach

Figure 15 presents Moorset al.’s object-oriented encoding of the origami operators (slightlyadapted due to intervening changes in Scala syntax), and Figure 16 shows the specializationto lists. Compared to this object-oriented (OO) encoding, our more functional (FP) stylehas some advantages. The most significant difference between the two is that the OOencoding favours representing operations as methods attached to objects, and provided witha distinguished ‘self’ parameter, whereas the FP encoding favours representing operationsas global functions, independent of any object. In particular, in the OO encoding of the typeclassBiFunctor, the methodbimaptakes just two functions, whereas in the FP encodingit takes a data structure too; the OO encoding of thecataoperation is as a method of theclassIn, with a recursive data structure as a ‘self’ parameter, whereas the FP encoding is

38 Bruno Oliveira and Jeremy Gibbons

trait ListF extends BiFunctor[ListF]

case class NilF [a,b] extends ListF {type A = a; type B = bdef bimap[c,d] (f :a⇒ c,g: b⇒ d) : NilF [c,d] = NilF ()

}case class ConsF[a,b] (x :a,xs:b) extends ListF {

type A = a; type B = bdef bimap[c,d] (f :a⇒ c,g: b⇒ d) : ConsF[c,d] = ConsF(f (x),g (xs))

}

type List[A] = Fix [ListF,A]

Fig. 16. Lists as a fixpoint, after Moorset al.

as a global function, with the recursive data structure passed explicitly. The OO approachrequires more advanced language features, and leads to problems with extensibility, as weshall discuss.

The dependence on the self parameter in the OO encoding requiresexplicit self types.This is seen in the definition of the traitBiFunctor:

trait BiFunctor[S<:BiFunctor[S] ] . . . {self :S⇒ . . .}

trait ListF extends BiFunctor[ListF]

Note thatListF is given a recursive type bound, and that theS parameter ofBiFunctor isgiven both an upper bound (namelyBiFunctor[S]) and a lower bound (through theselfclause, explicitly specifying the self type: an ‘instance of the type class’ such asListFcannot instantiate theSparameter to anything more specific thanListF itself). Moorset al.(2006) explain the necessity of this elaborate construction for guaranteeing type safety; itis not required at all in the FP encoding.

A second characteristic of the OO encoding is the way operations are attached to ob-jects as methods; for example,cata is a method of the case classIn, rather than a globalfunction. This works smoothly for operations consuming a single distinguished instance ofthe recursive datatype, such ascata. However, it doesn’t work for operations that producerather than consume, and take no instance, such asana; these appear outside the case classinstead. (And of course, it is well-known (Bruceet al., 1995) that it doesn’t work well forbinary methods such as ‘zip’ either.)

In addition to the awkward asymmetry introduced betweencata andana, the associa-tion of consumer methods with a class introduces an extensibility problem: adding newconsumers, such as monadic map (Meijer & Jeuring, 1995), paramorphism (Meertens,1992), or idiomatic traversal (Gibbons & Oliveira, 2009), requires modifications to existingcode. Moorset al. (2006) address this second problem through an ‘extensible encod-ing’, expressed in terms ofvirtual classes—that is, nested classes in a superclass thatare overridable in a subclass. Since Scala does not provide such a construct, this virtualclass encoding has itself to be encoded in terms of type members of the enclosing class,which are overridable. No such sophistication is needed in the FP approach: a new origamioperator is a completely separate function.

Restricting attention now to the FP approach we describe, how does the Scala imple-mentation compare with the Haskell one? Scala is syntactically rather more noisy than

Journal of Functional Programming 39

Haskell, for a variety of reasons: the use of parentheses rather than simple juxtapositionfor function application; explicit binding of type variables, for example in indicating thatbimapis parametrized by the four typesa,b,c,d; the lack of eta reduction because of call-by-value, as discussed above. However, the extra noise is not too distracting—and indeed,the extra explicitness in precedence might make this kind ofhigher-order datatype-genericprogramming more accessible to those not fluent in the language.

On the positive side, the translation is quite direct, and the encoding rather transparent;the code in Figure 13 is not that much more intimidating than that in Figure 12. Scala evenhas some lessons to teach Haskell; for example, the ‘anonymous case analysis’, as used inthe definitions ofbiList andsumList, would be nice syntactic sugar for the Haskell idiom‘λx→ case x of . . .’.

10 Discussion

10.1 Haskell versus Scala

Scala differs significantly from Haskell, and we were curious to know what were its ad-vantages and disadvantages when implementing generic programming libraries. This workwas done using the Glasgow Haskell Compiler version 6.10 andScala version 2.7, whichwere the latest official releases at the time of writing. However, the languages will keepevolving, and in the future it is likely that both languages will provide better support forgeneric programming. Indeed, the next version (2.8) of the Scala compiler will supporta few features that could have been useful for our work:context bounds, which providea compact syntax for implicits;prioratized overlapping implicits, which provide an al-ternative to overlapping instances; andtype-inference for type constructors. However, forconsistency with the results presented in this paper, we shall not consider these features inthe discussion that follows.

Generally speaking, Haskell has a few advantages over Scala:

Laziness: Some approaches to generic programming rely, one way or another, on laziness.While laziness comes without effort in Haskell, it does not in Scala, and we need to paymore attention to evaluation order: we had to adapt the origami definitions in Section 9,and introduce call-by-name arguments in theRViewconstructor in Figure 5.

Type inference: Haskell has good support for type inference, which helps to reduce theeffort and clutter demanded by generic programming libraries. Scala’s support for typeinference is not as good, and this leads to additional verbosity and complexity of use.

Syntactic clarity: While Scala’s syntax is more elegant than that of Java or C#, it is stillmore verbose than Haskell’s. In particular, we have to declare more types in Scala, andneed to write extra type annotations. Also, the syntax for implicits can be a bit unwieldy,and case classes can be slightly more cumbersome than Haskell’s data declarations.

Purity: Some generic programming approaches have strong theoretical foundations thatprovide a good framework for reasoning. However, in a language that does not carefullycontrol side effects, the properties that one would expect may not hold. Haskell is apurely functional programming language, which means that functions will not havesilent side-effects (except for non-termination); Scala provides no such guarantees.

40 Bruno Oliveira and Jeremy Gibbons

Higher-ranked types: Some implementations of Haskell provide support for higher-rankedtypes, while in Scala they need to be encoded. Because higher-ranked types play a rolein some aspects of DGP, the additional overhead required by the encoding can be asignificant drawback.

On the other hand, Scala has its own advantages:

Open datatypes with case classes: As noted in Section 5, case classes support the easyaddition of new variants to a datatype. As a consequence, we can have an extensibledatatype of type representations, which allows the definition of generic functions withad-hoc cases.

Generalized type classes with implicit parameters: In Haskell, type class ‘dictionaries’are always implicitly passed to functions. However, it is sometimes convenient to ex-plicitly construct and pass a dictionary (Kahl & Scheffczyk, 2001; Dijkstra & Swierstra,2005). The ability to override implicit dictionaries is a desirable feature for genericprogramming (Loh, 2004, Chapter 8).

Inheritance: Another advantage of Scala is that we can easily reuse generic functions viainheritance. In Haskell, although we can simulate this formof reuse in several ways,there is no natural way to do so.

Expressive type system: The combination of subtyping, higher kinds, abstract types, im-plicit parameters, traits and mixins (among other features) provides Scala with an im-pressively powerful type system. Although we do not fully exploit the expressivity inthis paper, Oliveira (2007, Chapter 5) shows how Scala’s type system can shine whenimplementing modularly extensible generic functions.

Minor conveniences: We found the support for anonymous case analysis (discussedinSection 9.3) quite neat and useful. Although we seldom need to provide type annotationsin Haskell expressions, they can be quite tricky to get rightwhen they are needed; inScala this is easier. Finally, Scala’s implicits can avoid the need for some of the typeclasses and instances that would be needed in Haskell (see the discussion in Section 7.7).

10.2 Support for DGP in Scala and Haskell

The most noticeable difference between the Haskell and Scala approaches to DGP is thattype classes and datatypes are essentially two separate mechanisms in Haskell; in contrast,in Scala, the same mechanism—Scala’s object system—is used, albeit in different ways,to define standard OO hierarchies and algebraic datatype-like structures.

Figure 17 extends the table presented in Figure 2 to include the approaches presented inSections 5 and 6, which can be considered to be the equivalents of the Haskell approachesin Scala. Specifically, case classes are used to implement the datatype approach in Scala,while standard OO classes (with implicits) are used to implement the type class approach.We discuss and summarize the results in the table for the Scala approaches next.

Convenience. Defining and using generic functions with case classes is quite natural, sothis mechanism scores ‘good’ for both aspects of convenience. Compared to Haskell,in Scala there is an advantage of using datatypes of type representations because thevalue of the type representation can be implicit, whereas inHaskell (without resorting

Journal of Functional Programming 41

. . .Haskell. . . . . .Scala. . .Datatypes Type classes Case classes Standard classes

Convenience:Defining generic functions G# G#

Using generic functions G#

Implicit explicit parametrization # #

Extensibility #

First-class generic functions # G# G#

Reuse of generic functions # # 1

G#2

Exotic types G# G# G#

Fig. 17. Evaluation of the Haskell mechanisms for DGP. Key: =‘good’,G#=‘sufficient’,#=‘poor’support. Notes: 1) generic functions need to be written using classes rather than function definitions;2) reuse can be achieved in Scala via inheritance with virtual types.

to type classes) it has to be explicit. Using standard classes and implicits to implementthe type class based approach confers no advantage over Haskell in terms of convenience.Approaches based on both Haskell’s type classes and Scala’sstandard classes score only‘sufficient’ for defining generic functions, since there is additional overhead compared tousing a datatype of type representations.

Implicit explicit parametrization. The Scala approaches do well in this respect becauseof the implicits mechanism, which allows values to be passedimplicitly or explicitly. InHaskell, the choice of mechanism determines the choice of parametrization: datatypesrequire explicitly passed values, whereas type classes require implicitly passed dictio-naries. In other words, unlike in Haskell, implicit or explicit parametrization in Scala isindependent of the particular mechanism chosen for implementing the DGP library.

Extensibility. This is another area in which Scala does well. As with the Haskell typeclass approach, using Scala classes to define generic functions provides extensibility bydefault. However, unlike Haskell, the datatype of type representations in Scala can also beextensible, since case classes are open. Furthermore, the case class mechanism provides asafer alternative to open datatypes and functions, preserving the advantages of static typingand avoiding pattern match failures by using sealed classes.

First-class generic functions. In this area the results are mixed. On the one hand, Scaladoes support first-class generic functions, and it is possible to abstract over the type of thegeneric function directly in a type-class based approach. On the other hand, Scala does notprovide native support for higher-ranked types, which addscomplexity and verbosity togeneric functions. For this reason Scala only scores ‘sufficient’.

Reuse of generic functions. Scala does well here in comparison to Haskell: inheritancesupports reuse of generic functions quite naturally. In thecase class approach, this supportis quite direct, and can be used effectively to define new generic functions by inheritingfrom existing ones. A small inconvenience, though, is that we need to write functiondefinitions using classes in order to be able to exploit inheritance. It is also possible touse inheritance to achieve reuse using the standard classesapproach, but nested types and

42 Bruno Oliveira and Jeremy Gibbons

virtual types are required. Such a solution is not presentedhere, but is shown by Oliveira(2007, Chapter 5). However, this latter solution is rather involved and heavyweight, whichhinders usability. Ultimately, we think that support for inheritance is helpful for genericprogramming, and that Scala is worthy of full marks for the case class approach.

Exotic types. This is an area in which Haskell is, for the most part, better than Scala. Thetwo main reasons are Haskell’s better support for type inference and for higher-rankedtypes. Because of that support, exotic types can be used cleanly with the datatype oftype representations approach. In contrast, Scala’s solution is hindered by the additionalverbosity required due to the lack of native support for higher-ranked types and the lesscomplete type inference. The standard classes approach in Scala has the merit of directlysupporting the solution proposed by Hinze and Peyton Jones (2000), but like the otherScala solution the cost in terms of usability is quite high. Therefore, in this area, Scala onlyscores ‘sufficient’.

10.3 Idiomatic Scala

Throughout this paper, we have been using a functional programming style heavily in-fluenced by Haskell and somewhat different from conventional Scala. What are the keytechniques in this programming style?

Making the most of type inference. Scala does not support type inference in the sameway that Haskell does. As explained in Section 3.5, in a definition like

def power(x : Int) : Int = twice((y : Int) ⇒ y∗y,x)the return type ofpowerand type of the lambda-boundy can be inferred, but the type ofthe parameterx cannot. Although in this particular case the type annotations are not toodaunting, for some definitions taking several arguments while possibly being implementedor redefined in subclasses, this can become a burden. A simpletrick can help the com-piler (at least to try) to infer argument types: use lambda expressions rather than passingparameters. That is, transform a parametrized method:

def f (x1 : t1, . . . ,xn : tn) : tn+1 = einto a parameterless method with a higher-order value:

def fT : t1 ⇒ . . . ⇒ tn ⇒ tn+1 = x1 ⇒ . . . ⇒ xn ⇒ eThen the typet1 ⇒ . . . ⇒ tn ⇒ tn+1 can possibly be inferred, allowing a definition withouttype annotations:

def fT = x1 ⇒ . . . ⇒ xn ⇒ eThe main difference betweenf andfT is that the former can be (name) overloaded, whilethe latter cannot. As discussed in Section 3.5, name-overloaded definitions pose a chal-lenge to type-inference. This transformation is used a few times to make the most of typeinference, avoiding cluttering definitions with redundanttype annotations; see for examplethe methods ofGenericin Figure 8, andbimapin Figure 13.

Type class programming. As we have seen, type classes can be encoded with implicitparameters. However, object-oriented classes are more general than type classes, because

Journal of Functional Programming 43

they can contain data. It is possible to mix ideas from traditional OO programming withideas inspired by type classes. For example, Moorset al. (2008) define the trait

trait Ord[T ] {

def 6 (other: T) : Boolean}

in order to encode the Haskell type classOrdclass Ord t where

(6) :: t → t → BoolThere is a significant difference between the two approaches: an instance of the traitOrd[T ]

will contain data, since theself variable plays the role of the first argument; whereas aninstance of the type classOrd is essentially a dictionary containing a binary operation,withno value of typet. In this paper, we use the classic Haskell type class approach instead ofthe OO approach. As we saw in Section 9, sometimes merging the‘type class’ with thedata can lead to extensibility problems that can be avoided by keeping the two conceptsseparate.

Encoding higher-ranked types. Some more advanced Haskell libraries exploit higher-ranked types (Odersky & Laufer, 1996). Scala does not support higher-ranked types di-rectly, but these can be easily encoded using a class with a single method that has somelocal type arguments. However, this encoding requires a new(named) class, which cansignificantly obscure the intent of the code. In this paper, we make use of Scala’s structuraltypes to avoid most of the clutter of the encoding. The idea issimple: the Haskell definition

func::∀a.(∀b.b→ b) → a→ ais encoded in Scala as:

def func[a] : {def apply[b] :b⇒ b}⇒ a⇒ aThe type{def apply[b] : b ⇒ b} stands forsomeclass with a methodapply[b] : b ⇒

b. Structural types allow a definition that is nearly as short as the Haskell one. As afinal remark, we note that this encoding makes it very easy to use parameter bounds. Forexample, to enforceb<:a it suffices to write

def func[a] : {def apply[b<:a] :b⇒ b}⇒ a⇒ aIf we had used a separate named class, we would have had to parametrize that class withthe extra type bound arguments (Washburn, 2008).

To our knowledge, this is the first time such an encoding for higher-ranked types hasbeen observed in the literature. We believe that providing primitive support for higher-ranked types in Scala using this encoding as a basis should befairly simple.

Functional inheritance. In Scala, all functions are objects and, as such, are amenableto inheritance when it comes to reuse. Unfortunately, whileScala does support the con-ventional notation of function definition, this notation does not support inheritance. Adefinition like:

def func(x : Int) : Int = e(wheree is an expression that may depend onx) needs to be rewritten as:

case class Funcextends (Int ⇒ Int) {def apply(x : Int) = e

}

44 Bruno Oliveira and Jeremy Gibbons

Provided that functions are written in this style, then inheritance allows the reuse of func-tion definitions, as demonstrated in Section 5.3.

10.4 Porting generic programming libraries to Scala

There has been a flurry of recent proposals for generic programming libraries in Haskell(Cheney & Hinze, 2002; Hinze, 2006; Lammel & Peyton Jones, 2005; Oliveiraet al., 2006;Hinzeet al., 2006; Weirich, 2006; Hinze & Loh, 2007; Mitchell & Runciman, 2007; Brown& Sampson, 2009), all having interesting aspects but none emerging as clearly the bestoption. An international committee has been set up to develop a standard generic program-ming library in Haskell. Their first effort (Rodriguezet al., 2008) is a detailed comparisonof most of the current library proposals, identifying the implementation mechanisms andthe compiler extensions needed.

The majority of the features required by those libraries translate well into Scala; theapproaches investigated in this paper are quite representative of the mechanisms requiredby most generic programming libraries. There are, however,some questions about someof the Haskell features. For example, certain approaches use type class extensions suchasundecidable instances, overlapping instances, andabstraction over type classes, whichrely on sophisticated instance selection algorithms implemented in the latest Haskell com-pilers; one example is the approach discussed in Section 8. As we have seen, it is possibleto implement such an approach in Scala, but the lack of support for type inference forhigher kinds and something likeoverlapping instancesmeans that additional explicitnessand effort is required in Scala. Therefore, approaches thatmake intensive use of advancedtype class features can be ported, but they may lose some usability in Scala.

Something that Scala does not have is a meta-programming facility. Some of the genericprogramming libraries useTemplate Haskell(Sheard & Peyton Jones, 2002) to automati-cally generate the code necessary for type representations. In Scala, those would need to begenerated manually, or a code generation tool would need to be developed. TheScrap yourBoilerplateapproach (Lammel & Peyton Jones, 2003) relies on the ability to automaticallyderive instances ofData andTypeable; in Scala there is noderiving mechanism, so thiswould entail defining instances manually.

11 Conclusions

The goal of this paper was not to promote a particular approach to generic programming.Instead, we were more interested in investigating how the language mechanisms of Haskellused in various generic programming techniques could be adapted to Scala. We hopethat this work can serve as a foundation for future development of generic programminglibraries in Scala: all of the approaches discussed in this paper could serve as good startingpoints for more complete libraries. Moreover, other approaches can still benefit from thediscussions we present.

As we have argued, Scala has some features that are very useful in a datatype-genericprogramming language. We expect that other programming languages (in particular, Haskell)can learn some lessons from Scala by borrowing these features. Conversely, Haskell hassome features useful for DGP that are not available in Scala,but which would be nice to

Journal of Functional Programming 45

have. Ultimately, we believe that we have pinpointed limitations of some general-purposelanguage mechanisms for implementing DGP libraries; hopefully, this will motivate thedevelopment of improved mechanisms or programming languages. Oliveira and Sulzmann(2008) have already done some preliminary work in that direction by proposing a gener-alized class system for a Haskell-like language that is partly inspired by Scala, and whichallows both implicit and explicit passing of dictionaries.

Acknowledgements

Some of the material from this paper is partly based on the first author’s DPhil thesis(Oliveira, 2007, Chapters 2 and 4); particular thanks are due to the DPhil examiners RalfHinze and Martin Odersky for their insightful advice and constructive comments. AdriaanMoors made many useful suggestions; and the members of the Scala mailing list werevery helpful in answering some questions related to this paper—their input, and guidancefrom the anonymous reviewers of theWorkshop on Generic Programmingin 2008 andof this journal, helped to improve the presentation. The work reported in this paper hasbeen supported by the UK Engineering and Physical Sciences Research Council grantGeneric and Indexed Programming(EP/E02128X) and the Engineering Research Centerof Excellence Program of Korea Ministry of Education, Science and Technology (MEST) /Korea Science and Engineering Foundation (KOSEF) grant number R11-2008-007-01002-0. Some of the work was conducted while the first author was at Oxford University Com-puting Laboratory.

References

Agrawal, Rakesh, Demichiel, Linda G., & Lindsay, Bruce G. (1991). Static type checking of multi-methods.Pages 113–128 of: Object oriented programming systems languages and applications.

Brown, Neil C. C., & Sampson, Adam T. (2009). Alloy: Fast generic transformations for Haskell.Pages 105–116 of: Haskell symposium.

Bruce, Kim, Cardelli, Luca, Castagna, Giuseppe, The Hopkins Object Group, Leavens, Gary T., &Pierce, Benjamin. (1995). On binary methods.Theory and practice of object systems, 1(3), 221–242.

Buchlovsky, Peter, & Thielecke, Hayo. (2006). A type-theoretic reconstruction of the Visitor pattern.Electronic notes in theoretical computer science, 155, 309–329. Mathematical Foundations ofProgramming Semantics.

Cheney, James, & Hinze, Ralf. (2002). A lightweight implementation of generics and dynamics.Pages 90–104 of: Haskell workshop.

Cockett, Robin, & Fukushima, Tom. 1992 (May).About Charity. Department of Computer Science,University of Calgary.

Cook, William R. (1989).A denotational semantics of inheritance. Ph.D. thesis, Brown University.

Dijkstra, Atze, & Swierstra, S. Doaitse. (2005).Making implicit parameters explicit. Tech. rept.UU-CS-2005-032. Department of Information and Computing Sciences, Utrecht University.

Gamma, E., Helm, R., Johnson, R., & Vlissides, J. (1995).Design patterns: Elements of reusableobject-oriented software. Addison-Wesley.

Gibbons, Jeremy. (2003). Origami programming.In: (Gibbons & de Moor, 2003).

Gibbons, Jeremy. (2006). Design patterns as higher-order datatype-generic programs.Pages 1–12of: Workshop on generic programming.

46 Bruno Oliveira and Jeremy Gibbons

Gibbons, Jeremy, & de Moor, Oege (eds). (2003).The fun of programming. Cornerstones inComputing. Palgrave Macmillan.

Gibbons, Jeremy, & Oliveira, Bruno C. d. S. (2009). The essence of the Iterator pattern.Journal offunctional programming, 19, 377–402.

Gibbons, Jeremy, & Paterson, Ross. (2009). Parametric datatype-genericity.Pages 85–93 of:Jansson,Patrik, & Schupp, Sibylle (eds),Workshop on generic programming.

Hall, Cordelia V., Hammond, Kevin, Peyton Jones, Simon L., &Wadler, Philip L. (1996). Typeclasses in Haskell.ACM transactions on programming languages and systems, 18(2), 109–138.

Harper, Robert, & Lillibridge, Mark. 1994 (Jan.). A type-theoretic approach to higher-order moduleswith sharing.Pages 123–137 of: Principles of programming languages.

Hinze, R., & Peyton Jones, S. (2000). Derivable type classes. Electronic notes in theoreticalcomputer science, 41(1), 5–35. Haskell Workshop.

Hinze, Ralf. (2000). Polytypic values possess polykinded types. Pages 2–27 of:Backhouse,Roland, & Oliveira, J. N. (eds),LNCS 1837: Proceedings of the fifth international conferenceon mathematics of program construction. Springer-Verlag.

Hinze, Ralf. (2003). Fun with phantom types.In: (Gibbons & de Moor, 2003).

Hinze, Ralf. (2006). Generics for the masses.Journal of functional programming, 16(4-5), 451–483.

Hinze, Ralf, & Jeuring, Johan. (2002). Generic Haskell: Practice and theory.LNCS 2793: Summerschool on generic programming.

Hinze, Ralf, & Loh, Andres. (2007). Generic programming, now! LNCS 4719: Datatype-genericprogramming.

Hinze, Ralf, & Loh, Andres. (2009). Generic programming in3D. Science of computerprogramming, 74(8), 590–628.

Hinze, Ralf, Loh, Andres, & Oliveira, Bruno C. d. S. (2006).‘Scrap your Boilerplate’ reloaded.Pages 13–29 of: LNCS 3945: Functional and logic programming.

Hughes, John. (1999). Restricted data types in Haskell. Meijer, Erik (ed), Haskell workshop.Technical Report UU-CS-1999-28, Utrecht University.

Jansson, Patrik. 2000 (May).Functional polytypic programming. Ph.D. thesis, Computing Science,Chalmers University of Technology and Goteborg University, Sweden.

Kahl, Wolfram, & Scheffczyk, Jan. (2001). Named instances for Haskell type classes.Pages 77–99of: Haskell workshop.

Lammel, Ralf, & Peyton Jones, Simon. (2003). Scrap your boilerplate: A practical design pattern forgeneric programming.Pages 26–37 of: Types in language design and implementation.

Lammel, Ralf, & Peyton Jones, Simon. (2005). Scrap your boilerplate with class: Extensible genericfunctions.Pages 204–215 of: International conference on functional programming.

Lammel, Ralf, Visser, Joost, & Kort, Jan. 2000 (July). Dealing with large bananas.Pages 46–59 of:Jeuring, Johan (ed),Workshop on generic programming.

Leroy, Xavier. (1994). Manifest types, modules, and separate compilation. Pages 109–122 of:Principles of programming languages.

Lieberherr, Karl. (1996).Adaptive object-oriented software: The Demeter Method with propagationpatterns. PWS Publishing.

Loh, Andres. (2004).Exploring Generic Haskell. Ph.D. thesis, Utrecht University.

Loh, Andres, & Hinze, Ralf. (2006). Open data types and openfunctions. Pages 133–144 of:Principles and practice of declarative programming.

McBride, Conor, & Paterson, Ross. (2008). Applicative programming with effects. Journal offunctional programming, 18(1).

Meertens, Lambert. (1992). Paramorphisms.Formal aspects of computing, 4(5), 413–425.

Journal of Functional Programming 47

Meijer, Erik, & Jeuring, Johan. (1995). Merging monads and folds for functional programming.LNCS 925: Advanced functional programming. Springer-Verlag.

Meijer, Erik, Fokkinga, Maarten, & Paterson, Ross. (1991).Functional programming with bananas,lenses, envelopes and barbed wire.Pages 124–144 of:Hughes, John (ed),LNCS 523: Functionalprogramming languages and computer architecture.

Mitchell, Neil, & Runciman, Colin. (2007). Uniform boilerplate and list processing.Pages 49–60of: Haskell workshop.

Moors, Adriaan. 2007 (June).Code-follows-type programming in Scala. http://www.cs.kuleuven.be/ ˜ adriaan/?q=cft_intro .

Moors, Adriaan, Piessens, Frank, & Joosen, Wouter. (2006).An object-oriented approach todatatype-generic programming.Pages 96–106 of: Workshop on generic programming.

Moors, Adriaan, Piessens, Frank, & Odersky, Martin. (2008). Generics of a higher kind.Object-oriented programming, systems, languages, and applications.

Odersky, Martin. (2006a).An Overview of the Scala programming language (second edition). Tech.rept. IC/2006/001. EPFL Lausanne, Switzerland.

Odersky, Martin. 2006b (July).Poor man’s type classes. http://lamp.epfl.ch/ ˜ odersky/talks/wg2.8-boston06.pdf .

Odersky, Martin. 2007a (May).Scala by example. http://scala.epfl.ch/docu/files/ScalaIntro.pdf .

Odersky, Martin. 2007b (May).The Scala language specification, version 2.4. http://scala.epfl.ch/docu/files/ScalaReference.pdf .

Odersky, Martin, & Laufer, Konstantin. (1996). Putting type annotations to work.Pages 54–67 of:Principles of programming languages.

Odersky, Martin, & Zenger, Matthias. (2005). Scalable component abstractions.Pages 41–57 of:Object oriented programming, systems, languages, and applications.

Odersky, Martin, Zenger, Christoph, & Zenger, Matthias. (2001). Colored local type inference.Pages41–53 of: Principles of programming languages.

Odersky, Martin, Spoon, Lex, & Venners, Bill. (2008).Programming in Scala: A comprehensivestep-by-step guide. 1st edn. Artima Inc.

Oliveira, Bruno, & Gibbons, Jeremy. (2005). TypeCase: A design pattern for type-indexed functions.Pages 98–109 of: Haskell workshop. New York, NY, USA: ACM Press.

Oliveira, Bruno C. d. S. (2007).Genericity, extensibility and type-safety in theV ISITOR pattern.D.Phil. thesis, University of Oxford.

Oliveira, Bruno C. d. S. 2009a (July). Modular visitor components: A practical solution to theexpression families problem.Pages 269–293 of:Drossopoulou, Sophia (ed),23rd Europeanconference on object oriented programming (ECOOP).

Oliveira, Bruno C. d. S. 2009b (Sept.).Scala for Generic Programmers: Source code. http://www.comlab.ox.ac.uk/projects/gip/Scala.tgz .

Oliveira, Bruno C. d. S., & Sulzmann, Martin. 2008 (Apr.).Objects to unify type classes and GADTs.http://www.comlab.ox.ac.uk/people/Bruno.Oliveira/ob jects.pdf .

Oliveira, Bruno C. d. S., Hinze, Ralf, & Loh, Andres. 2006 (Apr.). Extensible and modular genericsfor the masses.Pages 109–138 of: Trends in functional programming.

Oliveira, Bruno C. d. S., Moors, Adriaan, & Odersky, Martin.2010 (October). Type classes as objectsand implicits. Rinard, Martin (ed),Systems, programming, languages and applications: Softwarefor humanity (SPLASH). To appear.

Peyton Jones, Simon, Vytiniotis, Dimitrios, Weirich, Stephanie, & Washburn, Geoffrey. (2006).Simple unification-based type inference for GADTs.Pages 50–61 of: International conferenceon functional programming.

48 Bruno Oliveira and Jeremy Gibbons

Rodriguez, Alexey, Jeuring, Johan, Jansson, Patrik, Gerdes, Alex, Kiselyov, Oleg, & Oliveira, BrunoC. d. S. (2008). Comparing libraries for generic programming in Haskell.Haskell symposium.

Scharli, Nathanael, Ducasse, Stephane, Nierstrasz, Oscar, & Black, Andrew. 2003 (July). Traits:Composable units of behavior.Pages 248–274 of: LNCS 2743: European conference on object-oriented programming.

Schinz, Michel. 2007 (May).A Scala tutorial for Java programmers. http://scala.epfl.ch/docu/files/ScalaTutorial.pdf .

Sheard, Tim, & Peyton Jones, Simon. (2002). Template meta-programming for Haskell.Haskellworkshop.

Sulzmann, Martin, & Wang, Meng. (2006). Modular generic programming with extensiblesuperclasses.Pages 55–65 of: Workshop on generic programming. New York, NY, USA: ACM.

Wadler, Philip. (1993). Monads for functional programming. Program design calculi. Springer-Verlag.

Wadler, Philip. 1998 (Nov.).The expression problem. Java Genericity Mailing list.http://www.cse.ohio-state.edu/ ˜ gb/cis888.07g/java-genericity/20 .

Washburn, Geoffrey. 2008 (May). Revisiting higher-rank impredicative polymorphismin Scala. http://existentialtype.net/2008/05/26/revisiting-hi gher-rank-impredicative-polymorphism-in-scala/ .

Weirich, Stephanie. (2006). RepLib: a library for derivable type classes.Pages 1–12 of: Haskellworkshop.