libraries for generic programming in haskell - ?· libraries for generic programming in haskell...
Post on 25-Jul-2018
Embed Size (px)
Libraries for Generic Programmingin Haskell
Jose Pedro Magalhaes
Alexey Rodriguez Yakushev
Technical Report UU-CS-2008-025
Department of Information and Computing Sciences
Utrecht University, Utrecht, The Netherlands
Department of Information and Computing Sciences
P.O. Box 80.089
3508 TB UtrechtThe Netherlands
Libraries for Generic Programming in Haskell
Johan Jeuring, Sean Leather, Jose Pedro Magalhaes, andAlexey Rodriguez Yakushev
Universiteit Utrecht, The Netherlands
Abstract. These lecture notes introduce libraries for datatype-generic pro-gramming in Haskell. We introduce three characteristic generic program-ming libraries: lightweight implementation of generics and dynamics, ex-tensible and modular generics for the masses, and scrap your boilerplate.We show how to use them to use and write generic programs. In the casestudies for the different libraries we introduce generic components of amedium-sized application which assists a student in solving mathemati-cal exercises.
In the development of software, structuring data plays an important role. Manyprogramming methods and software development tools center around creatinga datatype (or XML schema, UML model, class, grammar, etc.). Once the struc-ture of the data has been designed, a software developer adds functionality tothe datatypes. There is always some functionality that is specific for a datatype,and part of the reason why the datatype has been designed in the first place.Other functionality is similar or even the same on many datatypes, followingcommon programming patterns. Examples of such patterns are:
in a possibly large value of a complicated datatype (for example for rep-resenting the structure of a company), applying a given action at all oc-currences of a particular constructor (e.g., adding or updating zip codes atall occurrences of street addresses) while leaving the rest of the value un-changed;
serializing a value of a datatype, or comparing two values of a datatype forequality, functionality that depends only on the structure of the datatype;
adapting data access functions after a datatype has changed, something thatoften involves modifying large amounts of existing code.
Generic programming addresses these high-level programming patterns. We alsouse the term datatype-generic programming [Gibbons, 2007] to distinguish thefield from Java generics, Ada generic packages, generic programming in C++STL, etc. Using generic programming, we can easily implement traversals inwhich a user is only interested in a small part of a possibly large value, func-tions which are naturally defined by induction on the structure of datatypes,
and functions that automatically adapt to a changing datatype. Larger exam-ples of generic programming include XML tools, testing frameworks, debug-gers, and data conversion tools.
Often an instance of a datatype-generic program on a particular datatypeis obtained by implementing the instance by hand, a boring and error-pronetask, which reduces programmers productivity. Some programming languagesprovide standard implementations of basic datatype-generic programs such asequality of two values and printing a value. In this case, the programs are inte-grated into the language, and cannot be extended or adapted. So, how can wedefine datatype-generic programs ourselves?
More than a decade ago the first programming languages appeared thatsupported the definition of datatype-generic programs. Using these program-ming languages it is possible to define a generic program, which can then beused on a particular datatype without further work. Although these languagesallow us to define our own generic programs, they have never grown out of theresearch prototype phase, and most cannot be used anymore.
The rich type system of Haskell allows us to write a number of datatype-generic programs in the language itself. The power of classes, constructor clas-ses, functional dependencies, generalized algebraic datatypes, and other ad-vanced language constructs of Haskell is impressive, and since 2001 we haveseen at least 10 proposals for generic programming libraries in Haskell usingone or more of these advanced constructs. Using a library instead of a separateprogramming language for generic programming has many advantages. Themain advantages are that a user does not need a separate compiler for genericprograms and that generic programs can be used out of the box. Furthermore, alibrary is much easier to ship, support, and maintain than a programming lan-guage, which makes the risk of using generic programs smaller. A library mightbe accompanied by tools that depend on non-standard language extensions, forexample for generating embedding-projection pairs, but the core is Haskell. Theloss of expressiveness compared with a generic programming language such asGeneric Haskell is limited.
These lecture notes introduce generic programming in Haskell using genericprogramming libraries. We introduce several characteristic generic program-ming libraries, and we show how to write generic programs using these li-braries. Furthermore, in the case studies for the different libraries we introducegeneric components of a medium-sized application which assists a student insolving mathematical exercises. We have included several exercises in these lec-ture notes. The answers to these exercises can be found in Appendix A.
These notes are organised as follows. Section 2 puts generic programminginto context. It introduces a number of variations on the theme of generics aswell as demonstrates how each may be used in Haskell. Section 3 starts ourfocus on datatype-generic programming by discussing the world of datatypessupported by Haskell and common extensions. In Section 4, we introduce li-braries for generic programming and briefly discuss the criteria we used toselect the libraries covered in the following three sections. Section 5 starts the
discussion of libraries with Lightweight Implementation of Generics and Dy-namics (LIGD); however, we leave out the dynamics since we focus on gene-rics in these lecture notes. Section 6 continues with a look at Extensible andModular Generics for the Masses (EMGM), a library using the same view asLIGD but implemented with a different mechanism. Section 7 examines ScrapYour Boilerplate (SYB), a library implemented with combinators and quite dif-ferent from LIGD and EMGM. After describing each library individually, weprovide an abridged comparison of them in Section 8. In Section 9, we take adetour from the discussion of established libraries to explore how type-indexeddatatypes can be implemented using type families. In Section 10 we concludewith some suggested reading and some thoughts about the future of genericprogramming libraries.
2 Generic programming in context (and in Haskell)
Generic programming has developed as a technique for increasing the amountand scale of reuse in code while still preserving type safety. The term genericis highly overloaded in computer science; however, broadly speaking, mostuses involve some sort of parametrisation. A generic program abstracts overthe differences in separate but similar programs. In order to arrive at specificprograms, one instantiates the parameter in various ways. It is the type of theparameter that distinguishes each variant of generic programming.
Gibbons  lists seven categories of generic programming. In this sec-tion, we revisit these with an additional twist: we look at how each may beimplemented in Haskell.
Each of the following sections is titled according to the type of the parameterof the generic abstraction. In each, we provide a description of that particularform of generics along with an example of how to apply the technique.
The most basic form of generic programming is to parametrise a computationby values. The idea goes by various names in programming languages (proce-dure, subroutine, function, etc.), and it is a fundamental element in mathema-tics. While a function is not often considered under the definition of generic,it is perfectly reasonable to model other forms of genericity as functions. Thegeneralization is given by the function g(x) in which a generic component g isparametrised by some entity x. Instantiation of the generic component is thenanalogous to application of a function.
Functions come naturally in Haskell. For example, here is function that takestwo boolean values as arguments and determines their basic equality.
eqBool :: Bool Bool BooleqBool x y = (not x not y) (x y)
We take this opportunity to introduce a few themes used in these lecturenotes. We use a stylized form of Haskell that is not necessarily what one wouldtype into a file. Here, we add a subscript, which can simply be typed, and usethe symbols () and (), which translate directly to the standard operators(&&) and (||).
As much as possible, we attempt to use the same functions for our examples,so that the similarities or differences are more evident. Equality is one suchrunning example used throughout the text.
If a function is a first-class citizen in a programming language, parametrisationof one function by another function is exactly the same as parametrisation byvalue. However, we explicitly mention this category because it enables abstrac-tion over control flow. The full power of function parameters can be seen in thehigher-order functions of languages such as Haskell and ML.
Suppose we have functions defining the logical conjunction and disjunctionof boolean values.
and :: List Bool Booland Nil