tools for refactoring functional programs simon thompson with huiqing li claus reinke

Post on 31-Mar-2015

219 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Tools for Refactoring Functional Programs

Simon Thompsonwith

Huiqing LiClaus Reinke

www.cs.kent.ac.uk/projects/refactor-fp

LIL 2006 2

Design

Models

Prototypes

Design documents

Visible artifacts

LIL 2006 3

All in the code …

Functional programs embody their design in their code.

This is enabled by their high-level nature: constructs, types …

data Message = Message Head Body

data Head = Head Metadata Title

data Metadata = Metadata [Tags]

type Title = String …

data Message = Message Head Body

data Head = Head Metadata Title

data Metadata = Metadata [Tags]

type Title = String …

LIL 2006 4

Evolution

Successful systems are long lived …

… and evolve continuously.

Supporting evolution of code and design?

LIL 2006 5

Soft-WareThere’s no single correct design …

… different options for different situations.

Maintain flexibility as the system evolves.

LIL 2006 6

RefactoringRefactoring means changing the design or structure of a program … without changing its behaviour.

RefactorModify

LIL 2006 7

Not just programmingPaper or presentation

moving sections about; amalgamate sections; move inline code to a figure; animation; …

Proof add lemma; remove, amalgamate hypotheses, …

Programthe topic of the lecture

LIL 2006 8

Splitting a function in two

LIL 2006 9

Splitting a function in two

LIL 2006 10

Splitting a function in two

LIL 2006 11

Splitting a functionmodule Split where

f :: [String] -> String

f ys = foldr (++) [] [ y++"\n" | y <- ys ]

LIL 2006 12

Splitting a functionmodule Split where

f :: [String] -> String

f ys = foldr (++) [] [ y++"\n" | y <- ys ]

LIL 2006 13

Splitting a functionmodule Split where

f :: [String] -> String

f ys = join [y ++ "\n" | y <- ys] where join = foldr (++) []

LIL 2006 14

Splitting a functionmodule Split where

f :: [String] -> String

f ys = join [y ++ "\n" | y <- ys] where join = foldr (++) []

LIL 2006 15

Splitting a functionmodule Split where

f :: [String] -> String

f ys = join addNL where join zs = foldr (++) [] zs addNL = [y ++ "\n" | y <- ys]

LIL 2006 16

Splitting a functionmodule Split where

f :: [String] -> String

f ys = join addNL where join zs = foldr (++) [] zs addNL = [y ++ "\n" | y <- ys]

LIL 2006 17

Splitting a functionmodule Split where

f :: [String] -> String

f ys = join (addNL ys) where join zs = foldr (++) [] zs addNL ys = [y ++ "\n" | y <- ys]

LIL 2006 18

Splitting a functionmodule Split where

f :: [String] -> String

f ys = join (addNL ys) where join zs = foldr (++) [] zs addNL ys = [y ++ "\n" | y <- ys]

LIL 2006 19

Splitting a functionmodule Split where

f :: [String] -> String

f ys = join (addNL ys) join zs = foldr (++) [] zs

addNL ys = [y ++ "\n" | y <- ys]

LIL 2006 20

Overview

Example refactorings: what they involve.

Building the HaRe tool.

Design rationale.

Infrastructure.

Haskell and Erlang.

The Wrangler tool.

Conclusions.

LIL 2006 21

Haskell 98

Standard, lazy, strongly typed, functional programming language.

Layout is significant … “offside rule” … and idiosyncratic.doSwap pnt = applyTP (full_buTP (idTP `adhocTP`

inMatch `adhocTP` inExp `adhocTP` inDecl)) where inMatch ((HsMatch loc fun pats rhs ds)::HsMatchP) | fun == pnt = case pats of (p1:p2:ps) -> do pats'<-swap p1 p2 pats return (HsMatch loc fun pats' rhs ds) _ -> error "Insufficient arguments to swap." inMatch m = return m inExp exp@((Exp (HsApp (Exp (HsApp e e1)) e2))::HsExpP) | expToPNT e == pnt = swap e1 e2 exp

inExp e = return e

doSwap pnt = applyTP (full_buTP (idTP `adhocTP` inMatch `adhocTP` inExp `adhocTP` inDecl)) where inMatch ((HsMatch loc fun pats rhs ds)::HsMatchP) | fun == pnt = case pats of (p1:p2:ps) -> do pats'<-swap p1 p2 pats return (HsMatch loc fun pats' rhs ds) _ -> error "Insufficient arguments to swap." inMatch m = return m inExp exp@((Exp (HsApp (Exp (HsApp e e1)) e2))::HsExpP) | expToPNT e == pnt = swap e1 e2 exp

inExp e = return e

LIL 2006 22

Why refactor Haskell?The only design artefact is (in) the code.

Semantics of functional languages support large-scale transformations (?)

Building real tools to support functional programming … heavy lifting.

Platform for research and experimentation.

LIL 2006 23

Lift / demotef x y = … h … where h = …

Hide a function which is clearly subsidiary to f; clear up the namespace.

f x y = … (h y) … h y = …

Makes h accessible to the other functions in the module and beyond.

Free variables: which parameters of f are used in h?Need h not to be defined at the top level, … , Type of h will generally change … .

LIL 2006 24

Algebraic or abstract type?

data Tr a

= Leaf a |

Node a (Tr a) (Tr a) Tr

Leaf

Node

flatten :: Tr a -> [a]

flatten (Leaf x) = [x]

flatten (Node s t)

= flatten s ++

flatten t

LIL 2006 25

Algebraic or abstract type?

data Tr a

= Leaf a |

Node a (Tr a) (Tr a)

isLeaf = …

isNode = …

Tr

isLeaf

isNode

leaf

left

right

mkLeaf

mkNode

flatten :: Tr a -> [a]

flatten t

| isleaf t = [leaf t]

| isNode t

= flatten (left t)

++ flatten (right t)

LIL 2006 26

Information required

Lexical structure of programs,abstract syntax,binding structure,type system andmodule system.

LIL 2006 27

Program transformationsProgram optimisation source-to-source transformations to get more efficient code

Program derivation calculating efficient code from obviously correct specifications

Refactoring transforming code structure usually bidirectional and conditional.

Refactoring = Transformation + Condition

LIL 2006 28

Conditions: renaming f to g

“No change to the binding structure”

1. No two definitions of g at the same level.

2. No capture of g.3. No capture by g.

LIL 2006 29

Capture of renamed identifier

h x = … h … f … g … where g y = …

f x = …

h x = … h … g … g … where g y = …

g x = …

LIL 2006 30

Capture by renamed identifier

h x = … h … f … g … where f y = … f … g …

g x = …

h x = … h … g … g … where g y = … g … g …

g x = …

LIL 2006 31

Refactoring by hand?

By hand = in a text editor

Tedious

Error-prone• Implementing the transformation …• … and the conditions.

Depends on compiler for type checking, …

… plus extensive testing.

LIL 2006 32

Machine support invaluableReliable

Low cost of do / undo, even for large refactorings.

Increased effectiveness and creativity.

LIL 2006 33

Demonstration of HaRe, hosted in vim.

LIL 2006 34

LIL 2006 35

LIL 2006 36

LIL 2006 37

The refactorings in HaRe

RenameDeleteLift / DemoteIntroduce definitionRemove definitionUnfoldGeneraliseAdd/remove parameters

Move def between modules

Delete/add to exports

Clean importsMake imports

explicit

data type to ADTShort-cut, warm

fusionAll module aware

LIL 2006 38

HaRe design rationale Integrate with existing development tools.

Work with the complete language: Haskell 98

Preserve comments and the formatting style.

Reuse existing libraries and systems.

Extensibility and scriptability.

LIL 2006 39

Information required

Lexical structure of programs,abstract syntax,binding structure,type system andmodule system.

LIL 2006 40

The Implementation of HaRe

Informationgathering

Informationgathering

Pre-conditionchecking

Pre-conditionchecking

Programtransformation

Programtransformation

ProgramrenderingProgramrendering

Strafunski

LIL 2006 41

Finding free variables ‘by hand’

instance FreeVbls HsExp where freeVbls (HsVar v) = [v] freeVbls (HsApp f e) = freeVbls f ++ freeVbls e freeVbls (HsLambda ps e) = freeVbls e \\ concatMap paramNames ps freeVbls (HsCase exp cases) = freeVbls exp ++ concatMap freeVbls cases freeVbls (HsTuple _ es) = concatMap freeVbls es …

Boilerplate code: 1000 noise : 100 significant.

LIL 2006 42

StrafunskiStrafunski allows a user to write general (read generic), type safe, tree traversing programs, with ad hoc behaviour at particular points.

Top-down / bottom up, type preserving / unifying,

full stop one

LIL 2006 43

Strafunski in useTraverse the tree accumulating free variables from components, except in the case of lambda abstraction, local scopes, …

Strafunski allows us to work within Haskell …

Other options? Generic Haskell, Template Haskell, AG, Scrap Your Boilerplate, …

LIL 2006 44

Rename an identifier

rename:: (Term t)=>PName->HsName->t->Maybe t rename oldName newName = applyTP worker where worker = full_tdTP (idTP ‘adhocTP‘ idSite) idSite :: PName -> Maybe PName idSite v@(PN name orig) | v == oldName = return (PN newName orig) idSite pn = return pn

LIL 2006 45

The coding effort

Transformations: straightforward in Strafunski …

… the chore is implementing conditions that the transformation preserves meaning.

This is where much of our code lies.

LIL 2006 46

Program rendering example-- This is an example

module Main where

sumSquares x y = sq x + sq y where sq :: Int->Int sq x = x ^ pow pow = 2 :: Int

main = sumSquares 10 20

module Main where

sumSquares x y = sq pow x + sq pow y where pow = 2 :: Int

sq :: Int->Int->Intsq pow x = x ^ pow

main = sumSquares 10 20

-- This is an example

module Main where

sumSquares x y = sq pow x + sq pow y where pow = 2 :: Int

sq :: Int->Int->Intsq pow x = x ^ pow

main = sumSquares 10 20

LIL 2006 47

Token stream and ASTWhite space + comments only in token stream.

Modification of the AST guides the modification of the token stream.

After a refactoring, the program source is recovered from the token stream not the AST.

Heuristics associate comments with program entities.

LIL 2006 48

Work in progress‘Fold’ against definitions … find duplicate code.

All, some or one? Effect on the interface …

f x = … e … e …

Symbolic evaluation

Data refactorings

Interfaces … ‘bad smell’ detection.

LIL 2006 49

API and DSL

RefactoringsRefactorings

Refactoringutilities

Refactoringutilities

StrafunskiStrafunski

HaskellHaskell

Combining formsCombining forms

Library functions

Grammar as data

Strafunski

???

LIL 2006 50

What have we learned?

Efficiency and robustness of libraries in question.

• type checking large systems, • linking, • editor script languages (vim, emacs).

The cost of infrastructure in building practical tools.

Reflections on Haskell itself.

LIL 2006 51

Reflections on HaskellCannot hide items in an export list (cf import).

Field names for prelude types?

Scoped class instances not supported.

‘Ambiguity’ vs. name clash.

‘Tab’ is a nightmare!

Correspondence principle fails …

LIL 2006 52

Correspondence

Operations on definitions and operations on expressions can be placed in one to one correspondence

(R.D.Tennent, 1980)

LIL 2006 53

Correspondence

Definitions

where

f x y = e

f x | g1 = e1 | g2 = e2

Expressions

let

\x y -> e

f x = if g1 then e1 else if g2 … …

LIL 2006 54

Function clauses f x | g1 = e1

f x | g2 = e2

Can ‘fall through’ a function clause … no direct correspondence in the expression language.

f x = if g1 then e1 else if g2 …

No clauses for anonymous functions … no reason to omit them.

LIL 2006 55

Haskell 98 vs. Erlang: generalities

Haskell 98: a lazy, statically typed, purely functional programming language featuring higher-order functions, polymorphism, type classes and monadic effects.

Erlang: a strict, dynamically typed functional programming language with support for concurrency, communication, distribution and fault-tolerance.

LIL 2006 56

Haskell 98 vs. Erlang: example

%% Factorial In Erlang.

-module (fact).

-export ([fac/1]).

fac(0) -> 1;

fac(N) when N > 0 -> N * fac(N-1).

%% Factorial In Erlang.

-module (fact).

-export ([fac/1]).

fac(0) -> 1;

fac(N) when N > 0 -> N * fac(N-1).

-- Factorial In Haskell.

module Fact(fac) where

fac :: Int -> Int

fac 0 = 1

fac n | n>0 = n * fac(n-1)

-- Factorial In Haskell.

module Fact(fac) where

fac :: Int -> Int

fac 0 = 1

fac n | n>0 = n * fac(n-1)

LIL 2006 57

Haskell 98 vs. Erlang: pragmatics Type system makes implementation complex. Layout and comment preservation. Types also affect the refactorings themselves. Clearer semantics for refactorings, but more complex infrastructure.

Untyped traversals much simpler. Use the layout given by emacs. Use cases which cannot be understood statically.

Dynamic semantics of Erlang makes refactorings harder to pin down.

LIL 2006 58

Challenges of Erlang refactoring Multiple binding occurrences of variables.

Indirect function call or function spawn: apply (lists, rev, [[a,b,c]])

Multiple arities … multiple functions: rev/1

Concurrency Refactoring within a design library: OTP. Side-effects.

LIL 2006 59

Generalisation and side-effects

-module (test).

-export([f/0]).

repeat(0) -> ok;repeat(N) ->

io:format (“hello\n"), repeat(N-1).

f( ) -> repeat(5).

-module (test).

-export([f/0]).

repeat(0) -> ok;repeat(N) ->

io:format (“hello\n"), repeat(N-1).

f( ) -> repeat(5).

-module (test).

-export([f/0]).

repeat(A, 0) -> ok;repeat(A, N) ->

A, repeat(A,N-1).

f( ) -> repeat (io:format (“hello\n”), 5).

-module (test).

-export([f/0]).

repeat(A, 0) -> ok;repeat(A, N) ->

A, repeat(A,N-1).

f( ) -> repeat (io:format (“hello\n”), 5).

LIL 2006 60

Generalisation and side-effects

-module (test).

-export([f/0]).

repeat(0) -> ok;repeat(N) ->

io:format (“hello\n"), repeat(N-1).

f( ) -> repeat(5).

-module (test).

-export([f/0]).

repeat(0) -> ok;repeat(N) ->

io:format (“hello\n"), repeat(N-1).

f( ) -> repeat(5).

-module (test).

-export([f/0]).

repeat(A, 0) -> ok;repeat(A, N) ->

A(), repeat(A,N-1).

f( ) -> repeat (fun( )-> io:format (“hello\

n”), 5).

-module (test).

-export([f/0]).

repeat(A, 0) -> ok;repeat(A, N) ->

A(), repeat(A,N-1).

f( ) -> repeat (fun( )-> io:format (“hello\

n”), 5).

LIL 2006 61

The Wrangler

Scanner/ParserParse Tree

Syntax tools

AST annotated with comments

Program analysis and transformation by the refactorer

Transformed AST

Pretty printer

Program source

Program source

Refactorer

AST + comments + binding structure

LIL 2006 62

Teaching and learning design

Exciting prospect of using a refactoring tool as an integral part of an elementary programming course.

Learning a language: learn how you could modify the programs that you have written …

… appreciate the design space, and

… the features of the language.

LIL 2006 63

ConclusionsRefactoring + functional programming: good fit.

Real win from available libraries … with work.

Substantial effort in infrastructure.

De facto vs de jure: GHC vs Haskell 98.

Correctness and verification …

Language independence …

top related