com2010 - functional programming regular expressions and abstract data types marian gheorghe lecture...

21
Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & http://www.dcs.shef.ac.uk/~marian ©University of Sheffield com2010

Upload: theodore-lee

Post on 04-Jan-2016

234 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

Com2010 - Functional Programming

Regular Expressions and Abstract Data Types

Marian Gheorghe

Lecture 14

Module homepage Mole & http://www.dcs.shef.ac.uk/~marian

©University of Sheffieldcom2010

Page 2: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

Functions over the type of REs are defined by recursion over the structure of the expression. Example

literals :: RegExp -> [Char]

literals Epsilon = []

literals (Literal ch) = [ch]

literals (Or r1 r2) = literals r1 ++ literals r2

literals (Then r1 r2) = literals r1 ++ literals r2

literals (Star r) = literals r

which shows a list of the literals (characters) occurring in

a RE.

re1 denoting (‘a’|(‘b’’c’)); ie re1 = Or a (Then b c)

Leads to literals re1 ⇒ "abc"

RE - Examples

©University of Sheffieldcom2010

Page 3: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

REs are patterns and we may match a word w against each RE.

w will match the empty word if it is epsilon

x w will match x if it is an arbitrary ASCII character

(r1|r2) w will match (r1|r2) if w matches either r1 or r2 (or both).

(r1r2) w will match (r1r2) if w can be split into two subwords w1 and w2, w = w1++w2, so that w1 matches r1 and w2 matches r2

(r)* w will match (r)* if w can be split into zero or more subwords, w = w1++w2++… wn, each of which matches r. The zero case implies that the empty string will match (r)* for any re r

Matching REs

©University of Sheffieldcom2010

Page 4: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

The first three cases are a simple transliteration of the definitionsmatches :: RegExp->String->Bool

matches Epsilon st = (st=="")

matches (Literal ch) st = (st==[ch])

matches (Or x y) st = matches x st || matches y st

Catenation needs an auxiliary function:splits :: String->[(String,String)]

splits st = [(take n st, drop n st)|n<-[0..length st]]

splits "123" [("","123"),("1","23"),("12","3"),("123","")]

Catenationmatches (Then x y) st = foldr (||) False

[matches x st1 && matches y st2| (st1,st2)<-splits st]

Matching against REs

©University of Sheffieldcom2010

Page 5: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

matches (Star r) st = matches Epsilon st ||

foldr (||) False

[matches r st1 && matches (Star r) st2|

(st1,st2)<-splits st]

This uses the fact that Star r means Epsilon or r or Star r

Examples:Matches (Or Epsilon (Then a (Then b c))) "abc"

⇒ Truematches (Star (Or Epsilon b)) "b"

⇒ ERROR - Control stack overflow -- OOPS!!

The problem is that once discovered the empty word (the first equation) it should be removed from the set of strings produced by further splitting the word, i.e. to avoid tuples ([],st).

Matches against Star r

©University of Sheffieldcom2010

Page 6: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

matches (Star r) st = matches Epsilon st ||

foldr (||) False

[matches r st1 && matches (Star r) st2|

(st1,st2)<-frontSplits st]

wherefrontSplits :: String -> [(String,String)]

frontSplits st =[(take n st,drop n st)|

n<-[1..length st]]

matches (Star (Or Epsilon b)) "b“⇒ True

“b” has been successfully matched against the RE ( | ‘b’)*.

Matches against Star r - again

©University of Sheffieldcom2010

Page 7: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

21.1 Representing Rationals

21.2 Haskell modules

Abstract Data Types

©University of Sheffieldcom2010

Page 8: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

Data are abstract in the sense that the programmer does not need to care about how they are implemented

Their implementation is abstracted and hidden from the user. All that the programmer needs to know are the generic operations for constructing and manipulating elements of the data type at hand.

Data abstraction is a very important design principle which consists in separating the definition or representation of a data type from its use.

Abstract data types - Introduction

©University of Sheffieldcom2010

Page 9: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

Goal: designing a system to perform simple arithmetic operations with rational numbers.

We’ll start by representing rationals …

Every rational number r is of the form r = n/d with n the numerator and d the denominator. Will be represented as a pair (n,d).

Thus the type of rational numbers:type Rat = (Int,Int)

What’s next?

Representing Rationals

©University of Sheffieldcom2010

Implement operations

Page 10: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

In order to implement addition and multiplication of rational numbers with respect to the usual priority rules of these operations we write:

infixl 7 `rmult`

infixl 6 `radd`

rmult :: Rat -> Rat -> Rat

rmult (n_1,d_1) (n_2,d_2) = (n_1*n_2,d_1*d_2)

radd :: Rat -> Rat -> Rat

radd (n_1,d_1) (n_2,d_2) = (n_1*d_2+n_2*d_1,d_1*d_2)

Now try (2,1) `radd` (2,1) `rmult` (3,2)

Remove priority rules and try it again. What happens??

In general (2,1) `radd` (2,1) `rmult` (3,2) means(2,1) `radd` ((2,1) `rmult` (3,2))

Addition and multiplication

©University of Sheffieldcom2010

Page 11: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

Converting integers into rationalsmkrat :: Int -> Int -> Rat

mkrat _ 0 = error "denominator 0"

mkrat n d = (n,d)

Check two rationals are equalinfix 4 `requ`

requ :: Rat -> Rat -> Bool

requ (n_1,d_1) (n_2,d_2) = (n_1*d_2==n_2*d_1)

Inverse of a rationalrinv :: Rat -> Rat

rinv (0,_) = error "no inverse"

rinv (n,d) = (d,n)

More functions

©University of Sheffieldcom2010

Page 12: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

A module to define a data type Rat and the operations mkrat, radd, rmult, requ, and rinv will have the following layout:

module Rationals where

type Rat …

mkrat …

This module may also contain functions to subtract and divide rational numbers.

infixl 7 `rdiv`

infixl 6 `rdiff`

rdiv :: Rat -> Rat -> Rat

rdiv x y = x `rmult` (rinv y)

rdiff :: Rat -> Rat -> Rat

rdiff x y=x `radd` (mkrat (-1) 1) `rmult` y

Module Rationals

©University of Sheffieldcom2010

Page 13: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

1. rmult and rdiv on the one hand and radd and rdiff on the other hand have the same priority level

2. all the operations radd, rdiff, rmult, rdiv are left associative

3. rdiv and rdiff are defined without referring to the specific representation of the type Rat

Our encoding of rational numbers is not an exact representation:

1. contains improper elements; the pairs (n,0) do not correspond to any rational number and some operations do not care about them (radd, rmult, requ) !

2. the representation is redundant; 1/3 has infinitely many representations, i.e. all the pairs (n, 3*n) !

Observations

©University of Sheffieldcom2010

Page 14: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

To remove redundant representatives a function reduce may be used:

reduce :: Rat -> Rat

reduce (_,0) = error "denominator 0"

reduce (x,y) = (x `div` d, y `div` d)

where d= gcd x y

gcd is a built-in function that computes the greatest common divisor for two integers.

The module Rationals contains

the definition of a data type Rat and

the operations mkrat, radd, rmult, requ, rinv, reduce

Remove redundancy

©University of Sheffieldcom2010

Page 15: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

Problem: build an application that uses Rationals to compute linear combinations of rationals

Given k integer numbers n_1, … n_k and k rational numbers r_1, … r_k , the following sum n_1 * r_1 +… n_k * r_k

is called a linear combination

Ex: 2*(1/2) + 3*(2/5)

Solution:module Application where

import Rationals

-- Haskell definition for functions

-- providing linear combinations of rationals

Appl: Linear combination of rationals

©University of Sheffieldcom2010

Page 16: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

linComb :: [(Int,Rat)] -> Rat

linComb = foldl raddIntRat (mkrat 0 1)

where raddIntRat, given below, adds a rational and the product of an integer with a rational number:

raddIntRat :: Rat -> (Int,Rat) -> Rat

raddIntRat x (n,y) = radd x (rmult (mkrat n 1) y)

For example, the following linear combination, 1*1/2 + 1*1/2 = 1, may be computed as

linComb [(1,(1,2)),(1,(1,2))] ⇒ (1,1) ??

module Application imports Rationals, all the definitions made in this module. The details of all the data types defined in Rat may be used in Application. ((Int, Int) may be used as well) The solution is to treat Rat as an abstract data type

Linear combination functions

©University of Sheffieldcom2010

Page 17: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

The Haskell module system allows definitions of data types and functions to be visible or hidden when a module is imported.

A module layout is split down into two parts:

• a visible part that is exported and which gives all the definitions that may be used outside of the module

• a hidden part that implements the types and the functions exported plus some other objects which are not visible

For example in the case of Rationals we may decide to export from it the data type Rat and the operations radd, rdiff, rmult, rdiv, requ, and mkrat.

Haskell modules

©University of Sheffieldcom2010

Page 18: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

module Rationals

(Rat, -- data type

radd, -- Rat -> Rat -> Rat

rdiff, -- Rat -> Rat -> Rat

rmult, -- Rat -> Rat -> Rat

rdiv, -- Rat -> Rat -> Rat

requ, -- Rat -> Rat -> Bool

mkrat -- Int -> Int -> Rat

) where

The data type Rat is called an Abstract Data Type. The functions rinv, reduce have not been specified and consequently can not be used outside of Rationals.

If we try to use now rinv in the module Application then the error message

ERROR - Undefined variable "rinv"

Rationals as an ADT

©University of Sheffieldcom2010

Page 19: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

Using abstract data types any application may be split down into a visible part (signature or interface) and a hidden part (implementation).

Changing the implementation without effecting the user.

For example Rat may be represented as an algebraic typedata Rat = ConR Int Int

or as a real typetype Rat = Float

If we use for Rat the implementation based on algebraic data types and in the module Application, the function linComb

linComb :: [(Int,Rat)] -> Rat

linComb = foldl raddIntRat (ConR 0 1)

Undefined constructor function “ConR“!! (look back)

Changing the implementation

©University of Sheffieldcom2010

Page 20: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

Lazy evaluation

• Infinite computation

• List comprehension

• Regular expressions

Abstract data types

• Divide the system into visible and hidden parts

• Use Haskell modules

Conclusions

©University of Sheffieldcom2010

Page 21: Com2010 - Functional Programming Regular Expressions and Abstract Data Types Marian Gheorghe Lecture 14 Module homepage Mole & marian

Available on Monday from

Mole

Lexical analyser, parser, evaluation functions at

http://www.dcs.shef.ac.uk/~marian/

Project

©University of Sheffieldcom2010