an algebraic theory of polymorphic temporal media paul hudak yale university department of computer...
TRANSCRIPT
An Algebraic Theory ofAn Algebraic Theory ofPolymorphic Temporal MediaPolymorphic Temporal Media
Paul HudakPaul Hudak
Yale UniversityYale University
Department of Computer ScienceDepartment of Computer Science
PADL SymposiumPADL Symposium
June 18, 2004June 18, 2004
MotivationMotivation• Previous work on:
– Haskore: a library for computer music composition.– Fran: a language for functional reactive animation.– Dance: a language for humanoid robots.
has revealed striking similarities at the highest level of expression.
• In particular, notions of:– Sequential composition– Parallel composition– Absence of value– Temporal properties (duration, etc.)– Map- and fold-like operations (scaling, transposing, etc.)
• Questions: – Can these notions be captured in a single unified
framework?– How do we give meaning to these structures?– How do we manipulate and reason about them?
OutlineOutline
• Polymorphic media• Syntactic (structural) operations and properties
(map, fold, etc.)• Temporal operations and properties
(duration, take, drop, etc.)• Semantic operations and properties
(sequential and parallel composition)• Axiomatic semantics
(with soundness and completeness results)
Haskell code is used throughout,with running examples from music and
animation.
The Nature of this TalkThe Nature of this Talk
• Everything is fairly simple.
• But it uses lots of ideas from PL research.
• Therefore it’s good pedagogy.
• Not entirely sure if it’s practical…but hopefully it’s fun!
Haskell Types and ClassesHaskell Types and Classes
• Polymorphic data type (new type):data List a = Nil | Cons a (List a)
• Type synonym (new name for existing type):type IntList = List Int
• Type class:class Eq a where
(==) :: a -> a -> Bool
• Type class instance with “context”:instance Eq Int => Eq IntList where Nil == Nil = True Cons x xs == Cons y ys = x==y && xs==ys
• Class “laws”:a==aa==b = b==aa==b && b==c a==c
Polymorphic MediaPolymorphic Media• Define an algebraic data type:
data Media a = Prim a -- base media value | Media a :+: Media a -- sequential composition | Media a :=: Media a -- parallel composition
(later we will define a way to express the absence of media)
• We refer to T in Media T as the base media type.• So:
– Prim x is a media value from the base media type.– m1 :+: m2 is media value m1 followed in time by m2.– m1 :=: m2 is media value m1 occurring simultaneously
with m2.
Example 1: MusicExample 1: Music
• For music media, Note is the base media type:
type Music = Media Note
data Note = Rest Dur | Note Pitch Durtype Dur = Realtype Pitch = (NoteName, Octave)type Octave = Intdata NoteName = Cf | C | Cs | Df | D | Ds | Ef | E | Es | Ff | F | Fs | Gf | G | Gs | Af | A | As | Bf | B | Bs
• For example:
let dMinor = Note (D,3) 1 :=: Note (F,3) 1 :=: Note (A,3) 1 gMajor = Note (G,3) 1 :=: Note (B,3) 1 :=: Note (D,4) 1 cMajor = Note (C,3) 2 :=: Note (E,3) 2 :=: Note (G,3) 2 in dMinor :+: gMajor :+: cMajor
is a ii-V-I progression in C major.
In Contrast: HaskoreIn Contrast: Haskoretype Pitch = (PitchClass, Octave)data PitchClass = Cf | C | Cs | Df | D | Ds | Ef | E | Es | Ff | F | Fs | Gf | G | Gs | Af | A | As | Bf | B | Bstype Octave = Int
data Music = Note Pitch Dur [NoteAttribute] -- a note \ atomic | Rest Dur -- a rest / objects | Music :+: Music -- sequential composition | Music :=: Music -- parallel composition | Tempo (Ratio Int) Music -- tempo scaling | Trans Int Music -- transposition | Instr IName Music -- instrument label | Player PName Music -- player label | Phrase [PhraseAttribute] Music -- phrasing attributes
type Dur = Ratio Int -- in whole notestype IName = Stringtype PName = String
data NoteAttribute = Volume Float -- convention: 0=min, 100=max | Fingering Int | Dynamics String | PFields [Float]
data PhraseAttribute = Dyn Dynamic | Art Articulation | Orn Ornament
data Dynamic = Accent Float | Crescendo Float | Diminuendo Float | PPP | PP | P | MP | SF | MF | NF | FF | FFF | Loudness Float | Ritardando Float | Accelerando Float
data Articulation = Staccato Float | Legato Float | Slurred Float | Tenuto | Marcato | Pedal | Fermata | FermataDown | Breath | DownBow | UpBow | Harmonic | Pizzicato | LeftPizz | BartokPizz | Swell | Wedge | Thumb | Stopped
data Ornament = Trill | Mordent | InvMordent | DoubleMordent | Turn | TrilledTurn | ShortTrill | Arpeggio | ArpeggioUp | ArpeggioDown | Instruction String | Head NoteHead
data NoteHead = DiamondHead | SquareHead | XHead | TriangleHead | TremoloHead | SlashHead | ArtHarmonic | NoHead
Example 2: AnimationExample 2: Animation• For animation media, Anim is the base media type:
type Animation = Media Anim
type Anim = (Dur, Time -> Picture)type Time = Realtype Dur = Realdata Picture = EmptyPic | Circle Radius Point | Square Length Point | Polygon [Point] Point
• For example: let ball1 = (10, \t -> Circle t origin) ball2 = (10, \t -> Circle (10-t) origin box = (20, const (Square 1 (1,1)) in (ball1 :+: ball2) :=: box
is a ball that first grows for 10 seconds and then shrinks, next to a stationary box.
Syntactic OperationsSyntactic Operations
• Syntactic operations depend only on the “syntax”, or “structure”, of polymorphic Media values.
• For example: instance Functor Media where fmap f (Prim n) = Prim (f n) fmap f (m1 :+: m2) = fmap f m1 :+: fmap f m2 fmap f (m1 :=: m2) = fmap f m1 :=: fmap f m2
• This instance obeys the standard laws of the Functor class; namely: fmap (f . g) = fmap f . fmap g fmap id = id
ExampleExample• A function to scale the tempo of a Music value:
tempo :: Dur -> Music -> Musictempo r = fmap temp where temp (Rest d) = Rest (r*d) temp (Note p d) = Note p (r*d)
• A function to transpose a Music value by a given interval:trans :: Int -> Music -> Musictrans i = fmap tran where tran (Rest d) = Rest d tran (Note p d) = Note (transPitch i p) d
• Using Functor class laws, it is straightforward to show that: tempo r1 . tempo r2 = tempo (r1*r2) trans i1 . trans i2 = trans (i1+i2) tempo r1 . tempo r2 = tempo r2 . tempo r1 trans i1 . trans i2 = trans i2 . trans i1 tempo r1 . trans i1 = trans i1 . tempo r1
• Similarly, we can define functions to scale an animation in size, or translate it in 2D space.
CatamorphismCatamorphism• We can also define a fold-like function:
foldM :: (a->b) -> (b->b->b) -> (b->b->b) -> Media a -> bfoldM f g h (Prim x) = f xfoldM f g h (m1 :+: m2) = foldM f g h m1 `g` foldM f g h m2foldM f g h (m1 :=: m2) = foldM f g h m1 `h` foldM f g h m2
• For which the following laws hold:foldM (Prim . f) (:+:) (:=:) = fmap ffoldM Prim (:+:) (:=:) = id
• As well as this fusion law:k . foldM f g h = foldM f’ g’ h’
if the following equalities hold:f’ x = k (f x)g’ (k x) (k y) = k (g x y)h’ (k x) (k y) = k (h x y)
• Several examples of catamorphisms are forthcoming.
Reversing a Media ValueReversing a Media Value
• We can reverse, in time, a Media value if we can reverse the base media type. We enforce this using type classes:
class Reverse a where reverseM :: a -> a
instance Reverse a => Reverse (Media a) where reverseM (Prim a) = Prim (reverseM a) reverseM (m1 :+: m2) = reverseM m2 :+: reverseM m1 reverseM (m1 :=: m2) = reverseM m1 :=: reverseM m2
• But note that reverseM can be defined more succinctly as a catamorphism:
instance Reverse a => Reverse (Media a) where reverseM = foldM (Prim . reverseM) (flip (:+:)) (:=:)
Laws Involving Laws Involving reverseMreverseM
• Theorem: For finite m, if the following holds for reverseM :: T -> T, then it also holds for reverseM :: Media T -> Media T:
reverseM (reverseM m) = m
• Theorem: For any f :: T -> T, if f . reverseM = reverseM . f, then:
fmap f . reverseM = reverseM . fmap f
• Theorem: For all finite m :: Media T, functions g, h ::T -> T -> T, and f, f' :: T -> T such that f = f' . reverseM:
foldM f g h m = foldM f' (flip g) h (reverse m)
Inductionless ProofInductionless ProofProve: reverse (reverse m) = m
Inductionless proof, using fusion law:
Let k = reverseM:(reverseM . reverseM) m
= (k . foldM (Prim . k) (flip (:+:)) (:=:)) m fusion law= foldM Prim (:+:) (:=:) m fold law= m
Justification for use of fusion law:Prim x assumption
= Prim (k (k x)) fold k= k (Prim (k x)) fold (.)= k ((Prim . k) x)
(:+:) (k x) (k y) fold k= k (y :+: x) fold flip= k (flip (:+:) x y)
(:=:) (k x) (k y) unfold k= k (x :=: y)
Example 1: MusicExample 1: Music• We declare Note to be an instance of class Reverse:
instance Reverse Note where reverseM = id
(i.e. a note is the same played backwards or forwards)• The constraints in the previous laws are thus satisfied.• Furthermore, we have this corollary to the second law:
reverseM . tempo r = tempo r . reverseM reverseM . trans i = trans i . reverseM
• And this corollary to the third: foldM f g h m = foldM f (flip g) h (reverse m)
• [Note: The reverse of a musical passage is called its retrograde. E.g.: J.S. Bach's “Crab Canons” and Franz Joseph Haydn's Piano Sonata No. 26 in A Major (Menueto al Rovescio). It is also a standard construction in modern twelve-tone music.]
Example 2: AnimationExample 2: Animation
• We declare Anim to be an instance of class Reverse:
instance Reverse Animation wherereverseM (d, f) = (d, \t -> f (d-t))
• It is easy to show that:reverseM (reverseM (d, f)) = (d, f)
• Therefore the constraints are satisfied, and the laws hold for continuous animations.
• Furthermore, we have this corollary:
reverseM . scale s d = scale s d . reverseM
Temporal PropertiesTemporal Properties
• So far, all operations have been structural (even reverseM, which purportedly also reverses time).
• Let’s now look at temporal properties that depend directly on time, and in particular on the duration of a media value.
• Define: class Temporal a where dur :: a -> Dur none :: Dur -> a
instance Temporal a => Temporal (Media a) where dur = foldM dur (+) max none = Prim . none
• Intuitively, dur m is the duration of m :: Media T, andnone d :: Media T is an “empty” media value with duration d.
The Intended Semantics of (:=:)The Intended Semantics of (:=:)
• In Haskore, the arguments to (:=:) are left-aligned:
m1 :=: m2
• In a recent paper, they are centered symmetrically:
m1 :=: m2
• In the current treatment, they must have equal duration:
m1 :=: m2
(This results in no loss of generality.)
m1
m2
m1
m2
m2
m1
A Definition and an Example A Definition and an Example
• Definition: A well-formed temporal media value m :: Media T is one for which each parallel composition m1 :=: m2 has the property that dur m1 = dur m2.
• Example 1: We declare Note to be Temporal: instance Temporal Note where dur (Rest d) = d dur (Note p d) = d none d = Rest d
Thus dur (Note p1 d1 :+: Note p2 d2), for example, is d1+d2.
• Example 2: We declare Anim to be Temporal:instance Temporal Anim where dur (d, f) = d none d = (d, const EmptyPic)
Take and Drop for MediaTake and Drop for Media
• Analogous to take and drop on lists, except indexed by time:
class Take a where takeM :: Dur -> a -> a dropM :: Dur -> a -> a
instance (Take a, Temporal a) => Take (Media a) where takeM d m | d <= 0 = none 0 takeM d (Prim x) = Prim (takeM d x) takeM d (m1 :+: m2) = let d1 = dur m1 in if d <= d1 then takeM d m1 else m1 :+: takeM (d-d1) m2 takeM d (m1 :=: m2) = takeM d m1 :=: takeM d m2 -- note: well-formed media
dropM d m = . . .
Laws for Laws for TakeMTakeM and and DropMDropM
• The following laws are analogous to ones for lists:For all non-negative d1, d2 :: Dur, if the following laws hold fortakeM, dropM :: Dur -> T -> T, then they also hold fortakeM, dropM :: Dur -> Media T -> Media T:
takeM d1 . takeM d2 = takeM (min d1 d2)dropM d1 . dropM d2 = dropM (d1+d2)takeM d1 . dropM d2 = dropM d2 . takeM (d1+d2)dropM d1 . takeM d2 = takeM (d2-d1) . dropM d -- if
d2>=d1
• But the following law does not hold:
For all finite well-formed m :: Media a and non-negative d :: Dur <= dur m,if the following law holds for takeM, dropM :: Dur -> T -> T, then it also holds for takeM, dropM :: Dur -> Media T -> Media T:
takeM d m :+: dropM d m = m
(the constraint on the base media type cannot be satisfied)
ExampleExample
• We declare Note to be an instance of Take:
instance Take Note where takeM d1 (Rest d2) = Rest (min d1 d2) takeM d1 (Note p d2) = Note p (min d1 d2)
dropM d1 (Rest d2) = Rest (max 0 (d2-d1)) dropM d1 (Note p d2) = Note p (max 0 (d2-d1))
• The constraints in the first four previous laws hold for this instance, and thus they hold for Music values.
• But note that Note p 1 :+: Note p 1 /= Note p 2, and thus the last law on the previous slide does not hold.
• An example using Animations can be constructed analogously.
SemanticsSemantics
• Consider these two expressions:m1 :+: (m2 :+: m3)(m1 :+: m2) :+: m3
• Intuition tells us that these represent the same media value; i.e. (:+:) should be associative. There are in fact several other examples of this.
• What we need is an interpretation of media values that somehow gives meaning to them.
• And we wish to do this in a polymorphic way.
The Meaning of MediaThe Meaning of Media
• We use type classes to structure meanings:
class Combine b where concatM :: b -> b -> b merge :: b -> b -> b zero :: Dur -> b
class Combine b => Meaning a b where meaning :: a -> b
instance Meaning a b => Meaning (Media a) b where meaning = foldM meaning concatM merge
• Intuitively, an instance Meaning T1 T2 means that T1 can be given meaning in terms of T2.
LawsLaws
• We require valid instances of Combine to obey these laws:
b1 `concatM` (b2 `concatM` b3) = (b1 `concatM` b2) `concatM` b3b1 `merge` (b2 `merge` b3) = (b1 `merge` b2) `merge` b3b1 `merge` b2 = b2 `merge` b1zero 0 `concatM` b = bb `concatM` zero 0 = bzero d1 `concatM` zero d2 = zero (d1+d2)zero d `merge` b = b, if d = dur b
(b1 `concatM` b2) `merge` (b3 `concatM` b4) = (b1 `merge` b3) `concatM` (b2 `merge` b4), if dur b1 = dur b3 and dur b2 = dur b4
• Laws for class Meaning:
meaning . none = zerodur . meaning = dur
Semantic EquivalenceSemantic Equivalence
• Definition: m1, m2 :: Media T are equivalent, written m1 === m2, if and only if meaning m1 = meaning m2.
• Example: We take the meaning of music to be a pair: the duration, and a sequence of events, where each event marks the start-time, pitch, and duration of a single note:
data Event = Event Time Pitch Durtype Time = Ratio Inttype Performance = (Dur, [Event])
• This corresponds well to low-level representations of music such as Midi and Csound.
Example, cont’dExample, cont’d
• Three instance declarations complete the meaning of music:
instance Combine Performance where concatM (d1, evs1) (d2, evs2) = (d1 + d2, evs1 ++ map shift evs2) where shift (Event t p d) = Event (t+d1) p d merge (d1, evs1) (d2, evs2) = (d1 `max` d2, sort (evs1 ++ evs2)) zero d = (d, [ ])
instance Temporal Performance where dur (d, _) = d none = zero
instance Meaning Note Performance where meaning (Rest d) = (d, [ ]) meaning (Note p d) = (d, [Event 0 p d])
The Structure of MeaningThe Structure of Meaning
• Theorem: The following diagram commutes:
<Media T,:+:,:=:> <I,concatM,merge>
<Media T/(===),:+:,:=:>
h-1hg
meaning
An Axiomatic SemanticsAn Axiomatic Semantics
• Define A to be the axiomatic semantics given by the following nine axioms:(1) associativity of (:+:) m1 :+: (m2 :+: m3) === (m1 :+: m2) :+: m3(2) associative of (:=:) m1 :=: (m2 :=: m3) === (m1 :=: m2) :=: m3(3) commutativity of (:=:) m1 :=: m2 === m2 :=: m1(4) left (sequential) zero none 0 :+: m === m(5) right (sequential) zero m :+: none 0 === m(6) left (parallel) zero none d :=: m === m, if d = dur m(7) right (parallel) zero m :=: none d === m, if d = dur m(8) additivity of none none d1 :+: none d2 === none (d1+d2)(9) serial/parallel axiom: (m1 :+: m2) :=: (m3 :+: m4) === (m1 :=: m3) :+: (m2 :=: m4), if dur m1 = dur m3 and dur m2 = dur m4
plus the reflexive, symmetric, and transitive axioms implied by (===) being an equivalence relation, and the substitution axioms implied by (===) being a congruence relation.
The Serial/Parallel AxiomThe Serial/Parallel Axiom• Suppose dur m1 = dur m3 and dur m2 = dur m4.
• Then, intuitively, these two phrases should be equivalent:
(m1 :+: m2) :=: (m3 :+: m4)(m1 :=: m3) :+: (m2 :=: m4)
• Or, graphically:
• This is a critical axiom to many proofs.
m1
m3
m2
m4
m1
m3
m2
m4===
ExampleExample
• Theorem: For all finite x :: T and non-negative d :: Dur <= dur m, if
takeM d x :+: dropM d x === x
then for all finite well-formed m :: Media T,
takeM d m :+: dropM d m === m
• Proof (partial): By structural induction.
Base case: Trivially true from the assumption.
Induction step:takeM d (m1 :=: m2) :+: dropM d (m1 :=: m2) -- unfold takeM and dropM= (takeM d m1 :=: takeM d m2) :+: (dropM d m1 :=: dropM d m2) serial/parallel axiom= (takeM d m1 :+: dropM d m1) :=: (takeM d m2 :+: dropM d m2) induction hypothesis= m1 :=: m2
SoundnessSoundness
• We write “A |- m1 = m2” iff m1 === m2 is provable from the axioms in A.
• Theorem: The axiomatic semantics A is sound. That is, for all m1, m2 :: Media T:
A |- m1 = m2 m1 === m2
Proof: By induction on the derivation, and validity of the axioms.
CompletenessCompleteness• In what sense are the axioms complete? That is, if two media
values are equivalent, can we always prove it from the axioms?
• The answer is “yes, if…”• Definition: A well-formed media term m :: Media T is in
normal form iff it is of the form:
none d, d >=0
--- or ---
(none d11 :+: Prim x1 :+: none d12) :=:(none d21 :+: Prim x2 :+: none d22) :=:
. . .(none dn1 :+: Prim xn :+: none dn2), n >= 1, where for all (1 <= i <= n), di1 + di2 + dur xi = dur m, and for all (1 <= i < n), (di1,xi,di2) <= d(i+1)1, xi+1, d(i+1)2
• We denote the set of media normal forms as MediaNF T.
NormalizationNormalization
• Theorem: Any m : Media T can be transformed into a media normal-form using only the axioms of A.
• Proof: Define a normalization function:
normalize :: Media a -> Media a
and establish it’s validity using only the axioms of A.
Completeness, cont’dCompleteness, cont’d
• Theorem: The axiomatic semantics A is complete, that is, for all m1, m2 :: Media T:
m1 === m2 A -| m1 = m2
if and only if the normal forms in MediaNF T are unique.
• The “if and only if” means that our design of the normal forms is rather special.
ExampleExample• Elements of MusicNF = MediaNF Note are unique.• To see why, note that each normal form m:
(none d11 :+: Prim x1 :+: none d12) :=:(none d21 :+: Prim x2 :+: none d22) :=:
. . .(none dn1 :+: Prim xn :+: none dn2)
corresponds uniquely to an interpretation:
(dur m, [ Event d11 p1 (dur x1), Event d21 p2 (dur x2),
. . . Event dn1 pn (dur xn) ])
• This correspondence is invertible, and therefore a bijection, because each di2 is computable from the other durations;i.e. di2 = dur m - di1 – dur xi.
ExampleExample
• Elements of AnimationNF = MediaNF Anim are not unique.• There are two problems:
1. There are more equivalences:
ball :=: ball === balltakeM d (Prim x) :+: dropM d (Prim x) === Prim x
Both of these imply more equivalences than the axioms alone can establish.
Solution: add “domain-specific” axioms to regain completeness.
2. We have assumed commutativity of (:=:), but this is unlikely to be true for most graphics/animation systems.
Solution: devise a non-commutative semantics.
Final ThoughtsFinal Thoughts
• Details of domain-specific axioms.• Details of non-commutative semantics.• What about infinite media values?
Which laws still hold?• Can we make a unified programming language for
multimedia?• Other concrete domains (e.g. robot control
language).• What about reactivity? (see Yampa for a start)• haskell.org/yale• haskell.org/yampa
Algebraic StructureAlgebraic Structure
• An algebra <S,op1,op2,...> consists of a non-empty carrier set (or sort) S together with one or more n-ary operations op1, op2, ..., on that set.
• The algebra of well-formed temporal media over type T is denoted by <Media T,:+:,:=:>
• The algebra of interpretations in terms of type I is denoted by <I,concatM,merge>
• Theorem: The semantic function meaning is a homomorphism from <Media T,:+:,:=:> to <I,concatM,merge,zero>.
Algebraic Structure, cont’dAlgebraic Structure, cont’d
• Theorem: (===) is a congruence relation on <Media,:+:,:=:>.
• Definition: Let [[m]] denote the equivalence class (induced by (===)) that contains m. Let Media/(===) denote the quotient set of such equivalence classes, and let <Media/(===),:+:,:=:> denote the quotient algebra (also called the initial algebra). The function:
g :: Media -> Media/(===)g m = [[m]]
is called the natural homomorphism from <Media,:+:,:=:> to <Media/(===),:+:,:=:>. Also define:
h :: Media/(===) -> Ih [[m]] = meaning m
which is an isomorphism, whose inverse is: h-1 p = [[m]], if p = meaning m
NormalizationNormalization
• Theorem: Any m : Media T can be transformed into a media normal-form using only the axioms of A.
• Proof: Define this normalization function:normalize :: (Ord (Media a), Temporal a) => Media a -> Media anormalize m = sortM (norm (dur m) 0 m)
norm :: (Ord (Media a), Temporal a) => Dur -> Time -> Media a -> Media anorm d t m | isNone m = mnorm d t (Prim x) = none t :+: Prim x :+: none (d-t-dur x)norm d t (m1 :+: m2) = norm d t m1 :=: norm d (t+dur m1) m2norm d t (m1 :=: m2) = norm d t m1 :=: norm d t m2
and establish it’s validity using only the axioms of A.In particular:
Lemma: norm (dur m) 0 m === mor more generally:
norm d t m === none t :+: m :+: none (d-t-dur m)
Completeness, cont’dCompleteness, cont’d
• Theorem: The axiomatic semantics A is complete, that is, for all m1, m2 :: Media T:
m1 === m2 A -| m1 = m2
if and only if the normal forms in MediaNF T are unique.• Proof (reverse direction):
– If m1 === m2, then p = meaning m1 = meaning m2.– Let n1 = normalize m1 and n2 = normalize m2. Then:
A -| n1 = m1 and A -| n2 = m2.– Thus: meaning n1 = meaning m1 = p = meaning m2 =
meaning n2.– But there is an isomorphism between Media/(===) and I.
Therefore p corresponds uniquely to a normal form h-1 p.– This implies that n1 = h-1 p = n2, and thus
A -| m1 = m2.
• Proof in forward direction is by contradiction.