programming language theory notes

Upload: lee-gao

Post on 14-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Programming Language Theory Notes

    1/61

    CS 6110: Ocaml Tutorial

    Lee Gao

    January 25, 2013

    1. Binding

    1 l e t s ix t y On e Te n = 6 1 1 0

    2. Tuples and Datatypes

    1 l e t t 1 : i nt i n t = ( 3 , 5 )2 l e t t 2 : s t r i ng b o o l c h ar = ( moo , t r u e , q )3 l e t t 3 : u n i t = ( )

    1 type s u i t = S p a d e s | Diamonds |

    3. Pattern Matching

    1 l e t t 1 : i nt i n t = . . .2

    3 l e t ( a , b ) = t 14

    5 l e t ( , b ) = t16

    7 l e t ( a , 5 ) = t 1 ( Can m at ch c o n s t a n t s t o o ; i n c o m p l e t e )

    4. More pattern matching, with custom types

    1 l e t s ui tn am e : s t r i n g =2 match c with3

    Spades >

    s p a d e s | Diamonds >

    diamonds4 | . . .

    If you dont exhaustive specify all possibilities, OCaml will warn you.

    1 l e t b : b o o l = . . .2 match b with3 t r u e > . . .4 | f a l s e > . . .

    5. Bad pattern, shadowing

    1

  • 7/30/2019 Programming Language Theory Notes

    2/61

    1 l e t p a i r = ( 4 2 , 6 1 1 )

    2 l e t x = 6 1 13 match p a i r with4 ( x , 6 1 1 ) > 05 | ( 4 2 , x ) > 16 | > 2

    Shadows

    6. Functions

    1 l e t s qu ar e : i n t > i n t =2 fun ( x : i n t ) > x x

    3iso=

    4 l e t s uq ar e ( x : i n t ) : i n t = x x

    7. Recursive functions must have a rec annotation to tell top level to binditself into tis environment

    1 l e t r ec x! =2 i f x = 0 then 13 e l s e x ( (x 1)!)

    8. Tuple argument (using CHI is isomorphic)

    1 l e t f ( a , b ) : i n t =

    2 . . .

    3

    4 l e t f ( p : i nt i n t ) : i n t =5 . . .

    By CHI, these are equivalent to the preferred style

    1 l e t f ( a : i n t ) ( b : i n t ) : i n t =2 . . .

    We call this currying, and say that f curried f.

    9. A minor tangent

    We have

    gcd : int * int -> int

    gcd : int -> (int -> int)

    Through currying and uncurrying, these types are somehow equiv-alent (there exists some isomorph between these that preservesbehavior)

    10. Polymorphism

    2

  • 7/30/2019 Programming Language Theory Notes

    3/61

    1 l e t i d x = x

    This will have type , or in caml parlance

    1 a > a

    11. Polymorphic types:

    1 type a l s t =2 Empty3 | Cons o f ( a a l s t ) ( a l so r e c ur s i ve )

    12. OCaml lists builtin

    [] is the empty list

    :: is the cons operator

    @ is the append operator

    [1;2;3] 1 :: 2 :: 3 :: []

    13. @ is not constant time, its a bad idea.

    1 l e t r ec r e v er s e ( l : a l i s t ) : a l i s t =2 match l with3 [ ] > [ ]4 | hd : : t l > ( r e v e r s e t l ) @ [ hd ]

    14. Instead, we should write something more idiomatic like

    1 l e t r ec r e v er s e ( l : a l i s t ) : a l i s t =2 l e t r ec r e v e r s e ( l : a l i s t ) ( l : a l i s t ) : a l i s t =3 match l with4 [ ] > l5 | hd : : t l > r ev er s e t l ( hd : : l ) in6 r e v e rs e l [ ]

    15. Load another script

    1 #u se .m l

    3

  • 7/30/2019 Programming Language Theory Notes

    4/61

    CS 6110: January 28

    Lee Gao

    January 28, 2013

    Pure Lambda:e ::= x | e0 e1 | x.e

    We also have a reduction such that

    (x.e1) e2 e1[x e2]

    We also have an reduction as long as x is not free in e.

    x.e x A

    e

    And we say that A and e are extensionally equivalent. If we apply theseequivalence, we have reductions and expansions. Reductions are like the

    steps of computation, while expansions are in reverse.Alpha equivalence

    e e[x/x] if x not free in e

    1 Lists

    We have pair e1 e2, fst p, snd pSpecification

    first(pair e1 e2) e1

    second(pair e1 e2) e2pair(first p)(second p) p

    We can havepair a b = f.fab

    so then

    pair a.b.f.fab

    So thenfirst true, second false

    1

  • 7/30/2019 Programming Language Theory Notes

    5/61

    first p.p (ab.a) true

    snd p.p false

    We can then build a linked list out of these pair/con constructs.

    2 Let construct

    In ocaml, the let x = e1 in e2, which can be encoded as

    let x = e1 in e2 (x.e2) e1

    3 Recursion

    We can harness .In ocaml, we need to do

    1 l e t r ec f ( n ) = i f n < 2 then 12 e l s e n f ( n1)

    DefineFact n.If (n < 2 ) 1 (n Fact (n 1))

    But how do we solve for Fact?Lets define something called F that is

    F = f.n.If (n < 2 ) 1 (n F (n 1))

    Then FFact = n. (n Fact ), which is what we want. So FFact =Fact. So Fact is a fixed point solution ofF. To find fixed point, iterativelyrefine your result at each point.We can compute the fixed point of any in .Let X = (x.F(x x)) (x.F(x x))By one simple reduction, we find that X = F X, meaning that X is thefixed point of F.

    Now, define Y = f. (x.f(x x)) (x.f(x x)), so now, Y F returns X forF which is the fixed point of F.Now, to define the factorial function, let

    F

    ThenFact Y F

    And we can encode the let rec construct as

    let rec f n = let f = Y f. in

    2

  • 7/30/2019 Programming Language Theory Notes

    6/61

    4 Confluence

    Theorem 1 (Confluence). Terms have unique normal form if that normalform exists up to equivalence. Or

    e e1, e e2 = e

    .e1 e e2

    e

    Normal order recution: reduces leftmost redex at each step. Guaranteed toreach the normal form. For example:

    (xy.y) I supposing that we reduce the innermost/leftmost redex

    The applicative order: outside-to-inside, reduces order first before applying:natural order of how most PL works.

    (xy.y) (xy.y)

    (xy.y)

    ...

    5 Operational Semantics

    How to run programs on an abstract turing machine?Theres an idea, Structural operational semantics (SOS, introduced by Plotkinset al):

    5.1 Small-step Semantics

    Were going to have a binary relation on terms e, : expr expr.Were going to write e1 e2 means e1 goes to e2 in one computational step.So to run a program, we just compute the transitive closure of e0 on

    upuntil normal form (or nontermination).We can define this reduction inductively using inference rules.

    5.1.1 CBV

    e ::= x | e1 e2 | x.ev ::= x.e

    We want to have the reduction and congruence rules

    (x.e) v e[x v]

    e1 e

    1

    e1 e2 e

    1e2

    Lefte2 e

    2

    v e2 v e

    2

    Right

    These rules give left-to-right evaluation and are deterministic, but will notfind the normal form for expressions that applies something to an equivalentform of .

    3

  • 7/30/2019 Programming Language Theory Notes

    7/61

    Alternatively, using the relation building syntax

    e1, e

    1

    v e1, v e

    1

    Right-2

    5.1.2 CBN Corresponds to the normal order of reduction

    Were going to have arguments that are evaluated lazily (we avoid evaluatingthe arguments until necessary).We have the same definition of what a normal term is, but we replace

    (x.e1) e2 e1[x e2]

    e1 e

    1

    e1 e2 e

    1

    e2Left

    This strategy never diverges if your term has a normal form. Used in Haskell,Miranda, etc.

    4

  • 7/30/2019 Programming Language Theory Notes

    8/61

    CS 6110: January 30

    Lee Gao

    January 30, 2013

    1 Subtleties of Substitution

    Suppose we have

    (y (x.x y))[y x] = (x(x.x x))

    This isnt what we want because were creating a new capture. So what wewant is a method of capture-avoiding substitution. In integral calculus

    ex=b

    x=a

    = e[x b] e[x a]

    A first defn of correct substitution came about in 1950. Were good atavoiding this problem, but programs cant (they need to have a manualalpha-renaming transformation on bad transforms).

    1.1 Free Variables

    Lets define a function

    Fv(e) : set of free variables

    We want to define this inductively where

    Fv(x.x y) = {y}

    Fv(x) = {x}Fv(e1 e2) = Fv(e1) Fv(e2)

    Fv(x.e) = Fv(e)\ {x}

    Fv doesnt expand out indefinitely because its definition relies on subtermse, where if we were to ever create an ordering, there exists one where e < e,so Fv relies on strictly smaller terms (under some ordering relation), thenwe will have guarantee on termination.In fact, this monotone structure has a name: structural induction. Alterna-tively, by induction on the height of e. Fixed Point Bitches!

    1

  • 7/30/2019 Programming Language Theory Notes

    9/61

    1.2 Computing the substitution

    Define e {e/x}, by structural induction on e, the terms (so case by case)

    x {e/x} = eVar

    y {e/x} = yVar-2

    (e1 e2) {e/x} = e0 {e/x} e1 {e/x}App

    y.e

    {e/x}Lambda

    For that last case, we have three subcases:

    1. x = y, then x gets shadowed, so = x.e

    2. x = y y / Fv(e), so = x.e {e/x}

    3. x = y y Fv(e), here we should do some -renaming (change ofvariable name).

    y.e

    {e/x} =

    y.e

    y/y

    {e/x} where y is fresh

    Then, this becomes case 2.

    In this context, y is Fresh in this context means that

    y = x

    y

    / Fvs(e)y / Fvs(e)

    y / Fvs(x e e)

    In CBV and CBN, the argument is ALWAYS closed, we typically dont getto a case where we need to worry about this kind of stuff.Why is this induction? We need to prove

    Lemma 1 (Monotonicity). The height of a substitution e {e/x} < the heightof e plus the height of e

    2 Big-step Semantics ()

    In small, its difficult to prove certain properties (but we need to establishequivalence between and ).Relation: e v, where e a symbolic variable for terms and v a symbolicvariable for value or a normal term.

    : Term Value

    Where only represents the termination evaluations.

    2

  • 7/30/2019 Programming Language Theory Notes

    10/61

    2.1 Call By Value

    v vVal

    e1 x.e e2 v2

    e

    {v/x} v

    e1 e2 v

    App

    Its easy to build an interpreter directly from this definition. In caml

    1 type term = Lambda of v a r term | App of term term | Var of v a r2 let b s t e p e : value =3 match e with4 | Lambda( v , e ) > Lambda( v , e )5 | App(e0 ,e1 ) >6 let Lambda(x , e ) = bste p e0 in7 let v 2 = b s t e p e1 in8

    b st ep s ub st e x v9 | > f a i l w i t h Cannot e v al u at e a v a r i a b l e

    2.2 Call By Name

    v vVal

    e0 x.e e {e1/x} v

    e1 e2 vApp

    The substitution in App in CBN can be intractable, but some languagememoizes its value instead of copying this expression over and over again.Even in eager languages, conditionals and guards ensures that not all codeare run, where in fully eager, we would expect to evaluate everything.

    Big vs Small

    Big:

    more extensional

    easier to prove properties about, since we only one proof tree ratherthan a closure over the small semantics

    easy to turn into an interpreter

    fewer rules

    Small:

    more compositional, so certain features like nondeterminism like threads,we can easily add these after building our semantic)

    simpler rules

    3

  • 7/30/2019 Programming Language Theory Notes

    11/61

    2.3 Nondeterministic choice operator

    We add a new terme1|e2

    to mean that either e1 or e2. Nondeterministic:

    e1 v

    e1 | e2 vLeft

    e2 v

    e1 | e2 vRight

    Before, our semantic is syntax directed, but we no longer have a deterministicchoice on which term to apply. For example, if e1 doesnt halt, but e2 does,our must be able to figure out. We call this angelic nondeterminism.

    In small step, we can reduce either side.

    e1 e

    1

    e1 | e2 e

    1 | e2Left

    e1 e1

    e1 | e2 e1 | e

    2

    Rightv | e v

    Choose-L

    e | v vChoose-R

    4

  • 7/30/2019 Programming Language Theory Notes

    12/61

    CS 6110: January 30

    Lee Gao

    February 1, 2013

    1 IMP

    The language Imp has syntax

    a ::= x | n | a1 a2

    b ::= True | False | b | b1 b2 | b1 b2 | b1 b2

    c ::= skip | x := a | c1; c2 | if b then c1 else c2 | while b do c

    One wuch algorithm

    1 n : = 9 9 9;2 p ri me := 0 ;3 while prime=0 do (4

    d := z ; n := n + 1 ;5 p ri me := 1 ;6 . . .

    7 )

    1.1 Small step semantics

    We want to extend our evaluation state with a symbolic environment , andwe have as our state, the pair or configuration c, , and our small steprelation will look like

    c, c,

    That is : com com

    Wlog, we define the same for arithmetic and boolean expressions:

    c, c,

    a, a

    b, b

    1

  • 7/30/2019 Programming Language Theory Notes

    13/61

    We want the normal form to be skip, so

    x := 2 + 3, {x 0} x := 5, {x 0}

    skip, {x 5}

    The final configurations are those that do not step, the normal forms arethose with skip.

    1.1.1 Arithmetic

    n = (x)

    x, nVar

    a1, a

    1

    a1 a2, a

    1 a2Circ

    a2, a

    2

    n a2, n a

    2

    Circ-2(n = n1 + n2)

    n1 n2, nCirc-Normal

    Normal form: n, this is why we do not allow the rule

    n, nInt

    since this will fuck up everything

    1.1.2 Booleans

    b1, b

    1

    b1 b2, b

    1 b2Short-

    False b2, FalseShort--F

    True b2, b2Short--T

    2

  • 7/30/2019 Programming Language Theory Notes

    14/61

    1.1.3 Commands

    a, a,

    x := a, x := a,

    Assgn

    x := n, skip, [x n]Assgn-Int

    c1, c1,

    c1; c2, c1; c2,

    Seq-Left

    skip; c2, c2, Seq-Skip

    b, b

    if b then c1 else c2 if b then c1 else c2 If

    while b do c, if b then c; while b do c else skip, While

    1.2 Big Step Semantics

    Were going to have

    a, n

    b, T

    c,

    skip, Skip

    a, n

    x := b, [x n]Assgn

    c1,

    c2,

    c1; c2,

    Seqb, True c1

    if b then c1 else c2,

    If-T

    b, False c2

    if b then c1 else c2, If-F

    b, True c; while b do c,

    while b do c, While-T

    b, False

    while b do c, While-F

    3

  • 7/30/2019 Programming Language Theory Notes

    15/61

    Note that the unrolling rule is equivalent to

    b, True c,

    while b do c,

    while b do c, While-T

    We can also establish command equivalence to the small step rule.

    Theorem 1 (Command Equivalence). c, c, skip,

    Suppose that while True do skip terminates, then

    True, True skip, while True do skip,

    while True do skip, A

    So this essentially becomes

    ...

    while True do skip, While

    while True do skip, While

    This proof tree is not finite, hence this doesnt terminate.

    4

  • 7/30/2019 Programming Language Theory Notes

    16/61

    CS 6110: Feb 4

    Lee Gao

    February 4, 2013

    1 Inference Rules

    When you define a rule, youre really definintg a set of rule instances, thatwith consistent substitution of metavariables that must also satisfy someside conditions.Now,

    3 2 5

    3 2 + 1 5 + 1

    is a valid (albeit incorrect) instance of the rule. But if were at the axiomleaves, then side condition must be obeyed as well.A logical instance is just

    x1 x2 x3 . . .x

    Instance

    so that x, xi S. Let the relation or whatever be defined as A (so that= A in the case of the operational semantics, then A S.Lets have a rule operator R(B), (a notion of entailment): what does thethings in B entail? (AKA, if were allowed to assume B, then

    R(B) =

    x |

    x1 . . . xn

    x, {x1, . . . , xn} B

    So R() is the set of axioms (that do not have premises). Note, if an axiomhas a side condition but no premise, then its still an axiom.

    Consistency A R(A)

    Closure R(A) A

    These two conditions says that A = R(A). So A is a fixed point ofR. Therecould potentially be an infinite number of fixed points of R, so we need topick the appropriate one.

    1

  • 7/30/2019 Programming Language Theory Notes

    17/61

    A set of inference rules is an inductive definition. So A is the elements

    derivable in a finite number of steps.

    times set0

    1 step R()2 R2()...

    ...

    So just get the fixed point, aka

    A = n

    Rn()

    Prove that this is a fixed point.R is monotonic, that is, if B C, then R(B) R(C). So

    R() R2() . . .

    So thatn

    iRi() = Rn. So if some element shows up in these sets, then it

    stays. Sox A = n.x Rn()

    Theorem 1 (Closure).R(A) A

    Proof. Let x R(A) where

    R(A) = R

    n

    Rn()

    then it must be the case that x is the conclusion of some rule

    x1 . . . xn

    xInstance

    where x1, . . . , xn A (implicit: finite n). Therefore, each of x1, . . . , xnoccurs in some Rn. Now, pick some finite m such that all of x1, . . . , xn Rm(). Then x Rm+1() A; so x A, which concludes the proof.

    Theorem 2 (Consistency).A R(A)

    Proof. Let x A, then there must exist some n such that x Rn(A). Sincex Rn() = x R(A), because

    Rn1() A = Rn R(A)

    which concludes the proof.

    2

  • 7/30/2019 Programming Language Theory Notes

    18/61

    Therefore, A is a fixed point of R.

    Which fixed point is it? It turns out that A is the least possible (smallest)fixed point of R.

    Theorem 3 (Least FP). A is the smallest fixed point of R

    Proof. let B be a fixed point of R, so B = R(B). Obviously, B, so

    B = R() R(B) = B = . . .

    So therefore n

    Rn() B

    which concludes the proof.So its interesting to think about what the other fp do. The least fixedpoint is the intersection of all closed sets. The greatest fp is the union of allconsistent sets.The greatest fp (max fp) allows infinite proof. So for example, we canhave

    x

    x?

    so we can derive x arbitrarily as

    ..

    .x?

    x?

    x?

    so for the relation, the LFP gives the terminating executions and theMFP gives that and also the nonterminating executions. MFP is also calledcoinduction.Do we really need to restrict rules to have a finite number of premises?Lets build a rule for interesting

    1 I1 n

    I n

    = n + 1n I

    Nextm > n.m In I

    Prev

    So that

    R() = {1}

    R2 = {1, 2}

    ...

    Rn = {1, . . . , n}

    3

  • 7/30/2019 Programming Language Theory Notes

    19/61

    so n

    Rn

    = {1, . . . } = N

    but 0 R(N), so A isnt a fixed point of R anymore.Recall that we defined the free variables Fvs(e) inductively as

    Fvs(x) = {x} Fvs(e1 e2) = Fvs(e1) Fvs(e2)

    Fvs(x.e) = Fvs(e)\ {x}

    then we an define the relation meaning that e has free variables {. . . },so that

    x {x}Var e

    1 v0 e2 v2 v = v2 v1

    e1 e2 vApp

    e v

    x.e v\ {x}Lam

    Theorem 4. Exactly one v for each e such that e v.

    Proof. By induction on the height of e. In the base case, we have height 0,so e = x. Here, theres only one rule that applies, which shows the case.Next, the inductive step

    e = e0 e1. By induction, theres a unique v0, v1 such that e0 v0 ande1 v1. Then there is a unique set v = v0 v1, and the rule givesthat e v, which also shows the case.

    e = x.e Same exact reasoning, appealing to uniqueness preservationof v\ {x}

    Were allowed to do this because theres only one rule that applies to eachargument, and the argument on the RHS is smaller.

    Its awkward to prove based on the height of the proof tree. Lets now moveonto well-founded induction, which uses the concept of the well-foundedrelations:

    4

  • 7/30/2019 Programming Language Theory Notes

    20/61

    CS 6110: Feb 4

    Lee Gao

    February 6, 2013

    1 Well Founded Relation

    A well founded relation over set A will not have an infinite descendingchain:

    a2 a1 a0

    Examplie: The predecessor relation on N

    0 2 . . . n

    But the > relation on N is not.Also, any reflexive relation cannot be well-founded since we just have

    . . . a a a

    What about ? Yes. You cannot construct a set of all sets, so you cant

    have S2 S1 S0.Proper subexpression.

    e subexpression of e = e e

    is well founded.To show that P(x) holds for all x A:

    x A. (y x.P(y)) = P(x)

    x A.P(x)WF-Induction

    Ordinary induction on naturals is WF-Induction where

    A =N

    ,

    N

    Since 0 is of N, then ordinary induction is an instance.Strong induction depends on the

  • 7/30/2019 Programming Language Theory Notes

    21/61

    2 Structural Induction

    When we prove things using the subexpression relation

    Definition 1. e e if e is a proper subexpr of e

    Theorem 1. Under CBVs e e, and that e is well-formed (no free vari-ables), then if e e, then e is also well formed.

    Proof. Lets have

    P(e) e.e e Fvs(e) = = Fvs(e) =

    We want to prove that e expr.P(e).

    We can prove this on the subexpr relation on expressions, with a case anal-ysis on e e from middle exclusionRecall that CBV has operational semantics

    (x.e) v e {x/v}

    e1 e

    1

    e1 e2 e

    1e2

    Le2 e

    2

    v e2 v e

    2

    R

    L rule e = e1 e2 and e = e

    1e2, also that Fvs(e) = = Fvs(e1)

    Fvs(e2) = both those are also null.

    By the IH on the expression e1 e1 e2, we have that e1 e

    1=

    Fvs(e1) = Fvs(e

    1) = , which immediately gives that

    Fvs(e) = Fvs(e1) Fvs(e2) =

    which shows the case

    e = (x.e) v and e = e {v/x}, then it must mean that Fvs(x.e) = and Fvs(v) = . So that Fvs(e) {x}.

    Were stuck, so use the following lemma

    Lemma 1 (Substitution).

    Fvs(v) = = Fvs(e {v/x}) = Fvs {x}

    Proof. By structural induction on e.

    e ::= x | x.e | e1 e2

    Lets do a case analysis on the possible expressions

    e = x Then Fvs(e {v/x}) = Fvs(v), but weve already assumed thatFvs(v) = = Fvs(x) {x}, which shows the case.

    e = y Then Fvs(y) = {y} = Fvs(y) {x}, which shows the case.

    2

  • 7/30/2019 Programming Language Theory Notes

    22/61

    e = e0 e1 Then we have that

    Fvs(e {v/x}) = Fvs(e0 {v/x}) Fvs(e1 {v/x})

    By ih on e0 e and on e1 e, we have that Fvs(e0 {.}) =Fvs(e0) {x} and vice versa for e1. So their union is just thesame as Fvs(Fvs(e0)Fvs(e1)){x}) = Fvs(e0 e1){x}, whichshows the case.

    e = x.eFvs(e {v/x}) = Fvs(e) = Fvs(e) {x}

    e = y.e By well-formedness, we have that y / Fvs(v), so

    Fvs(e {v/x}) = Fvs(y.e {v/x})) = Fvs(e {v/x}) {y}

    Now, since e e, then we know that Fvs(e {v/x}) = Fvs(e) {x}

    So the above just becomes

    Fvs(e {v/x}) y = Fvs(e) {x} {y} = Fvs(y.e) {x}

    which shows the case.

    From Lemma 1, we have that Fvs(e) {x} {x} = , so Fvs(e) =.

    3 Command Equivalence of Imp

    Theorem 2.

    c, c,

    skip,

    Proof. The can be proven by structural induction on c.

    The can be proven by induction on the derivation of c, .

    Now for an inductively defined set A, x A has derivation

    D =

    axiom

    x1...

    axiom

    x2...

    x

    3

  • 7/30/2019 Programming Language Theory Notes

    23/61

    So we can form a well-founded relation from the finiteness of D that

    D

    D if D

    is a subderivation of D. So induction on the derivationof a proof means that if weve proven the case for all subderivationD D, then we can show that D.

    By cases, well look at the last step on the derivation of c, .

    Skip Here, skip, , then its trivial that

    skip, skip,

    x:=a,c1; c2,if Exercise

    while Suppose that we do the case where

    c = while b do c

    and thatb, False

    Lemma 2.

    b, t = b, t,

    by structural induction

    Now

    while b do c,

    if b then c; while b do c else skip,

    which

    if False then c; while b do c else skip,

    Lemma 3. if b, t, then

    if b then c; while b do c else skip,

    if False then c; while b do c else skip,

    While, True We have a rule

    D =b,

    Truec,

    whileb

    doc,

    while b do c,

    Now, the c, and the while b do c, are subderivationsof D, so we can apply the induction hypothesis will give that

    c,

    skip,

    ,

    while b do c,

    skip,

    which can be stitched into the above two lemmas to conclude the proof.

    4

  • 7/30/2019 Programming Language Theory Notes

    24/61

    CS 6110: Feb 8

    Lee Gao

    February 8, 2013

    1 Evaluation Contexts

    We have the following types of rules

    Reduction rules (like )

    Structural congruence rules (like L and R)

    So with larger languages, we might get large sets of congruence rules thatsmessy. We can solve this problem by introducing evaluation contexts. Itsbasically a term with a hole [] in it, but it can only be located where areduction may happen. So for CBV, the following is a valid context

    ((x.x) []) (y.y y)

    unfortunately,(x.[]) (y.y)

    we dont reduce the [] next, but rather that application first, so this isntan evaluation context.We can write a grammar inductively:

    E ::= [] | E e | v E

    Where E[e] is E with the hole plugged in by e We can introduce a structuralcongruence rule

    e e

    E[e] E[e]Context

    1.1 CBN

    Lets define the same beta rule

    (x.e) e e

    e/x

    with the contextsE ::= [] | E e

    and the same structural congruence.

    1

  • 7/30/2019 Programming Language Theory Notes

    25/61

    1.2 Observational Equivalence

    We have contexts that defines observations that can be made of the pro-grams. When we talk about equivalence, we talk about that with respect tosome observations. For lambda calculus

    C ::= [] | C e | e C | x.C

    so we can plug in anywhere, not only where we can perform a reduction.So two expressions e1 e2 if

    C[e1] C[e2]

    Theorem 1 (Observational Equivalence). C[e1] C[e2] implies thate1 e2

    Proof. SUppose that we have C[e1] v1 and C[e2] v2, and v1 = v2. Theidea is that we can construct another context C such that

    C[v1]

    butC[v2]

    then there exists a context C such that the two do not have the samebehavior convergence-wise. So C[C[]] is a context evidence that e1 e2

    2 Semantics by translation

    So far weve only been talking about operational semantics, which says howto evaluate on some perfect and abstract machine. This is basically tellingus how we build an interpreter.There are two other styles

    Translation: you convert the program to a better-understood repre-sentation.

    Denotational semantics we translate it into mathematical expres-sion

    Definitional translation we target another language

    Axiomatic Semantics: what can be proven about the execution (Pre-Post conditional hoare logic)

    Lets talk about translation. The key idea is a concept of a meaning function,and we use some kind of a semantic bracket e where e is the argument.Occasionally, well give them names to make it clear which translation weretalking about. So for example, Ce or Ae. Some times eCBV

    2

  • 7/30/2019 Programming Language Theory Notes

    26/61

    Anyways, it takes in a source language expression and produces a target

    language expression. So

    : source target

    2.1 Encoding CBN in CBV

    Lets define a semantic function : Cbn Cbv. We want to deferevaluation by wrapping them up in .

    e0 e1 = e0 (z.e1)

    x.e = x.e

    x = x (z.z)

    where z is fresh.Heres an example. Suppose that

    If xyz.x y z

    so

    If = xyz.x y z

    = xyz.x y (w.z I)

    What does it mean for a translation to be correct? Or, how do we know

    that we have an adequate translation?Adequacy: the source and the target languages agree on evaluation. So ifwe have a source item e that to v. Then

    e v target

    Therefore, we want, using some observational equivalence relation, that v v.So we have two properties: soundness

    Theorem 2 (Soundness).

    e v = v.e v v v

    and completeness

    Theorem 3 (Completeness).

    e v = v.e v v v

    Both soundness and completeness gives adequacy.Theres a stronger notion

    Theorem 4 (Full Abstraction).

    e1 e2 e1 e2

    3

  • 7/30/2019 Programming Language Theory Notes

    27/61

    CS 6110: Feb 11

    Lee Gao

    February 11, 2013

    Were gonna analyze a scheme-like language with

    first-class functions

    data structures

    primitives

    CBV

    not statically typed

    So the language looks like

    e ::= n | True | False | null | x | xn.e | e0en | let x = e1 in e2

    | if e0 then e1 else e2 | e1 e2 | (e1, e2)let(x, y) = e1 in e2

    | letrec x = yn.e in e

    v ::= n | True | False | null | (v1, v2) | x.e

    E ::= [] | E e | v0 v E e | let x = Ein e | if E then e1 else e2

    | E e | v E | (E, e) | (v, E) | let(x, y) = Ein e

    Reduction rules: small step

    e e

    E[e] E[e]Context

    (xn.e) vn e {v/x}

    n = n1 n2

    n1 n2 nArithmetic

    let x = v in e e {v/x}Let

    x = y

    let(x, y) = (v1, v2) in e e {v1/x} v2/yLetPair

    if True then e1 else e2 e1IfTrue

    1

  • 7/30/2019 Programming Language Theory Notes

    28/61

    Can we reduce this to lambda calculus? Lets call our language uMl for

    untyped ML. Now, we can define a translation function that translatesfrom uMl to . Make sure that the things on the right hand side on asubterm where sub is defined by some well-founded relation .

    xn.e = x.e

    e0en = e0 e

    x = x

    n = n

    True = True I

    False = False I

    null = x.x

    let x = e1 in e2 = (x.e2) e1

    if e0 then e1 else e2 = e0 (z.e1) (z.e2)

    (e1, e2) = PAIR e1 e2

    let(x, y) = e1 in e2 = (p. (x.y.e2) (L p) (R p)) e1

    Unfortunately, this is not sound because we can translate null nullinto a valid term with normal form, but which has no normal form inuML. This is a runtime type error to apply null as if its a function. Thetechnical term for this is that this is a stuck configuration. This is an unsafe

    language/not strongly typed.

    Definition 1 (Strong Typing). A language is strongly typed if there are nostuck configurations.Equivalently, programs do not get stuck () if they are well-formed.

    On the other hand, static typing means that the compiler is going to assigntypes to terms at compile time.Examples of strongly typed languages

    OCaml, SML

    Java

    Scheme

    Javascript

    Examples of statically typed languages

    C

    C++

    2

  • 7/30/2019 Programming Language Theory Notes

    29/61

    Forth

    PS

    1 Fixing uML

    One idea is to add transitions from all stuck terms to the error term. Butwe need to define this for all error terms, which may be tedious. We definea set of stuck redexes:

    s ::= v0 = (xn.e) vn

    | let(x, y) = (v = (v1, v2)) in e

    and the new transitions basically say

    E[s] errorStuck

    Lemma 1 (Progress). If e is well formed program and e e, then eitherthere exists some e such that e e or e is a value, or e = error.

    Now, a second way to fix uML is to add a type system. A type system willstrengthen the notion of well-formed (if our type system is sound) so thatwell-formed terms do not get stuck.Going back to the first idea, how do we define the previous transition? Theidea is to tag values with their runtime types. We can represent null in

    the source language as a pair (1, null), so we call this runtime type taggedvariant uML.

    uML uML

    null ((1,null)true (2,true)false (2,false)

    n (3,n)(v,v) (4, (v1, v2))xn.e (5, (n, . . . ))

    So were going to do a translation

    uML uML

    So

    n = (3, n) = I N T n where IN T = x.(e, x)

    t = BOOL t

    null = NULL null

    (e1, e2) = PAIR (e1, e2)

    = F U N(n, )

    error = (0, 0)

    e0 en = let(t0, p0) = e0 in if t0 = 5 (let(y, f) = p0 in y = n) then f e else (0, 0)

    3

  • 7/30/2019 Programming Language Theory Notes

    30/61

    So what were doing is that were always going to error unless everythings

    good. However, this has an overhead in both stuff and in the time, sincewe need to check the well-typedness property. Therefore, its nice to have atype system to make sure that we dont get stuck!

    4

  • 7/30/2019 Programming Language Theory Notes

    31/61

    CS 6110: Feb 13

    Lee Gao

    February 13, 2013

    (Names and Scope)Suppose we have the program

    1 l e t r ec e v i l f 1 f 2 n =2 l e t f ( x ) = n + 1 0 i n3 i f n = 1 then f ( 0 ) + f 1 ( 0 ) + f 2 ( 0 )4 e l s e e v i l f f 1 ( n1)5 i n l e t dummy = fun x > 1000 i n6 e v i l dummy dummy 3

    what does this compute?Theres two ideas:

    static/lexical scoping: variable is bound to the closest enclosing bind-ing in lex.

    dynamic scoping: variable bound to the most recent active binding.(early Lisp, APL, TEX, Perl, early Python, PostScript, . . .

    So under static scoping, we have the result to be 36, but 33 in dynamicscoping.If you have a dynamically scoped language, the free variables are bound bythe caller. So for example, if we have a free variable x and we import itinto an environment that has a binding to x, it may shadow the expectedbehavior of x. Its unpredictable but cool.Dynamic scoping also suffers on the performance side. Its more difficult to

    figure out how the variables are supposed to be bound, so we need a certainlevel of analysis to generalize the possible values of x.

    1 Definitional Tranalsation

    We need a naming environment : Var Value {error}. Lets start withan empty environment 0 x.error. Our terms will have meaning in .Well write this meaning as an argument to the translation

    e

    1

  • 7/30/2019 Programming Language Theory Notes

    32/61

    so that

    : Term Env uML

    but that the image is restricted to the closed terms.

    1.1 Static Scoping

    Sn = n

    Sx = x where x is some encoding of var

    Se1 e2 = Se1 Se2

    Se0 e1 = Se0 Se1

    Sx.e = y.Se[xy]

    Unfortunately, this may not be closed as is Fvs, so we use closure conver-sion for

    Cx = x

    Cx.e =

    , y..Ce[xy]

    Ce0 e1 = let(, f) = Ce0 in f Ce1

    In general, closures and environments they point to cannot be stack allocatedsince they must persist after function call, modern compiler uses escape

    analysis to figure out which variables do not need to be heap allocated.

    1.2 Dynamic Scoping

    Dn = n

    Dx = x where x is some encoding of var

    De1 e2 = De1 De2

    De0 e1 = De0 De1

    Dx.e = y..De[xy]

    In dynamic scoping, its basically the same translation as static, but withoutclosures. The purpose closure is to save the static environment.

    2 Operational Semantics

    Well use the translation as a guide. Lets go with the semantics.

    e, v

    2

  • 7/30/2019 Programming Language Theory Notes

    33/61

    x, (x)Var

    x.e, (, x.e)Lambda

    e0, (, x.e) e1, v1 e,

    [x v1] v

    e0 e1, vApp

    2.1 Dynamic Scoping

    x, (x)Var

    x.e, x.eLambda

    e0, x.e e1, v1 e, [x v1] v

    e0 e1, v App

    3 Observational Equivalence

    Theorem 1 (Equivalence).

    Se e {x Fvs(e).(x)/x} = e {}

    Proof. By structural induction on e. Cases:

    e = x.

    Sx = (x) = x {}, which shows the case.

    e0 e1

    Se0 e1 = Se0 Se1. By IH, we have that Se0 = e0 {} and alsothat e1 {}. So e0 {} e1 {} = (e0 e1) {}, which shows the case.

    x.e

    Sx.e = y.Se[xy]. By IH, we have that Se[xy] = e {[x y]} =e x,y/x. By expansion, this is just y. (x.e { x}) y. Letsuse an eta reduction to get x.e { x} = (x.e) {}, which shows thecase.

    3

  • 7/30/2019 Programming Language Theory Notes

    34/61

    CS 6110: Feb 15

    Lee Gao

    February 15, 2013

    What is the let rec construct?

    letrecy1 = x1.e1, y2 = x2.e2, . . . in e

    OS: we substitute the entire let rec in.

    letrec y = .ex in e e {letrec . . . inx.e1/y1} {letrec . . . inx.en/yn}letrec

    As a translation using naming environments: target language is still theclosed uML.Lets imagine that we have a Merge function that merges two , .

    letrecy = x.e in e = eMerge

    Now, we want

    = Y .x. if x = y1 then x.e1 else if x = y2 then x.e2 else . . .

    If youre building a compiler for this language, then instead of using the Ycombinator, you use mutations and back patching in order to remedy theself-referencing issue in the environment table.So far, the naming schemes weve used so far are hierarchical. (Theres anotion of nesting scopes). What if we want non-hierarchical names?

    1 Non-hierarchical names

    We want to build separate modules that can exploit public names. You canreason about correctness and other properties of modules in isolation.How do we export names from one module to another?One idea is to have a huge global namespace (such as in C). Problematicbecause there could be collisions in names. Works as long as libraries arenttoo big.

    1

  • 7/30/2019 Programming Language Theory Notes

    35/61

    Another idea is to add module mechanisms explicitly, so we can group related

    functions together and export a set of names that form the interface fromwhich other modules can talk about and communicate to.Construct modules:

    1 module (x1 = e1, . . . , xn = en)

    Selector em.x to get the x outside of module m defined by em.Import/Open/With to bring all of the identifiers into scope from some mod-ule defined as em.

    1 imp o rt em in e

    However, in uML, it is a term/first class value. A first class value is just as

    normal as any other terms. What can we do?

    bind to variables

    pass as arguments

    return as a result

    In ML, Java, and most other languages, modules are more second class. Thisis done to ensure that all module level computations are done at compileor link time. In ML, you are only allowed to pass modules to functors,which are also second class.Were going to translate uML+modules to uML that is closed. Side: we

    can interpret the as were translating it into a lambda that first takes inan environment and passes it around in between themselves.

    Mx = x

    A module is really similar to a naming environment , so lets just representthese as such.

    Mem.x = Mem x

    Mimport em in e = Me

    Merge Mem

    Mmodule {x = e} = x. if x = x1 then Me1 else . . .

    Unfortunately, this semantics will not allow any of the ei to be defined interms of the internals, so we need to merge the environments again withitself.

    Mmodule{x = e} = Y .x. if x = x1 then Me1

    else . . .

    This allows us to write

    1 let m = m od ul e ( x = y + 1 ,2 y = x + 1 ) in3 m. x

    2

  • 7/30/2019 Programming Language Theory Notes

    36/61

    Since the fixed point ofm has no normal form x.. How do we avoid

    this? Solutions:

    on allow bindings to values

    allow some expressions that arent values, but only allow references toearlier variables. So we build and re-pass in the approximate namingenvironment as we go.

    break the fixed point (so the solution isnt a fixed point to m)

    allow divergence .

    OCaml is nice in that all recursive definitions are explicitly defined in let

    rec, so you know where that fixed point combinator is coming in. Java takesa fixed point over the entire class hierarchy (including the standard library).This is sometimes known as open-recursion.

    3

  • 7/30/2019 Programming Language Theory Notes

    37/61

    CS 6110: Feb 18

    Lee Gao

    February 18, 2013

    States: (refs, mutable variables)A state is this idea of a dynamic binding from identities to values. Moreover,

    these bindings can change over time.It matches our mental model of how the world is. It might be a lie, but itsa convenient lieBut its harder to reason about the program and its nonlocal (you dontknow how far the effects of your variable will go). This way of reasoningthat all states are equal is that it is a misleading performance model.

    1 Stateful uML

    Let uML! be the stateful version of uML. We get

    e ::= uML | ref e |!e | e1 := e2 | l

    where l is a symbolic variable for locations (think of them as memory ad-dresses). We need this within the operational semantics.

    v ::= v(uML) | l

    1 let x = r e f 1 in2 let y = x in3 let z = ( x : =2 ) in4 ! y

    x2

    y

    1.1 Small Step Operational Semantics

    Lets augment our configuration to be e, where : location value. Welllift all uML rules to corresponding rules

    e,

    e,

    1

  • 7/30/2019 Programming Language Theory Notes

    38/61

    New reductions

    l / dom() = [l v]

    ref v, l, Ref

    v = (l)

    !l, v, Bang

    = [l v]

    l := v, null, Set

    E ::= | ref E |!E | E := e | l := E

    We only allow configurations where all locations l e are in dom() This isthe preservation of well-formedness of with respect to e. Next, we need tolift the old context rule

    e, e,

    E[e], E[e], Context

    2 State-passing translation

    For uML! uML. Assume that we have an implementation of as a termin the target language uML such that we have

    [l v] Update l v

    fresh

    lMalloc

    (l) Lookup l

    Empty

    Translation S : uML uML

    Se, (v, )

    Lets start

    Sn = (n, )

    Sx = ((x), )Sif e0 then e1 else e2 = let(b,

    ) = Se0 in if b then Se1 else Se2

    Slet x = e1 in e2 = let(y, ) = Se1 in Se2[xy]

    Se0 e1 = let(f, ) = Se0 in let(v,

    ) = Se1 in f v

    So the translation of a function must also take in the as well.

    Sx.e = y.Se[xy]

    2

  • 7/30/2019 Programming Language Theory Notes

    39/61

    3 Mutable Variables

    There are no refs, but there will be pointers, and that variables are inherentlymutable. LValues vs RValues.

    e ::= uML | e1 = e2 | e | &e | e1; e2

    Lets have a translation for LValues

    Le = location of e

    andRe = value of e

    so as before, but : var locations

    Rn = (n, )

    Lx = ((x), )

    Re = let(l, ) = Lx in

    (l) but only when Le is defined

    Re1 = e2 = let(l, ) = Le1 in let(v,

    ) = Re2 in (v, [l v]

    Le = Re

    R&e = Le

    so we should be able to write *&*&x = 2

    Rx.e = y.. let l = Malloc in Re[xl][ly]

    Re0 e1 = Same

    If we want to allow pass by reference that updates the location outside ofthe scope of that lambda, we can model this as, using the notion of a passby reference function

    RRx.e = l.Re[xl][ly]

    and we need to update the machinery of applications

    Rrcall e0 e1 = let(f, ) = Re0 in let(l,

    ) = Le1 in f l

    3

  • 7/30/2019 Programming Language Theory Notes

    40/61

    CS 6110: Feb 22

    Lee Gao

    February 22, 2013

    CPS Continuation passing style.We can use this style of writing code to give errors and exceptions. So we

    need to be able to transfer flow that doesnt flow local control flow. Butcontinuation lets us define control flow explicitly.We defined some functions to tag things.

    BOOL : t.(2, t)

    so that

    not e = let(t, b) = e in if t = 2 then BOOL NOT b else ERROR

    itd be neat to factor out the boilerplate code to check if tags are right. Wewould want a function that checks tags and then returns the untagged value

    if the tag is correct.If the error is wrong, then in CPS, once we transfer the control, just nevercome back. So in CPS, a function becomes a continuations transformer, sothat it takes in one continuation and turns it into another continuation.

    CHECKBOOL = k.v. let(t, b) = v in if t = 2 then k v else halt ERROR

    So in CPS translation

    not ek = eCHECKBOOL (b.k BOOL(N O T b))

    if e0 then e1 else e2k = e0CHECKBOOL (b. if b then e1k else e2k)

    e

    e

    n

    k = e0CHECKF U N n

    f.f e1f1.e2f2....fn.f k v1 ...

    xn.ek = k

    F U N n k.x.ek

    This might extend to exceptions.

    1 Exceptions

    Augment terms with

    e ::= uML | raise s e | try e1 catch (s x) e

    1

  • 7/30/2019 Programming Language Theory Notes

    41/61

    this is a dynamic binding of try-catch. The most recently invoked try that

    hasnt caught. You need an environment for exceptions. Define a handlerenvironment that maps from exception names to continuations.To do this, we have

    ek,h

    where k is the continuation and h is the handeler environment. Functionsin target should look like

    k h x. . . .

    xk,h = k x

    x.ek,h = k

    khx.ek,h

    e0 e1 = e0 (f.e1 (v1. . . . f k h v))

    You can think of exceptions as dynamically scoped variables bound to con-tinuations.Now, lets move on to the translational semantics of raise and catch.

    raise s ek,h = eh s,h

    so raise inside raise means that the first raise didnt happen

    try e1 catch (s, x) ek,h = e1 k (h[x x.e2 k h])

    So far, weve looked at termination semantics. Theres another idea of re-sumption semantics. If you have an expression that hits an unusual excep-tion, can we define a handler that fixes it?

    2 Resumption Semantics

    Were going to have a new type of exception interrupt.

    e ::= | interrupt s e | try e1 handle (s, x) e2

    But our handler environment will take exception names, map them to func-

    tions, then go back to the original continuation.

    interrupt s ek,h = e (v.h s k h v)

    try e1 handle (s, x) e2k.h = e1 k

    h[s khx.e2 k h(or h)]

    2

  • 7/30/2019 Programming Language Theory Notes

    42/61

    CS 6110: Feb 22

    Lee Gao

    March 13, 2013

    Theorem 1.

    mn

    enm = nm

    enm =k

    ekk

    1 What are the operations on these cont functions

    Compose It takes in a pair [D E] [E F] [D F], and it isdefined as

    p [D E] [E F].x D.(2 p) ((1 p) x)

    Curry : [D E F] [D [E F]] and symmetrically uncurrying.

    Apply : [D E] D EFix : [[D D] D]

    g [D D].n

    gn() =

    n

    gn

    ()

    2 Denotational Semantics for a language with rec

    Lets define a language Rec which is just rec functions.

    p ::= let d in e

    d ::=fi(xi, , xai) = ei

    e ::= n | x | e1 e1 | let x = e1 in e2 | if e0 then e1 else e2 | fi(e1, , eai)

    Lets come up with a denotational semantics. We need a variable environ-ment : V ar Z. We also need a function environment

    : (Za1 Result) (Za1 Result)

    so thati = fi

    1

  • 7/30/2019 Programming Language Theory Notes

    43/61

    (denotationally equivalent). Now, our denotational translation:

    e : Result

    but since we can have general ad-hoc recursion, we need to lift Z for Result.So

    Result = Z

    1 n = n2 x = x3 e1 e2 = le t v1 : Z = e1 in4 le t v2 : Z = e2 in5 v1 v26 = e1 e2

    7 if e0 then e1 else e2 = let b : Z = e0 =8 i f b > 0 then9 e1

    10 else

    11 e212 let x = e1 in e2 = l et v : Z = e1 in e2[x v]13 fi(e1 ) = le t v1 : Z = e1 in14 let v2 : Z = e2 in

    15

    ...16 vai : Z = eai in17 (i ) v1, , vai18 let d in e = e Dd

    So of course, we now need to define the same thing for D.

    Df1( ) = e1, =

    1 Df1( ) = e1, fn( ) = en = fix . ( p1 : Za1 .e1 {x1 1 p1 . . . } ,

    2

    ...3 = pn : Z

    an .en {x1 1 pn . . . })

    but how do we get the in there if its out there. Take a least upper boundto resolve this. Is this function F continuous? Yes. But is our functionenvironment a pointed CPO?Since the function environment is a tuple, it will be pointed ifZai Result

    are pointed, but this must produce bottom every where, so as long thecodomain Result is pointed, were okay. But since Result = Z, thenwere good.

    3 Call By Name Rec

    Here, we lazily evaluate variables, so we can bind to divergent terms as longas they do not get evaluated. So our environment is still

    : V ar Z

    2

  • 7/30/2019 Programming Language Theory Notes

    44/61

    but functions can now expect divergent terms, so

    : Zaz Z

    so we just need to chainge

    1 x = (x)2 fi( ) = (i ) e1,

    3

  • 7/30/2019 Programming Language Theory Notes

    45/61

    CS 6110: April 1

    Lee Gao

    April 1, 2013

    1 Recap: SOS

    e0 e0e0 e1 e

    0e1

    Le1 e1

    v0 e1 v0 e

    1

    R(x : .e) v e {v/x}

    with typing judgments

    , x : x :

    e0 : e1 :

    e0 e1 :

    , x : e :

    x : .e :

    2 Soundness

    Soundness, if we can derive a typing judgment, then e shouldnt get stuck.

    Theorem 1 (Soundness).

    e : e e = e V al e.e e

    To prove this, we need preservation and progress.

    2.1 Preservation

    Also known as subject reduction.

    Lemma 1 (Preservation).

    e : e e = e :

    2.2 Progress

    Lemma 2 (Progress).

    e : = e val e.e e

    So intuitively, were going to have a chain of reductions

    e e1 e2 e

    where each of the above has type by preservation, but then e will eitherbe a value or can step by progress, which shows soundness.

    1

  • 7/30/2019 Programming Language Theory Notes

    46/61

    3 Proofs

    Lemma 3 (Preservation).

    e : e e = e :

    Proof. We want to use induction on the derivation of e : , using theimmediate subderivation relation which is trivially well-founded. Welldo a case analysis on the last rule used in the derivation of e e.

    L Here, e = e0 e1, e = e

    0e1, and we get e0 e

    0. The only derivation

    matching this is just e0 e1 : , so we get the immediate subderiva-tions e0 :

    and e1 : . But since these are e : ,

    we can apply IH to get e

    0 :

    , this together from the secondimmediate subderivation of e : , namely e1 :

    , we can thenconstruct the derivation taht e

    0e1 : , which shows the case.

    R Without loss of generality, same as the L case.

    Here, we have e = (x : .e0) v and e = e0 {v/x}. We have no premises,

    and the only derivation that we can apply for e : is the applicationrule, so we have x : .e0 :

    and that v : . Inverting thederivation of the , we also get that x : e0 : . Next, were goingto need a substitution lemma to push this proof any further. Seebelow first, then come back. By discharging substitution, we show the

    case and conclude the proof.

    Lemma 4 (Substitution).

    , x : e : v : = e {v/x} :

    Proof. We will show this by induction on the structure of e with a caseanalysis on e.

    n Here, e = b, which trivially hold.

    Var 1 Here, e = y = x, so here (y) = (, x : )(y), so its invariant.

    Var 2 Here, e = x, so e {v/x} = v. But we have , x : v : and v :

    by weakening (show) gives that = . Finally, the second premisegives v : = , which shows the case.

    App e = e0 e1, so e {v/x} = e0 {v/x} e1 {v/x}. Applying the inductionhypothesis on the immediate subexpressions with the same v-premiseto get , e0 {v/x} : 0, and so forth.

    Abs Here, we need to break this into a few cases.

    2

  • 7/30/2019 Programming Language Theory Notes

    47/61

    1. e = x : .e2, so e {v/x} = e, so we have the case trivially

    already.2. e = y : .e2, y = x, but since we have a typing derivation of v,

    we also have y / Fvs(v), so e {v/x} = y : .e2 {v/x}. Now, wehave the derivation

    , x : , y : e : 2

    , x : e {v/x} : = 2Abs

    but by IH, we get , y : e2 {v/x} : 2. From this, we canconstruct the derivation for y : .e2 {v/x} = e {v/x} : 2 = , which shows the case and concludes the proof.

    Lemma 5 (Progress).

    e : = e val e.e e

    Proof. By structural induction on e with a case analysis.

    n Here, e = b, which is a value, so were done trivially.

    x Here, e = x, but theres no derivation for x : , so vacuously true.

    Abs Here, e = y : .e

    , but this is a value, so were done trivially.

    App Here, e = e0 e1. Obviously, e / val, so we need to show that e.e

    e. We have e0 e1 : , so we get e0 : 1 and e1 : 1, so by IH, wehave that either e0 is a value or e0 e

    0, and same for e1. Lets do a

    case analysis here.

    If both are values, then we just apply (We need a normal-form lemmahere), which we can show by inspection of the derivation of on thestructure of v.

    If e0 is a value, but e1 is not, then we just apply the R rule

    Otherwise, we just apply the L rule.

    Hence showing the case, and concludes the proof.

    3

  • 7/30/2019 Programming Language Theory Notes

    48/61

    CS 6110: April 3

    Lee Gao

    April 3, 2013

    1 Subtyping

    Products generalize into records and sums generalize into variants.Records:

    ::= {x1 : 1, , xn : n} 1 n

    and we can construct these as

    e ::= | {x1 = e1, } | e.x

    where the first is analogous to construction and the second to projection.We extend the normal form to contain value records. Our SOS contains

    {x = v} .xi vi

    and extending the typing judgment to contain the rule

    ei : i

    {xi = ei} : {}

    and for projections {x : }

    e.xi : i

    We can do a similar kind of trick for sums to get variants:

    ::= | [xi : i]

    and

    e ::= | inx(e) | case e of inxi(yi) ei so

    (case inxi(v) of inx(y) e) ei {v/yi}

    and extending the typing judgment to

    e : i

    inxi(e) : [xi : i]

    e : [xi : i] , yi : i ei :

    case e ofinx(y) e :

    These are very similar to the concept datatypes in Ml.Suppose we want to write something like

    1

  • 7/30/2019 Programming Language Theory Notes

    49/61

    1 class P o in t{

    2 int x , y ;3 void draw ( Symbol s ) ;4 }

    this is really similar to the record type

    {x : int, y : int, draw : Symbol 1}

    We can also do subtyping in java.

    1 class C o l o r e d P o i n t extends P o in t{2 C o l or c ;3 }4 P oi nt p = new Co lo re d P o i n t ( )

    this is really similar to{x : int, y : int, draw : Symbol 1, c : Color}

    but in our sltc so far, we cannot just stick ColoredPoint instance into some-thing expecting a Point. We can define a kind of equivalence/ordering assubtyping. Define an ordering so that ColoredPoint Point.

    Definition 1.

    1 2

    to mean that 1 can be used where 2 is expected.

    1.1 Interpretations

    1. so that 1 is a 2, so that semantically (or denotationally), the class

    T1 T2.

    2. you can convert/coerce 1 into a 2 (an extension). We can define thiscoercion function : (1 2) 1 2

    Interpretation two allows us to translate a subtyping language into a simplytyped language, as long as we define .

    1.2 (Sub)Typing

    e :

    e : Subsumption

    this rule is not syntax directed, (like the weakening rule in hoare logic).Furthermore, the subderivation relation is no longer well-founded.Now, we need to have some subtyping rule. Well define them structurally.

    1 2 2 3

    1 3Trans

    so that is already a preorder.Now, lets define how these rules affect other types.

    2

  • 7/30/2019 Programming Language Theory Notes

    50/61

    1.3 Products

    1

    1 2

    2

    1 2

    1

    2

    Prod

    Product subtyping is covariant wrt components.Another way of seeing this is that we can actually define a coercion functionso that

    (1 2

    1

    2) = p : 1 2.((1

    1)(1 p), (1

    1)(2 p))

    1.4 Sums

    Sums are also covariant, so the coercion function is still covariant.

    (1 + 2

    1+

    2) = s : 1 + 2.case s of x1 : 1 in((1

    1)x1) |

    1.5 Supertype

    The universal super type is 1, so we can always say that 1. Withcoercion

    ( 1) = x : .()

    1.6 Subtype

    Define 0 (or void) with no instances and corresponding to . It is the ofthe relation. For example, errors or has type 0.

    1.7 Functions

    Eiffel wanted to do this covariantly, but we want to be able to coerce 1

    into1, so the judgment is contravariant.

    1.8 Records

    Methods in Java are covariant only in return type. (The argument type ison the class name)

    covariance

    {x1 : 1, , xn : n}

    x1 :

    1, , xn :

    n

    depth

    m n

    {x1 : 1, , xn : n} {x1 : 1, , xn : m}width

    what about permutation subtyping? We can actually order the field labelsfirst, but we can assume this for our system.

    3

  • 7/30/2019 Programming Language Theory Notes

    51/61

    1.9 Variants

    covariance

    [x : ] [x : ]depth

    n m

    [x : n] [x : m]width

    notice how the width rules flip between variants and records.

    1.10 References

    ::= | ref

    ande ::= |!e | e1 := e2 | refe

    with natural rules.Subtyping rules

    invariance

    1ref 2ref

    so no refs.

    4

  • 7/30/2019 Programming Language Theory Notes

    52/61

    CS 6110: April 10

    Lee Gao

    April 10, 2013

    1 Type Inference

    Also known as type reconstruction. The basic notion is to stick in typevariables for unknown types, which gives a system of type equations. Thislets us write the typing rules in a different way.

    , x : Tx e :

    x.e : Tx Lam

    and e0 : 0 e1 : 1 0 = 1 T2

    e0 e1 : T2App

    so type checking gives a system of equations. The main thing different hereis that theres a constraint as a side-condition of each inference rule.Suppose that we have

    f : Tf f : Tf 1 : int T f = int T2

    f : Tf, x : Tx f 1 : T2App

    f.x.f 1 : 0 y.y :

    ( f.x.f 1)(y.y)App

    this generates the following constraints

    Tf = int T2 Tf Tx T2 = (Ty Ty) T1

    which says that by direct substitution

    (int T2) Tx T2 = (Ty Ty) T1

    so here, (Ty Ty) = (int T2) = Ty = T2 = int and T1 = Tx int.What this says is that Tx is unconstrained.

    1

  • 7/30/2019 Programming Language Theory Notes

    53/61

    1.1 Unification

    This idea dates back to 1965 (Robinson). In general, we have two typeexpressions, which stands at equivalence: 1 = 2. Our goal is to find theweakest substitution S such that S(1) = S(2).Suppose we have

    T1

    T1 bool and

    +

    T3 int

    T2

    So we can say that S2 S1 (S2 is weaker) if there exists som non-trivial S3such that S1 = S3 S2.Suppose we have a set of equations

    E =

    1 =

    1, 2 =

    2,

    and were trying to find the weakest/laziest substitution. Define unify(E)as

    unify() =

    unify(B = B, E) = unify(E)

    unify(T = T, E) = unify(E)

    unify(1 2 = 3 4, E) = unify(1 = 3, 2 = 4, E)

    unify(T = , E) = unify(E{ /T}) (T )

    it had better be the case that T / Fvs().

    What is the ordering relation on E such that E1

    E2

    is well-founded?Note that in the substitution case, the number of unsolved variables godown, whereas in the rest, the height goes down. SoWell say E1 E2 ifFvs(E1) Fvs(E2) or if they have the same set offvs, then the size of E1 is strictly less than E2.So how do we use this algorithm?

    1.2 Hindley-Milner Algorithm W

    Here, we just apply constraints to unify as encountered.

    2

  • 7/30/2019 Programming Language Theory Notes

    54/61

    So, if we just ask for the type of

    x.x : Tx TxLam

    but now this is polymorphic, so Tx = or that theres a type schema suchthat

    Tx = .

    Furthermore, how quick is this algorithm? It is doubly exponential. How-ever, theyre wrong.

    1.2.1 What ever PLT should know

    So let f0(x) = x and f1(x) = if b then f0 else x f0, and so on, so

    fn(x) = if b then fn1 else x fn1

    So here, f0 : , and f1 : T0 T0, so the types may be expandingexponentially, so the substitution step will take some type going over thesepossibly exponential types.But if we dont need to output the types, we can do this in poly time.Idea: types are directed acyclic graphs.

    3

  • 7/30/2019 Programming Language Theory Notes

    55/61

    CS 6110: April 22 Abstract Interpretation

    Lee Gao

    April 22, 2013

    1 Monads

    We haveD M(D)

    with the operations

    1 u n it ( [] , r e t u rn ) : DM(D)2 b in d ( , >>=) : (DM(E)) (M(D) M(E))

    With semantics

    Cskip = [] : M() =

    Cx := a = [[x Aa]]

    Cif b then c1 else c2 = if Bb then Cc else Cc2

    Cc1; c2 = Cc2(Cc1)

    Cwhile b do c = fixw : M. . if Bb then w(Cc) else []

    Monads capture abstract computational effects. is the strictness monad,which handles divergence. We define

    [] =

    f() =

    f() = f()

    The CPS monad:M = ( ans) ans

    so we will build the monad as

    [] = k ans.k

    and binding

    f(m) : M ( ans) ans

    1

  • 7/30/2019 Programming Language Theory Notes

    56/61

    and m ( ans) ans. We want to get out a M, so f : M M.

    So M is the domain of continuations, and we want a continuation

    f(m) = k ans.m( .f k)

    we can now apply the lift monad onto this to get the correct semantics.

    Cskip = [] = k.k

    Cc1; c2 = Cc2(Cc1) = k.(Cc1)(Cc2lk)

    2 Abstract Interpretation

    We want to lift computation on onto a coarser domain M on which wecan do analysis. Then, were going to run the program on M rather thanon , and we will tweak the semantics to only deal with M.Suppose that M < and 1, 2 both maps to the same m M, but theseare distinct, we might need to join domains.Let

    () = m M

    and since we need a join , we need M to be an upper lattice with a joinoperation m1 m2.This turns out to not be enough, we also want to have our program termi-nate.

    We will use a widening operator . so the idea is that

    m1 m2 m1m2

    furthermore,m1 m1m2 m1m2m3

    this chain stabilizes at some finite n (converges).Now, ifM is finite height, we can just use the join as our widening operator.The is the join/union of all = can be in any state, we dont know which.Then lower is more precise.

    2.1 Modeling this as monad

    How do we lift from a state to this monadic domain?

    M = M

    and[] = []

    andf(M) = M

    2

  • 7/30/2019 Programming Language Theory Notes

    57/61

    and

    f

    (m) =

    {f() | () = m}

    thenCc : M

    but we dont want to work with states, so we can star it

    Cc : M M

    this is exactly what we want, it is called an abstract interpretation. But thisis going to be verrrry hard to compute. So instead, were going to defineanother translation called Cc which approximates Cc, in other words

    Cc

    Cc

    (recall that higher up = less precise)Image that we want to run programs, but we only care about the nega-tive/nonnegativeness of x.

    <

    Define the translation

    Cskipm = ([])m = m = m

    because already has codomain M, and the

    Theorem 1 (Final Unit).[] = id

    Theorem 2 (Initial Unit).

    f() = f([])

    Theorem 3 (Associativity).

    (f1 f2) = f1 f

    2

    tells us that [] = id.Now, lets look at

    Cx := a = [[x Aa]]

    then

    Cx := a

  • 7/30/2019 Programming Language Theory Notes

    58/61

    and applying to top gives

    Cx := a =

    < if a < 0

    if a 0

    otherwise

    and for an assignment with a different y = x

    Cy := am = m

    and for

    Cc1; c2 = Cc2

    Cc2

    but we want not bake associativity in the overapproximation, so it is just

    defined asCc1; c2 = C

    c2 Cc1

    and for if

    Cif b then c1 else c2

  • 7/30/2019 Programming Language Theory Notes

    59/61

    CS 6110: April 24

    Lee Gao

    April 24, 2013

    1 Object Encodings

    Objects are self-referential and has this special feature like inheritance. Wecan translate this into using records.Consider the java code

    1 c l a s s i n t s e t {2 i n t v a lu e ;3 i n t s e t l e f t , r i g h t ;4 i n t s e t uni on ( i n t s e t s ) { . . . }5 b o o l c o n t a i n s ( i n t n ) {6 i f (n == t h i s . valu e ) {7 r e t ur n t r u e ;8 } e ls e i f ( n < t h i s . valu e ) {9 r et ur n t h i s . l e f t != n u l l && t h i s . l e f t . co nt ai ns (n) ;

    10 } e l s e11 r et ur n t h i s . r i g h t != n u l l && t h i s . r i g h t . c o n t a i n s ( n ) ;12 }13 }14 }

    This is a recursive type obviously, so we can model objects as records oversome .What about this? It turns out that this could reference a subclass or asuperclass. Need to do something special to allow this open recursion.

    intset . ({value : int, left : , right : , union : , contains : int bool} + 1)

    where we define null = ().For example, if we want to build a non-null record, then

    fold rec this : linset > inlinset {v = 0,left = inr (),right = ..., }

    butunion s : inset.

    and

    contains n : int. if n = this.v then True else if n < this.v then case this.l of s : intset > (

    1

  • 7/30/2019 Programming Language Theory Notes

    60/61

    We have not added in the concept of encapsulation (protecting fields). We

    want to extend this encoding to hide the field members.

    intset . ({private : {value : int, left : } , right : , union : , contains : int bool} + 1)

    2 Existential Types

    Let denote the record that holds the private information.

    e ::= |

    ::= | . |

    E ::= |

    so the inset example now becomes

    intset .. ({private : , right : , union : , contains : int bool} + 1)

    Another interpretation of existentials, the operational one:

    2.1 Operational Intuition

    An existential type is just a pair of the that were hiding, and then therest of the type where we substitute the hidden type for . So think of themas

    [, v] : .

    so hat

    v : { /}

    note that v is an instance of this existential, the itself isnt filled in.New expressions

    e ::= | [, e]. | let[, x] = e in e

    where the elimination form basically take out the type and the value. Otherpeople will just call this pack and unpack.Evaluation context

    E ::= | [, E] | let[, x] = Ein e

    Typing rules:

    ; { /} : { /} .

    ; [, e].:.TPack

    , e : . , ; , x : e : /

    , let[, x] = e in e : TUnpack

    2

  • 7/30/2019 Programming Language Theory Notes

    61/61

    Operational Semantics

    let[, x] = [, v]. in e e {/,v/x}

    Pack

    2.2 Logical Correspondence

    In curry howard correspondence, we get a 1-1 correspondence to the exis-tentials.

    TPack {A/x} A : Prop

    x.

    TUnpack x. , x : Prop,

    2.3 Examples

    1 l e t p1 = [int, {5, n : int.n = 1}].bool in2 l e t [, x] = p1 i n (right x) (left x)3

    4 l e t p2 = [bool, (true, b : bool.b)].bool i n5 l e t (, x] = p2 in (right x) (left x)