higher-order probabilistic programming

60
Higher-Order Probabilistic Programming A Tutorial at POPL 2019 Part I Ugo Dal Lago (Based on joint work with Flavien Breuvart, Raphaëlle Crubillé, Charles Grellois, Davide Sangiorgi,. . . ) POPL 2019, Lisbon, January 14th

Upload: others

Post on 23-Nov-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Higher-Order Probabilistic Programming

Higher-Order Probabilistic ProgrammingA Tutorial at POPL 2019

Part I

Ugo Dal Lago(Based on joint work with Flavien Breuvart, Raphaëlle Crubillé, Charles

Grellois, Davide Sangiorgi,. . . )

POPL 2019, Lisbon, January 14th

Page 2: Higher-Order Probabilistic Programming

Probabilistic Models

I The environment is supposed not to behave deterministically, butprobabilistically.

I Crucial when modeling uncertainty.I Useful to handle complex domains.I Example:

q0 q1

q2 q3

14

34 1

12

12

13

23

I Abstractions:I (Labelled) Markov Chains.

Page 3: Higher-Order Probabilistic Programming

Probabilistic Models

I The environment is supposed not to behave deterministically, butprobabilistically.

I Crucial when modeling uncertainty.

I Useful to handle complex domains.I Example:

q0 q1

q2 q3

14

34 1

12

12

13

23

I Abstractions:I (Labelled) Markov Chains.

Page 4: Higher-Order Probabilistic Programming

Probabilistic Models

I The environment is supposed not to behave deterministically, butprobabilistically.

I Crucial when modeling uncertainty.I Useful to handle complex domains.

I Example:

q0 q1

q2 q3

14

34 1

12

12

13

23

I Abstractions:I (Labelled) Markov Chains.

Page 5: Higher-Order Probabilistic Programming

Probabilistic Models

I The environment is supposed not to behave deterministically, butprobabilistically.

I Crucial when modeling uncertainty.I Useful to handle complex domains.I Example:

q0 q1

q2 q3

14

34 1

12

12

13

23

I Abstractions:I (Labelled) Markov Chains.

Page 6: Higher-Order Probabilistic Programming

Probabilistic Models

I The environment is supposed not to behave deterministically, butprobabilistically.

I Crucial when modeling uncertainty.I Useful to handle complex domains.I Example:

q0 q1

q2 q3

14

34 1

12

12

13

23

I Abstractions:I (Labelled) Markov Chains.

Page 7: Higher-Order Probabilistic Programming

Probabilistic Models

RO

BO

TIC

S

Page 8: Higher-Order Probabilistic Programming

Probabilistic Models

ART

IFIC

IAL

INT

ELL

IGE

NC

E

Page 9: Higher-Order Probabilistic Programming

Probabilistic Models

NAT

UR

AL

LAN

GU

AG

EP

RO

CE

SSIN

G

Page 10: Higher-Order Probabilistic Programming

Randomized Computation

I Algorithms and automata are assumed to have the ability to sample from adistribution [dLMSS1956,R1963].

I This is a powerful tool when solving computational problems.I Example:

I Abstractions:I Randomized algorithms;I Probabilistic Turing machines.I Labelled Markov chains.

Page 11: Higher-Order Probabilistic Programming

Randomized Computation

I Algorithms and automata are assumed to have the ability to sample from adistribution [dLMSS1956,R1963].

I This is a powerful tool when solving computational problems.

I Example:

I Abstractions:I Randomized algorithms;I Probabilistic Turing machines.I Labelled Markov chains.

Page 12: Higher-Order Probabilistic Programming

Randomized Computation

I Algorithms and automata are assumed to have the ability to sample from adistribution [dLMSS1956,R1963].

I This is a powerful tool when solving computational problems.I Example:

I Abstractions:I Randomized algorithms;I Probabilistic Turing machines.I Labelled Markov chains.

Page 13: Higher-Order Probabilistic Programming

Randomized Computation

I Algorithms and automata are assumed to have the ability to sample from adistribution [dLMSS1956,R1963].

I This is a powerful tool when solving computational problems.I Example:

I Abstractions:I Randomized algorithms;I Probabilistic Turing machines.I Labelled Markov chains.

Page 14: Higher-Order Probabilistic Programming

Randomized Computation

I Algorithms and automata are assumed to have the ability to sample from adistribution [dLMSS1956,R1963].

I This is a powerful tool when solving computational problems.I Example:

I Abstractions:I Randomized algorithms;I Probabilistic Turing machines.I Labelled Markov chains.

Page 15: Higher-Order Probabilistic Programming

Randomized Computation

01 1 1 0

s

· · · · · · δ(s, 0) = {(q, 1,←)12 , (r, 0,→)

23 }

01 1 1 0

r

· · · · · ·

11 1 1 0

q

· · · · · ·

12

12

Page 16: Higher-Order Probabilistic Programming

Randomized Computation

01 1 1 0

s

· · · · · · δ(s, 0) = {(q, 1,←)12 , (r, 0,→)

23 }

01 1 1 0

r

· · · · · ·

11 1 1 0

q

· · · · · ·

12

12

Page 17: Higher-Order Probabilistic Programming

Randomized Computation

01 1 1 0

s

· · · · · · δ(s, 0) = {(q, 1,←)12 , (r, 0,→)

23 }

01 1 1 0

r

· · · · · ·

11 1 1 0

q

· · · · · ·

23

13

Page 18: Higher-Order Probabilistic Programming

Randomized Computation

ALG

OR

ITH

MIC

S

Page 19: Higher-Order Probabilistic Programming

Randomized Computation

CRY

PT

OG

RA

PH

Y

Page 20: Higher-Order Probabilistic Programming

Randomized Computation

PR

OG

RA

MV

ER

IFIC

ATIO

N

Page 21: Higher-Order Probabilistic Programming

What Algorithms Compute

I Deterministic ComputationI For every input x, there is at most one output y any algorithm A produces when

fed with x.

I As a consequence:

A ; JAK : N⇀ N.

I Randomized ComputationI For every input x, any algorithm A outputs y with a probability 0 ≤ p ≤ 1.

I As a consequence:

A ; JAK : N→ D(N).I The distribution JAK(n) sums to anything between 0 and 1, thus accounting for

the probability of divergence.

Page 22: Higher-Order Probabilistic Programming

What Algorithms Compute

I Deterministic ComputationI For every input x, there is at most one output y any algorithm A produces when

fed with x.

I As a consequence:

A ; JAK : N⇀ N.I Randomized Computation

I For every input x, any algorithm A outputs y with a probability 0 ≤ p ≤ 1.

I As a consequence:

A ; JAK : N→ D(N).I The distribution JAK(n) sums to anything between 0 and 1, thus accounting for

the probability of divergence.

Page 23: Higher-Order Probabilistic Programming

Higher-Order Computation

I Mainly useful in programming.

I Functions are first-class citizens:I They can be passed as arguments;I They can be obtained as results.

I Motivations:I Modularity;I Code reuse;I Conciseness.

I Example:

I Models:I λ-calculus

Page 24: Higher-Order Probabilistic Programming

Higher-Order Computation

I Mainly useful in programming.I Functions are first-class citizens:

I They can be passed as arguments;I They can be obtained as results.

I Motivations:I Modularity;I Code reuse;I Conciseness.

I Example:

I Models:I λ-calculus

Page 25: Higher-Order Probabilistic Programming

Higher-Order Computation

I Mainly useful in programming.I Functions are first-class citizens:

I They can be passed as arguments;I They can be obtained as results.

I Motivations:I Modularity;I Code reuse;I Conciseness.

I Example:

I Models:I λ-calculus

Page 26: Higher-Order Probabilistic Programming

Higher-Order Computation

I Mainly useful in programming.I Functions are first-class citizens:

I They can be passed as arguments;I They can be obtained as results.

I Motivations:I Modularity;I Code reuse;I Conciseness.

I Example:

I Models:I λ-calculus

Page 27: Higher-Order Probabilistic Programming

Higher-Order Computation

I Mainly useful in programming.I Functions are first-class citizens:

I They can be passed as arguments;I They can be obtained as results.

I Motivations:I Modularity;I Code reuse;I Conciseness.

I Example:

I Models:I λ-calculus

Page 28: Higher-Order Probabilistic Programming

Higher-Order Computation

I Mainly useful in programming.I Functions are first-class citizens:

I They can be passed as arguments;I They can be obtained as results.

I Motivations:I Modularity;I Code reuse;I Conciseness.

I Example:

I Models:I λ-calculus

Page 29: Higher-Order Probabilistic Programming

Higher-Order Computation

FUN

CT

ION

AL

PR

OG

RA

MM

ING

Page 30: Higher-Order Probabilistic Programming

Higher-Order Computation

FUN

CT

ION

AL

DAT

AST

RU

CT

UR

ES

Page 31: Higher-Order Probabilistic Programming

Higher-Order Computation

λ-C

ALC

ULU

S

Page 32: Higher-Order Probabilistic Programming

Higher-Order Probabilistic Computation?

Does it Make Sense?

What Kind of MetatheoryDoes it Have?

Interesting Research Problems?

Page 33: Higher-Order Probabilistic Programming

Higher-Order Probabilistic Computation?

Does it Make Sense?

What Kind of MetatheoryDoes it Have?

Interesting Research Problems?

Page 34: Higher-Order Probabilistic Programming

Higher-Order Probabilistic Computation?

Does it Make Sense?

What Kind of MetatheoryDoes it Have?

Interesting Research Problems?

Page 35: Higher-Order Probabilistic Programming

This Tutorial

1. Motivating Examples.2. A λ-Calculus Foundation.3. Operational and Denotational Semantics.4. Termination and Resource Analysis.

A webpage: http://www.cs.unibo.it/~dallago/HOPP/

Page 36: Higher-Order Probabilistic Programming

This Tutorial

1. Motivating Examples.2. A λ-Calculus Foundation.3. Operational and Denotational Semantics.4. Termination and Resource Analysis.

A webpage: http://www.cs.unibo.it/~dallago/HOPP/

Page 37: Higher-Order Probabilistic Programming

MergeSort (1)

Page 38: Higher-Order Probabilistic Programming

MergeSort (2)

Page 39: Higher-Order Probabilistic Programming

The Structure of MergeSort

merge sort

merge sort

merge sort

Page 40: Higher-Order Probabilistic Programming

MergeSort, HO

Page 41: Higher-Order Probabilistic Programming

QuickSort

Page 42: Higher-Order Probabilistic Programming

The Structure of MergeSort

quick sort

quick sort

quick sort

Page 43: Higher-Order Probabilistic Programming

QuickSort, HO

Page 44: Higher-Order Probabilistic Programming

Randomized QuickSort (1)

Page 45: Higher-Order Probabilistic Programming

Randomized QuickSort (2)

Page 46: Higher-Order Probabilistic Programming

Random Walk

0 1 2 n· · ·

p 1 − p

Page 47: Higher-Order Probabilistic Programming

Two Kinds of Random Walks

Page 48: Higher-Order Probabilistic Programming

The Coupon Collector

I A brand distributes a large quantity of coupons, each labelled withi ∈ {1, . . . , n}.

I Every day, you collect one coupon at a local store. Any label i has probability1n to occur.

I You win a prize when you collect a set of n coupons, each with a distinct label.I Example: if n = 5, you could get the following coupons:

3, 1, 5, 2, 3, 1, 5, 2, 3, 1, 5, 2, . . .

I Are you guaranteed to win the prize with probability 1? After how many days,on the average?

Page 49: Higher-Order Probabilistic Programming

The Coupon Collector

I A brand distributes a large quantity of coupons, each labelled withi ∈ {1, . . . , n}.

I Every day, you collect one coupon at a local store. Any label i has probability1n to occur.

I You win a prize when you collect a set of n coupons, each with a distinct label.I Example: if n = 5, you could get the following coupons:

3, 1, 5, 2, 3, 1, 5, 2, 3, 1, 5, 2, . . .

I Are you guaranteed to win the prize with probability 1? After how many days,on the average?

Page 50: Higher-Order Probabilistic Programming

The Coupon Collector

I A brand distributes a large quantity of coupons, each labelled withi ∈ {1, . . . , n}.

I Every day, you collect one coupon at a local store. Any label i has probability1n to occur.

I You win a prize when you collect a set of n coupons, each with a distinct label.

I Example: if n = 5, you could get the following coupons:

3, 1, 5, 2, 3, 1, 5, 2, 3, 1, 5, 2, . . .

I Are you guaranteed to win the prize with probability 1? After how many days,on the average?

Page 51: Higher-Order Probabilistic Programming

The Coupon Collector

I A brand distributes a large quantity of coupons, each labelled withi ∈ {1, . . . , n}.

I Every day, you collect one coupon at a local store. Any label i has probability1n to occur.

I You win a prize when you collect a set of n coupons, each with a distinct label.I Example: if n = 5, you could get the following coupons:

3, 1, 5, 2, 3, 1, 5, 2, 3, 1, 5, 2, . . .

I Are you guaranteed to win the prize with probability 1? After how many days,on the average?

Page 52: Higher-Order Probabilistic Programming

The Coupon Collector

I A brand distributes a large quantity of coupons, each labelled withi ∈ {1, . . . , n}.

I Every day, you collect one coupon at a local store. Any label i has probability1n to occur.

I You win a prize when you collect a set of n coupons, each with a distinct label.I Example: if n = 5, you could get the following coupons:

3, 1, 5, 2, 3, 1, 5, 2, 3, 1, 5, 2, . . .

I Are you guaranteed to win the prize with probability 1? After how many days,on the average?

Page 53: Higher-Order Probabilistic Programming

The Coupon Collector

Page 54: Higher-Order Probabilistic Programming

Beyond Randomized Algorithms: The Grass Problem

I You get up in the morning, and notice that the grass in your garden is wet.

I You know that there is 30% probability that it rains during the night.I The sprinkler is activated once every two days.I You further know that:

I If it rains, there is 90% probability that the grass is wet the following morning.I If the sprinkler is activated, there is 80% probability that the grass is wet the

following morning.I In any other case, there is anyway a 10% probability that the grass is wet.

I In other words, you are in a situation like the following:rain

sprinkler

grass is wet

I Now: how likely it is that it rained?

Page 55: Higher-Order Probabilistic Programming

Beyond Randomized Algorithms: The Grass Problem

I You get up in the morning, and notice that the grass in your garden is wet.I You know that there is 30% probability that it rains during the night.

I The sprinkler is activated once every two days.I You further know that:

I If it rains, there is 90% probability that the grass is wet the following morning.I If the sprinkler is activated, there is 80% probability that the grass is wet the

following morning.I In any other case, there is anyway a 10% probability that the grass is wet.

I In other words, you are in a situation like the following:rain

sprinkler

grass is wet

I Now: how likely it is that it rained?

Page 56: Higher-Order Probabilistic Programming

Beyond Randomized Algorithms: The Grass Problem

I You get up in the morning, and notice that the grass in your garden is wet.I You know that there is 30% probability that it rains during the night.I The sprinkler is activated once every two days.

I You further know that:I If it rains, there is 90% probability that the grass is wet the following morning.I If the sprinkler is activated, there is 80% probability that the grass is wet the

following morning.I In any other case, there is anyway a 10% probability that the grass is wet.

I In other words, you are in a situation like the following:rain

sprinkler

grass is wet

I Now: how likely it is that it rained?

Page 57: Higher-Order Probabilistic Programming

Beyond Randomized Algorithms: The Grass Problem

I You get up in the morning, and notice that the grass in your garden is wet.I You know that there is 30% probability that it rains during the night.I The sprinkler is activated once every two days.I You further know that:

I If it rains, there is 90% probability that the grass is wet the following morning.I If the sprinkler is activated, there is 80% probability that the grass is wet the

following morning.I In any other case, there is anyway a 10% probability that the grass is wet.

I In other words, you are in a situation like the following:rain

sprinkler

grass is wet

I Now: how likely it is that it rained?

Page 58: Higher-Order Probabilistic Programming

Beyond Randomized Algorithms: The Grass Problem

I You get up in the morning, and notice that the grass in your garden is wet.I You know that there is 30% probability that it rains during the night.I The sprinkler is activated once every two days.I You further know that:

I If it rains, there is 90% probability that the grass is wet the following morning.I If the sprinkler is activated, there is 80% probability that the grass is wet the

following morning.I In any other case, there is anyway a 10% probability that the grass is wet.

I In other words, you are in a situation like the following:rain

sprinkler

grass is wet

I Now: how likely it is that it rained?

Page 59: Higher-Order Probabilistic Programming

The Grass Model

Page 60: Higher-Order Probabilistic Programming

Thank You!

Questions?