turning probabilistic reasoning into programming avi pfeffer harvard university
TRANSCRIPT
![Page 1: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/1.jpg)
Turning Probabilistic Reasoning into Programming
Avi PfefferHarvard University
![Page 2: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/2.jpg)
Uncertainty
Uncertainty is ubiquitous Partial information Noisy sensors Non-deterministic actions Exogenous events
Reasoning under uncertainty is a central challenge for building intelligent systems
![Page 3: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/3.jpg)
Probability Probability provides a
mathematically sound basis for dealing with uncertainty
Combined with utilities, provides a basis for decision-making under uncertainty
![Page 4: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/4.jpg)
Probabilistic Reasoning Representation: creating a
probabilistic model of the world Inference: conditioning the model
on observations and computing probabilities of interest
Learning: estimating the model from training data
![Page 5: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/5.jpg)
The Challenge How do we build probabilistic
models of large, complex systems that are easy to construct and understand support efficient inference can be learned from data
![Page 6: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/6.jpg)
(The Programming Challenge) How do we build programs for
interesting problems that are easy to construct and maintain do the right thing run efficiently
![Page 7: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/7.jpg)
Lots of Representations Plethora of existing models
Bayesian networks, hidden Markov models, stochastic context free grammars, etc.
Lots of new models Object-oriented Bayesian networks,
probabilistic relational models, etc.
![Page 8: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/8.jpg)
Goal A probabilistic representation
language that captures many existing models allows many new models provides programming-language like
solutions to building and maintaining models
![Page 9: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/9.jpg)
IBAL A high-level “probabilistic
programming” language for representing Probabilistic models Decision problems Bayesian learning
Implemented and publicly available
![Page 10: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/10.jpg)
Outline Motivation The IBAL Language Inference Goals Probabilistic Inference Algorithm Lessons Learned
![Page 11: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/11.jpg)
Stochastic Experiments
A programming language expression describes a process that generates a value
An IBAL expression describes a process that stochastically generates a value
Meaning of expression is probability distribution over generated value
Evaluating an expression = computing the probability distribution
![Page 12: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/12.jpg)
Simple expressions
Constants Variables Conditionals
Stochastic Choice
x = ‘helloy = xz = if x==‘bye
ffthen 1 else 2
w = dist [ 0.4: ’hello,
0.6:
’world ]
![Page 13: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/13.jpg)
Functions
fair ( ) = dist [0.5 : ‘heads, 0.5 : ‘tails]x = fair ( )y = fair ( )
x and y are independent tosses of a ftfair coin
![Page 14: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/14.jpg)
Higher-order Functionsfair ( ) = dist [0.5 : ‘heads, 0.5 : ‘tails]biased ( ) = dist [0.9 : ‘heads, 0.1 : ‘tails]pick ( ) = dist [0.5 : fair, 0.5 : biased]coin = pick ( )x = coin ( )y = coin ( )
x and y are conditionally independent ffgiven coin
![Page 15: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/15.jpg)
Data Structures and Types IBAL provides a rich type system
tuples and records algebraic data types
IBAL is strongly typed automatic ML-style type inference
![Page 16: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/16.jpg)
Bayesian Networks
Smart
Good TestTaker
Diligent
Understands
HWGrade
ExamGrade
nodes = domain variablesedges = direct causal influence
Network structure encodes conditional independencies: I(HW-Grade , Smart | Understands)
0.9 0.1
s
d
s
0.3 0.7
0.010.990.6 0.4
ds
d
d
s
DS P(U| S, D)
![Page 17: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/17.jpg)
BNs in IBAL
smart = flip 0.8diligent = flip 0.4understands = case <smart,diligent> of # <true,true> : flip 0.9 # <true,false> : flip 0.6 …
S
G
D
U
HE
![Page 18: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/18.jpg)
First-Order HMMs
H1
O1
Ht-1
Ot-1
Ht
Ot
H2
O2
What if hidden state is arbitrary data structure?
Initial distribution P(H1)Transition model P(Hi|Hi-1)Observation model P(Oi|Hi)
![Page 19: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/19.jpg)
HMMs in IBAL
init : () -> statetrans : state -> stateobs : state -> obsrvsequence(current) = { state = current observation = obs(state) future = sequence(trans(state)) }hmm() = sequence(init())
![Page 20: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/20.jpg)
SCFGs
S -> AB (0.6)S -> BA (0.4)A -> a (0.7)A -> AA (0.3)B -> b (0.8)B -> BB (0.2) Non-terminals are data generating
functions
![Page 21: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/21.jpg)
SCFGs in IBAL
append(x,y) = if null(x) then y else cons (first(x), append (rest(x),y)production(x,y) = append(x(),y())terminal(x) = cons(x,nil)s() = dist[0.6:production(a,b), 0.4:production(b,a)]a() = dist[0.7:terminal(‘a),…
![Page 22: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/22.jpg)
Probabilistic Relational Models
ActorGender
Actor
Chaplin
…
Movie
Chaplin
…
Mod T.
…
Role-Type
Appearance
Actor
Genre
Movie
Mod T.
…
Movie
Role-Type Actor.Gender, Movie.Genre
![Page 23: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/23.jpg)
PRMs in IBAL
movie( ) = { genre = dist ... }actor( ) = { gender = dist ... }appearance(a,m) = { role_type = case (a.gender,m.genre) of (male,western) : dist ... }
mod_times = movie()chaplin = actor()a1 = appearance(chaplin, mod_times)
![Page 24: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/24.jpg)
Other IBAL Features Observations can be inserted into
programs condition probability distribution over values
Probabilities in programs can be learnable parameters, with Bayesian priors
Utilities can be associated with different outcomes
Decision variables can be specified influence diagrams, MDPs
![Page 25: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/25.jpg)
Outline Motivation The IBAL Language Inference Goals Probabilistic Inference Algorithm Lessons Learned
![Page 26: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/26.jpg)
Goals Generalize many standard
frameworks for inference e.g. Bayes nets, HMMs, probabilistic CFGs
Support parameter estimation Support decision making Take advantage of language structure Avoid unnecessary computation
![Page 27: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/27.jpg)
Desideratum #1: Exploit Independence
Use Bayes net-like inference algorithm
Smart
Good TestTaker
Diligent
Understands
HWGrade
ExamGrade
![Page 28: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/28.jpg)
Desideratum #2: Exploit Low-Level Structure Causal independence (noisy-or)x = f()y = g()z = x & flip(0.9) | y & flip(0.8)
![Page 29: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/29.jpg)
Desideratum #2: Exploit Low-Level Structure Context-specific independencex = f()y = g()z = case <x,y> of <false,false> : flip 0.4 <false,true> : flip 0.6 <true> : flip 0.7
![Page 30: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/30.jpg)
Desideratum #3: Exploit Object Structure Complex domain often consists of
weakly interacting objects Objects share a small interface Objects are conditionally independent
given interface
Student 1
Student 2
Course Difficulty
![Page 31: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/31.jpg)
Desideratum #4: Exploit Repetition Domain often consists of many of the
same kinds of objects Can inference be shared between them?
f() = complex
x1 = f()
x2 = f()
…
x100 = f()
![Page 32: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/32.jpg)
Desideratum #5: Use the Query
Only evaluate required parts of model Can allow finite computation on infinite model
f() = f()x = let y = f() in true A query on x does not require f Lazy evaluation is required Particularly important for probabilistic
languages, e.g. stochastic grammars
![Page 33: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/33.jpg)
Desideratum #6 Use Support
The support of a variable is the set of values it can take with positive probability
Knowing support of subexpressions can simplify computation
f() = f()x = falsey = if x then f() else true
![Page 34: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/34.jpg)
Desideratum #7 Use Evidence Evidence can restrict the possible
values of a variable It can be used like support to
simplify computationf() = f()x = flip 0.6y = if x then f() else trueobserve x = false
![Page 35: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/35.jpg)
Outline Motivation The IBAL Language Inference Goals Probabilistic Inference Algorithm Lessons Learned
![Page 36: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/36.jpg)
Two-Phase Inference Phase 1: decide what
computations need to be performed
Phase 2: perform the computations
![Page 37: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/37.jpg)
Natural Division of Labor Responsibilities of phase 1:
utilizing query, support and evidence taking advantage of repetition
Responsibilities of phase 2: exploiting conditional independence,
low-level structure and inter-object structure
![Page 38: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/38.jpg)
Phase 1
Computation graph
IBAL Program
![Page 39: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/39.jpg)
Computation Graph
Nodes are subexpressions Edge from X to Y means “Y needs to be
computed in order to compute X” Graph, not tree
different expressions may share subexpressions
memoization used to make sure each subexpression occurs once in graph
![Page 40: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/40.jpg)
Construction of Computation Graph
1. Propagate evidence throughout program
2. Compute support for each node
![Page 41: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/41.jpg)
Evidence Propagation Backwards and forwards
let x = <a:flip 0.4, b:1> inobserve x.a = true inif x.a then ‘a else ‘b
![Page 42: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/42.jpg)
Construction of Computation Graph
1. Propagate evidence throughout program
2. Compute support for each node• this is an evaluator for a non-
deterministic programming language
• lazy evaluation• memoization
![Page 43: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/43.jpg)
Gotcha! Laziness and memoization don’t go
together Memoization: when a function is
called, look up arguments in cache But with lazy evaluation,
arguments are not evaluated before function call!
![Page 44: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/44.jpg)
Lazy Memoization Speculatively evaluate function without
evaluating arguments When argument is found to be needed
abort function evaluation store in cache that argument is needed evaluate the argument speculatively evaluate function again
When function evaluates successfully cache mapping from evaluated arguments to
result
![Page 45: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/45.jpg)
Lazy Memoization
let f(x,y,z) = if x then y else zin f(true,’a,’b)
f(_,_,_)
f(true,_,_)
f(true,’a,_)
Need x
Need y
‘a
![Page 46: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/46.jpg)
Phase 2
Computation Graph
Solution P(Outcome=true)=0.6
Microfactors
![Page 47: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/47.jpg)
Microfactors Representation of function from
variables to reals
E.g.
is the indicator function of XvY More compact than complete tables Can represent low-level structure
X Y ValueFalse False 0False True 1
True - 1
![Page 48: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/48.jpg)
Producing Microfactors
Goal: Translate an IBAL program into a set of microfactors F and a set of variables X
such that the P(Output) = Similar to Bayes net Can solve by variable elimination
exploits independence
X Ff
f
![Page 49: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/49.jpg)
Producing Microfactors Accomplished by recursive descent
on computation graph Use production rules to translate
each expression type into microfactors
Introduce temporary variables where necessary
![Page 50: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/50.jpg)
Producing Microfactors
e1 e2 e3
if e1 then e2 else e3
X=TrueX=False 1e1
X e2X=True e3X=False
X=TrueX=True 1
![Page 51: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/51.jpg)
Phase 2
Computation Graph
Microfactors
Solution P(Outcome=true)=0.6
VariableElimination
Structured
![Page 52: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/52.jpg)
Learning and Decisions Learning uses EM
like BNs, HMMs, SCFGs etc. Decision making uses backward
induction like influence diagrams
Memoization provides dynamic programming simulates value iteration for MDPs
![Page 53: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/53.jpg)
Lessons Learned Stochastic programming languages
are more complex than they appear Single mechanism is insufficient for
inference in a complex language Different approaches may each
contribute ideas to solution Beware of unexpected interactions
![Page 54: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/54.jpg)
Conclusion IBAL is a very general language for
constructing probabilistic models captures many existing frameworks, and
allows many new ones Building an IBAL model = writing a
program describing how values are generated
Probabilistic reasoning is like programming
![Page 55: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/55.jpg)
Future Work Approximate inference
loopy belief propagation likelihood weighting Markov chain Monte Carlo special methods for IBAL?
Ease of use Reading formatted data Programming interface
![Page 56: Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University](https://reader030.vdocuments.net/reader030/viewer/2022032612/56649ea95503460f94bad970/html5/thumbnails/56.jpg)
Obtaining IBAL
www.eecs.harvard.edu/~avi/ibal