![Page 1: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/1.jpg)
Probability Intro Part II:Bayes’ Rule
Jonathan Pillow
Mathematical Tools for Neuroscience (NEU 314)Spring, 2016
lecture 13
![Page 2: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/2.jpg)
Quick recap
• Random variable X takes on different values according to a probability distribution
• discrete: probability mass function (pmf)• continuous: probability density function (pdf)• marginalization: summing (“splatting”)• conditionalization: “slicing”
![Page 3: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/3.jpg)
conditionalization (“slicing”)
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
-3 -2 -1 0 1 2 3
(“joint divided by marginal”)
![Page 4: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/4.jpg)
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
-3 -2 -1 0 1 2 3
conditionalization (“slicing”)
(“joint divided by marginal”)
![Page 5: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/5.jpg)
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
-3 -2 -1 0 1 2 3
marginal P(x)
conditional
conditionalization (“slicing”)
![Page 6: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/6.jpg)
conditional densities
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
-3 -2 -1 0 1 2 3
![Page 7: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/7.jpg)
conditional densities
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
-3 -2 -1 0 1 2 3
![Page 8: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/8.jpg)
Bayes’ Rule
Conditional Densities
posterior
Bayes’ Rule
likelihood prior
marginal probability of y (“normalizer”)
![Page 9: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/9.jpg)
A little math: Bayes’ rule
• very simple formula for manipulating probabilitiesP(A | B) P(B)
P(A)P(B | A) =
conditional probability“probability of B given that A occurred”
P(B | A) ∝ P(A | B) P(B)
probability of A
probability of B
simplified form:
![Page 10: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/10.jpg)
A little math: Bayes’ ruleP(B | A) ∝ P(A | B) P(B)
perception - alan stocker © 2009
perception: dealing with probabilities
15Example: 2 coins
• one coin is fake: “heads” on both sides (H / H)• one coin is standard: (H / T)
You grab one of the coins at random and flip it. It comes up “heads”. What is the probability that you’re holding the fake?
p( Fake | H)
p( Nrml | H)
( ½ )( 1 )
( ½ )( ½ ) = ¼
= ½ ∝ p(H | Fake) p(Fake)
∝ p (H | Nrml) p(Nrml)
probabilities mustsum to 1
![Page 11: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/11.jpg)
A little math: Bayes’ ruleP(B | A) ∝ P(A | B) P(B)
perception - alan stocker © 2009
perception: dealing with probabilities
15Example: 2 coins
p( Fake | H)
p( Nrml | H)
( ½ )( 1 ) = ½ ∝ p(H | Fake) p(Fake)
∝ p (H | Nrml) p(Nrml)
fake normal
start
H H H T
( ½ )( ½ ) = ¼probabilities must
sum to 1
![Page 12: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/12.jpg)
= 0
A little math: Bayes’ ruleP(B | A) ∝ P(A | B) P(B)
perception - alan stocker © 2009
perception: dealing with probabilities
15Example: 2 coins
Experiment #2: It comes up “tails”. What is the probability that you’re holding the fake?
p( Fake | T)
p( Nrml | T)
( ½ )( 0 )
( ½ )( ½ ) = ¼
= 0probabilities must
sum to 1
∝ p(T | Fake) p(Fake)
∝ p (T | Nrml) p(Nrml)
fake normal
start
H H H T
= 1
![Page 13: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/13.jpg)
Is the middle circle popping “out” or “in”?
![Page 14: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/14.jpg)
P( image | OUT & light is above) = 1P(image | IN & Light is below) = 1
• Image equally likely to be OUT or IN given sensory data alone
What we want to know: P(OUT | image) vs. P(IN | image)
P(OUT | image) ∝ P(image | OUT & light above) × P(OUT) × P(light above)P(IN | image) ∝ P(image | IN & light below ) × P(IN) × P(light below)
prior
Which of these is greater?
Apply Bayes’ rule:
![Page 15: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/15.jpg)
Bayesian Models for Perception
P(B | A) ∝ P(A | B) P(B) Bayes’ rule:
Formula for computing: P(what’s in the world | sensory data)
B A
(This is what our brain wants to know!)
P(world | sense data) ∝ P(sense data | world) P(world)
(given by past experience)Prior
(given by laws of physics;ambiguous because many world states
could give rise to same sense data)
LikelihoodPosterior(resulting beliefs about
the world)
![Page 16: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/16.jpg)
Helmholtz: perception as “optimal inference”
“Perception is our best guess as to what is in the world, given our current sensory evidence
and our prior experience.”
“perception is our best guess as to what is in the world, given our
current sensory evidence and our prior experience.”
perception - alan stocker © 2009
perception as optimal inference
helmholtz 1821-1894
P(world | sense data) ∝ P(sense data | world) P(world)
(given by past experience)
Prior(given by laws of physics;
ambiguous because many world statescould give rise to same sense data)
LikelihoodPosterior(resulting beliefs about
the world)
![Page 17: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/17.jpg)
Helmholtz: perception as “optimal inference”
“perception is our best guess as to what is in the world, given our
current sensory evidence and our prior experience.”
perception - alan stocker © 2009
perception as optimal inference
helmholtz 1821-1894
“Perception is our best guess as to what is in the world, given our current sensory evidence
and our prior experience.”
P(world | sense data) ∝ P(sense data | world) P(world)
(given by past experience)
Prior(given by laws of physics;
ambiguous because many world statescould give rise to same sense data)
LikelihoodPosterior(resulting beliefs about
the world)
![Page 18: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/18.jpg)
Many different 3D scenes can give rise to the same 2D retinal image
The Ames Room
![Page 19: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/19.jpg)
Many different 3D scenes can give rise to the same 2D retinal image
The Ames Room
How does our brain go about deciding which interpretation?
A
B
P(image | A) and P(image | B) are equal! (both A and B could have generated this image)
Let’s use Bayes’ rule:
P(A | image) = P(image | A) P(A) / Z P(B | image) = P(image | B) P(B) / Z
![Page 20: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/20.jpg)
Hollow Face Illusion
http://www.richardgregory.org/experiments/
![Page 21: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/21.jpg)
perception - alan stocker © 2009
perception as optimal inferencePerception as Bayesian Inference
www.youramazingbrain.org.uk/supersenses/hollow.htm
Very sharp prior favours convex faces: P(H1) >> P(H2)
Nearly flat likelihood function: P(D | H1) ! P(D | H2)
∴ Posterior favours convex: P(H1 | D) > P(H2 | D)
H1: convex
H2: concave
D: image
Richard Gregory
H1 : convex
H2 : concave
D : video
Hollow Face Illusion
Hypothesis #1: face is concaveHypothesis #2: face is convex
P(convex|video) ∝P(video|convex) P(convex)P(concave|video)∝P(video|concave) P(concave)
posterior likelihood prior
P(convex) > P(concave) ⇒ posterior probability of convex is higher(which determines our percept)
![Page 22: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/22.jpg)
• prior belief that objects are convex is SO strong we can’t over-ride it, even when we know it’s wrong!
(So your brain knows Bayes’ rule even if you don’t!)
![Page 23: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/23.jpg)
Terminology question:• When do we call this a likelihood?
A: when considered as a function of x (i.e., with y held fixed)
• note: doesn’t integrate to 1.
• What’s it called as a function of y, for fixed x?
conditional distribution or sampling distribution
![Page 24: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/24.jpg)
independence
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
![Page 25: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/25.jpg)
independenceDefinition: x, y are independent iff
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
![Page 26: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/26.jpg)
independenceDefinition: x, y are independent iff
In linear algebra terms:
−3 −2 −1 0 1 2 3−3
−2
−1
0
1
2
3
(outer product)
![Page 27: Probability Intro Part II: Bayes’ Rulepillowlab.princeton.edu/teaching/mathtools16/slides/lec... · 2016. 4. 21. · Quick recap • Random variable X takes on different values](https://reader034.vdocuments.net/reader034/viewer/2022051809/60133970ac32c075cf781248/html5/thumbnails/27.jpg)
Summary
• marginalization (splatting)• conditionalization (slicing)• Bayes’ rule (prior, likelihood, posterior)• independence