i nference in b ayesian n etworks. a genda reading off independence assumptions efficient inference...
TRANSCRIPT
![Page 1: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/1.jpg)
INFERENCE IN BAYESIAN NETWORKS
![Page 2: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/2.jpg)
AGENDA
Reading off independence assumptions Efficient inference in Bayesian Networks
Top-down inference Variable elimination Monte-Carlo methods
![Page 3: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/3.jpg)
SOME APPLICATIONS OF BN
Medical diagnosis Troubleshooting of hardware/software
systems Fraud/uncollectible debt detection Data mining Analysis of genetic sequences Data interpretation, computer vision, image
understanding
![Page 4: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/4.jpg)
MORE COMPLICATED SINGLY-CONNECTED BELIEF NET
Radio
Battery
SparkPlugs
Starts
Gas
Moves
![Page 5: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/5.jpg)
Region = {Sky, Tree, Grass, Rock}
R2
R4R3
R1
Above
![Page 6: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/6.jpg)
BN to evaluate insurance risks
![Page 7: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/7.jpg)
BN FROM LAST LECTURE
Burglary Earthquake
Alarm
MaryCallsJohnCalls
causes
effects
Directed acyclic graph
Intuitive meaning of arc from x to y:
“x has direct influence on y”
![Page 8: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/8.jpg)
ARCS DO NOT NECESSARILY ENCODE CAUSALITY!
A
B
C
C
B
A
2 BN’s that can encode the same joint probability distribution
![Page 9: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/9.jpg)
READING OFF INDEPENDENCE RELATIONSHIPS
Given B, does the value of A affect the probability of C? P(C|B,A) = P(C|B)?
No! C parent’s (B) are
given, and so it is independent of its non-descendents (A)
Independence is symmetric:C A | B => A C | B
A
B
C
![Page 10: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/10.jpg)
WHAT DOES THE BN ENCODE?
Burglary EarthquakeJohnCalls MaryCalls | AlarmJohnCalls Burglary | AlarmJohnCalls Earthquake | AlarmMaryCalls Burglary | AlarmMaryCalls Earthquake | Alarm
Burglary Earthquake
Alarm
MaryCallsJohnCalls
A node is independent of its non-descendents, given its parents
![Page 11: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/11.jpg)
READING OFF INDEPENDENCE RELATIONSHIPS
How about Burglary Earthquake | Alarm ? No! Why?
Burglary Earthquake
Alarm
MaryCallsJohnCalls
![Page 12: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/12.jpg)
READING OFF INDEPENDENCE RELATIONSHIPS
How about Burglary Earthquake | Alarm ? No! Why? P(BE|A) = P(A|B,E)P(BE)/P(A) = 0.00075 P(B|A)P(E|A) = 0.086
Burglary Earthquake
Alarm
MaryCallsJohnCalls
![Page 13: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/13.jpg)
READING OFF INDEPENDENCE RELATIONSHIPS
How about Burglary Earthquake | JohnCalls? No! Why? Knowing JohnCalls affects the probability of
Alarm, which makes Burglary and Earthquake dependent
Burglary Earthquake
Alarm
MaryCallsJohnCalls
![Page 14: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/14.jpg)
INDEPENDENCE RELATIONSHIPS
Rough intuition (this holds for tree-like graphs, polytrees): Evidence on the (directed) road between two
variables makes them independent Evidence on an “A” node makes descendants
independent Evidence on a “V” node, or below the V, makes
the ancestors of the variables dependent (otherwise they are independent)
Formal property in general case : D-separation independence (see R&N)
![Page 15: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/15.jpg)
BENEFITS OF SPARSE MODELS
Modeling Fewer relationships need to be encoded (either
through understanding or statistics) Large networks can be built up from smaller ones
Intuition Dependencies/independencies between variables
can be inferred through network structures Tractable inference
![Page 16: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/16.jpg)
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
TOP-DOWN INFERENCESuppose we want to compute P(Alarm)
![Page 17: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/17.jpg)
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
TOP-DOWN INFERENCESuppose we want to compute P(Alarm)1. P(Alarm) = Σb,e P(A,b,e)2. P(Alarm) = Σb,e P(A|b,e)P(b)P(e)
![Page 18: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/18.jpg)
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
TOP-DOWN INFERENCESuppose we want to compute P(Alarm)1. P(Alarm) = Σb,e P(A,b,e)2. P(Alarm) = Σb,e P(A|b,e)P(b)P(e)3. P(Alarm) = P(A|B,E)P(B)P(E) +
P(A|B, E)P(B)P(E) +P(A|B,E)P(B)P(E) +P(A|B,E)P(B)P(E)
![Page 19: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/19.jpg)
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
TOP-DOWN INFERENCESuppose we want to compute P(Alarm)1. P(A) = Σb,e P(A,b,e)2. P(A) = Σb,e P(A|b,e)P(b)P(e)3. P(A) = P(A|B,E)P(B)P(E) +
P(A|B, E)P(B)P(E) +P(A|B,E)P(B)P(E) +P(A|B,E)P(B)P(E)
4. P(A) = 0.95*0.001*0.002 +0.94*0.001*0.998 +0.29*0.999*0.002 +0.001*0.999*0.998= 0.00252
![Page 20: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/20.jpg)
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
TOP-DOWN INFERENCENow, suppose we want to compute P(MaryCalls)
![Page 21: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/21.jpg)
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
TOP-DOWN INFERENCENow, suppose we want to compute P(MaryCalls)1. P(M) = P(M|A)P(A) + P(M| A) P(A)
![Page 22: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/22.jpg)
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
TOP-DOWN INFERENCENow, suppose we want to compute P(MaryCalls)1. P(M) = P(M|A)P(A) + P(M| A) P(A)2. P(M) = 0.70*0.00252 + 0.01*(1-0.0252)
= 0.0117
![Page 23: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/23.jpg)
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
TOP-DOWN INFERENCE WITH EVIDENCE
Suppose we want to compute P(Alarm|Earthquake)
![Page 24: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/24.jpg)
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
TOP-DOWN INFERENCE WITH EVIDENCE
Suppose we want to compute P(A|e)1. P(A|e) = Σb P(A,b|e)2. P(A|e) = Σb P(A|b,e)P(b)
![Page 25: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/25.jpg)
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
TOP-DOWN INFERENCE WITH EVIDENCE
Suppose we want to compute P(A|e)1. P(A|e) = Σb P(A,b|e)2. P(A|e) = Σb P(A|b,e)P(b)3. P(A|e) = 0.95*0.001 +
0.29*0.999 += 0.29066
![Page 26: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/26.jpg)
TOP-DOWN INFERENCE
Only works if the graph of ancestors of a variable is a polytree
Evidence given on ancestor(s) of the query variable
Efficient: O(d 2k) time, where d is the number of ancestors
of a variable, with k a bound on # of parents Evidence on an ancestor cuts off influence of
portion of graph above evidence node
![Page 27: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/27.jpg)
QUERYING THE BN
The BN gives P(T|C) What about P(C|T)?
Cavity
Toothache
P(C)
0.1
C P(T|C)
TF
0.40.01111
![Page 28: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/28.jpg)
BAYES’ RULE
P(AB) = P(A|B) P(B)= P(B|A) P(A)
So… P(A|B) = P(B|A) P(A) / P(B)
![Page 29: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/29.jpg)
APPLYING BAYES’ RULE Let A be a cause, B be an effect, and let’s say we
know P(B|A) and P(A) (conditional probability tables)
What’s P(B)?
![Page 30: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/30.jpg)
APPLYING BAYES’ RULE Let A be a cause, B be an effect, and let’s say we
know P(B|A) and P(A) (conditional probability tables)
What’s P(B)? P(B) = Sa P(B,A=a) [marginalization]
P(B,A=a) = P(B|A=a)P(A=a) [conditional probability]
So, P(B) = Sa P(B | A=a) P(A=a)
![Page 31: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/31.jpg)
APPLYING BAYES’ RULE Let A be a cause, B be an effect, and let’s say we
know P(B|A) and P(A) (conditional probability tables)
What’s P(A|B)?
![Page 32: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/32.jpg)
APPLYING BAYES’ RULE Let A be a cause, B be an effect, and let’s say we
know P(B|A) and P(A) (conditional probability tables)
What’s P(A|B)? P(A|B) = P(B|A)P(A)/P(B) [Bayes
rule] P(B) = Sa P(B | A=a) P(A=a) [Last
slide] So, P(A|B) = P(B|A)P(A) / [Sa P(B | A=a) P(A=a)]
![Page 33: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/33.jpg)
HOW DO WE READ THIS?
P(A|B) = P(B|A)P(A) / [Sa P(B | A=a) P(A=a)] [An equation that holds for all values A can take on,
and all values B can take on] P(A=a|B=b) =
![Page 34: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/34.jpg)
HOW DO WE READ THIS?
P(A|B) = P(B|A)P(A) / [Sa P(B | A=a) P(A=a)] [An equation that holds for all values A can take on,
and all values B can take on] P(A=a|B=b) = P(B=b|A=a)P(A=a) /
[Sa P(B=b | A=a) P(A=a)]
Are these the same a?
![Page 35: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/35.jpg)
HOW DO WE READ THIS?
P(A|B) = P(B|A)P(A) / [Sa P(B | A=a) P(A=a)] [An equation that holds for all values A can take on,
and all values B can take on] P(A=a|B=b) = P(B=b|A=a)P(A=a) /
[Sa P(B=b | A=a) P(A=a)]
Are these the same a?
NO!
![Page 36: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/36.jpg)
HOW DO WE READ THIS?
P(A|B) = P(B|A)P(A) / [Sa P(B | A=a) P(A=a)] [An equation that holds for all values A can take on,
and all values B can take on] P(A=a|B=b) = P(B=b|A=a)P(A=a) /
[Sa’ P(B=b | A=a’) P(A=a’)]
Be careful about indices!
![Page 37: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/37.jpg)
QUERYING THE BN The BN gives P(T|C) What about P(C|T)? P(Cavity|Toothache) =
P(Toothache|Cavity) P(Cavity)
P(Toothache)
[Bayes’ rule]
Querying a BN is just applying Bayes’ rule on a larger scale…
Cavity
Toothache
P(C)
0.1
C P(T|C)
TF
0.40.01111 Denominator computed by
summing out numerator over Cavity and Cavity
![Page 38: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/38.jpg)
PERFORMING INFERENCE
Variables X Have evidence set E=e, query variable Q Want to compute the posterior probability
distribution over Q, given E=e Let the non-evidence variables be Y (= X \ E) Straight forward method:
1. Compute joint P(YE=e)2. Marginalize to get P(Q,E=e)3. Divide by P(E=e) to get P(Q|E=e)
![Page 39: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/39.jpg)
INFERENCE IN THE ALARM EXAMPLE
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
P(J|M) = ??
Query Q
Evidence E=e
![Page 40: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/40.jpg)
INFERENCE IN THE ALARM EXAMPLE
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
P(J|MaryCalls) = ??
1. P(J,A,B,E,MaryCalls) =P(J|A)P(MaryCalls|A)P(A|B,E)P(B)P(E)
P(x1x2…xn) = Pi=1,…,nP(xi|parents(Xi))
full joint distribution table
24 entries
![Page 41: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/41.jpg)
INFERENCE IN THE ALARM EXAMPLE
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
P(J|MaryCalls) = ??
1. P(J,A,B,E,MaryCalls) =P(J|A)P(MaryCalls|A)P(A|B,E)P(B)P(E)
2. P(J,MaryCalls) =Sa,b,e P(J,A=a,B=b,E=e,MaryCalls)
2 entries:one for JohnCalls,the other for JohnCalls
![Page 42: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/42.jpg)
INFERENCE IN THE ALARM EXAMPLE
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
P(J|MaryCalls) = ??
1. P(J,A,B,E,MaryCalls) =P(J|A)P(MaryCalls|A)P(A|B,E)P(B)P(E)
2. P(J,MaryCalls) =Sa,b,e P(J,A=a,B=b,E=e,MaryCalls)
3. P(J|MaryCalls) = P(J,MaryCalls)/P(MaryCalls)= P(J,MaryCalls)/(SjP(j,MaryCalls))
![Page 43: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/43.jpg)
HOW EXPENSIVE?
P(X) = P(x1x2…xn) = Pi=1,…,n P(xi|parents(Xi))
Straightforward method:1. Use above to compute P(Y,E=e)2. P(Q,E=e) = Sy1 … Syk P(Y,E=e)
3. P(E=e) = Sq P(Q,E=e) Step 1: O( 2n-|E| ) entries!
Normalization factor – no big deal once we have P(Q,E=e)
Can we do better?
![Page 44: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/44.jpg)
VARIABLE ELIMINATION
Consider linear network X1X2X3
P(X) = P(X1) P(X2|X1) P(X3|X2)
P(X3) = Σx1 Σx2 P(x1) P(x2|x1) P(X3|x2)
![Page 45: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/45.jpg)
VARIABLE ELIMINATION
Consider linear network X1X2X3
P(X) = P(X1) P(X2|X1) P(X3|X2)
P(X3) = Σx1 Σx2 P(x1) P(x2|x1) P(X3|x2)
= Σx2 P(X3|x2) Σx1 P(x1) P(x2|x1)
Rearrange equation…
![Page 46: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/46.jpg)
VARIABLE ELIMINATION
Consider linear network X1X2X3
P(X) = P(X1) P(X2|X1) P(X3|X2)
P(X3) = Σx1 Σx2 P(x1) P(x2|x1) P(X3|x2)
= Σx2 P(X3|x2) Σx1 P(x1) P(x2|x1)
= Σx2 P(X3|x2) P(x2)Computed for each value of X2
Cache P(x2) for both values of X3!
![Page 47: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/47.jpg)
VARIABLE ELIMINATION
Consider linear network X1X2X3
P(X) = P(X1) P(X2|X1) P(X3|X2)
P(X3) = Σx1 Σx2 P(x1) P(x2|x1) P(X3|x2)
= Σx2 P(X3|x2) Σx1 P(x1) P(x2|x1)
= Σx2 P(X3|x2) P(x2)Computed for each value of X2
How many * and + saved?*: 2*4*2=16 vs 4+4=8+ 2*3=8 vs 2+1=3
Can lead to huge gains in larger networks
![Page 48: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/48.jpg)
VE IN ALARM EXAMPLE
P(E|j,m)=P(E,j,m)/P(j,m) P(E,j,m) = ΣaΣb P(E) P(b) P(a|E,b) P(j|a) P(m|a)
![Page 49: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/49.jpg)
VE IN ALARM EXAMPLE
P(E|j,m)=P(E,j,m)/P(j,m) P(E,j,m) = ΣaΣb P(E) P(b) P(a|E,b) P(j|a) P(m|a)
= P(E) Σb P(b) Σa P(a|E,b) P(j|a) P(m|a)
![Page 50: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/50.jpg)
VE IN ALARM EXAMPLE
P(E|j,m)=P(E,j,m)/P(j,m) P(E,j,m) = ΣaΣb P(E) P(b) P(a|E,b) P(j|a) P(m|a)
= P(E) Σb P(b) Σa P(a|E,b) P(j|a) P(m|a)
= P(E) Σb P(b) P(j,m|E,b) Compute for all values of E,b
![Page 51: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/51.jpg)
VE IN ALARM EXAMPLE
P(E|j,m)=P(E,j,m)/P(j,m) P(E,j,m) = ΣaΣb P(E) P(b) P(a|E,b) P(j|a) P(m|a)
= P(E) Σb P(b) Σa P(a|E,b) P(j|a) P(m|a)
= P(E) Σb P(b) P(j,m|E,b)
= P(E) P(j,m|E) Compute for all values of E
![Page 52: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/52.jpg)
WHAT ORDER TO PERFORM VE?
For tree-like BNs (polytrees), order so parents come before children # of variables in each intermediate probability
table is 2^(# of parents of a node) If the number of parents of a node is
bounded, then VE is linear time!
Other networks: intermediate factors may become large
![Page 53: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/53.jpg)
NON-POLYTREE NETWORKS
P(D) = Σa Σb Σc P(A)P(B|A)P(C|A)P(D|B,C) = Σb Σc P(D|B,C) Σa P(A)P(B|A)P(C|A)
A
B C
D
No more simplifications…
![Page 54: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/54.jpg)
APPROXIMATE INFERENCE TECHNIQUES
Based on the idea of Monte Carlo simulation Basic idea:
To estimate the probability of a coin flipping heads, I can flip it a huge number of times and count the fraction of heads observed
Conditional simulation: To estimate the probability P(H) that a coin picked
out of bucket B flips heads, I can:1. Pick a coin C out of B (occurs with probability P(C))2. Flip C and observe whether it flips heads (occurs
with probability P(H|C))3. Put C back and repeat from step 1 many times4. Return the fraction of heads observed (estimate of
P(H))
![Page 55: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/55.jpg)
APPROXIMATE INFERENCE: MONTE-CARLO SIMULATION
Sample from the joint distribution
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
B=0E=0A=0J=1M=0
![Page 56: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/56.jpg)
APPROXIMATE INFERENCE: MONTE-CARLO SIMULATION
As more samples are generated, the distribution of the samples approaches the joint distribution!
B=0E=0A=0J=1M=0
B=0E=0A=0J=0M=0
B=0E=0A=0J=0M=0
B=1E=0A=1J=1M=0
![Page 57: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/57.jpg)
APPROXIMATE INFERENCE: MONTE-CARLO SIMULATION
Inference: given evidence E=e (e.g., J=1) Remove the samples that conflict
B=0E=0A=0J=1M=0
B=0E=0A=0J=0M=0
B=0E=0A=0J=0M=0
B=1E=0A=1J=1M=0
Distribution of remaining samples approximates the conditional distribution!
![Page 58: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/58.jpg)
HOW MANY SAMPLES?
Error of estimate, for n samples, is on average
Variance-reduction techniques
![Page 59: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/59.jpg)
RARE EVENT PROBLEM:
What if some events are really rare (e.g., burglary & earthquake ?)
# of samples must be huge to get a reasonable estimate
Solution: likelihood weighting Enforce that each sample agrees with evidence While generating a sample, keep track of the
ratio of(how likely the sampled value is to occur in the real world)
(how likely you were to generate the sampled value)
![Page 60: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/60.jpg)
LIKELIHOOD WEIGHTING
Suppose evidence Alarm & MaryCalls Sample B,E with P=0.5
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
w=1
![Page 61: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/61.jpg)
LIKELIHOOD WEIGHTING
Suppose evidence Alarm & MaryCalls Sample B,E with P=0.5
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
B=0E=1
w=0.008
![Page 62: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/62.jpg)
LIKELIHOOD WEIGHTING
Suppose evidence Alarm & MaryCalls Sample B,E with P=0.5
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
B=0E=1A=1
w=0.0023
A=1 is enforced, and the weight updated to reflect the likelihood that this occurs
![Page 63: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/63.jpg)
LIKELIHOOD WEIGHTING
Suppose evidence Alarm & MaryCalls Sample B,E with P=0.5
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
B=0E=1A=1M=1J=1
w=0.0016
![Page 64: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/64.jpg)
LIKELIHOOD WEIGHTING
Suppose evidence Alarm & MaryCalls Sample B,E with P=0.5
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
B=0E=0
w=3.988
![Page 65: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/65.jpg)
LIKELIHOOD WEIGHTING
Suppose evidence Alarm & MaryCalls Sample B,E with P=0.5
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
B=0E=0A=1
w=0.004
![Page 66: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/66.jpg)
LIKELIHOOD WEIGHTING
Suppose evidence Alarm & MaryCalls Sample B,E with P=0.5
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
B=0E=0A=1M=1J=1
w=0.0028
![Page 67: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/67.jpg)
LIKELIHOOD WEIGHTING
Suppose evidence Alarm & MaryCalls Sample B,E with P=0.5
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
B=1E=0A=1
w=0.00375
![Page 68: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/68.jpg)
LIKELIHOOD WEIGHTING
Suppose evidence Alarm & MaryCalls Sample B,E with P=0.5
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
B=1E=0A=1M=1J=1
w=0.0026
![Page 69: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/69.jpg)
LIKELIHOOD WEIGHTING
Suppose evidence Alarm & MaryCalls Sample B,E with P=0.5
B E P(A|…)
TTFF
TFTF
0.950.940.290.001
Burglary Earthquake
Alarm
MaryCallsJohnCalls
P(B)
0.001
P(E)
0.002
A P(J|…)
TF
0.900.05
A P(M|…)
TF
0.700.01
B=1E=1A=1M=1J=1
w=5e-7
![Page 70: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/70.jpg)
LIKELIHOOD WEIGHTING
Suppose evidence Alarm & MaryCalls Sample B,E with P=0.5
N=4 gives P(B|A,M)~=0.371 Exact inference gives P(B|A,M) = 0.375
B=0E=1A=1M=1J=1
w=0.0016
B=0E=0A=1M=1J=1
w=0.0028
B=1E=0A=1M=1J=1
w=0.0026
B=1E=1A=1M=1J=1
w~=0
![Page 71: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/71.jpg)
RECAP
Efficient inference in BNs Variable elimination Approximate methods: Monte-Carlo sampling
![Page 72: I NFERENCE IN B AYESIAN N ETWORKS. A GENDA Reading off independence assumptions Efficient inference in Bayesian Networks Top-down inference Variable elimination](https://reader033.vdocuments.net/reader033/viewer/2022051618/56649d0b5503460f949df082/html5/thumbnails/72.jpg)
NEXT LECTURE
Statistical learning: from data to distributions R&N 20.1-2