-
Probabilistic Graphical Models (Cmput 651):Hybrid Network
Matthew Brown
24/11/2008
Reading: Handout on Hybrid Networks
(Ch. 13 from older version of Koller‐Friedman)1
Cmput 651 - Hybrid Networks 24/11/2008
-
Space of topics
Directed UnDirected
Semantics
Learning
Discrete
Continuous
Inference
2
Cmput 651 - Hybrid Networks 24/11/2008
-
Outline
Inference in purely continuous nets
Hybrid network semantics
Inference in hybrid networks
3
Cmput 651 - Hybrid Networks 24/11/2008
-
Linear Gaussian Bayesian networks (KF Definition 6.2.1)
Definition:
A linear Gaussian Bayesian network satisfies:
• all variables continuous• all CPDs are linear Gaussians
4
A B C
D
E
Example:
P (A) = N (µA, σ2A)P (B) = N (µB , σ2B)P (C) = N (µC , σ2C)P (D|A, B) = N (βD,0 + βD,1A + βD,2B, σ2D)P (E|C, D) = N (βE,0 + βE,1C + βE,2D,σ2E)
Cmput 651 - Hybrid Networks 24/11/2008
-
Inference in linear Gaussian Bayes nets
Recall: linear Gaussian Bayes nets (LGBN) equivalent to multivariate Gaussian distribution
To marginalize, could convert LGBN to Gaussianmarginalization trivial for Gaussian
But ignores structureexample
LGBN: 3n‐1 parametersGaussian: n2+n parameters
bad for large n, eg: > 1000 5
X1 X2 Xn...
p(Xi|Xi−1) = N (βi + αiXi−1;σ2i )
Cmput 651 - Hybrid Networks 24/11/2008
-
Variable elimination
Marginalize out unwanted X using integrationrather than sum, as in discrete case
Note:
Variable elimination gives exact answers for continuous nets
(not for hybrid nets)
6
Cmput 651 - Hybrid Networks 24/11/2008
-
Variable elimination example
7
X1 X3 X4
X2 p(X4) =∫
X1,X2,X3
P (X1, X2, X3, X4)
=∫
X1,X2,X3
P (X1)P (X2)P (X3|X1, X2)P (X4|X3)
=∫
X1
P (X1)∫
X2
P (X2)∫
X3
P (X3|X1, X2)P (X4|X3)
Need a way to represent intermediate factors.Not Gaussian ‐ eg: conditional probabilities not (jointly) Gaussian
Need elimination, product, etc. on this representation
Cmput 651 - Hybrid Networks 24/11/2008
-
Canonical forms (KF Handout Def’n 13.2.1)
Definition:
canonical form
Also written
8
Cmput 651 - Hybrid Networks 24/11/2008
-
Canonical forms and Gaussians (KF Handout 13.2.1)
Canonical forms can represent Gaussians:
So:
9
Cmput 651 - Hybrid Networks 24/11/2008
-
Canonical forms and Gaussians (KF Handout 13.2.1)
Canonical forms can representGaussians
Other things (when K‐1 not defined)
eg: linear Gaussian CPDs
Can also use conditional forms (multivariate linear Gaussian P(X|Y) ) to represent linear Gaussian CPDs or Gaussians.
10
Cmput 651 - Hybrid Networks 24/11/2008
-
Operations on canonical forms (KF Handout 13.2.2)
Factor product:
When scopes don’t overlap, must extend them:
Product of
and
1st:
similarly for
product:11
Cmput 651 - Hybrid Networks 24/11/2008
-
Operations on canonical forms (KF Handout 13.2.2)
Factor division (for belief‐update message passing)
Note multiplying or dividing by vacuous canonical form C(0,0,0) has no effect.
12
Cmput 651 - Hybrid Networks 24/11/2008
-
Operations on canonical forms (KF Handout 13.2.2)
Marginalization
given over set of variables {X,Y}
want
require KYY positive definite so that integral is finite
marginal
13
Cmput 651 - Hybrid Networks 24/11/2008
-
Operations on canonical forms (KF Handout 13.2.2)
Conditioning
given over set of variables {X,Y}
want to condition on Y=y
‐>
14
Notice: Y no longer part of canonical form after conditioning (unlike tables).
Cmput 651 - Hybrid Networks 24/11/2008
-
Inference on linear Gaussian Bayesian nets(KF Handout 13.2.3)
Factor operationssimple, closed form
‐> Variable elimination
‐> Sum‐product message passing
‐> Belief‐update message passing
Note on conditioning:conditioned variables disappear from canonical form
unlike with factor reduction on table factors
‐> must restrict all factors relevant to inference based on evidence Y=y before doing inference 15
Cmput 651 - Hybrid Networks 24/11/2008
-
Inference on linear Gaussian Bayesian nets(KF Handout 13.2.3)
Computational performancecanonical form operations polynomial in factor scope size n
product & division O(n2)
marginalization ‐> matrix inversion ≤ O(n3)
‐> inference in LGBNslinear in # cliquescubic in max. clique size
for discrete networks
factor operations on table factors exponential in scope size
16
Cmput 651 - Hybrid Networks 24/11/2008
-
Inference on linear Gaussian Bayesian nets(KF Handout 13.2.3)
Computational performance (cont’d)‐ for low dimensionality (small # variables), Gaussian representation can be more efficient
‐ for high dimensionality and low tree width, message passing on LGBN much more efficient
17
Cmput 651 - Hybrid Networks 24/11/2008
-
Summary
Inference on linear Gaussian Bayesian nets:use canonical forms
variable elimination or clique tree calibration
exact
efficient
18
Cmput 651 - Hybrid Networks 24/11/2008
-
Outline
Inference in purely continuous nets
Hybrid network semantics
Inference in hybrid networks
19
Cmput 651 - Hybrid Networks 24/11/2008
-
Hybrid networks (KF 5.5.1)
Hybrid networks combine discrete and continuous variables
20
Cmput 651 - Hybrid Networks 24/11/2008
-
Conditional linear Gaussian (CLG) models (KF 5.1)
Definition:
Given: continuous variable X with
discrete parents
continuous parents
X has a conditional linear Gaussian CPD
if for each assignment
∃ coefficients and variance
such that
21
Cmput 651 - Hybrid Networks 24/11/2008
-
Conditional linear Gaussian (CLG) models (KF 5.1)
Definition:
A Bayesian network is a
conditional linear Gaussian network
if:• discrete nodes have only discrete parents• continuous nodes have conditional linear Gaussian CPDs
‐ continuous parents cannot have discrete children.
‐ mixture (weighted average) of Gaussians
weight = probability of discrete assignment22
Cmput 651 - Hybrid Networks 24/11/2008
-
CLG example
23
Country Gender
Weight
HeightWeight is CLG withcontinuous parent heightdiscrete parents country and gender
p(W |h, c, g) = N (βc,g,0 + βc,g,1h;σ2c,g)
Cmput 651 - Hybrid Networks 24/11/2008
-
Discrete nodes with continuous parents
Option 1 ‐ hard threshold:eg: continuous X ‐> discrete Y
Y = 0 if X
-
Linear sigmoid (Logistic or soft threshold)
25
p(Y = 1|x) = exp(θT x)
1 + exp(θT x)
x
P(Y=1|x)
Cmput 651 - Hybrid Networks 24/11/2008
-
Multivariate logit
26
Eg: stock tradingbuy (red)hold (green)sell (blue)
as function of stock pricelbuy = ‐3*(price‐18)lhold = 1lsell = 3*(price‐22)
Price
P(trade|price)
Price
Trade
Cmput 651 - Hybrid Networks 24/11/2008
-
Discrete node with discrete & continuous parents
Continuous parents’ input filtered through multivariate logit
Assignment to discrete parents’ determines coefficients for logit
27
Cmput 651 - Hybrid Networks 24/11/2008
-
Example hybrid net
28
stock trade (discrete) = {buy, hold, sell}parents: price (continuous), strategy (discrete) = {1 or 2}
strategy 1 (reddish)lbuy = ‐3*(price‐18)lhold = 1lsell = 3*(price‐22)
strategy 2 (blue/green)lbuy = ‐3*(price‐16)lhold = 1lsell = 1*(price‐26)
Price
P(trade|price,strategy)
Price Strategy
Trade
Cmput 651 - Hybrid Networks 24/11/2008
-
Outline
Inference in purely continuous nets
Hybrid network semantics
Inference in hybrid networksIssues
Non‐linear dependencies in continuous nets
Discrete & continuous nodes: CLGs
General hybrid networks
29
Cmput 651 - Hybrid Networks 24/11/2008
-
Variable elimination example (Handout Example 13.1.1)
Discrete D1 ... DnContinuous X1 ... Xn
30
p(D1 . . . Dn, X1 . . . Xn) =
(n∏
i=1
p(Di)
)p(X1|D1)
n∏
i=2
p(Xi|Di, Xi−1)
p(X2) =∑
D1,D2
∫
X1
p(D1, D2, X1, X2)
=∑
D1,D2
∫
X1
p(D1)p(D2)p(X1|D1)p(X2|D2, X1)
=∑
D2
p(D2)∫
X1
p(X2|D2, X1)∑
D1
p(X1|D1)p(D1)
‐> simple in principal (but see next slide)
Cmput 651 - Hybrid Networks 24/11/2008
-
Difficulties with inference in hybrid nets
1. must restrict representation (i.e. factors)implicit in choice to use CLGs for example
2. marginalization difficult with arbitrary hybrid netsespecially with non‐linear dependencies among nodes
continuous parent ‐> discrete node requires non‐linearity!
3. intermediate factors hard to represent / work with
eg: mixture of Gaussians from conditional linear Gaussian (CLG) representation
‐> approximation necessary with hybrid nets
31
Cmput 651 - Hybrid Networks 24/11/2008
-
Difficult marginalization (KF Handout Example 13.1.3)
32
Y XP (Y ) = N (0; 1)P (X) = N (Y 2; 1)
p(x, y) =1Z
exp(−y2 − (x− y2)2)
p(x) =∫
y
1Z
exp(−y2 − (x− y2)2)
‐> No analytic (closed form) solution!
Marginal
Joint
X non‐linear in Y
Cmput 651 - Hybrid Networks 24/11/2008
-
Variable elimination example (Handout Example 13.1.2)
Discrete binary D1 ... DnX1X2 ... XnWant P(X2)P(X1,X2) is a mixture of four Gaussians, 1 / assignment to {D1,D2}:
Can show P(X2) also a mixture of four Gaussians.not trivial to represent and work with 33
p(X1|d1) = N (β1,d1 ;σ21,d1)
p(Xi|di, xi−1) = N (βi,di + αi,dixi−1;σ2i,di)
Cmput 651 - Hybrid Networks 24/11/2008
-
Discretization (KF Handout 13.1.3)
What about discretizing continuous variables?
Usually no:typically need fine‐grained representation of continuous X
i.e. large # bins
especially where P(X) large
need inference to find where P(X) large to discretize efficiently
defeats the purpose
‐> # bins usually excessively hugeAND table factors suffer from curse of dimensionality
exponential in |Val(X)|
34
Cmput 651 - Hybrid Networks 24/11/2008
-
Summary
Inference in hybrid networks
Difficulties with variable eliminationfrom non‐linear dependencies
‐> non‐Gaussian intermediate factors
from mixing discrete & continuous variables‐> mixtures of Gaussians
General approach = approximate difficult intermediate factors with Gaussians
35
Cmput 651 - Hybrid Networks 24/11/2008
-
Outline
Inference in purely continuous nets
Hybrid network semantics
Inference in hybrid networksIssues
Non‐linear dependencies in continuous nets
Discrete & continuous nodes: CLGs
General hybrid networks
36
Cmput 651 - Hybrid Networks 24/11/2008
-
Approximating intermediate factors in VE (KF Handout 13.3.1)
General approach:during variable elimination, when difficult intermediate factor encountered, approximate with Gaussian
BUT Gaussians cannot represent:conditional distributions (CPDs)
general (unnormalized) factors
‐> must make sure to approximate only valid distributions with Gaussians
eg: to eliminate X from P(X|Y), must first multiply into a factor P(Y) to give p(X,Y)
‐> CPDs must be multiplied into factors in a topological ordering
i.e. an ordering with parents always before children
37
Cmput 651 - Hybrid Networks 24/11/2008
-
Example (KF Handout Example 13.3.2)
Cliques: C1 = {X,Y,Z}, C2 = {Z,W}Want P(Z|W=w1)
Variable elimination:Step 0:
initialize all cliques to vacuous canonical form C(0,0,0)i.e. initial potentials not product of initial factors
‐> C1’s initial factors: P(X),P(Y),P(Z|X,Y)
38
Cmput 651 - Hybrid Networks 24/11/2008
-
Example ‐ cont’d (KF Handout Example 13.3.2)
Cliques: C1 = {X,Y,Z}, C2 = {Z,W}Want P(Z|W=w1)
Variable elimination:Step 1:
linearize P(X)i.e. approximate with Gaussian
represent as canonical form
then multiply into C1’s potential (C(0,0,0) initially)
Step 2: same for P(Y)
could do P(Y) in step 1, then P(X)
‐> C1’s potential =39
P̂ (X, Y )
Cmput 651 - Hybrid Networks 24/11/2008
-
Example ‐ cont’d (KF Handout Example 13.3.2)
Cliques: C1 = {X,Y,Z}, C2 = {Z,W}Want P(Z|W=w1)
Variable elimination:C1 has
Step 3:
estimate
(represented as canonical form)
eliminate X,Y:
pass as message to C2
40
P̂ (Z)
P̂ (X, Y, Z) ≈ P (X, Y, Z) = P (X, Y )P (Z|X, Y )P̂ (X, Y, Z) ∼ N
Note: distributionP̂ (X, Y )P (Z|X, Y )
P̂ (Z) =∫
X,YP̂ (X, Y, Z)
Cmput 651 - Hybrid Networks 24/11/2008
-
Example ‐ cont’d (KF Handout Example 13.3.2)
Cliques: C1 = {X,Y,Z}, C2 = {Z,W}Want P(Z|W=w1)
Variable elimination:C2 has
Step 4:
estimate
(represented as canonical form)
Step 5:
set W=w1
pass message to C1 (canonical form)
Step 6: 41
Note: distributionP̂ (Z)P (W |Z)
P̂ (W, Z) ≈ P (W, Z) = P (Z)P (W |Z)P̂ (W, Z) ∼ N
P̂ (W = w1, Z)
P̂ (Z|W = w1) = P̂ (W = w1, Z)
P̂ (Z)
Cmput 651 - Hybrid Networks 24/11/2008
-
Definition (KF Handout Def’n 13.3.1)
Definition: A clique tree T with a root clique Cr allows topological incorporation if for any variable X, the clique to which X’s CPD is assigned is upstream to or equal to the cliques to which X’s parents’ CPDs are assigned.
42
Cmput 651 - Hybrid Networks 24/11/2008
-
Approximating with Gaussians (KF Handout 13.3.2, 13.3.3)
Local approximations:Taylor series
Numerical integration
Global approximation
43
Cmput 651 - Hybrid Networks 24/11/2008
-
Outline
Inference in purely continuous nets
Hybrid network semantics
Inference in hybrid networksIssues
Non‐linear dependencies in continuous nets
Discrete & continuous nodes: CLGs
General hybrid networks
44
Cmput 651 - Hybrid Networks 24/11/2008
-
Inference in general hybrid nets (KF Handout 13.4.1)
NP‐hardeven for polytrees
mixture of exponentially many Gaussians
(1 / assignment to discrete variables)
eg: 2n assignments for n binary variables
even easiest casecontinuous nodes have at most one discrete binary parent
i.e. mixture of at most two Gaussians
even for easiest approximate inferenceon discrete binary nodes with relative error
-
Canonical tables (KF Handout Def’n 13.4.3)
Definition:
A canonical table ϕ over discrete D and continuous X has entries ϕ(d):
one per assignment D=d
entry ϕ(d) = canonical form C(X;Kd,hd,gd)
Can represent:table factors
linear Gaussians
CLGs46
Cmput 651 - Hybrid Networks 24/11/2008
-
Canonical table example
47
Country Gender
Weight
Heightdiscrete country, gendercontinuous height, weight
Female Male
Canada C(KCan,F,hCan,F,gCan,F) C(KCan,M,hCan,M,gCan,M)
USA C(KUSA,F,hUSA,F,gUSA,F) C(KUSA,M,hUSA,M,gUSA,M)
China C(KChi,F,hChi,F,gChi,F) C(KChi,M,hChi,M,gChi,M)
India C(KInd,F,hInd,F,gInd,F) C(KInd,M,hInd,M,gInd,M)
Germany C(KGer,F,hGer,F,gGer,F) C(KGer,M,hGer,M,gGer,M)
Cmput 651 - Hybrid Networks 24/11/2008
-
Operations on canonical tables (KF Handout 13.4.2.1)
Extensions of canonical form operations:Product
Division
Marginalization over continuous variables
Marginalization over discrete variables‐> factor not necessarily representable with canonical table
‐> approximate with Gaussians whenever marginalizing(in form of canonical table)
(see next slide)
48
Cmput 651 - Hybrid Networks 24/11/2008
-
Marginalization example (KF Handout 13.4.5)
49
Binary D, continuous XCanonical table:
Two Gaussians (blue, green)Red: sum (marginalization over D)‐> not Gaussian!
cannot be represented by canonical table(see next slide)
Cmput 651 - Hybrid Networks 24/11/2008
-
Marginalization example ‐ cont’d (KF Handout 13.4.5)
50
Binary D, continuous XCanonical table:
Two Gaussians (blue, green)Red: Gaussian approximation to sum over blue and green
Cmput 651 - Hybrid Networks 24/11/2008
-
Marginalization on canonical tables (KF Handout 13.4.2.1)
Weak marginalizationapproximate marginal as Gaussian
necessary when marginalizing across mixture of GaussiansNote: canonical tables MUST represent valid mixture
Strong marginalizationexact
marginalize over:marginalize out continuous variables only
factor over discrete only
identical canonical forms51
Cmput 651 - Hybrid Networks 24/11/2008
-
Inference in hybrid nets (KF Handout 13.4.2.2)
Cannot marginalize discrete variables‐> must restrict elimination order
KF Handout Example 13.4.10A,B,C discrete; X,Y,Z continuous
possible clique tree:
neither leaf clique can start message passing
eg: {B,X,Y} has CPDs for P(B), P(Y|B,X) but not P(X)
‐> canonical form over {X,Y} = linear Gaussian CPDs, not Gaussians ‐> cannot marginalize out B 52
Cmput 651 - Hybrid Networks 24/11/2008
-
Strong rooted clique trees
Definition: A clique Cr in a clique tree is a strong root if for each clique C1 and its upstream neighbour C2
C1‐C2 ⊆ {continuous variables}C1∩C2 ⊆ {discrete variables}
In a strongly rooted clique tree, upward pass toward strong root does not require any weak marginalization.
‐ in downward pass, all required factors present for weak marginalization to proceed
Example ‐ strongly rooted clique tree (from example on previous slide):
middle clique = strong root53
Cmput 651 - Hybrid Networks 24/11/2008
-
Strong root
sometimes, exist non‐strongly rooted clique tree that still allow inference
example (refer to example 2 slide previous)
Also, issue of building strongly rooted treessee KF Handout 13.4.2.4
54
Cmput 651 - Hybrid Networks 24/11/2008
-
Outline
Inference in purely continuous nets
Hybrid network semantics
Inference in hybrid networksIssues
Non‐linear dependencies in continuous nets
Discrete & continuous nodes: CLGs
General hybrid networks
55
Cmput 651 - Hybrid Networks 24/11/2008
-
Inference in general hybrid nets (KF Handout 13.4.3)
Two issues:non‐linear dependencies
intermediate factors
‐> marginalization on canonical tables ‐> non‐canonical tabular factor
solution: approximate with Gaussians(in form of canonical tables)‐> applies to both issues, as discussed above
‐> allows discrete nodes with continuous parentseg: can model thermostat
56
Cmput 651 - Hybrid Networks 24/11/2008
-
Approximate methods
Above, discussed variable‐elimination‐based methods
Also:particle based (KF Handout 13.5)
global approximate methods
57
Cmput 651 - Hybrid Networks 24/11/2008