lecture 4: variable elimination - github pages · lecture 4: variable elimination theo rekatsinas...

Post on 30-May-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CS839:ProbabilisticGraphicalModels

Lecture4:VariableElimination

TheoRekatsinas

1

Recap:P-maps,BNs,MRFs

2

Section1

• ADAGGisaperfectmap(P-map)foradistributionPifI(P)=I(G)

• ThereareindependenceassumptionsthatcanbecapturedbyaBNandnotanMRF.Vice-versa

PartiallyDirectedAcyclicGraphs

3

Section1

• A.K.A.chaingraphs• Nodescanbepartitionedintodisjointchaincomponents• Anedgewithinthesamecomponentsmustbeundirected• Anedgebetweentwonodesindifferentchaincomponentsmustbedirected

ProbabilisticInferenceandLearning

4

Section1

• Wediscussedcompactrepresentationsofprobabilitydistributions:GraphicalModels(GMs)• AGM(fullyparameterized)MdescribesauniqueprobabilitydistributionP• TaskswecanperformwithaGM:• Task1(Inference):WhatistheprobabilityPM(X|Y)?Y:evidence• Task2(Learning):HowdoweestimateaplausiblemodelM fromdataD?

• LearningreferstoobtainingapointestimateofM• Bayesian’sarelookingforP(M|D)whichisactuallyaninferenceproblem.• Missingdata:tocomputeapointestimateofMweneedtoperforminferencetoimputethemissingdata.

Likelihood

5

Section1

• Mostqueriesinvolveevidence• Evidencee isanassignmentofvaluestoasetEvariablesinthedomain• WithoutlossofgeneralityE={Xk+1,…,Xn}

• Example:computetheprobabilityofevidencee

• A.k.a.computethelikelihoodofe

P (e) =X

x1

· · ·X

xk

P (x1, . . . , xk

, e)

ConditionalProbability

6

Section1

• Oftenweareinterestedintheconditionalprobabilitydistributionofavariablegiventheevidence:

• AkaaposterioribeliefinX,givenevidencee• MostofthetimesweonlycareaboutasubsetY ofalldomainvariablesX={Y,Z}anddonotcareabouttheremainingZ,

• theprocessofsummingout“donotcare”variablesiscalledmarginalization,andtheresultingprobabilityiscalledamarginalprobability

P (X|e) = P (X, e)

P (e)=

P (X, e)Px

P (X = x, e)

P (Y|e) =X

z

P (Y,Z = z|e)

Applicationsofaposterioribelief

7

Section1

• Prediction:whatistheprobabilityofanoutcomegiventhestartingcondition

• Thequerynodeisadescendentoftheevidence

• Diagnosis:whatistheprobabilityofdisease/faultgivensymptoms

• Thequerynodeisanancestoroftheevidence• Learningunderpartialobservations• Expectationmaximizationalgorithm(laterinclass)

A B C

?

A B C

?

MPA:MostProbableAssignment

8

Section1

• Whatisthemostprobablejointassignment(MPA)forsomevariablesofinterest

• Weareagaingivensomeevidenceeandweignore(thevaluesof)“donotcare”variablesz

• Akamaximumaposterioriassignmentofy (MAPinference)

MPA(Y|e) = argmax

yP (y|e) = argmax

y

X

z

P (y, z|e)

ApplicationsofMPA

9

Section1

• Classification: findmostlikelylabel,givenevidence• Explanation:whatthemostlikelyscenario,giventheobservedevidence

• MAPstateofavariabledependsonthesetofvariablesthatarejointlyqueried

• Example:• MapofY1• Mapof(Y1,Y2)

Y1 Y2 P(Y1,Y2)0 0 0.050 1 0.351 0 0.31 1 0.3

ComplexityofInference

10

Section1

• Thm: ComputingP(X=x|e)inaGMisNP-hard

• However,formanypracticalinstancesofGMswecansolveinferenceefficiently(inpolynomialtime)• NP-hardnessimpliesthatwecannotfindageneralprocedurethatworksefficientlyforarbitraryGMs• ForcertainGMfamilieswehaveprovablyefficientexactinferenceprocedures.

ApproachestoInference

11

Section1

• Exactinference• Variableelimination• Message-passingalgorithm(sum-product,beliefpropagation)• Junctiontreealgorithms

• Approximateinferencetechniques• Stochasticsimulation/samplingmethods• MarkovchainMonteCarlomethods• Variational algorithms

Operations:MarginalizationandElimination

12

Section1

• ConsiderthefollowingGM:

• WhatisthelikelihoodthatEistrue?

• Query:P(e)

A B C D E

P (e) =X

d

X

c

X

b

X

a

P (a, b, c, d, e)

Operations:MarginalizationandElimination

13

Section1

• ConsiderthefollowingGM:

• WhatisthelikelihoodthatEistrue?

• Query:P(e)

A B C D E

P (e) =X

d

X

c

X

b

X

a

P (a, b, c, d, e)

Anaïvesummationneedstoenumerateoveranexponentialnumberofterms

Operations:MarginalizationandElimination

14

Section1

• ConsiderthefollowingGM:

• WhatisthelikelihoodthatEistrue?

• Query:P(e)

• Chaindecomposition

A B C D E

P (e) =X

d

X

c

X

b

X

a

P (a, b, c, d, e)

P (e) =X

d

X

c

X

b

X

a

P (a)P (b|a)P (c|b)P (d|c)P (e|d)

EliminationonChains

15

Section1

• ConsiderthefollowingGM:

• WhatisthelikelihoodthatEistrue?

• Reorderterms

A B C D E

P (e) =X

d

X

c

X

b

X

a

P (a)P (b|a)P (c|b)P (d|c)P (e|d)

=X

d

X

c

X

b

P (c|b)P (d|c)P (e|d)X

a

P (a)P (b|a)

EliminationonChains

16

Section1

• ConsiderthefollowingGM:

• Performinnermostsummation

• Thissummationeliminatesonevariablefromoursummationargumentatalocalcost

A B C D EXP (e) =

X

d

X

c

X

b

P (c|b)P (d|c)P (e|d)X

a

P (a)P (b|a)

=X

d

X

c

X

b

P (c|b)P (d|c)P (e|d)p(b)

EliminationonChains

17

Section1

• ConsiderthefollowingGM:

• Performinnermostsummation

A B C D EX XP (e) =

X

d

X

c

X

b

P (c|b)P (d|c)P (e|d)p(b)

=X

d

X

c

P (d|c)P (e|d)X

b

P (c|b)p(b)

=X

d

X

c

P (d|c)P (e|d)p(c)

EliminationonChains

18

Section1

• ConsiderthefollowingGM:

• Eliminatenodesone-by-oneallthewaytotheend

• Complexity:• ForeachstepwehaveO(|Dom(Xi)|*|Dom(Xi+1)|)operations:O(kn2)• ComparewithnaïveO(nk)

A B C D EX X X XP (e) =

X

d

P (e|d)p(d)

UndirectedChains

19

Section1

• Rearrangingterms…A B C D E

P (e) =X

d

X

c

X

b

X

a

1

Z�(b, a)�(c, b)�(d, c)�(e, d)

=1

Z

X

d

X

c

X

b

�(c, b)�(d, c)�(e, d)X

a

�(b, a)

= . . .

ConditionalRandomFields

20

Section1

TheSum-ProductOperation

21

Section1

• Ingeneral,wewanttocomputethevalueofanexpressionoftheform:

• whereFisasetoffactors

• Wecallthistaskthesum-productinferencetask.

X

z

Y

�2F

InferenceviaVariableElimination

22

Section1

• Generalidea:• Writequeryintheform

• Iteratively• Moveallirrelevanttermsoutsideofinnermostsum• Performinnermostsum,gettinganewterm• Insertthenewtermintotheproduct

• Backtooriginalquery

P (X1, e) =X

xn

· · ·X

x3

X

x2

Y

i

P (xi

|pai

)

P (X1|e) =�(X1, e)Px1

�(X1, e)

Outcomeofelimination

23

Section1

• LetXbesomesetofvariables• LetFbeasetoffactorssuchthatforeach• LetbeasetofqueryvariablesandZ =X– Ybethevariabletobeeliminated

• TheresultofeliminatingZisafactor

• Thisdoesnotnecessarilycorrespondtoanyprobabilityorconditionalprobability

� 2 F, Scope[�] 2 XY ⇢ X

⌧(Y) =X

z

Y

�2F

EvidenceandSum-Product

24

Section1

• Evidencepotential

• Totalevidencepotential

• Introducingevidence– restrictedfactors:

⌧(Y, e) =X

z,e

Y

�2F

� · �(E, e)

VariableEliminationAlgorithm

25

Section1

ProcedureElimination(G,//theGME,//evidenceZ,//setofvariablestobeeliminatedX,//queryvariable(s))

1. Initialize(G)2. Evidence(E)3. Sum-Product-Elimination(F,Z,)4. Normalization(F)

VariableEliminationAlgorithm

26

Section1

ProcedureInitialize(G,Z)1. LetbeanorderingofZsuchthatiff I<j2. InitializeFwiththefullsetoffactors

ProcedureEvidence(E)1.Foreach

Z1, . . . , Zk Zi � Zj

i 2 IE ,

F = F [ �(Ei, ei)

VariableEliminationAlgorithm

27

Section1

ProcedureSum-Product-Variable-Elimination(F,Z,)

1. Fori =1,…,k

2.3. return4. Normalization()

F Sum-Product-Eliminate-Var(F,Zi)

�⇤ Q

�F�

�⇤

�⇤

VariableEliminationAlgorithm

28

Section1

ProcedureSum-Product-Eliminate-Var (F,Z)

1.

2.

3.

4.

5.return

F

0 {� 2 F : Z 2 Scope[�]}F 00 F � F 0

Q

�2F 0 �

⌧ P

Z

F 00 [ {⌧}

VariableEliminationAlgorithm

29

Section1

ProcedureNormalization()

1.

�⇤

P (X|E) = �⇤(X)Px

�⇤(X)

VariableEliminationExample

30

Section1

Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)

VariableEliminationExample

31

Section1

Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)Step1:• Conditioning onevidence(fixHtoh)

• Sameasamarginalizationstep:pH(E,F ) = P (H = h|E,F )

pH(E,F ) =P

h0 P (H = h|E,F )�(h0 = h)

VariableEliminationExample

32

Section1

Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)pH(E,F)Step2:EliminateG

=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)p(E,F)=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A) pH(E,F)

pG(E) =P

g P (G = g|E) = 1

VariableEliminationExample

33

Section1

Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)pH(E,F)Step3:EliminateF

=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)pF(A,E)

pH(E,A) =P

f P (F = f |A)pH(E,F )

VariableEliminationExample

34

Section1

Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)P(C|B)P(D|A)P(E|C,D)pF(A,E)Step4:EliminateE

=>P(A)P(B)P(C|B)P(D|A)pE(A,C,D)

pE(A,C,D) =P

e P (E = e|C,D)pF (A,E)

VariableEliminationExample

35

Section1

Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)P(C|B)P(D|A)pE(A,C,D)Step5:EliminateD

=>P(A)P(B)P(C|B)pD(A,C)

pD(A,C) =P

d P (D = d|A)pE(A,C,D)

VariableEliminationExample

36

Section1

Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)P(C|B)pD(A,C)Step6:EliminateC

=>P(A)P(B)P(C|B)pC(A,B)

pC(A,B) =P

c P (C = c|B)pD(A,C)

VariableEliminationExample

37

Section1

Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)P(B)pC(A,B)Step7:EliminateB

=>P(A)pB(A)

pB(A) =P

b P (B = b|A)pC(A,B)

VariableEliminationExample

38

Section1

Query:P(A|h)• Needtoeliminate:B,C,D,E,F,G,HInitialfactors:P(A)P(B)P(C|B)P(D|A)P(E|C,D)P(F|A)P(G|E)P(H|E,F)=>P(A)pB(A)Step8:Wrap-up

P (A, h) = P (A)pB(A), P (h) =P

a P (A = a)pB(A = a)

P (A|h) = P (A,h)P (h)

Complexityofvariableelimination

39

Section1

• Supportinoneeliminationstepwecompute

• Thisrequiresmultiplications

• Foreachvalueforwedokmultiplications

• additions

• Foreachvalueforwedo|Val(X)|additions

p

x

(y1, . . . , yk) =P

x

p

0x

(x, y1, . . . , yk)

p

0x

(x, y1, . . . , yk) =Q

k

i=1 pi(x,yci)

k · |V al(X)| ·Q

I |V al(yCi)|x, y1, . . . , yk

|V al(X)| ·Q

i |V al(yCi)|y1, . . . , yk

Complexityofvariableelimination

40

Section1

• Supportinoneeliminationstepwecompute

• Thisrequiresmultiplications

• Foreachvalueforwedokmultiplications

• additions

• Foreachvalueforwedo|Val(X)|additions

p

x

(y1, . . . , yk) =P

x

p

0x

(x, y1, . . . , yk)

p

0x

(x, y1, . . . , yk) =Q

k

i=1 pi(x,yci)

k · |V al(X)| ·Q

I |V al(yCi)|x, y1, . . . , yk

|V al(X)| ·Q

i |V al(yCi)|y1, . . . , yk

Complexityisexponential innumberofvariablesintheintermediatefactor

Summary

41

Section1

• Thesimpleeliminatealgorithmcapturesthekeyalgorithmicoperation

underlyingprobabilisticinference:

• Wetakeasumoverproductofpotentialfunctions

• Complexityofthealgorithmdependsonsizeofthesummandsthatappearin

thesequenceofthesummationoperations.

• ComputationalcomplexityoftheEliminatealgorithmcanbereducedtopurely

graph-theoreticconsiderations:treewidth (nextclass).

• Reasoningaboutthethreewidth wecandesignimprovedinferencealgorithms.

top related