tutorial on probabilistic graphical models: a geometric ...tutorial on probabilistic graphical...

49
Probabilistic Graphical Models Reasoning Bayesian Networks by Hand How to reason by a machine? Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi 7 July 2017 Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and T

Upload: others

Post on 04-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Tutorial on Probabilistic Graphical Models:A Geometric and Topological View

Qinfeng (Javen) Shi

7 July 2017

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 2: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Schedule

Part 1: Representation and Basic (Javen)

Part 2: Inference (Chao)

Part 3: Advanced Inference (Chao)

Part 4: Learning (both parameters and structures) (Javen).

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 3: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Part 1: Representation and Basics I

1 Probabilistic Graphical ModelsHistory and booksRepresentationsFactorisation and independences

2 Reasoning Bayesian Networks by HandFactorisation and ReasoningA universal way

3 How to reason by a machine?VE for marginal inference

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 4: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Scenario 1

A

C

B

Multiple problems (A,B, ...) affect each other

Joint optimal solution of all 6= the solutions of individuals

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 5: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Scenario 2

Two variables X ,Y each taking 10 possible values.Listing P(X ,Y ) for each possible value of X ,Y requiresspecifying/computing 102 many probabilities.

What if we have 1000 variables each taking 10 possible values?⇒ 101000 many probabilities

⇒ Difficult to store, and query naively.

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 6: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Scenario 2

Two variables X ,Y each taking 10 possible values.Listing P(X ,Y ) for each possible value of X ,Y requiresspecifying/computing 102 many probabilities.

What if we have 1000 variables each taking 10 possible values?

⇒ 101000 many probabilities

⇒ Difficult to store, and query naively.

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 7: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Scenario 2

Two variables X ,Y each taking 10 possible values.Listing P(X ,Y ) for each possible value of X ,Y requiresspecifying/computing 102 many probabilities.

What if we have 1000 variables each taking 10 possible values?⇒ 101000 many probabilities

⇒ Difficult to store, and query naively.

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 8: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Remedy

Structured Learning, specially Probabilistic Graphical Models(PGMs).

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 9: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

PGMs

PGMs use graphs to represent the complex probabilisticrelationships between random variables.

P(A,B,C , ...)

Benefits:

compactly represent distributions of variables.

Relation between variables are intuitive (such as conditionalindependences)

have fast and general algorithms to query withoutenumeration. e.g. ask for P(A|B = b,C = c) or EP [f ]

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 10: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

An Example

G

I

J

S

D

H

L

Difficulty Intelligence

Grade

Happy

Letter

SAT

Job

Intuitive

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 11: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Example

G

I

J

S

D

H

L

Difficulty Intelligence

Grade

Happy

Letter

SAT

Job

P(I)

P(S | I)

P(J | L,S)

P(D)

P(G | D,I)

P(H | G,J)

P(L | G)

Compact

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 12: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

History

Gibbs (1902) used undirected graphs in particles

Wright (1921,1934) used directed graph in genetics

In economists and social sci (Wold 1954, Blalock, Jr. 1971)

In statistics (Bartlett 1935, Vorobev 1962, Goodman 1970,Haberman 1974)

In AI, expert system (Bombal et al. 1972, Gorry and Barnett1968, Warner et al. 1961)

Widely accepted in late 1980s. Prob Reasoning in Intelli Sys(Pearl 1988), Pathfinder expert system (Heckerman et al.1992)

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 13: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

History

Hot since 2001. Flexible features and principled ways oflearning.CRFs (Lafferty et al. 2001), SVM struct (Tsochantaridis etal2004), M3Net (Taskar et al. 2004), DeepBeliefNet (Hinton etal. 2006)

Super-hot since 2010. Winners of a large number ofchallenges with big data.Google, Microsoft, Facebook all open new labs for it.

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 14: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

History

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 15: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Good books

Chris Bishop’s book “Pattern Recognition and MachineLearning” (Graphical Models are in chapter 8, which isavailable from his webpage) ≈ 60 pages

Koller and Friedman’s “Probabilistic Graphical Models”> 1000 pages

Stephen Lauritzen’s “Graphical Models”

Michael Jordan’s unpublished book “An Introduction toProbabilistic Graphical Models”

· · ·

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 16: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Representations

A

C

B

(a) Directed graph

A

C

B

(b) Undirected graph

A

C

B

f2

f1

f3

(c) Factor graph

Nodes represent random variables

Edges reflect dependencies between variables

Factors explicitly show which variables are used in each factori.e. f1(A,B)f2(A,C )f3(B,C )

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 17: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Example — Image Denoising

Denoising1

Applications in Vision and PR

Image denoising

Original CorrectedNoisy

Denoising

Real Applications

),( ii yxΦ

),( ji xxΨ

X ∗ = argmaxX P(X |Y )

1This example is from Tiberio Caetano’s short course: “Machine Learningusing Graphical Models”

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 18: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Example —Human Interaction Recognition

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 19: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Factorisation for Bayesian networks

Directed Acyclic Graph (DAG).Factorisation rule: P(x1, . . . , xn) =

∏ni=1 P(xi |Pa(xi ))

Pa(xi ) denotes parent of xi . e.g. (A,B) = Pa(C )

A

C

B

⇒ P(A,B,C ) = P(A)P(B|A)P(C |A,B)Acyclic: no cycle allowed. Replacing edge A→ C with C → A willform a cycle (loop i.e. A→ B → C → A), not allowed in DAG.

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 20: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Factorisation for Markov Random Fields

Undirected Graph:Factorisation rule:P(x1, . . . , xn) = 1

Z

∏c∈C ψc(Xc),

Z =∑

X

∏c∈C ψc(Xc),

where c is an index set of a clique (fully connected subgraph), Xc

is the set of variables indicated by c.

A

C

B

Consider Xc1 = {A,B},Xc2 = {A,C},Xc3 = {B,C}⇒ P(A,B,C ) = 1

Z ψc1(A,B)ψc2(A,C )ψc3(B,C )

Consider Xc = {A,B,C} ⇒ P(A,B,C ) = 1Z ψc(A,B,C ),

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 21: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Factorisation for Markov Random Fields

Factor Graph:Factorisation rule:P(x1, . . . , xn) = 1

Z

∏i fi (Xi ), Z =

∑X

∏i fi (Xi )

A

C

B

f2

f1

f3

⇒ P(A,B,C ) = 1Z f1(A,B)f2(A,C )f3(B,C )

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 22: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Inference and Learning

Given potentials and the graph, one can ask for:

x∗ = argmaxx P(x) MAP Inference

P(xc) =∑

xV/cP(x) Marginal Inference

How to get potentials and the graph? → Learning.

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 23: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

Independences

IndependenceA ⊥⊥ B ⇔ P(A,B) = P(A)P(B)

Conditional IndependenceA ⊥⊥ B|C ⇔ P(A,B|C ) = P(A|C )P(B|C )

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 24: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

From Graph to Independences

Case 1:

Head

Tail

A

CB

Question: B ⊥⊥ C?

Answer: No.

P(B,C ) =∑A

P(A,B,C )

=∑A

P(B|A)P(C |A)P(A)

6= P(B)P(C ) in general

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 25: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

From Graph to Independences

Case 1:

Head

Tail

A

CB

Question: B ⊥⊥ C?Answer: No.

P(B,C ) =∑A

P(A,B,C )

=∑A

P(B|A)P(C |A)P(A)

6= P(B)P(C ) in general

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 26: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

From Graph to Independences

Case 2:

A

CB

Question: B ⊥⊥ C |A?

Answer: Yes.

P(B,C |A) =P(A,B,C )

P(A)

=P(B|A)P(C |A)P(A)

P(A)

= P(B|A)P(C |A)

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 27: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

From Graph to Independences

Case 2:

A

CB

Question: B ⊥⊥ C |A?Answer: Yes.

P(B,C |A) =P(A,B,C )

P(A)

=P(B|A)P(C |A)P(A)

P(A)

= P(B|A)P(C |A)

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 28: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

From graphs to independences

Case 3:

A

CB

A

CB

Question: B ⊥⊥ C , B ⊥⊥ C |A?

∵ P(A,B,C ) = P(B)P(C )P(A|B,C ),

∴ P(B,C ) =∑A

P(A,B,C )

=∑A

P(B)P(C )P(A|B,C )

= P(B)P(C )

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 29: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

History and booksRepresentationsFactorisation and independences

From graphs to independences

Case 3:

A

CB

A

CB

Question: B ⊥⊥ C , B ⊥⊥ C |A?

∵ P(A,B,C ) = P(B)P(C )P(A|B,C ),

∴ P(B,C ) =∑A

P(A,B,C )

=∑A

P(B)P(C )P(A|B,C )

= P(B)P(C )

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 30: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

Bayesian networks

Directed Acyclic Graph (DAG).Factorisation rule: P(x1, . . . , xn) =

∏ni=1 P(xi |Pa(xi ))

Example:

A

C

B

P(A,B,C ) = P(A)P(B|A)P(C |A,B)

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 31: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

Bayesian networks

Directed Acyclic Graph (DAG).Factorisation rule: P(x1, . . . , xn) =

∏ni=1 P(xi |Pa(xi ))

Example:

A

C

B

P(A,B,C ) = P(A)P(B|A)P(C |A,B)

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 32: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

Reasoning with all variables

DAG tells us: P(A,B,C ) = P(A)P(B|A)P(C |A,B)

P(A = a|B = b,C = c) =?

All variables are involved in the query.

=P(A = a, B = b, C = c)

P(B = b, C = c)

=P(A = a)P(B = b|A = a)P(C = c|A = a, B = b)∑

A∈{¬a,a} P(A, B = b, C = c)

=P(A = a)P(B = b|A = a)P(C = c|A = a, B = b)∑

A∈{¬a,a} P(A)P(B = b|A)P(C = c|A, B = b)

=P(A = a)P(B = b|A = a)P(C = c|A = a, B = b)

P(A = ¬a)P(B = b|A = ¬a)P(C = c|A = ¬a, B = b) + P(A = a)P(B = b|A = a)P(C = c|A = a, B = b)

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 33: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

Reasoning with all variables

DAG tells us: P(A,B,C ) = P(A)P(B|A)P(C |A,B)

P(A = a|B = b,C = c) =?

All variables are involved in the query.

=P(A = a, B = b, C = c)

P(B = b, C = c)

=P(A = a)P(B = b|A = a)P(C = c|A = a, B = b)∑

A∈{¬a,a} P(A, B = b, C = c)

=P(A = a)P(B = b|A = a)P(C = c|A = a, B = b)∑

A∈{¬a,a} P(A)P(B = b|A)P(C = c|A, B = b)

=P(A = a)P(B = b|A = a)P(C = c|A = a, B = b)

P(A = ¬a)P(B = b|A = ¬a)P(C = c|A = ¬a, B = b) + P(A = a)P(B = b|A = a)P(C = c|A = a, B = b)

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 34: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

Reasoning with missing variable(s)

DAG tells us: P(A,B,C ) = P(A)P(B|A)P(C |A,B)

P(A = a|B = b) =?

C is missing in the query.

=P(A = a,B = b)

P(B = b)

=

∑C P(A = a,B = b,C )∑A,C P(A,B = b,C )

=

∑C P(A = a)P(B = b|A = a)P(C |A = a,B = b)∑

A,C P(A)P(B = b|A)P(C |A,B = b)

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 35: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

Reasoning with missing variable(s)

DAG tells us: P(A,B,C ) = P(A)P(B|A)P(C |A,B)

P(A = a|B = b) =?

C is missing in the query.

=P(A = a,B = b)

P(B = b)

=

∑C P(A = a,B = b,C )∑A,C P(A,B = b,C )

=

∑C P(A = a)P(B = b|A = a)P(C |A = a,B = b)∑

A,C P(A)P(B = b|A)P(C |A,B = b)

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 36: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

Example of 4WD

Someone finds that people who drive 4WDs vehicles (S) consumelarge amounts of gas (G ) and are involved in more accidents thanthe national average (A). They have constructed the Bayesiannetwork below (here t implies “true” and f implies “false”).

Figure: 4WD Bayesian network

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 37: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

Example of 4WD

P(¬g , a|s)? (i.e. P(G = ¬g ,A = a|S = s))

P(a|s)?

P(A|s)?

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 38: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

Example of 4WD

Someone else finds that there are two types of people that drive4WDs, people from the country (C ) and people with large families(F ). After collecting some statistics, here is the new Bayesiannetwork.

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 39: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

Example of 4WD

P(g ,¬a, s, c)?

P(g ,¬a|c , f )?

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 40: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

How to reason by hand?

A universal way: express the query probability in terms of the fulldistribution2, and then factorise it.

Step by step:

1 when you see a conditional distribution, break it into thenominator and the denominator.

2 when you see a distribution (may be from the nominatorand/or the denominator) with missing variable(s), rewrite it asa sum of the full distribution w.r.t. the missing variable(s).

3 After everything is expressed by the full distribution, factorisethe full distribution into local distributions (which are known).

2the joint distribution over all variables of your Bayesian networkQinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 41: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

How to reason by hand?

A universal way: express the query probability in terms of the fulldistribution2, and then factorise it.

Step by step:

1 when you see a conditional distribution, break it into thenominator and the denominator.

2 when you see a distribution (may be from the nominatorand/or the denominator) with missing variable(s), rewrite it asa sum of the full distribution w.r.t. the missing variable(s).

3 After everything is expressed by the full distribution, factorisethe full distribution into local distributions (which are known).

2the joint distribution over all variables of your Bayesian networkQinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 42: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

How to reason by hand?

A universal way: express the query probability in terms of the fulldistribution2, and then factorise it.

Step by step:

1 when you see a conditional distribution, break it into thenominator and the denominator.

2 when you see a distribution (may be from the nominatorand/or the denominator) with missing variable(s), rewrite it asa sum of the full distribution w.r.t. the missing variable(s).

3 After everything is expressed by the full distribution, factorisethe full distribution into local distributions (which are known).

2the joint distribution over all variables of your Bayesian networkQinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 43: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

How to reason by hand?

A universal way: express the query probability in terms of the fulldistribution2, and then factorise it.

Step by step:

1 when you see a conditional distribution, break it into thenominator and the denominator.

2 when you see a distribution (may be from the nominatorand/or the denominator) with missing variable(s), rewrite it asa sum of the full distribution w.r.t. the missing variable(s).

3 After everything is expressed by the full distribution, factorisethe full distribution into local distributions (which are known).

2the joint distribution over all variables of your Bayesian networkQinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 44: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?

Factorisation and ReasoningA universal way

How to reason by hand?

A universal way: express the query probability in terms of the fulldistribution2, and then factorise it.

Step by step:

1 when you see a conditional distribution, break it into thenominator and the denominator.

2 when you see a distribution (may be from the nominatorand/or the denominator) with missing variable(s), rewrite it asa sum of the full distribution w.r.t. the missing variable(s).

3 After everything is expressed by the full distribution, factorisethe full distribution into local distributions (which are known).

2the joint distribution over all variables of your Bayesian networkQinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 45: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?VE for marginal inference

Issues?

Do you want to try this for 100 variables?

Hand-tiring for many variables, and it’s only for Bayesian Networks.

How to infer for other graphical models and how to do it in acomputer program?

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 46: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?VE for marginal inference

Issues?

Do you want to try this for 100 variables?Hand-tiring for many variables, and it’s only for Bayesian Networks.

How to infer for other graphical models and how to do it in acomputer program?

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 47: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?VE for marginal inference

How to reason by a machine?

Variable elimination: infer by eliminating variables (works for bothmarginal and MAP inference)

P(A) =∑S,G

P(A,S ,G )

=∑S,G

P(S)P(A|S)P(G |S)

=∑S

P(S)P(A|S)(∑G

P(G |S)) =∑S

P(S)P(A|S)

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 48: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?VE for marginal inference

VE for marginal inference

Step by step:

1 sum over missing variables (marginalisation) for the fulldistribution.

2 factorise the full distribution.

3 rearrange the sum operator to reduce the computation.

4 eliminate the variables.

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View

Page 49: Tutorial on Probabilistic Graphical Models: A Geometric ...Tutorial on Probabilistic Graphical Models: A Geometric and Topological View Qinfeng (Javen) Shi ... Learning"(Graphical

Probabilistic Graphical ModelsReasoning Bayesian Networks by Hand

How to reason by a machine?VE for marginal inference

That’s all

Thanks!

Qinfeng (Javen) Shi Tutorial on Probabilistic Graphical Models: A Geometric and Topological View