第十讲 概率图模型导论 chapter 10 introduction to probabilistic graphical models
DESCRIPTION
浙江大学计算机学院 《 人工智能引论 》 课件. 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models. Weike Pan, and Congfu Xu {panweike, xucongfu}@zju.edu.cn Institute of Artificial Intelligence College of Computer Science, Zhejiang University October 12, 2006. References. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/1.jpg)
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Gra
phical Models
Weike Pan, and Congfu Xu{panweike, xucongfu}@zju.edu.cn
Institute of Artificial Intelligence
College of Computer Science, Zhejiang University
October 12, 2006
浙江大学计算机学院《人工智能引论》课件
![Page 2: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/2.jpg)
References
An Introduction to Probabilistic Graphical Models. Michael I. Jordan.
http://www.cs.berkeley.edu/~jordan/graphical.html
![Page 3: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/3.jpg)
Outline
PreparationsProbabilistic Graphical Models (PGM)
Directed PGM Undirected PGM
Insights of PGM
![Page 4: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/4.jpg)
Outline
Preparations PGM “is” a universal model
Different thoughts of machine learning Different training approaches Different data types
Bayesian Framework Chain rules of probability theory Conditional Independence
Probabilistic Graphical Models (PGM) Directed PGM Undirected PGM
Insights of PGM
![Page 5: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/5.jpg)
Different thoughts of machine learning
Statistics (modeling uncertainty, detailed information) vs. Logics (modeling complexity, high level information)
Unifying Logical and Statistical AI. Pedro Domingos, University of Washington. AAAI 2006.
Speech: Statistical information (Acoustic model + Language model + Affect model…) + High level information (Expert/Logics)
![Page 6: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/6.jpg)
Different training approaches
Maximum Likelihood Training: MAP (Maximum a Posteriori)
vs. Discriminative Training: Maximum Margin (SVM)
Speech: classical combination – Maximum Likelihood + Discriminative Training
![Page 7: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/7.jpg)
Different data types
Directed acyclic graph (Bayesian Networks, BN) Modeling asymmetric effects and dependencies:
causal/temporal dependence (e.g. speech analysis, DNA sequence analysis…)
Undirected graph (Markov Random Fields, MRF) Modeling symmetric effects and dependencies: spatial
dependence (e.g. image analysis…)
![Page 8: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/8.jpg)
PGM “is” a universal model
To model both temporal and spatial data, by unifying Thoughts: Statistics + Logics Approaches: Maximum Likelihood Training + Discriminative
Training
Further more, the directed and undirected models together provide modeling power beyond that which could be provided by either alone.
![Page 9: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/9.jpg)
Bayesian Framework
( | ) ( )( | )
( )i i
i
P O c P cP c O
P O
What we care is the conditional probability, and it’s is a ratio of two marginal probabilities.
A posteriori probability
Likelihood Priori probability
Class iNormalization factor
Observation
Problem description Observation Conclusion (classification or prediction)
Bayesian rule
![Page 10: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/10.jpg)
Chain rules of probability theory
![Page 11: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/11.jpg)
Conditional Independence
![Page 12: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/12.jpg)
Outline
PreparationsProbabilistic Graphical Models (PGM)
Directed PGM Undirected PGM
Insights of PGM
![Page 13: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/13.jpg)
PGM
Nodes represent random variables/states The missing arcs represent conditional independence assumptions
The graph structure implies the decomposition
![Page 14: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/14.jpg)
Directed PGM (BN)
Representation
Conditional Independence
Probability Distribution Queries
Implementation
Interpretation
![Page 15: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/15.jpg)
Probability Distribution
Definition of Joint Probability Distribution
( , ) 1i
i
i ix
f x x ( , ) 0ii if x x
Check:
![Page 16: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/16.jpg)
Representation
Graphical models represent joint probability distributions more economically, using a set of “local” relationships among variables.
![Page 17: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/17.jpg)
Conditional Independence (basic)
Assert the conditional independence of a node from its ancestors, conditional on its parents.
Interpret missing edges in terms of conditional independence
![Page 18: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/18.jpg)
Conditional Independence (3 canonical graphs)
Classical Markov chain“Past”, “present”,
“future”
Common causeY “explains” all the dependencies
between X and Z
Marginal Independence
Common effect Multiple, competing explanation
( , , ) ( ) ( ) ( | , )
( , , ) ( ) ( )
( , )
p x y z p x p z p y x z
p x y zp x p z
p x z
( , ) ( ) ( )p x z p x p z
Conditional Independence
![Page 19: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/19.jpg)
Conditional Independence (check)
One incoming arrow and one outgoing arrow
Two outgoing arrows Two incoming arrows
Check through reachability
Bayes ball algorithm (rules)
![Page 20: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/20.jpg)
Outline
PreparationsProbabilistic Graphical Models (PGM)
Directed PGM Undirected PGM
Insights of PGM
![Page 21: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/21.jpg)
Undirected PGM (MRF)
Representation
Conditional Independence
Probability Distribution Queries
Implementation
Interpretation
![Page 22: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/22.jpg)
Probability Distribution(1)
Clique A clique of a graph is a fully-connected subset of nodes. Local functions should not be defined on domains of nodes that extend
beyond the boundaries of cliques.
Maximal cliques The maximal cliques of a graph are the cliques that cannot be extended
to include additional nodes without losing the probability of being fully connected.
We restrict ourselves to maximal cliques without loss of generality, as it captures all possible dependencies.
Potential function (local parameterization) : potential function on the possible realizations of the maximal
clique ( )
CX CxCx
CX
![Page 23: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/23.jpg)
Probability Distribution(2)
Maximal cliques
![Page 24: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/24.jpg)
Probability Distribution(3)
Joint probability distribution
Normalization factor
1( ) ( )
CX CC
p x xZ
( )CX C
x C
Z x
1( ) ( )
1 exp{ ( )}
1 exp{ ( )}
1 exp{ ( )}
CX CC
C CC
C CC
p x xZ
H xZ
H xZ
H xZ
( )
exp{ ( )}
CX Cx C
x
Z x
H x
Boltzman distribution
![Page 25: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/25.jpg)
Conditional Independence
It’s a “reachability” problem in graph theory.
![Page 26: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/26.jpg)
Representation
![Page 27: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/27.jpg)
Outline
PreparationsProbabilistic Graphical Models (PGM)
Directed PGM Undirected PGM
Insights of PGM
![Page 28: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/28.jpg)
Insights of PGM (Michael I. Jordan)
Probabilistic Graphical Models are a marriage between probability theory and graph theory.
A graphical model can be thought of as a probabilistic database, a machine that can answer “queries” regarding the values of sets of random variables.
We build up the database in pieces, using probability theory to ensure that the pieces have a consistent overall interpretation. Probability theory also justifies the inferential machinery that allows the pieces to be put together “on the fly” to answer the queries.
In principle, all “queries” of a probabilistic database can be answered if we have in hand the joint probability distribution.
![Page 29: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/29.jpg)
Insights of PGM (data structure & algorithm)
A graphical model is a natural/perfect tool for representation(数据结构 ) and inference (算法 ).
![Page 30: 第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models](https://reader036.vdocuments.net/reader036/viewer/2022081415/56813c90550346895da63ca6/html5/thumbnails/30.jpg)
Thanks!