section 1 introduction combined
TRANSCRIPT
-
7/28/2019 Section 1 Introduction Combined
1/40
Probabilistic
Graphical
Models Introduction
the PGM C
-
7/28/2019 Section 1 Introduction Combined
2/40
Probabilistic
Models
-
7/28/2019 Section 1 Introduction Combined
3/40
Course Structure
10 weeks + final Videos + quizzes
pro em se s 25% of score
Multiple submissions
-
7/28/2019 Section 1 Introduction Combined
4/40
Course Structure
9 programming assignments Genetically inherited diseases Optical character recognition
9 x 7% = 63% of score Final exam
12% of score
-
7/28/2019 Section 1 Introduction Combined
5/40
Background
Required Basic probability theory Some programming
Recommended
Machine learning Simple optimization Matlab or Octave
-
7/28/2019 Section 1 Introduction Combined
6/40
Other Issues
Honor code Time management (10-15 hrs / wee
scuss on orum s u y groups
-
7/28/2019 Section 1 Introduction Combined
7/40
What youll learn
Fundamental methods Real-world applications How to use these methods in your wo
-
7/28/2019 Section 1 Introduction Combined
8/40
Daphne Koller
Mo#va#on'
and'Overview'
Probabilis#c'
Graphical'
Models'Introduc#on'
-
7/28/2019 Section 1 Introduction Combined
9/40
Daphne Koller
predisposing factorssymptomstest results
millions of pixels orthousands of superpixels
each needs to be labeled
{grass, sky, water, cow, horse, }diseasestreatment outcomes
-
7/28/2019 Section 1 Introduction Combined
10/40
Daphne Koller
ProbabilisticGraphical
Models
-
7/28/2019 Section 1 Introduction Combined
11/40
Daphne Koller
ModelsData
Model
Declarative representation
AlgorithmAlgorithm Algorithm
elicitation
domain expert
Learning
-
7/28/2019 Section 1 Introduction Combined
12/40
Daphne Koller
Uncertainty Partial knowledge of state of the world Noisy observations Phenomena not covered by our model Inherent stochasticity
-
7/28/2019 Section 1 Introduction Combined
13/40
Daphne Koller
Probability Theory
Declarative representation with clearsemantics
Powerful reasoning patterns Established learning methods
-
7/28/2019 Section 1 Introduction Combined
14/40
Daphne Koller
Complex Systemspredisposing factorssymptomstest results
diseasestreatment outcomes
class labels forthousands of superpixels
Random variables X1,, Xn
Joint distribution P(X1,, Xn)
-
7/28/2019 Section 1 Introduction Combined
15/40
Daphne Koller
Graphical Models
IntelligenceDifficulty
Grade
Letter
SAT BD
C
A
Bayesian networks Markov networks
-
7/28/2019 Section 1 Introduction Combined
16/40
Daphne Koller
Graphical Models
M. Pradhan, G. Provan, B. Middleton, M. Henrion, UAI 94
-
7/28/2019 Section 1 Introduction Combined
17/40
Daphne Koller
Graphical Representation Intuitive & compact data structureEfficient reasoning using general-purposealgorithms
Sparse parameterizationfeasible elicitationlearning from data
-
7/28/2019 Section 1 Introduction Combined
18/40
Daphne Koller
Many Applications Medical diagnosis Fault diagnosis Natural language
processing
Traffic analysis Social network models Message decoding
Computer vision Image segmentation 3D reconstruction Holistic scene analysis
Speech recognition
Robot localization &mapping
-
7/28/2019 Section 1 Introduction Combined
19/40
Daphne Koller
Image Segmentation
-
7/28/2019 Section 1 Introduction Combined
20/40
Daphne Koller
-Medical DiagnosisThanks to: Eric Horvitz, Microsoft Research
-
7/28/2019 Section 1 Introduction Combined
21/40
Daphne Koller
Textual Information Extraction
Mrs. Green spoke today in New York. Green chairs the finance committee.
-
7/28/2019 Section 1 Introduction Combined
22/40
Daphne Koller
Learned
Model
Multi-Sensor Integration: Traffic Trained on historical data Learn to predict current & future road speed, including
on unmeasured roads Dynamic route optimization
Multiple views
on traffic
Incident reports
Weather
Thanks to: Eric Horvitz, Microsoft Research
I95 corridor experiment: accurateto 5 MPH in 85% of cases Fielded in 72 cities
-
7/28/2019 Section 1 Introduction Combined
23/40
Daphne Koller
Known 15/17Supported 2/17
Reversed 1
Missed 3
Causal protein-signaling networks derived from multiparameter single-cell dataSachs et al., Science 2005
Biological Network ReconstructionPhospho-ProteinsPhospho-LipidsPerturbed in data
PKC
Raf
Erk
Mek
PlcPKA
Akt
Jnk P38
PIP2
PIP3
Subsequently validated in wetlab
This figure may be used for non-commercial and classroom purposes only.Any other uses require the prior written permission from AAAS
-
7/28/2019 Section 1 Introduction Combined
24/40
Daphne Koller
Overview Representation
Directed and undirected Temporal and plate models
Inference Exact and approximate Decision making
Learning Parameters and structure With and without complete data
-
7/28/2019 Section 1 Introduction Combined
25/40
Daphne Koller
Preliminaries:+
Distribu0ons+
Probabilis0c+
Graphical+
Models+Introduc0on+
-
7/28/2019 Section 1 Introduction Combined
26/40
Daphne Koller
Joint Distribution
Intelligence (I) i0 (low), i1 (high),
Difficulty (D) d0 (easy), d1 (hard)
Grade (G) g1 (A),g2 (B), g3 (C)
I D G Prob.
i0 d0 g1 0.126
i0 d0 g2 0.168
i0 d0 g3 0.126
i0 d1 g1 0.009
i0 d1 g2 0.045
i0 d1 g3 0.126
i1 d0 g1 0.252
i1 d0 g2 0.0224
i1 d0 g3 0.0056
i1 d1 g1 0.06
i1 d1 g2 0.036
i1 d1 g3 0.024
-
7/28/2019 Section 1 Introduction Combined
27/40
Daphne Koller
ConditioningI D G Prob.
i0 d0 g1 0.126
i0 d0 g2 0.168
i0 d0 g3 0.126
i0 d1 g1 0.009
i0 d1 g2 0.045
i0 d1 g3 0.126
i1 d0 g1 0.252
i1 d0 g2 0.0224
i1 d0 g3 0.0056
i1 d1 g1 0.06i1 d1 g2 0.036i1 d1 g3 0.024
condition on g1
-
7/28/2019 Section 1 Introduction Combined
28/40
Daphne Koller
Conditioning: ReductionI D G Prob.
i0 d0 g1 0.126
i0 d1 g1 0.009
i1 d0 g1 0.252
i1 d1 g1 0.06
-
7/28/2019 Section 1 Introduction Combined
29/40
Daphne Koller
P(I, D, g1)
Conditioning: RenormalizationI D G Prob.
i0 d0 g1 0.126
i0 d1 g1 0.009
i1 d0 g1 0.252
i1 d1 g1 0.06P(I, D | g1)
I D Prob.
i0 d0 0.282
i0 d1 0.02
i1 d0 0.564
i1 d1 0.1340.447
-
7/28/2019 Section 1 Introduction Combined
30/40
Daphne Koller
Marginalization
D Prob.
d0 0.846
d1 0.154
I D Prob.
i0 d0 0.282
i0 d1 0.02
i1 d0 0.564
i1 d1 0.134
Marginalize I
-
7/28/2019 Section 1 Introduction Combined
31/40
Daphne Koller
Preliminaries:+
Factors+
Probabilis1c+
Graphical+
Models+Introduc1on+
-
7/28/2019 Section 1 Introduction Combined
32/40
Daphne Koller
Factors A factor (X1,,Xk)
Scope = {X1,,Xk} : Val(X1,,Xk)
R
-
7/28/2019 Section 1 Introduction Combined
33/40
Daphne Koller
I D G Prob.
i0 d0 g1 0.126
i0 d0 g2 0.168
i0
d0
g3 0.126
i0 d1 g1 0.009
i0 d1 g2 0.045
i0 d1 g3 0.126
i1 d0 g1 0.252
i1 d0 g2 0.0224
i1 d0 g3 0.0056
i1 d1 g1 0.06i1 d1 g2 0.036i1 d1 g3 0.024
Joint Distribution
P(I,D,G)
-
7/28/2019 Section 1 Introduction Combined
34/40
Daphne Koller
Unnormalized measure P(I,D,g1)
I D G Prob.
i0 d0 g1 0.126
i0 d1 g1 0.009
i1 d0 g1 0.252
i1 d1 g1 0.06P(I,D,g1)
-
7/28/2019 Section 1 Introduction Combined
35/40
Daphne Koller
Conditional ProbabilityDistribution (CPD)
0.30.080.25
0.4g2
0.020.9i1,d00.70.05i0,d1
0.5
0.3g1 g3
0.2i1,d1
0.3i0,d0
P(G | I,D)
-
7/28/2019 Section 1 Introduction Combined
36/40
Daphne Koller
General factors
A B a0 b0 30
a0 b1 5
a1 b0 1
a1 b1 10
-
7/28/2019 Section 1 Introduction Combined
37/40
Daphne Koller
a1 b1 0.5
a1 b2 0.8
a2 b1 0.1
a2 b2 0
a3 b1 0.3
a3 b2 0.9
b1 c1 0.5
b1 c2 0.7
b2 c1 0.1
b2 c2 0.2
a1 b1 c1 0.50.5 = 0.25
a1 b1 c2 0.50.7 = 0.35
a1 b2 c1 0.80.1 = 0.08
a1 b2 c2 0.80.2 = 0.16
a2 b1 c1 0.10.5 = 0.05
a2 b1 c2 0.10.7 = 0.07
a2 b2 c1 00.1 = 0
a2 b2 c2 00.2 = 0
a3 b1 c1 0.30.5 = 0.15
a3 b1 c2 0.30.7 = 0.21
a3 b2 c1 0.90.1 = 0.09
a3 b2 c2 0.90.2 = 0.18
Factor Product
-
7/28/2019 Section 1 Introduction Combined
38/40
Daphne Koller
a1 b1 c1 0.25
a1 b1 c2 0.35
a1 b2 c1 0.08
a1 b2 c2 0.16
a2 b1 c1 0.05
a2 b1 c2 0.07
a2 b2 c1 0
a2 b2 c2 0
a3 b1 c1 0.15
a3 b1 c2 0.21
a3 b2 c1 0.09
a3 b2 c2 0.18
a1 c1 0.33
a1 c2 0.51
a2 c1 0.05
a2 c2 0.07
a3 c1 0.24
a3 c2 0.39
Factor Marginalization
-
7/28/2019 Section 1 Introduction Combined
39/40
Daphne Koller
a1 b1 c1 0.25
a1 b2 c1 0.08
a2 b1 c1 0.05
a2 b2 c1 0
a3 b1 c1 0.15
a3 b2 c1 0.09
a1 b1 c1 0.25
a1 b1 c2 0.35
a1 b2 c1 0.08
a1 b2 c2 0.16
a2 b1 c1 0.05
a2 b1 c2 0.07
a2 b2 c1 0
a2 b2 c2 0
a3 b1 c1 0.15
a3 b1 c2 0.21
a3 b2 c1 0.09
a3 b2 c2 0.18
Factor Reductiona1 b1 c1 0.25
a1 b1 c2 0.35
a1 b2 c1 0.08
a1 b2 c2 0.16
a2 b1 c1 0.05
a2 b1 c2 0.07
a2 b2 c1 0
a2 b2 c2 0
a3 b1 c1 0.15
a3 b1 c2 0.21
a3 b2 c1 0.09
a3 b2 c2 0.18
-
7/28/2019 Section 1 Introduction Combined
40/40
Daphne Koller
Why factors? Fundamental building block for defining
distributions in high-dimensional spaces
Set of basic operations for manipulatingthese probability distributions