s3-seminar on data mining -bayesian networks- b. inference master universitario en inteligencia...
Post on 14-Dec-2015
217 Views
Preview:
TRANSCRIPT
S3-SEMINAR ON DATA MINING-BAYESIAN NETWORKS-
B. INFERENCE
Master Universitario en Inteligencia Artificial
Concha Bielza, Pedro Larrañaga
Computational Intelligence GroupDepartamento de Inteligencia ArtificialUniversidad Politécnica de Madrid
C.Bielza, P.Larrañaga -UPM- 2
Types of queries
Brute-force computation
Probabilistic logic sampling
Variable elimination algorithm
Message passing algorithm
Conceptos básicos
Inference in Bayesian networks
Exact inference:
Approximate inference:
C.Bielza, P.Larrañaga -UPM- 3
Queries: posterior probabilitiesGiven some evidence e (observations),
Posterior probability of a target variable(s) X :
Other names: probability propagation, belief updating or revision…
Alarm
Earth.Burgl.
WCalls
News
?
Vector
Types of queriesQueries Brute-force VE Message Approx
answer queries about
P
C.Bielza, P.Larrañaga -UPM- 4
Semantically, for any kind of reasoningPredictive reasoning or deductive (causal inference): predict effects
Alarm
Earth.Burgl.
WCalls
News?
Diagnostic reasoning (diagnostic inference): diagnose the causes
Alarm
Earth.Burgl.
WCalls
News
?
Symptoms|Disease
Disease|Symptoms
Types of queriesQueries Brute-force VE Message Approx
Target variable is usually a descendant of the evidence
Target variable is usually an ancestor of the evidence
C.Bielza, P.Larrañaga -UPM- 5
More queries: maximum a posteriori (MAP)
Most likely configurations (abductive inference): event that best explains the evidence
Total abduction: search for
Partial abduction: search for
K most likely explanations
subset. of unobserved (explanation set)
all the unobserved
Alarm
Earth.Burgl.
WCalls
News
??
Alarm
Earth.Burgl.
WCalls
News
?
??
?
Types of queriesQueries Brute-force VE Message Approx
In general, cannot be computed component-wise, with max P(xi|e)
C.Bielza, P.Larrañaga -UPM- 6
More queries: maximum a posteriori (MAP)
Types of queriesQueries Brute-force VE Message Approx
Use MAP for:
Classification: find most likely label, given the evidence
Explanation: what is the most likely scenario, given the evidence
C.Bielza, P.Larrañaga -UPM- 7
More queries: decision-making
Optimal decisions (of maximum expected utility), with influence diagrams
Types of queriesQueries Brute-force VE Message Approx
C.Bielza, P.Larrañaga -UPM- 8
Brute-force computation of P(X|e)
First, consider P(Xi), without observed evidence e. Conceptually simple but computationally complex
For a BN with n variables, each with its P(Xj|Pa(Xj)):
But this amounts to computing the JPD, often very inefficient and even intractable computationally
CHALLENGE: Without computing the JDP, exploit the factorization encoded by the BN and the distributive law (local computations)
Exact inference [Pearl’88; Lauritzen & Spiegelhalter’88]
Queries Brute-force VE Message Approx
Brute-force approach
C.Bielza, P.Larrañaga -UPM- 9
Improving brute-forceUse the JPD factorization and the distributive law
Table with 32 inputs (JPD) (if binary variables)
Exact inferenceQueries Brute-force VE Message Approx
?
C.Bielza, P.Larrañaga -UPM- 10
Improving brute-forceArrange computations effectively, moving some additions
over X5 and X3:
over X4:Biggest table with 8 (like the BN)
Exact inferenceQueries Brute-force VE Message Approx
C.Bielza, P.Larrañaga -UPM- 11
Variable elimination algorithmWanted:
A list with all functions of the problemSelect an elimination order of all variables (except i)For each Xk from , if F is the set of functions that involve Xk:
Delete F from the list
Add f’ to the listOutput: combination (multiplication) of all functions in the current list
Eliminate Xk= combine all the functions that contain this variable and marginalize out Xk
Compute
ONE variable
Exact inferenceQueries Brute-force VE Message Approx
C.Bielza, P.Larrañaga -UPM- 12
Variable elimination algorithm
Exact inferenceQueries Brute-force VE Message Approx
Repeat th
e a
lgorith
m fo
r each
targ
et
varia
ble
C.Bielza, P.Larrañaga -UPM- 13
Example with Asia network
Exact inferenceQueries Brute-force VE Message Approx
Visit to Asia (A)
Smoking (S)
Lung Cancer(L)
Tuberculosis(T)
Tub. or Lung Canc (E)
Bronchitis (B)
X-Ray (X) Dyspnea (D)
C.Bielza, P.Larrañaga -UPM- 14
Brute-force approach
Compute P(D) by brute-force:
Exact inferenceQueries Brute-force VE Message Approx
x b e l t s a
dxbeltsaPdP ),,,,,,,()(
Complexity is exponential in the size of the graph (number of variables *number of states for each variable)
C.Bielza, P.Larrañaga -UPM- 15
Exact inferenceQueries Brute-force VE Message Approx
not necessarily a probability term
C.Bielza, P.Larrañaga -UPM- 16
Exact inferenceQueries Brute-force VE Message Approx
4
C.Bielza, P.Larrañaga -UPM- 17
Variable elimination algorithm
Size = 8
Local computations (due to moving the additions)
Importance of the elimination ordering, but finding an optimal (minimum cost) is NP-hard [Arnborg et al.’87] (heuristics for good sequences)
Exact inferenceQueries Brute-force VE Message Approx
Complexity is exponential in the max N. of var. infactors of the summation
C.Bielza, P.Larrañaga -UPM- 18
Message passing algorithm
Operates passing messages among the nodes of the network. Nodes act as processors that receive, calculate and send information. Called propagation algorithms
Exact inferenceQueries Brute-force VE Message Approx
Clique tree propagation, based on the same principle as VE but with a sophisticated caching strategy that:Enables to compute the posterior prob. distr. of
all variables in twice the time it takes to compute that of one single variable
Works in an intuitive appealing fashion, namely message propagation
C.Bielza, P.Larrañaga -UPM- 19
Basic operations for a node
Ask info(i,j): Target node i asks info to node j. Does it for all neighbors j. They do the same until there are no nodes to ask
Exact inferenceQueries Brute-force VE Message Approx
Send-message(i,j): Each node sends a message to the node that asked him the info… until reaching the target nodeA message is defined over the intersection of domains of fi and fj. It is computed as:
And finally, we calculate locally at each node i:Target combines all received info with his info and marginalize over the target variable
C.Bielza, P.Larrañaga -UPM- 20
Procedure for X2
Exact inferenceQueries Brute-force VE Message Approx
Colle
ctE
vid
en
ceAsk
C.Bielza, P.Larrañaga -UPM- 21
P(X2) as a message passing algorithm
Exact inferenceQueries Brute-force VE Message Approx
?
C.Bielza, P.Larrañaga -UPM- 22
VE as a message passing algorithm
Direct correspondence:
Exact inferenceQueries Brute-force VE Message Approx
?
VE
Mess.
C.Bielza, P.Larrañaga -UPM- 23
Computing prob. P(Xi|e) of all (unobserved) variables i at a time
We can perform the previous process for each node: but many messages are repeated!
Exact inferenceQueries Brute-force VE Message Approx
Or, we can use 2 rounds of messages as follows:Select a node as a root (or pivot)Ask or collect evidence from the leaves toward the root (messages in downward direction). As VE.
Distribute evidence from the root toward the leaves (messages in upward direction)
Calculate marginal distributions at each node by local computation, i.e. using its incoming messages
This algorithm never constructs tables larger than those in the BN
C.Bielza, P.Larrañaga -UPM- 24
Message passing algorithm
X
1 1
12
22
34
56 7
778 8
8
CollectEvidence
Root node
Exact inferenceQueries Brute-force VE Message Approx
First sweep:
DistributeEvidenceSecond sweep:
C.Bielza, P.Larrañaga -UPM- 25
Networks with loops
If net is not a polytree, it does not work
Independence assumptions applied in the algorithm cannot be used here (now “any node separates the graph into 2 unconnected parts (polytrees)” does not hold)
Exact inferenceQueries Brute-force VE Message Approx
Request/messages go in a cycle indefinitely(info goes through 2 paths and is counted twice)
Alternatives??
C.Bielza, P.Larrañaga -UPM- 26
Complexity
Exact inferenceQueries Brute-force VE Message Approx
Complexity of propagation algorithms in polytrees (i.e., without loops, cycles in the underlying undirected graph) is linear in the size (nodes+arcs) of the network [brute-force is exponential]Exact inference in multiply-connected BNs is an NP-complete problem [Cooper 1990]
C.Bielza, P.Larrañaga -UPM- 27
Alternative: clustering methods [Lauritzen & Spiegelhalter’88]
Method implemented in the main BN software packagesTransform the BN into a probabilistically equivalent polytree by merging nodes, removing the multiple paths between two nodes
Exact inferenceQueries Brute-force VE Message Approx
M
S B
C H
Metastatic cancer (M) is a possible cause of brain tumors (B) and an explanation for increased total serum calcium (S). In turn, either of these could explain a patient falling into a coma (C). Severe headache (H) is also associated with brain tumors.
Create a new node Z, that combines S and B
M
Z=S,B
C H
States of Z: {tt,ft,tf,ff}
P(Z|M)=P(S|M)P(B|M) since they are c.i. given M
P(H|Z)=P(H|B) since H c.i. of S given B
C.Bielza, P.Larrañaga -UPM- 28
Alternative: clustering methods
Steps for the JUNCTION TREE CLUSTERING ALGORITHM:
1. Moralize the BN2. Triangulate the moral graph and obtain the
cliques3. Create the junction tree and its separators4. Compute new parameters5. Message passing algorithm
Exact inferenceQueries Brute-force VE Message Approx
Transform BN into a polytree (slow, much memory if dense, but only once)
Belief updating(fast)
CO
MP
ILA
TIO
N
C.Bielza, P.Larrañaga -UPM- 29
Inferencia aproximada
Why?
Because exact inference is intractable (NP-complete) with large (+40) and densely connected BNs
Both deterministic and stochastic simulation to find approximate answers
the associated cliques for the junction tree algorithm or the intermediate factors in the VE algorithm will grow in size, generating an exponential blowup in the number of computations performed
Approximate inferenceQueries Brute-force VE Message Approx
C.Bielza, P.Larrañaga -UPM- 30
Stochastic simulation
Uses the network to generate a large number of cases (full instantiations) from the network distribution
Inferencia aproximada Approximate inferenceQueries Brute-force VE Message Approx
P(Xi|e) is estimated using these cases by counting observed frequencies in the samples. By the Law of Large Numbers, estimate converges to the exact probability as more cases are generatedApproximate propagation in BNs within an arbitrary tolerance or accuracy is an NP-complete problemIn practice, if e is not too unlikely, convergence is
quickly
C.Bielza, P.Larrañaga -UPM- 31
Probabilistic logic sampling [Henrion’88]
2
1
6
4
3
5
When all the nodes have been visited, we have a case, an instantiation of all the nodes in the BN
A forward sampling algorithm
Given an ancestral ordering of the nodes (parents before children), generate from X once we have generated from its parents (i.e. from the root nodes down to the leaves)
Inferencia aproximada Approximate inferenceQueries Brute-force VE Message Approx
Repeat and use the observed frequenciesto estimate P(Xi|e)Use conditional prob.
given the known values of the parents
C.Bielza, P.Larrañaga -UPM- 32
Software
C.Bielza, P.Larrañaga -UPM- 33
Software
C.Bielza, P.Larrañaga -UPM- 34
Software
C.Bielza, P.Larrañaga -UPM- 35
genie.sis.pitt.edu
Software
C.Bielza, P.Larrañaga -UPM- 36
http.cs.berkeley.edu/~murphyk/
Software
C.Bielza, P.Larrañaga -UPM- 37
leo.ugr.es/elvira
Software
C.Bielza, P.Larrañaga -UPM-
S3-SEMINAR ON DATA MINING-BAYESIAN NETWORKS-
B. INFERENCE
Master Universitario en Inteligencia Artificial
Concha Bielza, Pedro Larrañaga
Computational Intelligence GroupDepartamento de Inteligencia ArtificialUniversidad Politécnica de Madrid
top related