graphical models: approximate inference and learning ca6b, lecture 5

Graphical models: approximate inference and learning

CA6b, lecture 5

Bayesian Networks

General Factorization

D-separation: Example

Undirected Tree Directed Tree Polytree

Converting Directed to Undirected Graphs (2)

Additional links

Inference on a Chain

Inference in a HMM

E step: belief propagation

1s 1ns ns 1ns Ns

Belief propagation in a HMM

1s 1ns ns 1ns Ns

Expectation maximization in a HMM

1s 1ns ns 1ns Ns

The Junction Tree Algorithm

• Exact inference on general graphs.• Works by turning the initial graph into a

junction tree and then running a sum-product-like algorithm.

Factor Graphs

Factor Graphs from Undirected Graphs

The Sum-Product Algorithm (6)

Initialization

Sensory observations

Prior expectations

Forest

Leave Root

Bottom-up

Top-down

6xGreen

Consequence of failing inhibition in hierarchical inference

Causal model Pairwise factor graph

Bayesian network and factor graph

Causal model Pairwise factor graph

Pairwise graphs

Log belief ratio

Log messages ratio

Belief propagation and inhibitory loops

Tight excitatory/inhibitory balance is required, and sufficient

Okun and Lampl, Nat Neuro 2008

Inhibition

Excitation

Lewis et al,Nat Rev Nsci 05

controls schizophrenia

Support for impaired inhibition in schizophrenia

See also: Benes, Neuropsychopharmacology 2010, Uhhaas and Singer, Nat Rev Nsci 2010…

Circular inference:

Impaired inhibitory loops

Circular inference and overconfidence:

Renaud Jardri Alexandra Litvinova & Sandrine Duverne

The Fisher Task

A priori

Evidence sensorielles

Confiance a posteriori

Mean group responses

Controls: Schizophrenes:

-4 -2 0 2 4-8

Log likelihood ratio-4 -2 0 2 4

Log prior ratio

Simple Bayes:

-2 0 2-3

Log likelihood ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

ce-2 0 2

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

-2 0 2-3

Log prior ratio

Control Patients

sd) 0.75

Mean parameter values

Inference loops and psychosis

Strenght of loops Strenght of loops

The Junction Tree Algorithm

• Exact inference on general graphs.• Works by turning the initial graph into a

junction tree and then running a sum-product-like algorithm.

• Intractable on graphs with large cliques.

What if exact inference is intractable?

• Loopy belief propagation works in some scenarios.

• Markov-Monte-Carlo sampling methods.• Variational methods (not covered here)

Loopy Belief Propagation

• Sum-Product on general graphs.• Initial unit messages passed across all links,

after which messages are passed around until convergence (not guaranteed!).

• Approximate but tractable for large graphs.• Sometime works well, sometimes not at all.

Neural code for uncertainty: sampling

Alternative neural code for uncertainty: sampling

Berkes et al, Science 2011

Alternative neural code for uncertainty: sampling

Learning in graphical models

More generally: learning parameters in latent variable models

Visible

Hidden

? , |p x h

ˆ argmax |u

Visible

Hidden

? , |p x h

ˆ argmax |u

| , |u u

p x p x h

Visible

Hidden

? , |p x h

ˆ argmax |u

| , |u u

p x p x h

Mixture of Gaussians (clustering algorithm)

Data (unsupervised)

Generative model: M possible clusters

Gaussian distribution

Data (unsupervised)

Generative model: M possible clusters

Gaussian distribution

Parameters

Given the current parameters and the data, what are the expected hidden states?

Expectation stage:

Responsability

Given the responsabilities of each cluster, update the parameters to maximize the likelihood of the data:

Maximization stage:

Learning in hidden Markov models

1ts ts 1ts

tx1tx 1tx Hidden state

Observations

Forward model

Sensory likelihood

Inverse model

t dts ts t dts

txdttx dttx Object present/not

Receptor spike/not

Leak Synaptic input

' it i t

LL w s

Bayesian integration corresponds to leaky integration.

Expectation maximization in a HMM

1s 1ns ns 1ns Ns

Multiple training sequences: 1 2, ,...,u u uNs s s

What are the parameters: 1 |ij n nr p x i x j

|ik n nq p s k x j

Transition probabilities

Observation probabilities

Expectation stage

1s 1ns ns 1ns Ns

Expectation stage

1s 1ns ns 1ns Ns

Expectation stage

1s 1ns ns 1ns Ns

Using “on-line” expectation maximization, a neuron can adapt to the statistics of its input.

1 0,i iq q

,on offr r

Fast adaptation in single neurons

Adaptation to temporal statistics? Fairhall et al, 2001

graphical models: approximate inference and learning ca6b, lecture 5

sampling slide

posteriori slide

gad26 slide

initialization slide

chain slide

treepolytree slide

belief propagation slide

hierarchical inference

Documents

graphical inference

tutorial on approximate bayesian computation · inference...

parameter orthogonality and approximate conditional...

expectation consistent approximate inference

approximate genealogical inference

probabilistic inference in graphical...

approximate bayesian inference i:

stochastic approximate inference

global approximate inference

probabilistic graphical models: distributed inference and...

approximate inference in gaussian graphical...

empirical risk minimization of graphical model parameters...

graphical probability models for inference and decision...

inference in probabilistic graphical -...

07 approximate inference in bn

robust bayesian clustering - ucl computer science - · pdf...

graphical models - inference -

approximate inference, structure learning and...

algorithms for approximate bayesian inference with

linear response algorithms for approximate inference in...