sampling methods -- iihic/8803-fall-09/slides/8803-09-lec18.pdf · henrik i. christensen (rim@gt)...
TRANSCRIPT
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Sampling Methods – II
Henrik I. Christensen
Robotics & Intelligent Machines @ GTGeorgia Institute of Technology,
Atlanta, GA [email protected]
Henrik I. Christensen (RIM@GT) Sampling Methods – II 1 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Outline
1 Introduction
2 Markov Chain Monte Carlo
3 Gibbs Sampling
4 Slice Sampling
5 Hybrid Monte-Carlo
6 Summary
Henrik I. Christensen (RIM@GT) Sampling Methods – II 2 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Introduction
Last time we talked about sampling methods
Generation of distribution estimates based on sampling of the inputspace
Discussed rejection and importance sampling
A problem is typically rejection rates and generalization to higherdimensionality spaces
Today discussion of methods that generalizes to higher dimensionalspaces.
Henrik I. Christensen (RIM@GT) Sampling Methods – II 3 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Outline
1 Introduction
2 Markov Chain Monte Carlo
3 Gibbs Sampling
4 Slice Sampling
5 Hybrid Monte-Carlo
6 Summary
Henrik I. Christensen (RIM@GT) Sampling Methods – II 4 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Markov Chain Monte Carlo
We will sample a proposed distribution
We will maintain a record of samples - z(τ) and the proposaldistribution q(z |z(τ))
Assume we have p(z)/p̃(z)/Zp
Assume we can evaluate p̃(z)
Generate a candidate sample z∗ and accept if a criteria is satisfied.
Henrik I. Christensen (RIM@GT) Sampling Methods – II 5 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Metropolis Algorithm
Assume q(zA|zB) = q(zB |zA)
Acceptance criteria is then
A(z∗, z(τ)) = min
(1,
p̃(z∗)
p̃(z(τ))
)Generate a random number - u ∈ (0, 1)
Update
z(τ+1) =
{z∗ if A(z∗, z(τ)) > u
z(τ) otherwise
Ie. if a new update is better than the old one use it or stick to theearlier estimate
The basic Monte Carlo is a limited random walk and as such not overefficient
Henrik I. Christensen (RIM@GT) Sampling Methods – II 6 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Markov Chains
Assume we have a series of random variables - z(1), z(2), z(3), ..., z(M)
First order Markov Chain is defined by conditional independence
p(z(m+1)|z(1), z(2), ..., z(m)) = p(z(m+1)|z(m))
The marginal probability is then given by the transition probabilitiesand the initial prior
p(z(m+1)) =∑z(m)
p(zm+1|z(m))p(z(m))
Henrik I. Christensen (RIM@GT) Sampling Methods – II 7 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Markov Chain Properties
A MC is called homogeneous when all p(.|.) are the same
A distribution is invariant/stationary if the distribution remainsinvariant i.e.
p∗(z) =∑z ′
p(z |z ′)p∗(z ′)
A condition for ensuring invariance is that the transition probabilitiesare detail balanced:
p∗(z)p(z ′|z) = p∗(z ′)p(z |z ′)
We require that the desired distribution is invariant and converges tothis distribution as m →∞The property is called ergodicity and the final distribution is termedthe equilibrium
Henrik I. Christensen (RIM@GT) Sampling Methods – II 8 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Outline
1 Introduction
2 Markov Chain Monte Carlo
3 Gibbs Sampling
4 Slice Sampling
5 Hybrid Monte-Carlo
6 Summary
Henrik I. Christensen (RIM@GT) Sampling Methods – II 9 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Gibbs Sampling
Gibbs Sampling a widely applicable MCMC algorithm
Consider a distribution p(z) = p(z1, z2, ..., zM)
In each step one of the variables is optimized conditioned on theother variables.
Example - Consider p(z1, z2, z3)
Optimized by consideration /sampling of
p(z1|z(τ)2 , z
(τ)3 ) p(z2|z(τ)
1 , z(τ)3 ) p(z3|z(τ)
1 , z(τ)2 )
Continue until convergence
Henrik I. Christensen (RIM@GT) Sampling Methods – II 10 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Gibbs Example
z1
z2
L
l
Henrik I. Christensen (RIM@GT) Sampling Methods – II 11 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Gibbs Sampling in Graphical Models
Initialize variables in parent tree and traverse tree/graph
Henrik I. Christensen (RIM@GT) Sampling Methods – II 12 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Outline
1 Introduction
2 Markov Chain Monte Carlo
3 Gibbs Sampling
4 Slice Sampling
5 Hybrid Monte-Carlo
6 Summary
Henrik I. Christensen (RIM@GT) Sampling Methods – II 13 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Slice Sampling
Metropolis is sensitive to sampling step size
Slice sampling combines sampling to explore step size.
p̃(z)
z(τ) z
u
(a)
p̃(z)
z(τ) z
uzmin zmax
(b)
Henrik I. Christensen (RIM@GT) Sampling Methods – II 14 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Outline
1 Introduction
2 Markov Chain Monte Carlo
3 Gibbs Sampling
4 Slice Sampling
5 Hybrid Monte-Carlo
6 Summary
Henrik I. Christensen (RIM@GT) Sampling Methods – II 15 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Hybrid Monte-Carlo
The Metropolis algorithm has step size issues
Introduction of a method with adaptive step size and low reject rates
Adoption of a dynamic systems approach to optimization
Henrik I. Christensen (RIM@GT) Sampling Methods – II 16 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Dynamical Systems
In physics the Hamiltonian expresses the total energy of a system
If we consider a particle in motion we have momentum described as
r =dz
dτ
We describe the space of derivative/state as the phase space
We can rewrite the probability as
p(z) = 1/Zp exp(−E (z))
Acceleration / rate of change is defined as
dr
dτ= −∂E (z)
∂z
Kinetic energy is k(r) = 1/2||r ||2
Henrik I. Christensen (RIM@GT) Sampling Methods – II 17 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Hamiltonian model
The Hamiltonian is then
H(z , r) = E (z) + K (r)
The coupled systems is then
dzi
dτ=
∂H
∂ridridτ
= −∂H
∂zi
Henrik I. Christensen (RIM@GT) Sampling Methods – II 18 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Hamiltonian model
The Hamiltonian is constant energy but can trade-off z and r
We can control the motion of the dynamic system. As an example rcould be drawn as a sample from p(z).
In reality this is parallel to Newton - Rapson optimization wheregradient information is used to control step size.
Henrik I. Christensen (RIM@GT) Sampling Methods – II 19 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Leapfrog Discritization
Discretization - alternative variables
ri (τ + ε/2) = ri (τ)− ε
2
∂E
∂zi(z(τ))
zi (τ + ε) = zi (τ) + εrI (τ + ε/2)
ri (τ + ε) = ri (τ + ε/2)− ε
2
∂E
∂zi(z(τ + ε))
Henrik I. Christensen (RIM@GT) Sampling Methods – II 20 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Hybrid Monte-Carlo
Consider a state (z , r) and a updated state of (z∗, r∗)
We could then accept the candidate when
min(1, exp(H(z , r)− H(z∗, r∗)))
Given the hamiltonian is supposed to be constant a strategy is tomake a ’random’ change before the leapfrog integration and thenconsider the update.
Henrik I. Christensen (RIM@GT) Sampling Methods – II 21 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Outline
1 Introduction
2 Markov Chain Monte Carlo
3 Gibbs Sampling
4 Slice Sampling
5 Hybrid Monte-Carlo
6 Summary
Henrik I. Christensen (RIM@GT) Sampling Methods – II 22 / 23
Introduction MCMC Gibbs Sampling Slice Sampling Hybrid MC Summary
Summary
MCMC is about tracking of state during sampling
How can we use current estimates to update variables as iterativeupdating
Consideration of strategies to update
Metropolis - basic random walkSlicing - a way to update step sizesGibbs Sampling - stepwise updatingHybrid MCMC - a way to integrate gradient information
Henrik I. Christensen (RIM@GT) Sampling Methods – II 23 / 23