pliable rejection sampling - github pages

88
Paris, February 2018 Pliable rejection sampling Michal Valko Inria Lille - Nord Europe, France with Akram Erraqabi Alexandra Carpentier Odalric-Ambrym Maillard Montr´ eal Institute of Learning Algorithms, Canada Otto-von-Guericke-Universit¨ at Magdeburg, Germany Inria Lille - Nord Europe, France SequeL – Inria Lille GdR ISIS

Upload: others

Post on 01-Oct-2021

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pliable rejection sampling - GitHub Pages

Paris, February 2018

Pliable rejection samplingMichal ValkoInria Lille - Nord Europe, Francewith

Akram ErraqabiAlexandra CarpentierOdalric-Ambrym Maillard

Montreal Institute of Learning Algorithms, CanadaOtto-von-Guericke-Universitat Magdeburg, GermanyInria Lille - Nord Europe, France

SequeL – Inria Lille

GdR ISIS

Page 2: Pliable rejection sampling - GitHub Pages

Adapting to unknown smoothness

Learning the envelope for rejection sampling

Smooth functions are easier to learn

How to adapt to the unknown smoothness?

How to trade off between learning and sampling?

Pliable rejection sampling SequeL - 1/25

Page 3: Pliable rejection sampling - GitHub Pages

Adapting to unknown smoothness

Learning the envelope for rejection sampling

Smooth functions are easier to learn

How to adapt to the unknown smoothness?

How to trade off between learning and sampling?

Pliable rejection sampling SequeL - 1/25

Page 4: Pliable rejection sampling - GitHub Pages

Adapting to unknown smoothness

Learning the envelope for rejection sampling

Smooth functions are easier to learn

How to adapt to the unknown smoothness?

How to trade off between learning and sampling?

Pliable rejection sampling SequeL - 1/25

Page 5: Pliable rejection sampling - GitHub Pages

Adapting to unknown smoothness

Learning the envelope for rejection sampling

Smooth functions are easier to learn

How to adapt to the unknown smoothness?

How to trade off between learning and sampling?

Pliable rejection sampling SequeL - 1/25

Page 6: Pliable rejection sampling - GitHub Pages

Vanilla rejection sampling Rejection Sampling

Vanilla rejection samplingRejection Sampling

Goal: Sample from a target density f (not easy to sample from)Tool: Use a proposal density g (from which sampling is quite easy)

target f

envelope Mg

proposal g

x

M verifies f ≤ MgRejection sampling:

1. Sample x from g2. Accept x as a sample

from f with probabilityf (x)

Mg(x)

Pliable rejection sampling SequeL - 2/25

Question:Can we increase the acceptance

rate?

Page 7: Pliable rejection sampling - GitHub Pages

Vanilla rejection sampling Rejection Sampling

Vanilla rejection samplingRejection Sampling

Goal: Sample from a target density f (not easy to sample from)Tool: Use a proposal density g (from which sampling is quite easy)

target f

envelope Mg

proposal g

x

M verifies f ≤ MgRejection sampling:

1. Sample x from g2. Accept x as a sample

from f with probabilityf (x)

Mg(x)

Pliable rejection sampling SequeL - 2/25

Question:Can we increase the acceptance

rate?

Page 8: Pliable rejection sampling - GitHub Pages

Vanilla rejection sampling Rejection Sampling

Vanilla rejection samplingRejection Sampling

Goal: Sample from a target density f (not easy to sample from)Tool: Use a proposal density g (from which sampling is quite easy)

target f

envelope Mg

proposal g

target f

envelope Mg

proposal g

x

R

A

M verifies f ≤ MgRejection sampling:

1. Sample x from g2. Accept x as a sample

from f with probabilityf (x)

Mg(x)

acceptance rate = AA+R = 1

M

Pliable rejection sampling SequeL - 2/25

Question:Can we increase the acceptance

rate?

Page 9: Pliable rejection sampling - GitHub Pages

Vanilla rejection sampling Rejection Sampling

Vanilla rejection samplingRejection Sampling

Goal: Sample from a target density f (not easy to sample from)Tool: Use a proposal density g (from which sampling is quite easy)

target f

envelope Mg

proposal g

target f

envelope Mg

proposal g

x

R

A

M verifies f ≤ MgRejection sampling:

1. Sample x from g2. Accept x as a sample

from f with probabilityf (x)

Mg(x)

acceptance rate = AA+R = 1

M

Pliable rejection sampling SequeL - 2/25

Question:Can we increase the acceptance

rate?

Page 10: Pliable rejection sampling - GitHub Pages

The setting

The setting

Let d ≥ 1 and let f be a density on Rd .

Goal:Given a number n of requests to f , what is the number T ofsamples Y1, . . . ,YT that we can generate such that they arei.i.d. and sampled according to f ?

acceptance rate = Tn

Pliable rejection sampling SequeL - 3/25

Page 11: Pliable rejection sampling - GitHub Pages

Can we increase the acceptance rate? ARS

Can we increase the acceptance rate?Adaptive Rejection Sampling

Adaptive Rejection Sampling(ARS) [Gilks and Wild 1992]

I The target f is assumed to belog-concave (unimodal)

I The envelope is made oftangents at a set of points S

I At each rejection, the sample isadded to S

log(f )

Pliable rejection sampling SequeL - 4/25

Very strong assumption!

Page 12: Pliable rejection sampling - GitHub Pages

Can we increase the acceptance rate? ARS

Can we increase the acceptance rate?Adaptive Rejection Sampling

Adaptive Rejection Sampling(ARS) [Gilks and Wild 1992]

I The target f is assumed to belog-concave (unimodal)

I The envelope is made oftangents at a set of points S

I At each rejection, the sample isadded to S

log(f )

Pliable rejection sampling SequeL - 4/25

Very strong assumption!

Page 13: Pliable rejection sampling - GitHub Pages

Can we increase the acceptance rate? ARS

Can we increase the acceptance rate?Adaptive Rejection Sampling

Adaptive Rejection Sampling(ARS) [Gilks and Wild 1992]

I The target f is assumed to belog-concave (unimodal)

I The envelope is made oftangents at a set of points S

I At each rejection, the sample isadded to S

log(f )

Pliable rejection sampling SequeL - 4/25

Very strong assumption!

Page 14: Pliable rejection sampling - GitHub Pages

Can we increase the acceptance rate? Improved ARS versions

Can we increase the acceptance rate?Improved ARS versions

Adaptive Rejection MetropolisSampling (ARMS) [Gilks, Best andTan 1995]

I Can deal with non-log-concavedensities.

I Performs a Metropolis-Hastingscontrol for each acceptedsample

I At each rejection, the sample isadded to S

Convex-Concave AdaptiveRejection Sampling [Gorur andTuh 2011]

I Decomposes the target asconvex + concave

I Builds piecewise linear upperbounds (tangents, secant lines)

I At each rejection, the sample isadded to S

Pliable rejection sampling SequeL - 5/25

Correlated samples! Convexity assumption!

Page 15: Pliable rejection sampling - GitHub Pages

Can we increase the acceptance rate? Improved ARS versions

Can we increase the acceptance rate?Improved ARS versions

Adaptive Rejection MetropolisSampling (ARMS) [Gilks, Best andTan 1995]

I Can deal with non-log-concavedensities.

I Performs a Metropolis-Hastingscontrol for each acceptedsample

I At each rejection, the sample isadded to S

Convex-Concave AdaptiveRejection Sampling [Gorur andTuh 2011]

I Decomposes the target asconvex + concave

I Builds piecewise linear upperbounds (tangents, secant lines)

I At each rejection, the sample isadded to S

Pliable rejection sampling SequeL - 5/25

Correlated samples!

Convexity assumption!

Page 16: Pliable rejection sampling - GitHub Pages

Can we increase the acceptance rate? Improved ARS versions

Can we increase the acceptance rate?Improved ARS versions

Adaptive Rejection MetropolisSampling (ARMS) [Gilks, Best andTan 1995]

I Can deal with non-log-concavedensities.

I Performs a Metropolis-Hastingscontrol for each acceptedsample

I At each rejection, the sample isadded to S

Convex-Concave AdaptiveRejection Sampling [Gorur andTuh 2011]

I Decomposes the target asconvex + concave

I Builds piecewise linear upperbounds (tangents, secant lines)

I At each rejection, the sample isadded to S

Pliable rejection sampling SequeL - 5/25

Correlated samples! Convexity assumption!

Page 17: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Pliable solutionFolding the envelope

target f

envelope Mg

proposal g

R

A

acceptance rate =A

A+R = 1M

Pliable rejection sampling SequeL - 6/25

Page 18: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Pliable solutionFolding the envelope

target f

envelope Mg

proposal g

R

A

acceptance rate =A

A+R = 1M

Better proposal means smallerrejection area R

Smaller R means g should have asimilar “shape” to f

Pliable rejection sampling SequeL - 6/25

Page 19: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Pliable solutionFolding the envelope

target f

acceptance rate =A

A+R = 1M

Better proposal means smallerrejection area R

Smaller R means g should have asimilar “shape” to f

Pliable rejection sampling SequeL - 6/25

Page 20: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Pliable solutionFolding the envelope

target festimate f

acceptance rate =A

A+R = 1M

Better proposal means smallerrejection area R

Smaller R means g should have asimilar “shape” to f

For this purpose:I Build an estimate f

Pliable rejection sampling SequeL - 6/25

Page 21: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Pliable solutionFolding the envelope

target festimate f

envelope Mg

acceptance rate =A

A+R = 1M

Better proposal means smallerrejection area R

Smaller R means g should have asimilar “shape” to f

For this purpose:I Build an estimate fI Translate it uniformly

Pliable rejection sampling SequeL - 6/25

Page 22: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Pliable solutionFolding the envelope

envelope Mg

target festimate f

R A

acceptance rate =A

A+R = 1M

Better proposal means smallerrejection area R

Smaller R means g should have asimilar “shape” to f

For this purpose:I Build an estimate fI Translate it uniformly

Pliable rejection sampling SequeL - 6/25

Page 23: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Pliable solutionFolding the envelope

envelope Mg

target festimate f

R A

acceptance rate =A

A+R = 1M

Better proposal means smallerrejection area R

Smaller R means g should have asimilar “shape” to f

For this purpose:I Build an estimate fI Translate it uniformly

4! It should be easy to samplefrom g ... and f !

Pliable rejection sampling SequeL - 6/25

Page 24: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Pliable solutionFolding the envelope

envelope Mg

target festimate f

R A

acceptance rate =A

A+R = 1M

Better proposal means smallerrejection area R

Smaller R means g should have asimilar “shape” to f

For this purpose:I Build an estimate fI Translate it uniformly

4! It should be easy to samplefrom g ... and f !

Pliable rejection sampling SequeL - 6/25

Page 25: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Assumption on the target density fI The positive function f , defined on [0,A]d is bounded i.e., there

exists c > 0 such that the density f satisfies f (x) ≤ c.

I f can be uniformly expanded by a Taylor expansion in any point upto some degree 0 < s ≤ 2,

|f (x + u)− f (x)− 〈5f (x), u〉1{s > 1}| ≤ c ′′‖u‖s2.

I f is in a Holder ball of smoothness sI not very restrictive, for a small sI f can be an unnormalized density (useful for some Bayesian methods)

Pliable rejection sampling SequeL - 7/25

Page 26: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Assumption on the target density fI The positive function f , defined on [0,A]d is bounded i.e., there

exists c > 0 such that the density f satisfies f (x) ≤ c.

I f can be uniformly expanded by a Taylor expansion in any point upto some degree 0 < s ≤ 2,

|f (x + u)− f (x)− 〈5f (x), u〉1{s > 1}| ≤ c ′′‖u‖s2.

I f is in a Holder ball of smoothness sI not very restrictive, for a small sI f can be an unnormalized density (useful for some Bayesian methods)

Pliable rejection sampling SequeL - 7/25

Page 27: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Assumption on the target density fI The positive function f , defined on [0,A]d is bounded i.e., there

exists c > 0 such that the density f satisfies f (x) ≤ c.

I f can be uniformly expanded by a Taylor expansion in any point upto some degree 0 < s ≤ 2,

|f (x + u)− f (x)− 〈5f (x), u〉1{s > 1}| ≤ c ′′‖u‖s2.

I f is in a Holder ball of smoothness s

I not very restrictive, for a small sI f can be an unnormalized density (useful for some Bayesian methods)

Pliable rejection sampling SequeL - 7/25

Page 28: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Assumption on the target density fI The positive function f , defined on [0,A]d is bounded i.e., there

exists c > 0 such that the density f satisfies f (x) ≤ c.

I f can be uniformly expanded by a Taylor expansion in any point upto some degree 0 < s ≤ 2,

|f (x + u)− f (x)− 〈5f (x), u〉1{s > 1}| ≤ c ′′‖u‖s2.

I f is in a Holder ball of smoothness sI not very restrictive, for a small s

I f can be an unnormalized density (useful for some Bayesian methods)

Pliable rejection sampling SequeL - 7/25

Page 29: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Assumption on the target density fI The positive function f , defined on [0,A]d is bounded i.e., there

exists c > 0 such that the density f satisfies f (x) ≤ c.

I f can be uniformly expanded by a Taylor expansion in any point upto some degree 0 < s ≤ 2,

|f (x + u)− f (x)− 〈5f (x), u〉1{s > 1}| ≤ c ′′‖u‖s2.

I f is in a Holder ball of smoothness sI not very restrictive, for a small sI f can be an unnormalized density (useful for some Bayesian methods)

Pliable rejection sampling SequeL - 7/25

Page 30: Pliable rejection sampling - GitHub Pages

Pliable solution Folding the envelope

Visualizing a 2D exampleMultimodal case

f (x , y) ∝(

1 + sin(

4πx − π

2

))(1 + sin

(4πy − π

2

))

1

0.5

000.20.40.60.81

5

4

3

2

1

0

Figure: 2D target density (orange) and the pliable proposal (blue)

Pliable rejection sampling SequeL - 8/25

Page 31: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Pliable Rejection SamplingStep 1: Estimating f

I f is defined on [0,A]d , bounded and smooth.I K is a positive kernel on Rd (product kernel).I Let X1, . . . ,XN ∼ U[0,A]d . The (modified) kernel regression estimate

is

f (x) = Ad

Nhd

N∑k=1

f (Xi)K(

Xi − xh

)

For an unbounded support density, some extra information is needed toconstruct a kernel-based estimate.

Pliable rejection sampling SequeL - 9/25

Cost: N requests to f out of n.

Page 32: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Pliable Rejection SamplingStep 1: Estimating f

I f is defined on [0,A]d , bounded and smooth.I K is a positive kernel on Rd (product kernel).I Let X1, . . . ,XN ∼ U[0,A]d . The (modified) kernel regression estimate

is

f (x) = Ad

Nhd

N∑k=1

f (Xi)K(

Xi − xh

)

For an unbounded support density, some extra information is needed toconstruct a kernel-based estimate.

Pliable rejection sampling SequeL - 9/25

Cost: N requests to f out of n.

Page 33: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Assumption on the kernel KK0 be a positive univariate density kernel defined on R

K =d∏

i=1K0

Furthermore, it is also of degree 2, i.e., it satisfies∫R

xK0(x)dx = 0,

and, for some C ′ > 0 ∫R

x2K0(x)dx ≤ C ′.

K0 is ε-Holder for some ε > 0, i.e., ∃C ′′ > 0 s.t., for any (x , y) ∈ R2,

|K0(y)− K0(x)| ≤ C ′′ |x − y |ε .

Gaussian kernel satisfies this with C = 1, C ′ = 1, C ′′ = 4, and ε = 1

Pliable rejection sampling SequeL - 10/25

Page 34: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Assumption on the kernel KK0 be a positive univariate density kernel defined on R

K =d∏

i=1K0

Furthermore, it is also of degree 2, i.e., it satisfies∫R

xK0(x)dx = 0,

and, for some C ′ > 0 ∫R

x2K0(x)dx ≤ C ′.

K0 is ε-Holder for some ε > 0, i.e., ∃C ′′ > 0 s.t., for any (x , y) ∈ R2,

|K0(y)− K0(x)| ≤ C ′′ |x − y |ε .

Gaussian kernel satisfies this with C = 1, C ′ = 1, C ′′ = 4, and ε = 1

Pliable rejection sampling SequeL - 10/25

Page 35: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Pliable Rejection SamplingBounding the gap

Theorem 1The estimate f is such that with probability larger than 1− δ, for anypoint x ∈ [0,A]d ,

∣∣∣f (x)− f (x)∣∣∣ ≤ H0

((log(NAd/δ)

N

) s2s+d)

where H0 is a constant that depends on the problem parameters.

s is the degree to which f can be expanded as a Taylor expression.

Pliable rejection sampling SequeL - 11/25

Remaing Budget: n − N.

Page 36: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Pliable Rejection SamplingBounding the gap

Theorem 1The estimate f is such that with probability larger than 1− δ, for anypoint x ∈ [0,A]d ,

∣∣∣f (x)− f (x)∣∣∣ ≤ H0

((log(NAd/δ)

N

) s2s+d)

where H0 is a constant that depends on the problem parameters.

s is the degree to which f can be expanded as a Taylor expression.

Pliable rejection sampling SequeL - 11/25

Remaing Budget: n − N.

Page 37: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Pliable Rejection SamplingStep 2: Generating Samples

I Remaining requests to f : n − N

I Let rN = AdHC

(log(NAd/δ)

N

) s2s+d

I Construct the pliable proposal g out of f :

g =f + rN U[0,A]d

1N∑N

i=1 f (Xi) + rN

I Perform rejection sampling using g and the empirical rejectionsampling constant

M =1N∑

i f (Xi) + rN1N∑

i f (Xi)− 5rN

Pliable rejection sampling SequeL - 12/25

Page 38: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

The algorithm

target f

Algorithm: Pliable Rejection Sampling(PRS)Input: s, n, δ, HCInitial SamplingDraw uniformly at random N sampleson [0,A]d

Estimation of fEstimate f using these N samples bykernel regressionGenerating the samplesSample n − N samples from the pliableproposal g and perform RejectionSampling using M as the envelopeconstantOutput: n accepted samples

Pliable rejection sampling SequeL - 13/25

Page 39: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

The algorithm

target f

Algorithm: Pliable Rejection Sampling(PRS)Input: s, n, δ, HCInitial SamplingDraw uniformly at random N sampleson [0,A]d

Estimation of fEstimate f using these N samples bykernel regressionGenerating the samplesSample n − N samples from the pliableproposal g and perform RejectionSampling using M as the envelopeconstantOutput: n accepted samples

Pliable rejection sampling SequeL - 13/25

Page 40: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

The algorithm

target festimate f

Algorithm: Pliable Rejection Sampling(PRS)Input: s, n, δ, HCInitial SamplingDraw uniformly at random N sampleson [0,A]d

Estimation of fEstimate f using these N samples bykernel regressionGenerating the samplesSample n − N samples from the pliableproposal g and perform RejectionSampling using M as the envelopeconstantOutput: n accepted samples

Pliable rejection sampling SequeL - 13/25

Page 41: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

The algorithm

target festimate f

proposal Mg

Algorithm: Pliable Rejection Sampling(PRS)Input: s, n, δ, HCInitial SamplingDraw uniformly at random N sampleson [0,A]d

Estimation of fEstimate f using these N samples bykernel regressionGenerating the samplesSample n − N samples from the pliableproposal g and perform RejectionSampling using M as the envelopeconstantOutput: n accepted samples

Pliable rejection sampling SequeL - 13/25

Page 42: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Is the sampling correct?Theorem 1: w.p. 1− δ, for any x ∈ [0,A]d

ξ′def=∣∣∣f (x)− f (x)

∣∣∣ ≤ rN1

Ad = rNU[0,A]d .

Hoeffding’s: w.p. 1− δ

ξ′′def=

{∣∣∣∣∣Ad

n

n∑i=1

f (Xi)−∫[0,A]d

f (x)dx

∣∣∣∣∣ ≤ 2Adc√

1N log(1/δ) def

= cN

}

On, ξ = ξ′ ∩ ξ′′, we have for our proposal and 8rN ≤∫[0,A]d

f (x)dx def= m

g? =f + rNU[0,A]d

Ad/n∑n

i=1 f (Xi) + rN≥ f∫

[0,A]df (x)dx + rN + cN

≥ f∫[0,A]d

f (x)dx(1− 4rN/m)

Pliable rejection sampling SequeL - 14/25

Page 43: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Is the sampling correct?Theorem 1: w.p. 1− δ, for any x ∈ [0,A]d

ξ′def=∣∣∣f (x)− f (x)

∣∣∣ ≤ rN1

Ad = rNU[0,A]d .

Hoeffding’s: w.p. 1− δ

ξ′′def=

{∣∣∣∣∣Ad

n

n∑i=1

f (Xi)−∫[0,A]d

f (x)dx

∣∣∣∣∣ ≤ 2Adc√

1N log(1/δ) def

= cN

}

On, ξ = ξ′ ∩ ξ′′, we have for our proposal and 8rN ≤∫[0,A]d

f (x)dx def= m

g? =f + rNU[0,A]d

Ad/n∑n

i=1 f (Xi) + rN≥ f∫

[0,A]df (x)dx + rN + cN

≥ f∫[0,A]d

f (x)dx(1− 4rN/m)

Pliable rejection sampling SequeL - 14/25

Page 44: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Is the sampling correct?Theorem 1: w.p. 1− δ, for any x ∈ [0,A]d

ξ′def=∣∣∣f (x)− f (x)

∣∣∣ ≤ rN1

Ad = rNU[0,A]d .

Hoeffding’s: w.p. 1− δ

ξ′′def=

{∣∣∣∣∣Ad

n

n∑i=1

f (Xi)−∫[0,A]d

f (x)dx

∣∣∣∣∣ ≤ 2Adc√

1N log(1/δ) def

= cN

}

On, ξ = ξ′ ∩ ξ′′, we have for our proposal and 8rN ≤∫[0,A]d

f (x)dx def= m

g? =f + rNU[0,A]d

Ad/n∑n

i=1 f (Xi) + rN≥ f∫

[0,A]df (x)dx + rN + cN

≥ f∫[0,A]d

f (x)dx(1− 4rN/m)

Pliable rejection sampling SequeL - 14/25

Page 45: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Choice of empirical multiplication constant M

11− 4rN/m

=m

m − 4rN

≤ Ad/N∑

i f (Xi) + cN

Ad/N∑

i f (Xi)− cn − 4rN

≤ Ad/N∑

i f (Xi) + rN

Ad/N∑

i f (Xi)− 5rN= M

Pliable rejection sampling SequeL - 15/25

Page 46: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Choice of empirical multiplication constant M

11− 4rN/m

=m

m − 4rN

≤ Ad/N∑

i f (Xi) + cN

Ad/N∑

i f (Xi)− cn − 4rN

≤ Ad/N∑

i f (Xi) + rN

Ad/N∑

i f (Xi)− 5rN= M

Pliable rejection sampling SequeL - 15/25

Mg? upperbounds f (under ξ)

Page 47: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

Choice of empirical multiplication constant M

11− 4rN/m

=m

m − 4rN

≤ Ad/N∑

i f (Xi) + cN

Ad/N∑

i f (Xi)− cn − 4rN

≤ Ad/N∑

i f (Xi) + rN

Ad/N∑

i f (Xi)− 5rN= M

Pliable rejection sampling SequeL - 15/25

Sampling is correct whp.

Page 48: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

How many accepted samples can we guarantee?

M =Ad/N

∑i f (Xi) + rN

Ad/N∑

i f (Xi)− 5rN≤ m + rN + cN

m − 5rN − cN≤ m + 2rN

m − 6rN.

On ξ, we get samples that are i.i.d. according to f , and n will be a sum ofBernoulli random variables of parameter larger than

1M≥ m − 6rN

m + 2rN≥ (1− 6rN/m)(1− 4rN/m) ≥ 1− 20rN/m,

n is with probability larger than 1− 3δ lower bounded as

n ≥ (n − N)

(1− 20rN/m − 4

√log(1/δ)

n

)

Setting: N = n 2s+d3s+d ,

n ≥ n[1− K log(nAd/δ) s

3s+d n− s3s+d]. (1)

Pliable rejection sampling SequeL - 16/25

Page 49: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

How many accepted samples can we guarantee?

M =Ad/N

∑i f (Xi) + rN

Ad/N∑

i f (Xi)− 5rN≤ m + rN + cN

m − 5rN − cN≤ m + 2rN

m − 6rN.

On ξ, we get samples that are i.i.d. according to f , and n will be a sum ofBernoulli random variables of parameter larger than

1M≥ m − 6rN

m + 2rN≥ (1− 6rN/m)(1− 4rN/m) ≥ 1− 20rN/m,

n is with probability larger than 1− 3δ lower bounded as

n ≥ (n − N)

(1− 20rN/m − 4

√log(1/δ)

n

)

Setting: N = n 2s+d3s+d ,

n ≥ n[1− K log(nAd/δ) s

3s+d n− s3s+d]. (1)

Pliable rejection sampling SequeL - 16/25

Page 50: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

How many accepted samples can we guarantee?

M =Ad/N

∑i f (Xi) + rN

Ad/N∑

i f (Xi)− 5rN≤ m + rN + cN

m − 5rN − cN≤ m + 2rN

m − 6rN.

On ξ, we get samples that are i.i.d. according to f , and n will be a sum ofBernoulli random variables of parameter larger than

1M≥ m − 6rN

m + 2rN≥ (1− 6rN/m)(1− 4rN/m) ≥ 1− 20rN/m,

n is with probability larger than 1− 3δ lower bounded as

n ≥ (n − N)

(1− 20rN/m − 4

√log(1/δ)

n

)

Setting: N = n 2s+d3s+d ,

n ≥ n[1− K log(nAd/δ) s

3s+d n− s3s+d]. (1)

Pliable rejection sampling SequeL - 16/25

Page 51: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The PRS algorithm

How many accepted samples can we guarantee?

M =Ad/N

∑i f (Xi) + rN

Ad/N∑

i f (Xi)− 5rN≤ m + rN + cN

m − 5rN − cN≤ m + 2rN

m − 6rN.

On ξ, we get samples that are i.i.d. according to f , and n will be a sum ofBernoulli random variables of parameter larger than

1M≥ m − 6rN

m + 2rN≥ (1− 6rN/m)(1− 4rN/m) ≥ 1− 20rN/m,

n is with probability larger than 1− 3δ lower bounded as

n ≥ (n − N)

(1− 20rN/m − 4

√log(1/δ)

n

)

Setting: N = n 2s+d3s+d ,

n ≥ n[1− K log(nAd/δ) s

3s+d n− s3s+d]. (1)

Pliable rejection sampling SequeL - 16/25

Page 52: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A bound on the acceptance rateThe asymptotic performace

Theorem 2Under Theorem 1’s assumptions and if H0 < HC , 8rN ≤

∫[0,A]d

f (x)dx.Then, for n large enough, we have with probability larger than 1− δ that

n ≥ n[

1−O(

log (nAd/δ)n

) s3s+d].

where n is the number of i.i.d. samples generated by PRS.

Pliable rejection sampling SequeL - 17/25

Convergence Rate ↑with s

Convergence Rate ↓with d

Page 53: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A bound on the acceptance rateThe asymptotic performace

Theorem 2Under Theorem 1’s assumptions and if H0 < HC , 8rN ≤

∫[0,A]d

f (x)dx.Then, for n large enough, we have with probability larger than 1− δ that

n ≥ n[

1−O(

log (nAd/δ)n

) s3s+d].

where n is the number of i.i.d. samples generated by PRS.

Pliable rejection sampling SequeL - 17/25

Convergence Rate ↑with s

Convergence Rate ↓with d

Page 54: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

Competitor: A? sampling from Gumbel-Max trick

Gumbel-Max trick: p(i) ∝ exp (φ (i)) for i ∈ {1, 2, 3, 4, 5}images from Chris J. Maddison

The Gumbel-Max Trick (well-known, see Yellott 1977)

Suppose we want to sample from a finite distribution

p(i) ∝ exp(φ(i)) for i ∈ {1, 2, 3, 4, 5}

φ(1)

φ(2)φ(3)

φ(4)

φ(5)

1 2 3 4 5

• • • • •

Pliable rejection sampling SequeL - 18/25

Page 55: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

Competitor: A? sampling from Gumbel-Max trick

Gumbel-Max trick: p(i) ∝ exp (φ (i)) for i ∈ {1, 2, 3, 4, 5}images from Chris J. Maddison

The Gumbel-Max Trick (well-known, see Yellott 1977)

Suppose we want to sample from a finite distribution

p(i) ∝ exp(φ(i)) for i ∈ {1, 2, 3, 4, 5}

φ(1)

φ(2)φ(3)

φ(4)

φ(5)

G(1) G(2) G(3) G(4) G(5)

1 2 3 4 5

G (i) ∼ Gumbel(0) IID

• • • • •

Pliable rejection sampling SequeL - 18/25

Page 56: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

Competitor: A? sampling from Gumbel-Max trick

Gumbel-Max trick: p(i) ∝ exp (φ (i)) for i ∈ {1, 2, 3, 4, 5}images from Chris J. Maddison

The Gumbel-Max Trick (well-known, see Yellott 1977)

Suppose we want to sample from a finite distribution

p(i) ∝ exp(φ(i)) for i ∈ {1, 2, 3, 4, 5}

φ(1) + G(1)

φ(2) + G(2)

φ(3) + G(3)

φ(4) + G(4)

φ(5) + G(5)

1 2 3 4 5

• • • • •

Pliable rejection sampling SequeL - 18/25

Page 57: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

Competitor: A? sampling from Gumbel-Max trick

Gumbel-Max trick: p(i) ∝ exp (φ (i)) for i ∈ {1, 2, 3, 4, 5}images from Chris J. Maddison

The Gumbel-Max Trick (well-known, see Yellott 1977)

Suppose we want to sample from a finite distribution

p(i) ∝ exp(φ(i)) for i ∈ {1, 2, 3, 4, 5}

φ(1) + G(1)

φ(2) + G(2)

φ(3) + G(3)

φ(4) + G(4)

φ(5) + G(5)

1 2 3 4 5

exact sample

• • • • •

Pliable rejection sampling SequeL - 18/25

Page 58: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A? samplingContinuous Gumbel-Max trick: f (x) ∝ exp (i(x) + o(x))A∗ Sampling

G q1

X q1 o(X )

Pliable rejection sampling SequeL - 19/25

Page 59: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A? samplingContinuous Gumbel-Max trick: f (x) ∝ exp (i(x) + o(x))A∗ Sampling

G q1

X q1

LB

UB

o(X )

Pliable rejection sampling SequeL - 19/25

Page 60: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A? samplingContinuous Gumbel-Max trick: f (x) ∝ exp (i(x) + o(x))

A∗ Sampling

G q1

X q1

LB

o(X )

Pliable rejection sampling SequeL - 19/25

Page 61: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A? samplingContinuous Gumbel-Max trick: f (x) ∝ exp (i(x) + o(x))

A∗ Sampling

G q1

X q1

G q2

X q2

LB

o(X )

Pliable rejection sampling SequeL - 19/25

Page 62: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A? samplingContinuous Gumbel-Max trick: f (x) ∝ exp (i(x) + o(x))

A∗ Sampling

G q1

X q1

G q2

X q2

LB

exact sample

o(X )

Pliable rejection sampling SequeL - 19/25

Page 63: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A? sampling vs. PRS− A? needs several calls to f to generate a sample+ PRS rejects (asymptotically) only a negligible number of samples

with respect to n

number of i.i.d. samples generated according to f per computation of fare better than the ones for A? sampling

− A? needs a decomposition f (x) ∝ exp (φ(x)), whereφ(x) = i(x) + o(x)

+ PRS learns it!

Scaling with d? Same.

Pliable rejection sampling SequeL - 20/25

Page 64: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A? sampling vs. PRS− A? needs several calls to f to generate a sample+ PRS rejects (asymptotically) only a negligible number of samples

with respect to nnumber of i.i.d. samples generated according to f per computation of fare better than the ones for A? sampling

− A? needs a decomposition f (x) ∝ exp (φ(x)), whereφ(x) = i(x) + o(x)

+ PRS learns it!

Scaling with d? Same.

Pliable rejection sampling SequeL - 20/25

Page 65: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A? sampling vs. PRS− A? needs several calls to f to generate a sample+ PRS rejects (asymptotically) only a negligible number of samples

with respect to nnumber of i.i.d. samples generated according to f per computation of fare better than the ones for A? sampling

− A? needs a decomposition f (x) ∝ exp (φ(x)), whereφ(x) = i(x) + o(x)

+ PRS learns it!

Scaling with d?

Same.

Pliable rejection sampling SequeL - 20/25

Page 66: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A? sampling vs. PRS− A? needs several calls to f to generate a sample+ PRS rejects (asymptotically) only a negligible number of samples

with respect to nnumber of i.i.d. samples generated according to f per computation of fare better than the ones for A? sampling

− A? needs a decomposition f (x) ∝ exp (φ(x)), whereφ(x) = i(x) + o(x)

+ PRS learns it!

Scaling with d? Same.

Pliable rejection sampling SequeL - 20/25

Page 67: Pliable rejection sampling - GitHub Pages

Pliable Rejection Sampling The Main Result

A? sampling vs. PRS− A? needs several calls to f to generate a sample+ PRS rejects (asymptotically) only a negligible number of samples

with respect to nnumber of i.i.d. samples generated according to f per computation of fare better than the ones for A? sampling

− A? needs a decomposition f (x) ∝ exp (φ(x)), whereφ(x) = i(x) + o(x)

+ PRS learns it!

Scaling with d? Same.

Pliable rejection sampling SequeL - 20/25

Page 68: Pliable rejection sampling - GitHub Pages

Experiements Peakiness

ExperimentsScaling with peakiness

f ∝ e−x

(1+x)a , a defines the peakiness level

Peakiness

AcceptanceRate

2 4 6 8 10 12 14 16 18 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

PRSAstarSRS

n = 104

PeakinessAcceptanceRate

2 4 6 8 10 12 14 16 18 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

PRSAstarSRS

(b) n = 105

Figure: Acceptance rate vs. peakiness

Pliable rejection sampling SequeL - 21/25

Page 69: Pliable rejection sampling - GitHub Pages

Experiements 2D example

ExperimentsTwo dimensional example

1

0.5

000.20.40.60.81

5

4

3

2

1

0

n = 106 acceptance rate standard deviationPRS 66.4% 0.45%

A? sampling 76.1% 0.80%SRS 25.0% 0.01%

Table: 2D example: Acceptance rates averaged over 10 trials

Pliable rejection sampling SequeL - 22/25

Page 70: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

ExperimentsThe Clutter problem

n = 105, 1D acceptance rate standard deviationPRS 79.5% 0.2%

A? sampling 89.4% 0.8%SRS 17.6% 0.1%

n = 105, 2D acceptance rate standard deviationPRS 51,0% 0.4%

A? sampling 56.1% 0.5%SRS 2.10−3% 10−5%

Table: Clutter problem: Acceptance rates averaged over 10 trials

Pliable rejection sampling SequeL - 23/25

Page 71: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

DiscussionNormalized distribution

If∫[0,A]d

f = 1 then we can simplify the algorithm

g? def=

11 + rN

(f + rNU[0,A]d

)

Case of a distribution with unbounded support

instead of uniformly sampling on [0,A]d , we sample on a hypercubecentered in 0 and of side length

√log(n)

Extensions for high dimensional cases (large d)

– when the mass of the distribution is localized in a few small subsets

Pliable rejection sampling SequeL - 24/25

Page 72: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

DiscussionNormalized distribution

If∫[0,A]d

f = 1 then we can simplify the algorithm

g? def=

11 + rN

(f + rNU[0,A]d

)

Case of a distribution with unbounded support

instead of uniformly sampling on [0,A]d , we sample on a hypercubecentered in 0 and of side length

√log(n)

Extensions for high dimensional cases (large d)

– when the mass of the distribution is localized in a few small subsets

Pliable rejection sampling SequeL - 24/25

Page 73: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

DiscussionNormalized distribution

If∫[0,A]d

f = 1 then we can simplify the algorithm

g? def=

11 + rN

(f + rNU[0,A]d

)

Case of a distribution with unbounded support

instead of uniformly sampling on [0,A]d , we sample on a hypercubecentered in 0 and of side length

√log(n)

Extensions for high dimensional cases (large d)

– when the mass of the distribution is localized in a few small subsets

Pliable rejection sampling SequeL - 24/25

Page 74: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions

+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions

+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 75: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)

+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions

+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 76: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions

+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 77: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions

+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 78: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art

+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions

+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 79: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions

+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 80: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions

+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 81: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 82: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)

Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 83: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)

Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 84: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several times

Extension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 85: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noiselles

Improved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 86: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 87: Pliable rejection sampling - GitHub Pages

Experiements An inference problem

Conclusion

+ PRS deals with a wide class of functions+ PRS has guarantees: asymptotically we accept everything (whp)+ PRS is a perfect sampler

+ (whp) the samples are iid (unlike MCMC)

+ PRS’s empirical performance is comparable to state of the art+ We have an extension to densities with unbounded support

− PRS works only for small and moderate dimensions+ in favorable cases, it can scale to high dimensions as well

− It does not work well for peaky distributions (posteriors)Extension 1: Iterative PRS by re-estimating f several timesExtension 2: Using the fact that the evaluations are noisellesImproved rate and lower bound: Follow-up work by Juliette Achdouand Alexandra Carpentier

Pliable rejection sampling SequeL - 25/25

Page 88: Pliable rejection sampling - GitHub Pages

Thank you!

SequeL – Inria Lille

GdR ISIS