hybridization of dynamic optimization methodologies · 7 introduction state of the art...

98
Introduction State of the art Contributions Conclusion References Hybridization of dynamic optimization methodologies er´ emie Decock Advisor: Olivier Teytaud I-Lab: LRI - Inria & Artelys Ph.D. Defense November 28, 2014 Decock I-Lab: LRI - Inria & Artelys Hybridization of dynamic optimization methodologies

Upload: others

Post on 26-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

Introduction State of the art Contributions Conclusion References

Hybridization of dynamic optimizationmethodologies

Jeremie DecockAdvisor: Olivier Teytaud

I-Lab: LRI - Inria & Artelys

Ph.D. Defense

November 28, 2014

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 2: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

2

Introduction State of the art Contributions Conclusion References

Subject

Comparison and hybridation of Sequential Decision Making (SDM)algorithms

I Long-term reward

I Interdependence among decisions

I Exponential number of solutions

Application field

Decision aids for Power systems management

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 3: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

3

Introduction State of the art Contributions Conclusion References

Outline

IntroductionIntroduction

State of the artState of the art in Sequential Decision MakingState of the art in Power Systems

ContributionsDirect Value SearchNoisy Optimization: Lower Bounds on RuntimesNoisy Optimization: Variance Reduction

ConclusionConclusions and perspectives

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 4: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

4

Introduction State of the art Contributions Conclusion References

Introduction

Introduction

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 5: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

5

Introduction State of the art Contributions Conclusion References

Introduction

Energy policies

Optimizing energy management: a crucial problem in manyrespects

I economics

I political and geopolitical

I ecological and health

I societal

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 6: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

6

Introduction State of the art Contributions Conclusion References

Introduction

Energy policies

New technical constraints:

I intermittent: solar power plants and wind turbines

I distribution network security: more and more interconnectionsincrease blackout risks

To sum up

I As there are more and more constraints, decision are more andmore complex;

I Environmental, economical and political consequences ofinappropriate choices are more and more severe.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 7: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

7

Introduction State of the art Contributions Conclusion References

Introduction

Context and Motivations

Need for better decision aids toolsProspective models are very valuable decision support. They arecommonly used to forecast and optimise the short and mid termelectricity production and long term investment strategies.

Our goals

I create new decision aid tools accounting for the latestpolitical, economical, environmental, sociological andtechnical constraints

I help electricity managers to find better exploitation andinvestment strategies reflecting realistic large-scale models.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 8: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

8

Introduction State of the art Contributions Conclusion References

Introduction

Unit Commitment Problems

rain

Stock3 Stock4

Stock1 Stock2

ClassicalThermal plants

Electricitydemand

I A multi-stage problem

I Energy demand (forecast)I Energy production:

I Hydroelectricity (N water stocks)I Thermal plants

I Water flow through stock links

GoalDecide, for each power plant, its poweroutput at time t so as to satisfy thedemand with the lowest possible cost.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 9: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

9

Introduction State of the art Contributions Conclusion References

Introduction

Challenges

Issues

I Limited forecast (e.g. demand and weather)

I Renewable energies increase production variability

I Transportation introduces constraints

SpecificsI Can’t assume Markovian process

I E.g. weather (influences production and demand)I Increase the state space dimension

I Avoid simplified models to avoid model errors

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 10: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

10

Introduction State of the art Contributions Conclusion References

Introduction

Optimization and Model Errors

I Optimization errorsI Model errors

I Simplification errorsI Anticipativity errorsI Statistical errors

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 11: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

11

Introduction State of the art Contributions Conclusion References

Introduction

Optimization and Model Errors

I NEWAVE versus ODIN: comparison of stochastic anddeterministic models for the long term hydropower schedulingof the interconnected Brazilian system. M. Zambelli et al.,2011.

I Dynamic Programming and Suboptimal Control: A Surveyfrom ADP to MPC. D. Bertsekas, 2005.(MPC = deterministic forecasts)

I Renewable energy forecasts ought to be probabilistic!P. Pinson, 2013 (WIPFOR talk)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 12: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

12

Introduction State of the art Contributions Conclusion References

Introduction

Example: POST (ADEME)

High scale investment studies(e.g. Europe + North Africa)

I Long term (2030 - 2050)I Huge (non-stochastic)

uncertaintiesI Future technologiesI Future lawsI Future demand

I Investment problemI InterconnectionsI StorageI Smart gridsI Power plants

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 13: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

13

Introduction State of the art Contributions Conclusion References

Introduction

Stochastic Control

StochasticProcess

Agent(Algorithm)

System(or Environment)

Next state+

Reward

Actions (or decisions)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 14: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

14

Introduction State of the art Contributions Conclusion References

Introduction

Modelling Sequential Decision Making Problems

Notations

I st = state at time t

I at = action (decision) at time t

I s ′ = st+1

I T (s, a, s ′) = probability of s → s ′ when action a

I r(s, a) = reward when action a in s

I π(s) = action to execute in s (policy)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 15: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

15

Introduction State of the art Contributions Conclusion References

State of the art in Sequential Decision Making

State of the art in Sequential DecisionMaking

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 16: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

16

Introduction State of the art Contributions Conclusion References

State of the art in Sequential Decision Making

Bellman Methods[Bellman57]

Bellman ValuesThe expected Value V π for each state s when the agent follow agiven (stationary) policy π is

V π(s) = E

[ ∞∑t=0

γtr(st)|π, s0 = s

]

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 17: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

17

Introduction State of the art Contributions Conclusion References

State of the art in Sequential Decision Making

Bellman Methods

Bellman ValuesThe Bellman equation gives the best value we can expect for anygiven state (assuming the optimal policy π∗ is followed).

V π∗(s) = r(s) + γmaxa∈A

[∑s′∈S

T (s, a, s′)V (s′)

]

The optimal (stationary) policy π∗ is defined using the bestexpected value V π∗ and using the principle of Maximum ExpectedUtility as follow

π∗(s) = arg maxa∈A

[∑s′∈S

T (s, a, s′)V π∗(s′)

]

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 18: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

18

Introduction State of the art Contributions Conclusion References

State of the art in Sequential Decision Making

Bellman Methods

Computing Bellman Values

V π∗(s) = r(s) + γmaxa∈A

[∑s′∈S

T (s, a, s′)V (s′)

]There are |S| Bellman equations, one for each state. This systemof equations cannot be solved analytically because Bellmanequations contains non-linear terms (due to the ”max” operator).

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 19: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

19

Introduction State of the art Contributions Conclusion References

State of the art in Sequential Decision Making

Bellman Methods[Bellman57]

Backward Induction: the non-stationary case

Value V ∗ for each state s at the latest time step T is

V ∗T (s) = 0 (or some valorization)

Best expected value V ∗ at time t < T computed backwards:

V ∗t (s) = maxa∈A

r(s, a) +

[∑s′∈S

T (s, a, s′)V ∗t+1(s′)

]

Optimal action (or decision) π∗t (s):

π∗t (s) = arg maxa∈A

[r(s, a) +

∑s′∈S

T (s, a, s′)V ∗t+1(s′)

]Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 20: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

20

Introduction State of the art Contributions Conclusion References

State of the art in Sequential Decision Making

Bellman Methods[Bellman57]

Value Iteration: the stationary case

Bellman update is used in Value Iteration to update V at eachiteration.

Vi+1(s)← r(s) + γmaxa∈A

[∑s′∈S

T (s, a, s′)Vi (s′)

]

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 21: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

21

Introduction State of the art Contributions Conclusion References

State of the art in Sequential Decision Making

Direct Policy Search

GoalFinds “good” parameters for a given parametric policyπθ : states → actions.

Requires a parametric controller (e.g. neural network)

Principle

Optimize the parameters on simulations (Noisy Black-BoxOptimization).

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 22: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

22

Introduction State of the art Contributions Conclusion References

State of the art in Sequential Decision Making

Parametric policies πθNeural Networks: θ =

(w

(1)10 , . . . ,w

(1)mn ,w

(2)10 , . . . ,w

(2)km

)T

Σ

Σw11

w12

w1n

x0=1

x1

x2

xn

Σ

...

w10

Σ

z1

z2

x0=1output layerinput layer hidden layer

w21

w22

w2n

w20

(1)

(1)

(1)

(1)

(1)

(1)

(1)

(1)

w10(2)

w20(2)

w11(2)

w21(2)

w12(2)

w22(2)

tanh

tanh

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 23: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

23

Introduction State of the art Contributions Conclusion References

State of the art in Power Systems

State of the art in Power Systems

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 24: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

24

Introduction State of the art Contributions Conclusion References

State of the art in Power Systems

State of the art in Power SystemsStochastic Dual Dynamic Programming (SDDP) [Pereira91]

Decompose the problem in instantreward + future reward as:

maxa

r(s, a)︸ ︷︷ ︸instant reward

+ V (s)︸︷︷︸Bellman Value

Approximate V (.) with Bender cuts

Issuesrequires convexity of V (.), smallstate space and Markovian process

s

V(s)

V=min cuts

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 25: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

25

Introduction State of the art Contributions Conclusion References

State of the art in Power Systems

State of the art in Power SystemsModel-Predictive Control (MPC) [Bertsekas05]

strategic horizontime step

tactical horizon

operational horizon

global problem

problem 1

solution 1

kept solution 1

problem 2

solution 2

kept solution 2

problem 3

solution 3

kept solution 3

problem 4

solution 4

kept solution 4

global solution

+ + +

I Rolling planning (closed-loop)

I Replacing the stochastic partsby a pessimistic deterministicforecast

⇒ simple but indeed better than SDDP (Zambelli’11): lessassumptions

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 26: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

26

Introduction State of the art Contributions Conclusion References

State of the art in Power Systems

Summary

Pros Cons

S(D)DP large constrained A not anytime

polynomial time decision making convex problems only (SDDP)

asymptotically find the optimum small S

Markovian random process

DPS anytime slow on large A

large S hardly handles decision constraints

works with non linear functions

no random process constraint

ContributionMerge both approaches !

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 27: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

27

Introduction State of the art Contributions Conclusion References

Direct Value Search

Direct Value Search

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 28: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

28

Introduction State of the art Contributions Conclusion References

Direct Value Search

Direct Value Search

Coupling Bellman Methods with Direct Policy Search

I Optimization of Energy PoliciesI with Direct Value Search (DVS)

I Linear Programming (poly-time decision making)I Direct Policy Search (on real simulations)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 29: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

29

Introduction State of the art Contributions Conclusion References

Direct Value Search

Direct Value Search

Combine instantaneous reward and valorization:

Π(st) = arg maxat

r(st , at) + V (st+1)

V (st+1) = αt .st+1︸ ︷︷ ︸linear

αt = πθ(st)︸ ︷︷ ︸non linear

I Given θ, decision making solved as a LP

I Non-linear mapping for choosing the parameters of the LP from thecurrent state

Requires the optimization of θ (noisy black-box optimization problem)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 30: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

30

Introduction State of the art Contributions Conclusion References

Direct Value Search

Direct Value SearchRecourse planning

strategic horizontime step

tactical horizon

operational horizon

global problem

problem 1

solution 1

kept solution 1

problem 2

solution 2

kept solution 2

problem 3

solution 3

kept solution 3

problem 4

solution 4

kept solution 4

global solution

+ + +

We assume we know ιt...k the random realizations from currentstage to tactical horizon.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 31: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

31

Introduction State of the art Contributions Conclusion References

Direct Value Search

Direct Value SearchStep 1 (offline): compute πθ(.)

Build parametric policy πθ

Require:a parametric policy πθ(.) where πθ is a mapping from S to A,a Stochastic Decision Process SDP,an initial state s

Ensure:a parameter θ∗ leading to a policy πθ∗ (.)

Find a parameter θ∗ maximizing the expectation of the following fitness functionθ 7→ Simulate(s, SDP, πθ)with a given non-linear noisy optimization algorithm (e.g. SA-ES, CMA-ES)

return θ∗

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 32: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

32

Introduction State of the art Contributions Conclusion References

Direct Value Search

Direct Value SearchStep 1 (offline): compute πθ(.)

Simulate(s0, SDP, πθ)

r ← 0for t ← t0, t0 + h, t0 + 2h, ..., T doιt...k ← get forecast(.)

α← πθ(s+t )

at...k ← arg maxa r (at...k , ιt...k , st) + α>s . st+k−1

r ← r + SDP reward(st , at...h, ιt...h) ← not necessarily linearst+h ← SDP transition(st , at...h, ιt...h) ← not necessarily linear

end for

return r

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 33: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

33

Introduction State of the art Contributions Conclusion References

Direct Value Search

Direct Value SearchStep 2 (online): use πθ(.) to solve the actual problem

The offline optimisation of πθ can be stopped at any time.

Then

I we have πθ, an approximation of state’s marginal value

I we can use it to solve the actual problem, the same manner asin Simulate

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 34: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

34

Introduction State of the art Contributions Conclusion References

Direct Value Search

DVS: A proof of concept

96000 kw(96h)

Battery

≤1000 kw

≤1000 kw

Output efficiency: 80%(1kw in -> 0.8kw out)

Buy

Sell

Energy market price

Goal: find a policy which maximises gains

Buy (and store) when the market price is low, sell when the marketprice is high.

I The market price is stochastic.I 10 constrained batteries.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 35: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

35

Introduction State of the art Contributions Conclusion References

Direct Value Search

DVS: A proof of concept

I State vector s+ = stock level of the 10 batteries +handcrafted inputs

I 10 decision variables at each time step = the quantity ofenergy to buy/sell for each battery

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 36: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

36

Introduction State of the art Contributions Conclusion References

Direct Value Search

Baselines

Recourse planning without final valorization

Bellman values → null (α = 0)

Π(s, t) = arg maxat

maxat+1,...,at+k−1

E [rt + · · ·+ rt+k−1]

Recourse planning with constant marginal valorization

Bellman values → linear function of total stock (α = constant)

Π(s, t) = arg maxat

maxat+1,...,at+k−1

E [rt + · · ·+ rt+k−1 + α.st+k−1]

with optimized constant marginal valorization α.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 37: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

37

Introduction State of the art Contributions Conclusion References

Direct Value Search

DVS: experimental setting

I Parametric policy πθ = a neural network with N neurons in asingle hidden layer and weights vector θ;

I θ is optimized by maximizing θ 7→ Simulate(θ) with aSelf-Adaptive Evolution Strategy (SA-ES).

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 38: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

38

Introduction State of the art Contributions Conclusion References

Direct Value Search

Results

0 50000 100000 150000 200000 250000 300000Evaluation

1.0

1.5

2.0

2.5

3.0

Mean r

ew

ard

(M

)

Evolution of the performance during the DPS policy build

Best constant marginal cost

Null valorization at tactical horizon

DVS (1 neuron)

DVS (2 neurons)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 39: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

39

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Noisy Optimization: Lower Bounds onRuntimes

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 40: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

40

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Introduction

The aim of this 2nd contribution

I Study convergence rate for numerical optimization with noisyobjective function

I Reduce the gap between upper and lower boundsI for a given family of noisy objective functionsI under some assumptions

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 41: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

41

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Considered family F of objective functions

fx∗,β,γ : [0, 1]d → 0, 1

x 7→ B

γ(‖x− x∗‖√

d

)β+ (1− γ)︸ ︷︷ ︸

p

I fx∗,β,γ ∈ F : a stochastic objective function (a random variable)

I B(p) : the Bernoulli distribution with parameter p

I d ∈ N∗ : the dimension of the domain of fx∗,β,γ

I x∗ ∈ [0, 1]d : the optimum

I β ∈ R∗+ : the ”flatness” of the expectation Ef around x∗

I γ ∈ [0, 1] : a noise parameter (variance at x∗)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 42: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

42

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Considered objective functions

=1

=2

=3

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 43: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

43

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Considered objective functions

I we assume that the optimum is unique

I we consider noisy optimization in the case of local convergence

Lower bound → concerns all families including F

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 44: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

44

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

The experimental setting for our noisy optimization settingRequire: ω, ω′, x∗, β, γ and (unknown).

for all n = 1, 2, 3, . . . doxx∗,n,ω,ω′ = Optimize(xx∗,1,ω,ω′ , . . . , xx∗,n−1,ω,ω′ , y1, . . . , yn−1, ω′)if ωn ≤ Efx∗,β,γ(xx∗,n,ω,ω′) then

yn = 1else

yn = 0end if

end forreturn xx∗,n,ω,ω′

The framework requires:

I ω uniform random variable for simulating the Bernoulli in f

I ω′ a random seed of the algorithm (optimizer’s internal randomness)

I x∗ the optimum (that minimizes Eωfx∗,β,γ(x, ω))

I β and γ two fixed parameters of f

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 45: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

45

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

The experimental setting for our noisy optimization settingRequire: ω, ω′, x∗, β, γ and (unknown).

for all n = 1, 2, 3, . . . doxx∗,n,ω,ω′ = Optimize(xx∗,1,ω,ω′ , . . . , xx∗,n−1,ω,ω′ , y1, . . . , yn−1, ω′)if ωn ≤ Efx∗,β,γ(xx∗,n,ω,ω′) then

yn = 1else

yn = 0end if

end forreturn xx∗,n,ω,ω′

It’s an iterative process. For each iteration:

I Optimize (the optimization algorithm) returns the next point to visit

I looking for x∗

I according to some inputs

I This point is evaluated by the fitness function (if-then-else statement)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 46: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

46

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

The experimental setting for our noisy optimization settingRequire: ω, ω′, x∗, β, γ and (unknown).

for all n = 1, 2, 3, . . . doxx∗,n,ω,ω′ = Optimize(xx∗,1,ω,ω′ , . . . , xx∗,n−1,ω,ω′ , y1, . . . , yn−1, ω′)if ωn ≤ Efx∗,β,γ(xx∗,n,ω,ω′) then

yn = 1else

yn = 0end if

end forreturn xx∗,n,ω,ω′

Optimize makes its recommendation according to:

I the sequence of former visited points xn

I their binary noisy fitness values yn

I the optimizer’s internal randomness ω′

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 47: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

47

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

The experimental setting for our noisy optimization settingRequire: ω, ω′, x∗, β, γ and (unknown).

for all n = 1, 2, 3, . . . doxx∗,n,ω,ω′ = Optimize(xx∗,1,ω,ω′ , . . . , xx∗,n−1,ω,ω′ , y1, . . . , yn−1, ω′)if ωn ≤ Efx∗,β,γ(xx∗,n,ω,ω′) then

yn = 1else

yn = 0end if

end forreturn xx∗,n,ω,ω′

The fitness function fx∗,β,γ outputs

I 1 if random (= ωn) is less than Efx∗,β,γ(xx∗,n,ω,ω′)

I 0 otherwise

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 48: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

48

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

The experimental setting for our noisy optimization settingRequire: ω, ω′, x∗, β, γ and (unknown).

for all n = 1, 2, 3, . . . doxx∗,n,ω,ω′ = Optimize(xx∗,1,ω,ω′ , . . . , xx∗,n−1,ω,ω′ , y1, . . . , yn−1, ω′)if ωn ≤ Efx∗,β,γ(xx∗,n,ω,ω′) then

yn = 1else

yn = 0end if

end forreturn xx∗,n,ω,ω′

Eventually, the estimation of the optimum is returned

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 49: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

49

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Sampling strategies

Sampling close to the current estimation of the optimum

I most evolution strategies do that

Sampling far from the current estimation of the optimum

I when f is learnable

I the optimizer can use a model of f to sample far from x∗

I optimize’s outputs can have different meaningsI be the most informative points to sample (exploration)I provide an estimate of arg minEf (recommendation)

These strategies lead to different convergence rates.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 50: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

50

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Our assumptions

We assume that the optimization algorithm

I does not require a model of the objective function

I samples close to the optimum (first strategy)

The locality assumption

∀f ∈ F , P

(∀i ≤ n, ‖xi − x∗‖ ≤ C (d)

)≥ 1− δ

2

I for some 0 < δ < 1/2

I C (d) > 0: a constant depending on d only

I α > 0: convergence speed (large α implies a fast convergence)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 51: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

51

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Noisy optimization complexity

What is the best theoretical convergence rate of an optimizationalgorithm on f ∈ F assuming this locality assumption ?

∀f ∈ F , P

(∀i ≤ n, ‖xi − x∗‖ ≤ C (d)

)≥ 1− δ

2

QuestionWhat is the supremum of possible α ?

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 52: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

52

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

State of the art: upper and lower bounds[Roley2010,Shamir,Chen88

Consider α such that

∀f ∈ F , P

(∀i ≤ n, ‖xi − x∗‖ ≤ C (d)

)≥ 1− δ

2

γ = 1 (small noise) γ < 1 (large noise)

Proved rate for R-EDA 1β ≤ α

12β ≤ α

Former lower bounds α ≤ 1 α ≤ 12

R-EDA experimental rates α = 1β α = 1

Rate by active learning α = 12 α = 1

2

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 53: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

53

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

What we prove

Consider α such that

∀f ∈ F , P

(∀i ≤ n, ‖xi − x∗‖ ≤ C (d)

)≥ 1− δ

2

γ = 1 (small noise) γ < 1 (large noise)

Proved rate for R-EDA 1β ≤ α

12β ≤ α

Former lower bounds α ≤ 1 α ≤ 12

This proof α ≤ 1β α ≤ 1

β

(lower bounds) (recently: 1/4 for β = 2)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 54: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

54

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Key ideas

Key point

if ωi < Ef (x∗) or ωi > Ef (xi ), the function evaluation is the samefor all x∗ ⇒ let us show that this happens for most i .

Definitions

I N = number of ωi between Ef (x∗) and Ef (xi ) for i ≤ n.

I Xn,Ω the set of points which can be chosen as nth sampledpoint if the random sequence is Ω = (ω1, . . . ,ωn, . . . ).

Card Xn,Ω ≤ 2N We will show that locality ⇒ N small. (intuition:N (depends on n) is the number of i which bring information; thelocality assumption implies that N is small, i.e. we receive littleinformation)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 55: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

55

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

LemmaN is probably upper bounded by O(

∑ni=1 1/iαβ).

(i.e. for most evaluations, the result does not depend on x∗!)

Proofprobability of ωi ∈ (Ef (x∗),Ef (xi )) is O(1/iαβ) by localityassumption.⇒ hence the result by summation (for expectations).⇒ hence the result by Chebyshev (with large probability).

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 56: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

56

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Theorem

TheoremAssume the objective function f ∈ F

Assume the locality assumption

∀f ∈ F , P

(∀i ≤ n, ‖xi − x∗‖ ≤ C (d)

)≥ 1− δ

2

Then α ≤ 1/β.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 57: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

57

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Proof of the theorem (1)

Let us show that αβ ≤ 1

In order to do so, let us assume, in order to get a contradiction,that αβ > 1

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 58: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

58

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Proof of the theorem (2)

DefinitionConsider R a set of points withlower bounded distance to eachother:

I two distinct elements of Rare at distance greater than2ε from each other with

ε =C (d)

ε

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 59: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

59

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Proof of the theorem (3)

If x∗ is uniformly drawn in R ...

I optimize should have theopportunity to choose pointswhich at distance ≤ ε of(almost) each possible x∗

That is to say:

I Xn,Ω has points at distance ≤ εof a certain percentage(1− δ/2) of elements in R

ε

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 60: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

60

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Lower Bounds on Runtimes

Proof of the theorem (4)

But...

I if ε decreases (xi approach x∗)then |R| increases

I and Xn,Ω is finite (with greatprobability, by Lemma)

Up to a certain timestep, the localityassumption can’t be respected.

We have a contradiction on ourassumption.

αβ > 1 is wrong QED

ε

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 61: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

61

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Variance Reduction

Noisy Optimization: Variance Reduction

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 62: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

62

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Variance Reduction

Variance Reduction

AimImprove convergence rate in Noisy optimization problems (e.g.DVS) using traditional variance reduction techniques

FrameworkGrey-box: manipulate random seeds

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 63: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

63

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Variance Reduction

Common Random Numbers (CRN)

Relevant for population-based noisy optimization:

I we want to know Ef (x,ω) for several x,

I we approximate Ef (x,ω) ' 1n

∑ni=1 f (x ,ωi )

⇒ CRN = use the same samples ω1, . . . ,ωn for all x in a sameoptimization iteration

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 64: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

64

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Variance Reduction

Stratified Sampling

Each ωi is randomly drawn conditionally to a stratum.The domain of ω is partitioned into disjoint strata S1, . . . ,SN .The stratification function i 7→ s(i) is chosen by the noisyoptimization algorithm.ωi is randomly drawn conditionally to ωi ∈ Ss(i).

Ef (x,ω) =∑

i∈1,...,n

P(ω ∈ s(i))f (x,ωi )

Card j ∈ 1, . . . , n;ωj ∈ s(i)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 65: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

65

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Variance Reduction

Experiments

I Known:I CRN can be detrimentalI Stratification rarely (never ?) detrimental

I Here: experimental test

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 66: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

66

Introduction State of the art Contributions Conclusion References

Noisy Optimization: Variance Reduction

Experiments

0 10000 20000 30000 40000 50000 60000 70000 80000Evaluation

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Mea

n re

war

d (M

)

Evolution of the performance during the DPS policy build

Pairing and stratificationPairingReferenceStratification (noise factor)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 67: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

67

Introduction State of the art Contributions Conclusion References

Conclusions and perspectives

Conclusions and perspectives

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 68: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

68

Introduction State of the art Contributions Conclusion References

Conclusions and perspectives

Conclusions

I DVS: combining DPS and SDP forI fast decision making (polynomial)I anytime optimizationI initialized at MPC (which is great)I no need for tree representation

I CRN:I no proof of good propertiesI but empirically really good

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 69: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

69

Introduction State of the art Contributions Conclusion References

Conclusions and perspectives

Future work

I Mathematical analysis of DVS

I More intensive testing of DVS

I Optimization at the level of investments

I Non-stochastic uncertainties

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 70: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

70

Introduction State of the art Contributions Conclusion References

Conclusions and perspectives

BibliographyI An Introduction to Direct Value Search. 9emes Journees

Francophones de Planification, Decision et Apprentissage(JFPDA’14), Liege (Belgique), May 2014.

I Direct model predictive control. Jean-Joseph Christophe,Jeremie Decock, and Olivier Teytaud. In EuropeanSymposium on Artificial Neural Networks, ComputationalIntelligence and Machine Learning (ESANN), Bruges,Belgique, April 2014.

I Linear Convergence of Evolution Strategies withDerandomized Sampling Beyond Quasi-Convex Functions.Jeremie Decock and Olivier Teytaud. In EA - 11th BiennalInternational Conference on Artificial Evolution - 2013,Lecture Notes in Computer Science, Bordeaux, France,August 2013. Springer.

I Noisy Optimization Complexity Under Locality Assumption.Jeremie Decock and Olivier Teytaud. In FOGA - Foundationsof Genetic Algorithms XII - 2013, Adelaide, Australie,February 2013.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 71: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

71

Introduction State of the art Contributions Conclusion References

Conclusions and perspectives

Bibliography

I NEWAVE versus ODIN: comparison of stochastic anddeterministic models for the long term hydropower schedulingof the interconnected brazilian system. M. Zambelli et al.,2011.

I Dynamic Programming and Suboptimal Control: A Surveyfrom ADP to MPC. D. Bertsekas, 2005. (MPC =deterministic forecasts)

I Astrom 1965

I Renewable energy forecasts ought to be probabilistic! P.Pinson, 2013 (WIPFOR talk)

I Training a neural network with a financial criterion rather thana prediction criterion Y. Bengio, 1997 (quite practicalapplication of direct policy search, convincing experiments)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 72: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

72

Introduction State of the art Contributions Conclusion References

Conclusions and perspectives

Thank you for your attention

Questions ?

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 73: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

73

Introduction State of the art Contributions Conclusion References

References I

S. Astete-Morales, J. Liu, and O. Teytaud, log-log convergencefor noisy optimization, Proceedings of EA 2013, LLNCS,Springer, 2013, p. accepted.

alexander shapiro, Analysis of stochastic dual dynamicprogramming method, European Journal of OperationalResearch 209 (2011), no. 1, 63–72.

R. Bellman, Dynamic programming, Princeton Univ. Press,1957.

Yoshua Bengio, Using a financial training criterion rather thana prediction criterion, CIRANO Working Papers 98s-21,CIRANO, 1998.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 74: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

74

Introduction State of the art Contributions Conclusion References

References II

Dimitri P. Bertsekas, Dynamic programming and suboptimalcontrol: A survey from ADP to MPC, Eur. J. Control 11(2005), no. 4-5, 310–334.

H.-G. Beyer, The theory of evolutions strategies, Springer,Heidelberg, 2001.

D.P. Bertsekas and J.N. Tsitsiklis, Neuro-dynamicprogramming, Athena Scientific, 1996.

Zhihao Cen, J. Frederic Bonnans, and Thibault Christel,Energy contracts management by stochastic programmingtechniques, Annals of Operations Research 200 (2011), no. 1,199–222 (Anglais), RR-7289 RR-7289.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 75: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

75

Introduction State of the art Contributions Conclusion References

References III

Gerard Cornuejols, Revival of the gomory cuts in the 1990’s,Annals OR 149 (2007), no. 1, 63–66.

Z. L. Chen and W. B. Powell, Convergent cutting-plane andpartial-sampling algorithm for multistage stochastic linearprograms with recourse, J. Optim. Theory Appl. 102 (1999),no. 3, 497–524.

Christopher J. Donohue and John R. Birge, The abridgednested decomposition method for multistage stochastic linearprograms with relatively complete recourse, AlgorithmicOperations Research 1 (2006), no. 1.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 76: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

76

Introduction State of the art Contributions Conclusion References

References IV

Vaclav Fabian, Stochastic approximation of minima withimproved asymptotic speed, Ann. Math. Statist. 38 (1967),no. 1, 191–200.

S. Hong, P.O. Malaterre, G. Belaud, and C. Dejean,Optimization of irrigation scheduling for complex waterdistribution using mixed integer quadratic programming(MIQP), HIC 2012 – 10th International Conference onHydroinformatics (Hamburg, Allemagne), IAHR, 2012, pp. p.– p. (Anglais).

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 77: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

77

Introduction State of the art Contributions Conclusion References

References V

Verena Heidrich-Meisner and Christian Igel, Hoeffding andbernstein races for selecting policies in evolutionary directpolicy search, ICML ’09: Proceedings of the 26th AnnualInternational Conference on Machine Learning (New York, NY,USA), ACM, 2009, pp. 401–408.

Nikolaus Hansen, Sibylle D Muller, and Petros Koumoutsakos,Reducing the time complexity of the derandomized evolutionstrategy with covariance matrix adaptation (cma-es),Evolutionary Computation 11 (2003), no. 1, 1–18.

Holger Heitsch and Werner Romisch, Scenario tree reductionfor multistage stochastic programs, ComputationalManagement Science 6 (2009), no. 2, 117–133.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 78: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

78

Introduction State of the art Contributions Conclusion References

References VI

L. G. Khachiyan, A polynomial algorithm in linearprogramming, Doklady Akademii Nauk SSSR 244 (1979),1093–1096.

K. Linowsky and A. B. Philpott, On the convergence ofsampling-based decomposition algorithms for multistagestochastic programs, J. Optim. Theory Appl. 125 (2005),no. 2, 349–366.

A.B. Philpott and Z. Guan, On the convergence of stochasticdual dynamic programming and related methods, OperationsResearch Letters 36 (2008), no. 4, 450 – 455.

P. Pinson, Renewable energy forecasts ought to beprobabilistic, 2013, WIPFOR seminar, EDF.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 79: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

79

Introduction State of the art Contributions Conclusion References

References VII

W.-B. Powell, Approximate dynamic programming, Wiley,2007.

M. V. F. Pereira and L. M. V. G. Pinto, Multi-stage stochasticoptimization applied to energy planning, Math. Program. 52(1991), no. 2, 359–375.

A. Ruszczynski, vol. 10, ch. Decomposition methods,North-Holland Publishing Company, Amsterdam, 2003.

S. Uryasev and R.T. Rockafellar, Optimization of conditionalvalue-at-risk, Research report (University of Florida. Dept. ofIndustrial & Systems Engineering), Department of Industrial &Systems Engineering, University of Florida, 1999.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 80: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

80

Appendix

Parametric policies

Σ

Σ

x0=1

x1

x2

xn

Σ

...

Σ

z1

z2

x0=1output layerinput layer hidden layer

tanh

tanhW2

W3 W0

W1

z = W0 + W1 tanh(W2 x + W3)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 81: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

81

Appendix

Parametric policies

x =

x1

x2

...

xn

, z =

z1

z2

...

zk

, W3 =

w

(1)10

w(1)20

...

w(1)m0

, W0 =

w

(2)10

w(2)20

...

w(2)k0

,

W2 =

w

(1)11 w

(1)12 · · · w

(1)1n

w(1)21 w

(1)22 · · · w

(1)2n

......

. . ....

w(1)m1 w

(1)m2 · · · w

(1)mn

, W1 =

w

(2)11 w

(2)12 · · · w

(2)1m

w(2)21 w

(2)22 · · · w

(2)2m

......

. . ....

w(2)k1 w

(2)k2 · · · w

(2)km

.

z = W0 + W1 tanh(W2 x + W3)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 82: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

82

Appendix

Self-adaptive Evolution Strategy (SA-ES) with revaluations

Require:K > 0, λ > µ > 0, a dimension d > 0, τ (usually τ = 1√

2d).

Initialize parent population Pµ = (x1, σ1), (x2, σ2), . . . , (xµ, σµ)with ∀i ∈ 1, . . . , µ, xi ∈ IRd and σi = 1.

while stop condition doGenerate the offspring population Pλ = (x′1, σ

′1), (x′2, σ

′2), . . . , (x′λ, σ

′λ)

where each individual is generated by:

1. Select (randomly) ρ parents from Pµ.

2. Recombine the ρ selected parents to form a recombinant individual (x′, σ′).

3. Mutate the strategy parameter: σ′ ← σ′eτN (0,1).

4. Mutate the objective parameter: x′ ← x′ + σ′N (0, 1).

Select the new parent population Pµ taking the µ best form Pλ ∪ Pµ.end while

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 83: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

83

Appendix

Fabian1: Input: an initial x1 = 0 ∈ IRd , 1

2> γ > 0, a > 0, c > 0, m ∈ N, weights

w1 > · · · > wm summing to 1, scales 1 ≥ u1 > · · · > um > 0.2: n← 13: while (true) do4: Compute σn = c/nγ .5: Evaluate the gradient g at xn by finite differences, averaging over 2m samples

per axis:

∀i , j ∈ 1, . . . , d × 1 . . .m, x(i,j)+n = xn + ujei ,

∀i , j ∈ 1, . . . , d × 1 . . .m, x(i,j)−n = xn − ujei ,

∀i ∈ 1, . . . , d, g (i) =1

2σn

m∑j=1

wj

(f (x

(i,j)+n )− f (x

(i,j)−n )

).

6: Apply xn+1 ← xn − ang

7: n← n + 18: end while

Fabian’s stochastic gradient algorithm with finite differences. Several variants have

been defined, in particular versions in which only one point (or a constant number of

points, independently of the dimension) is evaluated at each iteration.Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 84: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

84

Appendix

Lemmas - Introduction (1)The objective function

fx∗,β,γ(x) = B

(‖x− x∗‖√

d

)β+ (1− γ)

)and the locality assumption

∀f ∈ F , P

∀i ≤ n, ‖xi − x∗‖ ≤ C(d)

iα︸ ︷︷ ︸ ≥ 1− δ

2

imply

Ef (x∗)︸ ︷︷ ︸1−γ

︷ ︸︸ ︷Ef (xn) ≤ Ef (x∗)︸ ︷︷ ︸

1−γ

dβ/2

C(d)β

nαβ

with probability at least 1− δ/2

(f and Ef (x) are short notations for fx∗,β,γ and Eωf (x,ω))

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 85: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

85

Appendix

Lemmas - Introduction (2)

‖xi − x∗‖ ≤ C (d)

iα(the locality assumption)

⇔ γ

(‖xi − x∗‖√

d

)β+ (1− γ)︸ ︷︷ ︸ ≤ (1− γ)︸ ︷︷ ︸+γ

(C (d)

iα√

d

⇔ Ef (xi ) ≤ Ef (x∗) + γ

(C (d)

iα√

d

)β⇔ Ef (xi ) ≤ Ef (x∗) +

γC (d)β

iαβ√

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 86: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

86

Appendix

ProofConsider fx∗ = fx∗,β,γ with x∗ uniformly distributed in R. Then:

P(‖xn − x∗‖ ≤ ε)≤ EΩPx∗ (x∗ ∈ Enl(Xn,Ω, ε))

≤ P(#Xn,Ω ≤ C)Px∗ (x∗ ∈ Enl(Xn,Ω, ε)|#Xn,Ω ≤ C)

+P(#Xn,Ω > C)

≤ (1−δ

2)C

C ′+δ

2

< 1−δ

2

where Enl(U, ε) is the ε-enlargement of U defined as:

Enl(U, ε) =

x; ∃x′ ∈ U, ‖x− x′‖ ≤ ε.

This contradicts the locality assumption.

This concludes the proof of αβ ≤ 1.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 87: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

87

Appendix

R-EDA (1)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 88: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

88

Appendix

R-EDA (2)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 89: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

89

Appendix

R-EDA (3)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 90: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

90

Appendix

Theorem’s proofStep 2

(??) ⇐⇒ ln

(f (x)

f (x′)

)≥ max

(ln

(1

η

), ln

(K ′

K ′′

)+ ln

(||x||||x′||

))

f (x′) ≤ ηf (x) ⇐⇒1

ηf (x′) ≤ f (x)

⇐⇒1

η≤

f (x)

f (x′)

⇐⇒ ln

(f (x)

f (x′)

)≥ ln

(1

η

)

f (x′) ≤K ′′

K ′||x′||||x||

f (x) ⇐⇒ f (x′)K ′

K ′′||x||||x′||

≤ f (x)

⇐⇒K ′

K ′′||x||||x′||

≤f (x)

f (x′)

⇐⇒ ln

(K ′

K ′′||x||||x′||

)≤ ln

(f (x)

f (x′)

)

⇐⇒ ln

(f (x)

f (x′)

)≥ ln

(K ′

K ′′

)+ ln

(||x||||x′||

)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 91: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

91

Appendix

Theorem’s proofStep 2

if l′ ≥ ln(c′) then

I l < ln(c) (otherwise it couldn’t be a “win”)

l′ − ln(c′) ≤ ln(c + maxi||δi ||)− ln(c′)

≤ ln(c) + ln(1 + maxi||δi ||/c)− ln(c′)

≤ ln(c/c′) + maxi||δi ||/c

≤ maxi||δi ||/c

l′ − ln(c′) = ln(||x + σδi ||

σ)− ln(c′) ≤ ln(

||x||σ

+ δi )− ln(c′)

≤ ln(c ∗ (1 +δi

c))− ln(c′)

≤ ln(c) + ln(1 + maxi||δi ||/c)− ln(c′)

≤ ln(c/c′) + maxi||δi ||/c

≤ maxi||δi ||/c

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 92: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

92

Appendix

Theorem’s ProofStep 3

Summing iterationsHence if t > n0,

ln(f (Xt )) ≤ ln(f (X1))− (t − n0)×∑i

max(

ln(

), ln

(K′K′′)

+ ∆)

min

(1 + ln( c

b) ∆

ln(2), 1 + ln( c

b)

maxj ln(||δj ||)ln(2)

)⇐⇒ ln(f (Xt ))− ln(f (X1)) ≤ −(t − n0)× C

⇐⇒ln(

f (Xt )f (X1)

)t − n0

≤ −C

⇒ln(||Xt ||)

t≤ K < 0 (theorem)

with C a positive constant.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 93: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

93

Appendix

Corollary’s proofStep 2 (1/4)

Step 2: showing that σ small leads to high acceptance rate and σ highleads to small acceptance rate.Thanks to the bounded conditioning, there exists ε > 0 s.t.

s ′ <1

2s (1)

with s = sup

σ

||x||;σ, x, f s.t. p ≥ ε

2

and s ′ = inf

σ

||x||;σ, x, f s.t. p <

1

2− ε

2

because s ′ → 0 and s →∞ as ε→ 0.

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 94: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

94

Appendix

Corollary’s proofStep 2 (2/4)

Notes

s = sup

σ

||x||;σ, x, f s.t. p ≥ ε

s ′ = inf

σ

||x||;σ, x, f s.t. p <

1

2− ε

Thensup

x,f ,σ>0|p − p| ≤ ε/2

implies1

2s ≥ 1

2s and s ′ ≥ s ′

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 95: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

95

Appendix

Corollary’s proofStep 2 (3/4)

So

s ′ ≤ s ′ <1

2s ≤ 1

2s

and

s ′ ≤ 1

2s

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 96: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

96

Appendix

Corollary’s proofStep 2 (4/4)

This provides k1, k2, c ′ and b′ as follows for Eqs. ?? and ??:

1

b′= s = sup

σ

||x||;σ, x, f s.t. p ≥ ε

1

c ′= s ′ = inf

σ

||x||;σ, x, f s.t. p <

1

2− ε

k1 = bεkc

k2 =

⌈(

1

2− ε)k

⌉Eqs. above imply c ′ ≥ 2b′

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 97: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

97

Appendix

Corollary’s proofStep 3

k large enough yield

b−1 = sup

σ

||x||;σ, x, f s.t. p > k1/k

,

c−1 = inf

σ

||x||;σ, x, f s.t. p < k2/k

,

which provide assumptions 2 and 5 with b < c

Assumptions 2 and 5 then imply b < b′ and c ′ < c

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies

Page 98: Hybridization of dynamic optimization methodologies · 7 Introduction State of the art Contributions ConclusionReferences Introduction Context and Motivations Need for better decision

98

Appendix

Non quasi-convex functions

-10

-5

0

5

10

-10 -5 0 5 10

-10

-5

0

5

10

-10 -5 0 5 10

-10

-5

0

5

10

-10 -5 0 5 10-10

-5

0

5

10

-10 -5 0 5 10

-10

-5

0

5

10

-10 -5 0 5 10-10

-5

0

5

10

-10 -5 0 5 10

(a) (b) (c)

(f)(e)(d)

Decock I-Lab: LRI - Inria & Artelys

Hybridization of dynamic optimization methodologies