optimization of power systems - old and new tools
TRANSCRIPT
I do not speak Chinese ! ! !
And my English is extremely French (when native English speakers listen to my English, they sometimes believe that they suddenly, by miracle, understand French)
I do not speak Chinese ! ! !
And my English is extremely French (when native English speakers listen to my English, they sometimes believe that they suddenly, by miracle, understand French)
For the moment if I gave a talk in Chinese it would be boring, with only: hse-hse
nirao
pukachi
I do not speak Chinese ! ! !
And my English is extremely French (when native English speakers listen to my English, they sometimes believe that they suddenly, by miracle, understand French)
For the moment if I gave a talk in Chinese it would be boring, with only: hse-hse
nirao
pukachi
Interrupt me as much as you want for facilitating understanding :-)
High-Scale Power Systems:Simulation & Optimization
Olivier Teytaud + Inria-Tao + Artelys
TAO project-team
INRIA Saclay le-de-France
O. Teytaud, Research Fellow,[email protected]://www.lri.fr/~teytaud/
Ilab METIS
www.lri.fr/~teytaud/metis.html
Metis = Tao + ArtelysTAO tao.lri.fr, Machine Learning & OptimizationJoint INRIA / CNRS / Univ. Paris-Sud team
12 researchers, 17 PhDs, 3 post-docs, 3 engineers
Artelys www.artelys.com SME - France / US / Canada
- 50 persons==> collaboration through common platform
ActivitiesOptimization (uncertainties, sequential)
Application to power systems
O. Teytaud, Research Fellow,[email protected]://www.lri.fr/~teytaud/
Importantly, it is not a lie.
It is a tradition, in research institutes, to claim some links with industry
I don't claim that having such links is necessary or always a great achievement in itself
But I do claim that in my case it is true that I have links with industry
My four students here in Taiwan, and others in France, all have real salaries based on industrial fundings.
All in one slide
Consider an electric system.
Decisions = Strategic decisions (a few time steps):building a nuclear power plant
build a Spain-Marocco connection
build a wind farm
tactical decisions (many time steps):switching on hydroelectricity plant #7
switching on thermal plant #4
....
Based on Simulationsof the tactical levelDepends onthestrategiclevel
A bit more precisely:
the strategic level
Brute force approach for strategic level:
I simulateeach possible strategic decision (e.g. 20000);
1000 times;
each of them with optimal tactical decisions==> 20 000 optimizations, 1000 simulations each
I choose the best one.
Better: More simulations on the best strategic decisions.However, this talk will not focus on that part.
A bit more precisely:
the tactical level
Brute force approach for tactical level:
SimplifyReplace each random process by expectation
Optimize decisions deterministically
But reality is stochastic:Water inflows
Wind farms
Better: optimizing a policy(i.e. reactive, closed-loop)
Planning/control (tactical level)Pluriannual planning: evaluate marginal costs of hydroelectricity
Taking into account stochasticity and uncertainties==> IOMCA (ANR)
High scale investment studies (e.g. Europe+North Africa)Long term (2030 - 2050)
Huge (non-stochastic) uncertainties
Investments: interconnections, storage, smart grids, power plants...==> POST (ADEME)
Moderate scale (Cities, Factories) (tactical level simpler)Master plan optimization
Stochastic uncertainties ==> Citines project (FP7)
Specialization on Power Systems
Example: interconnection studies
(demand levelling, stabilized supply)
The POST project supergrids simulation and optimization
European subregions:
- Case 1 : electric corridor France / Spain / Marocco
- Case 2 : south-west (France/Spain/Italiy/Tunisia/Marocco)
- Case 3 : maghreb Central West Europe
==> towards a European supergrid
Relatedideas in AsiaMature technology:HVDC links(high-voltage direct current)
Tactical level: unit commitment at the scale of a coutry: looks like a game
Many time steps.
Many power plants.
Some of them have stocks (hydroelectricity).
Many constraints (rules).
Uncertainties (water inflows, temperature, )
==> make decisions:When should I switch on ? (for each PP)
At which power ?
Investment decisions through simulations
IssuesDemand varying in time, limited previsibility
Transportation introduces constraints
Renewable ==> variability ++
MethodsMarkovian assumptions ==> wrong
Simplified models ==> Model error >> optimization error
Our approachMachine Learning on top of Mathematical Programming
Hybridization reinforcement learning / mathematical programming
Math programming (mathematicians doing discrete-time control)Nearly exact solutions for a simplified problem
High-dimensional constrained action space
But small state space & not anytime
Reinforcement learning (artificially intelligent people doing discrete-time control :-) )Unstable
Small model bias
Small / simple action space
But high dimensional state space & anytime
Now the technical part
Model Predictive Control,
Now the technical part
Model Predictive Control,
Stochastic Dynamic Programming,
Now the technical part
Model Predictive Control,
Stochastic Dynamic Programming,
Direct Policy Search
Now the technical part
Model Predictive Control,
Stochastic Dynamic Programming,
Direct Policy Search,
and Direct Value Search (new)
(3/4 of this talk is about the state of the art, only 1/4 our work)
Now the technical part
Model Predictive Control,Stochastic Dynamic Programming,Direct Policy Searchand Direct Value Search
combining Direct Policy Search and Stochastic Dynamic Programming
(3/4 of this talk is about the state of the art, only 1/4 our work)
[email protected]@[email protected]@inria.fr
Many optimization tools (SDP, MPC):Strong constraints on forecasts
Strong constraints on model structure.
Direct Policy Search Arbitrary forecasts, arbitrary structure
But not scalable / # decision variables.
merge: Direct Value Search
Stochastic Dynamic Optimization
Classical solutions: Bellman (old & new)Anticipativity
Markov Chains
Overfitting
SDP, SDDP
Alternate solution: Direct Policy SearchNo problem with anticipativity
Scalability issue
The best of both worlds: Direct Value Search
Stochastic Control
System
Controllerwithmemory
commands
State
Cost
Observation
Random values
RandomprocessFor an optimal representation, you need access to the whole archive
or to forecasts (generative model / probabilistic forecasts) (Astrom 1965)
Stochastic Dynamic Optimization
Classical solutions: Bellman (old & new)Anticipativity (dirty solution)
Markov Chains
Overfitting
SDP, SDDP
Alternate solution: Direct Policy SearchNo problem with anticipativity
Scalability issue
The best of both worlds: Direct Value Search
Anticipative solutions:Maximum over strategic decisions
Of average over random processes
Of optimized decisions, given random processes & strategic decisions
Pros/ConsMuch simpler (deterministic optimization)
But in real life you can not guess November rains in January
Rather optimistic decisions
Anticipative solutions:Maximum over strategic decisions
Of pessimistic forecasts (e.g. quantile)
Of optimized decisions, given forecasts & strategic decisions
Pros/ConsMuch simpler (deterministic optimization)
But in real life you can not guess November rains in January
Not so optimistic, convenient, simple
MODEL PREDICTIVE CONTROL
Anticipative solutions:Maximum over strategic decisions
Of pessimistic forecasts (e.g. quantile)
Of optimized decisions, given forecasts & strategic decisions
Pros/ConsMuch simpler (deterministic optimization)
But in real life you can not guess November rains in January
Not so optimistic, convenient, simple
MODEL PREDICTIVE CONTROLOk, we have done oneof the four targets:model predictivecontrol.
Stochastic Dynamic Optimization
Classical solutions: Bellman (old & new)Anticipativity (dirty solution)
Markov Chains
Overfitting
SDP, SDDP
Alternate solution: Direct Policy SearchNo problem with anticipativity
Scalability issue
The best of both worlds: Direct Value Search
Markov solution
Representation as a Markov process (a tree):
This is the representationof the random process.
Let us see how torepresent the rest.
How to solve, simple case, binary stock, one day
It is December30th and I havewaterI do not useI use water(cost = 0)No more water, december 31st
I have water, december 31st
How to solve, simple case, binary stock, one day
It is December30th and I havewater:FutureCost = 0I do not useI use water(cost = 0)No more water, december 31st
I have water, december 31st
How to solve, simple case, binary stock, 3 days, no random process
1
1
2
3
2
2
2
2
3
2
3
3
3
3
3
4
1
How to solve, simple case, binary stock, 3 days, no random process
222
1
1
2
3
2
2
2
2
3
2
3
3
3
3
3
4
1
How to solve, simple case, binary stock, 3 days, no random process
457346222
1
1
2
3
2
2
2
2
3
2
3
3
3
3
3
4
1
This was deterministic
How to add a random process ?
Just multiply nodes :-)
How to solve, simple case, binary stock, 3 days, random parts
457346222
1
1
2
3
2
2
2
2
3
2
3
3
3
3
3
4
1
457346222
1
1
2
3
2
2
2
2
3
2
3
3
3
3
3
4
1
457346222
1
1
2
3
2
2
2
2
3
2
3
3
3
3
3
4
1
Probability 1/3Probability 2/3
Markov solution: ok you have understood stochastic dynamic programming (Bellman)
Representation as a Markov process (a tree):
This is the representationof the random process.
In each node, there are thestate-nodes with decision-edges.
Markov solution: ok you have understood stochastic dynamic programming (Bellman)
Representation as a Markov process (a tree):
This is the representationof the random process.
In each node, there are thestate-nodes with decision-edges.Ok, we have done the 2ndof the four targets:stochastic dynamicprogramming
Markov solution
Representation as a Markov process (a tree):
Optimize decisions for each state.This means you are not cheating.But difficult to use.
Might be ok for your problem ?
Strategy optimized forvery specific forecastingmodels
Stochastic Dynamic Optimization
Classical solutions: Bellman (old & new)Anticipativity (dirty solution)
Markov Chains
Overfitting
SDP, SDDP
Alternate solution: Direct Policy SearchNo problem with anticipativity
Scalability issue
The best of both worlds: Direct Value Search
Overfitting
Representation as a Markov process (a tree):
How do you actually make decisions when the random values are not exactly those observed ? (heuristics...)
Check on random realizations which have not been used for building the tree.
Does it work correctly ?
Overfitting = when it works only on scenarios used in the optimization process.
Stochastic Dynamic Optimization
Classical solutions: Bellman (old & new)Anticipativity (dirty solution)
Markov Chains
Overfitting
SDP, SDDP
Alternate solution: Direct Policy SearchNo problem with anticipativity
Scalability issue
The best of both worlds: Direct Value Search
SDP / SDDP
Stochastic (Dual) Dynamic Programming
Representation of the controller with Linear Progamming
(value function as piecewise linear)
SDP / SDDP
Stochastic (Dual) Dynamic Programming
Representation of the controller with Linear Progamming
(value function as piecewise linear) ok for 100 000 decision
variables per time step(tenths of time steps, hundreds of plants,
several decisions each)
SDP / SDDP
Stochastic (Dual) Dynamic Programming
Representation of the controller with Linear Progamming
(value function as piecewise linear) ok for 100 000 decision
variables per time step
but solving by expensive SDP/SDDP (curse of dimensionality, exp. in state variables)
SDP / SDDP
Stochastic (Dual) Dynamic Programming
Representation of the controller with Linear Progamming
(value function as piecewise linear) ok for 100 000 decision
variables per time step
but solving by expensive SDP/SDDP
ConstraintsNeeds LP approximation: ok for you ?
SDP / SDDP
Stochastic (Dual) Dynamic Programming
Representation of the controller with Linear Progamming
(value function as piecewise linear) ok for 100 000 decision
variables per time step
but solving by expensive SDP/SDDP
ConstraintsNeeds LP approximation: ok for you ?
SDDP requires convex Bellman values: ok for you ?
SDP / SDDP
Stochastic (Dual) Dynamic Programming
Representation of the controller with Linear Progamming
(value function as piecewise linear) ok for 100 000 decision
variables per time step
but solving by expensive SDP/SDDP
ConstraintsNeeds LP approximation: ok for you ?
SDDP requires convex Bellman values: ok for you ?
Needs Markov random processes: ok for you ?
(possibly after some random process extension...)
SDP / SDDP
Stochastic (Dual) Dynamic Programming
Representation of the controller with Linear Progamming
(value function as piecewise linear) ok for 100 000 decision
variables per time step
but solving by expensive SDP/SDDP
ConstraintsNeeds LP approximation: ok for you ?
SDDP requires convex Bellman values: ok for you ?
Needs Markov random processes: ok for you ?
(possibly after some random process extension...)
Goal keep scalability
but get rid of SDP/SDDP solving
Summary
Most classical solution = SDP and variants
Or MPC (model-predictive control), replacing the stochastic parts by deterministic pessimistic forecasts
Statistical modelization is cast into a tree model & (probabilistic) forecasting modules are essentially lost
Stochastic Dynamic Optimization
Classical solutions: Bellman (old & new)Anticipativity (dirty solution)
Markov Chains
Overfitting
SDP, SDDP
Alternate solution: Direct Policy SearchNo problem with anticipativity
But scalability issue
The best of both worlds: Direct Value Search
Direct Policy Search
Requires a parametric controller
Principle: optimize the parameters on simulations
Unusual in large scale Power Systems
(we will see why)
Usual in other areas (finance, evolutionary robotics)
Stochastic Control
System
Controllerwithmemory
commands
State
Cost
State
Random values
RandomprocessOptimize the controller thanks to a simulator:Command = Controller(w,state,forecasts)
Simulate( w ) = stochastic loss with parameter w
w* = argmin [Simulate(w)]
Stochastic Control
System
Controllerwithmemory
commands
State
Cost
State
Random values
RandomprocessOptimize the controller thanks to a simulator:Command = Controller(w,state,forecasts)
Simulate( w ) = stochastic loss with parameter w
w* = argmin [Simulate(w)]
Ok, we have done the 3rdof the four targets:Direct policy search.
Direct Policy Search (DPS)
Requires a parametric controller
Direct Policy Search (DPS)
Requires a parametric controllere.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
Direct Policy Search (DPS)
Requires a parametric controllere.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
Noisy Black-Box Optimization
Direct Policy Search (DPS)
Requires a parametric controllere.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
Noisy Black-Box OptimizationAdvantages: non-linear ok, forecasts included
Direct Policy Search (DPS)
Requires a parametric controllere.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
Noisy Black-Box OptimizationAdvantages: non-linear ok, forecasts included
Issue: too slow
hundreds of parameters for even 20 decision variables (depends on
structure)
Direct Policy Search (DPS)
Requires a parametric controllere.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
Noisy Black-Box OptimizationAdvantages: non-linear ok, forecasts included
Issue: too slow
hundreds of parameters for even 20 decision variables (depends on
structure)
Idea: a special structure for DPS (inspired from SDP)
Direct Policy Search (DPS)
Requires a parametric controllere.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
Noisy Black-Box OptimizationAdvantages: non-linear ok, forecasts included
Issue: too slow
hundreds of parameters for even 20 decision variables (depends on
structure)
Idea: a special structure for DPS (inspired from SDP)
Strategy optimized given the realforecasting module you have
(forecasts are inputs)
Stochastic Dynamic Optimization
Classical solutions: Bellman (old & new)Anticipativity (dirty solution)
Markov Chains
Overfitting
SDP, SDDP
Alternate solution: Direct Policy SearchNo problem with anticipativity
Scalability issue
The best of both worlds: Direct Value Search
Direct Value Search
SDP representation in DPS
Controller(state) = argmin Cost(decision) + V(next state)V(nextState) = alpha x NextState
alpha = NeuralNetwork(w,state)
(or a more sophisticated LP)==> given w, decision making solved as a LP==> non-linear mapping for choosing the parameters of the LP from the current state
Not LPLP
Controller(state) = argmin Cost(decision) + V(next state)V(nextState) = alpha x NextState
alpha = NeuralNetwork(w,state)
(or a more sophisticated LP)Drawback: requires the optimization
of w ( = noisy black-box optimization problem)
Not LPLPDirect Value Search
SDP representation in DPS
Summary: the best of both worlds
Controller(w,state)V(w,state,.) is non-linear
Optimize Cost(dec) + V(w,state,nextState) is LP
Simul(w)
Do a simulation with w
Return the cost
DirectValueSearch
optimize w* = argmin simul(w)
Return Controller with w*
The optimization(will do its best, given the simulator and the structure)A simulator(you can put anything you want in it,even if it is not linear, nothing Markovian...)The Structure of the Controller(fast, scalable by structure)
Summary: the best of both worlds
Controller(w,state)V(w,state,.) is non-linear
Optimize Cost(dec) + V(w,state,nextState) is LP
Simul(w)
Do a simulation with w
Return the cost
DirectValueSearch
optimize w* = argmin simul(w)
Return Controller with w*
3 optimizers:SAES
Fabian: gradient descent
redundant finite differences
Newton version
Ok, we have done the 4thof the four targets:Direct value search.
State of the art in discrete-time control, a few tools:
Model Predictive Control:For making a decision in a given state:(i) do forecasts(ii) replace random procs -> pessimistic forecasts(iii) Optimize as if deterministic problem
Stochastic Dynamic Programming:Markov model
Compute cost to go backwards
Direct Policy Search:Parametric controller
Optimized on simulations
Conclusion
Still rather preliminary (less tested than MPC or SDDP) but promising:Forecasts naturally included in optimization
Anytime algorithm
(the user immediately gets approximate results)
No convexity constraints
Room for detailed simulations
(e.g. with very small time scale, for volatility)
No random process constraints (not Markov)
Can handle large state spaces (as DPS)
Can handle large action spaces (as SDP)==> can work on the real problem, without cast
Bibliography
Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC. D. Bertsekas, 2005. (MPC = deterministic forecasts)
Astrom 1965
Renewable energy forecasts ought to be probabilistic! P. Pinson, 2013 (wipfor talk)
Training a neural network with a financial criterion rather than
a prediction criterion.
Y. Bengio, 1997 (quite practical application of direct policy
search, convincing experiments)
Questions ?
Appendix
Representation of the controllerdecision(current state)= argmin Cost(decision) + Bellman(next state)
Linear programming (LP) if:For a given current state, next state = LP(decision)
Cost(decision) = LP(decision)
100 000 decision variables per time step
SDP / SDDP
Stochastic (Dual) Dynamic Programming