optimization of power systems - old and new tools

I do not speak Chinese ! ! !

And my English is extremely French (when native English speakers listen to my English, they sometimes believe that they suddenly, by miracle, understand French)



For the moment if I gave a talk in Chinese it would be boring, with only: hse-hse

nirao

pukachi



For the moment if I gave a talk in Chinese it would be boring, with only: hse-hse

nirao

pukachi

Interrupt me as much as you want for facilitating understanding :-)

High-Scale Power Systems:Simulation & Optimization

Olivier Teytaud + Inria-Tao + Artelys

TAO project-team

INRIA Saclay le-de-France

O. Teytaud, Research Fellow,[email protected]://www.lri.fr/~teytaud/

Ilab METIS
www.lri.fr/~teytaud/metis.html

Metis = Tao + ArtelysTAO tao.lri.fr, Machine Learning & OptimizationJoint INRIA / CNRS / Univ. Paris-Sud team

12 researchers, 17 PhDs, 3 post-docs, 3 engineers

Artelys www.artelys.com SME - France / US / Canada
- 50 persons==> collaboration through common platform

ActivitiesOptimization (uncertainties, sequential)

Application to power systems

O. Teytaud, Research Fellow,[email protected]://www.lri.fr/~teytaud/

Importantly, it is not a lie.

It is a tradition, in research institutes, to claim some links with industry

I don't claim that having such links is necessary or always a great achievement in itself

But I do claim that in my case it is true that I have links with industry

My four students here in Taiwan, and others in France, all have real salaries based on industrial fundings.

All in one slide

Consider an electric system.

Decisions = Strategic decisions (a few time steps):building a nuclear power plant

build a Spain-Marocco connection

build a wind farm

tactical decisions (many time steps):switching on hydroelectricity plant #7

switching on thermal plant #4

....

Based on Simulationsof the tactical levelDepends onthestrategiclevel

A bit more precisely:
the strategic level

Brute force approach for strategic level:

I simulateeach possible strategic decision (e.g. 20000);

1000 times;

each of them with optimal tactical decisions==> 20 000 optimizations, 1000 simulations each

I choose the best one.

Better: More simulations on the best strategic decisions.However, this talk will not focus on that part.

A bit more precisely:
the tactical level

Brute force approach for tactical level:

SimplifyReplace each random process by expectation

Optimize decisions deterministically

But reality is stochastic:Water inflows

Wind farms

Better: optimizing a policy(i.e. reactive, closed-loop)

Planning/control (tactical level)Pluriannual planning: evaluate marginal costs of hydroelectricity

Taking into account stochasticity and uncertainties==> IOMCA (ANR)

High scale investment studies (e.g. Europe+North Africa)Long term (2030 - 2050)

Huge (non-stochastic) uncertainties

Investments: interconnections, storage, smart grids, power plants...==> POST (ADEME)

Moderate scale (Cities, Factories) (tactical level simpler)Master plan optimization

Stochastic uncertainties ==> Citines project (FP7)

Specialization on Power Systems

Example: interconnection studies
(demand levelling, stabilized supply)

The POST project supergrids simulation and optimization

European subregions:

- Case 1 : electric corridor France / Spain / Marocco

- Case 2 : south-west (France/Spain/Italiy/Tunisia/Marocco)

- Case 3 : maghreb Central West Europe

==> towards a European supergrid

Relatedideas in AsiaMature technology:HVDC links(high-voltage direct current)

Tactical level: unit commitment at the scale of a coutry: looks like a game

Many time steps.

Many power plants.

Some of them have stocks (hydroelectricity).

Many constraints (rules).

Uncertainties (water inflows, temperature, )

==> make decisions:When should I switch on ? (for each PP)

At which power ?

Investment decisions through simulations

IssuesDemand varying in time, limited previsibility

Transportation introduces constraints

Renewable ==> variability ++

MethodsMarkovian assumptions ==> wrong

Simplified models ==> Model error >> optimization error

Our approachMachine Learning on top of Mathematical Programming

Hybridization reinforcement learning / mathematical programming

Math programming (mathematicians doing discrete-time control)Nearly exact solutions for a simplified problem

High-dimensional constrained action space

But small state space & not anytime

Reinforcement learning (artificially intelligent people doing discrete-time control :-) )Unstable

Small model bias

Small / simple action space

But high dimensional state space & anytime

Now the technical part

Model Predictive Control,



Stochastic Dynamic Programming,




Direct Policy Search




Direct Policy Search,

and Direct Value Search (new)

(3/4 of this talk is about the state of the art, only 1/4 our work)


Model Predictive Control,Stochastic Dynamic Programming,Direct Policy Searchand Direct Value Search

combining Direct Policy Search and Stochastic Dynamic Programming

(3/4 of this talk is about the state of the art, only 1/4 our work)

[email protected]@[email protected]@inria.fr

Many optimization tools (SDP, MPC):Strong constraints on forecasts

Strong constraints on model structure.

Direct Policy Search Arbitrary forecasts, arbitrary structure

But not scalable / # decision variables.

merge: Direct Value Search

Stochastic Dynamic Optimization

Classical solutions: Bellman (old & new)Anticipativity

Markov Chains

Overfitting

SDP, SDDP

Alternate solution: Direct Policy SearchNo problem with anticipativity

Scalability issue

The best of both worlds: Direct Value Search

Stochastic Control

System

Controllerwithmemory

commands

State

Cost

Observation

Random values

RandomprocessFor an optimal representation, you need access to the whole archive

or to forecasts (generative model / probabilistic forecasts) (Astrom 1965)


Classical solutions: Bellman (old & new)Anticipativity (dirty solution)

Markov Chains

Overfitting

SDP, SDDP


Scalability issue


Anticipative solutions:Maximum over strategic decisions

Of average over random processes

Of optimized decisions, given random processes & strategic decisions

Pros/ConsMuch simpler (deterministic optimization)

But in real life you can not guess November rains in January

Rather optimistic decisions


Of pessimistic forecasts (e.g. quantile)

Of optimized decisions, given forecasts & strategic decisions



Not so optimistic, convenient, simple

MODEL PREDICTIVE CONTROL


Of pessimistic forecasts (e.g. quantile)

Of optimized decisions, given forecasts & strategic decisions



Not so optimistic, convenient, simple

MODEL PREDICTIVE CONTROLOk, we have done oneof the four targets:model predictivecontrol.



Markov Chains

Overfitting

SDP, SDDP


Scalability issue


Markov solution

Representation as a Markov process (a tree):

This is the representationof the random process.

Let us see how torepresent the rest.

How to solve, simple case, binary stock, one day

It is December30th and I havewaterI do not useI use water(cost = 0)No more water, december 31st

I have water, december 31st

How to solve, simple case, binary stock, one day

It is December30th and I havewater:FutureCost = 0I do not useI use water(cost = 0)No more water, december 31st

I have water, december 31st

How to solve, simple case, binary stock, 3 days, no random process

1

1

2

3

2

2

2

2

3

2

3

3

3

3

3

4

1


222

1

1

2

3

2

2

2

2

3

2

3

3

3

3

3

4

1


457346222

1

1

2

3

2

2

2

2

3

2

3

3

3

3

3

4

1

This was deterministic

How to add a random process ?

Just multiply nodes :-)

How to solve, simple case, binary stock, 3 days, random parts

457346222

1

1

2

3

2

2

2

2

3

2

3

3

3

3

3

4

1

457346222

1

1

2

3

2

2

2

2

3

2

3

3

3

3

3

4

1

457346222

1

1

2

3

2

2

2

2

3

2

3

3

3

3

3

4

1

Probability 1/3Probability 2/3

Markov solution: ok you have understood stochastic dynamic programming (Bellman)



In each node, there are thestate-nodes with decision-edges.

Markov solution: ok you have understood stochastic dynamic programming (Bellman)



In each node, there are thestate-nodes with decision-edges.Ok, we have done the 2ndof the four targets:stochastic dynamicprogramming

Markov solution


Optimize decisions for each state.This means you are not cheating.But difficult to use.

Might be ok for your problem ?

Strategy optimized forvery specific forecastingmodels



Markov Chains

Overfitting

SDP, SDDP


Scalability issue


Overfitting


How do you actually make decisions when the random values are not exactly those observed ? (heuristics...)

Check on random realizations which have not been used for building the tree.

Does it work correctly ?

Overfitting = when it works only on scenarios used in the optimization process.



Markov Chains

Overfitting

SDP, SDDP


Scalability issue


SDP / SDDP
Stochastic (Dual) Dynamic Programming

Representation of the controller with Linear Progamming
(value function as piecewise linear)

SDP / SDDP

(value function as piecewise linear) ok for 100 000 decision variables per time step(tenths of time steps, hundreds of plants, several decisions each)

SDP / SDDP

(value function as piecewise linear) ok for 100 000 decision variables per time step

but solving by expensive SDP/SDDP (curse of dimensionality, exp. in state variables)

SDP / SDDP


but solving by expensive SDP/SDDP

ConstraintsNeeds LP approximation: ok for you ?

SDP / SDDP




SDDP requires convex Bellman values: ok for you ?

SDP / SDDP





Needs Markov random processes: ok for you ?
(possibly after some random process extension...)

SDP / SDDP





Needs Markov random processes: ok for you ?
(possibly after some random process extension...)

Goal keep scalability
but get rid of SDP/SDDP solving

Summary

Most classical solution = SDP and variants

Or MPC (model-predictive control), replacing the stochastic parts by deterministic pessimistic forecasts

Statistical modelization is cast into a tree model & (probabilistic) forecasting modules are essentially lost



Markov Chains

Overfitting

SDP, SDDP


But scalability issue


Direct Policy Search

Requires a parametric controller

Principle: optimize the parameters on simulations

Unusual in large scale Power Systems
(we will see why)

Usual in other areas (finance, evolutionary robotics)

Stochastic Control

System


commands

State

Cost

State

Random values

RandomprocessOptimize the controller thanks to a simulator:Command = Controller(w,state,forecasts)

Simulate( w ) = stochastic loss with parameter w

w* = argmin [Simulate(w)]

Stochastic Control

System


commands

State

Cost

State

Random values

RandomprocessOptimize the controller thanks to a simulator:Command = Controller(w,state,forecasts)

Simulate( w ) = stochastic loss with parameter w

w* = argmin [Simulate(w)]

Ok, we have done the 3rdof the four targets:Direct policy search.

Direct Policy Search (DPS)

Requires a parametric controller


Requires a parametric controllere.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)


Controller(w,x) =
W3+W2.tanh(W1.x+W0)

Noisy Black-Box Optimization


Controller(w,x) =
W3+W2.tanh(W1.x+W0)

Noisy Black-Box OptimizationAdvantages: non-linear ok, forecasts included


Controller(w,x) =
W3+W2.tanh(W1.x+W0)


Issue: too slow
hundreds of parameters for even 20 decision variables (depends on structure)


Controller(w,x) =
W3+W2.tanh(W1.x+W0)


Issue: too slow

Idea: a special structure for DPS (inspired from SDP)


Controller(w,x) =
W3+W2.tanh(W1.x+W0)


Issue: too slow

Idea: a special structure for DPS (inspired from SDP)

Strategy optimized given the realforecasting module you have

(forecasts are inputs)



Markov Chains

Overfitting

SDP, SDDP


Scalability issue


Direct Value Search
SDP representation in DPS

Controller(state) = argmin Cost(decision) + V(next state)V(nextState) = alpha x NextState

alpha = NeuralNetwork(w,state)

(or a more sophisticated LP)==> given w, decision making solved as a LP==> non-linear mapping for choosing the parameters of the LP from the current state

Not LPLP

Controller(state) = argmin Cost(decision) + V(next state)V(nextState) = alpha x NextState

alpha = NeuralNetwork(w,state)

(or a more sophisticated LP)Drawback: requires the optimization of w ( = noisy black-box optimization problem)

Not LPLPDirect Value Search
SDP representation in DPS

Summary: the best of both worlds

Controller(w,state)V(w,state,.) is non-linear

Optimize Cost(dec) + V(w,state,nextState) is LP

Simul(w)

Do a simulation with w

Return the cost

DirectValueSearch

optimize w* = argmin simul(w)

Return Controller with w*

The optimization(will do its best, given the simulator and the structure)A simulator(you can put anything you want in it,even if it is not linear, nothing Markovian...)The Structure of the Controller(fast, scalable by structure)

Summary: the best of both worlds

Controller(w,state)V(w,state,.) is non-linear

Optimize Cost(dec) + V(w,state,nextState) is LP

Simul(w)

Do a simulation with w

Return the cost

DirectValueSearch

optimize w* = argmin simul(w)

Return Controller with w*

3 optimizers:SAES

Fabian: gradient descent

redundant finite differences

Newton version

Ok, we have done the 4thof the four targets:Direct value search.

State of the art in discrete-time control, a few tools:

Model Predictive Control:For making a decision in a given state:(i) do forecasts(ii) replace random procs -> pessimistic forecasts(iii) Optimize as if deterministic problem

Stochastic Dynamic Programming:Markov model

Compute cost to go backwards

Direct Policy Search:Parametric controller

Optimized on simulations

Conclusion

Still rather preliminary (less tested than MPC or SDDP) but promising:Forecasts naturally included in optimization

Anytime algorithm
(the user immediately gets approximate results)

No convexity constraints

Room for detailed simulations
(e.g. with very small time scale, for volatility)

No random process constraints (not Markov)

Can handle large state spaces (as DPS)

Can handle large action spaces (as SDP)==> can work on the real problem, without cast

Bibliography

Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC. D. Bertsekas, 2005. (MPC = deterministic forecasts)

Astrom 1965

Renewable energy forecasts ought to be probabilistic! P. Pinson, 2013 (wipfor talk)

Training a neural network with a financial criterion rather than a prediction criterion.
Y. Bengio, 1997 (quite practical application of direct policy search, convincing experiments)

Questions ?

Appendix

Representation of the controllerdecision(current state)= argmin Cost(decision) + Bellman(next state)

Linear programming (LP) if:For a given current state, next state = LP(decision)

Cost(decision) = LP(decision)

100 000 decision variables per time step

SDP / SDDP

optimization of power systems - old and new tools

Technology