prévision de consommation électrique avec adaptive gam
DESCRIPTION
par Yannig Goude (EDF)TRANSCRIPT
GAM models for Day-Ahead and Intra-Day Electricity Consumption Forecasts
week.temp
0
10
20
week.
ind
10
20
30
40
50
z
45000
50000
55000
60000
65000
Temperature Effect
Yannig Goude
EDF R&D - Clamart
( EDF R&D - Clamart) March 15, 2012 1 / 24
Motivation of Electricity Load Forecasting
Electricity can not be stored, thus forecast-ing elec. consumption:
to avoid blackouts on the grid
to avoid financial penalties
to optimize the management ofproduction units and electricitytrading
Managing a wild variety of productionunits:
nuclear plants
fuel, coal and gas plants
renewable energy: water dams, windfarms, solar panels...
( EDF R&D - Clamart) March 15, 2012 2 / 24
Motivation of Electricity Load Forecasting
Short-term load forecasting: from 1 day to a few hours horizon
( EDF R&D - Clamart) March 15, 2012 3 / 24
(Generalized) Additive smooth models
consider a univariate response y and corresponding predictors x1, ..., xp
an additive smooth model has the following structure:
yi = Xiβ + f1(x1,i ) + f2(x2,i ) + f3(x3,i , x4,i ) + ...+ εi
Xiβ is the linear part of the model
functions fj are supposed to be smooth
εi :
iidE (εi ) = 0,V (εi ) = σ2
normality if needed (tests...)
More precisely, we want to solve the following pb:
minβ,fj ||y − Xβ − f1(x1)− f2(x2) + ...)2 + λ1
∫f′′1 (x)2dx + λ2
∫f′′2 (x)2dx + ...
( EDF R&D - Clamart) March 15, 2012 4 / 24
(Generalized) Additive smooth models
Estimation of the fj : basis expansion in a spline basis
fj(x) =
kj∑q=1
aj,q(x)βj,q
Then the additive model becomes
yi = Xiβ +
k1∑q=1
a1,q(x)β1,q +
k2∑q=1
a2,q(x)β2,q + ...+ εi
Unknowns:
choice of the spline basis, number-position of knots kj
β and aj,q
⇒ take large kj and proceed to penalized regression (ridge)
( EDF R&D - Clamart) March 15, 2012 5 / 24
(Generalized) Additive smooth models
Then the initial problem
minβ,fj ||y − Xβ − f1(x1)− f2(x2) + ...)2 + λ1
∫f′′1 (x)2dx + λ2
∫f′′2 (x)2dx + ...
becomes a linear regression problem:
minβ ||y − Xβ||2 +∑
λjβTSjβ
as∫f′′j (x)2dx can be written as βTSjβ
absorbing aj,q(xi ) into Xi
Solution:
β̂λ = (XTX +∑
λjSj)−1XT y
( EDF R&D - Clamart) March 15, 2012 6 / 24
(Generalized) Additive smooth models
How to choose the penalization parameter λ?
without any penalisation: β̂0 = (XTX )−1XT y
regularised: β̂λ = (XTX +∑λjSj)
−1XT y
β̂λ = Fλβ̂0
WhereFλ = (XTX +
∑λjSj)
−1XTX
tr(Fλ): estimated degrees of freedom
( EDF R&D - Clamart) March 15, 2012 7 / 24
(Generalized) Additive smooth models
How to choose the penalization parameter λ?
( EDF R&D - Clamart) March 15, 2012 8 / 24
(Generalized) Additive smooth models
Ordinary Cross Validation
leave one observation yiestimate a model µ̂−i on the new data setforecast yi with µ̂
−ii
do that for all ichoose the λ that minimizes the OCV score:
V0(λ) =n∑
i=1
(yi − µ̂−ii )2/n
# Pb: calculation time
Generalized Cross Validation [Craven and Wahba (1979)]
Vg (λ) = n‖y − X β̂λ|2/(n − tr(Fλ))2
Advantages of GCV:
λ is obtained by numerical minimization of Vg (few comp. cost)
Vg (λ) is invariant when doing useful transf. of the data (on-line update, big data)
⇒ Software: R, package mgcv (see[Wood (2001)] and [Wood (2006)])
( EDF R&D - Clamart) March 15, 2012 9 / 24
From GAM to BAM
BAM: Big Additive Models
⇒ for huge data sets (more than 10 000 observations) we use the QR decomposition:
X = QR, f = QT y and denote ||r ||2 = ||y ||2 − ||f ||2
Q orth. matrix, R upper triang.
then we have:
Vg (λ) =n||f − Rβ̂λ||2 + ||r ||2
(n − tr(Fλ))2
where Fλ is now (RTR +∑λjSj)
−1RTR
⇒ Once we have R, f and ||r ||2, X plays no further part
( EDF R&D - Clamart) March 15, 2012 10 / 24
From GAM to BAM
⇒ Application for large data sets:
X is too big and has to be split:
(X0
X1
), similarly y =
(y0y1
)form QR dec. X0 = Q0X0 and
(R0
X1
)= Q1R see section 12.5 of
[Golub and Van Loan (1996)]
then X = QR with Q =
(Q0 00 I
)Q1 and QT y = QT
1
(QT
0 y0y1
)⇒ On-line update
X0, y0 past data, X1, y1 last observations
Use the new data X1, y1 to update R, f and ||r ||2
Re-estimate λ and βλ (previous values can be used as starting values for thenumerical optimization)
( EDF R&D - Clamart) March 15, 2012 11 / 24
Application to Electricity Load Data Electricity Data
Trend
1/9/
2002
13/1
/200
328
/5/2
003
9/10
/200
321
/2/2
004
4/7/
2004
16/1
1/20
0431
/3/2
005
12/8
/200
525
/12/
2005
8/5/
2006
20/9
/200
61/
2/20
0716
/6/2
007
28/1
0/20
0710
/3/2
008
23/7
/200
84/
12/2
008
18/4
/200
931
/8/2
009
4000
050
000
6000
070
000
8000
090
000
( EDF R&D - Clamart) March 15, 2012 12 / 24
Application to Electricity Load Data Electricity Data
Yearly Pattern
1/1/
2006
20/1
/200
68/
2/20
0627
/2/2
006
18/3
/200
67/
4/20
0626
/4/2
006
15/5
/200
63/
6/20
0622
/6/2
006
12/7
/200
631
/7/2
006
19/8
/200
67/
9/20
0626
/9/2
006
16/1
0/20
064/
11/2
006
23/1
1/20
0612
/12/
2006
31/1
2/20
06
3000
040
000
5000
060
000
7000
080
000
( EDF R&D - Clamart) March 15, 2012 13 / 24
Application to Electricity Load Data Electricity Data
Weekly Pattern
1/6/
2006
2/6/
2006
4/6/
2006
5/6/
2006
7/6/
2006
8/6/
2006
10/6
/200
612
/6/2
006
13/6
/200
615
/6/2
006
16/6
/200
618
/6/2
006
19/6
/200
621
/6/2
006
23/6
/200
624
/6/2
006
26/6
/200
627
/6/2
006
29/6
/200
630
/6/2
006
3500
040
000
4500
050
000
5500
0
( EDF R&D - Clamart) March 15, 2012 14 / 24
Application to Electricity Load Data Electricity Data
Daily Pattern
0 10 20 30 40
4000
045
000
5000
055
000
6000
065
000
7000
0
Instant
Load
MoTuWeTh
FrSaSu
( EDF R&D - Clamart) March 15, 2012 15 / 24
Application to Electricity Load Data Electricity Data
Special Days
0 10 20 30 40
6000
065
000
7000
075
000
8000
0
Instant
Load
(M
W)
Normal Special Tariff
20/1
2/20
0720
/12/
2007
21/1
2/20
0722
/12/
2007
23/1
2/20
0724
/12/
2007
25/1
2/20
0725
/12/
2007
26/1
2/20
0727
/12/
2007
28/1
2/20
0729
/12/
2007
30/1
2/20
0730
/12/
2007
31/1
2/20
071/
1/20
082/
1/20
083/
1/20
084/
1/20
084/
1/20
08
5500
060
000
6500
070
000
7500
080
000
8500
0
( EDF R&D - Clamart) March 15, 2012 16 / 24
Application to Electricity Load Data Electricity Data
Load-Temperature
( EDF R&D - Clamart) March 15, 2012 17 / 24
Application to Electricity Load Data Electricity Data
Load-Cloud Cover
0 10 20 30 40
6000
065
000
7000
075
000
Instant
Load
(M
W)
0 10 20 30 40
02
46
8Instant
Clo
ud c
over
(O
ctet
s)
( EDF R&D - Clamart) March 15, 2012 18 / 24
Application to Electricity Load Data Model
Lt = f1(Lt−48, It) IHH +f2(Lt−48, It) IHW +f3(Lt−48, It) IWH +f4(Lt−48, It) IWW
+ g1(Tt , It) + g2(Tt−48,Tt−96) + g3(Cloudt)+ h(Toyt , It)
+∑48
i=1 γiSpec.Tarift+
∑11j=1 αj
+ s(t)+ εt
fjs: lagged load effects
gjs: meteo. effects
hs: yearly pattern, Toy is time of year
γi : special tariff effect by half-hour of the day
αj mean load for: sunday, monday, tuesday,...,saturday, HH,HW,WH and WW days
s(t) is the trend
Estimation period: september 2002 - august 2008Forecasting period: september 2008 - august 2009
( EDF R&D - Clamart) March 15, 2012 19 / 24
Application to Electricity Load Data Model
GAM Model
I[t]
0
10
20
30
40
T[t]0
10
2030
L[t]
50000
55000
60000
65000
70000
Temperature Effect
0 10 20 30 40
−10
000
−50
000
5000
1000
0
Hour
Load
(M
W)
Mowe
FrSa
Su
Posan
0.00.2
0.4
0.6
0.8
Inst
ant
0
10
20
30
40
z
40000
50000
60000
70000
80000
Yearly Cycle
120000 140000 160000 180000 200000 220000 240000
−10
000
−50
000
5000
1000
0
Trend
t
( EDF R&D - Clamart) March 15, 2012 20 / 24
Application to Electricity Load Data Model
I[t]
0
10
20
30
40
L[t − 1]
30000
40000
5000060000
7000080000
L[t]
40000
50000
60000
70000
80000
Lagged Load Effect, WW
I[t]
0
10
20
30
40
L[t − 1]
30000
40000
5000060000
7000080000
L[t]
30000
40000
50000
60000
70000
Lagged Load Effect, WH
I[t]
0
10
20
30
40
L[t − 1]
30000
40000
5000060000
7000080000
L[t]
40000
60000
80000
Lagged Load Effect, HW
I[t]
0
10
20
30
40
L[t − 1]
30000
40000
5000060000
7000080000
L[t]
−10000
0
10000
20000
30000
Lagged Load Effect, HH
( EDF R&D - Clamart) March 15, 2012 21 / 24
Application to Electricity Load Data Model
Figure: Top: half hourly RMSE (left) and MAPE (right) by type of day. Bottom: residuals
0 10 20 30 40
500
1000
1500
2000
Instant
RM
SE
(M
W)
MoTuWeTh
FrSaSu
0 10 20 30 40
0.5
1.0
1.5
2.0
2.5
3.0
Instant
MA
PE
(%
)
MoTuWeTh
FrSaSu
9/1/
2002
12/1
9/20
02
4/8/
2003
8/12
/200
3
12/4
/200
3
3/22
/200
4
7/26
/200
4
11/1
7/20
04
3/7/
2005
7/6/
2005
10/2
3/20
05
2/18
/200
6
6/20
/200
6
10/8
/200
6
2/3/
2007
6/4/
2007
9/21
/200
7
1/17
/200
8
5/6/
2008
8/31
/200
8
−80
00−
4000
020
0040
0060
00
0 10 20 30 40
−80
00−
4000
020
0040
0060
00
( EDF R&D - Clamart) March 15, 2012 22 / 24
Application to Electricity Load Data Model
Performances
Model RMSE (MW) MAPE (%) RGCV scoreEstimation set
m0 831 1.17 882m1 1024 1.46 806
Forecasting setm0 1220 1.87
On-line m0 1048 1.49m1 1156 1.62
On-line m1 1109 1.53
( EDF R&D - Clamart) March 15, 2012 23 / 24
Application to Electricity Load Data Model
Residuals
9/1/
2008
9/17
/200
8
10/4
/200
8
10/2
1/20
08
11/1
5/20
08
12/2
/200
8
12/1
9/20
08
1/13
/200
9
1/30
/200
9
2/16
/200
9
3/4/
2009
3/21
/200
9
4/7/
2009
4/28
/200
9
5/27
/200
9
6/17
/200
9
7/4/
2009
7/25
/200
9
8/11
/200
9
8/31
/200
9
−60
0−
500
−40
0−
300
−20
0−
100
0m0m1On−line update
Figure: Cumulative residuals (right) for models m0 (black), m1 (red), and their on-line updated version (dashed lines).
( EDF R&D - Clamart) March 15, 2012 24 / 24
Application to Electricity Load Data Model
Craven and Wahba (1979) ”Smoothing noisy data with spline functions: estimated the correct degree of smoothing by
the method of general cross validation”. Numerische Mathematik 31, 377-403.
Golub and Van Loan (1996) ”Matrix Computations, 3rd edition”. John Hopkins Studies in the Mathematical Sciences.
Green and Silverman (1994) ”Nonparametric Regression and Generalized Linear Models”. Chapman and Hall.
Hastie and Tibshirani (1990) ” Generalized Additive Models”. Chapman and Hall.
Pierrot and Goude (2011) ”Short-Term Electricity Load Forecasting With Generalized Additive Models”, Proceedings of
ISAP power 2011.
Wahba (1990) ”Spline Models of Observational Data”. SIAM
Wood (2001) mgcv:GAMs and Generalized Ridge Regression for R. R News 1(2):20-25
Wood and Augustin (2002) ”GAMs with integrated model selection using penalized regression splines and applications to
environmental modelling”. Ecological Modelling 157:157-177
Wood (2006)Generalized Additive Models, An Introduction with R (Chapman and Hall, 2006)
( EDF R&D - Clamart) March 15, 2012 24 / 24