# nina kondrashova, andriy pavlov, yaroslav pavlov

Embed Size (px)

DESCRIPTION

Sample division with sliding interval as a method of increase of accuracy for time series forecasting. Nina Kondrashova, Andriy Pavlov, Yaroslav Pavlov International Research and Training Center of Information Technologies and Systems of NASU, 03680, Kiev, av. Glushkova, 40, Ukraine - PowerPoint PPT PresentationTRANSCRIPT

Sample division with sliding interval as a

method of increase of accuracy for time series

forecasting

Nina Kondrashova, Andriy Pavlov, Yaroslav Pavlov

International Research and Training Center of Information Technologies and Systems of

NASU, 03680, Kiev, av. Glushkova, 40, Ukraine

National technical university of Ukraine «Kiev polytechnic institute», department of informatics and computing technique

1 Introduction 2 Problem statement 3 Analysis of the results 3.1Comparison of forecasts obtained by different well-known methods 4 Algorithm of modeling for sliding window with optimization of division (AMSWOD) 4.1 Comparison of forecasts obtained by the best well-known method and AMSWOD Conclusion

The purpose is to build forecasts petroleum prices for 3 months ahead (March, April and May of 2008) using the data

Nkyk ,...,2,1, of prices for period 1999 - 2008. This purpose needs to construct models, and Dik niy ,1,

forecast

values of time series by one of the most widespread methods or a new method. We needed:

1) To carry out the analysis of the obtained forecast results which were obtained by well known methods and to choose the best forecasting time series method;

2) To check criteria of choosing the best models of an examination subsample;

3) To create a new method for increasing of accuracy of forecasting;

4) To get new forecasts for the time series of Brent by the best methods.

Forecast of time series are built with help of known algorithms of GMDH, and also, for comparison, with AutoRegressive Moving Average (ARIMA) models of Box and Jenkins method, and with exponential smoothing models.

Holt-Winter’s method, Holt’s method, and method of smoothing with exponential trend were applied for the exponential smoothing. STATISTICA programmatic package (PP) was used for realization of forecast construction in the exponential smoothing methods and in method of Box and Jenkins. The program of COMBI under the operating system of

Windows with additional possibility in comparison with available program of COMBI of ASTRID PP was written for the realization of model construction by the combinatorial algorithm of GMDH. The improved version of Sheludko’s algorithm of GMDH

was used. This is multistage algorithm with a combinatorial selection and orthogonalization of variables (MACSO).

A model isn’t built for a complete sample W, but the sample must be shortened for the number of Dn examinations records.

Fig. 1. Time series of Brent petroleum prices: real data (month-averaged) and obtained trend by one of GMDH algorithms.

Initial data is divided into: U training subsample and D examination subsample for methods of Box and Jenkins, and of exponential smoothing. Initial W sample is divided into 3 parts: A learning, B testing and D examination subsamples for GMDH algorithms.

Fig. 2. Schemata of samples divisions: a) and b) one-stage division;

c) series-parallel multistage division.

1 Introduction 2 Problem statement 3 Analysis of the results 3.1Comparison of forecasts obtained by different well-known methods 4 Algorithm of modeling for sliding window with optimization of division (AMSWOD) 4.1 Comparison of forecasts obtained by the best well-known method and AMSWOD Conclusion

The model of forecast is being represented as the sum of two models: of trend model (1) and remainder model (3).

At first, trend v(k) as function of k is being obtained. For COMBI trend v(k) is power series of k time parameter. For MACSO trend v(k) is like:

kv(k) fθ (1) where θ is a vector of model parameters; kf is a vector of

functional transformations, elements of which belong to the set:

)(kfi

, , , 1,,,,1 32

33 kkk

kk

k (2)

Secondly, one could to search another part of model, linear in

parameters, for a remainder )(kvykk taking into account the effect of autocorrelation:

Dkkkkik ni ,1),,,,,( 321 θ

. (3)

The vectors θ and θ

are being determined by a method of least-squares (MLS) of errors at the learning subsample and model structure has been selected by criterion of regularity at the subsample В.

All GMDH algorithms (COMBI, MACSO and AMSWOD) constructed the models with a recursive forecast and without it. Nonrecursive forecast, when for the every forecast step is built its model.

,...),,( 111 kkk xxx θ,

)...,,,( 122

kkkk xxxx θ ,…, )...,,,( 1 rkkknnk xxxx

D θ. (4)

Recursive forecast, when every value obtained per one step ahead, take part in further forecast and the model is being recalculated taking into account these previous values:

,...),,( 111 kkk xxx θ,

)...,,,,( 1122

kkkkk xxxxx θ ,…,)...,,,,...,,( 111 rkkknkknnk xxxxxx

DD θ . (5)

Thus on every step of recursive forecasting, actually, the model is built for one step, which is being a result of chain substitution of the forecast values.

1 Introduction 2 Problem statement 3 Analysis of the results 3.1Comparison of forecasts obtained by different well-known methods 4 Algorithm of modeling for sliding window with optimization of division (AMSWOD) 4.1 Comparison of forecasts obtained by the best well-known method and AMSWOD Conclusion

To check the accuracy of forecasting, the following criteria were chosen: FE criterion is the relation of the model mean deviation module to the

spread of examination sample values NMSE criterion is the normalized mean-square error for the best

models of an examination subsample.

Dn

i D

ikik

nyy

yyFE

1 minmax )(

,

D

D

n

iik

n

iikik

yy

yy

NMSE

1

2

1

2

)(

)(

; (6)

Method FE Brent FE Urals NMSE Brent NMSE Urals ARIMA 0,73 0,80 1,98 2,11

Holt’s method 0,67 0,66 1,80 1,76 Holt-Winter’s method 0,76 0,97 2,09 2,49 Method of smoothing with exponential trend

0,58 0,55 1,58 1,47

COMBI of GMDH 0,83 0,74 2,16 1,92 recursive COMBI of GMDH

0,78 0,74 2,04 1,94

MACSO of GMDH 0,60 0,53 1,64 1,40 recursive MACSO of GMDH

0,60 0,56 1,65 1,53

Tab.1. Value of criteria for well-known methods.

The criteria values of the FE and of the NMSE

for Brent prices forecastingFE

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

SwET

recursiveMACSO ofGMDHMACSO ofGMDH

Holt'smethod

ARIMA

Holt-Winter'smethod

recursiveCOMBI ofGMDHCOMBI ofGMDH

NMSE

0

0,5

1

1,5

2

2,5

1

SwET

recursiveMACSO ofGMDHMACSO ofGMDH

Holt's method

ARIMA

Holt-Winter'smethod

recursiveCOMBI ofGMDHCOMBI ofGMDH

12

Forecasts Brent

95,00

100,00

105,00

110,00

115,00

120,00

125,00

1 2 3

Real data

Holt'smethod

SwET

MACSO

recursiveMACSO 95,00

100,00

105,00

110,00

115,00

120,00

125,00

1 2 3

Real data

COMBI of GMDH

recursive COMBIof GMDH

ARIMA

Holt-Winter'smethod

Fig. 2. Scheme of samples divisions: for Algorithm of Modeling for Sliding Window with Optimization of Data

Division

Algorithm of modeling for sliding window with optimization of division (AMSWOD)

The algorithm was created for model construction and includes the algorithm of quasi-optimal division of data and the MACSO.

The data from time series are putting into a sliding window consecutively. In the first stage of division these data are divided into 2 subsamples consecutively in the time. In second stage of division the data of jU subsample are divided into jA and jB subsamples quasi-optimally (fig.2c). The length of all examinations subsamples of D and jD is identical jnn

jDD , and

can be set. Lengths of other subsamples jUn , jAn , jBn and wn length of window are variable within their boundaries. As the modeling of the trend and remaider k not raised accuracy of forecasts by this algorithm. We construct the linear autoregressive models of the k monthly first differences of initial variable (data on figure 3) The content of subsamples for models construction (7) is being updated, at the change of the window position step by step:

Dkkkkik nif ,1),,,,,( 321 θ (7) process of division and of models construction is being repeatable oneself for each window.

Fig. 3. Real data (monthly first differences of Brent time series, 61 records)

The criteria of accuracy FE and NMSE make sense for comparison of obtained forecasts by different methods and algorithms, when the set of examination records identically and more than one element. In AMSWOD at the change of position of window the set of records of examination is varied. In addition, as the forecasting model is being explored for one step, these criteria can’t be calculated. Therefore are examined the following criteria: 1) δ is the module of forecast error for last record of sliding window; 2) PRT criterion of forecasting of tendency calculates probability of correct guessing of tendency, as

Dn

nPRT , (8)

where n is number of steps, on which direction of change of the forecast value coincides with direction of change of real value. 3) ω is the part of models with allowable of forecast.

Fig. 4. PRT the criterion of tendency forecasting; δ module of forecast error for last record of sliding window; ω part of models with allowable of forecast

Fig. 5. Real data and double forecasting of time series of Brent for 3 steps by different algorithms

A model has guessed about the tendency of variable in 76% cases and in 60% gives allowable of forecast as is obvious from figure 4. This result of AMSWOD is obtained at the limited search, on a quite short sample of 55 records and by linear autoregressive models. It requires further researches.

From comparison of the diagrams in figure 5 is visible that recursive AMSWOD gives more than in two times more exact forecast in comparison with the best of known algorithms for the first series of forecast (March, April and the May). Taking into account a forecast in the second series for three steps (June, July and August), the Smoothing with Exponential Trend is more exact NMSE=1.50, although together with other methods, does not reflect the tendency of prices change in July and August.

Conclusion The following items were performed within the bounds of subject:

• the using of sliding window with recursive forecast and optimization of data division providing more accurate forecasting on the areas of monotonous change of time series.

• Programmatic realization of interactive system of model construction is developed on the basis of combinatorial algorithm of GMDH, which can be used for the modeling and forecasting.

• The research, the purpose of which was determination of the best forecasting of time series of prices on the indicative sorts of petroleum, had been conducted.

Thank you for attention!