o sezonskom prilagođ
DESCRIPTION
dgdgfhTRANSCRIPT
Eurostat
Seasonal Adjustment
Topics
• Motivation and theoretical background (Øyvind Langsrud)
• Seasonal adjustment step-by-step (László Sajtos)
• (A few) issues on seasonal adjustment (László Sajtos)
Presented by• Øyvind Langsrud
• Statistics Norway
Time series with seasonal and non-seasonal variation
Index of production: Durable consumer goods
Time
a1
2004 2006 2008 2010 2012
4060
8010
012
014
0
Removing the seasonal variation
Original (black) and seasonally adjusted (blue)
Time
2004 2006 2008 2010 2012
4060
8010
012
014
0
Removing also the non-seasonal variation
Original (black), seasonally adjusted (blue) and trend (red)
Time
2004 2006 2008 2010 2012
4060
8010
012
014
0
Monthly time series example
• Trend and seasonality can be seen – How to find it by computation?
Original series: Retail sales volume index
2000 2002 2004 2006 2008 2010 2012 2014
8010
012
014
016
0
2000 2002 2004 2006 2008 2010 2012 2014
8010
012
014
016
0
Quick and dirty calculation of trend by ordinary linear regression:
y = a + b*time + e
time = 2000.000, 2000.083, 2000.167, 2000.250, 2000.333, 2000.417, 2000.500, 2000.583, 2000.667, 2000.750, 2000.833, 2000.917, 2001.000, 2001.083, …...
a = -6619.731 b = 3.351223
Original (blue) and model fit (red)
2000 2002 2004 2006 2008 2010 2012 2014
8010
012
014
016
0Including seasonality in "the dirty model"
y = a + b*time + cmonth + e
Original (blue) and model fit (red)
2000 2002 2004 2006 2008 2010 2012 2014
8010
012
014
016
0
Including seasonality in "the dirty model"
y = a + b*time + cmonth + e
a = -6468.505b = 3.275956 c = mnd0 mnd2 mnd3 mnd4 mnd5 mnd6 -9.19620250 -16.59062737 -6.79790939 -8.51090569 -1.18890200 6.33881598 mnd7 mnd8 mnd9 mnd10 mnd11 mnd12 1.84439111 4.62139480 -2.56494236 -0.04409251 1.53598811 30.55299181
• Transforming to seasonal adjustment languagea + b*time → Tt cmonth → St e → It
yt = Tt + St + It
Trend from "the dirty model" Original (blue) and trend (red)
2000 2002 2004 2006 2008 2010 2012 2014
8010
012
014
016
0
yt = Tt + St + It
Seasonality from "the dirty model" yt = Tt + St + It
Seasonality
2000 2002 2004 2006 2008 2010 2012 2014
-10
010
2030
Seasonal adjustment by "the dirty model" yt = Tt + St + It
Original (blue) and seasonal adjusted (red)
2000 2002 2004 2006 2008 2010 2012 2014
8010
012
014
016
0
Question to the audience:What is wrong with this ordinary regression approach ?
Irregular component by "the dirty model"yt = Tt + St + It
Irregular componet
2000 2002 2004 2006 2008 2010 2012 2014
-50
510
Original (blue) and trend (red)
2000 2002 2004 2006 2008 2010 2012 2014
8010
012
014
016
0
In practise a multiplicative model is used: yt = Tt × St × It
• yt is not the original series but a series that is corrected for holiday and trading day effects (calendar adjusted)
yt = Tt × St × It
yt = Tt × St × It
• Note that the seasonal factors vary slightly along time
Seasonal factors
2000 2005 2010 2015
0.9
1.0
1.1
1.2
1.3
Irregular componet
2000 2002 2004 2006 2008 2010 2012 2014
0.97
0.98
0.99
1.00
1.01
1.02
yt = Tt × St × It
• This time the irregular component looks more as true noise
• Note that correlated neighbour values is allowed (autocorrelation)
Original (blue) and seasonally adjusted (red)
2000 2002 2004 2006 2008 2010 2012 2014
8010
012
014
016
0
yt = Tt × St × It
• This is seasonally adjusted data as published by Statistics Norway
Multiplicative model: yt = Tt × St × It Additive model: yt = Tt + St + It
How to calculate Tt, St, and It from yt?
Seasonally adjusted (blue) and trend (red)
2000 2002 2004 2006 2008 2010 2012 2014
9010
011
012
0
• This is done by filtering techniques– One element of this
methodology is how to calculate the trend from seasonally adjusted data
– This is a question of smoothing a noisy series
2000-2014 Seasonally adjusted (blue) and trend (red)
2000 2002 2004 2006 2008 2010 2012 2014
9010
011
012
0
2007-2012 Seasonally adjusted (blue) and trend (red)
2007 2008 2009 2010 2011 2012
110
115
120
Smoothing by averaging • Pt = (Yt-1+ Yt + Yt+1)/3
3-term simple moving average: [1,1,1]/3
2007 2008 2009 2010 2011 2012
110
115
120
Also called filtering • Pt = (Yt-2+ Yt-1+ Yt + Yt+1 + Yt+2)/5• The filter is [1,1,1,1,1]/5
5-term simple moving average: [1,1,1,1,1]/5
2007 2008 2009 2010 2011 2012
110
115
120
Here the filter length is 9
9-term simple moving average: [1,1,1,1,1,1,1,1,1]/9
2007 2008 2009 2010 2011 2012
110
115
120
Filtering can be performed twice
• 3x3 filter– 3-term moving average of a 3-term moving average– The final filter is [1,2,3,2,1]/9– Pt = (Yt-2+ 2Yt-1+ 3Yt + 2Yt+1 + Yt+2)/9
• 2x12 filter– [1/2,1,1,1,1,1,1,1,1,1,1,1,1/2]/12– Also called a centred 12-term moving average– Question to the audience:
Why is this filter of special interest?
Henderson filters
• Finding filters with good properties is an interesting topic …
• Hederson (1916) introduces the so-called Henderson filters
• X-12-ARIMA uses this type of filter to calculate the trend
• The filter length determines the degree of smoothing
5-term Henderson: [-21,84,160,84,-21]/286
2007 2008 2009 2010 2011 2012
110
115
120
7-term Henderson: [-42,42,210,295,210,42,-42]/715
2007 2008 2009 2010 2011 2012
110
115
120
13-term Henderson: [-325,-468,0,1100,2475,3600,4032,3600,2475,1100,0,-468,-325]/16796
2007 2008 2009 2010 2011 2012
110
115
120
23-term Henderson filter
2007 2008 2009 2010 2011 2012
110
115
120
Question to the audience: Why does the filtered series stop in 2009?99-term Henderson filter
2007 2008 2009 2010 2011 2012
110
115
120
Non-available observations at the end: Two solutions
• Asymmetric filters– Asymmetric variant of Henderson
[-0.034,0.116,0.383,0.534,0,0,0] Can be used at the last observation
• Forecasts in place of the unobserved values – The “starting series” for the X12-ARIMA decompositions is
a calendar adjusted series which is based on reg-ARIMA modelling
– The reg-ARIMA modelling can also be used to produced forecasts
– X12-ARIMA uses these forecasts in trend calculations
Finding the seasonal component by filtering
• From a series with the trend removed we make 12 series– January-values, February-values, …
• Each of these series is smoothed by filtering • Altogether these smoothed series are the
seasonal component
Series with trend removed
2000 2002 2004 2006 2008 2010 2012 2014
0.9
1.0
1.1
1.2
1.3
The X12-ARIMA algorithm
• The decomposition is made by several iterative steps– Seasonal component from series with trend removed– Trend from series with seasonal component removed
• Initial estimate of trend using the 2x12 moving average
• One element is downweighting of observations with an extreme irregular component
X12-ARIMA or SEATS • Both method can be viewed as filtering techniques
• X12-ARIMA– A non-parametric method – No model assumed
• SEATS– The components are assumed to follow ARIMA models– The filters are derived from modelling – Possible to do inference and to make forecasts with
confidence intervals – So why the name X12-ARIMA when this method is the one
that is not based on ARIMA? Answer on the next slide
Calendar adjustment by reg-ARIMA modelling
• Seasonal ARIMA model– Correlated errors (autocorrelation)– Differencing the series makes the model quite good without explicit
parameters for trend and seasonality – Need to decide the type of ARIMA model: ARIMA(p,d,q)(P,D,Q)
• Regression parameters in the model– Calendar effects: Trading day, Moving holyday, … – Outliers and level shifts
• Here y can be a log-transformed and leap-year adjusted variant of the original data
"The dirty model" mentioned earlier:
This slide is “stolen” from https://www.scss.tcd.ie/Rozenn.Dahyot/ST7005/15SeasonalARIMA.pdf
Here B is the backshift operator: BYt =Yt-1
ARIMA(0,1,1)(0,1,1) Most common model Airline model
Example of regression variables
in reg-ARIMA modelling
• Easter – 2000 and 2001: Easter in
April– 2008: Easter in March– 2002: 4 of 5 Norwegian
Easter days in March• Trading day
– Six parameters needed to model seven days
– Mon: Number of Mondays minus Number of Sundays
Easter Mon Tue Wed Thu Fri SatJan 2000 0.0000000 0 -1 -1 -1 -1 0Feb 2000 0.0000000 0 1 0 0 0 0Mar 2000 -0.2571429 0 0 1 1 1 0Apr 2000 0.2571429 -1 -1 -1 -1 -1 0May 2000 0.0000000 1 1 1 0 0 0Jun 2000 0.0000000 0 0 0 1 1 0Jul 2000 0.0000000 0 -1 -1 -1 -1 0Aug 2000 0.0000000 0 1 1 1 0 0Sep 2000 0.0000000 0 0 0 0 1 1Oct 2000 0.0000000 0 0 -1 -1 -1 -1Nov 2000 0.0000000 0 0 1 1 0 0Dec 2000 0.0000000 -1 -1 -1 -1 0 0Jan 2001 0.0000000 1 1 1 0 0 0Feb 2001 0.0000000 0 0 0 0 0 0Mar 2001 -0.2571429 0 0 0 1 1 1Apr 2001 0.2571429 0 -1 -1 -1 -1 -1May 2001 0.0000000 0 1 1 1 0 0Jun 2001 0.0000000 0 0 0 0 1 1Jul 2001 0.0000000 0 0 -1 -1 -1 -1Aug 2001 0.0000000 0 0 1 1 1 0Sep 2001 0.0000000 -1 -1 -1 -1 -1 0Oct 2001 0.0000000 1 1 1 0 0 0Nov 2001 0.0000000 0 0 0 1 1 0Dec 2001 0.0000000 0 -1 -1 -1 -1 0Jan 2002 0.0000000 0 1 1 1 0 0Feb 2002 0.0000000 0 0 0 0 0 0Mar 2002 0.5428571 -1 -1 -1 -1 0 0Apr 2002 -0.5428571 1 1 0 0 0 0May 2002 0.0000000 0 0 1 1 1 0 : : :Mar 2008 0.7428571 0 -1 -1 -1 -1 0Apr 2008 -0.7428571 0 1 1 0 0 0May 2008 0.0000000 0 0 0 1 1 1Jun 2008 0.0000000 0 -1 -1 -1 -1 -1Jul 2008 0.0000000 0 1 1 1 0 0Aug 2008 0.0000000 -1 -1 -1 -1 0 0Sep 2008 0.0000000 1 1 0 0 0 0Oct 2008 0.0000000 0 0 1 1 1 0Nov 2008 0.0000000 -1 -1 -1 -1 -1 0Dec 2008 0.0000000 1 1 1 0 0 0
Trading day: Separate effect of each day or common effect of all weekdays?
• Question to the audience:– Why exactly
equal t-values?
Regression Model -------------------------------------------------------------- Parameter Standard Variable Estimate Error t-value -------------------------------------------------------------- Trading Day Mon -0.0019 0.00193 -1.00 Tue 0.0064 0.00194 3.31 Wed 0.0018 0.00190 0.94 Thu -0.0016 0.00195 -0.81 Fri 0.0138 0.00188 7.37 Sat 0.0034 0.00193 1.73 *Sun (derived) -0.0219 0.00196 -11.16
Regression Model -------------------------------------------------------------- Parameter Standard Variable Estimate Error t-value -------------------------------------------------------------- Trading Day Weekday 0.0036 0.00053 6.87 **Sat/Sun (derived) -0.0090 0.00131 -6.87
Outliers• An extreme observation caused by a special event can
be problematic – Can influence the modelling in a negative way
Parameter estimates Forecasts Decomposition
• Solution – Include the outlier as a dummy variable in the reg-ARIMA
modelling ….0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0….
– The outlier is included in the irregular component after modelling
The observation is still included in seasonally adjusted data But has no effect on the trend
Question to the audience: Examples of special events?
Data with outlier: Seasonally adjusted (blue) and trend (red)
2000 2002 2004 2006 2008 2010 2012 2014
9010
011
012
0
Data with level shift: Seasonally adjusted (blue) and trend (red)
2000 2002 2004 2006 2008 2010 2012 2014
8590
9510
010
511
011
5
• Level shift is handled similar to outliers– Regression variable: ….0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1…. – Level shift is included in the trend
Presented by• László Sajtos
• Hungarian Central Statistical Office
Topics
• Seasonal adjustment step-by-step
• (A few) issues on seasonal adjustment
Seasonal adjustment step-by-step
Seasonal adjustment step-by-step: structure
Input data
STEPS with check points
Preliminary results
Output data
If results are acceptable
Not acceptable results
Basic conditions• Length of time series (enough long to be seasonally adjusted?)
Monthly datasets: at least 3-year long Quarterly datasets: at least 4-year long
At least 5-7-year long time series is optimal!
Expert information• Collecting expert data from the sections about datasets (potential outliers,
methodological changes, changes in exterior factors (e.g. law), connections to other time series and sectors)
Time series analysis (STEP 0)
• Graphical analysis via basic and sophisticated graphs
Plotted raw dataset Spectral analysis: autocorrelogram and auto-regressive spectrum
• Identifying and explaining missing observations and outliers
• Correction of data faults
• Test for seasonality
Graphical analysis, test for seasonality (STEP 1)
Seasonality
Seems additive
Data: Hungarian monthly retail volume index, food
dateJ an2000 J an2002 J an2004 J an2006 J an2008 J an2010 J an2012 J an2014
56
64
72
80
88
96
104
112
120
128
136
144Élelm. jell.
Probably outliers
Graphical analysis, an example (2000-2013)
Automatic test
Graphical analysis
Software tools
Verification
Type of transformation (STEP 2)
Determining factors which may affect (regressors)+national holidays
Non-significance or absence Little significance
Keep
Sign
ifica
nce
Elimination
Consideration based on professional reasons
Consideration based on professional reasons
Elimination
Calendar adjustment (STEP 3)
Outlier treatment (Step 4)
Automatic outlier testing
Software tools
Verifying the results
STEP 1
Keep it
Significant
Monitoring Stability
Available expert information
Less significant, but professionally
reasonable
Not significant
Eliminate it
Consideration based on professional reasons
Airline model
Software tools
Not satisfying results Good results
Keep model
Manual settings
Automatic choice recommended
Other low ordered models
Reducing the order of the model
ARIMA model (Step 5)
Decomposition (Step 6)
Software tools
Eliminating deterministic effects
Decomposition
Multiplicative Log-additiveAdditive
Quality diagnostics (Step 7)
1. Model adequacy on residuals:
• Ljung-Box test• Box-Pierce test
2. Seasonality: based on spectral graphics
3. Stability analysis: sliding spans
Documentation required!
Manual settings (Step 8)
In case of:
• Detailed analysis
• Quality diagnostics are not auspicious
• Further outlier correction
• Other advanced settings (e.g. confidence intervals)
Manual settings
Quality diagnostics Dissemination
satisfying
Manual settingsnot
(STEP 9)
EXAMPLE (IN DEMETRA 2.04 SOFTWARE)
HUNGARIAN INDUSTRIAL TIME SERIES
Automated module
Open the input database
The list of time series
Selection of time series output
Save of output
Diagnostic, outlier %
Adjustment without fixed models
Setting the method and trading day regressor
Setting the country specific holidays
The results Manual settings required Quality diagnostics
(A few) issues on seasonal adjustment
Issues in Memobust book
• Consistency issues Data presentation
• Revision Issues on chained indices
• Treatment of the crisis Documentation
• Communication with users
Revision
Revision
SA data Unadjusted data
Reasons:• Data arrival after deadline• Erroneous data etc.
What to do: Data review
Reasons:• New information are available• Better estimation required.
What to do: Estimating new model, new seasonal factors
Revision strategies
Goal: preserving accuracy, taking new information into consideration while avoiding large changes reliability and stability
Strategies:
Extreme types Current Concurrent
Alternative types Partial concurrent
Controlled current
Extreme types
Alternative types
Horizon of revision
Practices:
• ESS Guideline: 3-4 years before the beginning of the revision period
• Statistics Denmark: at least 13 months back in time
Question: How many months of data should be revised?
Consistency issues
Issues
Linkages in economy and among time
series;expectations of users; errors; etc.
Temporal constraints
E.g.Annual and infra-annual series
Cross-sectional constraints
E.g.Total industrial and segmental series
Time consistency issue Aggregation consistency issue
Time consistency issues
Problem: consistency of, for instance, sub-annual and annual series e.g. GDP
Sources of inconsistency:
• Less and more accurate data are compared;• Sampling errors;• Errors in evaluation
BenchmarkingBenchmark: typically annual data
Aim: Providing time consistency, the techniques operate with the sum of modified sub-annual series
Benchmarking
Pro-rating method
Denton method
Pro-rating method
How it works: multiplies the sub-annual values by the corresponding annual proportional discrepancies
Example: Three observations (), requirement:
Corrected values: ;
Denton methodHow it works: Based on quadratic optimalization
Advantages:
• The method can be developed, specificated
• More reliable results (smaller discontinuities compared with pro-rating)
Aggregation consistencyAggregate series: time series consists of several components (e.g. industrial series)
Goal: The aggregate series should equal to the sum of their components
Problem: Non-linear seasonal adjustment process
Consequences: Hard to preserve accounting relationships, and meet users’ expectations
Indirect SADirect SA
Methods to achieve aggregation consistency
• Only direct or indirect seasonal adjustment
• Pro-rating
• Denton method
• Regression based models
Thank you for your attention!
Questions?