forecasting using - rob j hyndman · pdf file · 2014-01-21outline 1time series...
TRANSCRIPT
Forecasting using
3. Autocorrelation and seasonality
OTexts.com/fpp/2/OTexts.com/fpp/6/1
Forecasting using R 1
Rob J Hyndman
Outline
1 Time series graphics
2 Seasonal or cyclic?
3 Autocorrelation
Forecasting using R Time series graphics 2
Time series graphics
Time plotsR command: plot or plot.ts
Seasonal plotsR command: seasonplot
Seasonal subseries plotsR command: monthplot
Lag plotsR command: lag.plot
ACF plotsR command: Acf
Forecasting using R Time series graphics 3
Time series graphics
Forecasting using R Time series graphics 4
Economy class passengers: Melbourne−Sydney
Year
Tho
usan
ds
1988 1989 1990 1991 1992 1993
05
1015
2025
30
plot(melsyd[,"Economy.Class"])
Time series graphics
Forecasting using R Time series graphics 5
Antidiabetic drug sales
Year
$ m
illio
n
1995 2000 2005
510
1520
2530
> plot(a10)
Time series graphics
Forecasting using R Time series graphics 6
510
1520
2530
Seasonal plot: antidiabetic drug sales
Year
$ m
illio
n
●● ●
● ●
●
●
● ● ● ● ●● ● ● ●
●
●●
●● ● ● ●
●● ● ●
●
●●
●●
●● ● ●
● ● ●●
●●
●
● ●● ● ●
●●
● ●
●
●
● ●●
●●
● ● ●● ●
●●
●●
● ●●
●●
●●
●
●
●
●
● ●●
●●
●
● ●●
●●
●
●
●
● ●●
● ● ●
●
●
●
●
● ●
● ●
● ●
●
●●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
● ●
● ●
● ●
●●
●
●
●
●
●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●●
●
●
●
● ●
●
1991
1992
19931994
1995
1996
19971998199920002001
2002
2003
2004
2005
2006
2007
1992199319941995
199619971998
1999
2000
2001
20022003
2004
2005
2006
2007
2008
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting using R Time series graphics 7
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting using R Time series graphics 7
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting using R Time series graphics 7
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting using R Time series graphics 7
Seasonal subseries plots
Forecasting using R Time series graphics 8
Seasonal subseries plot: antidiabetic drug sales
Month
$ m
illio
n
510
1520
2530
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
> monthplot(a10)
Seasonal subseries plots
Data for each season collected together in timeplot as separate time series.
Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.
In R: monthplot
Forecasting using R Time series graphics 9
Seasonal subseries plots
Data for each season collected together in timeplot as separate time series.
Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.
In R: monthplot
Forecasting using R Time series graphics 9
Seasonal subseries plots
Data for each season collected together in timeplot as separate time series.
Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.
In R: monthplot
Forecasting using R Time series graphics 9
Quarterly Australian Beer Production
beer <- window(ausbeer,start=1992)
plot(beer)
seasonplot(beer,year.labels=TRUE)
monthplot(beer)
Forecasting using R Time series graphics 10
Time series graphics
Forecasting using R Time series graphics 11
Australian quarterly beer production
meg
alite
rs
1995 2000 2005
400
450
500
Time series graphics
Forecasting using R Time series graphics 12
400
450
500
Seasonal plot: quarterly beer production
Quarter
meg
alitr
es
Q1 Q2 Q3 Q4
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
19921993
1994
1995
1996
199719981999
2000
2001
200220032004
2005
2006
20072008
1992
1993
1994
1995
1996
199719981999
20002001
2002
2003
2004
20052006
2007
Time series graphics
Forecasting using R Time series graphics 13
Seasonal subseries plot: quarterly beer production
Quarter
Meg
alitr
es
400
450
500
Jan Apr Jul Oct
Outline
1 Time series graphics
2 Seasonal or cyclic?
3 Autocorrelation
Forecasting using R Seasonal or cyclic? 14
Time series patterns
Trend pattern exists when there is a long-termincrease or decrease in the data.
Seasonal pattern exists when a series isinfluenced by seasonal factors (e.g., thequarter of the year, the month, or day ofthe week).
Cyclic pattern exists when data exhibit rises andfalls that are not of fixed period (durationusually of at least 2 years).
Forecasting using R Seasonal or cyclic? 15
Time series patterns
Forecasting using R Seasonal or cyclic? 16
Australian electricity production
Year
GW
h
1980 1985 1990 1995
8000
1000
012
000
1400
0
Time series patterns
Forecasting using R Seasonal or cyclic? 17
Australian clay brick production
Year
mill
ion
units
1960 1970 1980 1990
200
300
400
500
600
Time series patterns
Forecasting using R Seasonal or cyclic? 18
Sales of new one−family houses, USA
Tota
l sal
es
1975 1980 1985 1990 1995
3040
5060
7080
90
Time series patterns
Forecasting using R Seasonal or cyclic? 19
US Treasury bill contracts
Day
pric
e
0 20 40 60 80 100
8586
8788
8990
91
Time series patterns
Forecasting using R Seasonal or cyclic? 20
Annual Canadian Lynx trappings
Time
Num
ber
trap
ped
1820 1840 1860 1880 1900 1920
010
0020
0030
0040
0050
0060
0070
00
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting using R Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting using R Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting using R Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting using R Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting using R Seasonal or cyclic? 21
Outline
1 Time series graphics
2 Seasonal or cyclic?
3 Autocorrelation
Forecasting using R Autocorrelation 22
Autocorrelation
Covariance and correlation: measure extent oflinear relationship between two variables (y andX).
Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.
We measure the relationship between: yt and yt−1
yt and yt−2
yt and yt−3
etc.
Forecasting using R Autocorrelation 23
Autocorrelation
Covariance and correlation: measure extent oflinear relationship between two variables (y andX).
Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.
We measure the relationship between: yt and yt−1
yt and yt−2
yt and yt−3
etc.
Forecasting using R Autocorrelation 23
Autocorrelation
Covariance and correlation: measure extent oflinear relationship between two variables (y andX).
Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.
We measure the relationship between: yt and yt−1
yt and yt−2
yt and yt−3
etc.
Forecasting using R Autocorrelation 23
Example: Beer production
Forecasting using R Autocorrelation 24
lag 1
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
18 19
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 43
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
62
63
64
65
66400
450
500
400 450 500
lag 2
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
1819
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
4243
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
62
63
64
65
lag 3
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
1819
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
4243
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
62
63
64
400 450 500
lag 4
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
18 19
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 43
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
62
63
lag 5
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
18 19
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 43
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
62
lag 6
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
1819
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
4243
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
400
450
500
lag 7
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
1819
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
4243
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
400
450
500
lag 8
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
1819
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 43
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
400 450 500 lag 9
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
18 19
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 43
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
> lag.plot(beer,lags=9)
Example: Beer production
Forecasting using R Autocorrelation 25
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
lag 1
beer
400
450
500
400 450 500
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
lag 2
beer
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
lag 3
beer
400 450 500
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
lag 4
beer
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
lag 5
beer
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
lag 6
beer
400
450
500
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
lag 7
beer
400
450
500
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
lag 8
beer
400 450 500
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
lag 9
beer
> lag.plot(beer,lags=9,do.lines=FALSE)
Lagged scatterplots
Each graph shows yt plotted against yt−k fordifferent values of k.
The autocorrelations are the correlationsassociated with these scatterplots.
Forecasting using R Autocorrelation 26
Lagged scatterplots
Each graph shows yt plotted against yt−k fordifferent values of k.
The autocorrelations are the correlationsassociated with these scatterplots.
Forecasting using R Autocorrelation 26
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting using R Autocorrelation 27
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting using R Autocorrelation 27
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting using R Autocorrelation 27
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting using R Autocorrelation 27
AutocorrelationResults for first 9 lags for beer data:
r1 r2 r3 r4 r5 r6 r7 r8 r9
−0.126 −0.650 −0.094 0.863 −0.099 −0.642 −0.098 0.834 −0.116
Forecasting using R Autocorrelation 28
AutocorrelationResults for first 9 lags for beer data:
r1 r2 r3 r4 r5 r6 r7 r8 r9
−0.126 −0.650 −0.094 0.863 −0.099 −0.642 −0.098 0.834 −0.116−
0.5
0.0
0.5
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1711 13 17
Forecasting using R Autocorrelation 28
Autocorrelation
r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.
r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.
Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting using R Autocorrelation 29
Autocorrelation
r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.
r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.
Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting using R Autocorrelation 29
Autocorrelation
r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.
r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.
Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting using R Autocorrelation 29
Autocorrelation
r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.
r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.
Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting using R Autocorrelation 29
ACF
Acf(beer)
Forecasting using R Autocorrelation 30
−0.
50.
00.
5
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1711 13 17
ACF
Acf(beer)
Forecasting using R Autocorrelation 30
−0.
50.
00.
5
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1711 13 17
Recognizing seasonality in a time series
If there is seasonality, the ACF at the seasonal lag(e.g., 12 for monthly data) will be large andpositive.
For seasonal monthly data, a large ACF valuewill be seen at lag 12 and possibly also at lags24, 36, . . .
For seasonal quarterly data, a large ACF valuewill be seen at lag 4 and possibly also at lags 8,12, . . .
Forecasting using R Autocorrelation 31
Recognizing seasonality in a time series
If there is seasonality, the ACF at the seasonal lag(e.g., 12 for monthly data) will be large andpositive.
For seasonal monthly data, a large ACF valuewill be seen at lag 12 and possibly also at lags24, 36, . . .
For seasonal quarterly data, a large ACF valuewill be seen at lag 4 and possibly also at lags 8,12, . . .
Forecasting using R Autocorrelation 31
Australian monthly electricity production
Forecasting using R Autocorrelation 32
Australian electricity production
Year
GW
h
1980 1985 1990 1995
8000
1000
012
000
1400
0
Australian monthly electricity production
Forecasting using R Autocorrelation 33
−0.
20.
00.
20.
40.
60.
8
Lag
AC
F
0 10 20 30 40
Australian monthly electricity production
Time plot shows clear trend and seasonality.The same features are reflected in the ACF.
The slowly decaying ACF indicates trend.
The ACF peaks at lags 12, 24, 36, . . . , indicateseasonality of length 12.
Forecasting using R Autocorrelation 34
Australian monthly electricity production
Time plot shows clear trend and seasonality.The same features are reflected in the ACF.
The slowly decaying ACF indicates trend.
The ACF peaks at lags 12, 24, 36, . . . , indicateseasonality of length 12.
Forecasting using R Autocorrelation 34
Which is which?
chirp
s pe
r m
inut
e0 20 40 60
4060
80
1. Daily morning temperature of a cow
thou
sand
s
1973 1975 1977 1979
78
910
2. Accidental deaths in USA (monthly)
thou
sand
s
1950 1952 1954 1956
100
200
300
400
3. International airline passengers
thou
sand
s
1850 1870 1890 1910
2060
100
4. Annual mink trappings (Canada)
A
AC
F
5 10 15 20
-0.4
0.2
0.6
1.0
B
AC
F
5 10 15 20
-0.4
0.2
0.6
1.0
C
AC
F
5 10 15 20
-0.4
0.2
0.6
1.0
D
AC
F
5 10 15 20
-0.4
0.2
0.6
1.0