forecasting using - rob j hyndman · pdf file · 2014-01-21outline 1time series...

58
Forecasting using 3. Autocorrelation and seasonality OTexts.com/fpp/2/ OTexts.com/fpp/6/1 Forecasting using R 1 Rob J Hyndman

Upload: dinhthuy

Post on 20-Mar-2018

218 views

Category:

Documents


2 download

TRANSCRIPT

Forecasting using

3. Autocorrelation and seasonality

OTexts.com/fpp/2/OTexts.com/fpp/6/1

Forecasting using R 1

Rob J Hyndman

Outline

1 Time series graphics

2 Seasonal or cyclic?

3 Autocorrelation

Forecasting using R Time series graphics 2

Time series graphics

Time plotsR command: plot or plot.ts

Seasonal plotsR command: seasonplot

Seasonal subseries plotsR command: monthplot

Lag plotsR command: lag.plot

ACF plotsR command: Acf

Forecasting using R Time series graphics 3

Time series graphics

Forecasting using R Time series graphics 4

Economy class passengers: Melbourne−Sydney

Year

Tho

usan

ds

1988 1989 1990 1991 1992 1993

05

1015

2025

30

plot(melsyd[,"Economy.Class"])

Time series graphics

Forecasting using R Time series graphics 5

Antidiabetic drug sales

Year

$ m

illio

n

1995 2000 2005

510

1520

2530

> plot(a10)

Time series graphics

Forecasting using R Time series graphics 6

510

1520

2530

Seasonal plot: antidiabetic drug sales

Year

$ m

illio

n

●● ●

● ●

● ● ● ● ●● ● ● ●

●●

●● ● ● ●

●● ● ●

●●

●●

●● ● ●

● ● ●●

●●

● ●● ● ●

●●

● ●

● ●●

●●

● ● ●● ●

●●

●●

● ●●

●●

●●

● ●●

●●

● ●●

●●

● ●●

● ● ●

● ●

● ●

● ●

●●

●●

● ●

●●

● ●

●●

●●

● ●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

1991

1992

19931994

1995

1996

19971998199920002001

2002

2003

2004

2005

2006

2007

1992199319941995

199619971998

1999

2000

2001

20022003

2004

2005

2006

2007

2008

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Seasonal plots

Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)

Something like a time plot except that the datafrom each season are overlapped.

Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.

In R: seasonplot

Forecasting using R Time series graphics 7

Seasonal plots

Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)

Something like a time plot except that the datafrom each season are overlapped.

Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.

In R: seasonplot

Forecasting using R Time series graphics 7

Seasonal plots

Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)

Something like a time plot except that the datafrom each season are overlapped.

Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.

In R: seasonplot

Forecasting using R Time series graphics 7

Seasonal plots

Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)

Something like a time plot except that the datafrom each season are overlapped.

Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.

In R: seasonplot

Forecasting using R Time series graphics 7

Seasonal subseries plots

Forecasting using R Time series graphics 8

Seasonal subseries plot: antidiabetic drug sales

Month

$ m

illio

n

510

1520

2530

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

> monthplot(a10)

Seasonal subseries plots

Data for each season collected together in timeplot as separate time series.

Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.

In R: monthplot

Forecasting using R Time series graphics 9

Seasonal subseries plots

Data for each season collected together in timeplot as separate time series.

Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.

In R: monthplot

Forecasting using R Time series graphics 9

Seasonal subseries plots

Data for each season collected together in timeplot as separate time series.

Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.

In R: monthplot

Forecasting using R Time series graphics 9

Quarterly Australian Beer Production

beer <- window(ausbeer,start=1992)

plot(beer)

seasonplot(beer,year.labels=TRUE)

monthplot(beer)

Forecasting using R Time series graphics 10

Time series graphics

Forecasting using R Time series graphics 11

Australian quarterly beer production

meg

alite

rs

1995 2000 2005

400

450

500

Time series graphics

Forecasting using R Time series graphics 12

400

450

500

Seasonal plot: quarterly beer production

Quarter

meg

alitr

es

Q1 Q2 Q3 Q4

● ●

● ●

●●

19921993

1994

1995

1996

199719981999

2000

2001

200220032004

2005

2006

20072008

1992

1993

1994

1995

1996

199719981999

20002001

2002

2003

2004

20052006

2007

Time series graphics

Forecasting using R Time series graphics 13

Seasonal subseries plot: quarterly beer production

Quarter

Meg

alitr

es

400

450

500

Jan Apr Jul Oct

Outline

1 Time series graphics

2 Seasonal or cyclic?

3 Autocorrelation

Forecasting using R Seasonal or cyclic? 14

Time series patterns

Trend pattern exists when there is a long-termincrease or decrease in the data.

Seasonal pattern exists when a series isinfluenced by seasonal factors (e.g., thequarter of the year, the month, or day ofthe week).

Cyclic pattern exists when data exhibit rises andfalls that are not of fixed period (durationusually of at least 2 years).

Forecasting using R Seasonal or cyclic? 15

Time series patterns

Forecasting using R Seasonal or cyclic? 16

Australian electricity production

Year

GW

h

1980 1985 1990 1995

8000

1000

012

000

1400

0

Time series patterns

Forecasting using R Seasonal or cyclic? 17

Australian clay brick production

Year

mill

ion

units

1960 1970 1980 1990

200

300

400

500

600

Time series patterns

Forecasting using R Seasonal or cyclic? 18

Sales of new one−family houses, USA

Tota

l sal

es

1975 1980 1985 1990 1995

3040

5060

7080

90

Time series patterns

Forecasting using R Seasonal or cyclic? 19

US Treasury bill contracts

Day

pric

e

0 20 40 60 80 100

8586

8788

8990

91

Time series patterns

Forecasting using R Seasonal or cyclic? 20

Annual Canadian Lynx trappings

Time

Num

ber

trap

ped

1820 1840 1860 1880 1900 1920

010

0020

0030

0040

0050

0060

0070

00

Seasonal or cyclic?

Differences between seasonal and cyclicpatterns:

seasonal pattern constant length; cyclic patternvariable length

average length of cycle longer than length ofseasonal pattern

magnitude of cycle more variable thanmagnitude of seasonal pattern

The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.

Forecasting using R Seasonal or cyclic? 21

Seasonal or cyclic?

Differences between seasonal and cyclicpatterns:

seasonal pattern constant length; cyclic patternvariable length

average length of cycle longer than length ofseasonal pattern

magnitude of cycle more variable thanmagnitude of seasonal pattern

The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.

Forecasting using R Seasonal or cyclic? 21

Seasonal or cyclic?

Differences between seasonal and cyclicpatterns:

seasonal pattern constant length; cyclic patternvariable length

average length of cycle longer than length ofseasonal pattern

magnitude of cycle more variable thanmagnitude of seasonal pattern

The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.

Forecasting using R Seasonal or cyclic? 21

Seasonal or cyclic?

Differences between seasonal and cyclicpatterns:

seasonal pattern constant length; cyclic patternvariable length

average length of cycle longer than length ofseasonal pattern

magnitude of cycle more variable thanmagnitude of seasonal pattern

The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.

Forecasting using R Seasonal or cyclic? 21

Seasonal or cyclic?

Differences between seasonal and cyclicpatterns:

seasonal pattern constant length; cyclic patternvariable length

average length of cycle longer than length ofseasonal pattern

magnitude of cycle more variable thanmagnitude of seasonal pattern

The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.

Forecasting using R Seasonal or cyclic? 21

Outline

1 Time series graphics

2 Seasonal or cyclic?

3 Autocorrelation

Forecasting using R Autocorrelation 22

Autocorrelation

Covariance and correlation: measure extent oflinear relationship between two variables (y andX).

Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.

We measure the relationship between: yt and yt−1

yt and yt−2

yt and yt−3

etc.

Forecasting using R Autocorrelation 23

Autocorrelation

Covariance and correlation: measure extent oflinear relationship between two variables (y andX).

Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.

We measure the relationship between: yt and yt−1

yt and yt−2

yt and yt−3

etc.

Forecasting using R Autocorrelation 23

Autocorrelation

Covariance and correlation: measure extent oflinear relationship between two variables (y andX).

Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.

We measure the relationship between: yt and yt−1

yt and yt−2

yt and yt−3

etc.

Forecasting using R Autocorrelation 23

Example: Beer production

Forecasting using R Autocorrelation 24

lag 1

beer

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

18 19

20

21

2223

24

25

2627

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42 43

44

45

46

47

48

49

50

51

52

53

5455

56

57

58

59

60

61

62

63

64

65

66400

450

500

400 450 500

lag 2

beer

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

1819

20

21

2223

24

25

2627

28

29

30

31

32

33

34

35

36

37

38

39

40

41

4243

44

45

46

47

48

49

50

51

52

53

5455

56

57

58

59

60

61

62

63

64

65

lag 3

beer

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

1819

20

21

2223

24

25

2627

28

29

30

31

32

33

34

35

36

37

38

39

40

41

4243

44

45

46

47

48

49

50

51

52

53

5455

56

57

58

59

60

61

62

63

64

400 450 500

lag 4

beer

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

18 19

20

21

2223

24

25

2627

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42 43

44

45

46

47

48

49

50

51

52

53

5455

56

57

58

59

60

61

62

63

lag 5

beer

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

18 19

20

21

2223

24

25

2627

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42 43

44

45

46

47

48

49

50

51

52

53

5455

56

57

58

59

60

61

62

lag 6

beer

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

1819

20

21

2223

24

25

2627

28

29

30

31

32

33

34

35

36

37

38

39

40

41

4243

44

45

46

47

48

49

50

51

52

53

5455

56

57

58

59

60

61

400

450

500

lag 7

beer

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

1819

20

21

2223

24

25

2627

28

29

30

31

32

33

34

35

36

37

38

39

40

41

4243

44

45

46

47

48

49

50

51

52

53

5455

56

57

58

59

60

400

450

500

lag 8

beer

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

1819

20

21

2223

24

25

2627

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42 43

44

45

46

47

48

49

50

51

52

53

5455

56

57

58

59

400 450 500 lag 9

beer

1

2

3

4

5

6

7

8

9

10

11

12

13

1415

16

17

18 19

20

21

2223

24

25

2627

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42 43

44

45

46

47

48

49

50

51

52

53

5455

56

57

58

> lag.plot(beer,lags=9)

Example: Beer production

Forecasting using R Autocorrelation 25

● ●

●●

● ●

●●

lag 1

beer

400

450

500

400 450 500

●●

●●

●●

●●

lag 2

beer

●●

●●

●●

●●

lag 3

beer

400 450 500

● ●

●●

● ●

●●

lag 4

beer

● ●

●●

● ●

●●

lag 5

beer

●●

●●

●●

●●

lag 6

beer

400

450

500

●●

●●

●●

●●

lag 7

beer

400

450

500

● ●

●●

● ●

●●

lag 8

beer

400 450 500

● ●

●●

● ●

●●

lag 9

beer

> lag.plot(beer,lags=9,do.lines=FALSE)

Lagged scatterplots

Each graph shows yt plotted against yt−k fordifferent values of k.

The autocorrelations are the correlationsassociated with these scatterplots.

Forecasting using R Autocorrelation 26

Lagged scatterplots

Each graph shows yt plotted against yt−k fordifferent values of k.

The autocorrelations are the correlationsassociated with these scatterplots.

Forecasting using R Autocorrelation 26

AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define

ck =1

T

T∑t=k+1

(yt − y)(yt−k − y)

and rk = ck/c0

r1 indicates how successive values of y relate to eachother

r2 indicates how y values two periods apart relate toeach other

rk is almost the same as the sample correlation betweenyt and yt−k.

Forecasting using R Autocorrelation 27

AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define

ck =1

T

T∑t=k+1

(yt − y)(yt−k − y)

and rk = ck/c0

r1 indicates how successive values of y relate to eachother

r2 indicates how y values two periods apart relate toeach other

rk is almost the same as the sample correlation betweenyt and yt−k.

Forecasting using R Autocorrelation 27

AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define

ck =1

T

T∑t=k+1

(yt − y)(yt−k − y)

and rk = ck/c0

r1 indicates how successive values of y relate to eachother

r2 indicates how y values two periods apart relate toeach other

rk is almost the same as the sample correlation betweenyt and yt−k.

Forecasting using R Autocorrelation 27

AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define

ck =1

T

T∑t=k+1

(yt − y)(yt−k − y)

and rk = ck/c0

r1 indicates how successive values of y relate to eachother

r2 indicates how y values two periods apart relate toeach other

rk is almost the same as the sample correlation betweenyt and yt−k.

Forecasting using R Autocorrelation 27

AutocorrelationResults for first 9 lags for beer data:

r1 r2 r3 r4 r5 r6 r7 r8 r9

−0.126 −0.650 −0.094 0.863 −0.099 −0.642 −0.098 0.834 −0.116

Forecasting using R Autocorrelation 28

AutocorrelationResults for first 9 lags for beer data:

r1 r2 r3 r4 r5 r6 r7 r8 r9

−0.126 −0.650 −0.094 0.863 −0.099 −0.642 −0.098 0.834 −0.116−

0.5

0.0

0.5

Lag

AC

F

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1711 13 17

Forecasting using R Autocorrelation 28

Autocorrelation

r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.

r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.

Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.

The plot is known as a correlogram

Forecasting using R Autocorrelation 29

Autocorrelation

r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.

r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.

Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.

The plot is known as a correlogram

Forecasting using R Autocorrelation 29

Autocorrelation

r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.

r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.

Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.

The plot is known as a correlogram

Forecasting using R Autocorrelation 29

Autocorrelation

r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.

r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.

Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.

The plot is known as a correlogram

Forecasting using R Autocorrelation 29

ACF

Acf(beer)

Forecasting using R Autocorrelation 30

−0.

50.

00.

5

Lag

AC

F

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1711 13 17

ACF

Acf(beer)

Forecasting using R Autocorrelation 30

−0.

50.

00.

5

Lag

AC

F

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1711 13 17

Recognizing seasonality in a time series

If there is seasonality, the ACF at the seasonal lag(e.g., 12 for monthly data) will be large andpositive.

For seasonal monthly data, a large ACF valuewill be seen at lag 12 and possibly also at lags24, 36, . . .

For seasonal quarterly data, a large ACF valuewill be seen at lag 4 and possibly also at lags 8,12, . . .

Forecasting using R Autocorrelation 31

Recognizing seasonality in a time series

If there is seasonality, the ACF at the seasonal lag(e.g., 12 for monthly data) will be large andpositive.

For seasonal monthly data, a large ACF valuewill be seen at lag 12 and possibly also at lags24, 36, . . .

For seasonal quarterly data, a large ACF valuewill be seen at lag 4 and possibly also at lags 8,12, . . .

Forecasting using R Autocorrelation 31

Australian monthly electricity production

Forecasting using R Autocorrelation 32

Australian electricity production

Year

GW

h

1980 1985 1990 1995

8000

1000

012

000

1400

0

Australian monthly electricity production

Forecasting using R Autocorrelation 33

−0.

20.

00.

20.

40.

60.

8

Lag

AC

F

0 10 20 30 40

Australian monthly electricity production

Time plot shows clear trend and seasonality.The same features are reflected in the ACF.

The slowly decaying ACF indicates trend.

The ACF peaks at lags 12, 24, 36, . . . , indicateseasonality of length 12.

Forecasting using R Autocorrelation 34

Australian monthly electricity production

Time plot shows clear trend and seasonality.The same features are reflected in the ACF.

The slowly decaying ACF indicates trend.

The ACF peaks at lags 12, 24, 36, . . . , indicateseasonality of length 12.

Forecasting using R Autocorrelation 34

Which is which?

chirp

s pe

r m

inut

e0 20 40 60

4060

80

1. Daily morning temperature of a cow

thou

sand

s

1973 1975 1977 1979

78

910

2. Accidental deaths in USA (monthly)

thou

sand

s

1950 1952 1954 1956

100

200

300

400

3. International airline passengers

thou

sand

s

1850 1870 1890 1910

2060

100

4. Annual mink trappings (Canada)

A

AC

F

5 10 15 20

-0.4

0.2

0.6

1.0

B

AC

F

5 10 15 20

-0.4

0.2

0.6

1.0

C

AC

F

5 10 15 20

-0.4

0.2

0.6

1.0

D

AC

F

5 10 15 20

-0.4

0.2

0.6

1.0

Time series graphics

Time plotsR command: plot.ts

Seasonal plotsR command: seasonplot

Seasonal subseries plotsR command: monthplot

Lag plotsR command: lag.plot

ACF plotsR command: Acf

Forecasting using R Autocorrelation 36