grouped time-series forecasting: application to regional infant mortality counts

Motivation Data Method Result Conclusion

Grouped time-series forecasting:Application to regional infant mortality counts

Han Lin Shang and Peter W. F. SmithUniversity of Southampton

Motivation

1 Multiple time series can be disaggregated byhierarchical/grouped structure

2 Hyndman, Ahmed, Athanasopoulos and Shang (2010, CSDA)considered four hierarchical methods, but did not consider theconstruction of prediction interval for hierarchical/groupedtime series

3 Present a parametric bootstrap method to construct predictioninterval

4 Apply to infant mortality forecasting

Motivation

Consider regional infant mortality counts from 1933 to 2003,available in the hts package

Western Australia

South Australia

Northern Territory

Queensland

New South Wales

VictoriaTasmania

Capital Territory

Adelaide

Darwin

Brisbane

Sydney

Melbourne

Hobart

Canberra

Australia

1 Hierarchical structure is expressed below

Level Number of seriesAustralia 1Gender 2State 8Gender × State 16Total 27

2 Since multiple time series can be disaggregated by state firstor gender first, our data are called grouped time series

3 Forecast regional infant mortality count from 2004 to 2013

Hierarchical tree

VIC NSW QLD SA WA ACT NT TAS

Female

VIC NSW QLD SA WA ACT NT TAS

Figure: A two level hierarchical tree diagram.

Bottom-up method

1 Generate base (or independent) forecasts for each series at thebottom level

2 Aggregate these upwards to produce revised forecasts3 E.g., YMale,h = Y VIC

Male,h + ... + Y NTMale,h,

YTotal,h = YMale,h + YFemale,h, where h represents horizon4 Base forecasts = Revised forecasts

Bottom-up method

2 Aggregate these upwards to produce revised forecasts

3 E.g., YMale,h = Y VICMale,h + ... + Y NT

Male,h,YTotal,h = YMale,h + YFemale,h, where h represents horizon

4 Base forecasts = Revised forecasts

Bottom-up method

YTotal,h = YMale,h + YFemale,h, where h represents horizon

4 Base forecasts = Revised forecasts

Bottom-up method

YTotal,h = YMale,h + YFemale,h, where h represents horizon4 Base forecasts = Revised forecasts

Bottom-up in action

Level 0

1940 1960 1980 2000

Level 1femalemale

1940 1960 1980 2000

Level 2nswvicqldsa

wantactottas

1940 1960 1980 2000

00Level 3

nsw_fvic_fqld_fsa_fwa_fnt_factot_ftas_f

nsw_mvic_mqld_msa_mwa_mnt_mactot_mtas_m

Point forecast accuracy: data design

1 For series in the bottom level, select optimal exponentialsmoothing model based on information criterion, such as AIC(by defualt) or BIC

2 Re-estimate the parameters of model using a rolling windowapproach, with the initial fitting period (1933 to 1993)

3 Forecasts are produced for one- to ten-step-ahead4 Iterate the process, by increasing the sample size of training

period by one year until 20035 This gives us 10 one-step-ahead forecasts, 9 two-step-ahead

forecasts, ..., and 1 ten-step-ahead forecast6 The advantage of rolling window approach is to assess forecast

accuracy for each horizon

3 Forecasts are produced for one- to ten-step-ahead

4 Iterate the process, by increasing the sample size of trainingperiod by one year until 2003

5 This gives us 10 one-step-ahead forecasts, 9 two-step-aheadforecasts, ..., and 1 ten-step-ahead forecast

6 The advantage of rolling window approach is to assess forecastaccuracy for each horizon

period by one year until 2003

5 This gives us 10 one-step-ahead forecasts, 9 two-step-aheadforecasts, ..., and 1 ten-step-ahead forecast

forecasts, ..., and 1 ten-step-ahead forecast

Point forecast accuracy: evaluation

To compare point forecast accuracy between the base andbottom-up forecasts for all series, calculate mean absolutepercentage error,

MAPEh =1

(11− h)×m

n+(10−h)∑i=n

m∑j=1

∣∣∣∣∣Yt+h,j − Yt+h,j

Yt+h,j

∣∣∣∣∣ ,where m represents the total number of time series in the hierarchy,and h = 1, 2, . . . , 10

Point forecast result

Level 0 Level 1 Level 2 Level 3Base BU Base BU Base BU Base BU

1 4.26 5.35 5.59 5.72 14.76 14.03 20.98 20.982 6.25 5.96 7.38 6.23 16.32 16.20 25.50 25.503 8.27 6.51 10.26 6.86 18.95 18.95 30.55 30.554 11.94 10.73 14.71 10.34 22.40 22.11 34.55 34.555 19.02 9.37 16.48 10.47 24.87 25.96 39.58 39.586 16.46 6.16 17.60 6.18 27.75 27.74 41.99 41.997 19.59 9.46 19.55 9.58 31.66 34.43 47.57 47.578 20.30 9.74 24.50 10.03 34.61 39.32 54.78 54.789 28.71 11.62 29.72 12.02 33.41 40.38 52.97 52.9710 32.40 27.55 32.42 26.15 37.66 45.66 61.32 61.32Mean 16.72 10.25 17.82 10.36 26.24 28.48 40.98 40.98

Bottom-up method outperforms the independent (base) forecasts(without group structure) at the top two levels, not the state level

Construction of interval forecasts

1 Provide pointwise interval forecasts for assessing uncertainty

2 Proposed method fits within the framework of parametricbootstrapping

3 Draw bootstrap samples from the fitted exponential smoothingmodel for each series at the bottom level

4 For each bootstrap sample, we construct group structure andobtain point forecasts

5 Based on bootstrapped forecasts, we assess the variability ofpoint forecasts by constructing prediction interval

6 Computationally, the simulate.ets function in the forecastpackage was used

1 Provide pointwise interval forecasts for assessing uncertainty2 Proposed method fits within the framework of parametric

bootstrapping

3 Draw bootstrap samples from the fitted exponential smoothingmodel for each series at the bottom level

bootstrapping3 Draw bootstrap samples from the fitted exponential smoothing

model for each series at the bottom level

model for each series at the bottom level4 For each bootstrap sample, we construct group structure and

obtain point forecasts

obtain point forecasts5 Based on bootstrapped forecasts, we assess the variability of

point forecasts by constructing prediction interval

obtain point forecasts5 Based on bootstrapped forecasts, we assess the variability of

point forecasts by constructing prediction interval6 Computationally, the simulate.ets function in the forecast

package was used

Demonstration of interval forecasts

Present 80% pointwise prediction interval of the regional infantmortality counts from 2004 to 2013 at the top two levels

1940 1950 1960 1970 1980 1990 2000

6000 Total

(a) Level 0

1940 1950 1960 1970 1980 1990 200050

MaleFemale

(b) Level 1

Infant mortality counts will continue to decrease in future. Thevariability of male forecasts is higher than female ones

Interval forecast accuracy

1 Given a sample path [Y1, . . . ,Yn] where Yt is a column vectorof values across the entire hierarchy, we constructed theh-step-ahead interval forecasts

2 Let Ln+h|n(p) and Un+h|n(p) be the lower and upper bounds,where p symbolizes the nominal coverage probability

3 Conditioning on holdout data, the indicator variable is

In+h,j =

{1 if Yn+h,j ∈ [Ln+h|n,j(p), Un+h|n,j(p)]

0 if Yn+h,j /∈ [Ln+h|n,j(p), Un+h|n,j(p)] j = 1, . . . ,m

In+h,j =

Empirical coverage probability

Empirical coverage probability (ECP) is defined as

ECPh = 1−∑n+(10−h)

∑mj=1 Il+h,j

m× (11− h), h = 1, . . . , 10

h 1 2 3 4 5 6 7 8 9 10ECP 0.71 0.72 0.75 0.69 0.64 0.73 0.72 0.69 0.72 0.74

Table: Empirical coverage probability at nominal of 0.8

Hypothesis testing: interval forecast accuracy

1 To test if the ECP differs from the nominal coverageprobability, we performed log likelihood-ratio test statistics(see Christoffersen 1998, for more details)

2 Christoffersen (1998) proposed a test for unconditionalcoverage, a test for independence of indicator sequence, and ajoint test of conditional coverage and independence

3 At the nominal coverage probability of 0.8, log likelihood-ratioare

h 1 2 3 4 5 6 7 8 9 10LR 5.73 4.55 1.87 3.24 9.23 5.28 5.94 4.03 2.55 5.01

Table: Critical value is 5.99 at 95% level of significance

4 At 95% level of significance, only 1 in 10 is greater thancritical value

h 1 2 3 4 5 6 7 8 9 10LR 5.73 4.55 1.87 3.24 9.23 5.28 5.94 4.03 2.55 5.01

Conclusion

1 Revisited the bottom-up method

2 Applied it to the regional infant mortality count in Australia3 Performed evaluation of point forecast accuracy4 Proposed a parametric bootstrap method to construct

prediction interval5 Performed evaluation of interval forecast accuracy6 Carried out hypothesis testing of interval forecast accuracy

Conclusion

1 Revisited the bottom-up method2 Applied it to the regional infant mortality count in Australia

3 Performed evaluation of point forecast accuracy4 Proposed a parametric bootstrap method to construct

Conclusion

1 Revisited the bottom-up method2 Applied it to the regional infant mortality count in Australia3 Performed evaluation of point forecast accuracy

4 Proposed a parametric bootstrap method to constructprediction interval

5 Performed evaluation of interval forecast accuracy6 Carried out hypothesis testing of interval forecast accuracy

Conclusion

1 Revisited the bottom-up method2 Applied it to the regional infant mortality count in Australia3 Performed evaluation of point forecast accuracy4 Proposed a parametric bootstrap method to construct

prediction interval

5 Performed evaluation of interval forecast accuracy6 Carried out hypothesis testing of interval forecast accuracy

Conclusion

prediction interval5 Performed evaluation of interval forecast accuracy

6 Carried out hypothesis testing of interval forecast accuracy

Conclusion

Future research

1 Parametric bootstrapping is expected to work for otherhierarchical/grouped time series forecasting method, such astop-down methods

2 Modeling age-specific mortality counts hierarchically andcoherently

3 Extension from mortality count to mortality rate

Future research

Thank you

A draft is available upon request from H.Shang@soton.ac.uk

grouped time-series forecasting: application to regional infant mortality counts

data design

multiple time series

parametric bootstrap

base forecasts

action level

independent forecasts

revised forecasts

hierarchicalgrouped

Technology

mode (grouped data)

math counts!

every mile counts. every $ counts. every kid counts....every...

rainflow counts

accounting counts

comparing counts

infant-prints: fingerprints for reducing infant...

character counts

mean from a grouped table

centre-staging infant nutrition for infant survival and...

mathematics counts

infant toddler modules project infant toddler modules...

longitudinal prediction of the infant gut microbiome with...

quality counts

the counts streetin’...

grouped (002) [read-only]

every drop counts - transfusion guidelines ·...

infant capitalists, infant industries and infant economies

grouped therapy - columbia university

how are plants grouped