model building steps: forecasting the jobs number

41
Model Building Steps Forecasting the jobs number John H. Muller October 1, 2012 John H. Muller () Model Building Steps October 1, 2012 1 / 40

Upload: john-muller

Post on 23-Jun-2015

617 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Model Building Steps: Forecasting the Jobs number

Model Building Steps

Forecasting the jobs number

John H. Muller

October 1, 2012

John H. Muller () Model Building Steps October 1, 2012 1 / 40

Page 2: Model Building Steps: Forecasting the Jobs number

Outline

1 Goals

2 The TaskForecasting the jobs numberPredictor Variables

3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction

4 Resources

John H. Muller () Model Building Steps October 1, 2012 2 / 40

Page 3: Model Building Steps: Forecasting the Jobs number

1 Goals

2 The TaskForecasting the jobs numberPredictor Variables

3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction

4 Resources

John H. Muller () Model Building Steps October 1, 2012 3 / 40

Page 4: Model Building Steps: Forecasting the Jobs number

Goals for the presentation

Illustrate issues and choices in a typical model building process.

To do that we take the following as out task.Build a model to forecast a macro economic time series

Time is limited, so we don’t have time to discuss:

econometrics or macroeconomics

time series methods

details or merits of particular modeling or model fitting methods

John H. Muller () Model Building Steps October 1, 2012 4 / 40

Page 5: Model Building Steps: Forecasting the Jobs number

1 Goals

2 The TaskForecasting the jobs numberPredictor Variables

3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction

4 Resources

John H. Muller () Model Building Steps October 1, 2012 5 / 40

Page 6: Model Building Steps: Forecasting the Jobs number

Total nonfarm (FRED symbol = PAYEMS)

60000

80000

100000

120000

140000

1960 1970 1980 1990 2000 2010

Every month BLS publishes the Employment Situation report.

Two most important numbers: Unemployment rate and Total nonfarm

Total nonfarm: count of jobs from survey of businesses(units: thousands of jobs)

John H. Muller () Model Building Steps October 1, 2012 6 / 40

Page 7: Model Building Steps: Forecasting the Jobs number

Total nonfarm (FRED symbol = PAYEMS)

60000

80000

100000

120000

140000

1960 1970 1980 1990 2000 2010

Every month BLS publishes the Employment Situation report.

Two most important numbers: Unemployment rate and Total nonfarm

Total nonfarm: count of jobs from survey of businesses(units: thousands of jobs)

Task: Forecast month-over-month change in Total nonfarm

John H. Muller () Model Building Steps October 1, 2012 6 / 40

Page 8: Model Building Steps: Forecasting the Jobs number

Total nonfarm (FRED symbol = PAYEMS)

130000

132000

134000

136000

138000

2000 2002 2004 2006 2008 2010 2012

Figure: PAYEMS since 2000

John H. Muller () Model Building Steps October 1, 2012 7 / 40

Page 9: Model Building Steps: Forecasting the Jobs number

Month over Month Change in PAYEMS

mean=23, sd = 289

−1000

−500

0

500

1000

2000 2002 2004 2006 2008 2010 2012

Figure: Month-over-month change in PAYEMS

John H. Muller () Model Building Steps October 1, 2012 8 / 40

Page 10: Model Building Steps: Forecasting the Jobs number

1 Goals

2 The TaskForecasting the jobs numberPredictor Variables

3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction

4 Resources

John H. Muller () Model Building Steps October 1, 2012 9 / 40

Page 11: Model Building Steps: Forecasting the Jobs number

ID Description

ALTSALES Light Weight Vehicle Sales: Autos & Light Trucks

BUSLOANS Commercial and Industrial Loans at All Commercial Banks

CE16OV Civilian Employment

CIVPART Civilian Labor Force Participation Rate

CLF16OV Civilian Labor Force

CONSUMER Consumer Loans at All Commercial Banks

CPATAX Corporate Profits After Tax

DPI Disposable Personal Income

PAYEMS All Employees: Total nonfarm

PCE Personal Consumption Expenditures

PSAVERT Personal Saving Rate

SRVPRD All Employees: Service-Providing Industries

TCU Capacity Utilization: Total Industry

UEMP27OV Civilians Unemployed for 27 Weeks and Over

UEMPLT5 Civilians Unemployed - Less Than 5 Weeks

UEMPMEAN Average (Mean) Duration of Unemployment

UEMPMED Median Duration of Unemployment

UNEMPLOY Unemployed

UNRATE Civilian Unemployment Rate

USGOOD All Employees: Goods-Producing Industries

Table: Variables and descriptionsJohn H. Muller () Model Building Steps October 1, 2012 10 / 40

Page 12: Model Building Steps: Forecasting the Jobs number

Jobs

1350

00

2000 2004 2008 2012

CE16OV

6466

2000 2004 2008 2012

CIVPART

1400

00

2000 2004 2008 2012

CLF16OV

1280

00

2000 2004 2008 2012

PAYEMS

1050

002000 2004 2008 2012

SRVPRD

2000

2000 2004 2008 2012

UEMP27OV

2500

2000 2004 2008 2012

UEMPLT5

1530

2000 2004 2008 2012

UEMPMEAN

515

25

2000 2004 2008 2012

UEMPMED

6000

1600

0

2000 2004 2008 2012

UNEMPLOY

46

810

2000 2004 2008 2012

UNRATE

1800

0

2000 2004 2008 2012

USGOOD

Figure: Original Series

John H. Muller () Model Building Steps October 1, 2012 11 / 40

Page 13: Model Building Steps: Forecasting the Jobs number

Consumer

1014

18

2000 2004 2008 2012

ALTSALES

600

1000

2000 2004 2008 2012

CONSUMER

6000

9000

2000 2004 2008 2012

DPI

6000

9000

2000 2004 2008 2012

PCE

−2

02

46

2000 2004 2008 2012

PSAVERT

Figure: Original Series

John H. Muller () Model Building Steps October 1, 2012 12 / 40

Page 14: Model Building Steps: Forecasting the Jobs number

Business

1000

1400

2000 2002 2004 2006 2008 2010 2012

BUSLOANS

600

1200

2000 2002 2004 2006 2008 2010 2012

CPATAX

7075

80

2000 2002 2004 2006 2008 2010 2012

TCU

Figure: Original Series

John H. Muller () Model Building Steps October 1, 2012 13 / 40

Page 15: Model Building Steps: Forecasting the Jobs number

Jobs

−10

00

2000 2004 2008 2012

CE16OV

−0.

40.

2

2000 2004 2008 2012

CIVPART

−10

0030

00

2000 2004 2008 2012

CLF16OV

−10

00

2000 2004 2008 2012

PAYEMS

−10

0010

002000 2004 2008 2012

SRVPRD

−40

040

0

2000 2004 2008 2012

UEMP27OV

−50

050

0

2000 2004 2008 2012

UEMPLT5

−1

12

2000 2004 2008 2012

UEMPMEAN

−20

2

2000 2004 2008 2012

UEMPMED

−50

0

2000 2004 2008 2012

UNEMPLOY

−0.

40.

2

2000 2004 2008 2012

UNRATE

−10

000

2000 2004 2008 2012

USGOOD

Figure: Differenced Series

John H. Muller () Model Building Steps October 1, 2012 14 / 40

Page 16: Model Building Steps: Forecasting the Jobs number

Consumer

−4

−2

02

4

2000 2004 2008 2012

ALTSALES

010

020

0

2000 2004 2008 2012

CONSUMER

−10

00

100

300

20002002 2006 2010

DPI

−10

00

100

200

2000 2004 2008 2012

PCE

−4

−2

02

4

2000 2004 2008 2012

PSAVERT

Figure: Differenced Series

John H. Muller () Model Building Steps October 1, 2012 15 / 40

Page 17: Model Building Steps: Forecasting the Jobs number

Business

−40

020

4060

2000 2002 2004 2006 2008 2010 2012

BUSLOANS

−10

00

100

200

2000 2002 2004 2006 2008 2010 2012

CPATAX

−2

−1

01

2000 2002 2004 2006 2008 2010 2012

TCU

Figure: Differenced Series

John H. Muller () Model Building Steps October 1, 2012 16 / 40

Page 18: Model Building Steps: Forecasting the Jobs number

1 Goals

2 The TaskForecasting the jobs numberPredictor Variables

3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction

4 Resources

John H. Muller () Model Building Steps October 1, 2012 17 / 40

Page 19: Model Building Steps: Forecasting the Jobs number

Preliminaries

choose target and predictor variablesconsideration might include: history, cost, frequency, accuracy

choose model form & method: Lasso & random forestalternatives: neural networks, OLS, robust regression, ...Criteria for choosing:

◮ prediction accuracy◮ interpretability◮ suitability to the task and data◮ available software, model maintenance, implementation complexity

Derive variables from inputs. smoothed, standardizedalternatives: powers of original variables, cross terms, ratios

Plan for estimating out of sample error:cross validation & test/train split

John H. Muller () Model Building Steps October 1, 2012 18 / 40

Page 20: Model Building Steps: Forecasting the Jobs number

Preliminaries

Data issues

Missing data: removealternatives: impute, ignore (for some model forms)

Outliers: trim to within 3 sd of rolling meanalternatives: ignore, remove

Correlated predictor variable: ignorealternatives: cluster variables and choose 1 from each cluster

John H. Muller () Model Building Steps October 1, 2012 19 / 40

Page 21: Model Building Steps: Forecasting the Jobs number

Figure: Trimmed and Smoothed

Jobs

−10

0030

00

2000 2004 2008 2012

CE16OV

−0.

40.

2

2000 2004 2008 2012

CIVPART

−10

0030

00

2000 2004 2008 2012

CLF16OV

−10

0010

00

2000 2004 2008 2012

PAYEMS

−10

0050

0

2000 2004 2008 2012

SRVPRD

−40

0040

0

2000 2004 2008 2012

UEMP27OV

−50

050

0

2000 2004 2008 2012

UEMPLT5

−1

12

2000 2004 2008 2012

UEMPMEAN

−2

02

2000 2004 2008 2012

UEMPMED

−50

050

02000 2004 2008 2012

UNEMPLOY

−0.

40.

2

2000 2004 2008 2012

UNRATE

−10

000

2000 2004 2008 2012

USGOOD

John H. Muller () Model Building Steps October 1, 2012 20 / 40

Page 22: Model Building Steps: Forecasting the Jobs number

Figure: Trimmed and Smoothed

Consumer

−4

−2

02

4

2000 2004 2008 2012

ALTSALES

010

020

0

2000 2004 2008 2012

CONSUMER

−10

010

030

0

2000 2004 2008 2012

DPI

−10

00

100

200

2000 2004 2008 2012

PCE

−4

−2

02

42000 2004 2008 2012

PSAVERT

John H. Muller () Model Building Steps October 1, 2012 21 / 40

Page 23: Model Building Steps: Forecasting the Jobs number

Figure: Trimmed and Smoothed

Business

−40

020

4060

2000 2002 2004 2006 2008 2010 2012

BUSLOANS

−10

00

100

2000 2002 2004 2006 2008 2010 2012

CPATAX

−2

−1

01

2000 2002 2004 2006 2008 2010 2012

TCU

John H. Muller () Model Building Steps October 1, 2012 22 / 40

Page 24: Model Building Steps: Forecasting the Jobs number

1 Goals

2 The TaskForecasting the jobs numberPredictor Variables

3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction

4 Resources

John H. Muller () Model Building Steps October 1, 2012 23 / 40

Page 25: Model Building Steps: Forecasting the Jobs number

Fitting and Tuning the Model

Complexity: how many knobs the model hase.g. degrees of freedom,# variables, shrinkage factor, tree size, ...

Fitting: estimating parameters for given complexityMethods: least squares, method of moments, maximum likelihood, optimization

Tuning: adjusting the models complexity

Possibly iterative, using diagnostics:

out of sample error

sensitivity, e.g. ∂error

∂data

significance of parameters

error structure, e.g. heteroskedastic

alignment with prior beliefsWhich variabless are important for the model?

John H. Muller () Model Building Steps October 1, 2012 24 / 40

Page 26: Model Building Steps: Forecasting the Jobs number

0.0 0.2 0.4 0.6 0.8 1.0

4e+

048e

+04

Fraction of final L1 norm

Cro

ss−

Val

idat

ed M

SE

John H. Muller () Model Building Steps October 1, 2012 25 / 40

Page 27: Model Building Steps: Forecasting the Jobs number

** * * * * ** ** **** **** * *** * *** ** * * * * *

0.0 0.2 0.4 0.6 0.8 1.0

−20

000

|beta|/max|beta|

Sta

ndar

dize

d C

oeffi

cien

ts

** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** ******** * *** * ***

** * * * **

** * * * * ** ** ******** * ***

* *** ** * * * * *

** * * * * ** ** **** **** * *** * ***** * * * *

*** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *

** * * ** ** ** **** **** * *** * *** ** * * * * *

** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** ****

**** * *** * ***** * * * *

*

** * * * * ** ** **** **** * *** * *** ** * * * * *

** * * * * ** ** **** **** * *** * *** ** * * * * *

LASSO

183

87

4

John H. Muller () Model Building Steps October 1, 2012 26 / 40

Page 28: Model Building Steps: Forecasting the Jobs number

variable estimate

ALTSALES 0.000

BUSLOANS 0.000

CE16OV 0.157

CIVPART 0.000

CLF16OV 0.000

CONSUMER 0.000

CPATAX 0.000

DPI 0.000

PAYEMS 0.028

PCE 2.250

PSAVERT -15.350

SRVPRD 0.000

TCU 0.000

UEMP27OV -0.265

UEMPLT5 0.000

UEMPMEAN -18.559

UEMPMED 0.000

UNEMPLOY 0.000

UNRATE -305.852

USGOOD 0.292

Table: Coefficient estimates for LASSOJohn H. Muller () Model Building Steps October 1, 2012 27 / 40

Page 29: Model Building Steps: Forecasting the Jobs number

ALTSALES

CPATAX

DPI

UEMPLT5

BUSLOANS

CONSUMER

PSAVERT

UEMPMED

UEMPMEAN

SRVPRD

CIVPART

PAYEMS

CLF16OV

PCE

UNRATE

UEMP27OV

CE16OV

UNEMPLOY

TCU

USGOOD

0 200000 400000 600000 800000 1000000 1200000

Random Forest: predictor variable importance

IncNodePurity

John H. Muller () Model Building Steps October 1, 2012 28 / 40

Page 30: Model Building Steps: Forecasting the Jobs number

1 Goals

2 The TaskForecasting the jobs numberPredictor Variables

3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction

4 Resources

John H. Muller () Model Building Steps October 1, 2012 29 / 40

Page 31: Model Building Steps: Forecasting the Jobs number

Model Selection

Model selection: choosing the best among different models

Our criteria: prediction accuracyHow will we measure this?

Training set cross validation estimates of out of sample MSERF: 52,000Lasso: 62,000

Separate test data.25,000 essentially the same for both!

John H. Muller () Model Building Steps October 1, 2012 30 / 40

Page 32: Model Building Steps: Forecasting the Jobs number

2000 2002 2004 2006 2008 2010 2012

−10

00−

500

050

010

00

targetrflasso

John H. Muller () Model Building Steps October 1, 2012 31 / 40

Page 33: Model Building Steps: Forecasting the Jobs number

2000 2002 2004 2006 2008 2010 2012−10

000

500

rflasso

Figure: Training set error

John H. Muller () Model Building Steps October 1, 2012 32 / 40

Page 34: Model Building Steps: Forecasting the Jobs number

0 5 10 15 20

−0.

20.

20.

61.

0

Lag

AC

FRandom Forest

0 5 10 15 20

−0.

20.

20.

61.

0Lag

AC

F

Lasso

Figure: Training error ACF

John H. Muller () Model Building Steps October 1, 2012 33 / 40

Page 35: Model Building Steps: Forecasting the Jobs number

Jan Mar May Jul Sep

010

020

030

040

050

0 targetrflasso

John H. Muller () Model Building Steps October 1, 2012 34 / 40

Page 36: Model Building Steps: Forecasting the Jobs number

Jan Mar May Jul Sep

−40

0−

200

020

0

rflasso

Figure: Test set error

John H. Muller () Model Building Steps October 1, 2012 35 / 40

Page 37: Model Building Steps: Forecasting the Jobs number

1 Goals

2 The TaskForecasting the jobs numberPredictor Variables

3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction

4 Resources

John H. Muller () Model Building Steps October 1, 2012 36 / 40

Page 38: Model Building Steps: Forecasting the Jobs number

Prediction

Random Forest: 120

Lasso: 86

John H. Muller () Model Building Steps October 1, 2012 37 / 40

Page 39: Model Building Steps: Forecasting the Jobs number

1 Goals

2 The TaskForecasting the jobs numberPredictor Variables

3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction

4 Resources

John H. Muller () Model Building Steps October 1, 2012 38 / 40

Page 40: Model Building Steps: Forecasting the Jobs number

The Secrets of Economic Indicators, Bernard Baumohl

The Elements of Statistical Learning, Hastie, Tibshirani, Friedman

Macroeconomic Patterns and Stories, Edward E. Leamer

Analysis of Financial Time Series, Ruey S. Tsay

http://api.stlouisfed.org/docs/fred/good source for both FRED and ALFRED

http://cran.r-project.org/

John H. Muller () Model Building Steps October 1, 2012 39 / 40

Page 41: Model Building Steps: Forecasting the Jobs number

Thank you!

and thank you to John Verostek, Vladimir Valenta and Steve Kusiak

John H. Muller () Model Building Steps October 1, 2012 40 / 40