econ 3790: business and economics statistics

25
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal [email protected]

Upload: hasad-zimmerman

Post on 31-Dec-2015

54 views

Category:

Documents


3 download

DESCRIPTION

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal [email protected]. Chapter 15: Multiple Regression Model. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Econ 3790: Business and Economics Statistics

Econ 3790: Business and Economics Statistics

Instructor: Yogesh [email protected]

Page 2: Econ 3790: Business and Economics Statistics

The equation that describes how the dependent variable y is related to the independent variables x1, x2, . . . xp and an error term is called the multiple regression model.

Chapter 15: Multiple Regression Model

yy = = 00 + + 11xx11 + + 22xx2 2 ++ . . . + . . . + ppxxpp + +

where:where:00, , 11, , 22, . . . , , . . . , pp are the are the parametersparameters, and, and is a random variable called the is a random variable called the error termerror term

Page 3: Econ 3790: Business and Economics Statistics

A simple random sample is used to A simple random sample is used to compute sample statistics compute sample statistics bb00, , bb11, , bb22, , . . . , . . . , bbpp that are used as the point estimators of the that are used as the point estimators of the parameters parameters 00, , 11, , 22, . . . , , . . . , pp..

Estimated Multiple Regression EquationEstimated Multiple Regression Equation

^yy = = bb00 + + bb11xx1 1 + + bb22xx2 2 + . . . + + . . . + bbppxxpp

The The estimated multiple regression equationestimated multiple regression equation is: is:

Page 4: Econ 3790: Business and Economics Statistics

Interpreting the CoefficientsInterpreting the Coefficients

In multiple regression analysis, we In multiple regression analysis, we interpret eachinterpret each

regression coefficient as follows:regression coefficient as follows: bbii represents an estimate of the change in represents an estimate of the change in yy corresponding to a 1-unit increase in corresponding to a 1-unit increase in xxii when all when all other independent variables are held constant.other independent variables are held constant.

Page 5: Econ 3790: Business and Economics Statistics

Example: Car SalesExample: Car Sales Suppose we believe that number of cars sold (Suppose we believe that number of cars sold (yy) is) is

not only related to the number of ads (not only related to the number of ads (xx11), but also ), but also to the minimum down payment required at the to the minimum down payment required at the ((xx22). The regression model can be given by:). The regression model can be given by:

Multiple Regression ModelMultiple Regression Model

wherewhere yy = number of cars sold = number of cars sold

xx11 = number of ads = number of ads

xx22 = minimum down payment required (‘000) = minimum down payment required (‘000)

yy = = 00 + + 11xx1 1 + + 22xx2 2 + +

Page 6: Econ 3790: Business and Economics Statistics

Estimated Regression EquationEstimated Regression Equation

y = 14.4 + 3.7 *y = 14.4 + 3.7 *xx11 – 25* – 25* xx22y = 14.4 + 3.7 *y = 14.4 + 3.7 *xx11 – 25* – 25* xx22

Interpretation? Interpretation? Estimated values of y?Estimated values of y? Error?Error? Prediction?Prediction?

Page 7: Econ 3790: Business and Economics Statistics

Multiple Coefficient of DeterminationMultiple Coefficient of Determination

Relationship Among SST, SSR, SSERelationship Among SST, SSR, SSE

where:where: SST = total sum of squaresSST = total sum of squares SSR = sum of squares due to regressionSSR = sum of squares due to regression SSE = sum of squares due to errorSSE = sum of squares due to error

SST = SSR + SST = SSR + SSE SSE

2( )iy y 2( )iy y 2ˆ( )iy y 2ˆ( )iy y 2ˆ( )i iy y 2ˆ( )i iy y

Page 8: Econ 3790: Business and Economics Statistics

Multiple Coefficient of DeterminationMultiple Coefficient of Determination

RR22 = 84.63/89.2 = .949 = 84.63/89.2 = .949

Adjusted Multiple Coefficient of Adjusted Multiple Coefficient of DeterminationDetermination

R Rn

n pa2 21 1

11

( )R Rn

n pa2 21 1

11

( )

Standard Error of EstimateStandard Error of Estimate

RR22 = SSR/SST = SSR/SST

1 pnSSEMSEs

Page 9: Econ 3790: Business and Economics Statistics

Testing for Significance: Testing for Significance: t t Test Test

HypothesesHypotheses

Rejection RuleRejection Rule

Test StatisticsTest Statistics

Reject Reject HH00 if if pp-value -value << or or

if if tt << - -ttor or tt >> ttwhere where tt

is based on a is based on a t t distribution distribution

with with nn - - pp - 1 degrees of freedom. - 1 degrees of freedom.

0 : 0iH 0 : 0iH

: 0a iH : 0a iH

)( i

i

bSE

bt

Page 10: Econ 3790: Business and Economics Statistics

Example: Testing for significance of coefficients

HypothesesHypotheses

Rejection RuleRejection RuleFor For = .05 and d.f. = ?, = .05 and d.f. = ?, tt.025.025 = =

0:

0:0

ia

i

H

H

Test StatisticsTest Statistics)( i

i

bSE

bt

Page 11: Econ 3790: Business and Economics Statistics

Testing for Significance of Regression: Testing for Significance of Regression: F F TestTest

HypothesesHypotheses

Rejection RuleRejection Rule

Test StatisticsTest Statistics

HH00: : 11 = = 2 2 = . . . = = . . . = p p = 0= 0

HHaa: One or more of the parameters: One or more of the parameters

is not equal to zero.is not equal to zero.

FF = MSR/MSE = MSR/MSE

Reject Reject HH00 if if pp-value -value << or if or if FF > > FF

where where FF is based on an is based on an FF distribution distribution

with with pp d.f. in the numerator and d.f. in the numerator and

nn - - pp - 1 d.f. in the denominator. - 1 d.f. in the denominator.

Page 12: Econ 3790: Business and Economics Statistics

The years of experience, score on the The years of experience, score on the aptitudeaptitudetest, and corresponding annual salary test, and corresponding annual salary ($1000s) for a ($1000s) for a sample of 20 programmers is shown on the sample of 20 programmers is shown on the nextnextslide.slide.

Example 2: Programmer Salary Survey

Multiple Regression ModelMultiple Regression Model

A software firm collected data for a sampleA software firm collected data for a sampleof 20 computer programmers. A suggestionof 20 computer programmers. A suggestionwas made that regression analysis couldwas made that regression analysis couldbe used to determine if salary was relatedbe used to determine if salary was relatedto the years of experience and the scoreto the years of experience and the scoreon the firm’s programmer aptitude test.on the firm’s programmer aptitude test.

Page 13: Econ 3790: Business and Economics Statistics

4477115588101000116666

9922101055668844663333

787810010086868282868684847575808083839191

8888737375758181747487877979949470708989

24244343

23.723.734.334.335.835.83838

22.222.223.123.130303333

383826.626.636.236.231.631.629293434

30.130.133.933.928.228.23030

Exper.Exper. ScoreScore ScoreScoreExper.Exper.SalarySalary SalarySalary

Multiple Regression ModelMultiple Regression Model

Page 14: Econ 3790: Business and Economics Statistics

Suppose we believe that salary (Suppose we believe that salary (yy) is) is

related to the years of experience (related to the years of experience (xx11) and the ) and the score onscore on

the programmer aptitude test (the programmer aptitude test (xx22) by the ) by the following following

regression model:regression model:

Multiple Regression ModelMultiple Regression Model

wherewhere yy = annual salary ($1000) = annual salary ($1000)

xx11 = years of experience = years of experience

xx22 = score on programmer aptitude test = score on programmer aptitude test

yy = = 00 + + 11xx1 1 + + 22xx2 2 + +

Page 15: Econ 3790: Business and Economics Statistics

Solving for 0, 1 and 2:

A B C3839 Coeffic. Std. Err.40 Intercept 3.17394 6.1560741 Experience 1.4039 0.1985742 Test Score 0.25089 0.07735

Page 16: Econ 3790: Business and Economics Statistics

Anova Table

Source of Variation

Sum of Squares

Degrees of Freedom

Mean Square

F-statistic

Regression 500.34 …… …….. ……….

Error …….. ……. …….

Total 599.8 ……..

Page 17: Econ 3790: Business and Economics Statistics

Estimated Regression EquationEstimated Regression Equation

SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)

bb11 = 1.404 implies that salary is expected to = 1.404 implies that salary is expected to increase by $1,404 for each additional year of increase by $1,404 for each additional year of experience (when the variable experience (when the variable score on score on programmer attitude testprogrammer attitude test is held constant). is held constant).

b2 = 0.251 implies that salary is expected to b2 = 0.251 implies that salary is expected to increase by $251 for each additional point increase by $251 for each additional point scored on the programmer aptitude test (when scored on the programmer aptitude test (when the variable the variable years of experienceyears of experience is held is heldconstant).constant).

Page 18: Econ 3790: Business and Economics Statistics

Prediction

Suppose Bob had an experience of 4 years and had a score of 78 on the aptitude test. What would you estimate (or expect) his score to be?

= 3.174 + 1.404*(4) + 0.251(78)= 3.174 + 1.404*(4) + 0.251(78)

= 28.358= 28.358 Bob’s estimated salary is $28,358.Bob’s estimated salary is $28,358.

y

Page 19: Econ 3790: Business and Economics Statistics

Error

Bob’s actual salary is $24000. How much error we made in estimating his salary based on his experience and score?

So, we shall overestimate Bob’s salary.

43582835824000ˆ yyerror

Page 20: Econ 3790: Business and Economics Statistics

Multiple Coefficient of DeterminationMultiple Coefficient of Determination

Relationship Among SST, SSR, SSERelationship Among SST, SSR, SSE

where:where: SST = total sum of squaresSST = total sum of squares SSR = sum of squares due to regressionSSR = sum of squares due to regression SSE = sum of squares due to errorSSE = sum of squares due to error

SST = SSR + SST = SSR + SSE SSE

2( )iy y 2( )iy y 2ˆ( )iy y 2ˆ( )iy y 2ˆ( )i iy y 2ˆ( )i iy y

Page 21: Econ 3790: Business and Economics Statistics

Multiple Coefficient of DeterminationMultiple Coefficient of Determination

RR22 = 500.3285/599.7855 = .83418 = 500.3285/599.7855 = .83418

RR22 = SSR/SST = SSR/SST

Adjusted Multiple Coefficient of Adjusted Multiple Coefficient of DeterminationDetermination

R Rn

n pa2 21 1

11

( )R Rn

n pa2 21 1

11

( )

2 20 11 (1 .834179) .814671

20 2 1aR

2 20 11 (1 .834179) .814671

20 2 1aR

Page 22: Econ 3790: Business and Economics Statistics

Testing for Significance: Testing for Significance: t t Test Test

HypothesesHypotheses

Rejection RuleRejection Rule

Test StatisticsTest Statistics

Reject Reject HH00 if if pp-value -value << or or

if if tt << - -ttor or tt >> ttwhere where tt

is based on a is based on a t t distribution distribution

with with nn - - pp - 1 degrees of freedom. - 1 degrees of freedom.

0 : 0iH 0 : 0iH

: 0a iH : 0a iH

)( i

i

bSE

bt

Page 23: Econ 3790: Business and Economics Statistics

Example

HypothesesHypotheses

Rejection RuleRejection RuleFor For = .05 and d.f. = 17, = .05 and d.f. = 17, tt.025.025 = 2.11 = 2.11

Reject Reject HH00 if if pp-value -value << .05 or if .05 or if tt >> 2.11 2.11

0:

0:

1

10

aH

H

Test StatisticsTest Statistics 07.7199.0

404.1

)( 1

1 bSE

bt

Since t=7.07 > tSince t=7.07 > t0.0250.025 =2.11, we reject H =2.11, we reject H00..

Page 24: Econ 3790: Business and Economics Statistics

Testing for Significance of Regression: Testing for Significance of Regression: F F TestTest

HypothesesHypotheses

Rejection RuleRejection Rule

Test StatisticsTest Statistics

HH00: : 11 = = 2 2 = . . . = = . . . = p p = 0= 0

HHaa: One or more of the parameters: One or more of the parameters

is not equal to zero.is not equal to zero.

FF = MSR/MSE = MSR/MSE

Reject Reject HH00 if if pp-value -value << or if or if FF > > FF

where where FF is based on an is based on an FF distribution distribution

with with pp d.f. in the numerator and d.f. in the numerator and

nn - - pp - 1 d.f. in the denominator. - 1 d.f. in the denominator.

Page 25: Econ 3790: Business and Economics Statistics

ExampleExample

HypothesesHypotheses HH00: : 11 = = 2 2 = 0= 0

HHaa: One or both of the parameters: One or both of the parameters

is not equal to zero.is not equal to zero.

Rejection RuleRejection Rule For For = .05 and d.f. = 2, 17; = .05 and d.f. = 2, 17; FF.05.05 = 3.59 = 3.59

Reject Reject HH00 if if pp-value -value << .05 or .05 or FF >> 3.59 3.59

Test StatisticsTest Statistics FF = MSR/MSE = MSR/MSE = 250.17/5.86 = 42.8= 250.17/5.86 = 42.8

FF = 42.8 = 42.8 >> F F0.050.05 = 3.59, so we can reject = 3.59, so we can reject HH00..