lesson04

35
IBS Statistics Year 1 Dr. Ning DING [email protected] I.007

Upload: ning-ding

Post on 03-Dec-2014

2.921 views

Category:

Technology


1 download

DESCRIPTION

Statistics for International Business School, Hanze University of Applied Science, Groningen, The Netherlands

TRANSCRIPT

Page 1: Lesson04

IBS Statistics Year 1

Dr. Ning DING [email protected]

Page 2: Lesson04

What we are going to learn?

• Review

• Chapter 12: Simple Regression and Correlation– dependent / independent variables– scatter diagrams– regression analysis– Least-squares estimating equation– the coefficient of determination– the coefficient of correlation

Page 3: Lesson04

• Review

• Chapter 12: Simple Regression and Correlation

• Exercises

Review

Find the interquartile range: 146014711637172117581787194020382047205420972205228723112406

Interquartile Range=Q3-Q1

=2205-1721=484

Page 4: Lesson04

• Review

• Chapter 12: Simple Regression and Correlation

• Exercises

Review EXCEL Lesson

L=(8+1)*25%=2.25

Q1=133.5

L=(8+1)*75%=6.75

Q3=274.5

Interquartile Range=274.5-133.5=141

Page 5: Lesson04

Review

Boxplot

12245789

12

Median1224

789

12

Quartile

Q1=2

Q3=8.5

5Interquartile

Range

Decile

1st D

9th D

Percentile

http://cnx.org/content/m11192/latest/

How to interpret?

Page 6: Lesson04

The distribution is skewed to __________ because the mean is __________the median.

the right larger than

http://cnx.org/content/m11192/latest/

€ 20 € 2000Q1= € 250 Q3= € 850Median= € 350

Mean= € 450a b

• Review

• Chapter 12: Simple Regression and Correlation

• Exercises

Review

Page 7: Lesson04

0.81.01.01.21.21.31.51.72.02.02.12.24.0

2.03.23.63.74.04.24.24.54.54.64.85.05.0

Mean > Median

Mean < Median

Positively skewed

Negatively skewedhttp://qudata.com/online/statcalc/

Review

Page 8: Lesson04

This means that the data is symmetrically distributed.

Zero skewness

mode=median=mean

Zero skewness

mode=median=mean

Review

Page 9: Lesson04

– scatter diagrams– dependent / independent variables– regression analysis– Least-squares estimating equation– the coefficient of determination– the coefficient of correlation

Chapter 12• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 10: Lesson04

Regression and Correlation Analyses

– How to determine both the nature and the strength of a relationship between variables.

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 11: Lesson04

Regression and Correlation Analyses

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Scatter Diagram:

28

Describing Relationship between Two Variables – Scatter Diagram Examples

Positive correlationPositive correlation

Page 12: Lesson04

28

Describing Relationship between Two Variables – Scatter Diagram Examples

Regression and Correlation Analyses

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Scatter Diagram:

Negative correlationNegative correlation

Page 13: Lesson04

28

Describing Relationship between Two Variables – Scatter Diagram Examples

Regression and Correlation Analyses

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Scatter Diagram:

No correlationNo correlation

Page 14: Lesson04

Regression and Correlation Analyses

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Scatter Diagrams:• Patterns indicating that the variables are related• If related, we can describe the relationship

Strong & Positivecorrelation

Strong & Negativecorrelation

Weak & Positivecorrelation

Weak & Negativecorrelation

Nocorrelation

Page 15: Lesson04

28

Describing Relationship between Two Variables – Scatter Diagram Examples

28

Describing Relationship between Two Variables – Scatter Diagram ExamplesVariables: – Independent variables: known

– Dependent variables: to predict

Independent Variable

Dependent Variable

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Regression and Correlation Analyses

Page 16: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Regression and Correlation Analyses

Correlation & Cause Effect?

• The relationships found by regression to be relationships of association

• Not necessarilly of cause and effect.

Page 17: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 18: Lesson04

Least-squares estimating equation:• The dependent variable Y is determined by the independent

variable X

Ŷ = a + bX

Y

X

Independent Variable

Dependent Variable• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

88 ?I

Page 19: Lesson04

Ŷ = a + bX

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Least-squares estimating equation:

Page 20: Lesson04

xn-x

y xn-xy=b 22

Y = a + bX a = Y - bX

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Least-squares estimating equation:

Page 21: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

75.09*444

6*3*478

-

-=b

the relationship between the age of a truck and the annual repair expense?

X=3 Y=6

xn-x

y xn-xy=b

22

a = 6 - 0.75*3 = 3.75

Ŷ = 3.75 + 0.75 X

If the city has a truck that is 4 years old,

the director could use the equation to predict $675 annually in repairs.

6.75 = 3.75 + 0.75 * 4

Least-squares estimating equation:

Y = a + bX a = Y - bX

Step 1:

Step 2:

Step 4:

Step 5:

Step 6:

Step 7:

Step 8:

Page 22: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Example:• To find the simple/linear regression of Personal Income (X) and

Auto Sales (Y)

Count the number of values.      

Step 1:

Find XY, X2   See the below tableStep 2:

N = 5N = 5

If X=64, what about Y?

Least-squares estimating equation:

Page 23: Lesson04

Step 3:

Step 4:

Find ΣX, ΣY, ΣXY, ΣX2.            ΣX = 311 Mean = 62.2             ΣY = 18.6 Mean = 3.72            ΣXY = 1159.7             ΣX2 = 19359

xn-x

y xn-xy=b

22

Substitute in the above slope formula given.            Slope(b) = = 0.19 1159.7-5*62.2*3.72

19359-5*62.2*62.2

Least-squares estimating equation:

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 24: Lesson04

Step 5:

Then substitute these values in regression equation formula            Regression Equation(Ŷ) = a + bX

         Ŷ  = -8.098 + 0.19X

            Slope(b) = 0.19

Now, again substitute in the above intercept formula given.           

Intercept(a) = Y - bX  = 3.72- 0.19 * 62.2= -8.098

Suppose if we want to know the approximate y value for the variable X = 64. Then we can substitute the value in the above equation.

Regression Equation:Ŷ = a + bX             = -8.098 + 0.19(64)            = -8.098 + 12.16

            = 4.06

Regression Equation:Ŷ = a + bX             = -8.098 + 0.19(64)            = -8.098 + 12.16

            = 4.06

Least-squares estimating equation:

Step 6:

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 25: Lesson04

to minimize the sum of the squares of the errors to measure the goodness of fit of a line

ei = residuali

Least-squares estimating equation:• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Strongcorrelation

Weakcorrelation

SESE

Page 26: Lesson04

to minimize the sum of the squares of the errors to measure the goodness of fit of a line

ei = residuali

Least-squares estimating equation:

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 27: Lesson04

Correlation Analysis:

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

describe the degree to which one variable is linearly related to another.

Coefficient of Determination:Measure the extent, or strength, of the association that existsbetween two variables.

Coefficient of Correlation:Square root of coefficient of determination

r 2r 2

rr

Page 28: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Coefficient of Determination:Measure the extent, or strength, of the association that

exists between two variables.

r 2r 2

• 0 ≤ r2 ≤ 1.• The larger r2 , the stronger the linear relationship.• The closer r2 is to 1, the more confident we are in our prediction.

Yn-YYn-XYb+Ya

=r 22

22

Page 29: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Coefficient of Determination: r 2r 2Yn-Y

Yn-XYb+Ya=r 22

22

Page 30: Lesson04

Coefficient of Correlation:Square root of coefficient of determination

rr• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 31: Lesson04

ReviewWhich value of r indicates a stronger correlation than 0.40? A. -0.30B. -0.50C. +0.38D. 0

If all the plots on a scatter diagram lie on a straight line, what is the standard error of estimate? A. -1B. +1C. 0D. Infinity

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 32: Lesson04

• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

In the least squares equation,  Ŷ = 10 + 20X the value of 20 indicates A. the Y intercept.B. for each unit increase in X, Y increases by 20.C. for each unit increase in Y, X increases by 20.D. none of these. 

Review

Page 33: Lesson04

A sales manager for an advertising agency believes there is a relationship between the number of contacts and the amount of the sales. To verify this belief, the following data was collected: What is the Y-intercept of the linear equation? A. -12.201B. 2.1946C. -2.1946D. 12.201

Review• Review

• Chapter 12: •scatter diagrams•dependent / independent variables•regression analysis•Least-squares estimating equation•the coefficient of determination•the coefficient of correlation

Page 34: Lesson04

– scatter diagrams– dependent / independent variables– regression analysis– Least-squares estimating equation– the coefficient of determination– the coefficient of correlation

What we have learnt?

Page 35: Lesson04