chapter 8 – 1 chapter 8: bivariate regression and correlation overview the scatter diagram two...

45
Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance

Upload: polly-stafford

Post on 24-Dec-2015

238 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 1

Chapter 8: Bivariate Regression and Correlation

• Overview• The Scatter Diagram• Two Examples: Education & Prestige• Correlation Coefficient• Bivariate Linear Regression Line• SPSS Output• Interpretation• Covariance

Page 2: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 2

Overview

Int

erva

l

N

omin

al

Dep

ende

ntV

aria

ble

Independent Variables

Nominal Interval

Considers the distribution of one variable across the categories of another variable

Considers the difference between the mean of one group on a variable with another group

Considers how a change in a variable affects a discrete outcome

Considers the degree to which a change in one variable results in a change in another

Page 3: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 3

Overview

Int

erva

l

N

omin

al

Dep

ende

ntV

aria

ble

Independent Variables

Nominal Interval

Considers the difference between the mean of one group on a variable with another group

Considers how a change in a variable affects a discrete outcome

Considers the degree to which a change in one variable results in a change in another

You already know how to deal with two

nominal variables

Lambda

Page 4: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 4

Overview

Int

erva

l

N

omin

al

Dep

ende

ntV

aria

ble

Independent Variables

Nominal Interval

Considers how a change in a variable affects a discrete outcome

Considers the degree to which a change in one variable results in a change in another

You already know how to deal with two

nominal variables

Lambda

Confidence IntervalsT-Test

We will deal with this later in the course

TODAY!

Page 5: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 5

Overview

Int

erva

l

N

omin

al

Dep

ende

ntV

aria

ble

Independent Variables

Nominal Interval

Considers how a change in a variable affects a discrete outcome

RegressionCorrelation

You already know how to deal with two

nominal variables

Lambda

Confidence IntervalsT-Test

We will deal with this later in the course TODAY!

What about this cell?

Page 6: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 6

Overview

Int

erva

l

N

omin

al

Dep

ende

ntV

aria

ble

Independent Variables

Nominal Interval

Logistic Regression

RegressionCorrelation

You already know how to deal with two

nominal variables

Lambda

Confidence IntervalsT-Test

We will deal with this later in the course TODAY!

This cell is not covered in this course

Page 7: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 7

General Examples

Does a change in one variable significantly affect another variable?

Do two scores tend to co-vary positively (high on one score high on the other, low on one, low on the other)?

Do two scores tend to co-vary negatively (high on one score low on the other; low on one, hi on the other)?

Inte

rval

Nom

inal

Dep

ende

ntV

aria

ble

Independent Variables

Nominal Interval

Inte

rval

Nom

inal

Dep

ende

ntV

aria

ble

Independent Variables

Nominal Interval

Inte

rval

Nom

inal

Dep

ende

ntV

aria

ble

Independent Variables

Nominal Interval

Page 8: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 8

Specific Examples

Does getting older significantly influence a person’s political views?

Does marital satisfaction increase with length of marriage?

How does an additional year of education affect one’s earnings?

Inte

rval

Nom

inal

Dep

ende

ntV

aria

ble

Independent Variables

Nominal Interval

Inte

rval

Nom

inal

Dep

ende

ntV

aria

ble

Independent Variables

Nominal Interval

Inte

rval

Nom

inal

Dep

ende

ntV

aria

ble

Independent Variables

Nominal Interval

Page 9: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 9

Scatter Diagrams

• Scatter Diagram (scatterplot)—a visual method used to display a relationship between two interval-ratio variables.

• Typically, the independent variable is placed on the X-axis (horizontal axis), while the dependent variable is placed on the Y-axis (vertical axis.)

Page 10: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 10

Scatter Diagram Example

• The data…

Page 11: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 11

Scatter Diagram Example

Page 12: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 12

A Scatter Diagram Example of a Negative Relationship

Page 13: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 13

Linear Relationships• Linear relationship – A relationship between two

interval-ratio variables in which the observations displayed in a scatter diagram can be approximated with a straight line.

• Deterministic (perfect) linear relationship – A relationship between two interval-ratio variables in which all the observations (the dots) fall along a straight line. The line provides a predicted value of Y (the vertical axis) for any value of X (the horizontal axis.

Page 14: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 14

Graph the data below and examine the relationship:

Page 15: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 15

The Seniority-Salary Relationship

Page 16: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 16

Example: Education & Prestige

Does education predict occupational prestige? If so, then the higher the respondent’s level of education, as measured by number of years of schooling, the greater the prestige of the respondent’s occupation.Take a careful look at the scatter diagram on the next slide and see if you think that there exists a relationship between these two variables…

Page 17: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 17

Scatterplot of Prestige by Education

Page 18: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 18

Example: Education & Prestige

• The scatter diagram data can be represented by a straight line, therefore there does exist a relationship between these two variables.

• In addition, since occupational prestige becomes higher, as years of education increases, we can say also that the relationship is a positive one.

Page 19: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 19

Take your best guess?

The mean age for U.S. residents.

Now if I tell you that this person owns a skateboard, would you change your guess? (Of course!)

With quantitative analyses we are generally trying to predict or take our best guess at value of the dependent variable. One way to assess the relationship between two variables is to consider the degree to which the extra information of the second variable makes your guess better. If someone owns a skateboard, that is likely to indicate to us that s/he is younger and we may be able to guess closer to the actual value.

If you know nothing else about a person, except that he or she lives in United States and I asked you to his or her age, what would you guess?

Page 20: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 20

Take your best guess?

• Similar to the example of age and the skateboard, we can take a much better guess at someone’s occupational prestige, if we have information about her/his years or level of education.

Page 21: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 21

Equation for a Straight LineY= a + bX

where a = interceptb = slopeY = dependent variableX = independent variable

X

Y

a

riserunrise run = b

Page 22: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 22

Bivariate Linear Regression EquationY = a + bX

• Y-intercept (a)—The point where the regression line crosses the Y-axis, or the value of Y when X=0.

• Slope (b)—The change in variable Y (the dependent variable) with a unit change in X (the independent variable.)

The estimates of a and b will have the property that the sum of the squared differences between the observed and predicted (Y-Y)2 is minimized using ordinary least squares (OLS). Thus the regression line represents the Best Linear and Unbiased Estimators (BLUE) of the intercept and slope.

ˆ

^

Page 23: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 23

Model Summary

.559a .313 .312 11.90Model1

R R SquareAdjusted R

Square

Std. Errorof the

Estimate

Predictors: (Constant), HIGHEST YEAR OF SCHOOLCOMPLETED

a.

ANOVAb

87633.680 1 87633.680 619.305 .000a

192727.378 1362 141.503

280361.058 1363

Regression

Residual

Total

Model1

Sum ofSquares df

MeanSquare F Sig.

Predictors: (Constant), HIGHEST YEAR OF SCHOOL COMPLETEDa.

Dependent Variable: RS OCCUPATIONAL PRESTIGE SCORE (1980)b.

SPSS Regression Output (GSS)Education & Prestige

Page 24: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 24

Coefficientsa

6.120 1.531 3.997 .000

2.762 .111 .559 24.886 .000

(Constant)

HIGHEST YEAR OFSCHOOL COMPLETED

Model1

B Std. Error

UnstandardizedCoefficients

Beta

Standardized

Coefficients

t Sig.

Dependent Variable: RS OCCUPATIONAL PRESTIGE SCORE (1980)a.

Now let’s interpret the SPSS output...

SPSS Regression Output (GSS)Education & Prestige

Page 25: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 25

The Regression EquationCoefficientsa

6.120 1.531 3.997 .000

2.762 .111 .559 24.886 .000

(Constant)

HIGHEST YEAR OFSCHOOL COMPLETED

Model1

B Std. Error

UnstandardizedCoefficients

Beta

Standardized

Coefficients

t Sig.

Dependent Variable: RS OCCUPATIONAL PRESTIGE SCORE (1980)a.

Prediction Equation:

Y = 6.120 + 2.762(X)This line represents the predicted values for Y for any and all values of X

ˆ

Page 26: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 26

Prediction Equation:

Y = 6.120 + 2.762(X)This line represents the predicted values for Y for any and all values of X

ˆ

Coefficientsa

6.120 1.531 3.997 .000

2.762 .111 .559 24.886 .000

(Constant)

HIGHEST YEAR OFSCHOOL COMPLETED

Model1

B Std. Error

UnstandardizedCoefficients

Beta

Standardized

Coefficients

t Sig.

Dependent Variable: RS OCCUPATIONAL PRESTIGE SCORE (1980)a.

The Regression Equation

Page 27: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 27

• If a respondent had zero years of schooling, this model predicts that his occupational prestige score would be 6.120 points.

• For each additional year of education, our model predicts a 2.762 point increase in occupational prestige.

Y = 6.120 + 2.762(X)ˆ

Interpreting the regression equation

Page 28: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 28

Ordinary Least Squares

• Least-squares line (best fitting line) – A line where the errors sum of squares, or e2, is at a minimum.

• Least-squares method – The technique that produces the least squares line.

Page 29: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 29

Estimating the slope: b

• The bivariate regression coefficient or the slope of the regression line can be obtained from the observed X and Y scores.

)(

)()(

1

)(1

)()(

222XX

YYXX

N

XXN

YYXX

S

SbX

YX

Page 30: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 30

Covariance =

Variance of X =

Covariance of X and Y—a measure of how X and Y vary together. Covariance will be close to zero when X and Y are unrelated. It will be greater than zero when the relationship is positive and less than zero when the relationship is negative.

Variance of X—we have talked a lot about variance in the dependent variable. This is simply the variance for the independent variable

Covariance and Variance

1

)()(

N

YYXX

1

)(

1

))(( 2

N

XX

N

XXXX

Page 31: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 31

Estimating the Intercept

XbYa

The regression line always goes through the point corresponding to the mean of both X and Y, by definition.

So we utilize this information to solve for a:

Page 32: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 32

Back to the original scatterplot:

Page 33: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 33

A Representative Line

Page 34: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 34

Other Representative Lines

Page 35: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 35

Calculating the

Regression Equation

Page 36: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 36

Calculating the

Regression Equation

Page 37: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 37

The Least Squares Line!

Page 38: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 38

Summary: Properties of the Regression Line

• Represents the predicted values for Y for any and all values of X.

• Always goes through the point corresponding to the mean of both X and Y.

• It is the best fitting line in that it minimizes the sum of the squared deviations.

• Has a slope that can be positive or negative; null hypothesis is that the slope is zero.

Page 39: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 39

Coefficient of Determination

• Coefficient of Determination (r2) – A PRE measure reflecting the proportional reduction of error that results from using the linear regression model. It reflects the proportion of the total variation in the dependent variable, Y, explained by the independent variable, X.

Page 40: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 40

Coefficient of Determination

22

222

)]()][var([var

)],([cov

yx

yx

SS

S

YianceXiance

YXariancer

Page 41: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 41

Coefficient of Determination

Page 42: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 42

• Pearson’s Correlation Coefficient (r) — The square root of r2. It is a measure of association between two interval-ratio variables.

• Symmetrical measure—No specification of independent or dependent variables.

• Ranges from –1.0 to +1.0. The sign () indicates direction. The closer the number is to 1.0 the stronger the association between X and Y.

)Y.(d.s*)X.(d.s

)Y,X(covr

The Correlation Coefficient

Page 43: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 43

r = 0 means that there is no association between the two variables.

The Correlation Coefficient

Y

X

r = 0

Page 44: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 44

The Correlation Coefficient

Y

X

r = +1

r = 0 means that there is no association between the two variables.

r = +1 means a perfect positive correlation.

Page 45: Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient

Chapter 8 – 45

The Correlation Coefficient

Y

X

r = –1

r = 0 means that there is no association between the two variables.

r = +1 means a perfect positive correlation.

r = –1 means a perfect negative correlation.