assignment 9 slides (to post)

Multiple Correlation and Regression

Psychology 3800, Lab 002

•  chi-square test feedback

•  overview of multivariate regression

•  methods for multiple regression

•  example analysis

•  the assignment

!"#$%&#'()*#+,,-.#

/**)0"1,"2#34#5,,6&%7-#

•  list of commonly made errors is up on the lab blog •  great work with the actual statistics, but definitely check out the pointers on summarizing and describing your findings

http://uwo3800g.tumblr.com/post/80570904491/assignment-7-commonly-made-errors

Multiple Regression: Overview

•  extension of bivariate regression from last week: ! in both cases, only one criterion (y-variable) ! now, looking at contribution of multiple predictors (x-variables)

89:2);:,#<,0=,**)>"#

•  still talking about linear models (capturing relationships between variables using straight lines)

Recall the prediction equation from last week (bivariate regression):

!

ˆ y = b0 + b1(x)

predicted score on criterion variable constant slope given value of interest

on predictor variable

unstandardized coefficients from SPSS

89:2);:,#<,0=,**)>"4#?=,6)72)>"#@A9%2)>"#

89:2);:,#<,0=,**)>"4#?=,6)72)>"#@A9%2)>"#

In a bivariate situation, knowing the slope and intercept of a line allowed us to depict it graphically in order to understand how it could yield expected values on our criterion (y) using information about our predictor (x).

89:2);:,#<,0=,**)>"4#?=,6)72)>"#@A9%2)>"#

•  with multiple predictors, we can no longer draw a single line of best fit o  we are now dealing with a regression “plane” (now have multiple axes leading to more complex graphs) o  good news: an equation can still be derived that adheres to the same basic principles as in the bivariate case

!

ˆ y = b0 + b1x1 + b2x2 + b3x3 + ... etc.

predicted score on criterion variable b1, b2, b3, etc. ! unstandardized coefficients for each

predictor from SPSS

! gives info about change in criterion per unit change in each predictor, keeping all other variables in equation constant

x1, x2, x3, etc. = values of interest for each predictor (these are the values we plug into the equation)

89:2);:,#<,0=,**)>"4#?=,6)72)>"#@A9%2)>"#

In multiple regression, the regression equation (or model) has more predictors:

!

ˆ y = b0 + b1x1 + b2x2 + b3x3 + ... etc.

predicted score on criterion variable

constant: value of criterion if all predictors are zero for a given sample (given that it is often not possible for all predictors to be set at zero, this value may take on a

number that conceptually does not make sense)

Note: regression equations are sensitive to rounding so use as many decimal places as possible in your equations

89:2);:,#<,0=,**)>"4#?=,6)72)>"#@A9%2)>"#

…In multiple regression, the prediction equation (or model) has more predictors:

•  we only want to include variables that add significantly to prediction of the criterion variable in our prediction equation (i.e. those that contribute a significant amount of unique variance)

•  SPSS shows which predictors should be included in the model and which regression coefficients (b’s) are optimal for predicting the criterion

89:2);:,#<,0=,**)>"4#?=,6)72)>"#@A9%2)>"#

In essence, we are interested in two things:

1) amount of unique variance in dependent variable (DV) accounted for by each predictor (IV1, IV2) alone

IV1

DV IV2

unique variance of IV1 on DV

unique variance of IV2 on DV

89:2);:,#<,0=,**)>"4#B%=)%"7,#

In essence, we are interested in two things:

1) amount of unique variance in dependent variable (DV) accounted for by each predictor (IV1, IV2) alone

2) amount of total variance in DV accounted for by all predictors together

IV1

DV IV2

89:2);:,#<,0=,**)>"4#B%=)%"7,#

variance of all IVs together (considered in assessing the model

as a whole)

Multiple Regression: Methods

•  default method in SPSS

•  researcher picks which variables get added into the regression equation and at which stage of analysis based on theoretical evidence

•  even non-significant variables get added in

@"2,=#8,2(>6#

*most commonly used method of finding the best equation

(a) SPSS adds each predictor step-by-step to the equation, starting with the best predictor (highest correlation)

(b) SPSS checks whether newly added values significantly improve the prediction of the criterion

(c) after each step, it also removes any predictors that no longer account for a significant amount of unique variance in the DV

(d) each subsequent step starts over, adding in the variable that contributes the most UNIQUE variance until there are no more significant predictors

How It Works…

C2,;D)*,#8,2(>6#

C2,;D)*,#8,2(>64#@E%1;:,#

Three predictors of interest:

(1)  level of sunshine 1 = not sunny; 10 = very sunny (2)  level of audible laughter 1 = no laughter; 10 = very audible laughter (3)  level of cuddles 1 = no cuddles; 10 = many cuddles

Criterion: grumpiness of Grumpy Cat sample

1 = not at all grumpy 10 = extremely grumpy

X3

X2

Y

X1

Y = grumpiness (criterion)

X1 = laugher (predictor)

X2 = cuddles (predictor)

X3 = sunshine (predictor)

C2,;D)*,#8,2(>64#@E%1;:,#

Model 1

Y

X1

level of laughter (X1) significantly improves prediction ! variable is retained ! analysis continues

C2,;D)*,#8,2(>64#@E%1;:,#

X2

Y

X1

Model 2 level of cuddles (X2) significantly improves prediction ! adds unique variance beyond X1 ! variable is retained ! analysis continues

C2,;D)*,#8,2(>64#@E%1;:,#

X3

X2

Y

X1

Model 3 level of sunshine (X3) does not add significantly to prediction ! does not add substantial unique variance ! variable is removed ! no addition variables can be added ! analysis stops

C2,;D)*,#8,2(>64#@E%1;:,#

X2

Y

X1

the final model is composed only of those predictors that contribute significantly to the variance (prediction) of the criterion ! SPSS provides results for each step and for the final model

Final Model (Model 2)

C2,;D)*,#8,2(>64#@E%1;:,#

Multiple Regression: Example Analysis

@E%1;:,4#'(,#F%2%#

So, cat #1, would be pretty grump (7/10) given a fair amount of sunshine (6/10), considerable laughter (7/10), and quite a few cuddles (8/10).

3 predictor variables

1 criterion variables

@E%1;:,4#/**,**)"0#G,=>HI=6,=#J>==,:%2)>"*#

*regular, ol’ bivariate correlations between our variables (always an important first step)

Analyze ! Correlate ! Bivariate

@E%1;:,4#/**,**)"0#G,=>HI=6,=#J>==,:%2)>"*#

All predictors show significant associations with criterion of grumpiness: sunshine: r = .479, p < .001 laughter: r = .591, p < .001 cuddle: r = .511, p < .001

@E%1;:,4#89:2);:,#<,0=,**)>"#

Analyze ! Regression ! Linear

criterion variable is inserted into the “Dependent” box

predictor variables are inserted into the “Independent(s)” box

@E%1;:,4#89:2);:,#<,0=,**)>"#

Statistics Menu

specify that you would like to output “part and partial correlations”

@E%1;:,4#89:2);:,#<,0=,**)>"#

Save Menu

request that predicted values and residual values (both unstandardized) be outputted

@E%1;:,4#89:2);:,#<,0=,**)>"#

two models were assessed by SPSS (final model is always the best-fitting model)

R = .653 ! multiple correlation between the aggregated predictors and the criterion (this is the only reference to multiple correlation in this unit!) ! in the case of Model 2: association between grumpiness and the two predictors in the final model combined (laughter, cuddles)

@E%1;:,4#89:2);:,#<,0=,**)>"#


R = .653

R2 = .427 (42.7%) ! variance in criterion accounted for by the predictors taken together ! in the case of Model 2: variance in grumpiness accounted for by the two predictors in the final model combined (laughter, cuddles)

@E%1;:,4#89:2);:,#<,0=,**)>"#


R = .653

R2 = .427 (42.7%)

sy.x = .966 units ! error associated with prediction of criterion given predictors in model ! can be thought of as the extent to which predicted criterion values will differ from actual scores obtained in the population, on average

@E%1;:,4#89:2);:,#<,0=,**)>"#

F(2, 77) = 28.643, p < .001

! indication of whether overall model is significant or not (how well combo of predictors add to the prediction of the criterion

@E%1;:,4#89:2);:,#<,0=,**)>"#

All predictors listed in the final model add significantly to the prediction of grumpiness:

laughter: t(77) = 4.712, p < .001 cuddles: t(77) = 3.212, p < .01

df is the value associated with residual for the corresponding model in the ANOVA output

@E%1;:,4#89:2);:,#<,0=,**)>"#

•  zero-order correlations: same as those seen in the bivariate correlation table •  standard measure of association (strength and direction) between the criterion and a given predictor variables •  significance is not shown here as it is shown in the bivariate table

grumpiness vs. laughter: r = .591 grumpiness vs. cuddles: r = .511

@E%1;:,4#89:2);:,#<,0=,**)>"#

•  partial correlations: correlation between two variables, controlling for the effect of the remaining variables (unique correlation) •  do not have a significance value ! their significance is reflected by the corresponding t-value •  changes across the models give some insight into how predictors may be related

grumpiness vs. laughter: rxy.z = .473, t(77) = 4.712, p < .001 grumpiness vs. cuddles: rxy.z = .344, t(77) = 3.212, p < .01

@E%1;:,4#89:2);:,#<,0=,**)>"#

•  squaring a partial correlation coefficient value will give us the unique proportion of variance in the criterion accounted for by each predictor •  typically, when reporting , it is also customary to report the partial correlation (with significance information) along with it

grumpiness vs. laughter: rxy.z = .473 ! = .224 (22.4%) grumpiness vs. cuddles: rxy.z = .344 ! = .118 (11.8%)

!

rxy.z2

!

rxy.z2

!

rxy.z2

@E%1;:,4#89:2);:,#<,0=,**)>"#

!

ˆ y = b0 + b1x1 + b2x2

!

ˆ y grumpiness = 4.146 + 0.333xlaughter + 0.135xcuddles

When there is lots of laughter (9) but few cuddles (2), cat grumpiness would be:

!

ˆ y grumpiness = 4.146 + 0.333(9) + 0.135(2)ˆ y grumpiness = 4.146 + 2.997 + 0.270ˆ y grumpiness = 7.413

@E%1;:,4#89:2);:,#<,0=,**)>"#

•  for a cat scoring 9 on laugher and 2 on cuddles, SPSS predicted a score of 7.321 on grumpiness (close enough to what we found)

•  cat actually scored an 8 on grumpiness

•  SPSS under-predicted by 0.678 using the prediction equation

@E%1;:,4#89:2);:,#<,0=,**)>"#

•  provides information regarding variables that have not been entered into a given model (i.e. excluded at that step)

•  large beta, large and significant t-value, large partial correlation, and large tolerance statistic all indicate that a variable is a good candidate to get added into the model at the next step

•  statistics are interpreted the same way as what you would see in other tables (and you don’t have to worry about reporting “Beta In” or tolerance”)

Assignment 9: Overview

•  not an APA-style results section o  reply to questions in numbered responses o  format your work using APA style as much as possible

•  submit report (3 pages), hand calculations, and all output

•  I can help you run the data and answer general questions, but I cannot tell you which statistics to report (help info is the slides for this week and last week)

/**)0"1,"24#IK,=K),D#

/**)0"1,"24#IK,=K),D#

Question #1

•  run a bivariate correlation •  list all relevant variables with supporting statistics •  no concluding sentences or double-spacing needed

Question #2

•  run a simple (bivariate) regression using the relevant variables •  report all statistics that that tell us how (and how well) the predictor adds to the prediction of the criterion, and relates to the criterion •  can be in list form, single-spaced, with no concluding statements •  be as thorough as possible and include units where applicable (i.e., for unstandardized values)

/**)0"1,"24#IK,=K),D#

Question #3

Part A •  run a Stepwise multiple regression using the relevant variables •  make sure to request predicted values using the “Save” option (may come in handy in answering later questions) •  write out the best-fitting model as a regression equation •  explain each component of the equation in sentence form (double-spaced)

Parts B-D •  self-explanatory •  be as thorough as possible •  can simply list responses, single-spaced, with no concluding sentences

/**)0"1,"24#IK,=K),D#

Questions #4-5 (tough ones)

•  respond to the questions in full sentences, double-spaced •  refer to specific statistics when making your case •  restrict your discussion to zero-order and partial correlations (but recall that these statistics show up in a number of different outputs and tables) •  it may be helpful to see what happens to your variables across the different models that have been outputted

Questions #6

•  run a bivariate correlation using the relevant variables •  answer the question is full sentences, double-spaced •  don’t forget to include relevant statistics

See you next week for our last lab!

assignment 9 slides (to post)

Education

criterion yvariable

cuddles criterion

predictor iv1

prediction equation

criterion variable b1

unique variance spss

regression equations

x3 x2 y x1 y