multiple regression. regression attempts to predict one criterion variable using one predictor...

Multiple Regression

Multiple Regression

Regression Attempts to predict one criterion variable

using one predictor variable Addresses the question: Does the predictor

significantly predict the criterion?

Multiple Regression

Multiple Regression Attempts to predict one criterion variable

using 2+ predictor variables Addresses the questions: Do the predictors

significantly predict the criterion? If so, which predictor is best?

Allows for variance to be removed from one predictor prior to evaluating the rest (like ANCOVA)

Multiple Regression

How to compare the predictive value of 2+ predictors

When comparing multiple predictors within an experiment

Use standardized b (β) β = bxs/sintercept

z-score = lets you compare performance between 2 variables with different metrics, by addressing performance relative to a sample mean & SD

Multiple Regression

How to compare the predictive value of 2+ predictors When comparing multiple predictors between

experiments Use b

SE highly variable between experiments the SE from Exp. 1 ≠ the SE from Exp. 2 β’s from both experiments not comparable

Can’t compare z-score of your Stats grade from this semester with your Stats grade if you take the class again next semester If next semester’s class is especially dumb, you

appear to have gotten much smarter

Multiple Regression

Magnitude of the relationship between one predictor and a criterion (b/β) in a model dependent upon the other predictors in that model Relationship between IQ and SES (with

College GPA and Parents’ SES in the model) will be different if more, less, or different predictors included in the model

Multiple Regression

When comparing the results of 2 experiments using regression, coefficients (b/β) will not be the same Will be similar to the extent that the

regression models are similar

Why not?

Multiple Regression

Coefficients (b/β) represent partial and semipartial (part) correlations, not traditional Pearson’s r Partial Correlation – the correlation

between 2 variables with the variance from one or more variables removed I.e. correlation between the residuals of both

variables, once variance from one or more covariates has been removed

Multiple Regression

Partial Correlation = the amount of the variance in a criterion that is associated with a predictor that could not be explained by the other covariate(s)

Multiple Regression

Semipartial/Part Correlation -the correlation between 2 variables with the variance from one or more variables removed from the predictor only (i.e. not the criterion) I.e. correlation between the residuals of the

predictor, once variance from one or more covariates has been removed, and the criterion

Multiple Regression

Part Correlation = the amount of variance that a predictor explains in a criterion once variance from the covariates has been removed I.e. the percentage of the total variance left

unexplained by the covariate that the predictor accounts for

Since the variance that is removed from the criterion depends on the other predictors in the model, different models yield different regression coefficients

Partial Correlation = B

Part Correlation = B/A + B

Multiple Regression

How to compare the predictive value of 2+ predictors Remember: Regression coefficients are

very unstable from sample to sample, so interpret large differences in coefficients only (> ~.2)

Multiple Regression

Like regression, tests: Ability of each predictor to predict the criterion variable

(tests b’s/β’s) Overall ability of the model (all predictors combined) to

predict the criterion variable (Model R2) Model R2 = total % variance in criterion accounted for by

predictors Model R = correlation between predictors and criterion

Also can test: If one or more predictors can predict the criterion if

variance from one or more other predictors is removed If each predictor significantly increases the Model R2

Multiple Regression

Predictors are evaluated with variance from other predictors removed More than one way to remove this

variance Examine all predictors en masse with

variance from all other predictors removed Remove variance from one or more

predictors first, then look at second set Like in factorial ANCOVA

Multiple Regression

This is done by specifying different selection methods Selection method = method of inputting

predictors into a regression equation Four most commonly used methods

Commonly-used = Only 4 methods offered by SPSS

Multiple Regression

Selection Methods Simultaneous – Adds all predictors at

once & is therefore the lack of a selection method Good if there is no theory to guide which

predictors should be entered first But when does this ever happen?

Multiple Regression

Selection Methods All Subsets – Computer finds method of

entering predictors that maximizes overall Model R2

But SPSS doesn’t do it and it finds best subset in your particular dataset – since data, not theory, guiding selection method not guarantee that the model will generalize to other datasets, particularly in smaller samples

Multiple Regression

Selection Methods Backward Elimination – Starts will all predictors

in the model and eliminates the predictor with least unique variance related to criterion iteratively until all predictors are significant Iterative = process involving several steps It begins with all predictors, so predictors with least

variance not overlapping with other predictors (i.e. that would be partialled out) are removed

But, also atheoretical/based on data only

Multiple Regression

Selection Methods Forward Selection – the opposite of backward

elimination - starts will the predictor in the model most strongly related to the criterion and adds the predictor next most strongly-related to criterion iteratively until a nonsignificant predictor is found Step 1: predictor most correlated with the criterion (P1)

Step 2: add strongest predictor when P1 partialled out

But also atheoretical

Multiple Regression

Selection Methods Stepwise

Technically, any selection method that procedes iteratively (in steps) is stepwise (i.e. both backward elimination and forward selection)

However, usually refers to method where order of predictors is determined in advance by the researcher based upon theory

Multiple Regression

Selection Method Stepwise

Why would you use it? Same reason as covariates in ANCOVA Want to know if Measure A of treatment adherence

is better than Measure B? Run stepwise regression and enter Measure B first, then Measure A with treatment outcome as the criterion. Addresses the question: Does Measure A

predict treatment outcome even when variance from Measure B has already been removed (i.e. above and beyond Measure B)?

Multiple Regression

Selection Method Stepwise

Why would you use it? Running a repeated-measures design and want to

make sure your groups are equal on pre-test scores? Enter the pre-test into the first step of your regression.

Multiple Regression

Assumptions Linearity of Regression

Variables linearly related to one another Normality in Arrays

Actual values of DV normally distributed around predicted values (i.e. regression line) – AKA regression line is good approximation of population parameter

Homogeneity of Variance in Arrays Assumes that variance of criterion is equal for all

levels of predictor(s)

Multiple Regression

Issues to be aware of: Range Restriction Heterogenous Subsamples Outliers

With multiple predictors, must be aware of both univariate outliers (unusual values on one variable) as well as multivariate outliers (unusual values on two or more variables)

Multiple Regression

Outliers Univariate outlier – a man weighing 500 lbs. Multivariate outlier – a man who is 6’ tall and

weights 120 lbs. – Note neither value is a univariate outlier, but both together are quite odd Three variables define the presence of an outlier in

multiple regression: Distance – distance from the regression line Leverage – distance from predictor mean Influence – average of distance and leverage

Distance – distance from the regression line See A

Leverage – distance from predictor mean See B

Influence – average of distance and leverage

Multiple Regression

Degree of Overlap in Predictors Adding predictors is like adding covariates in

ANCOVA: In adding one that correlates too highly with others, model R2 remains unchanged but df decreases, making the regression less powerful

Tolerance = multiple R2 between all predictors – want to be low Examine bivariate correlations between predictors,

if correlation exceeds internal consistency (α), get rid of one of them

Multiple Regression

Multiple regression can also test for more complex relationships, such as mediation and moderation Mediation – when one variable (predictor)

operates on another variable (criterion) via a third variable (mediator)

Math self-efficacy mediates math ability and interest in a math major Must establish paths A & B, and that path C is

smaller when paths A & B are included in the model (i.e. math self-efficacy accounts for variance in interest in a math major above and beyond math ability)

1. Find significant correlations between the predictor and mediator (path A) and mediator and criterion (path B)

2. Run a stepwise regression with the predictor entered first, then the predictor and mediator entered together in step 2

Multiple Regression

The mediator should be a significant predictor of the criterion in step 2

The predictor-criterion relationship (b/β) should decrease from step 1 to step 2 Full mediation: If this relationship is significant

in step 1, but nonsignificant in step 2 Partial mediation: This relationship is

significant in step 1, and smaller, but still significant, in step 2

Multiple Regression

Partial mediation Sobel’s test (1982): tests the statistical significance

of this mediation relationship

1. Regress predictor on mediator (path A) and mediator on criterion (path B) in 2 separate regressions

2. Calculate sβ for path A & B, where sβ = β/t

3. Calculate a t-statistic, where df = n – 3 and

222222BAABBA

BA

sssst

Multiple Regression

Multiple regression can also test for more complex relationships, such as mediation and moderation Moderation (in regression) – when the

strength of a predictor-criterion changes as a result of a third variable (moderator) Interaction (in ANOVA) – when the strength

of the relationship between an IV and DV changes as a function of levels of the IV

Multiple Regression

Moderation Unlike in ANOVA, you have to create a

moderator term for yourself by multiplying the predictor and moderator In SPSS, go to Transform Compute Typical to enter the predictor and mediator in the

first step of a regression and the interaction term in the second step to determine the contribution of the mediator above and beyond the main effect terms Just like how variance is partitioned in a

factorial ANOVA

Logistic Regression

Logistic Regression = used to predict a dichotomous criterion (only 2 levels) variable with 1+ continuous or discrete predictors

Can’t use linear regression with a dichotomous criterion because:

1. Dichtotomous = assuming the criterion isn’t normally distributed (i.e. assumption of normality in arrays is violated)

Can’t use linear regression with a dichotomous criterion because:

2. Regression line fits data more poorly when predictor = 0 (i.e. assumption of homogeneity of variance arrays is violated)

Logistic Regression

Logistic Regression Interpreting coefficients

In logistic regression, b represents change in log odds in criterion with one point increase in predictor

Raise “ex” where x = b, to find the odds – b = -.0812 e-.0812 = .9220

Logistic Regression

Logistic Regression Interpreting coefficients

Continuous predictor: One pt. increase in predictor corresponds to decreasing (because b is neg) odds of criterion by factor of .922 (almost 100% or twice as likely)

Dichotomous predictor: Odds of change in one group vs. other group (sign indicates increase or decrease)

multiple regression. regression attempts to predict one criterion variable using one predictor...

Documents

multiple regression

criterion slide

multiple predictors

multiple regression

multiple regression

model slide

predictor variables

regression models