statistical analysis regression & correlation

52
Statistical Analysis Regression & Correlation Psyc 250 Winter, 2013

Upload: chava-snyder

Post on 03-Jan-2016

48 views

Category:

Documents


4 download

DESCRIPTION

Statistical Analysis Regression & Correlation. Psyc 250 Winter, 2013. Review: Types of Variables & Steps in Analysis. Variables & Statistical Tests. Evaluating an hypothesis. Step 1: What is the relationship in the sample ? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Statistical Analysis Regression & Correlation

Statistical Analysis

Regression & Correlation

Psyc 250

Winter, 2013

Page 2: Statistical Analysis Regression & Correlation

Review:

Types of Variables&

Steps in Analysis

Page 3: Statistical Analysis Regression & Correlation

Variables & Statistical TestsVariable Type Example Common Stat

MethodNominal by nominal

Blood type by gender

Chi-square

Scale by nominal GPA by gender

GPA by major

T-test

Analysis of Variance

Scale by scale Weight by height

GPA by SAT

Regression

Correlation

Page 4: Statistical Analysis Regression & Correlation

Evaluating an hypothesis

• Step 1: What is the relationship in the sample?

• Step 2: How confidently can one generalize from the sample to the universe from which it comes?

p < .05

Page 5: Statistical Analysis Regression & Correlation

Evaluating an hypothesisRelationship in

SampleStatistical

Significance

2 nom. vars. Cross-tab / contingency table

“p value” from Chi Square

Scale dep. & 2-cat indep.

Means for each category

“p value” from t-test

Scale dep. & 3+ cat indep.

Means for each category

“p value” from ANOVA f ratio

2 scale vars. Regression line

Correlation r & r2

“p value” from reg or correlation

Page 6: Statistical Analysis Regression & Correlation

Evaluating an hypothesisRelationship in

SampleStatistical

Significance

2 nom. vars. Cross-tab / contingency table

“p value” from Chi Square

Scale dep. & 2-cat indep.

Means for each category

“p value” from t-test

Scale dep. & 3+ cat indep.

Means for each category

“p value” from ANOVA

2 scale vars. Regression line

Correlation r & r2

“p value” from reg or correlation

Page 7: Statistical Analysis Regression & Correlation

Relationships betweenScale Variables

Regression

Correlation

Page 8: Statistical Analysis Regression & Correlation

Regression• Amount that a dependent variable

increases (or decreases) for each unit increase in an independent variable.

• Expressed as equation for a line –

y = m(x) + b – the “regression line”

• Interpret by slope of the line: m

(Or: interpret by “odds ratio” in “logistic regression”)

Page 9: Statistical Analysis Regression & Correlation

Correlation• Strength of association of scale measures

• r = -1 to 0 to +1

+1 perfect positive correlation

-1 perfect negative correlation

0 no correlation

• Interpret r in terms of variance

Page 10: Statistical Analysis Regression & Correlation

Mean&

Variance

Page 11: Statistical Analysis Regression & Correlation

Example: Weight & HeightSurvey of Class n = 42

• Height• Mother’s height• Mother’s education• SAT• Estimate IQ• Well-being

(7 pt. Likert)

• Weight• Father’s education• Family income• G.P.A.• Health (7 pt. Likert)

Page 12: Statistical Analysis Regression & Correlation

Frequency Table for: HEIGHT  Valid CumValue Label Value Frequency Percent Percent Percent  59.00 1 2.4 2.4 2.4 61.00 2 4.8 4.8 7.1 62.00 3 7.1 7.1 14.3 63.00 3 7.1 7.1 21.4 65.00 5 11.9 11.9 33.3 66.00 3 7.1 7.1 40.5 67.00 4 9.5 9.5 50.0 68.00 5 11.9 11.9 61.9 69.00 1 2.4 2.4 64.3 70.00 6 14.3 14.3 78.6 71.00 1 2.4 2.4 81.0 72.00 4 9.5 9.5 90.5 73.00 3 7.1 7.1 97.6 74.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0

Page 13: Statistical Analysis Regression & Correlation

Frequency Table for: HEIGHT  Valid CumValue Label Value Frequency Percent Percent Percent  59.00 1 2.4 2.4 2.4 61.00 2 4.8 4.8 7.1 62.00 3 7.1 7.1 14.3 63.00 3 7.1 7.1 21.4 65.00 5 11.9 11.9 33.3 66.00 3 7.1 7.1 40.5 67.00 4 9.5 9.5 50.0 68.00 5 11.9 11.9 61.9 69.00 1 2.4 2.4 64.3 70.00 6 14.3 14.3 78.6 71.00 1 2.4 2.4 81.0 72.00 4 9.5 9.5 90.5 73.00 3 7.1 7.1 97.6 74.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0  Descriptive Statistics for: HEIGHT ValidVariable Mean Std Dev Variance Range Minimum Maximum N HEIGHT 67.33 3.87 14.96 15.00 59.00 74.00 42

mean

Page 14: Statistical Analysis Regression & Correlation

Variance

x i - Mean )2

Variance = s2 = ----------------------- N - 1

 

Standard Deviation = s = variance

Page 15: Statistical Analysis Regression & Correlation

Frequency Table for: WEIGHT Valid CumValue Label Value Frequency Percent Percent Percent  115.00 1 2.4 2.4 2.4 120.00 1 2.4 2.4 4.8 124.00 1 2.4 2.4 7.1 125.00 4 9.5 9.5 16.7 128.00 1 2.4 2.4 19.0 130.00 6 14.3 14.3 33.3 135.00 4 9.5 9.5 42.9 136.00 1 2.4 2.4 45.2 140.00 3 7.1 7.1 52.4 145.00 2 4.8 4.8 57.1 150.00 3 7.1 7.1 64.3 155.00 2 4.8 4.8 69.0 160.00 6 14.3 14.3 83.3 165.00 2 4.8 4.8 88.1 170.00 1 2.4 2.4 90.5 185.00 1 2.4 2.4 92.9 190.00 2 4.8 4.8 97.6 210.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0  Descriptive Statistics for: WEIGHT ValidVariable Mean Std Dev Variance Range Minimum Maximum N WEIGHT 146.38 21.30 453.80 95.00 115.00 210.00 42

mean

Page 16: Statistical Analysis Regression & Correlation

Relationship of weight & height:

Regression Analysis

Page 17: Statistical Analysis Regression & Correlation
Page 18: Statistical Analysis Regression & Correlation

“Least Squares” Regression Line

Dependent = ( B ) (Independent) + constant

weight = ( B ) ( height ) + constant

Page 19: Statistical Analysis Regression & Correlation

Regression line

Page 20: Statistical Analysis Regression & Correlation

Regression: WEIGHT on HEIGHT Multiple R .59254R Square .35110Adjusted R Square .33488Standard Error 17.37332 Analysis of Variance DF Sum of Squares Mean SquareRegression 1 6532.61322 6532.61322Residual 40 12073.29154 301.83229 F = 21.64319 Signif F = .0000 ------------------ Variables in the Equation ------------------ Variable B SE B Beta T Sig T HEIGHT 3.263587 .701511 .592541 4.652 .0000(Constant) -73.367236 47.311093 -1.551 

[ Equation: Weight = 3.3 ( height ) - 73 ]

Page 21: Statistical Analysis Regression & Correlation

Regression line

W = 3.3 H - 73

Page 22: Statistical Analysis Regression & Correlation

Strength of Relationship

“Goodness of Fit”: Correlation

How well does the regression line “fit” the data?

Page 23: Statistical Analysis Regression & Correlation

Correlation• Strength of association of scale measures

• r = -1 to 0 to +1

+1 perfect positive correlation

-1 perfect negative correlation

0 no correlation

• Interpret r in terms of variance

Page 24: Statistical Analysis Regression & Correlation
Page 25: Statistical Analysis Regression & Correlation

Frequency Table for: WEIGHT Valid CumValue Label Value Frequency Percent Percent Percent  115.00 1 2.4 2.4 2.4 120.00 1 2.4 2.4 4.8 124.00 1 2.4 2.4 7.1 125.00 4 9.5 9.5 16.7 128.00 1 2.4 2.4 19.0 130.00 6 14.3 14.3 33.3 135.00 4 9.5 9.5 42.9 136.00 1 2.4 2.4 45.2 140.00 3 7.1 7.1 52.4 145.00 2 4.8 4.8 57.1 150.00 3 7.1 7.1 64.3 155.00 2 4.8 4.8 69.0 160.00 6 14.3 14.3 83.3 165.00 2 4.8 4.8 88.1 170.00 1 2.4 2.4 90.5 185.00 1 2.4 2.4 92.9 190.00 2 4.8 4.8 97.6 210.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0  Descriptive Statistics for: WEIGHT ValidVariable Mean Std Dev Variance Range Minimum Maximum N 

WEIGHT 146.38 21.30 453.80 95.00 115.00 210.00 42

mean

Page 26: Statistical Analysis Regression & Correlation

mean

Variance = 454

Page 27: Statistical Analysis Regression & Correlation

Regression line

mean

Page 28: Statistical Analysis Regression & Correlation

Correlation: “Goodness of Fit”

• Variance (average sum of squared distances from mean) = 454

• “Least squares” (average sum of squared distances from regression line) = 295

Page 29: Statistical Analysis Regression & Correlation

l.s. = 295 Regression line

meanS2 = 454

Page 30: Statistical Analysis Regression & Correlation

Correlation: “Goodness of Fit”

• How much is variance reduced by calculating from regression line?

• 454 – 295 = 159 159 / 454 = .35

• Variance is reduced 35% by calculating “least squares” from regression line

• r2 = .35

Page 31: Statistical Analysis Regression & Correlation

r2 = % of variance in WEIGHT

“explained” by HEIGHT

Correlation coefficient = r

Page 32: Statistical Analysis Regression & Correlation

Correlation: HEIGHT with WEIGHT   HEIGHT WEIGHT HEIGHT 1.0000 .5925 ( 42) ( 42) P= . P= .000 WEIGHT .5925 1.0000 ( 42) ( 42) P= .000 P= .

Page 33: Statistical Analysis Regression & Correlation

r = .59

r2 = .35

HEIGHT “explains” 35% of variance in WEIGHT

Page 34: Statistical Analysis Regression & Correlation

Sentence & G.P.A.

• Regression: form of relationship

• Correlation: strength of relationship

• p value: statistical significance

Page 35: Statistical Analysis Regression & Correlation

Legal Attitudes Study:

1. Relationship of sentence length to G.P.A.?

2. Relationship of sentence length to Liberal-Conservative views

Page 36: Statistical Analysis Regression & Correlation

grade point average

grade point average

4.003.903.803.753.703.403.333.203.00

Fre

qu

en

cy

7

6

5

4

3

2

1

0

Statistics

grade point average23

1

3.5752

.35057

.12290

Valid

Missing

N

Mean

Std. Deviation

Variance

G. P. A.

Page 37: Statistical Analysis Regression & Correlation

jail sentence

jail sentence

18.0012.0011.009.006.004.003.002.00.00

Fre

qu

en

cy

7

6

5

4

3

2

1

0

Statistics

jail sentence24

0

5.1250

5.44788

29.67935

Valid

Missing

N

Mean

Std. Deviation

Variance

Length of Sentence (simulated data)

Page 38: Statistical Analysis Regression & Correlation

grade point average

4.24.03.83.63.43.23.02.8

jail

sen

ten

ce

20

10

0

-10

Scatterplot: Sentence on G.P.A.

Page 39: Statistical Analysis Regression & Correlation

Regression Coefficients

Coefficientsa

17.853 12.097 1.476 .155

-3.534 3.368 -.223 -1.049 .306

(Constant)

grade point average

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: jail sentencea.

Sentence = -3.5 G.P.A. + 18

Page 40: Statistical Analysis Regression & Correlation

grade point average

4.24.03.83.63.43.23.02.8

jail

sen

ten

ce20

10

0

-10

Sent = -3.5 GPA + 18

“Least Squares” Regression Line

Page 41: Statistical Analysis Regression & Correlation

Correlation: Sentence & G.P.A.

Correlations

1 -.223

. .306

23 23

-.223 1

.306 .

23 24

Pearson Correlation

Sig. (2-tailed)

N

Pearson Correlation

Sig. (2-tailed)

N

grade point average

jail sentence

grade pointaverage jail sentence

Page 42: Statistical Analysis Regression & Correlation

Statistical Significance

Coefficientsa

17.853 12.097 1.476 .155

-3.534 3.368 -.223 -1.049 .306

(Constant)

grade point average

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: jail sentencea.

Correlations

1 -.223

. .306

23 23

-.223 1

.306 .

23 24

Pearson Correlation

Sig. (2-tailed)

N

Pearson Correlation

Sig. (2-tailed)

N

grade point average

jail sentence

grade pointaverage jail sentence

p = .31

Regression:

Correlation

Page 43: Statistical Analysis Regression & Correlation

Interpreting Correlations

• r = -.22

• r2 = .05 p = .31

G.P.A. “explains” 5% of the variance in length of sentence

Page 44: Statistical Analysis Regression & Correlation

Write Results

“A regression analysis finds that each higher unit of GPA is associated with a 3.5 month decrease in sentence length, but this correlation was low (r = -.22) and not statistically significant (p = .31).”

Page 45: Statistical Analysis Regression & Correlation
Page 46: Statistical Analysis Regression & Correlation

Multiple Regression

• Problem: relationship of weight and calorie consumption

• Both weight and calorie consumption related to height

• Need to “control for” height or assess relative effects of height and calorie consumption

Page 47: Statistical Analysis Regression & Correlation

Regression line

mean

Multiple Regression

Page 48: Statistical Analysis Regression & Correlation

Regression line

mean

Multiple Regression

Residuals

Page 49: Statistical Analysis Regression & Correlation

Multiple Regression• Regress weight residuals (dependent

variable) on caloric intake (independent variable)

• Statistically “controls” for height: removes effect or “confound” of height .

• How much variance in weight does caloric intake account for over and above height?

Page 50: Statistical Analysis Regression & Correlation

Multiple Regression

• How much variance in dependent measure (weight, length of sentence) do all independent variables combined account for?

multiple R2

• What is the best “model” for predicting the dependent variable?

Page 51: Statistical Analysis Regression & Correlation

Malamuth: Sexual Aggression

• Dependent Var: self-report aggression

• Indep / Predictor Vars:– Dominance– Hostility toward women– Acceptance of violence toward women– Psychoticism– Sexual Experience

+ interaction effects

Page 52: Statistical Analysis Regression & Correlation

Malamuth: multiple regressions• Without “tumescence” index:

multiple R = .55 w/ interactions R = .67

multiple R2 = .30 R2 = .45

• With “tumescence” index:

multiple R = .62 w/ interactions R = .87

multiple R2 = .38 R2 = .75