statistical analysis regression & correlation
DESCRIPTION
Statistical Analysis Regression & Correlation. Psyc 250 Winter, 2013. Review: Types of Variables & Steps in Analysis. Variables & Statistical Tests. Evaluating an hypothesis. Step 1: What is the relationship in the sample ? - PowerPoint PPT PresentationTRANSCRIPT
Statistical Analysis
Regression & Correlation
Psyc 250
Winter, 2013
Review:
Types of Variables&
Steps in Analysis
Variables & Statistical TestsVariable Type Example Common Stat
MethodNominal by nominal
Blood type by gender
Chi-square
Scale by nominal GPA by gender
GPA by major
T-test
Analysis of Variance
Scale by scale Weight by height
GPA by SAT
Regression
Correlation
Evaluating an hypothesis
• Step 1: What is the relationship in the sample?
• Step 2: How confidently can one generalize from the sample to the universe from which it comes?
p < .05
Evaluating an hypothesisRelationship in
SampleStatistical
Significance
2 nom. vars. Cross-tab / contingency table
“p value” from Chi Square
Scale dep. & 2-cat indep.
Means for each category
“p value” from t-test
Scale dep. & 3+ cat indep.
Means for each category
“p value” from ANOVA f ratio
2 scale vars. Regression line
Correlation r & r2
“p value” from reg or correlation
Evaluating an hypothesisRelationship in
SampleStatistical
Significance
2 nom. vars. Cross-tab / contingency table
“p value” from Chi Square
Scale dep. & 2-cat indep.
Means for each category
“p value” from t-test
Scale dep. & 3+ cat indep.
Means for each category
“p value” from ANOVA
2 scale vars. Regression line
Correlation r & r2
“p value” from reg or correlation
Relationships betweenScale Variables
Regression
Correlation
Regression• Amount that a dependent variable
increases (or decreases) for each unit increase in an independent variable.
• Expressed as equation for a line –
y = m(x) + b – the “regression line”
• Interpret by slope of the line: m
(Or: interpret by “odds ratio” in “logistic regression”)
Correlation• Strength of association of scale measures
• r = -1 to 0 to +1
+1 perfect positive correlation
-1 perfect negative correlation
0 no correlation
• Interpret r in terms of variance
Mean&
Variance
Example: Weight & HeightSurvey of Class n = 42
• Height• Mother’s height• Mother’s education• SAT• Estimate IQ• Well-being
(7 pt. Likert)
• Weight• Father’s education• Family income• G.P.A.• Health (7 pt. Likert)
Frequency Table for: HEIGHT Valid CumValue Label Value Frequency Percent Percent Percent 59.00 1 2.4 2.4 2.4 61.00 2 4.8 4.8 7.1 62.00 3 7.1 7.1 14.3 63.00 3 7.1 7.1 21.4 65.00 5 11.9 11.9 33.3 66.00 3 7.1 7.1 40.5 67.00 4 9.5 9.5 50.0 68.00 5 11.9 11.9 61.9 69.00 1 2.4 2.4 64.3 70.00 6 14.3 14.3 78.6 71.00 1 2.4 2.4 81.0 72.00 4 9.5 9.5 90.5 73.00 3 7.1 7.1 97.6 74.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0
Frequency Table for: HEIGHT Valid CumValue Label Value Frequency Percent Percent Percent 59.00 1 2.4 2.4 2.4 61.00 2 4.8 4.8 7.1 62.00 3 7.1 7.1 14.3 63.00 3 7.1 7.1 21.4 65.00 5 11.9 11.9 33.3 66.00 3 7.1 7.1 40.5 67.00 4 9.5 9.5 50.0 68.00 5 11.9 11.9 61.9 69.00 1 2.4 2.4 64.3 70.00 6 14.3 14.3 78.6 71.00 1 2.4 2.4 81.0 72.00 4 9.5 9.5 90.5 73.00 3 7.1 7.1 97.6 74.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for: HEIGHT ValidVariable Mean Std Dev Variance Range Minimum Maximum N HEIGHT 67.33 3.87 14.96 15.00 59.00 74.00 42
mean
Variance
x i - Mean )2
Variance = s2 = ----------------------- N - 1
Standard Deviation = s = variance
Frequency Table for: WEIGHT Valid CumValue Label Value Frequency Percent Percent Percent 115.00 1 2.4 2.4 2.4 120.00 1 2.4 2.4 4.8 124.00 1 2.4 2.4 7.1 125.00 4 9.5 9.5 16.7 128.00 1 2.4 2.4 19.0 130.00 6 14.3 14.3 33.3 135.00 4 9.5 9.5 42.9 136.00 1 2.4 2.4 45.2 140.00 3 7.1 7.1 52.4 145.00 2 4.8 4.8 57.1 150.00 3 7.1 7.1 64.3 155.00 2 4.8 4.8 69.0 160.00 6 14.3 14.3 83.3 165.00 2 4.8 4.8 88.1 170.00 1 2.4 2.4 90.5 185.00 1 2.4 2.4 92.9 190.00 2 4.8 4.8 97.6 210.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for: WEIGHT ValidVariable Mean Std Dev Variance Range Minimum Maximum N WEIGHT 146.38 21.30 453.80 95.00 115.00 210.00 42
mean
Relationship of weight & height:
Regression Analysis
“Least Squares” Regression Line
Dependent = ( B ) (Independent) + constant
weight = ( B ) ( height ) + constant
Regression line
Regression: WEIGHT on HEIGHT Multiple R .59254R Square .35110Adjusted R Square .33488Standard Error 17.37332 Analysis of Variance DF Sum of Squares Mean SquareRegression 1 6532.61322 6532.61322Residual 40 12073.29154 301.83229 F = 21.64319 Signif F = .0000 ------------------ Variables in the Equation ------------------ Variable B SE B Beta T Sig T HEIGHT 3.263587 .701511 .592541 4.652 .0000(Constant) -73.367236 47.311093 -1.551
[ Equation: Weight = 3.3 ( height ) - 73 ]
Regression line
W = 3.3 H - 73
Strength of Relationship
“Goodness of Fit”: Correlation
How well does the regression line “fit” the data?
Correlation• Strength of association of scale measures
• r = -1 to 0 to +1
+1 perfect positive correlation
-1 perfect negative correlation
0 no correlation
• Interpret r in terms of variance
Frequency Table for: WEIGHT Valid CumValue Label Value Frequency Percent Percent Percent 115.00 1 2.4 2.4 2.4 120.00 1 2.4 2.4 4.8 124.00 1 2.4 2.4 7.1 125.00 4 9.5 9.5 16.7 128.00 1 2.4 2.4 19.0 130.00 6 14.3 14.3 33.3 135.00 4 9.5 9.5 42.9 136.00 1 2.4 2.4 45.2 140.00 3 7.1 7.1 52.4 145.00 2 4.8 4.8 57.1 150.00 3 7.1 7.1 64.3 155.00 2 4.8 4.8 69.0 160.00 6 14.3 14.3 83.3 165.00 2 4.8 4.8 88.1 170.00 1 2.4 2.4 90.5 185.00 1 2.4 2.4 92.9 190.00 2 4.8 4.8 97.6 210.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for: WEIGHT ValidVariable Mean Std Dev Variance Range Minimum Maximum N
WEIGHT 146.38 21.30 453.80 95.00 115.00 210.00 42
mean
mean
Variance = 454
Regression line
mean
Correlation: “Goodness of Fit”
• Variance (average sum of squared distances from mean) = 454
• “Least squares” (average sum of squared distances from regression line) = 295
l.s. = 295 Regression line
meanS2 = 454
Correlation: “Goodness of Fit”
• How much is variance reduced by calculating from regression line?
• 454 – 295 = 159 159 / 454 = .35
• Variance is reduced 35% by calculating “least squares” from regression line
• r2 = .35
r2 = % of variance in WEIGHT
“explained” by HEIGHT
Correlation coefficient = r
Correlation: HEIGHT with WEIGHT HEIGHT WEIGHT HEIGHT 1.0000 .5925 ( 42) ( 42) P= . P= .000 WEIGHT .5925 1.0000 ( 42) ( 42) P= .000 P= .
r = .59
r2 = .35
HEIGHT “explains” 35% of variance in WEIGHT
Sentence & G.P.A.
• Regression: form of relationship
• Correlation: strength of relationship
• p value: statistical significance
Legal Attitudes Study:
1. Relationship of sentence length to G.P.A.?
2. Relationship of sentence length to Liberal-Conservative views
grade point average
grade point average
4.003.903.803.753.703.403.333.203.00
Fre
qu
en
cy
7
6
5
4
3
2
1
0
Statistics
grade point average23
1
3.5752
.35057
.12290
Valid
Missing
N
Mean
Std. Deviation
Variance
G. P. A.
jail sentence
jail sentence
18.0012.0011.009.006.004.003.002.00.00
Fre
qu
en
cy
7
6
5
4
3
2
1
0
Statistics
jail sentence24
0
5.1250
5.44788
29.67935
Valid
Missing
N
Mean
Std. Deviation
Variance
Length of Sentence (simulated data)
grade point average
4.24.03.83.63.43.23.02.8
jail
sen
ten
ce
20
10
0
-10
Scatterplot: Sentence on G.P.A.
Regression Coefficients
Coefficientsa
17.853 12.097 1.476 .155
-3.534 3.368 -.223 -1.049 .306
(Constant)
grade point average
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: jail sentencea.
Sentence = -3.5 G.P.A. + 18
grade point average
4.24.03.83.63.43.23.02.8
jail
sen
ten
ce20
10
0
-10
Sent = -3.5 GPA + 18
“Least Squares” Regression Line
Correlation: Sentence & G.P.A.
Correlations
1 -.223
. .306
23 23
-.223 1
.306 .
23 24
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
grade point average
jail sentence
grade pointaverage jail sentence
Statistical Significance
Coefficientsa
17.853 12.097 1.476 .155
-3.534 3.368 -.223 -1.049 .306
(Constant)
grade point average
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: jail sentencea.
Correlations
1 -.223
. .306
23 23
-.223 1
.306 .
23 24
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
grade point average
jail sentence
grade pointaverage jail sentence
p = .31
Regression:
Correlation
Interpreting Correlations
• r = -.22
• r2 = .05 p = .31
G.P.A. “explains” 5% of the variance in length of sentence
Write Results
“A regression analysis finds that each higher unit of GPA is associated with a 3.5 month decrease in sentence length, but this correlation was low (r = -.22) and not statistically significant (p = .31).”
Multiple Regression
• Problem: relationship of weight and calorie consumption
• Both weight and calorie consumption related to height
• Need to “control for” height or assess relative effects of height and calorie consumption
Regression line
mean
Multiple Regression
Regression line
mean
Multiple Regression
Residuals
Multiple Regression• Regress weight residuals (dependent
variable) on caloric intake (independent variable)
• Statistically “controls” for height: removes effect or “confound” of height .
• How much variance in weight does caloric intake account for over and above height?
Multiple Regression
• How much variance in dependent measure (weight, length of sentence) do all independent variables combined account for?
multiple R2
• What is the best “model” for predicting the dependent variable?
Malamuth: Sexual Aggression
• Dependent Var: self-report aggression
• Indep / Predictor Vars:– Dominance– Hostility toward women– Acceptance of violence toward women– Psychoticism– Sexual Experience
+ interaction effects
Malamuth: multiple regressions• Without “tumescence” index:
multiple R = .55 w/ interactions R = .67
multiple R2 = .30 R2 = .45
• With “tumescence” index:
multiple R = .62 w/ interactions R = .87
multiple R2 = .38 R2 = .75