coefficient of determination r
TRANSCRIPT
-
8/10/2019 Coefficient of Determination r
1/18
r
Coefficient of DeterminationUnit 3
-
8/10/2019 Coefficient of Determination r
2/18
Coefficient of Determination, r2
Once weve decided its appropriate to usea line, we need to think about assessing the
accuracy of predictions.
-
8/10/2019 Coefficient of Determination r
3/18
Coefficient of Determination, r2
Suppose we wish to predict the price of homes in aparticular city. We take a random sample of 20houses to get y = price and x = size (our housingdata).
Clearly, we are going to get some variability in the price,since houses differ in price.
How much of this variability in price can be explained by thefact that price is related to size and houses differ in size?
If a lot of the variation in price can be accounted for by
house size, a prediction of price based on house size will bea big improvement over a prediction not based on housesize.
Our best guess, here, would be the average price of our
sample (y-bar).
-
8/10/2019 Coefficient of Determination r
4/18
Coefficient of Determination, r2
The Coefficient of Determination, r2, is
the proportion of variation in y that can
be attributed to the approximate linear
relationship between x and y. (or that
can be explained by the linear
relationship between x and y).
-
8/10/2019 Coefficient of Determination r
5/18
Coefficient of Determination, r2
r2is useful because:
it gives the proportion of the variance
(fluctuation) of one variable that is
predictable from the other variable
explains how much of the variability in
the y's can be explained by the fact that
they are related to x
-
8/10/2019 Coefficient of Determination r
6/18
Lets look at a formula
We find the total variation in y (SSTotal)
SSTotal = (yi ybar)
Is also called SSM (Sum of Squares about the
mean)
Is the variation around the meanlooks likevariance
-
8/10/2019 Coefficient of Determination r
7/18
Formula continued
We then find the Sum of Squared
Residuals (SSR)
SSR = (yii)
Is also called SSE or sum of squares of error
This is sometimes referred to as a measure of
the unexplained variation. Or the amount ofvariation in y that cannot be attributed to the
linear relationship between x and y
-
8/10/2019 Coefficient of Determination r
8/18
This gives us
r = 1(SSR / SSTotal)
If I multiply by 100, I get the percentage of
y variation attributable to the approximate
linear relationship between x and y.
The book uses the formula:
r = (SSMSSE) / SSM
Formula continued
-
8/10/2019 Coefficient of Determination r
9/18
Couple of Examples:
The variation of each observation (y) from is small. explains the variation in y very well
High r, high r
-
8/10/2019 Coefficient of Determination r
10/18
The variation of each observation (y) from isnot really small. doesnt explain the variation in y as well.
Poor r, poor r
-
8/10/2019 Coefficient of Determination r
11/18
Example
Suppose from our strong example thatr = .9 then r = .81
This means that 81% of the variation in the y
variable is accounted for by the linearrelationship between x and y
Suppose the other model:
r = -.4 then r = .16This means that only 16% of the variation in the
y variable is accounted for by the linear
relationship
-
8/10/2019 Coefficient of Determination r
12/18
Some points
Always use in context
Must interpret the r with our sentence. Do
not say:
The regression equation can predict 81% of
the data points
81% of data points lie on the LSRL
LSRL accounts for 81% of the data points
-
8/10/2019 Coefficient of Determination r
13/18
Properties of r2
Properties to note:
r2ranges in value from 0 to 1.0
The magnitude of r2is proportional to the
strength of the linear relation between x and y
The location of the r2value relative to 0 and 1.0
indicates the relative proximity of the linear relation
is to a perfect linear relation and no linear relation
-
8/10/2019 Coefficient of Determination r
14/18
Examples
An r2
value of 0.75 indicates that the linear relationis the distance between:
No linear relation between x and y
A perfect linear relation between x and y.
If the r2value between motivation to learn andclassroom achievement equals 0.16 for femalesand 0.04 for males, we can conclude that the linearrelation between these two variables is 4 times as
strong for females as it is for males. An r2value between systolic blood pressure and
age equal to 0.38 implies that 38% of the variabilityof between systolic blood pressure and age isaccounted for by their linear relation.
-
8/10/2019 Coefficient of Determination r
15/18
Standard Deviation about the LSRL
Se= (SSR / n-2)
This measures the typical amount by
which an observation deviates fromthe LSRL (analogous to sample
standard deviation)
-
8/10/2019 Coefficient of Determination r
16/18
Example
-
8/10/2019 Coefficient of Determination r
17/18
Homework
Textbook pp. 190196 # 15, 16, 31, 32, 47
Anova Table worksheet
Chapter 9 Projectdue October 12
th
!! Unit 3 TestOctober 14th!!
-
8/10/2019 Coefficient of Determination r
18/18
Example
3.36.pdf
anova tables.pdf
anova answers.pdf
http://localhost/var/www/apps/conversion/tmp/scratch_2/3.36.pdfhttp://localhost/var/www/apps/conversion/tmp/scratch_2/anova%20tables.pdfhttp://localhost/var/www/apps/conversion/tmp/scratch_2/anova%20answers.pdfhttp://localhost/var/www/apps/conversion/tmp/scratch_2/anova%20answers.pdfhttp://localhost/var/www/apps/conversion/tmp/scratch_2/anova%20tables.pdfhttp://localhost/var/www/apps/conversion/tmp/scratch_2/3.36.pdf