slide slide 1 warm up: #23, 24 on page 555 answer each of the following questions for #23, 24: a)is...
TRANSCRIPT
SlideSlide 1
Warm Up: #23, 24 on page 555
Answer each of the following questions for #23, 24:a) Is there a linear correlation? Use software then Table A-6 to prove it. Are your
answers the same? b) Graph the points (don’t forget axis labels). If there is a correlation, graph the
LSRL and continue to do c-h.c) Find the vital statistics (r, r-squared, a, b, y-hat – don’t forget to define x and y)d) Tell me what r and r-squared means in the context of the problem (r: form,
direction, strength) (r-squared: how much of the variation in x can be explained by the variation in y)
e) Find the residualsf) Draw the residual plot – is the regression line a good model for the data? Why?g) For # 23, predict the winning time when the temperature is 73 degrees
Fahrenheit. h) For #24, predict the height of a daughter when her mother is 66 inches tall.
SlideSlide 2
The SAT essay: longer is better?
Words 460 422 402 365 357 278 236 201 168 156 133 114 108 100 403
Score 6 6 5 5 6 5 4 4 4 3 2 2 1 1 5
Words 401 388 320 258 236 189 128 67 697 387 355 337 325 272 150
Score 6 6 5 4 4 3 2 1 6 6 5 5 4 4 2
Words 135 73
Score 3 1
SlideSlide 3
Section 10-4 Variation
SlideSlide 4
Key Concept
In this section we proceed to consider a method for constructing a prediction interval, which is an interval estimate of a predicted value of y.
Using paired data (x,y), we describe the variation that can be explained between x and y and the variation that is unexplained.
SlideSlide 5
Figure 10-9
Unexplained, Explained, and Total Deviation
SlideSlide 6
DefinitionsTotal Deviation The total deviation of (x, y) is the vertical distance y – ybar, which is the distance between the point (x, y) and the horizontal line passing through the sample mean y-bar.Explained DeviationThe explained deviation is the vertical distance y-hat
- y-bar, which is the distance between the predicted y-value and the horizontal line passing through the sample mean y-bar.
Unexplained Deviation The unexplained deviation is the vertical distance y – y-hat, which is the vertical distance between the point (x, y) and the regression line. (The distance y – y-hat is also called a residual, as defined in Section 10-3.)
SlideSlide 7
Particulars
We can explain the discrepancy between y-bar = 9 and y-hat=13 by noting that there is a linear relationship best described by the LSRL (y-y-hat).
The discrepancy between y-hat = 13 and y=19 can’t be explained by the LSRL = residual or unexplained deviation (y-y-hat)
SlideSlide 8
(total deviation) = (explained deviation) + (unexplained deviation)
(y - y) = (y - y) + (y - y)^ ^
(total variation) = (explained variation) + (unexplained variation)
(y - y) 2
= (y - y) 2
+ (y - y) 2^ ^
Formula 10-4
Relationships
SlideSlide 9
Definition
r2 =explained variation.
total variation
The value of r2 is the proportion of the variation in y that is explained by the linear relationship between x and y.
Coefficient of determinationis the amount of the variation in y thatis explained by the regression line.
SlideSlide 10
Warm Up: Day 2
Consider the following data set:
Find:
a) Total variation
b) Explained variation
c) Unexplained variation
X Y
1 4
2 24
4 8
5 32
SlideSlide 11
Try again!
Consider the following data set:
Find:a) Total variationb) Explained variationc) Unexplained variation
X Y
1 1
2 3
3 5
4 7
SlideSlide 12
Not Old Faithful again!
In section 10-2 we used the duration/interval after eruption times in Table 10-1 to find that r = .926. find the coefficient of determination. Also, find the percentage of the total variation in y (time interval after eruption) that can be explained by the linear relationship between the duration of time and the time interval after an eruption.
Duration 240 120 178 234 235 269 255 220
Interval After 92 65 72 94 83 94 101 87
SlideSlide 13
Interpretation/New Def86% of the total variation in time intervals after eruptions (y) can
be explained by the duration times (x)14% of the total variation in time intervals after eruptions can be
explained by factors other than duration times.
Recall: y-hat = 34.8 +.234x (x = duration in seconds, y = predicted time interval). When x = 180, we predict a y-hat of ____?
This single value is called a point estimate. It is our best predicted value. How accurate is it?
We use prediction intervals to answer this question.
SlideSlide 14
DefinitionsPrediction Interval: an interval estimate of a predicted value
of y. The development of a prediction interval requires a measure of the spread of sample points about the regression line.
The standard error of estimate, denoted by se is a measure of the differences (or distances) between the observed sample y-values and the predicted values y that are obtained using the regression equation. That is, it is a collective measure of the spread of the sample points about the regression line.
Se = A measure of how sample points deviate from their regression line.
SlideSlide 15
Standard Error of Estimate
se =
or
se = y2 – b0 y – b1 xyn – 2 Formula 10-5
(y – y)2
n – 2
^
SlideSlide 16
Given the sample data in Table 10-1, find the standard error of estimate se for the duration/interval data.
Example: Old Faithful
Duration 240 120 178 234 235 269 255 220Interval After 92 65 72 94 83 94 101 87
SlideSlide 17
y - E < y < y + E^ ^
Prediction Interval for an Individual y
where
E = t2 se n(x2) – (x)2
n(x0 – x)2
1 + +1n
x0 represents the given value of x
t2 has n – 2 degrees of freedom
SlideSlide 18
E = t2 se +
n(x2) – (x)2
n(x0 – x)2
1 + 1
n
Example: Old FaithfulFor the paired duration/interval after eruption times in Table 10-1, we have found that for a duration of 180 sec, the best predicted time interval after the eruption is 76.9 min. Construct a 95% prediction interval for the time interval after the eruption, given that the duration of the eruption is 180 sec (so that x = 180).
Duration 240 120 178 234 235 269 255 220Interval After 92 65 72 94 83 94 101 87
SlideSlide 19
y – E < y < y + E
76.9 – 13.4 < y < 76.9 + 13.4
63.5 < y < 90.3
^ ^
Example: Old Faithful - contFor the paired duration/interval after eruption times in Table 10-1, we have found that for a duration of 180 sec, the best predicted time interval after the eruption is 76.9 min. Construct a 95% prediction interval for the time interval after the eruption, given that the duration of the eruption is 180 sec (so that x = 180).
SlideSlide 20
Same problem, different x
For the paired duration/interval after eruption times, find:
1) For a duration of 150 sec, the best predicted time interval after the eruption is _____ min.
2) Construct a 95% prediction interval for the time interval after the eruption, given that the duration of the eruption is 150 sec (so that x = 150).
Duration 240 120 178 234 235 269 255 220
Interval After 92 65 72 94 83 94 101 87
E = t2 se +
n(x2) – (x)2
n(x0 – x)2
1 + 1
n
y – E < y < y + E^ ^