a regression of this data gives: r-squared = 92.6% s =0.2417 variable coefficient
DESCRIPTION
This is a scatterplot of men’s age at first marriage against year at every census from 1890 to 1940. Comment on this scatterplot. A regression of this data gives: R-squared = 92.6% s =0.2417 Variable Coefficient Intercept 25.7 Year -0.04 What does all of this information tell us?. - PowerPoint PPT PresentationTRANSCRIPT
This is a scatterplot of men’s age at first marriage against year at every census from 1890 to 1940.
Comment on this scatterplot.
ˆ 25.7 0.04age year
A regression of this data gives:
• R-squared = 92.6%• s =0.2417• Variable Coefficient• Intercept 25.7• Year -0.04
• What does all of this information tell us?
Use the model to predict the median age for
first marriage in the year 2000.
What is the name of the word that describes
using data to “predict” values that are “far
off” ?
Extrapolation is always dangerous!
FOXTROT Cartoon
When describing unusual points:
High Leverage Points:• A data point can be unusual if the x value
is far from the mean of the x-values.
Influential Points are a kind of high leverage point
Omitting it from the analysis gives a very different model
Influence depends on both leverage and its residual
Example #1
Example #1
• There is the case with moderate leverage but with a very large residual (can be influential)
Example #2
Example #2
• There is the case with high leverage whose y-value sits right on the line of fit (this is not influential – it does not change the slope but does change the R2.
EXAMPLE #3
• There is also the case of extreme leverage where a point pulls the line right to it. This is highly influential but it’s residual is small.
Example #3
Example #4
Regression Applet
• http://www.stat.sc.edu/~west/javahtml/Regression.html
• Experiment with adding additional points to a scatterplot and seeing how the regression line changes.
COMPUTER LAB EXPLORATION1) REGRESSION BY EYE
Go to: http://www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/index.html and read the instructions on that page. Really read them before you begin. Click begin on the left side of the page and start guessing!
2) EXPLORING INFLUENTIAL & HIGH LEVERAGE POINTS
Go to http://illuminations.nctm.org/LessonDetail.aspx?ID=L456 and read the instructions. Explore the effect of outliers (influential and leverage points) on the correlation coefficient and the line of least squares
3) RESIDUAL PLOTS
Go to: http://www.math.csusb.edu/faculty/stanton/m262/regress/regress.html and read the directions. Then start plotting points and you will see the line of least squares forming as well as the residual plots
Regression towards the Mean
This is a closed wallet multiple choice test. If you are not sure of an answer, guess. Write your answers somewhere in your notebook or on a piece of scrap paper.
MONEY TEST
I. On the back of a nickel is:(a) Monticello(b) The Jefferson Memorial
II. On the back of a $2 bill is(c) Signers of the Declaration(d) Independence Hall
III. On the front of a $500 bill is(e) Madison(f) McKinley
Grade yourself (out of 3)MONEY TEST
I. On the back of a nickel is:(a) Monticello(b) The Jefferson Memorial
II. On the back of a $2 bill is(c) Signers of the Declaration(d) Independence Hall
III. On the front of a $500 bill is(e) Madison(f) McKinley
Let’s record our results
• Give us the number correct (out of 3)• My score was 1 / 3
MONEY MAKE-UP TEST• This is a closed wallet, multiple choice test. If you’re not sure of
an answer, take your best guess.
I. On the front of a $20 bill is(a) Jefferson(b) Jackson
II. On a dollar bill, Washington is looking to his:(a) left(b) right
III. On the front of a $1000 bill is:(a) Cleveland(b) Wilson
Give yourself a grade on the make up
I. On the front of a $20 bill is(a) Jefferson(b) Jackson
II. On a dollar bill, Washington is looking to his:(a) left(b) right
III. On the front of a $1000 bill is:(a) Cleveland(b) Wilson
Let’s record our make-up exam grades next to our old exams.
FIRST EXAM SCORE SECOND EXAM SCORE vf
CALCULATE
• What’s the average score on the first test for the “remedial” students (those that missed two or three questions)?
• What was the average score on the second test for the remedial students?
• What was the average score for the “star students” (those that got all three right) on the make-up test?
The Regression Effect / Regression towards the Mean
• Explain why the scores of the “remedial” students tended to go up and the scores of the “star” students tended to go down?