lesson 3 - 1
DESCRIPTION
Lesson 3 - 1. Scatterplots and Correlation. Knowledge Objectives. Explain the difference between an explanatory variable and a response variable Explain what it means for two variables to be positively or negatively associated Define the correlation r and describe what it measures - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/1.jpg)
Lesson 3 - 1
Scatterplots and Correlation
![Page 2: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/2.jpg)
Knowledge Objectives• Explain the difference between an explanatory variable
and a response variable
• Explain what it means for two variables to be positively or negatively associated
• Define the correlation r and describe what it measures
• List the four basic properties of the correlation r that you need to know in order to interpret any correlation
• List four other facts about correlation that must be kept in mind when using r
![Page 3: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/3.jpg)
Construction Objectives• Given a set of bivariate data, construct a scatterplot.
• Explain what is meant by the direction, form, and strength of the overall pattern of a scatterplot.
• Explain how to recognize an outlier in a scatterplot.
• Explain how to add categorical variables to a scatterplot.
• Use a TI-83/84/89 to construct a scatterplot.
• Given a set of bivariate data, use technology to compute the correlation r.
![Page 4: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/4.jpg)
Vocabulary• Bivariate data – • Categorical Variables – • Correlation (r) –• Negatively Associated –• Outlier – • Positively Associated –• Scatterplot – • Scatterplot Direction – • Scatterplot Form – • Scatterplot Strength –
![Page 5: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/5.jpg)
Scatter Plots
• Shows relationship between two quantitative variables measured on the same individual.
• Each individual in the data set is represented by a point in the scatter diagram.
• Explanatory variable plotted on horizontal axis and the response variable plotted on vertical axis.
• Do not connect the points when drawing a scatter diagram.
![Page 6: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/6.jpg)
Drawing Scatter Plots by Hand
• Plot the explanatory variable on the x-axis. If there is no explanatory-response distinction, either variable can go on the horizontal axis.
• Label both axes
• Scale both axes (but not necessarily the same scale on both axes). Intervals must be uniform.
• Make your plot large enough so that the details can be seen easily.
• If you have a grid, adopt a scale so that you plot uses the entire grid
![Page 7: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/7.jpg)
TI-83 Instructions for Scatter Plots
• Enter explanatory variable in L1• Enter response variable in L2• Press 2nd y= for StatPlot, select 1: Plot1• Turn plot1 on by highlighting ON and enter• Highlight the scatter plot icon and enter• Press ZOOM and select 9: ZoomStat
![Page 8: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/8.jpg)
Interpreting Scatterplots• Just like distributions had certain important
characteristics (Shape, Outliers, Center, Spread)
• Scatter plots should be described by– Direction
positive association (positive slope left to right)negative association (negative slope left to right)
– Form linear – straight line, curved – quadratic, cubic, etc, exponential, etc
– Strength of the formweakmoderate (either weak or strong)strong
– Outliers (any points not conforming to the form)– Clusters (any sub-groups not conforming to the form)
![Page 9: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/9.jpg)
Response
Explanatory
Response
Explanatory
Response
Explanatory
Response
Explanatory
Response
ExplanatoryStrong Negative Quadratic Association Weak Negative Linear Association
No RelationStrong Positive Linear Association
Strong Negative Linear Association
Example 1
![Page 10: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/10.jpg)
Example 2Describe the scatterplot below
Colorado
MildNegativeExponentialAssociation
One obviousoutlier
Two clusters > 50% < 50%
![Page 11: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/11.jpg)
Example 3Describe the scatterplot below
MildPositiveLinearAssociation
One mildoutlier
![Page 12: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/12.jpg)
Adding Categorical Variables
Use a different plotting color or symbol for each category
![Page 13: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/13.jpg)
Associations
• Remember the emphasis in the definitions on above and below average values in examining the definition for linear correlation coefficient, r
![Page 14: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/14.jpg)
Where x is the sample mean of the explanatory variable sx is the sample standard deviation for x y is the sample mean of the response variable sy is the sample standard deviation for y n is the number of individuals in the sample
Linear Correlation Coefficient, r
(xi – x)---------- sx
(yi – y)---------- sy
1r = ------ n – 1 Σ
![Page 15: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/15.jpg)
Equivalent Form for r
• Easy for computers (and calculators)
r =
xi yixiyi – ----------- nΣ Σ Σ
√ xi xi
2 – -------- nΣ (Σ )2 yi
yi2 – --------
nΣ (Σ )2=
sxy
√sxx √syy
![Page 16: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/16.jpg)
Important Properties of r• Correlation makes no distinction between explanatory
and response variables
• r does not change when we change the units of measurement of x, y or both
• Positive r indicates positive association between the variables and negative r indicates negative association
• The correlation r is always a number between -1 and 1
![Page 17: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/17.jpg)
Linear Correlation Coefficient Properties • The linear correlation coefficient is always between -1 and 1
• If r = 1, then the variables have a perfect positive linear relation
• If r = -1, then the variables have a perfect negative linear relation
• The closer r is to 1, then the stronger the evidence for a positive linear relation
• The closer r is to -1, then the stronger the evidence for a negative linear relation
• If r is close to zero, then there is little evidence of a linear relation between the two variables. R close to zero does not mean that there is no relation between the two variables
• The linear correlation coefficient is a unitless measure of association
![Page 18: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/18.jpg)
TI-83 Instructions for Correlation Coefficient
• With explanatory variable in L1 and response variable in L2
• Turn diagnostics on by – Go to catalog (2nd 0)– Scroll down and when diagnosticOn is
highlighted, hit enter twice• Press STAT, highlight CALC and select
4: LinReg (ax + b) and hit enter twice• Read r value (last line)
![Page 19: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/19.jpg)
Example 4
• Draw a scatter plot of the above data
• Compute the correlation coefficient
1 2 3 4 5 6 7 8 9 10 11 12x 3 2 2 4 5 15 22 13 6 5 4 1
y 0 1 2 1 2 9 16 5 3 3 1 0
r = 0.9613
y
x
![Page 20: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/20.jpg)
Example 5Match the r values to the Scatterplots to the left
1)r = -0.992)r = -0.73)r = -0.34)r = 05)r = 0.56)r = 0.9
A
B
C F
E
D FE
A
CB
D
![Page 21: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/21.jpg)
Cautions to Heed
• Correlation requires that both variables be quantitative, so that it makes sense to do the arithmetic indicated by the formula for r
• Correlation does not describe curved relationships between variables, not matter how strong they are
• Like the mean and the standard deviation, the correlation is not resistant: r is strongly affected by a few outlying observations
• Correlation is not a complete summary of two-variable data
![Page 22: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/22.jpg)
Observational Data Reminder
• If bivariate (two variable) data are observational, then we cannot conclude that any relation between the explanatory and response variable are due to cause and effect
• Remember Observational versus Experimental Data
![Page 23: Lesson 3 - 1](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815e06550346895dcc5276/html5/thumbnails/23.jpg)
Summary and Homework• Summary– Scatter plots can show associations between
variables and are described using direction, form, strength and outliers
– Correlation r measures the strength and direction of the linear association between two variables
– r ranges between -1 and 1 with 0 indicating no linear association
• Homework– 3.7, 3.8, 3.13 – 3.16, 3.21