chapter 7 scatterplots, association,and correlation

13
Chapter 7 Scatterplots, Association,and Correlation

Upload: mercy-greer

Post on 21-Dec-2015

249 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 7 Scatterplots, Association,and Correlation

Chapter 7Scatterplots, Association,and Correlation

Page 2: Chapter 7 Scatterplots, Association,and Correlation

A scatterplot shows the relationship between two quantitative variables measured on the same case.

It shows patterns, trends, and relationships.

Scatterplots

Page 3: Chapter 7 Scatterplots, Association,and Correlation

Direction: Positive direction/association means as one variable increases, so does the other. If it’s negative, when one variable increases, the other decreases.

Form: Should be mostly in a straight line. If it drastically curves, it isn’t useful.

Strength: A scatterplot is strong if there is little scatter.

Association

Page 4: Chapter 7 Scatterplots, Association,and Correlation

The response variable (explains or predicts) goes on y-axis.

The explanatory variable goes on x-axis.

A lurking variable is a variable other than x and y that affects both variables, accounting for the correlation between the two.

Variables can have a strong association but still have a small correlation if the association isn’t linear.

Variables

Page 5: Chapter 7 Scatterplots, Association,and Correlation

Correlation numerically measures the direction and strength of the linear relationship between the explanatory and response variables.

Equation: r = E zx zy/n-1

Correlation is always between -1 and +1.

A correlation near zero corresponds to a weak linear association.

Correlation

Page 6: Chapter 7 Scatterplots, Association,and Correlation

Correlation

Strong correlation Weak correlation No correlation

Page 7: Chapter 7 Scatterplots, Association,and Correlation

Look for unusual features such as clusters/subgroups and outliers(point that doesn’t fit pattern).

When you see an outlier, report the correlations with and without the point.

Correlation is sensitive to outliers. A single outlying value can make a small correlation large or make a large one small.

Outliers

Page 8: Chapter 7 Scatterplots, Association,and Correlation

A study examined brain size(measured as pixels counted in a digitized resonance image of a cross-section of the brain) and IQ (4 Performance scales of the Weschler IQ test) for college students. The scatterplot shows the performance IQ scores vs. the brain size. Comment on the association between brain size and IQ as seen in this scatterplot.

Problem #7

Page 9: Chapter 7 Scatterplots, Association,and Correlation

Answer: The data points are very spread out. It has an extremely weak positive correlation between the brain size and performance IQ. It does not have a good form; it is not a good scatterplot. The data points are so scattered, it may have no correlation.

Problem #7

Page 10: Chapter 7 Scatterplots, Association,and Correlation

A ceramics factory can fire eight large batches of pottery a day. Sometimes in the process a few of the pieces break. In order to understand the problem better, the factory records the number of broken pieces in each batch for 3 days and then creates the scatterplot shown.

Problem #9

Page 11: Chapter 7 Scatterplots, Association,and Correlation

a. Make a histogram showing the distribution of the number of broken pieces in the 24 batches of pottery examined.

Answer:

Problem #9

Page 12: Chapter 7 Scatterplots, Association,and Correlation

b. Describe the distribution as shown in the histogram. what feature of the problem is more apparent in the histogram than in the scatterplot?

Answer: The histogram is unimodal. Besides the first broken piece, it is uniformed. It is skewed to the right, because as the number of batches increase, the number of broken pieces decrease.

Problem #9

Page 13: Chapter 7 Scatterplots, Association,and Correlation

C. What aspect of the company’s problem is more apparent in the scatterplot?

Answer: There is a positive but weak correlation between the number of batches and the number of broken pieces. You know this because as one variable increases, so does the other, but the data points are very scattered out and do not make a clear line, so it is weak.

Problem #9