slide slide 1 baby leo’s 4-month “healthy baby” check-up reported the following: 1)he is in...

16
Slide Slide 1 Baby Leo’s 4-month “Healthy Baby” check-up reported the following: 1) He is in the 90 th percentile for weight 2) He is in the 95 th percentile for head circumference 3) He is in the 100 th percentile for height. Interpret these scores. What do they mean?

Upload: everett-page

Post on 27-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

SlideSlide 1

Baby Leo’s 4-month “Healthy Baby” check-up reported the following:

1) He is in the 90th percentile for weight

2) He is in the 95th percentile for head circumference

3) He is in the 100th percentile for height.

Interpret these scores. What do they mean?

SlideSlide 2

Section 3-5Exploratory Data Analysis

(EDA)

SlideSlide 3

This section discusses outliers, then introduces a new statistical graph called a boxplot, which is helpful for visualizing the distribution of data.

Key Concept

SlideSlide 4

Exploratory Data Analysis (EDA)

the process of using statistical tools (such as graphs, measures of center, and measures of variation) to investigate data sets in order to understand their important characteristics

Definition

SlideSlide 5

Definition An outlier is a value that is located very far away

from almost all of the other values. An extreme value that falls outside general pattern

of data. Not all outliers are errors. An outlier can have a dramatic effect on the

mean. An outlier can have a dramatic effect on the

standard deviation. An outlier can have a dramatic effect on the

scale of the histogram so that the true nature of the distribution is totally obscured.

SlideSlide 6

For a set of data, the 5-number summary consists of the minimum value; the first quartile Q1; the median (or second quartile Q2); the third quartile, Q3; and the maximum value.

A boxplot ( or box-and-whisker-diagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile, Q1; the median; and the third quartile, Q3.

Definitions

(Calculator: with/without potential outlier, boxplot/modified boxplot)

SlideSlide 7

Boxplots (useful for revealing: center, spread, distribution,

outliers)

SlideSlide 8

Modified Boxplots

Some statistical packages provide modified boxplots which represent outliers as special points.

A data value is an outlier if it is …

above Q3 by an amount greater than 1.5 X IQR

or

below Q1 by an amount greater than 1.5 X IQR

Use the outlier criterion on our dataset1 2 3 4 8 165

To identify any outliers.

SlideSlide 9

Modified Boxplot Construction

A modified boxplot is constructed with these specifications:

A special symbol (such as an asterisk) is used to identify outliers.

The solid horizontal line extends only as far as the minimum data value that is not an outlier and the maximum data value that is not an outlier.

SlideSlide 10

Modified Boxplots - Example

SlideSlide 11

Do male doctors perform more C-sections than female doctors? A study in Switzerland examined the number

of C-sections performed in one year by a sample of male/female doctors.

Male Dr. Data:

20 25 25 27 28 31 33 34 36 37 44 50 59 85 86

Min Q1 M Q3 Max

Female dr. data:

5 7 10 14 18 19 25 29 31 33

Min Q1 M = 18.5 Q3 Max

• Are there any outliers?

• What do the modified boxplots tell you?

• Give some overall observations.

SlideSlide 12

You do!

Here are measured reaction times (in seconds) in a test of driving skills:

2.4 2.5 2.8 2.0 2.4 2.9 3.2 3.5 2.7 2.7 2.8 2.7

1) Find the five-number summary

2) Draw the modified boxplot. Are there any outliers?

SlideSlide 13

Boxplots - cont

SlideSlide 14

Boxplots - cont

SlideSlide 15

Statdisk Pulse rates

Compare male and female

1) Center

2) Variation

3) 5 # summary

4) Are there any outliers?

SlideSlide 16

Recap

In this section we have looked at:

Exploratory Data Analysis

Effects of outliers

5-number summary

Boxplots and modified boxplots