statistics: dealing with uncertainty acads (08-006) covered keywords sample, normal distribution,...

31
ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population, data sorting, standard normal distribution, Z- tables, probability. Supporting Material 1.1.1.2 1.1.1.4 3.2.3.19 3.2.3.20

Upload: anastasia-mcgee

Post on 30-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

ACADs (08-006) Covered

KeywordsSample, normal distribution, central tendency, histogram, probability, sample, population, data sorting, standard normal distribution, Z-tables, probability.

Supporting Material

1.1.1.2 1.1.1.4 3.2.3.19 3.2.3.20

Page 2: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Statistics

Dealing With Uncertainty

Page 3: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Objectives

• Describe the difference between a sample and a population

• Learn to use descriptive statistics (data sorting, central tendency, etc.)

• Learn how to prepare and interpret histograms• State what is meant by normal distribution and

standard normal distribution.• Use Z-tables to compute probability.

Page 4: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Statistics

• “There are lies, d#$& lies, and then there’s statistics.”

Mark Twain

Page 5: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Statistics is...

• a standard method for... - collecting, organizing, summarizing, presenting, and analyzing data - drawing conclusions - making decisions based upon the analyses of these data.

• used extensively by engineers (e.g., quality control)

Page 6: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Populations and Samples

• Population - complete set of all of the possible instances of a particular object – e.g., the entire class

• Sample - subset of the population– e.g., a team

• We use samples to draw conclusions about the parent population.

Page 7: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Why use samples?

• The population may be large– all people on earth, all stars in the sky.

• The population may be dangerous to observe – automobile wrecks, explosions, etc.

• The population may be difficult to measure – subatomic particles.

• Measurement may destroy sample– bolt strength

Page 8: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Exercise: Sample Bias• To three significant figures, estimate the

average age of the class based upon your team.

• When would a team not be a representative sample of the class?

Page 9: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Measures of Central Tendency

• If you wish to describe a population (or a sample) with a single number, what do you use?

– Mean - the arithmetic average – Mode - most likely (most common) value.– Median - “middle” of the data set.

Page 10: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

What is the Mean?

• The mean is the sum of all data values divided by the number of values.

Page 11: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Sample Mean

Where:– is the sample mean– xi are the data points

– n is the sample size

n

iixn

x1

1

x

Page 12: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Population Mean

Where:

– μ is the population mean

– xi are the data points

– N is the total number of observations in the population

N

iixN 1

1

Page 13: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

What is the Mode?

• mode - the value that occurs the most often in discrete data (or data that have been grouped into discrete intervals)

– Example, students in this class are most likely to get a grade of B.

Page 14: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Mode continued

• Example of a grade distribution with mean C, mode B

0

5

10

15

20

25

F D C B A

Page 15: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

What is the Median?

• Median - for sorted data, the median is the middle value (for an odd number of points) or the average of the two middle values (for an even number of points).– useful to characterize data sets with a few

extreme values that would distort the mean (e.g., house price,family incomes).

Page 16: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

What Is the Range?

• Range - the difference between the lowest and highest values in the set.– Example, driving time to Houston is 2 hours +/- 15

minutes. Therefore... • Minimum = 105 min • Maximum = 135 minutes• Range = 30 minutes

Page 17: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Standard Deviation

• Gives a unique and unbiased estimate of the scatter in the data.

Page 18: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Standard Deviation

• Population

• Sample

2

1

)(1

N

iix

N

2

1

)()1(

1xx

ns

n

ii

Deviation

Variance = 2

Variance = s2

Page 19: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

The Subtle Difference Between and σ

N versus n-1n-1 is needed to get a better estimate of the

population from the sample s.

Note: for large n, the difference is trivial.

Page 20: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

A Valuable Tool

• Gauss invented standard deviation circa 1700 to explain the error observed in measured star positions.

• Today it is used in everything from quality control to measuring financial risk.

Page 21: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Team Exercise

• In your team’s bag of M&M candies, count– the number of candies for each color– the total number of candies in the bag

• When you are done counting, have a representative from your team enter your data in Excel

More

Page 22: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Team Exercise (con’t)

For each color, and the total number of candies, determine the following:

maximum modeminimum medianrange standard deviationmean variance

Page 23: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Individual Exercise: Histograms

• Flip a coin EXACTLY ten times. Count the number of heads YOU get.

• Report your result to the instructor who will post all the results on the board

• Open Excel• Using the data from the entire class, create

bar graphs showing the number of classmates who get one head, two heads, three heads, etc.

Page 24: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Data Distributions

• The “shape” of the data is described by its frequency histogram.

• Data that behaves “normally” exhibit a “bell-shaped” curve, or the “normal” distribution.

• Gauss found that star position errors tended to follow a “normal” distribution.

Page 25: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

The Normal Distribution

• The normal distribution is sometimes called the “Gauss” curve.

22 /2

1

2

1RF

xe

mean

x

RF

RelativeFrequency

Page 26: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Standard Normal Distribution

Define:

Then / xz

2RF

2

2

1z

e

0.0

0.1

0.2

0.3

0.4

0.5

-4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0

Area = 1.00

z

Page 27: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Some handy things to know.

• 50% of the area lies on each side of the mid-point for any normal curve.

• A standard normal distribution (SND) has a total area of 1.00.

• “z-Tables” show the area under the standard normal distribution, and can be used to find the area between any two points on the z-axis.

Page 28: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Using Z Tables (Appendix C, p. 624)

• Question: Find the area between z= -1.0 and z= 2.0– From table, for z = 1.0, area = 0.3413– By symmetry, for z = -1.0, area = 0.3413– From table, for z= 2.0, area = 0.4772– Total area = 0.3413 + 0.4772 = 0.8185 – “Tails” area = 1.0 - 0.8185 = 0.1815

Page 29: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

“Quick and Dirty” Estimates of and

• (lowest + 4*mode + highest)/6• For a standard normal curve, 99.7% of the

area is contained within ± 3 from the mean.• Define “highest” = • Define “lowest” = • Therefore, (highest - lowest)/6

Page 30: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Example: Drive time to Houston

• Lowest = 1 h• Most likely = 2 h• Highest = 4 h (including a flat tire, etc.)

– = (1+4*2+4)/6 = 2.16 (2 h 12 min)– = (4 - 1)/6= 0.5 h

• This technique (Delphi) was used to plan the moon flights.

Page 31: Statistics: Dealing With Uncertainty ACADs (08-006) Covered Keywords Sample, normal distribution, central tendency, histogram, probability, sample, population,

Review

• Central tendency – mean– mode– median

• Scatter – range– variance– standard deviation

• Normal Distribution