chapter 1 & 3. statistics 4 the science of collecting, analyzing, and drawing conclusions from...

Post on 19-Dec-2015

226 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Chapter 1 & 3

Statisticsthe science of collecting, analyzing, and drawing conclusions from data

Descriptive statisticsthe methods of organizing & summarizing data

Inferential statisticsinvolves making generalizations from a sample to a population

PopulationThe entire collection of individuals or objects about which information is desired

SampleA subset of the population, selected for study in some prescribed manner

Variable any characteristic whose value may change from one individual to another

Dataobservations on single variable or simultaneously on two or more variables

Types of variables

Categorical variablesor qualitativeidentifies basic

differentiating characteristics of the population

Numerical variablesor quantitative observations or measurements

take on numerical valuesmakes sense to average these

valuestwo types - discrete & continuous

Discrete (numerical)

listable set of valuesusually counts of items

Continuous (numerical)

data can take on any values in the domain of the variable

usually measurements of something

Classification by the number of variablesUnivariate - data that describes a single

characteristic of the population

Bivariate - data that describes two characteristics of the population

Multivariate - data that describes more than two characteristics (beyond the scope of this course

Identify the following variables:1. the income of adults in your city

2. the color of M&M candies selected at random from a bag

3. the number of speeding tickets each student in AP Statistics has received

4. the area code of an individual

5. the birth weights of female babies born at a large hospital over the course of a year

Numerical

Numerical

Numerical

Categorical

Categorical

Graphs for categorical data

Bar Graph

Used for categorical data Bars do not touch Categorical variable is typically on the horizontal

axis To describe – comment on which occurred the

most often or least often May make a double bar graph or segmented bar

graph for bivariate categorical data sets

Using class survey data:

graph birth month

graph gender & handedness

Pie (Circle) graph

Used for categorical data To make:

– Proportion 360°

– Using a protractor, mark off each part

To describe – comment on which occurred the most often or least often

Graphs for numerical data

Dotplot

Used with numerical data (either discrete or continuous)

Made by putting dots (or X’s) on a number line

Can make comparative dotplots by using the same axis for multiple groups

Distribution Activity . . .

Types (shapes)of Distributions

Symmetricalrefers to data in which both sides are

(more or less) the same when the graph is folded vertically down the middle

bell-shaped is a special type

–has a center mound with two sloping tails

Uniformrefers to data in which every

class has equal or approximately equal frequency

Skewed (left or right)refers to data in which one

side (tail) is longer than the other side

the direction of skewness is on the side of the longer tail

Bimodal (multi-modal)refers to data in which two

(or more) classes have the largest frequency & are separated by at least one other class

How to describe a numerical,

univariate graph

What strikes you as the most distinctive difference among the distributions of exam scores in classes A, B, & C ?

1. Centerdiscuss where the middle of

the data fallsthree types of central

tendency–mean, median, & mode

What strikes you as the most distinctive difference among the distributions of scores in

classes D, E, & F? Class

2. Spreaddiscuss how spread out the data

isrefers to the variability of the

data–Range, standard deviation, IQR

What strikes you as the most distinctive difference among the distributions of exam scores in classes G, H, & I ?

3. Shaperefers to the overall shape of

the distributionsymmetrical, uniform,

skewed, or bimodal

What strikes you as the most distinctive difference among the distributions of exam scores in class K ?

K

4. Unusual occurrencesoutliers - value that lies away

from the rest of the datagapsclustersanything else unusual

5. In contextYou must write your answer

in reference to the specifics in the problem, using correct statistical vocabulary and using complete sentences!

More graphs for numerical data

Stemplots (stem & leaf plots)

Used with univariate, numerical data Must have key so that we know how to read

numbers Can split stems when you have long list of

leaves Can have a comparative stemplot with two

groups

Would a stemplot be a good graph for the number of pieces of gun chewed per day by

AP Stat students? Why or why not?

Would a stemplot be a good graph for the number of pairs of shoes owned by AP Stat

students? Why or why not?

Example:

The following data are price per ounce for various brands of dandruff shampoo at a local grocery store.

0.32 0.21 0.29 0.54 0.17 0.28 0.36 0.23

Can you make a stemplot with this data?

Example: Tobacco use in G-rated Movies

Total tobacco exposure time (in seconds) for Disney movies:223 176 548 37 158 51 299 37 11 165 74 9 2 6 23 206 9

Total tobacco exposure time (in seconds) for other studios’ movies:205 162 6 1 117 5 91 155 24 55 17

Make a comparative stemplot.

Histograms

Used with numerical data Bars touch on histograms Two types

– Discrete• Bars are centered over discrete values

– Continuous• Bars cover a class (interval) of values

For comparative histograms – use two separate graphs with the same scale on the horizontal axis

Would a histogram be a good graph for the fastest speed driven by AP Stat students?

Why or why not?

Would a histogram be a good graph for the number of pieces of gun chewed per day by

AP Stat students? Why or why not?

Cumulative Relative Frequency Plot(Ogive)

. . . is used to answer questions about percentiles. Percentiles are the percent of individuals that are

at or below a certain value. Quartiles are located every 25% of the data. The

first quartile (Q1) is the 25th percentile, while the third quartile (Q3) is the 75th percentile. What is the special name for Q2?

Interquartile Range (IQR) is the range of the middle half (50%) of the data.

IQR = Q3 – Q1

top related