statistical measures
DESCRIPTION
Statistical Measures. Mrs. Watkins AP Statistics Chapters 5,6. MEASURES OF CENTER. Mean : arithmetic average of all data values population mean : sample mean : Formula : Mode : the most common value in a data set. Median : the middle value in a data set - PowerPoint PPT PresentationTRANSCRIPT
Statistical Measures
Mrs. WatkinsAP StatisticsChapters 5,6
MEASURES OF CENTERMean: arithmetic average of all data values
population mean:sample mean:Formula:
Mode: the most common value in a data set
Median: the middle value in a data set
Midrange: average of the extremes
Trimmed Mean: when you find the meanof data set with a certain percentage ofdata values trimmed of the ends of thedistribution
Ex:
5 number summary
5 important numbers in data set:Min:Q1:Med:Q3:Max:
Q1, Med, Q3, may not be actual data values
BOXPLOT
graphical display of data using 5 number summary (if outliers shown, called “modified box plot”)
OUTLIERS
Outliers:
IQR Test for Outliers: (IQR )(1.5) = multiplier MQ1 - M = outlier lower boundQ3 + M = outlier upper bound
If values exceed these bounds, they are outliers
RESISTANCE
Resistant Measures:
Non-resistant Measures:
Mean, Midrange:Median, IQR, Trimmed Mean:
MEASURES OF SPREAD
Range: the spread between high and low
Resistant?
IQR (Interquartile Range) : Resistant?
STANDARD DEVIATION
a measure of the average amount of deviation from the mean among the data values
Population St. Deviation:Sample St. Deviation:
We generally use sx because we usually do not have entire population.
VARIANCE
the square of the standard deviation what you get before taking square root
Population Variance:Sample Variance:
This measure not used much in elementary statistics but you need to know what it is.
Coefficient of Variance
measure of how relatively large a st. dev. is
Ex: St. deviation of IQ = 15, Mean 100
St. deviation of height = 3 in, Mean 69
“Comment on the distribution”
You now have numbers to support your statements, rather than just graphs.
SHAPE:OUTLIERS:CENTER:SPREAD: how widely does the data vary?Unusual Features: gaps, clusters
SHAPE If the mean > median, then data distribution
is skewed ________The mean is in the tail.
If the mean < median, then data distribution is skewed ________The mean is in the tail.
If the mean ≈ median, then data distributionis approximately ____________.
SHAPESymmetric if mean = median
SKEWNESSSkewed left if mean < medianSkewed right if mean > median
Left RightMean is in the tail of the data
OTHER SHAPES
Uniform distribution: all values relativelyevenly distributed across interval
Bimodal distribution: two peaks
TRANSFORMATIONS TO DATA
What would happen to the statistical measures if each data value had a constant added to or subtracted from it?
Mean:Standard Deviation:Median:IQR:
What would happen to the statistical measures if each data value had a constant multiplied or divided by it?Mean:Standard Deviation:Median:IQR:
TRANSFORMATIONS TO DATA SET
What would happen to the statistical measures if one very low or very high data value was added to the set?
Mean:Standard Deviation:Median:IQR:
MEASURES OF POSITION
Give a numerical approximation of where a single data value stands compared to the whole distribution
Quartiles:Percentiles: Z Scores:
Z SCORES
standardized scorehow a single value compares to entire data set
in terms of position in distribution
z =
How unusual are you?
Compute your z score for height?
Compute your z score for Math SAT?
Compute your z score for IQ?
NORMAL MODEL
shows how data is distributed symmetrically along an interval according to empirical rule
Empirical Rule:of data within 1 st. deviation of μof data within 2 st. deviations of μ of data within 3 st. deviations of μ
ANOTHER OUTLIER TEST
Using Empirical Rule:
Data values of z > +2 st. deviations awayfrom mean are mild outliers
Data values of z > +3 st. deviations awayfrom mean are extreme outliers
NORMAL CURVE
a theoretical ideal about how traits/characteristics are distributed
Many human traits are approximately normally distributed such as height, body temp, IQ, pulse
Avoid using “normal” when describing data—say “approximately normal or symmetric” unless clearly mound-shaped, bell-shaped
NORMAL CURVE
Normal curve—symmetric, mound-shaped
Area under curve=
A z score can be used to establish what % ofthe curve is less or more than the z score,and establish probability of a data value being in that position.
FINDING PERCENTILE/PROBABILITY USING NORMAL CURVE
1. Calculate z score for data value2. Use calculator: normalcdf under DISTR
key
Looking for area > z score: normalcdf (z, ∞)Looking for area < z score: normalcdf (∞, z)Looking for area between z scores:
normalcdf (z1, z2)
FINDING CUT OFF SCORES
If you are given a percentile or probability, and need to determine the “cut off score”
1. Sketch curve to determine where z score is located.2. Determine if you want area above or below this
percentile3. Use INVNORM on calculator
invnorm(percentile)= z score4. Use z score formula to solve for x.
Does the data fit a normal model?
1. Check mean and median
2. Make a NORMAL PROBABILITY PLOT—
3. Make a BOXPLOT on calculator.
AVOID using histograms on calculator to check.