statistics what is statistics? where are statistics used?
TRANSCRIPT
Statistics
What is statistics?
Where are statistics used?
What is Statistics?
Mathematics science pertaining to the collection, analysis, interpretation or explanation, and presentation of data
Summarizing information to aid understanding
Drawing conclusions from data
Estimating the present or predicting the future
Random event/process/variableOutcome cannot be predicted Ex: sum of the numbers on two rolled dice Ex: stock market Ex: fluctuations in water pressure in a municipal water supply
http://www.scc.ms.unimelb.edu.au/whatisstatistics/
Uncertainty in Measurements
Where in engineering do we have to worry about uncertainty? Measurements.
Ask 50 people to measure the length of a football field with a meter stick…
Would each person get the same answer? Why or why not?
Field doesn’t change in size; therefore, errors are in the measurements
(misreading tick marks, didn’t measure in a straight line, etc.)
Uncertainty in Measurements
So, how long is the football field?
How confident are we that the length is the average value?
Estimate the length by taking the average.
Calculate the standard deviation about the average.
Mean, Median, and Mode
Mean: the sum of all a list divided by the number of items in the list
1, 2, 4, 5, 5, 8, 9, 12, 15
1, 2, 4, 5, 5, 7, 8, 9, 12, 15
Median: the number separating the higher half of a set of numbers from the lower half
Mode: the value that occurs the most frequently in a set of numbers
When do you use…
Median
Mode
When you know that a distribution is skewed When you believe that a distribution might be skewed When you have a small number of objects Why? To combat the effect of outliers.
When there are many numbers and the frequency of the numbers progress smoothly When you have non-numerical data (categorical data)
Standard Deviation and Variance
Measures of how spread out a distribution is i.e. Measures of variability
Variance: average squared deviation of each number from its mean Summation
Standard Deviation: square root of the variance Most commonly used measure of spread
Definitions
Event/realization Rolling of a pair of dice, taking of a measurement, performing an
experiment
Outcome Result of rolling the dice, taking the measurement, etc.
Deterministic event Event whose outcome can be predicted realization after realization
Ex: measured length of a table to the nearest cm
Probability Estimate of the likelihood that a random event will product a certain
outcome
Probability Distribution Functions
If a random event is repeated many times Produces a distribution of possible outcomes, f(n) The probability distribution function represents the distribution as the
percentage of occurrences of each outcome
Consider rolling a die. Each side is equally likely to land face up:
(Uniform Distribution: f(1) = f(2) = f(3) = f(4) = f(5) = f(6) = 1/6)
Distribution function
Outcome
Probability Distribution Functions
Consider rolling a pair of dice: If outcome of this event is the sum of the dice, what does the distribution look like?
6 x 6 = 36 possible ways for the pair of dice to land.
11 possible outcomes (pairs can sum to 2 12).
For example,
n = 2: (1,1); therefore f(n) = 1/36.
n = 7: (1,6) (2,5) (3,4) (4,3), (5,2), (6,1) ; therefore f(n) = 6/36.
Probability Distribution Functions
Most well-known: bell-shaped distribution
(More formally, Gaussian or normal distribution).
Normal Distribution
Continuous probability distribution Distributions with the same general shape
Symmetric with scores more concentrated in the middle than in the tails
“Bell shaped” Height is determined by the mean and
the standard deviation QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Probability Distribution Functions
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
f(x)
Why are Normal Distributions Important?
1) Many psychological and educational variables are distributed approximately normally- Reading ability, introversion, job satisfaction, etc
2) It’s easy for mathematical statisticians to work with
3) If mean and standard deviations of a normal distribution are known, it is easy to convert back and forth from raw scores to percentiles
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Standard Normal Distribution
Normal distribution with a mean of 0 and a standard deviation of 1.
Normal distributions can be transformed to standard normal distributions by:
This determines how many standard deviations above or below the mean a particular score is.
Ex: Score of 70 on a test with a mean of 50 and standard deviation of 10. Z = 2.
(2 standard deviations above the mean)
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
http://www.oswego.edu/~srp/stats/z.htm
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Percentiles
If the mean and standard deviation of a normal distribution are known, you can find out the percentile rank of a person obtaining a specific score
Ex: Introductory Psychology test normally distributed with a mean of 80 and a standard deviation of 5
What is the percentile rank of a person with a score of 70?
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
The proportion of the area below 70 is equal to the proportion of the scores below 70
z = (70-80)/5 = -2
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
f(x)QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
Percentiles
What about a person scoring 75 on the test?
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
The proportion of the area below 75 is the same as the proportion of scores below 75.
z = (75-80)/5 = -1
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
Percentiles
What about a person scoring 90 on the test?
This graph shows that most people scored below 90.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
z = (90 - 80)/5 = 2
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
Percentiles
The test has a mean of 80 and a standard deviation of 5. What score do you need to be in the 75th percentile?
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Percentiles
The test has a mean of 80 and a standard deviation of 5. What score do you need to be in the 75th percentile?
Look at the z table to find the z associated with 0.75- The value of z is 0.674- This means that you need to be .674 standard deviationsabove the mean
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Standard deviation is 5, so(5)(.674) = 3.37Need to be 3.37 points above the mean
80+3.37 = 83.37Round to get a score of 83
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
Sampling Distribution
Say you compute the mean of a sample of 10 numbers; the value you obtain will not equal the population(entire set) mean exactly
Sampling distribution of the mean is a theoretical distribution that is approached as the number of samples increases
Population with mean of and standard deviation of , the sampling distribution of the mean has mean of and standard deviation (called standard error of the mean):
where n is the sample size
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture. QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.