introduction to statistics 3
DESCRIPTION
Lecture discussing the measures of central tendencyTRANSCRIPT
Introduction to statistics-3
Measures of variability or spread
Summary Last Class
• Measures of central Tendency– Mean – Median – Mode
Mode
• The mode can be found with any scale of measurement; it is the only measure of typicality that can be used with a nominal scale.
Median
• The median can be used with ordinal, as well as interval/ratio, scales. It can even be used with scales that have open-ended categories at either end (e.g., 10 or more).
• It is not greatly affected by outliers, and it can be a good descriptive statistic for a strongly skewed distribution.
Mean
• The mean can only be used with interval or ratio scales. It is affected by every score in the distribution, and it can be strongly affected by outliers.
• It may not be a good descriptive statistic for a skewed distribution, but it plays an important role in advanced statistics.
Key Terms
• Variability• Measures of Variability• Variance• Standard Deviation• Normal Distribution
Variability
• The difference in data or in a set of scores• Populations can be described as homogenous
or heterogeneous based on the level of variability in the data set
Measures of Variability
• Provide an estimate of how much scores in a distribution vary from an average score. The usual average is the mean
• Range • Variance• Standard Deviation
Range
• A single number that represents the spread of the data
• Upper real limit of Xmax- Lower real limit of Xmin
Variance
• Variance is a measure of variability• Standard deviation is the square root of
variance
Steps to calculate SD
1. Determine the Mean= x̄�2. Determine the deviations (X-x̄�)3. Square these (X-x̄�)2
4. Add the squares ∑ (X-x̄�)2
5. Divide by total numbers less one ∑ (X-x̄�)2/n-16. Square root of result is Standard Deviation
Standard Deviation=√∑ (X-x̄�)2/n-1
Variance and Standard Deviation Population Sample
Mean =µ Mean= x̄�
Variance Variance
Standard Deviation Standard Deviation
Worked Example (appspot.com)
Step 1- Calculate mean
Step 2 Calculate Deviation
Step 3 Sum Mean Square Deviation
Step 4 Calculate standard deviation
Range
• The range tells you the largest difference that you have among your scores. It is strongly affected by outliers, and being based on only two scores, it can be very unreliable.
Mean Deviation
• The mean deviation, and the two measures that follow, can only be used with interval/ratio scales.
• It is a good descriptive measure, which is less affected by outliers than the standard deviation, but it is not used in advanced statistics.
Variance
• The variance is not appropriate for descriptive purposes, but it plays an important role in advanced statistics.
Standard Deviation
• The standard deviation is a good descriptive measure of variability, although it can be affected strongly by outliers. It plays an important role in advanced statistics.
Scaled Scores
• A key element of statistics is making comparison between variables and populations.
Changing the mean and standard deviation of a distribution
• If a constant is added to each score in a distribution the mean for the distribution changes but the variance and standard deviation does not.
• Adding a score changes the sum of all the scores but not the spread or shape of a distribution
Adding or Subtracting a constant
• When you add or subtract a constant from each score in a distribution the mean changes by the amount added or subtracted but the standard deviation and variance remain the same
x̄� new= x̄� original +/- constant
s new= s original
Multiplying or dividing by a constant
x̄� new= x̄� original x or / by the constant
s new= s original x or / by the constant
Z scores
• The z score provides the exact position of a score in its distribution.
• This allows us to compare scores from different distributions
Z-score distribution
• The z score distribution has a mean of 0 and a standard deviation of 1
Z-Score formula
Z score for sample
Sz
i XX
33
Example Z Score
• For scores above the mean, the z score has a positive sign. Example + 1.5z.
• Below the mean, the z score has a minus sign. Example - 0.5z.
• Calculate Z score for blood pressure of 140 if the sample mean is 110 and the standard deviation is 10
• Z = 140 – 110 / 10 = 3
34
Comparing Scores from Different Distributions
• Interpreting a raw score requires additional information about the entire distribution. In most situations, we need some idea about the mean score and an indication of how much the scores vary.
• For example, assume that an individual took two tests in reading and mathematics. The reading score was 32 and mathematics was 48. Is it correct to say that performance in mathematics was better than in reading?
35
Z Scores Help in Comparisons
• Not without additional information. One method to interpret the raw score is to transform it to a z score.
• The advantage of the z score transformation is that it takes into account both the mean value and the variability in a set of raw scores.
G RADE
0 20 40 60 80 100
05
1015
Dave in Statistics:
(50 - 40)/10 = 1
(one SD above the mean)
Dave in Calculus
(50 - 60)/10 = -1
(one SD below the mean)
Statistics Calculus
Mean Statistics = 40
Mean Calculus = 60
Example 1
An example where the means are identical, but the two sets of scores have different spreads
Dave’s Stats Z-score
(50-40)/5 = 2
Dave’s Calc Z-score
(50-40)/20 = .5
G RADE
0 20 40 60 80 100
05
1015
2025
30
Calculus
Statistics
Example 2
Thee Properties of Standard Scores
• 1. The mean of a set of z-scores is always zero
Properties of Standard Scores
• Why?
• The mean has been subtracted from each score. Therefore, following the definition of the mean as a balancing point, the sum (and, accordingly, the average) of all the deviation scores must be zero.
Three Properties of Standard Scores
• 2. The SD of a set of standardized scores is always 1
Three Properties of Standard Scores
• 3. The distribution of a set of standardized scores has the same shape as the unstandardized scores– beware of the “normalization”
misinterpretation
The shape is the same (but the scaling or metric is different)
UNSTANDARDIZED
0.4 0.6 0.8 1.0
02
46
STANDARDIZED
-6 -4 -2 0 2
0.0
0.1
0.2
0.3
0.4
0.5
Two Advantages of Standard Scores
1. We can use standard scores to find centile scores: the proportion of people with scores less than or equal to a particular score. Centile scores are intuitive ways of summarizing a person’s location in a larger set of scores.
SCO RE
- 4 - 2 0 2 4
0.0
0.1
0.2
0.3
0.4
34% 34%
14%14%
2%2%
50%
The area under a normal curve
Two Advantages of Standard Scores
2. Standard scores provides a way to standardize or equate different metrics. We can now interpret Dave’s scores in Statistics and Calculus on the same metric (the z-score metric). (Each score comes from a distribution with the same mean [zero] and the same standard deviation [1].)
Two Disadvantages of Standard Scores
1. Because a person’s score is expressed relative to the group (X - M), the same person can have different z-scores when assessed in different samples
Example: If Dave had taken his Calculus exam in a class in which everyone knew math well his z-score would be well below the mean. If the class didn’t know math very well, however, Dave would be above the mean. Dave’s score depends on everyone else’s scores.
Two Disadvantages of Standard Scores
2. If the absolute score is meaningful or of psychological interest, it will be obscured by transforming it to a relative metric.
Properties of the Mean, Standard Deviation, and Standardized Scores
Mean. Adding or subtracting a constant from the scores changes the mean in the same way. Multiplying or dividing by a constant also changes the mean in the same way. The sum of squared deviations is smaller around the mean than any other point in the distribution.
Standard Deviation
• Standard deviation. Adding or subtracting a constant from the scores does not change the standard deviation. However, multiplying or dividing by a constant means that the standard deviation will be multiplied or divided by the same constant. The standard deviation is smaller when calculated around the mean than any other point in the distribution.
Standardized Scores
• Standardized scores. Adding, subtracting, multiplying or dividing the scores by a constant does not change the standardized scores. The mean of a set of z scores is zero, and the standard deviation is 1.0.