measures of a distribution’s central tendency, spread, and shape chapter 3 sharon lawner weinberg...

23
Measures of a Measures of a Distribution’s Distribution’s Central Central Tendency, Tendency, Spread, and Spread, and Shape Shape Chapter 3 Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ Statis Statis tics tics SPSS SPSS An Integrative Approach SECOND EDITION Usin Usin g g

Upload: spencer-jordan

Post on 22-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of a Measures of a Distribution’s Central Distribution’s Central Tendency, Spread, and Tendency, Spread, and ShapeShapeChapter 3Chapter 3

SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ

StatisticStatisticss

SPSSSPSSAn Integrative Approach

SECOND EDITION

UsinUsingg

Page 2: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

OverviewOverview• Measures of Central Tendency (Level)

• Mode• Median• Mean

• Measures of Dispersion (Spread)• Range• Interquartile Range• Variance• Standard Deviation

• Measure of Shape• Skewness and Skewness Ratio

Page 3: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of Central Tendency: ModeMeasures of Central Tendency: Mode

Definition: The mode is the score that occurs most often.Useful when data are nominal or ordinal with only a limited number of categories.

To find the mode, click Analyze on the main menu bar, Descriptive Statistics, and then Frequencies.

Click on Options, and the square next to mode. Click OK.

Page 4: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of Central Tendency: ModeMeasures of Central Tendency: ModeExample: Home Language Background (HOMELANG)

What is this variable’s mode?

Home Language Background

15 3.0 3.0 3.0

35 7.0 7.0 10.0

47 9.4 9.4 19.4

403 80.6 80.6 100.0

500 100.0 100.0

Non-English Only

Non-English Dominant

English Dominant

English Only

Total

ValidFrequency Percent Valid Percent

CumulativePercent

Home Language Background

15 3.0 3.0 3.0

35 7.0 7.0 10.0

47 9.4 9.4 19.4

403 80.6 80.6 100.0

500 100.0 100.0

Non-English Only

Non-English Dominant

English Dominant

English Only

Total

ValidFrequency Percent Valid Percent

CumulativePercent

Mode = English Only

Page 5: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of Central Tendency: ModeMeasures of Central Tendency: Mode

Example: Although the mode is technically the South, the North Central is close enough that the distribution may be considered bimodal.

Page 6: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of Central Tendency: ModeMeasures of Central Tendency: Mode

•Definition: A bimodal distribution is one with two modes, usually at some distance apart from each other.

•Definition: A uniform distribution is one in which all values occur with the same frequency.

Page 7: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of Central Tendency: MedianMeasures of Central Tendency: Median

Definition: The median is the middle point in a distribution.Useful when data are ordinal or scale and severely skewed.

To find the median, click Analyze on the main menu bar, Descriptive Statistics, and then Explore. Click OK.

Or, to find the median, click Analyze on the main menu bar, Descriptive Statistics, and then Frequencies.

Click on Options, and the square next to median. Click OK.

Page 8: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of Central Tendency: MeanMeasures of Central Tendency: MeanDefinition: The mean is the sum of all of the data

points divided by the number of data points.Useful when data are scale and not severely skewed.

To find the mean, click Analyze on the main menu bar, Descriptive Statistics, and then Explore. Click OK.

OR use Frequencies OR use Descriptives.

Page 9: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of Central Tendency: MeanMeasures of Central Tendency: Mean

In the case where the variable is dichotomous and coded as 0 and 1, the mean is interpreted as the proportion of 1’s in the distribution.• Example: Gender

Statistics

Gender500

0

.55

1.00

1

Valid

Missing

N

Mean

Median

Mode

Statistics

Gender500

0

.55

1.00

1

Valid

Missing

N

Mean

Median

Mode

Gender

227 45.4 45.4 45.4

273 54.6 54.6 100.0

500 100.0 100.0

Male

Female

Total

ValidFrequency Percent Valid Percent

CumulativePercent

Gender

227 45.4 45.4 45.4

273 54.6 54.6 100.0

500 100.0 100.0

Male

Female

Total

ValidFrequency Percent Valid Percent

CumulativePercent

Page 10: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of Central Tendency:Measures of Central Tendency:Comparing the Mean, Median, and ModeComparing the Mean, Median, and Mode

Compare the values of the mode, median, and mean for SES, EXPINC30, and SCHATTRT.

Statistics

500 459 417

0 41 83

18.43 51574.73 93.65

19.00 40000.00 95.00

19 50000 95

Valid

Missing

N

Mean

Median

Mode

Socio-Economic

Status

Expectedincome at

age 30

SchoolAverage DailyAttendance

Rate

Statistics

500 459 417

0 41 83

18.43 51574.73 93.65

19.00 40000.00 95.00

19 50000 95

Valid

Missing

N

Mean

Median

Mode

Socio-Economic

Status

Expectedincome at

age 30

SchoolAverage DailyAttendance

Rate

Page 11: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of Dispersion VisuallyMeasures of Dispersion VisuallyWhen traveling to these two cities, would the same clothing be suitable

for both cities at any time during the year from the point of view of warmth?

Page 12: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of DispersionMeasures of DispersionHow can we quantify the obvious difference in

temperature variability across the year between these two cities?• One Answer: By using the range or interquartile range

(IQR).• Another Answer: By using the variance or standard

deviation.

Page 13: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

The Range and Interquartile RangeThe Range and Interquartile RangeDefinition: The range is the difference between the

highest and lowest values in the distribution. The interquartile range (IQR) is the range of the middle half of the data, or the difference between the 75th and 25th percentiles.Useful when data are ordinal or scale and severely skewed.

To find the IQR and range, click Analyze on the main menu bar, Descriptive Statistics, and then Explore. Click OK.

Page 14: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

The Variance and Standard DeviationThe Variance and Standard DeviationDefinition: The variance is the average of the squared

deviations from the mean. The standard deviation is the square root of the variance. We may think of the standard deviation as the distance we have to travel in both directions from the mean to capture the majority of values in a distribution. The farther out we need to travel, the more spread out are the values of the distribution from the mean.Useful when data are scale and not severely skewed.

To find the SD and Variance, click Analyze on the main menu bar, Descriptive Statistics, and then Explore. Click OK.

Page 15: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of DispersionMeasures of Dispersion We get the following values for the temperature example.

Consistent with the earlier boxplots, for all quantitative measures, Springfield is shown to have a greater temperature spread than San Francisco.

Descriptives

282.992

16.822

46

34

29.061

5.391

15

10

Variance

Std. Deviation

Range

Interquartile Range

Variance

Std. Deviation

Range

Interquartile Range

citySpringfield

San Francisco

tempStatistic

Descriptives

282.992

16.822

46

34

29.061

5.391

15

10

Variance

Std. Deviation

Range

Interquartile Range

Variance

Std. Deviation

Range

Interquartile Range

citySpringfield

San Francisco

tempStatistic

Page 16: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of DispersionMeasures of Dispersion

Key words to indicate that a question relates to dispersion:Spread, variability, dispersion, heterogeneity, inconsistency, unpredictability

Page 17: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Measures of ShapeMeasures of ShapeDefinition: The skewness statistic is a measure of the shape of a

distribution. It is negative when the distribution is negatively skewed, zero when the distribution is not skewed, and positive when the distribution is positively skewed. Its calculation is based on the cubed deviations from the mean.

Definition: The skewness ratio is the value of the skewness statistic divided by its standard error. This measure is useful for determining the extent of skew. As a rule of thumb, when this ratio exceeds 2 in magnitude for small and moderate sized samples, the distribution is considered to be severely skewed.

Useful when data are scale.

To find the skewness ratio, click Analyze on the main menu bar, Descriptive Statistics, and then Explore. Click OK. Divide the skewness statistic by the standard error of the skew.

Page 18: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Examples of Distributions of Different ShapeExamples of Distributions of Different Shape

Page 19: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

How the Shape of the Distribution Affects How the Shape of the Distribution Affects the Mean and Medianthe Mean and Median• For a severely positively skewed distribution, in general,

the mean is greater than the median.• For a severely negatively skewed distribution, in general,

the mean is less than the median. • For a symmetric distribution, the mean equals the median.

Page 20: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Which Measure of Central Tendency Which Measure of Central Tendency Should One UseShould One Use• An article in the Wall Street Journal online (

http://online.wsj.com/article/SB118790518546107112.html) from August 24, 2007 reported the following: • The average cost of a wedding is between $27,400 and

$28,800.• The median is approximately $15,000.

How can we justify this apparent contradiction in the cost of a wedding?

Page 21: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Applying What We have LearnedApplying What We have Learned• What is the extent to which eighth-grade males expect

larger incomes at age 30 than eighth-grade females?

• To what extent is there lack of consensus among males in their income expectations as compared to females?

• How are the answers to these questions influenced by the outliers and general shape of these distributions as shown in the boxplots in the last slide?

Page 22: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Descriptive Statistics for Males and FemalesDescriptive Statistics for Males and Females

Descriptives

60720.93 5405.410

45000.00

6E+009

79258.866

1

1000000

999999

25000

8.863 .166

43515.57 1726.263

40000.00

7E+008

26965.088

0

250000

250000

20000

3.816 .156

Mean

Median

Variance

Std. Deviation

Minimum

Maximum

Range

Interquartile Range

Skewness

Mean

Median

Variance

Std. Deviation

Minimum

Maximum

Range

Interquartile Range

Skewness

GenderMale

Female

Expected incomeat age 30

Statistic Std. Error

Descriptives

60720.93 5405.410

45000.00

6E+009

79258.866

1

1000000

999999

25000

8.863 .166

43515.57 1726.263

40000.00

7E+008

26965.088

0

250000

250000

20000

3.816 .156

Mean

Median

Variance

Std. Deviation

Minimum

Maximum

Range

Interquartile Range

Skewness

Mean

Median

Variance

Std. Deviation

Minimum

Maximum

Range

Interquartile Range

Skewness

GenderMale

Female

Expected incomeat age 30

Statistic Std. Error

Page 23: Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative

Boxplots for Males and FemalesBoxplots for Males and Females