exam 2: review g 201 statistics for political science 1

44
Exam 2: Review G 201 Statistics for Political Science 1

Upload: joseph-allison

Post on 17-Jan-2016

247 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exam 2: Review G 201 Statistics for Political Science 1

Exam 2: Review

G 201Statistics for Political Science

1

Page 2: Exam 2: Review G 201 Statistics for Political Science 1

Exam 2: Review

Exam 2: Review Topics

Chapter 3: Central Tendency1. Mode, Median, Mean (Definition, Formula for each)2. Skewed Distribution3. Systematical Distributions

Chapter 4: Variability1. Range (Definition, Formula)2. Deviation (Definition, Formula)3. Variance (Definition, Formula)4. Standard Deviation (Definition, Formula)

2

Page 3: Exam 2: Review G 201 Statistics for Political Science 1

Measures of central tendency:

Measures of central tendency: Measures of central tendency are numbers that describe what is average or

typical in a distribution

We will focus on three measures of central tendency:– The Mode– The Median– The Mean (average)

Our choice of an appropriate measure of central tendency depends on three factors: (a) the level of measurement, (b) the shape of the distribution, (c) the purpose of the research.

3

Page 4: Exam 2: Review G 201 Statistics for Political Science 1

The Mode

The Mode: The mode is the most frequent, most typical or most common value or category

in a distribution.

Example: There are more protestants in the US than people of any other religion.

The mode is always a category or score, not a frequency.

The mode is not necessarily the category with the majority (that is, 50% or more) of cases. It is simply the category in which the largest number (or proportion) of cases falls.

4

Page 5: Exam 2: Review G 201 Statistics for Political Science 1

Language Number of SpeakersSpanish 17,339,000

French 1,702,000

German 1,547,000

Italian 1,309,000

Chinese 1,249,000

Tagalog 843,000

Polish 723,000

Korean 626,000

Vietnamese 507,000

Portuguese 430,000

Ten Most Common Foreign Languages Spoken in the United States, 1990.

Source: U.S. Bureau of the Census, Statistical Abstract of the United States, 2000, Table 51.

5

Page 6: Exam 2: Review G 201 Statistics for Political Science 1

Is the mode 17,339,000?

NO!

Recall: The mode is the category or score, not the frequency!!

Thus, the mode is Spanish.

A Review of Mode

6

Page 7: Exam 2: Review G 201 Statistics for Political Science 1

The Mode

Some additional points to consider about modes:Some distributions have two modes where two response categories have the

highest frequencies.

Such distributions are said to be bimodal.

NOTE: When two scores or categories have the highest frequencies that are quite close, but not identical, in frequency, the distribution is still “essentially” bimodal. In these instances report both the “true” mode and the highest frequency categories.

7

Page 8: Exam 2: Review G 201 Statistics for Political Science 1

Example of a Bimodal Frequency Distribution

8

Page 9: Exam 2: Review G 201 Statistics for Political Science 1

The Median

The Median:The median is the score that divides the distribution into two equal parts so

that half of the cases are above it and half are below it.

The median can be calculated for both ordinal and interval levels of measurement, but not for nominal data.

It must be emphasized that the median is the exact middle of a distribution.

So, now let’s look at ways we can find the median in sorted data:

9

Page 10: Exam 2: Review G 201 Statistics for Political Science 1

In some cases, we can find the median by simple inspection.

Let’s look at the responses (A) to the question: “Think about the economy, how would you rate economic conditions in the country today?”

First, we sort the responses (B) in order from lowest to highest (or highest to lowest).

Since we have an odd number of cases, let’s find the middle case.

Poor Jim

Good Sue

Only Fair Bob

Poor Jorge

Excellent Karen

Total (N) 5

Poor Jim

Poor Jorge

Only Fair Bob

Good Sue

Excellent Karen

Total (N) 5

A

B

10

Page 11: Exam 2: Review G 201 Statistics for Political Science 1

Calculating the median:

Jim Poor

Jorge Poor

Bob Only Fair

Sue Good

Karen Excellent

We can find the median through visual inspection and through calculation.

We can also find the middle case when N is odd by adding 1 to N and dividing by 2:

(N + 1) ÷2.

Since N is 5, you calculate (5 + 1) ÷ 2 = 3. The middle case is, thus, the third case (Bob), the median

response is “Only Fair.”

11

Page 12: Exam 2: Review G 201 Statistics for Political Science 1

Calculating the median:

State Number

California 1831

Florida 93

Virginia 105

New Jersey 694

New York 853

Ohio 265

Pennsylvania 168

Texas 333

North Carolina 42

TOTAL N = 9

Another example:The following is a list of the number of hate crimes reported in the nine

largest U.S. states for 1997.

12

Page 13: Exam 2: Review G 201 Statistics for Political Science 1

Calculating the median:

Finding the Median State for Hate Crimes

1. Order the cases from lowest to highest.

2. In this situation, we need the 5th case:

(9 + 1) ÷ 2 = 5

Which is Ohio

Remember: (N + 1) ÷2.

State Number

North Carolina 42

Florida 93

Virginia 105

Pennsylvania 168

Ohio 265

Texas 333

New Jersey 694

New York 853

California 1831

N = 9

13

Page 14: Exam 2: Review G 201 Statistics for Political Science 1

Finding the Median Number of Hate Crimes out of Eight States

Order the cases from lowest to highest.

For an even number of cases, there will be two middle cases.

In this instance, the median falls halfway between both cases (216.5).

However, the circumstances being explained should determine if you use the two middle cases or the point halfway between both cases for your explanation.

State Number

North Carolina 42

Florida 93

Virginia 105

Pennsylvania 168

Ohio 265

Texas 333

New Jersey 694

New York 853

14

Page 15: Exam 2: Review G 201 Statistics for Political Science 1

Finding the Median Number of Hate Crimes out of Eight States

1.In this instance, the median falls halfway between both cases (216.5).

(8 + 1) ÷ 2 = 4.5

State Number

North Carolina 42

Florida 93

Virginia 105

Pennsylvania 168

Ohio 265

Texas 333

New Jersey 694

New York 853

15

4.5 (216.5)4.5 (216.5)

Page 16: Exam 2: Review G 201 Statistics for Political Science 1

The MedianThe Median (Mdn) : Examples

Odd Number of Cases: Median exactly in the middle12, 17, 13, 11, 16, 25, 20 (not ordered)

11, 12, 13, 16, 17, 20, 25 (ordered: Lowest to Highest)N = 7(N + 1) ÷ 2 = (7 + 1) ÷ 2 = 4

11, 12, 13, 16, 17, 20, 25, 26 (ordered)1 2 3 4 Mdn = 16

16

Page 17: Exam 2: Review G 201 Statistics for Political Science 1

The MedianThe Median (Mdn): Examples

Even Number of Cases: Median is the point above and below which 50% of the cases fall: 17, 12, 16, 13, 11, 25, 20, 26

11, 12, 13, 16, 17, 20, 25, 26 (ordered) N = 8 (N + 1) ÷ 2 = (8 + 1) ÷ 2 = 4.5

11, 12, 13, 16, 17, 20, 25, 261 2 3 4 4.5Mdn = 16.5

17

Page 18: Exam 2: Review G 201 Statistics for Political Science 1

The MeanThe Mean: The mean is what most people call the average. It find the mean of any distribution

simply add up all the scores and divide by the total number of scores.

Here is formula for calculating the mean

18

Page 19: Exam 2: Review G 201 Statistics for Political Science 1

Finding the MeanCommunicable Diseases -> Tuberculosis (as of 22 March 2007)

  2005

Bangladesh 37

Bhutan 44

Democratic People's Republic of Korea 103

India 58

Indonesia 47

Maldives 76

Myanmar 119

Nepal 64

Sri Lanka 71

Thailand 61

Timor-Leste 71

n (cases) = 11 751

© World Health Organization, 2008. All rights reserved 19

Page 20: Exam 2: Review G 201 Statistics for Political Science 1

Finding the Mean:To identify the number of new tuberculosis cases found in 2006 by the WHO

in this region,

– Add up the cases for all of the countries in the region and– Divide the sum by the total number of cases.

Thus, the mean rate is (751 ÷ 11) = 68.273.

Finding the Mean

20

Page 21: Exam 2: Review G 201 Statistics for Political Science 1

Using a formula to calculate the mean:The Usefulness of Formulas: The mean introduces the usefulness of a formula, which may be defined as a

is a shorthand way to explain what operations we need to follow to obtain a certain result.

Again, the formula that defines the mean is:

21

Page 22: Exam 2: Review G 201 Statistics for Political Science 1

Deviation:

Deviation:The deviation indicates the distance and direction of any raw score from the

mean.

To find the deviation of a particular score, we simply subtract the mean from the score:

Where X = any raw score in the distribution

ondistributitheofmeanX

22

Page 23: Exam 2: Review G 201 Statistics for Political Science 1

So what does this tell us?

The mode is the peak of the curve.

The mean is found closest to the tail, where the relatively few extreme cases will be found.

The median is found between the mode and mean or is aligned with them in a normal distribution.

23

Page 24: Exam 2: Review G 201 Statistics for Political Science 1

Did you know?

The shape or form of a distribution can influence the researcher’s choice of a measure of tendency.

Why is that? Well, let’s see…

24

Page 25: Exam 2: Review G 201 Statistics for Political Science 1

Measures of Variability

Chapter 4: Measures of Variability

Page 26: Exam 2: Review G 201 Statistics for Political Science 1

Measures of Variability

Measures of variability tell us:

• The extent to which the scores differ from each other or how spread out the scores are.

• How accurately the measure of central tendency describes the distribution.

• The shape of the distribution.

Page 27: Exam 2: Review G 201 Statistics for Political Science 1

Measures of Variability

Just what is variability?Variability is the spread or dispersion of scores.

Measuring VariabilityThere are a few ways to measure variability and they include:

1) The Range2) The Deviation3) The Standard Deviation4) The Variance

Page 28: Exam 2: Review G 201 Statistics for Political Science 1

Variability

Measures of Variability

Range: The range is a measure of the distance between highest and lowest.

R= H – L

Temperature Example: Range:

Honolulu: 89° – 65° 24°Phoenix: 106° – 41° 65°

Page 29: Exam 2: Review G 201 Statistics for Political Science 1

Okay, so now you tell me the range…

This table indicates the number of metropolitan areas, as defined by the Census Bureau, in six states.

What is the range in the number metropolitan areas in these six states?

– R=H-L– R=9-3– R=6

DelawareDelaware 33

IdahoIdaho 44

NebraskaNebraska 44

KansasKansas 55

IowaIowa 44

MontanaMontana 33

CaliforniaCalifornia 99

Page 30: Exam 2: Review G 201 Statistics for Political Science 1

The Variance

Remember that the deviation is the distance of any given score from its mean.

The variance takes into account every score.

But if we were to simply add them up, the plus and minus (positive and negative) scores would cancel each other out because the sum of actual deviations is always zero!

)( XX

0)( XX

Page 31: Exam 2: Review G 201 Statistics for Political Science 1

So, what we should we do?

We square the actual deviations and then add them together.

– Remember: When you square a negative number it becomes positive!

SO,

S2 = sum of squared deviations divided by the number of scores.

The variance provides information about the relative variability.

The Variance

Page 32: Exam 2: Review G 201 Statistics for Political Science 1

Variance: Weeks on Unemployment:

X(weeks)

Deviation:

(raw score from the mean, squared)

9 8 6 4 2 1

9-5= 48-5=36-5=14-5=-12-5=-31-5=-4

42 = 1632 = 912 = 1-12 = 1-32 = 9-42 = 16

ΣX=30 χ= 30=5 6

Step 1: Calculatethe Mean

Step 3: CalculateSum of square Dev

Step 2: CalculateDeviation

Page 33: Exam 2: Review G 201 Statistics for Political Science 1

The mean of the squared deviations is the same as the variance, and can be symbolized by s2

scoresofnumbertotal

meanthefromdeviationssquaredtheofsum

variancewhere

N

XX

s2

2

)(

The Variance

Page 34: Exam 2: Review G 201 Statistics for Political Science 1

Variance: Weeks on Unemployment:

X(weeks)

Deviation:

(raw score from the mean, squared)

Variance:

9 8 6 4 2 1

9-5= 48-5=36-5=14-5=-12-5=-31-5=-4

42 = 1632 = 912 = 1-12 = 1-32 = 9-42 = 16

(weeks squared)

ΣX=30 χ= 30=5 6

Step 1: Calculatethe Mean

Step 3: CalculateSum of square Dev

Step 2: CalculateDeviation

Step 4: Calculatethe Mean of squared dev.

Page 35: Exam 2: Review G 201 Statistics for Political Science 1

Standard Deviation:

It is the typical (standard) difference (deviation) of an observation from the mean.

Think of it as the average distance a data point is from the mean, although this is not strictly true.

What is a standard deviation?

Page 36: Exam 2: Review G 201 Statistics for Political Science 1

Standard Deviation:

The standard deviation is calculated by taking the square root of the variance.

What is a standard deviation?

Page 37: Exam 2: Review G 201 Statistics for Political Science 1

Variance: Weeks on Unemployment:

X(weeks)

Deviation:

(raw score from the mean, squared)

Variance: Standard Deviation:

(square root of the variance)

9 8 6 4 2 1

9-5= 48-5=36-5=14-5=-12-5=-31-5=-4

42 = 1632 = 912 = 1-12 = 1-32 = 9-42 = 16

(weeks squared)

ΣX=30 χ= 30=5 6

s = 2.94

Step 1: Calculatethe Mean

Step 3: CalculateSum of square Dev

Step 2: CalculateDeviation

Step 4: Calculatethe Mean of squared dev.

Step 5: Calculate the Square root of the Var.

Page 38: Exam 2: Review G 201 Statistics for Political Science 1

Raw Score Calculations

Here is how you calculate variance using raw scores:

Here is how you calculate standard deviation using raw scores:

S =

Page 39: Exam 2: Review G 201 Statistics for Political Science 1

Variance: Weeks on Unemployment:

X(weeks)

X

9 8 6 4 2 1

92 = 8182 = 6462 = 3642 = 1622 = 412 = 1

202 – 25 = 6

33.67 – 25 =

____ √ 8.67

ΣX=30 χ= 30=5 6X =25

ΣX = 202 S = 8.67 s = 2.94

Step 1: Calculatethe Mean

Step 3: CalculateVariance

Step 2: CalculateSquare raw scores

Step 4: Calculatethe Standard Deviation.

2

2

2_

2

Page 40: Exam 2: Review G 201 Statistics for Political Science 1

Standard Deviation

Standard Deviation: ApplicationsStandard deviation also allows us to:

1) Measure the baseline of a frequency polygon.2) Find the distance between raw scores and the mean – a standardized method that permits comparisons between raw scores in the distribution – as well as between different distributions.

Page 41: Exam 2: Review G 201 Statistics for Political Science 1

Standard Deviation

Standard Deviation: Baseline of a Frequency Polygon.The baseline of a frequency polygon can be measured in units of standard deviation.

Example: = 80s = 5

Thus, the raw score 85 liesone Standard Deviation above the mean (+1s).

Page 42: Exam 2: Review G 201 Statistics for Political Science 1

Standard Deviation

Standard Deviation: The Normal RangeUnless highly skewed, approximately two-thirds of scores within a

distribution will fall within the one standard deviation above and below the mean.

Example: Reading LevelsWords per minute.

= 120s = 25

Norm

al Ran

ge

Page 43: Exam 2: Review G 201 Statistics for Political Science 1
Page 44: Exam 2: Review G 201 Statistics for Political Science 1