review

27
Review • Measures of Central Tendency – Mean, median, mode • Measures of Variation – Variance, standard deviation

Upload: grover

Post on 05-Jan-2016

26 views

Category:

Documents


0 download

DESCRIPTION

Review. Measures of Central Tendency Mean, median, mode Measures of Variation Variance, standard deviation. Variance is defined as. The Normal Curve. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Review

Review

• Measures of Central Tendency – Mean, median, mode

• Measures of Variation– Variance, standard deviation

Page 2: Review

s 2 = Σ ( X - X̄ ) 2

N

Variance is defined as

Page 3: Review

The Normal Curve

• The mean and standard deviation, in conjunction with the normal curve allow for more sophisticated description of the data and (as we see later) statistical analysis

• For example, a school is not that interested in the raw GRE score, it is interested in how you score relative to others.

Page 4: Review

• Even if the school knows the average (mean) GRE score, your raw score still doesn’t tell them much, since in a perfectly normal distribution, 50% of people will score higher than the mean.

• This is where the standard deviation is so helpful. It helps interpret raw scores and understand the likelihood of a score.

• So if I told you if I scored 710 on the quantitative section and the mean score is 591. Is that good?

Page 5: Review

• It’s above average, but who cares.

• What if I tell you the standard deviation is 148?

• What does that mean?

• What if I said the standard deviation is 5?

• Calculating z-scores

Page 6: Review

Converting raw scores to z scores

What is a z score? What does it represent

Z = (x-µ) / σ

Converting z scores into raw scores

X = z σ + µ

Z = (710-563)/140 = 147/140 = 1.05

Page 7: Review
Page 8: Review

Finding Probabilities under the Normal Curve

So what % of GRE takers scored above and below 710?

The importance of Table A

Why is this important? Inferential Statistics (to be cont.)

Page 9: Review

Stuff you don’t need to know:

pi = ≈3.14159265

e = ≈2.71

Page 10: Review

The Normal Curve and Sampling

A. A sample will (almost) always be different from the true population

B. This is called “sampling error”

C. The difference between a sample and the true population, regardless of how well the survey was designed or implemented

D. Different from measurement error or sample bias

Page 11: Review

Sampling distribution of Means

• The existence of sampling error means that if you take a 1000 random samples from a population and calculate a 1000 means and plot the distribution of those means you will get a consistent distribution that has the following characteristics:

Page 12: Review

Characteristics of a Sampling Distribution

• 1. the distribution approximates a normal curve• 2. the mean of a sampling distribution of means

is equal to the true population• 3. the standard deviation of a sampling

distribution is smaller than the standard deviation of the population. Less variation in the distribution because we are not dealing with raw scores but rather central tendencies.

Page 13: Review

Why is the normal curve so important?

• If we define probability in terms of the likelihood of occurrence, then the normal curve can be regarded as a probability distribution (the probability of occurrence decreases as we move away from the center).

• With this notion, we can find the probability of obtaining a raw score in a distribution, given a certain mean and SD.

Page 14: Review

Probability and the Normal Curve

In chapter 6 – we are not interested in the distribution of raw scores but rather the distribution of sample means and making probability statements about those sample means.

Page 15: Review

Probability and the Sampling Distribution

Why is making probabilistic statements about a central tendency important?

• 1. it will allow us to engage in inferential statistics (later in ch. 7)

• 2. it allows us to produce confidence intervals

Page 16: Review

Example of number 1:

• President of UNLV states that the average salary of a new UNLV graduate is $60,000. We are skeptical and test this by taking a random sample of a 100 UNLV students. We find that the average is only $55,000. Do we declare the President a liar?

Page 17: Review

Not Yet!!!!

We need to make a probabilistic statement regarding the likelihood of Harter’s statement. How do we do that?

With the aid of the standard error of the mean we can calculate confidence intervals - the range of mean values within with our true population mean is likely to fall.

Page 18: Review

How do we do that?

• First, we need the sample mean

• Second, we need the standard deviation of the sampling distribution of means (what’s another name for this?)

• a.k.a standard error of the mean

Page 19: Review

What’s the Problem?

• The problem is…

• We don’t have the standard deviation of the sampling distribution of means?

• What do we do?

Page 20: Review

First – let’s pretend

• Let’s pretend that I know the Standard Deviation of the Sampling Distribution of Means (a.k.a. the standard error of the mean). It’s 3000

• For a 95% confidence interval we multiply the standard error of the mean by 1.96 and add & subtract that product to our sample mean

• Why 1.96?• What’s the range?

Page 21: Review

So is President Ashley Lying?

CI = Mean + or – 1.96 (SE)

= 55,000 +/- 1.96 (3000)

= 55,000 +/- 5880 = $49,120 to 60,880

Page 22: Review

Let’s stop pretending

• We Can Estimate the Standard Error of the Mean.– Divide the standard deviation of the sample

by √N-1

• Multiply this estimate by t rather than 1.96 and then add this product to our sample mean.

• Why t?

Page 23: Review

The t Distribution

• Empirical testing and models shows that a standard deviation from a sample underestimates the standard deviation of the true population

• This is why we use N-1 not N when calculating the standard deviation and the standard error

• So in reality, we are calculating t-scores, not z-scores since we are not using the true sd.

Page 24: Review

• So when we are using a sample and calculating a 95% confidence interval (CI) we need to multiply the standard error by t, not 1.96

• How do we know what t is?

• Table in back of book

• Df = N - 1

Page 25: Review

Confidence Intervals for Proportions

Calculate the standard error of the proportion:

Sp =

95% conf. Interval =

P +/- (1.96)Sp

N

PP 1

Page 26: Review

Example• National sample of 531 Democrats and

Democratic-leaning independents, aged 18 and older, conducted Sept. 14-16, 2007

• Clinton 47%; Obama 25%; Edwards 11%

• P(1-P) = .47(1-.47) = .47(.53) = .2491

• Divide by N = .2491/531 = .000469

• Take square root = .0217

• 95% CI = .47 +/- 1.96 (.0217)

• .47 +/- .04116 or 0.429 to .511

Page 27: Review

Midterm

• Key terms from Schutt chapters 1-5

• Statistical Calculations by hand– Mean, Median, Mode– Variance/Standard Deviation– Z-scores– Standard errors and confidence intervals

using z or t