review
DESCRIPTION
Review. Measures of Central Tendency Mean, median, mode Measures of Variation Variance, standard deviation. Variance is defined as. The Normal Curve. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/1.jpg)
Review
• Measures of Central Tendency – Mean, median, mode
• Measures of Variation– Variance, standard deviation
![Page 2: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/2.jpg)
s 2 = Σ ( X - X̄ ) 2
N
Variance is defined as
![Page 3: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/3.jpg)
The Normal Curve
• The mean and standard deviation, in conjunction with the normal curve allow for more sophisticated description of the data and (as we see later) statistical analysis
• For example, a school is not that interested in the raw GRE score, it is interested in how you score relative to others.
![Page 4: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/4.jpg)
• Even if the school knows the average (mean) GRE score, your raw score still doesn’t tell them much, since in a perfectly normal distribution, 50% of people will score higher than the mean.
• This is where the standard deviation is so helpful. It helps interpret raw scores and understand the likelihood of a score.
• So if I told you if I scored 710 on the quantitative section and the mean score is 591. Is that good?
![Page 5: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/5.jpg)
• It’s above average, but who cares.
• What if I tell you the standard deviation is 148?
• What does that mean?
• What if I said the standard deviation is 5?
• Calculating z-scores
![Page 6: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/6.jpg)
Converting raw scores to z scores
What is a z score? What does it represent
Z = (x-µ) / σ
Converting z scores into raw scores
X = z σ + µ
Z = (710-563)/140 = 147/140 = 1.05
![Page 7: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/7.jpg)
![Page 8: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/8.jpg)
Finding Probabilities under the Normal Curve
So what % of GRE takers scored above and below 710?
The importance of Table A
Why is this important? Inferential Statistics (to be cont.)
![Page 9: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/9.jpg)
Stuff you don’t need to know:
pi = ≈3.14159265
e = ≈2.71
![Page 10: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/10.jpg)
The Normal Curve and Sampling
A. A sample will (almost) always be different from the true population
B. This is called “sampling error”
C. The difference between a sample and the true population, regardless of how well the survey was designed or implemented
D. Different from measurement error or sample bias
![Page 11: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/11.jpg)
Sampling distribution of Means
• The existence of sampling error means that if you take a 1000 random samples from a population and calculate a 1000 means and plot the distribution of those means you will get a consistent distribution that has the following characteristics:
![Page 12: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/12.jpg)
Characteristics of a Sampling Distribution
• 1. the distribution approximates a normal curve• 2. the mean of a sampling distribution of means
is equal to the true population• 3. the standard deviation of a sampling
distribution is smaller than the standard deviation of the population. Less variation in the distribution because we are not dealing with raw scores but rather central tendencies.
![Page 13: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/13.jpg)
Why is the normal curve so important?
• If we define probability in terms of the likelihood of occurrence, then the normal curve can be regarded as a probability distribution (the probability of occurrence decreases as we move away from the center).
• With this notion, we can find the probability of obtaining a raw score in a distribution, given a certain mean and SD.
![Page 14: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/14.jpg)
Probability and the Normal Curve
In chapter 6 – we are not interested in the distribution of raw scores but rather the distribution of sample means and making probability statements about those sample means.
![Page 15: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/15.jpg)
Probability and the Sampling Distribution
Why is making probabilistic statements about a central tendency important?
• 1. it will allow us to engage in inferential statistics (later in ch. 7)
• 2. it allows us to produce confidence intervals
![Page 16: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/16.jpg)
Example of number 1:
• President of UNLV states that the average salary of a new UNLV graduate is $60,000. We are skeptical and test this by taking a random sample of a 100 UNLV students. We find that the average is only $55,000. Do we declare the President a liar?
![Page 17: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/17.jpg)
Not Yet!!!!
We need to make a probabilistic statement regarding the likelihood of Harter’s statement. How do we do that?
With the aid of the standard error of the mean we can calculate confidence intervals - the range of mean values within with our true population mean is likely to fall.
![Page 18: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/18.jpg)
How do we do that?
• First, we need the sample mean
• Second, we need the standard deviation of the sampling distribution of means (what’s another name for this?)
• a.k.a standard error of the mean
![Page 19: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/19.jpg)
What’s the Problem?
• The problem is…
• We don’t have the standard deviation of the sampling distribution of means?
• What do we do?
![Page 20: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/20.jpg)
First – let’s pretend
• Let’s pretend that I know the Standard Deviation of the Sampling Distribution of Means (a.k.a. the standard error of the mean). It’s 3000
• For a 95% confidence interval we multiply the standard error of the mean by 1.96 and add & subtract that product to our sample mean
• Why 1.96?• What’s the range?
![Page 21: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/21.jpg)
So is President Ashley Lying?
CI = Mean + or – 1.96 (SE)
= 55,000 +/- 1.96 (3000)
= 55,000 +/- 5880 = $49,120 to 60,880
![Page 22: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/22.jpg)
Let’s stop pretending
• We Can Estimate the Standard Error of the Mean.– Divide the standard deviation of the sample
by √N-1
• Multiply this estimate by t rather than 1.96 and then add this product to our sample mean.
• Why t?
![Page 23: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/23.jpg)
The t Distribution
• Empirical testing and models shows that a standard deviation from a sample underestimates the standard deviation of the true population
• This is why we use N-1 not N when calculating the standard deviation and the standard error
• So in reality, we are calculating t-scores, not z-scores since we are not using the true sd.
![Page 24: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/24.jpg)
• So when we are using a sample and calculating a 95% confidence interval (CI) we need to multiply the standard error by t, not 1.96
• How do we know what t is?
• Table in back of book
• Df = N - 1
![Page 25: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/25.jpg)
Confidence Intervals for Proportions
Calculate the standard error of the proportion:
Sp =
95% conf. Interval =
P +/- (1.96)Sp
N
PP 1
![Page 26: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/26.jpg)
Example• National sample of 531 Democrats and
Democratic-leaning independents, aged 18 and older, conducted Sept. 14-16, 2007
• Clinton 47%; Obama 25%; Edwards 11%
• P(1-P) = .47(1-.47) = .47(.53) = .2491
• Divide by N = .2491/531 = .000469
• Take square root = .0217
• 95% CI = .47 +/- 1.96 (.0217)
• .47 +/- .04116 or 0.429 to .511
![Page 27: Review](https://reader036.vdocuments.net/reader036/viewer/2022070404/56813b2b550346895da3f26f/html5/thumbnails/27.jpg)
Midterm
• Key terms from Schutt chapters 1-5
• Statistical Calculations by hand– Mean, Median, Mode– Variance/Standard Deviation– Z-scores– Standard errors and confidence intervals
using z or t