basic stat review
TRANSCRIPT
![Page 1: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/1.jpg)
Level of Measurement: Nominal, Ordinal, Ratio, Interval
Frequency tables to distributionsCentral Tendency: Mode, Median,
MeanDispersion: Variance, Standard
Deviation
![Page 2: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/2.jpg)
Nominal – categorical data ; numbers represent labels or categories of data but have no quantitative value e.g. demographic data
Ordinal – ranked/ordered data ; numbers rank or order people or objects based on the attribute being measured
Interval – points on a scale are an equal distance apart ; scale does not contain an absolute zero point
Ratio – points on a scale are an equal distance apart and there is an absolute zero point
![Page 3: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/3.jpg)
![Page 4: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/4.jpg)
Simple depiction of all the data Graphic — easy to understand Problems
Not always precisely measured Not summarized in one number or datum
![Page 5: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/5.jpg)
Observation Frequency65 170 275 380 485 390 295 1
![Page 6: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/6.jpg)
Test Score
Frequency
4
3
2
1
65 70 75 80 85 90 95
![Page 7: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/7.jpg)
Two key characteristics of a frequency distribution are especially important when summarizing data or when making a prediction from one set of results to another:
Central Tendency What is in the “Middle”? What is most common? What would we use to predict?
Dispersion How Spread out is the distribution? What Shape is it?
![Page 8: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/8.jpg)
Three measures of central tendency are commonly used in statistical analysis - the mode, the median, and the mean
Each measure is designed to represent a typical score
The choice of which measure to use depends on:
the shape of the distribution (whether normal or skewed), and
the variable’s “level of measurement” (data are nominal, ordinal or interval).
![Page 9: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/9.jpg)
Nominal variables Mode
Ordinal variables Median
Ratio/Interval level variables Mean
![Page 10: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/10.jpg)
Most Common Outcome
Male Female
![Page 11: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/11.jpg)
Middle-most Value 50% of observations are above the
Median, 50% are below it The difference in magnitude between the
observations does not matter Therefore, it is not sensitive to outliers
![Page 12: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/12.jpg)
first you rank order the values of X from low to high: 85, 94, 94, 96, 96, 96, 96, 97, 97, 98
then count number of observations = 10. divide by 2 to get the middle score the
5 score here 96 is the middle score
![Page 13: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/13.jpg)
Most common measure of central tendency Best for making predictions Applicable under two conditions:1. scores are measured at the interval level, and2. distribution is more or less normal [symmetrical]. Symbolized as:
for the mean of a sample μ for the mean of a population
X
![Page 14: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/14.jpg)
X = (Σ X) / N If X = {3, 5, 10, 4, 3}
X = (3 + 5 + 10 + 4 + 3) / 5
= 25 / 5
= 5
![Page 15: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/15.jpg)
IF THE DISTRIBUTION IS NORMAL
Mean is the best measure of central tendency Most scores “bunched up” in middle
Extreme scores less frequent don’t move mean around.
But, central tendency doesn’t tell us everything !
![Page 16: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/16.jpg)
Dispersion/Deviation/Spread tells us a lot about how a variable is distributed.
We are most interested in Standard Deviations (σ) and Variance (σ2)
![Page 17: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/17.jpg)
Consider these means for weekly candy bar consumption.
X = {7, 8, 6, 7, 7, 6, 8, 7}
X = (7+8+6+7+7+6+8+7)/8
X = 7
X = {12, 2, 0, 14, 10, 9, 5, 4}
X = (12+2+0+14+10+9+5+4)/8
X = 7
What is the difference?
![Page 18: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/18.jpg)
Once you determine that the variable of interest is normally distributed, ideally by producing a
histogram of the scores, the next question to beasked is its dispersion: how spread out are
the scores around the mean. Dispersion is a key concept in statistical
thinking. The basic question being asked is how much do
the scores deviate around the Mean? The more “bunched up” around the mean the better the scores are.
![Page 19: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/19.jpg)
![Page 20: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/20.jpg)
. If every X were very close to the Mean, the mean would be a very good predictor.
If the distribution is very sharply peaked then the mean is a good measure of central tendency and if you were to use the mean to make predictions you would be right or close much of the time.
![Page 21: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/21.jpg)
The key concept for describing normal distributions
and making predictions from them is calleddeviation from the mean. We could just calculate the average distance
between each observation and the mean. We must take the absolute value of the
distance, otherwise they would just cancel out to zero!
Formula: | |iX X
n
![Page 22: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/22.jpg)
X – Xi Abs. Dev.
7 – 6 1
7 – 10 3
7 – 5 2
7 – 4 3
7 – 9 2
7 – 8 1
1. Compute X (Average)2. Compute X – X and take
the Absolute Value to get Absolute Deviations
3. Sum the Absolute Deviations
4. Divide the sum of the absolute deviations by N
Data: X = {6, 10, 5, 4, 9, 8} X = 42 / 6 = 7
Total: 12 12 / 6 = 2
![Page 23: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/23.jpg)
Instead of taking the absolute value, we square the deviations from the mean. This yields a positive value.
This will result in measures we call the Variance and the Standard Deviation
Sample- Population-s: Standard Deviation σ: Standard
Deviations2: Variance σ2: Variance
![Page 24: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/24.jpg)
Formulae:
Variance:
22 ( )iX Xs
N
2( )iX X
sN
Standard Deviation:
![Page 25: Basic stat review](https://reader035.vdocuments.net/reader035/viewer/2022062405/55799207d8b42ae72b8b4c6b/html5/thumbnails/25.jpg)
-1 1
3 9
-2 4
-3 9
2 4
1 1
Data: X = {6, 10, 5, 4, 9, 8}; N = 6
Total: 42 Total: 28
Standard Deviation:
76
42
N
XX
Mean:
Variance:2
2 ( ) 284.67
6
X Xs
N
16.267.42 ss
XX 2)( XX X
6
10
5
4
9
8