basic notation. summation ( ) x i = the number of meals i have on day “i” x= 1,2,3,2,1 x i =...
TRANSCRIPT
BASIC NOTATION
Summation ()
Xi = The number of meals I have on day “i”X= 1,2,3,2,1
Xi = ???
Xi2 = ???
( Xi)2 = ???
9
19
81
Qualitative Variables
Nominal •Political affiliation•Republican•Democrat• Independent
•Gender•Female•Male
Quantitative Variables
Ordinal•Categories have relative value/order•Example•Very Depressed•Depressed•Slightly depressed•Not depressed
Quantitative Variables
Interval•Categories have relative value/order•Difference in measurement = Difference in characteristic•Example•Temperature Fahrenheit, 83,84,85 …Difference from 83 to 84 = Difference from 84 to 85
Quantitative Variables
Ratio•Categories have relative value/order•Difference in measurement = Difference in characteristic •True zero (0) point exists•Example•Temperature Kelvin, 0,1,2,…343,345,346 …•Height 0 inches, 1 inch, …. 86 inches (Shaq)
Statistical Analyses
Design Interval/Ratio Nominal
One Sample One sample z or t test Goodness of fit χ2
Two Indepenedent samples Independent t test χ2 test of independence
Two dependent Samples Dependent t test
McNemar test for significance of change
K independent samples ANOVA χ2 test of independence
Correlation Pearson R
Prediction Regression
Frequency Distributions
Tables
Ungrouped (list of scores)
Grouped (grouped by ranges)
Graphs
histograms
frequency polygons
Table DistributionsThe variable: Time (in minutes) between getting out of bed this morning and eating your first bite of food.
Time (min) Ungrouped : (6, 28, 27, 7, 7, 24, 39, 55, 13, 17, 13, 13, 3, 23, 18, 37, 2, 8, 11, 18, 22, 2, 21, 31, 12)Bad Grouped Frequency Distribution
X f 0-10 7 11-20 8 21-30 6 31-40 3 41-50 0 51-60 1 25
Good Grouped Frequency Distribution
X f 1-10 7 11-20 8 21-30 6 31-40 3 41-50 0 51-60 1 25
Grouped Frequency Distribution
X f 51-60 1 41-50 0 31-40 3 21-30 6 11-20 8 1-10 7
25
Distribution Characteristics
Modality - Peaks
Symmetry – Mirror Reflection
Asymptoticness – Extreme Values on both Sides
Terms that Describe Distributions
Term Features Example
"Symmetric" left side is mirror image of right side
"Positively skewed"
right tail is longer then the left
"Negatively skewed"
left tail is longer than the right
"Unimodal" one highest point
"Bimodal" two high points
"Normal" unimodal, symmetric, asymptotic
Normal Distributions
USA
Unimodal
Symmetric
Asymptotic
Inflection points
Where curve changes from convex to concave or concave to convex
Also = 1 standard deviation from the mean
CENTRAL TENDENCY WHAT IS A TYPICAL SCORE LIKE?
Mode: Most common value; number of peaks; always an observed value
Median: Middle of distribution; not affected much by outliers
Mean: Average; greatly affected by outliers
- Most common score(s)
1,2,2,2,3,4,5,6,7 Unimodal Mode=2
1,3,3,4,4,5,6,7,8 BimodalModes=3,4
1,3,3,4,4,5,6,6,8 TrimodalModes=3,4,6
1,2,3,4,5,6,7,8,9 Amodal
CENTRAL TENDENCY Modes
Modes in Populations - Unimodal
- Bimodal
- Trimodal
- Amodal ?
- Middle score in distribution
- Odd number of scores
5-point data set: 2,3,5,9,12Median=51,2,5,5,7,9,500,700,999Median=?
- Even number of scores
4-point data set: 3,5,8,9Median=(5+8)/2=6.5
1,2,5,5,7,9,500,700,999,1122
Median=?
CENTRAL TENDENCY Medians
Medians in Populations
CENTRAL TENDENCY Means
More modes, medians and means
Measure Definition
Level of measurement Disadvantage
Modemost frequent
value nom., ord., int./rat. Crude
Median middle value ord., int./rat.Only two points
contribute
Meanbalancing
point int./rat. Affected by skew
The Spread of Distributions
- How different are scores from central tendency?
-Range-Standard Deviation
Measure of SpreadRANGE
- Highest value – Lowest Value
- Affected only by end points-Data set 1-1,1,1,50,99,99,99
-Data set 2-1,50,50,50,50,50,99
Why ‘range’ is weak
The Spread of Distributions
- How different are scores from central tendency?
- Always, by definition of the mean
( ) 0X X
Population Standard Deviation
2( )X
N
Sample Variance and Standard deviation
Also known as“Estimated Population Standard Deviation”
2( )
1
X XS
n
Sample Variance and Standard deviation
Why do we use N-1 for sample? Because sample means are closer to sample
mean than to population mean, which underestimates the estimate
Population 2,4,6,and 8, σ = (2+4+6+8)/4 = 5Scores 2 and 6
σ2 = (2-5)2 +(6-5)2 = 9 + 1 = 10
Scores 2 and 6, = (2+6)/2 = 4 S2 = (2-4)2 +(6-4)2 = 4 + 4 = 8
N-1 adjusts for bias
Sample Variance
SUM OF SQUARED DEVIATIONS 2
2 ( )
1
X XS
n
DEGREES OF FREEDOM
2( )
1
X XS
n
STANDARD DEVIATION
2( )
1
X XS
n
Differences BetweenSample and Population Standard Deviation
2( )X
N
1) Sigma vs. S2) Population mean versus Sample mean3) N vs. N-1
Super Important Relationship
Standard Deviation is square root of variance
2S SSAMPLE STANDARD DEVIATION = SQUARE ROOT OF THE SAMPLE VARIANCE
2
POPULATION STANDARD DEVIATION = SQUARE ROOT OF THE POPULATION VARIANCE
Population Standard Deviation
Sample Standard Deviation