data observation and descriptive statistics. organizing data frequency distribution table that...
Post on 22-Dec-2015
227 views
TRANSCRIPT
![Page 1: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/1.jpg)
Data observation and Descriptive Statistics
![Page 2: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/2.jpg)
Organizing Data
Frequency distribution Table that contains all the scores along with the
frequency (or number of times) the score occurs. Relative frequency: proportion of the total observations
included in each score.
![Page 3: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/3.jpg)
Frequency distribution
Amount f(frequency) rf(relative frequency)
$0.00 2 0.125
$0.13 1 0.0625
$0.93 1 0.0625
$1.00 1 0.0625
$10.00 1 0.0625
$32.00 1 0.0625
$45.53 1 0.0625
$56.00 1 0.0625
$60.00 1 0.0625
$63.25 1 0.0625
$74.93 1 0.0625
$80.00 1 0.0625
$85.28 1 0.0625
$115.35 1 0.0625
$120.00 1 0.0625
n=16 1.00
![Page 4: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/4.jpg)
Organizing data
Class interval frequency distribution Scores are grouped into intervals and presented along
with frequency of scores in each interval. Appears more organized, but does not show the exact
scores within the interval. To calculate the range or width of the interval:
(Highest score – lowest score) / # of intervals Ex: 120 – 0 / 5 = 24
![Page 5: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/5.jpg)
Class interval frequency distribution
Class interval f (frequency) rf ( relative frequency)
$0-$24 6 .375
$25-$48 2 .125
$49-$73 3 .1875
$74-$98 3 .1875
$99-$124 2 .125
n = 16 1.00
![Page 6: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/6.jpg)
Graphs
Bar graphs
Data that are collected on a nominal scale.
Qualitative variables or categorical variables.
Each bar represents a separate (discrete) category, and therefore, do not touch.
The bars on the x-axis can be placed in any order.
![Page 7: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/7.jpg)
Bar Graph
![Page 8: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/8.jpg)
Graphs
Histograms
To illustrate quantitative variables Scores represent changes in quantity.
Bars touch each other and represent a variable with increasing values.
The values of the variable being measured have a specific order and cannot be changed.
![Page 9: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/9.jpg)
Histogram
![Page 10: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/10.jpg)
Frequency polygon
Line graph for quantitative variables Represents continuous data: (time, age, weight)
![Page 11: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/11.jpg)
Frequency Polygon
AGE 22.06 24.0525.04 25.04 25.07 25.07 26.03 26.11 27.03 27.11 29.03 29.05 29.05 34 37.1 53
![Page 12: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/12.jpg)
Descriptive Statistics
Numerical measures that describe: Central tendency of distribution Width of distribution Shape of distribution
![Page 13: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/13.jpg)
Central tendency
Describe the “middleness” of a data set Mean Median Mode
![Page 14: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/14.jpg)
Mean Arithmetic average Used for interval and ratio data
Formula for population mean ( µ pronounced “mu”)
µ = ∑ X _____ N
Formulas for sample mean
_ X = ∑ X _____ n
![Page 15: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/15.jpg)
Mean
Amount f(frequency) rf(relative frequency)$0.00 2 0.125$0.13 1 0.0625$0.93 1 0.0625$1.00 1 0.0625
$10.00 1 0.0625$32.00 1 0.0625$45.53 1 0.0625$56.00 1 0.0625$60.00 1 0.0625$63.25 1 0.0625$74.93 1 0.0625$80.00 1 0.0625$85.28 1 0.0625$115.35 1 0.0625$120.00 1 0.0625
$46.53 n=16 1
![Page 16: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/16.jpg)
Mean
Not a good indicator of central tendency if distribution has extreme scores (high or low). High scores pull the mean higher Low scores pull the mean lower
![Page 17: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/17.jpg)
Median
Middle score of a distribution once the scores are arranged in increasing or decreasing order. Used when the mean might not be a good indicator of
central tendency. Used with ratio, interval and ordinal data.
![Page 18: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/18.jpg)
Median
$0.00$0.00$0.13$0.93$1.00
$10.00$32.00$45.53$56.00$60.00$63.25$74.93$80.00$85.28
$115.35$120.00
![Page 19: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/19.jpg)
Mode
The score that occurs in the distribution with the greatest frequency. Mode = 0; no mode Mode = 1; unimodal Mode = 2; bimodal distribution Mode = 3; trimodal distribution
![Page 20: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/20.jpg)
Mode
Amount f(frequency)rf(relative
frequency)$0.00 2 0.125$0.13 1 0.0625$0.93 1 0.0625$1.00 1 0.0625
$10.00 1 0.0625$32.00 1 0.0625$45.53 1 0.0625$56.00 1 0.0625$60.00 1 0.0625$63.25 1 0.0625$74.93 1 0.0625$80.00 1 0.0625$85.28 1 0.0625
$115.35 1 0.0625$120.00 1 0.0625
$46.53 n=16 1
![Page 21: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/21.jpg)
Measures of Variability
Range From the lowest to the highest score
Variance Average square deviation from the mean
Standard deviation Variation from the sample mean Square root of the variance
![Page 22: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/22.jpg)
Measures of Variability
Indicate the degree to which the scores are clustered or spread out in a distribution.
Ex: Two distributions of teacher to student ratio. Which college has more variation?
College A College B
4 16
12 19
41 22
Sum = 57 Sum = 57
Mean = 19
Mean = 19
![Page 23: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/23.jpg)
Range
The difference between the highest and lowest scores. Provides limited information about variation. Influenced by high and low scores. Does not inform about variations of scores not at the
extremes.
Examples: Range = X(highest) – X (lowest) College A: range = 41- 4 = 37 College B: range = 22-16 = 6
![Page 24: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/24.jpg)
Variance
Limitations of range require a more precise way to measure variability.
Deviation: The degree to which the scores in a distribution vary from the mean.
Typical measure of variability: standard deviation (SD)
VarianceThe first step in calculating standard deviation
![Page 25: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/25.jpg)
Variance
X = Number of therapy sessions each student attended.
M = 4.2 “Deviation”
Sum of deviations = 0
![Page 26: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/26.jpg)
Variance
In order to eliminate negative signs, we square the deviations.
Sum the deviations = sum of squares or SS
![Page 27: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/27.jpg)
Variance
Take the average of the SS Ex: SS = 48.80
SD2 = Σ(X-M)2
N That is the average of the squared deviations from the
mean
SD2 = 9.76
![Page 28: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/28.jpg)
Standard Deviation
Standard deviation Typical amount that the scores vary or deviate
from the sample mean
SD = Σ(X-M)2
N
That is, the square root of the variance
Since we take the square root, this value is now more representative of the distribution of the scores.
____ √
![Page 29: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/29.jpg)
Standard Deviation
X = 1, 2, 4, 4, 10 M = 4.2 SD = 3.12 (standard deviation) SD2 = 9.76 (variance)
Always ask yourself: do these data (mean and SD) make sense based on the raw scores?
![Page 30: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/30.jpg)
Population Standard Deviation
The average amount that the scores in a distribution vary from the mean.
Population standard deviation: (σ pronounced “sigma”) √
____ σ = ∑( X - µ ) ² _________ N
![Page 31: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/31.jpg)
![Page 32: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/32.jpg)
Sample Standard Deviation
Sample is a subset of the population. Use sample SD to estimate population SD. Because samples are smaller than populations, there
may be less variability in a sample. To correct for this, we divide the sample by N – 1
Increases the standard deviation of the sample. Provides a better estimate of population standard
deviation.
σ = ∑( X - µ ) ² _________ N
Unbiased Sample estimator standard deviation
Population standard deviation
√ s = ∑( X - X ) ² _________ N - 1
√
![Page 33: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/33.jpg)
Sample Standard Deviation
X X - mean X - mean squared$0.00 -$46.53 $2,165.04$0.00 -$46.53 $2,165.04$0.13 -$46.40 $2,152.96$0.93 -$45.60 $2,079.36$1.00 -$45.53 $2,072.98$10.00 -$36.53 $1,334.44$32.00 -$14.53 $211.12$45.53 -$1.00 $1.00$56.00 $9.47 $89.68$60.00 $13.47 $181.44$63.25 $16.72 $279.56$74.93 $28.40 $806.56$80.00 $33.47 $1,120.24$85.28 $38.75 $1,501.56
$115.35 $68.82 $4,736.19$120.00 $73.47 $5,397.84
$46.53 N = 16 SS = $26,295.02
![Page 34: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/34.jpg)
Types of Distributions
Refers to the shape of the distribution. 3 types:
Normal distribution Positively skewed distribution Negatively skewed distribution
![Page 35: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/35.jpg)
Normal Distribution
Normal distributions: Specific frequency distribution Bell shaped Symmetrical Unimodal
Most distributions of variables found in nature (when samples are large) are normal distributions.
![Page 36: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/36.jpg)
Normal Distribution
Mean, media and mode are equal and located in the center.
![Page 37: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/37.jpg)
Normal Distribution
![Page 38: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/38.jpg)
Skewed distributions
When our data are not symmetrical Positively skewed distribution Negatively skewed distribution
Memory hint: skew is where the tail is; also the tail looks like a skewer and it points to the skew (either positive or negative direction)
![Page 39: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/39.jpg)
Skewed Distributions
![Page 40: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/40.jpg)
Kurtosis
Kurtosis - how flat or peaked a distribution is.
Tall and skinny versus short and wide Mesokurtic: normal Leptokurtic: tall and thin Platykurtic: short and fat (squatty like a
platypus!)
![Page 41: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/41.jpg)
Kurtosis
leptokurtic
mesokurtic
platykurtic
![Page 42: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/42.jpg)
Skewness, Number of Modes, and Kurtosis in Distribution of Housing Prices
![Page 43: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/43.jpg)
z - Scores In which country (US vs. England) is Homer Simpson considered overweight?
How can we make this comparison? Need to convert weight in pounds and kilograms to a standardized scale.
Z- scores: allow for scores from different distributions to be compared under standardized conditions.
The need for standardization Putting two different variables on the same scale z-score: Transforming raw scores into standardized scores
z = (X - µ) σ
Tell us the number of standard deviations a score is from the mean.
![Page 44: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/44.jpg)
z- Scores
Class 1: M = $46.53 SD = $41.87 X = $54.76 Class 2: M = $53.67 SD = $18.23 X = $89.07
In which class did I have more money in comparison to the distribution of the other students?
Sample z-score: z = (X - M) s
When we convert raw scores from different distributions to z-scores, these scores become part of the same z distribution and we can compare scores from different distributions.
![Page 45: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/45.jpg)
z Distribution
Characteristics: (regardless of the original distributions) z score at the mean equals 0 Standard deviation equals 1
![Page 46: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/46.jpg)
z distribution of exam scores
M = 70s = 10
![Page 47: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/47.jpg)
Standard normal distribution
If a z-distribution is normal, then we refer to it as a standard normal distribution.
Provides information about the proportion of scores that are higher or lower than any other score in the distribution.
![Page 48: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/48.jpg)
Standard Normal Curve Table
Standard normal curve table (Appendix A)
Statisticians provided the proportion of scores that fall between any two z-scores.
What is the percentile rank of a z score of 1?
Percentile rank = proportion of scores at or below a given raw score.
Ex: SAT score = 1350 M = 1120 s = 340 75th percentile
![Page 49: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/49.jpg)
Percentile Rank
The percentage of scores that your score is higher than.
89th percentile rank for height You are taller than 89% of the students in the class. (you are tall!)
Homer Simpson: 4th percentile rank for intelligence. he is smarter than 4% of the population (or 96% of the population is smarter than
Homer).
GRE score: 88th percentile rank
Reading scores of grammar school: 18th percentile rank
![Page 50: Data observation and Descriptive Statistics. Organizing Data Frequency distribution Table that contains all the scores along with the frequency (or](https://reader031.vdocuments.net/reader031/viewer/2022032310/56649d7e5503460f94a60b48/html5/thumbnails/50.jpg)
Review Data organization
Frequency distribution, bar graph, histogram and frequency polygon.
Descriptive statistics Central tendency = middleness of a distribution
Mean, median and mode Measures of variation = the spread of a distribution
Range, standard deviation Distributions can be normal or skewed (positively or negatively).
Z- scores Method of transforming raw scores into standard scores for
comparisons.
Normal distribution: mean z-score = 0 and standard deviation = 1
Normal curve table: shows the proportions of scores below the curve for a given z-score.