Download - Intro Stat
INTRODUCTION TO STATISTICS
Prepared by:Joshua Erdy A. Tan
Professional Teacher
I. Basics of StatisticsII. Statistical Description of DataIII. Measures of Central Tendency
Outline of Discussion
Define the basics of statistics. Compute for the accurate statistical data. Reflect on learning statistics in everyday
lives.
Objectives
Basics of Statistics
Science of collection, presentation, analysis, and reasonable interpretation of data.
Presents a rigorous scientific method for gaining insight into data.
Give an instant overall picture of data based on graphical presentation or numerical summarization irrespective to the number of data points.
Statistics
Methods used to determine the variability and reliability of data.
Statistical Methods
Taxonomy of Statistical Methods
Statistical Description of Data
Statistics describes a numeric set of data by its:
Center Variability Shape
Statistics describes a categorical set of data by:
Frequency, percentage or proportion of each category
Statistical Description of Data
Any characteristic of an individual or entity. It can take different values for different individuals.
Variables
• Nominal - Categorical variables with no inherent order or ranking sequence such as names or classes (e.g., gender). Value may be a numerical, but without numerical value (e.g., I, II, III). The only operation that can be applied to Nominal variables is enumeration.
• Ordinal - Variables with an inherent rank or order, e.g. mild, moderate, severe. Can be compared for equality, or greater or less, but not how much greater or less.
Types of Variables
• Interval - Values of the variable are ordered as in Ordinal, and additionally, differences between values are meaningful, however, the scale is not absolutely anchored.
• Ratio - Variables with all properties of Interval plus an absolute, non-arbitrary zero point, e.g. age, weight, temperature (Kelvin).
Types of Variables
Tells us what values the variable takes and how often it takes these values.
Distribution
Unimodal - having a single peak Bimodal - having two distinct peaks Symmetric - left and right half are mirror
images.
Types of Distribution
Consider a data set of 26 children of ages 1-6 years. Then the frequency distribution of variable ‘age’ can be tabulated as follows
Frequency Distribution
Frequency DistributionFrequency Distribution of Age:Age 1 2 3 4 5 6Frequency 5 3 7 5 4 2
Age Group 1-2 3-4 5-6
Frequency 8 12 6
Grouped Frequency Distribution of Age:
Cumulative FrequencyAge 1 2 3 4 5 6Frequency 5 3 7 5 4 2Cumulative
Frequency5 8 15 20 24 26
Age Group 1-2 3-4 5-6Frequency 8 12 6Cumulative Frequency 8 20 26
Measures of Central Tendency
Mean The most popular and well known measure
of central tendency. It is equal to the sum of all the values in the
data set ( ) divided by the number of values ( ) in the data set.
Formula:
Mean
Staff 1 2 3 4 5 6 7 8 9 10
Salary 15k 18k 16k 14k 15k 15k 12k 17k 90k 95k
For example, consider the wages of staff at a factory below:
Mean To get the mean (represented by x) , you need to add the salaries of staff members and divide it by the number of staff members.x = (15,000 + 18,000 + 16,000 + 14,000 + 15,000 + 15,000 + 12,000 + 17,000 + 90,000 + 95,000)/10x = 30,700
Answer: The mean salary for these ten staff is $30.7k.
Median The middle score for a set of data that has
been arranged in order of magnitude.
Formula: e = (x + y)/2
Where:e = medianx = smallest middle marky = largest middle mark
Median Suppose we have a data below:
To get the median, find the smallest and largest middle mark.
(x) Smallest middle mark: 55(y) Largest middle mark: 56
65 55 89 56 35 14 56 55 87 45 92
Median Then solve using the formula:
e = (x+y)/2e = (55+56)/2e = 55.5
Answer: The median is 55.5.
Mode The most frequent score in the data set.
Mode Suppose we have a data below:
To get the mode (X), find the most occuring/frequent score in the data above.
X = 55Answer: The mode is 55 since it appears/occurs more than the other numbers.
69 55 89 56 35 14 56 55 83 55 91
Range The difference between the lowest and highest values.
Range In A(4, 6, 9, 3, 7) the lowest value is 3,
and the highest is 9. To get the Range of A:
R = highest value – lowest valueR = 9 – 3R = 6
Answer: The range of set A is 6.
END