introduction to statistics
DESCRIPTION
TRANSCRIPT
Introduction to statistics
By Dr. Amira Talic
What is “Statistics”?• •Statistics is the science of data that involves:• •Collecting• •Classifying• •Summarizing• •Organizing and• •Interpretation
• Of numerical information.• •Examples:• •Cricket batting averages• •Stock price• •Climatology data such as rainfall amounts, average temperatures• •Marketing information• •Gambling?
Key Terms
• What is Data? facts or information that is relevant or
appropriate to a decision maker• Population? •the totality of objects under
consideration• Sample? •a portion of the population that is
selected for analysis
Key Terms
• Parameter? a summary measure (e.g., mean) that
is computed to describe a characteristic of the population
• Statistic? a summary measure (e.g., mean) that
is computed to describe a characteristic of the sample
Variables
• Traits or characteristics that can change values from case to case.
• A variable is what is measured or manipulated in an experiment
•Examples:•Age•Gender•Income•Social class
Types Of Variables
• In causal relationships:• CAUSE =>EFFECTindependent variable & dependent variable•Independent variable: is a variable that can be
controlled or manipulated.An independent variable is the variable you have
control over (dose of drug)•Dependent variable: is a variable that cannot
be controlled or manipulated. Its values are predicted from the independent variable ( effect on the condition)
Types Of Variables
•Discrete variables are measured in units that cannot be subdivided. Example: Number of children
•Continuous variables are measured in a unit that can be subdivided infinitely. Example: Height
Statistical analysis
• Descriptive Statistics
• Inferential statistics
• Predictive modeling
Descriptive Statistics
•Gives us the overall picture about data•Presents data in the form of tables, charts and
graphs•Includes summary data•Avoids inferencesExamples:•Measures of central locationMean, median, mode and midrange•Measures of Variation•Variance, Standard Deviation, z-scores
Inferential Statistics
•Take decision on overall population using a sample
• “Sampled” data are incomplete but can still be representative of the population
•Permits the making of generalizations (inferences) about the data
• Probability theory is a major tool used to analyze sampled data
Predictive Modeling
• The science of predicting future outcomes based on historical events.
• Model Building: “Developing set of equations or mathematical formulation to forecast future behaviors based on current or historical data.”
• Regression, logistic Regression, time series analysis etc.,
Calculation of the probability
• Based on the characteristics of the population for the observed parameter
• (e.g. . Duration of the pregnancy, duration of the first labor stage, height, et cetera)
• To describe the population, “distribution”
will be used
Distribution
• A statistical distribution describes the numbers of times each possible outcome occurs in a sample
• Distributions for continuous variables are called continuous distributions ( e.g. height)
• They also carry the fancier name probability density
Distribution
• Some probability densities have particular importance in statistics. A very important one is shaped like a bell, and called the normal ( Gaussian) distribution.
• Many naturally-occurring phenomena can be approximated surprisingly well by this distribution. It will serve to illustrate some features of all continuous distributions.
Gaussian distribution
What are the Components of A Distribution?
• Measures of central tendency• Suppose we have a sample with 4
observations: 4, 1, 4, 3• Mean = the sum of a set of numbers
divided by the number of observations (4+1+4+3=12:4=3) Median - the middle point of a set of
numbers(3.5)
Components of distribution
• Mode - the most frequently occurring number. Mode=4
• Median - the middle point of a set of numbers(3.5)
Components of distribution
Measures of variationRange - the maximum value minus
the minimum value in a set of numbers. Range = 4-1 = 3
Standard Deviation - the average distance a data point is away from the mean.
[ (4 3)+( 1 3)+ (4 3)+ (3 3)]: 4=1standard deviation= 1
Standard deviation
Why to know about it ?
• Mean, Median, Mode, Range, and Standard Deviations are measurements in a sample (statistics) and can
also be used to make inferences on a population.
What do we expect from the statistical analysis?
• To find out whether there is a statistically significant difference between our sample
(e.g. pregnancy loss in Al Ain Hospital Patient) and general population
How to perform the statistical analysis?
• Statistics can take us to a beautiful journey of understanding ,but
Festina lente! make haste slowly
Let us take it easy!
With the love,
Dr. Amira