topic 1 statistical analysis. making a scientific investigation step 1: have a research question...

36
TOPIC 1 STATISTICAL ANALYSIS

Upload: nathaniel-floyd

Post on 26-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

TOPIC 1STATISTICAL ANALYSIS

Page 2: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

MAKING A SCIENTIFIC INVESTIGATION

STEP 1: HAVE A RESEARCH QUESTION

STEP 2: HAVE A HYPOTHESIS

STEP 3: WRITE A METHOD TO TEST YOUR HYPOTHESIS (design a controlled experiment)

STEP 4: COLLECT DATA

STEP 5: ORGANIZE THE DATA

STEP 6: ILLUSTRATE THE DATA USING AN APPROPRIATE DIAGRAM

STEP 7: ANALYZE THE DATA USING THE CORRECT STATISTICAL METHODS, ENABLING A CONCLUSION TO BE DRAWN

Page 3: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

STEP 4: DATA COLLECTION

The collection of all things being investigated iscalled the population.

It is usually impossible for us to collect data fromevery member of the population.

We must therefore choose a sample from thepopulation.

Page 4: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

We must try to make sure that the sample isrepresentative of the population from which it isdrawn, so that we can generalize any findings about the sample to the population.

Random sampling ensures that every member ofthe population has an equal chance of beingincluded in the sample.

Page 5: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

I. QUALITATIVE DATA (descriptive)

II. QUANTITATIVE DATA (numerical)• CONTINUOUS ex. length• DISCRETE ex. number of eggs

Page 6: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

STEP 5: ORGANIZING DATA

Ways to Organize Raw Data:Constructing tables- Ranking

- Tally chart

- Frequency distribution

Page 7: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Shell length / mm

Number of limpets

8-11 2

12-15 5

16-19 8

20-23 10

24-27 9

28-31 5

32-35 1

Use the table below to answer the following questions:Is discrete or continuous data represented?What type of data organization is below?

Is the data table complete? How will you process this data? (What does this data ‘say’ to you?)

Page 8: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

QUADRATSAMPLING

Marine Intertidal Zone

Page 9: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST
Page 10: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

SPREADSHEET ACTIVITY 1: NORMAL DISTRIBUTION1) Input the data from Limpet Shell Lengths in your spreadsheet2) GRAPH: frequency distribution (normal distribution)3) What does this graph tell you?

Shell length / mm

Number of limpets

8-11 2

12-15 5

16-19 8

20-23 10

24-27 9

28-31 5

32-35 1

Page 11: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Normal Distribution

Skewed Distribution

Page 12: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Descriptive Statistics Includes:

• Calculating the:– Mean– Median– Mode– Range– Standard deviation (variability)– P value (level of confidence from a T-Test)– PEARSON correlation coefficient (correlation/cause)

Page 13: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Mean (average): the average of all data entries; measure of central tendency for normal distribution.

Median: middle value when data entries are placed in rank order; good measure of central tendency for skewed distributions.

Mode: the most frequently ocurring value (the most common data value)

Range: the difference between the smallest and largest data values. This gives simple measure of spread of data. (Note: gives us outliers – extremes which are very different from all other values)

Page 14: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

SPREADSHEET ACTIVITY 21) Input the following data in your spreadsheetSample 1: 30 45 45 60 75 75 75 80 90 90 100Sample 2: 60 60 70 70 80 80 90 90 100 100 120 1202) Calculate the mean, median, mode & range

a) manually (using scientific calculator)b) using your spreadsheet

Note: you need to know how to complete all stats. calculations using: 1) formula 2) spreadsheet 3) calculator.

Page 15: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Do we stop data analysis at calculating the Mean, Median &

Mode? • No! • The mean does not give us a complete picture of

variation in our data.• We need to calculate standard deviation– The STDEV is a more complete measure of variation. It

considers every value in the set.– It is a measure of the spread of data around the mean

Page 16: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

SPREADSHEET ACTIVITY 3: Standard Deviation1) Input the following data in your spreadsheet.

Mass (g) of mice bred in different environmentsSample A(isolated mice)22, 22, 23, 24, 24, 24, 24, 25, 26, 26Sample B( bred together)16, 17, 20, 23, 24, 25,27, 28, 29, 31

2) Calculate the means for samples A & B3) Calculate standard deviation (STDEVP) for A & B

a) with formula b) with spreadsheet c) with calculator

4) Is variation high or low in Sample A? Sample B?5) What does this variation tell us?

Page 17: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Analyzing Values from Mice Samples• Looking at the calculated values for mean alone

for sample A and B, it appears that there is no difference between the two populations of mice. (we cannot recognize variability of data)

• However, when looking at STDEV, we can see:• For sample A – STDEV is low• For sample B – STDEV is high– Wide variation in this data set makes us question the

experimental design. Is it possible that mice bred in environment ‘B’ were subject to other environmental factors ? What is causing wide variation of data?

Page 18: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

x x

x x xx x x x x 222426

x x x x x x x x x x 16 24 31

Page 19: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

• Standard Deviation: A measure of how the individual observations of a data set are dispersed or spread out around the mean (average).

• For normally distributed data:– 68% of all values lie within ±1 standard deviation of the mean– 95% of all values lie within ±2 standard deviations of the mean

Page 20: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Reasons for Using Statistics

• In a population, we usually find that not all the values are identical. Instead, there are differences between the values even inside a population.

• We call this VARIATION.• The data we obtain from a study has variability.• We often need to describe the variation within a

population to help us decide whether a difference between sample means truly represents a difference between populations means.

• How can we describe this variation? (via statistics)

Page 21: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Why Use Standard Deviation?

• The value provides a description of the variation which considers every data item.

• Large differences in the sizes of the standard deviation between samples being compared can indicate: – 1) that control variables are not constant– 2) that there is a problem with validity of the

investigation.

• The standard deviation can be used as a support in hypothesis testing.

Page 22: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

We can graphically represent STDEV as ERROR BARS

Page 23: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Error Bars

• In many charts and graphs, we show the mean values of our samples.

• It is useful to show a measure of the variation inside each of these samples. We do this by adding error bars to the chart or graph.

Page 24: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Error Bars• An error bar is a line that extends above and

below a bar in a chart of a data point in a graph. It could represent the range for that sample, or the standard deviation.

• The length of the line represents the size of the range or size of standard deviation – it extends an equal distance above and below the value of the mean.

• Error bars are graphical representations of the variability of data.

Page 25: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Significance

Significance: real; true difference between two or more samples in the phenomena that we are examining (testing to see if findings are not just by chance)

Note: statistical significance is our main tool in deciding whether the data supports the hypothesis.

Page 26: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

What information do the means of data give?What additional information do error bars give?How does this affect interpretation of the figures?- Error bars help us determine whether or not thedifference between two sets of data is significant(real).- A large difference between the means of samples, and

small standard deviations for thes samples, indicates that it is likely that the difference between the means is statistically significant.

- A small difference between these means and large standard deviations fro these samples indicates that it is likely that the difference between these means is not statistically significant.

Page 27: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Confidence Levels• It is seldom possible to say with absolute certainty that the

difference between sample means is significant with complete certainty (100% confidence)

• Instead, we determine if the difference between the sample means is probably significant.

• Most often, scientists/biologists want to be 95% confident that the difference between the samples is significant.

• This means that there is only 5% chance that the samples could be different purely due to chance and not because of a real difference between the populations.

• We could say: p = 0.05 (the probability (p) that chance alone produced the difference between our sample means is 5%.

Page 28: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Determining Confidence of Significance with T-Test

How do we determine if our findings are significant? We Need to calculate our t value and find p value.

Apply t-test to calculate t-value – will help determine p-value (significance at a certain level of confidence):

• Data should be normally distributed• Sample size should be at least 10

Page 29: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

T-Test• Need to include the following information for T-Test

calculation:• 1) size of the difference between means of the samples• 2) number of items in each sample• 3) the amount of variation about the mean of each sample

(standard deviation)• Value for t from data can be calculated using:– Formula– Scientific calculator– Spread sheet (Microsoft Excel)

Page 30: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

SPREASHEET ACTIVITY 4: T-TEST (P-value)

1) Input data from Clegg Text Chapter 21 Page 6812) Calculate: mean and standard deviation 3) Calculate: P-value (from T-Test)

a) spreadsheetb) calculator

4) What does this P-value tell you?

Page 31: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

T-Test & P-Value using a Calculator• Need to use table of t-values!• Calculate T-Test Value (t-value)• Identify Degrees of Freedom for your experiment

((sample 1 + sample 2)-2) = DFExample: (10+10)-2 = 18

• Find row 18 in DF column• Find t value in row 18 under “t values” column• Once you found your t value, look to the bottom row

in that column for p value.

Page 32: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Two – tailed test

• A two-tailed test will test both if the mean is significantly greater than x and if the mean significantly less than x.

• The mean is considered significantly different from x if the test statistic is in the top 2.5% or bottom 2.5% of its probability distribution, resulting in a p-value less than 0.05.

• We would use a two-tailed test to see if two means are different from each other (ie from different populations), or from the same population.

Page 33: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Most likely observation

observed or more extreme result arising by chance

Page 34: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Cause & Correlation

• Correlation: a relationship or connection between two or more things. (observations without an experiment can only show a correlation)

• Cause: a phenomenon that gives rise to a result. (experimentation gives evidence for cause of result)

• Example: we might do an experiment to see if watering bean plants prevents wilting. Observing that wilting occurs when the soil is dry is a simple correlation, but the experiment gives us evidence that the lack of water is the cause of the wilting. Experiments proved a test which shows cause.

Page 35: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

SPREADSHEET ACTIVITY 5:1) Inpute the following data 2) Calculate the PEARSON Correlation Coefficient (r value)

LIGHT INTENSITY (X UNITS) PLANT HEIGHT (CM)

0 6

5 7

10 9

15 10

20 11

25 12

30 15

3) Explain what this r-value tells you.4) Explain that existence of a correlation does not establish that there is a causal relationship between two variables.

Page 36: TOPIC 1 STATISTICAL ANALYSIS. MAKING A SCIENTIFIC INVESTIGATION STEP 1: HAVE A RESEARCH QUESTION STEP 2: HAVE A HYPOTHESIS STEP 3: WRITE A METHOD TO TEST

Positive Correlation: The correlation in the same direction is called positive correlation. If one variable increases, the other variable also increases or if one variable decrease and the other variable also decreases. For example, the length of an iron bar will increase as the temperature increases.

Negative Correlation: The correlation in opposite direction is called negative correlation, if one variable is increase other is decrease and vice versa, for example, the volume of gas will decrease as the pressure increase or the demand of a particular commodity is increase as price of such commodity is decrease.

No Correlation or Zero Correlation: If there is no relationship between the two variables such that the value of one variable change and the other variable remain constant is called no or zero correlation.