pcb 3043l - general ecology data analysis organizing an ecological study what is the aim of the...

18

Upload: basil-barrett

Post on 17-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are
Page 2: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

PCB 3043L - General Ecology

Data Analysis

Page 3: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

Organizing an ecological study

• What is the aim of the study?• What is the main question being asked?• What are your hypotheses?• Collect data• Summarize data in tables• Present data graphically• Statistically test your hypotheses• Analyze the statistical results• Present a conclusion to the proposed question

Page 4: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

What is a variable?

• Variable: any defined characteristic that varies from one biological entity to another.

• Examples: plant height, bird weight, human eye color, no. of tree species

• If an individual is selected randomly from a population, it may display a particular height, weight, etc.

• If several individuals are selected, their characteristics may be very similar or very different.

Page 5: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

What is a population?

• Population: the entire collection of measurements of a variable of interest.

• Example: if we are interested in the heights of pine trees in Everglades National Park (Plant height is our variable) then our population would consist of all the pine trees in Everglades National Park .

Page 6: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

What is a sample?

• Sample: smaller groups or subsets of the population which are measured and used to estimate the distribution of the variable within the true population

• Example: the heights of 100 pine trees in Everglades National Park may be used to estimate the heights of trees within the entire population (which actually consists of thousands of trees)

Page 7: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

What is a parameter?

• Parameter: any calculated measure used to describe or characterize a population

• Example: the average height of pine trees in Everglades National Park

Page 8: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

What is a statistic?

• Statistic: an estimate of any population parameter

• Example: the average height of a sample of 100 pine trees in Everglades National Park

Page 9: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

Why use statistics?• It is not always possible to obtain measures and calculate parameters

of variables for the entire population of interest

• Statistics allow us to estimate these values for the entire population based on multiple, random samples of the variable of interest

• The larger the number of samples, the closer the estimated measure is to the true population measure

• Statistics also allow us to efficiently compare populations to determine differences among them

• Statistics allow us to determine relationships between variables

Page 10: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

Statistical analysis of data

• Measures of central tendency• Measures of dispersion and variability

Site 1 Site 2

5 4

7 2

3 8

8 3

6 7

Heights of pine trees at 2 sites in Everglades National Park

Page 11: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

Measures of central tendency

• Where is the center of the distribution?

mean ( or μ): arithmetic mean……

median: the value in the middle of the ordered data set

mode: the most commonly occurring value

Example data set : 1, 2, 2, 2, 3, 5, 6, 7, 8, 9, 10Mean = (1 + 2 + 2 + 2+ 3 + 5 + 6 + 7 + 8 + 9 + 10)/11 = 55/11 = 5Median = 1, 2, 2, 2, 3, 5, 6, 7, 8, 9,10 = 5

1, 2, 2, 2, 3, 5, 6, 7, 8, 9,10,11 = (5+6)/2 = 5.5Mode = 1, 2, 2, 2, 3, 5, 6, 7, 8, 9, 10 = 2

n

xx n

xx

Page 12: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

Measures of dispersion and variability

• How widely is the data distributed?

range: largest value minus smallest value

variance (s2 or σ2) ………….………….

standard deviation (s or σ)…………………

2 1

)( 22

n

xxi

2

Large spread Small spread

Page 13: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

Example data set: 0, 1, 3, 3, 5, 5, 5, 7, 7, 9, 10

Variance = 9.8Standard Deviation = 3.13Range = 10

Example data set: 0, 10, 30, 30, 50, 50, 50, 70, 70, 90, 100

Variance = 980Standard Deviation = 31.30Range = 100

0

0.5

1

1.5

2

2.5

3

3.5

0 1 3 5 7 9 10Value

Num

ber o

f Occ

uren

ces

0

0.5

1

1.5

2

2.5

3

3.5

0 10 30 50 70 90 100Value

Num

ber o

f Occ

uren

ces

Measures of dispersion and variability

Page 14: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

Normal distribution of data

• A data set in which most values are around the mean, with fewer observations towards the extremes of the range of values

• The distribution is symmetrical about the mean

Page 15: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

Proportions of a Normal Distribution

• A normal population of 1000 body weights

• μ = 70kg σ = 10kg• 500 weights are > 70kg• 500 weights are < 70 kg

Weights of Black Bears in Bunting Park

0

100

200

300

400

500

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140

Weights (kg)

No

. o

f b

ears

Page 16: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

Proportions of a Normal Distribution

• How many bears have a weight > 80kg

• μ = 70kg σ = 10kg X = 80kg

• We use an equation to tell us how many standard deviations from the mean the X value is located:

= =

• We then use a special table to tell us what proportion of a normal distribution lies beyond this Z value

• This proportion is equal to the probability of drawing at random a measurement (X) greater than 80kg

Weights of Black Bears in Bunting Park

0

100

200

300

400

500

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140

Weights (kg)

No

. o

f b

ears

Z = X – μ σ

Z = 80 – 70 10

1

Page 17: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

Z table

• Look for Z value on table (1.0)

• Find associated P value (0.1587)

• P value states there is a 15.87% ((0.1587/1)x100) chance that a bear selected from the population of 1000 bears measured will have a weight greater than 80kg

Page 18: PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are

Probability distribution tables

• There are multiple probability tables for different types of statistical tests.e.g. Z-Table, t-Table, Χ2-Table

• Each allows you to associate a “critical value” with a “P value”

• This P value is used to determine the significance of statistical results