experimental design and analysis graphical exploration of data gerry quinn & mick keough, 1998...

27
Experimental design and analysis Graphical Exploration of Data Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors.

Upload: barry-kennedy

Post on 05-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Experimental design and analysis

Graphical Exploration of Data

Gerry Quinn & Mick Keough, 1998Do not copy or distribute without permission of authors.

Page 2: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Graphical displays

• Exploration– assumptions (normality, equal variances)– unusual values– which analysis?

• Analysis– model fitting

• Presentation/communication of results

Page 3: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Space shuttle data

• NASA meeting Jan 27th 1986– day before launch of shuttle Challenger

• Concern about low air temperatures at launch

• Affect O-rings that seal joints of rocket motors

• Previous data studied

Page 4: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

50 55 60 65 70 75 80 85

0

1

2

3

Joint temp. oF

Num

ber

of in

cide

nts

O-ring failure vs temperaturePre 1986

Page 5: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

50 55 60 65 70 75 80 85

0

1

2

3

Joint temp. Fo

Num

ber

of in

cide

nts

O-ring failure vs temperature

Page 6: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Checking assumptions - exploratory data analysis (EDA)

• Shape of sample (and therefore population)– is distribution normal (symmetrical) or skewed?

• Spread of sample– are variances similar in different groups?

• Are outliers present– observations very different from the rest of the

sample?

Page 7: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Distributions of biological dataBell-shaped symmetrical

distribution:

• normal

y

Pr(y)

Pr(y)

y

Skewed asymmetrical distribution:

• log-normal• poisson

Page 8: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Common skewed distributions

Log-normal distribution:

• proportional to • measurement data, e.g. length, weight etc.

Poisson distribution:

• = 2

• count data, e.g. numbers of individuals

Page 9: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Exploring sample data

Page 10: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Example data set

• Quinn & Keough (in press)

• Surveys of 8 rocky shores along Point Nepean coast

• 10 sampling times (1988 - 1993)

• 15 quadrats (0.25m2) at each site

• Numbers of all gastropod species and % cover of macroalgae recorded from each quadrat

Page 11: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Frequency distributions

NORMAL LOG-NORMAL

Value of variable (class)

Num

ber

of o

bser

vati

ons

Observations grouped into classes

Value of variable (class)

Page 12: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Number of Cellana per quadrat

30

20

10

00 20 40 60 80 100

Number of Cellana per quadrat

Fre

quen

cy

Survey 5, all shores combinedTotal no. quadrats = 120

Page 13: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Dotplots

0 10 20 30 40

Number of Cellana per quadrat

• Each observation represented by a dot• Number of Cellana per quadrat, Cheviot

Beach survey 5• No. quadrats = 15

Page 14: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Boxplot

25% of values}

}

}

}

"

"

"

spread

outlier

hinge

hinge

median

*

GROUP

VA

RIA

BLE

largest value

smallest value

Page 15: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

1. IDEAL 2. SKEWED

4. UNEQUAL VARIANCES3. OUTLIERS

*

*

**

*

Page 16: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

0

20

40

60

80

100

S FPE RR SP CPE CB LB CPW

Site

Num

ber

of Cellana

per

qua

drat

Boxplots of Cellana numbers in survey 5

Page 17: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Scatterplots

• Plotting bivariate data

• Value of two variables recorded for each observation

• Each variable plotted on one axis (x or y)

• Symbols represent each observation

• Assess relationship between two variables

Page 18: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Cheviot Beach survey 5 n = 15

0 10 20 30 40 50 60 700

10

20

30

40

% cover of Hormosira per quadrat

Num

ber

of Cellana

per

qua

drat

Page 19: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Scatterplot matrix

• Abbreviated to SPLOM

• Extension of scatterplot

• For plotting relationships between 3 or more variables on one plot

• Bivariate plots in multiple panels on SPLOM

Page 20: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

SPLOM for Cheviot Beach survey 5

CELLANA- numbers of Cellana

SIPHALL- numbers of Siphonaria

HORMOS- % cover of Hormosira

n = 15 quadrats

Page 21: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Transformations

• Improve normality.

• Remove relationship between mean and variance.

• Make variances more similar in different populations.

• Reduce influence of outliers.

• Make relationships between variables more linear (regression analysis).

Page 22: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Log transformation

Lognormal Normal

y = log(y)

Measurement data

Page 23: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Power transformation

Poisson Normal

y = (y), i.e. y = y0.5, y = y0.25

Count data

Page 24: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Arcsin transformation

Square Normal

y = sin-1((y))

Proportions and percentages

Page 25: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Outliers

• Observations very different from rest of sample - identified in boxplots.

• Check if mistakes (e.g. typos, broken measuring device) - if so, omit.

• Extreme values in skewed distribution - transform.

• Alternatively, do analysis twice - outliers in and outliers excluded. Worry if influential.

Page 26: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Assumptions not met?

• Check and deal with outliers

• Transformation– might fix non-normality and unequal variances

• Nonparametric rank test– does not assume normality– does assume similar variances– Mann-Whitney-Wilcoxon– only suitable for simple analyses

Page 27: Experimental design and analysis Graphical Exploration of Data  Gerry Quinn & Mick Keough, 1998 Do not copy or distribute without permission of authors

Category or line plot

1 2 3 4 5 6 7 8 9 10 0

5

10

15

20

25

30

Mea

n nu

mbe

r of

Cellana

per

qua

drat

Survey

1 2 3 4 5 6 7 8 9 10 0

5

10

15

20

25

30

Cheviot BeachSorrento

Mea

n nu

mbe

r of

Cellana

per

qua

drat

Survey