multidimensional scaling mds · stress of final configuration is 0.069 iteration history how low...

Multidimensional scaling

MDS

And other permutation based analyses

MDS Aim

• Graphical representation of

dissimilarities between objects in as few

dimensions (axes) as possible

• Graphical representation is termed an

“ordination” in ecology

• Axes of graph represent new variables

which are summaries of original

variables

Haynes & Quinn (unpublished)

• Four sites along Morwell River – site 1 upstream from planned sewage

outfall

– sites 2, 3 and 4 downstream

– site 3 below fish farm

• Abundance of all species of

invertebrates recorded from 3 stations

at each site

• 12 objects (sampling units): – 4 sites by 3 stations at each site

• 94 variables (species)

Do invertebrate communities (or

assemblages) differ between stations

and sites? – Is Site 1 different from rest?

Multidimensional scaling

1. Set up a raw data matrix

Species 1 2 3 4 5 etc.

Site/sample

S11 54 0 0 5 0

S12 37 1 0 4 0

S13 68 2 0 2 0

S21 60 0 0 0 1

S22 47 0 0 2 0

S23 60 0 0 0 0

etc.

2. Calculate a dissimilarity (Bray-Curtis) matrix

S11 S12 S13 S21 S22 S23 etc.

S11 .000

S12 .203 .000

S13 .666 .652 .000

S21 .216 .331 .759 .000

S22 .328 .410 .796 .191 .000

S23 .336 .432 .796 .183 .054 .000

etc.

3. Decide on number of dimensions

(axes) for the ordination:

– suspected number of underlying

ecological gradients

– match distances between objects on plot

and dissimilarities between objects as

closely as possible

– more dimensions means better match

– usually between 2 and 4 dimensions

4. Arrange objects (eg. sampling units)

initially on ordination plot in chosen

number of dimensions

– starting configuration

– usually generated randomly

Starting configuration

-2 -1 0 1 2

-2

-1

0

1

2

Axis I

Axis II

Site 1 Site 3 Site 2 Site 4

5. Compare distances between objects on

ordination plot and Bray-Curtis

dissimilarities between objects

– strength of relationship measured by

Kruskal’s stress value

– measures “badness of fit” so lower values

indicate better match

– plot is called Shepard plot

Starting configuration

-2 -1 0 1 2

-2

-1

0

1

2

Axis I

Axis II

Site 1

Site 3

Site 2

Site 4

0 0.5 1 0

1

2

3

Dissimilarity

Distance

Shepard plot

Stress = 0.394

6. Move objects on ordination plot

iteratively by method of steepest

descent

– each step improves match between

dissimilarities and distances between

objects on ordination plot

– lowers stress value

0 0.5 1 0

1

2

3

Dissimilarity

Distance

-2 -1 0 1 2 -2

-1

0

1

2

Axis I

Axis II

After 20 iterations

Stress = 0.119

7. Final configuration

• further moving of objects on ordination

plot cannot improve match between

dissimilarities and distances

• stress as low as possible

0 0.5 1 0

1

2

3

Dissimilarity

Distance

-2 -1 0 1 2 -2

-1

0

1

2

Axis II

Axis I

Final configuration - 50 iterations

Stress = 0.069

Iteration Stress

1 0.394

2 0.368

3 0.357

4 0.351

... ...

20 0.119

... ...

49 0.069

50 0.069

Stress of final configuration is 0.069

Iteration history

How low should stress be?

Clarke (1993) suggests:

• > 0.20 is basically random

• < 0.15 is good

• < 0.10 is ideal

– configuration is close to actual

dissimilarities

How many dimensions?

• Increasing no. of dimensions above 4

usually offers little reduction in stress

• 2 or 3 dimensions usually adequate to

get good fit (ie. low stress)

• 2 dimensions straightforward to plot

Lonhart (unpublished data)

• Effects of depth and piling location on

marine fouling assemblage

• Two pilings, four sides of each panel,

two depths, sampled 4 times

• 40 species in total recorded

• MDS to examine relationship piling

location and depth on invertebrate

community – Does the community vary as a function of

depth?

– Does the community vary as a function of

pilling location?

– Does the effect of depth on the community

vary as a function of piling location?

• Bray-Curtis dissimilarity

• Non-metric MDS

• ANOSIM / PERMANOVA

• SIMPER

Transform: Square root

Resemblance: S17 Bray Curtis similarity

Date2_22_2010

3_05_2010

3_18_2010

4_02_2010

2D Stress: 0.17

MDS Plot



Piling8381

8179

2D Stress: 0.17



DepthShallow

Deep

2D Stress: 0.17



PilingDepth8381Shallow

8381Deep

8179Shallow

8179Deep

2D Stress: 0.17

Comparing groups in MDS

• 2 Piling locations

• 2 Depths

• 8 replicates per treatment combination (4

sides x 2 samples)

• Are sites significantly different in species

composition?

• Is there an ANOVA-like equivalent for

MDS?

Procedure 1:Analysis of

similarities - ANOSIM

• Uses (dis)similarity matrix

• Because dissimilarities are not normally distributed, uses ranks of pairwise dissimilarities

• Because dissimilarities are not independent of each other, uses randomization test rather than usual significance testing procedure

• Generates own test statistic (called R) by randomization of rank dissimilarities

• Available through PRIMER package

Lonhart ANOSIM

• Depth effect R = 0.305, P = 0.001 so reject Ho.

- Significant differences between depths

• Piling location R = 0.761 , P = 0.001 so reject Ho

- Significant difference by Piling

Permanova (permutation

ANOVA)

• Run just like an ANOVA

• Sums of Squares can be partitioned in

multivariate space (based on distances to

multidimensional centroids)

• P – values based on permutations of the

analysis

Permanova (permutation

ANOVA)

PERMANOVA table of results

Unique

Source df SS MS Pseudo-F P(perm) perms

Depth 1 14884 14884 15.67 0.001 999

Piling 1 70878 70878 74.623 0.001 999

DepthxPiling 1 10558 10558 11.116 0.001 999

Res 124 1.1778E5 949.82

Total 127 2.141E5



Piling8381

8179

2D Stress: 0.17



DepthShallow

Deep

2D Stress: 0.17




8381Deep

8179Shallow

8179Deep

2D Stress: 0.17

Interaction effect

Which variables (species) most

important?

• For MDS-type analyses, three methods:

– correlate individual variables (species abundances) with axis scores – like PCA loadings

– SIMPER (similarity percentages) to determine which species contribute most to Bray-Curtis dissimilarity

– CA (Correspondence Analyis)to simultaneously ordinate objects and species - biplots

SIMPER (similarity percentages)

|yij - yik|

Bray-Curtis dissimilarity =

yij + yik)

Note is summing over each species, 1 to p.

The contribution of species i is:

|yij - yik|

i =

yij + yik)

Simper results – comparing deep

depths between Pilings

Groups 8381Deep & 8179Deep

Average dissimilarity = 77.47

Group 8381Deep

Group 8179Deep

Species Av.Abund

Av.Abund Av.Diss Diss/SD Contrib% Cum.%

Watersipora, live 11.34 0 11.58 1.63 14.94 14.94

Detritus 3.28 13.34 10.7 1.7 13.81 28.75

Corynactis californica 0 7.53 7.68 1.15 9.92 38.67

Burgundy crust 0 6.66 6.79 1.04 8.77 47.44

Diplosoma listerianum 6.97 2.41 6.4 0.8 8.26 55.7

CaCO3 9.13 8.16 6.06 1.43 7.82 63.52

Dead bryozoan 5.41 0.19 5.35 1.16 6.91 70.42

Orange bryozoan 5 0 5.1 0.83 6.59 77.01

Dead Watersipora 4.88 0 4.97 0.9 6.42 83.43

Ascidia ceratodes 0.09 4.91 4.95 0.83 6.39 89.82

Rhynchozoon (brwn bryo) 1 1.44 2.04 0.67 2.64 92.45

Are these results interpretable

graphically?



Watersipora, live

0

10

20

30

2D Stress: 0.17




8381Deep

8179Shallow

8179Deep

2D Stress: 0.17

Linking biota MDS to

environmental variables

• Are differences in species composition

related to differences in environmental

variables?

• Correlate MDS axis scores with

environmental variables

• BIO-ENV procedure - correlates

dissimilarities from biota with

dissimilarities from environmental variables

BIO-ENV procedure

Samples

Species

abundances

Env

variables

Euclidean

Bray-Curtis

Subsets of

variables

Rank correlation - Spearman

- Weighted Spearman

Dissimilarity matrix

BIO-ENV correlations

• Exploratory rather than hypothesis testing

procedure.

• Tries to find best combination of

environmental variables, ie. combination

most correlated with biotic dissimilarities.

• A priori chosen correlations can be tested

with RELATE procedure - randomization

test of correlation.

Example

• Bristol Bay Zooplankton

• 57 stations

• 25 species sampled

• Salinity measures taken at the same time

• Question: is zooplankton community related

to salinity

Zooplankton community data

Community Matrix

NMDS plot

Bristol Channel zooplankton

Non-metric MDSTransform: Square root

Resemblance: S17 Bray-Curtis similarity

1

2

34

567

8

9

10

1112

13

14

15

16

171819

20

21

22

23

24

25

2627

28

29

31

32

3334

35

36

37

38

3940

41

42

43

44

45

46

47

4849

50

51

52

5354

55

56

5758

2D Stress: 0.1

Bristol Channel zooplankton



Salinity

1.8

4.2

6.6

9

1

2

34

567

8

9

10

1112

13

14

15

16

171819

20

21

22

23

24

25

2627

28

29

31

32

3334

35

36

37

38

3940

41

42

43

44

45

46

47

4849

50

51

52

5354

55

56

5758

2D Stress: 0.1

NMDS plot with Salinity Bubbles

Salinity data

Salinity Matrix

RELATE procedure

Samples

Species

abundances

Env

variables

Euclidean

Bray-Curtis

All variables


- Weighted Spearman


RELATE the matrices

Bristol Channel salinity group (1-9 in increasing salinity)RELATE

-0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Rho

0

56

Fre

quency

0.741

Parameters

Correlation method: Spearman rank

Sample statistic (Rho): 0.741

Significance level of sample statistic: 0.1 % (=<0.001)

Number of permutations: 999

Number of permuted statistics greater than or equal to Rho: 0

A more complicated example – linking multivariate

biological data to multivariate environmental data

• Biological data: Nematode species (>100)

abundance at 19 sites in Exe estuary

• Environmental:

– MPD: mean particle diameter

– % Org: Percent organic matter

– WT: water table depth

– H2S: depth of Hydrogen sulfide layer

– Sal: interstitial salinity

– Ht: Intertidal range

Environmental NMDS Exe estuary

Non-metric MDSNormalise

Resemblance: D1 Euclidean distance

1

2

3

4

5

6

7

89

10

11

12

13

14

15

16

17

1819

2D Stress: 0.06

Biological NMDS

Exe nematodes (19 sites averaged over season)Non-metric MDS



site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

123

4

5

6

7 8 9

10

11

121314

15

16

17

18

19

2D Stress: 0.05

Linking Environment to Community Exe nematodes (19 sites averaged over season)



Med Part Diam

0.2

0.8

1.4

2

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05




Interstit Salinity

19

46

73

100

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05




Dep Water Tab

2

8

14

20

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05




%Organics

0.8

3.2

5.6

8

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05

Formally

• First: use RELATE to determine

relationship between the biological

community and the environmental

community

RELATE procedure

Samples

Species

abundances

Env

variables

Euclidean

Bray-Curtis

All variables


- Weighted Spearman


Exe estuaryRELATE

-0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Rho

0

67

Fre

quency

0.791

Parameters

Correlation method: Spearman rank

Sample statistic (Rho): 0.791

Significance level of sample statistic: 0.1 % (<0.001)

Number of permutations: 999

Number of permuted statistics greater than or equal to Rho: 0

Formally

• First: use RELATE to determine

relationship between the biological

community and the environmental

community

• Second: Use BIO ENV to determine best fit

of environmental variables to Biological

Community

BIO-ENV procedure

Samples

Species

abundances

Env

variables

Euclidean

Bray-Curtis

Subsets of

variables


- Weighted Spearman


Select best model

Best result for each number of variables

No.Vars Corr. Selections

1 0.676 Dep H2S layer

2 0.777 Dep H2S layer,Interstit Salinity

3 0.816 Med Part Diam,Dep H2S layer,Interstit Salinity

4 0.811 Med Part Diam,Dep H2S layer,%Organics,Interstit Salinity

5 0.804 Med Part Diam,Dep H2S layer,Shore height,%Organics,Interstit Salinity

6 0.791 Med Part Diam,Dep Water Tab,Dep H2S layer,Shore height,%Organics,Interstit Salinity




Med Part Diam

0.2

0.8

1.4

2

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05

Linking Environment to Community – model results




Interstit Salinity

19

46

73

100

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05




Dep H2S layer

2

8

14

20

site1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2D Stress: 0.05

Best result for each number of variables

No.Vars Corr. Selections

1 0.676 Dep H2S layer

2 0.777 Dep H2S layer,Interstit Salinity

3 0.816 Med Part Diam,Dep H2S layer,Interstit Salinity

4 0.811 Med Part Diam,Dep H2S layer,%Organics,Interstit

Salinity

5 0.804 Med Part Diam,Dep H2S layer,Shore

height,%Organics,Interstit Salinity

6 0.791 Med Part Diam,Dep Water Tab,Dep H2S layer,Shore

height,%Organics,Interstit Salinity

Procedure 1:Analysis of

similarities - ANOSIM

• Uses (dis)similarity matrix

• Because dissimilarities are not normally distributed, uses ranks of pairwise dissimilarities

• Because dissimilarities are not independent of each other, uses randomization test rather than usual significance testing procedure

• Generates own test statistic (called R) by randomization of rank dissimilarities

• Available through PRIMER package

Null hypothesis

Average of rank dissimilarities between objects

within groups = average of rank dissimilarities

between objects between groups

rB = rW

No difference in species composition between

groups

Within group dissimilarities

Between group dissimilarities

Test statistic

R average of rank dissimilarities between objects

between groups - average of rank

dissimilarities between objects within groups

R = (rB - rW) / (M / 2) where M = n(n-1)/2

• R between -1 and +1.

• Use randomization test to generate probability

distribution of R when H0 is true.

Lonhart ANOSIM

• Depth effect R = 0.305, P = 0.001 so reject Ho.

- Significant differences between depths

• Piling location R = 0.761 , P = 0.001 so reject Ho

- Significant difference by Piling

SIMPER (similarity percentages)

|yij - yik|

Bray-Curtis dissimilarity =

yij + yik)

Note is summing over each species, 1 to p.

The contribution of species i is:

|yij - yik|

i =

yij + yik)

Which species discriminate

groups of objects?

• Calculate average i over all pairs of objects between groups

– larger values indicate species contribute more to group differences

• Calculate standard deviation of i

– smaller values indicate species contribution is consistent across all pairs of objects

• Calculate ratio of i / SD(i)

– larger values indicate good discriminating species between 2 groups

multidimensional scaling mds · stress of final configuration is 0.069 iteration history how low...

Documents