a new interpretation of graph theory measures in ...doglioli/costa_etal... · in evaluating marine...

55
Costa et al. Graph theory for species persistence A new interpretation of graph theory measures in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf of Lion. Andrea Costa 1,2,* , Andrea M. Doglioli 1,2 , Katell Guizien 3 , Anne. A. Petrenko 1,2 1 - Aix Marseille Universit´ e, CNRS/INSU, IRD, Mediterranean Institute of Oceanography (MIO), UM 110, 13288 Marseille 2 - Universit´ e de Toulon, CNRS/INSU, IRD, Mediterranean Institute of Oceanography (MIO), UM 110, 83957 La Garde 3 - Laboratoire d’Ecogeochimie des Environnements Benthique, CNRS, Universite Paris VI, UMR8222, Av. du Fontaule - F-66651 Banyuls-sur- Mer (France) * [email protected] keywords Connectivity, Persistence, Graph Theory, Shortest Cycles, Betweenness, Metapopulation Model, Directed Weighted Bridging Centrality, Modularity, Clustering December 22, 2015 1

Upload: others

Post on 01-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

A new interpretation of graph theory measures

in evaluating marine metapopulations persistence:

The study case of soft-bottom polychaetes in

the Gulf of Lion.

Andrea Costa1,2,*, Andrea M. Doglioli1,2, Katell Guizien3, Anne. A.Petrenko1,2

1 - Aix Marseille Universite, CNRS/INSU, IRD, Mediterranean Instituteof Oceanography (MIO), UM 110, 13288 Marseille

2 - Universite de Toulon, CNRS/INSU, IRD, Mediterranean Instituteof Oceanography (MIO), UM 110, 83957 La Garde

3 - Laboratoire d’Ecogeochimie des Environnements Benthique, CNRS,Universite Paris VI, UMR8222, Av. du Fontaule - F-66651 Banyuls-sur-Mer (France)

* [email protected]

keywords

Connectivity, Persistence, Graph Theory, Shortest Cycles, Betweenness,Metapopulation Model, Directed Weighted Bridging Centrality, Modularity,Clustering

December 22, 2015

1

Page 2: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Abstract1

Herein we test graph theory analysis of hydrological connectivity for the2

demographical persistence of a soft-bottom polychaetes metapopulation in3

the Gulf of Lion (NW Mediterranean Sea). As a preliminary step of the4

graph theory analysis we introduce a metric for the node-to-node distance5

in graphs based on connectivity matrices containing larval transfer probabilities.6

This metric ensures a physically meaningful interpretation of shortest paths7

and, consequently, of betweenness. Then, we assess and -eventually- re-8

evaluate the interpretation of two classical graph theory centrality measures9

(betweenness and modularity) in the context of species persistence. New10

measures (directed weighted bridging centrality, minimum cycles identification)11

are also derived and evaluated. In particular, modularity and bridging12

centrality are shown to characterize clusters of interconnected nodes (i.e.,13

subpopulations), to highlight rescuing mechanisms and its source sites.14

Further, we show that shortest cycles indicate the sites ensuring species’15

regional persistence, whereas betweenness appears less relevant for persistence16

than in previous literature. Our new interpretation of graph theory proposed17

here is supported by a detailed comparison with a metapopulation model.18

Introduction19

Losses of biodiversity at sea due to deleterious effects of natural phenomena20

and human activities (e.g., global warming, habitat destruction, overfishing,21

2

Page 3: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

etc.) are currently expected to be mitigated by the implementation of Marine22

Protected Areas (MPAs); see LeCorre et al. (2012) and Duraiappah and23

Shahid (2005) for comprehensive discussions on the argument. The basic24

assumption of this approach is that, if a carefully-chosen portion of the25

whole marine biological network is protected, the network would not be26

subject to breakdowns generating critical losses of both biodiversity and27

individual abundances in the marine ecosystem. Indeed, if a subset of populations28

are sufficiently interconnected, forming a persistent sub-network and alimenting29

other populations, the regional maintenance of the species is ensured. Thus,30

the challenge is to identify the sub-networks that could permit the species31

persistence in a whole habitat by minimizing the mortality only in some32

areas within it. Equivalently we can say that the problem is to identify33

the minimal sub-network that can maximise the connectivity of the whole34

network. In this way it will be possible to protect the network by minimizing35

the costs of MPAs implementation (see Andrello et al., 2014, for example).36

37

However, there are two major problems in identifying key sites for the38

conservation of a set of species distributed in disjunct sites between which39

species may disperse. First, great dissimilarities in dispersing ability among40

species translates into very different connection patterns (see Siegel et al.,41

2003, for example). Second, species interactions with the environment or42

between themselves in the various sites also affect differently the persistence43

of species in the network (McArthur and Levins, 1967). These difficulties44

3

Page 4: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

led to different methodologies for tackling the problem, each with their45

own advantages and shortcomings (see Kool et al., 2013 and Lagabrielle46

et al., 2014). Among these techniques, graph theory has been increasingly47

employed for conservation studies (Moilanen, 2011) due to its ability to48

capture the essential features of a network. Due to this ability, it has been49

adapted to a vast range of contexts. It was first introduced in ecology by50

Urban and Keitt (2001) in a study of landscape connectivity, then Schick51

and Lindley (2007) extended its use to riverine networks. Treml et al. (2008)52

applied it to the study of the connection between marine reefs. Rozenfeld53

et al. (2008) used it to infer gene flux in marine populations networks.54

Jacobi et al. (2012) exploited graph theory for the identification of marine55

sub-populations.Andrello et al. (2013) applied it for the estimation of connectivity56

among marine MPAs.57

In particular, many different graph theory measures have been proposed58

to highlight nodes of interest in different networks in different contexts (see59

Rayfield et al., 2010; and Galpern et al., 2011; for exhaustive reviews).60

Because graph theory identifies well connected networks with networks61

that have an efficient transfer within them, in the literature, graph theory62

measures (e.g., betweenness) highlighting nodes important for an efficient63

transfer have been proposed as relevant for conservation (e.g., Treml et al.,64

2008). This point is not bereft of controversies. For example, the equivalence65

between efficient transfer and conservation relevance remains unproven66

(sensu Moilanen, 2011; and Lagabrielle et al., 2014).67

4

Page 5: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

68

Herein we want to revise the interpretation of some graph theory69

measures from a conservation point of view in the framework of marine70

connectivity. For doing this, graph theory measures were compared with71

outputs from a metapopulation model. This latter model links local demography72

and regional dispersal. Specifically, metapopulation models have been extensively73

used to investigate conditions of species persistence (Caswel, 2001; Hastings74

and Botsford, 2005). In particular, in metapopulation models, well connected75

networks are networks ensuring the regional persistence of the species. It76

follows that, if we can find which graph theory measure best reproduces77

the information provided by metapopulation models, we can clarify the78

relevance of the different graph theory measures for conservation issues.79

80

Our study case consists in the identification of the sites important81

for the persistence of a soft-bottom polychaetes metapopulation in the82

Gulf of Lion (GoL), see Figure 1. The Gulf of Lion was selected as study83

case because of the numerous studies, both physical and biological, already84

performed in this area that can be used to interpret and validate our results.85

Moreover, recent studies (Rossi et al., 2014) support the choice of spatial86

scale of the size of the GoL as appropriate to study the hydrodynamical87

connectivity properties of month-long larvae dispersals. The GoL is located88

in the north-western Mediterranean Sea and is characterized by a large89

continental margin dominated by a soft bottom forming the habitat of90

5

Page 6: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

uniform polychaetes assemblages in the 10 to 30 m bathymetric depth91

range. Its hydrodynamics is complex and highly variable (Millot, 1990).92

Depending on wind forcing, currents in the study zone can be either eastward93

or south-westward (Estournel et al., 2003; Petrenko et al., 2008). The circulation94

is strongly influenced by the Northern Current, which constitutes an effective95

dynamical barrier blocking coastal waters on the continental shelf (Petrenko,96

2003) and delimits the regional scale of hydrodynamic connectivity. Exchanges97

between the GoL and offshore waters are mainly induced by processes98

associated with the Northern Current (Petrenko et al., 2005).99

100

As reference for the metapopulation model analysis we use the study101

by Guizien et al. (2014). Some essential details on the model can be found102

in the Supplementary Materials A. Hydrodynamic connectivity was quantified103

by larval transfer probability between 32 reproductive sites along the shore104

of the GoL (Figure 1). These sites cover a substantial part of the available105

habitat in the GoL for soft-bottom polychaetes. The metapopulation model106

parameters were tuned for polychaetes on the base of the review by McHugh107

and Fong (2002). Ad hoc metapopulation simulations based on a threat108

scenario have been used to hierarchize four targeted sites in a metapopulation109

model of the Gulf of Lion (NW Mediterranean Sea). The scenario aimed at110

quantifying the resilience of the metapopulation to habitat losses due to111

anthropic pressure around each of the four principal commercial ports in112

the GoL. Vulnerability was thus hierarchized by determining the number113

6

Page 7: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

of unsuitable sites, starting from each port and proceeding symmetrically114

around each port, necessary to cause the metapopulation crash. Evidence115

of a rescue mechanism from the sites located in the western part of the116

studied area by the sites in the eastern part was also provided by studying117

the spatial distribution of polychaetes population density.118

119

In the present study we focus on the relevance to the way of applying120

graph theory. Our principal concern is the validity of some choices for the121

node-to-node metric in previous literature. Hereby we propose a metric122

that provides physically meaningful results from graph theory analysis123

when analysing connectivity matrices based on larval transfer probabilities.124

125

The paper is organized as follows. In the Materials Section we discuss126

the main characteristics of the common input for graph theory and metapopulation127

model: 20 connectivity matrices issued from Lagrangian dispersal simulations.128

In the Procedures Section we summarize the graph theory measures we129

tested and introduce a new metric for the node-to-node distance in graphs130

built on current-based connectivity matrices. In the Assessment section we131

present the systematic analysis of the hydrological connectivity matrices132

with graph theory measures and its interpretation and explanation in the133

light of oceanographic structures.134

7

Page 8: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Materials135

In order to compare with the results of the metapopulation modelling study136

of Guizien et al. (2014), the same input connectivity matrices obtained137

from Lagrangian dispersal simulations were used in the present study.138

The Lagrangian simulations used a three-dimensional circulation139

model (see Marsaleix et al., 2006) with a horizontal resolution of 750 m.140

Spawning was simulated by releasing 30 particles in the center of each of141

32 sites alongshore the GoL, on the 30 m isobath, every hour from January142

5 until May 16 in 2004 and 2006. The final positions of larvae after three,143

four and five weeks were processed to compute the proportion of larvae144

coming from an origin site and arriving at a settlement site. Connectivity145

matrices were then built for ten consecutive 10-day spawning periods in146

each year and for each of three different pelagic larval durations (3, 4 and 5147

week), for a total of 20 matrices (numbered from #1 to #20).148

149

It is important to note that the connectivity matrices’ values depend150

strongly on the circulation present in the Gulf during the period of the151

dispersal simulation. The typical circulation of the Gulf of Lion is a westward152

current regime (Figure 1). This was the case of matrices #7,#11,#12,#15,#17.153

In this study, other types of circulation were also present. In particular154

matrix #1 was obtained after a period of reversed (eastward) circulation.155

Indeed this case of circulation is less frequent than the westward circulation156

8

Page 9: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

(Petrenko et al., 2008). Matrices #14, #10 and #13 correspond to a circulation157

pattern with an enhanced recirculation in the centre of the gulf. Finally,158

matrices #2, #3, #5, #6, #8, #9, #14, #16, #18, #19, #20 correspond159

to a rather mixed circulation with no clear patterns. A geographic representation160

of some connectivity matrices is in Figure 2. Therein we can see simultaneously:161

(i) the geographic distribution of the sites, (ii) the geographic direction of162

the connectivity by advection aij (with the arrows pointing in the i → j163

direction) and (iii), for each couple of sites, the difference between the164

probability to go from one site to the other, or vice versa, by looking at165

the different colors of the arrows. For clarity the connectivity values lower166

than 2/3 of the maximum are not represented. When both probabilities in167

i → j and j → i directions are high and hence plotted, the arrows reach168

only the mid-distance between the nodes. This representation captures169

the circulation patterns: in Figure 2a (matrix #7), there are more arrows170

in the east-to-west direction and these are almost always stronger than171

the corresponding west-to-east ones, like we expect in a case of westward172

circulation. The opposite case is represented in Figure 2b (matrix #1),173

dominated by an eastward circulation pattern. As we can infer from the174

high number of arrows without a predominant direction in Figure 2c, matrix175

#10 is characterized by a recirculation pattern in the center of the GoL.176

9

Page 10: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Procedures177

Mathematically speaking, a graph G is a couple of sets (V,E) where V is178

the set of nodes and E is the set of edges. The set V represents the collection179

of objects under study that are pair-wise linked by an edge representing a180

relation of interest between these two objects. When the relation is symmetric,181

the graph is said to be ‘undirected’, otherwise it is ‘directed’. An example182

of an undirected graph in the context of biological networks is the genetic183

distance among populations used in Rozenfeld et al. (2008), while an example184

of directed graph is the probability of connections due to the current field185

between two zones of the sea as in Rossi et al. (2014). If every existing186

edge has the same importance as the others, the graph is said to be ‘binary’,187

that is the edges can exist or not. If each edge has a specific relative importance,188

a weight can be associated to each of them and the graph will then be189

called ‘weighted’. The total weight of the connections of a node i ∈ V is190

called strength k(i). In an undirected graph, this is equal to the number of191

edges incident on the node. In a directed graph, it is possible to distinguish192

between in-strength and out-strength. The first one is the sum of the values193

of the edges terminating in the node kin(i) =∑

j aji, while the second194

is kout(i) =∑

j aij with j ∈ V and i 6= j. Here the values aij are the195

terms of the connectivity matrix where all the values of the edges from196

node i to node j are stored. The density or connectance ρ of a graph can197

be defined as the ratio between the number of existing edges E and its198

10

Page 11: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

maximum possible value N(N − 1), where N is the number of nodes. For199

a directed graph we have: ρ = EN ·(N−1)

. The network strength is defined as200

the sum of all the elements of a connectivity matrix. The number of not-201

null elements of a connectivity matrix is the number of matrix entries with202

a value greater than zero.203

204

In the present study, we deal with directed weighted graphs. The205

nodes of our graphs represent the 32 sites used in the metapopulation model,206

while the edges represent the not null probability that a Lagrangian particle207

released in one site is transported to another site, after a certain amount of208

time corresponding to the larval duration period.209

210

In a connected directed-unweighted graph (i.e., directed-unweighted211

with no disconnected parts), it is possible to define the shortest path σl,j212

connecting two nodes l ∈ V and j ∈ V as the shortest possible alternating213

sequence of nodes and edges, beginning with node l and ending with node214

j, such as each edge connects the preceding node with the succeeding one.215

The definition can be extended to directed weighted graphs: the shortest216

path has the lowest cost between two nodes. The most frequent choice217

to define the cost of a path is the sum of its edges’ weights. Nonetheless,218

other alternatives are possible and will be discussed in more detail at the219

end of this section (see Subsection ‘A new metric for node-to-node distance’).220

The definition of a widely used centrality measure called betweenness BC(i),221

11

Page 12: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

i ∈ V , is based on the concept of shortest path. The betweenness estimates222

the relative importance of a node i within a graph by counting the fraction223

of existing shortest paths σl,j that effectively pass through this node σl,j(i):224

BC(i) =

∑l,j σlj(i)∑l,j σl,j

(1)

225

The concept of cycle (see, for example, Barrat et al., 2008, for an226

introduction), despite its simplicity, turns out to be useful in the study of227

species multi-generational persistence. Cycles are defined as those paths228

that, starting from node i ∈ V , end up to the node i itself, after a certain229

number L of steps. In order to neglect the effect of the particles remaining230

at the same site with respect to the effect of the ones leaving the site and231

coming back, we only consider cycles with L > 2. One of the essential232

requisites for ensuring the persistence of a species in a given zone is the233

high probability to see the larvae returning home after a certain number of234

generations (see Hastings and Botsford, 2005, for details). This means that235

the shorter the cycle starting from a given node, the more likely the site is236

important for persistence. In fact, in this case, the site survival would be237

quite independent from the import of larvae from other sites. Thus it can238

act as a source in our network (Hastings and Botsford, 2005).239

The main practical problem of this kind of analysis is the generally240

overwhelming computational power required. We used an algorithm that241

12

Page 13: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

recursively finds all the possible cycles for every node of the network, thus242

involving a (N − 1)L − 1 complexity. Indeed, our analysis was doable only243

because the number of nodes (N = 32) in our network is small enough to244

make the problem treatable with easily accessible computational facilities.245

Nonetheless, we were constrained to limit L to 5; hence L is between 2 and246

5.247

248

A widely used method for identifying clusters in physical networks249

is the maximum modularity criterion first introduced by Newman and250

Girwan (2004). Modularity Q is defined, up to a multiplicative constant, as251

the difference between the number of edges falling within given groups of252

nodes and the expected value in a network that conserves the degree values253

but with randomly placed edges (further details can be found in Newman,254

2006). The values of modularity can be either positive or negative, with255

positive values indicating the possible presence of community structures.256

Therefore we are able to investigate the community structure of a network257

by looking for the divisions of the network associated with a maximum258

value of modularity. Given a network, let ci be the community in which259

node i is assigned. For a directed weighted graph the modularity assumes260

the form (see Nicosia et al., 2009, for details):261

Q =1

m

∑i,j∈V

[aij −

kouti kinjm

]δ(ci, cj) (2)

13

Page 14: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

where ki and kj are the degrees of the nodes i and j, m =∑

i ki and262

δ(ci, cj) is the Kronecker δ-function.263

Exploiting a reformulation of modularity in matrix formalism, it is264

possible to recursively explore all the possible divisions of a network in265

order to identify the one that maximizes the modularity value of the network266

without exceedingly high computational power (see Newman, 2006, for267

details). One drawback of the algorithm is an intrinsic variability that268

eventually makes the results not completely compatible between different269

runs of the analysis. For example certain nodes could be assigned to different270

clusters without changing the maximum value of Q. This inconvenience271

can be bypassed by running the analysis multiple times and taking, as a272

best division, the one that is the most frequently found. In the present273

work we ran the analysis 1 · 104 times on the 20 different variant matrices,274

hence a total of 2 · 105 runs.275

276

In order to extract all possible information from the connectivity277

matrices about the role played by the different sites, we also used the bridging278

centrality CBR measure. This measure was first proposed by Hwang et al.279

(2008) for undirected unweighted graphs. For our analysis we reformulated280

it in order to extend its use to directed weighted graphs.281

Bridging centrality highlights those nodes that connect different clusters282

of a network (see Hwang et al., 2008). It is derived both from the betweenness283

value of a node and from the bridging coefficient that accounts for the284

14

Page 15: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

probability of leaving the direct neighbourhood of the node by starting285

from one of the nodes composing it. Intuitively, nodes with a high number286

of such edges fall on the boundary of clusters. In Hwang et al. (2008), for a287

node i ∈ V , the bridging coefficient is defined as:288

Ψuu(i) =1

k(i)

∑v∈N(i)

∆(v)

k(v)− 1(3)

where k(i) is the strength of the node i ∈ V and N(i) the direct289

neighbourhood of i: that is the set of nodes reachable from i in one step.290

∆(v) is the out-strength of nodes v ∈ N once deleted the edges going from291

v to other nodes in N(i).292

We propose the way to generalize the bridging coefficient to directed293

weighted graphs by accounting for the weight of the edges and by checking294

which edges are effectively leaving the neighbourhood of the node. Then,295

we correct the out-strength of i via the term −avi and the strength of v296

via the term −(aiv + avi). Note that, for this calculation, all the terms297

avv on the diagonal of the connectivity matrix are irrelevant. Hence, in the298

directed weighted case, we redefine the bridging coefficient as:299

Ψdw(i) =1

ktot(i)

∑v∈N(i)

∆(v)− aviktot(v)− (aiv + avi)

(4)

where ktot(i) = kin(i) + kout(i) is the strength of the node i ∈ V . In300

this way, we retain both the information on the flux of information through301

a node (given by the betweenness) and the topological information on the302

15

Page 16: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

position of this node relative to clusters (given by the bridging coefficient).303

In fact, a node falling on the border of a cluster and channelling a high flux304

of information will have both high bridging coefficient and high betweenness305

values. As a result, the removal of such a high bridging centrality node306

would have a much more disruptive effect than the removal of a node having307

only either a high betweenness value or a high bridging coefficient (see308

Hwang et al., 2008, for an analysis and discussion of this phenomenon309

in the undirected case). An important aspect to pay attention to, when310

calculating the betweenness centrality and the bridging coefficient of a311

node, is the different orders of magnitude in play. While the first is normalized312

to one, the second is not: its value depends upon the particular metric313

used to define the distance between the nodes. In order to give to the two314

parameters equal importance in characterizing the centrality of a node, we315

follow the suggestions of Hwang et al. (2008), and (i) calculate the betweenness316

centrality and the bridging coefficient for each node, (ii) calculate the rank317

vector of the nodes on the base of their value of betweenness and bridging318

values, and (iii) calculate the bridging centrality as:319

HBR(i) = ΓBC(i) · ΓΨ(i) (5)

where ΓBR(i) is the rank of a node i in the betweenness vector and320

ΓΨ(i) is the rank of a node i in the bridging coefficient vector. To summarize,321

bridging centrality allows us to identify the nodes which are likely to be on322

16

Page 17: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

the boundaries of the clusters and hence able to prevent the fragmentation323

of the network in sub-networks.324

A new metric for node-to-node distance325

An essential aspect in analysing biological network stability and structure326

with graph theory is the choice of the metric used to define the distance327

between the nodes of the corresponding graph. Above all, this choice has328

important consequences on the physical interpretation of the results. In329

principle, many choices are possible: the genetic distance was used by Rozenfeld330

et al. (2008); the connection time between sites by Treml et al. (2008); the331

larval transfer probability by, for example, Andrello et al. (2013). One can332

refer to Rayfield et al. (2010) and Galpern et al. (2011) for reviews on the333

different metrics.334

Here we propose the use of a new metric to define the distance between335

nodes when dealing with larval transfer probabilities, in order to ensure336

that larger larval transfer probability between two nodes corresponds to337

smaller node-to-node distance. Consider that: (i) larval transfer probabilities338

are calculated by considering the position of the Lagrangian particles only339

at the beginning and at the end of the advection period; (ii) we are discarding340

the information on the effective path taken by a particle (i.e., the probability341

to go from i to j does not depend on the zone the particle is coming from342

before arriving in i); and (iii) the calculation of the shortest paths implies343

the summation of a variable number of these connectivity values (that is,344

17

Page 18: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

in the calculation of betweenness, we are considering paths whose values345

are calculated from different number of generations). Thus it is clear that346

the probabilities we calculated by Lagrangian simulations are intrinsically347

independent from each other. Nevertheless, the classical algorithms calculate348

the shortest paths as the summation of the edges composing them (e.g.,349

the Dijkstra algorithm, Dijkstra, 1959). These algorithms, if directly applied350

to the probabilities at play here, are incompatible with their independence.351

So we define the distance between two nodes i and j as:352

dij = ln

(1

aij

)(6)

where aij is the connectivity probability given by the connectivity353

matrices used in the metapopulation model. This definition is the composition354

of two functions: h(x) = 1/x and f(x) = ln(x). The use of h(x) =355

1/x allows one to exchange the ordering of the metric in order to make356

the most probable path the shortest. The use of f(x) = ln(x), thanks to357

the basic property of logarithms, allows the use of classical shortest-path358

finding algorithms while dealing correctly with the independence of the359

connectivity values. In fact, we are de facto calculating the value of a path360

as the product of the values of its edges. It is worth mentioning that the361

values dij = ∞, resulting from the values aij = 0, do not influence the362

calculation of betweenness values via the Dijkstra algorithm.363

Note that Equation (6) is additive and homogeneous (see Supplementary364

18

Page 19: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Materials B for a detailed demonstration).365

This new metric combines previous distance definitions attempts366

(dij = 1/x, Gao et al., 2010; and dij = ln(x), Brockmann and Helbing,367

2013). Above all, it is consistent with the summation of probabilities and it368

respects the three mathematical properties of a distance (see Supplementary369

Material B).370

371

Note also that this node-to-node metric does not apply for the calculation372

of all the measures we presented in this section. It is of interest in the373

only case in which a probability reversal is needed: betweenness, minimum374

cycles and network strength calculation. Modularity, bridging centrality375

and number of not null elements are still calculated using the original larval376

transfer probability aij.377

378

For the interested reader, please note that the graph theory toolbox379

we developed for the present study can be freely downloaded from the web-380

page http://www.mio.univ-amu.fr/~costa.a.381

Assessment382

To the best of our knowledge, it is the first time that graph theory is applied383

to connectivity matrices obtained from Lagrangian trajectories based on a384

fully 3-D circulation model. Moreover the dispersal time of the numerical385

19

Page 20: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

particles is set in order to mimic the principal biological characteristics386

of polychaetes. Both these aspects significantly complicate the dispersion387

dynamics (Siegel et al., 2003). As far as we can tell, it is also the first time388

that graph analysis is used on such a restricted coastal spatial domain for389

conservation aims. These facts add difficulty in the analysis since they390

result in dense connectivity matrices. Nonetheless the analysis provided391

meaningful results, supporting the validity of the application of graph theory392

in connectivity estimation problems at different spatial scales.393

Betweenness values and their variability394

Figure 2, showING a geographical representation of the connectivity matrices395

together with betweenness values, highlights the strong dependency of396

betweenness on the circulation pattern present in the gulf. Figure 2d displays397

the connectivity values of the mean of the 20 variant matrices. The betweenness398

values here are the mean of the betweenness values obtained for each site399

with the 20 connectivity matrices. Note that this calculation is in principle400

different from calculating betweenness from the mean matrix. In our case401

betweenness values differed by one order of magnitude when comparing402

the two calculations (data not shown). In order to evidence the influence403

of different circulation patterns on betweenness, one has to analyse the404

betweenness values issued from each single matrice. In our study, central405

retention, mixed and westward circulation patterns appear to induce a high406

betweenness value for site 21. One case of reversed circulation (matrix #1)407

20

Page 21: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

was associated to a low betweenness value for all the sites (Figure 3).408

409

The number of not null elements of the connectivity matrices (Figure410

4a) can be used in order to clarify the influence of the circulation on the411

connectivity matrices and, consequently, on betweenness values; especially412

if we cross this information with the network strength (given in Figure 4b).413

Note that, in order to avoid infinite network strength values, we substituted414

the infinities coming from the null aij values in Equation 6, with a constant415

that was set to 1000 times the maximum value of dij in the different matrices416

after a sensitivity analysis (not shown here). We can see that matrix #1417

has a lot more connections between the 32 sites compared to almost all418

the other connectivity matrices (892 not null elements out of 1024). This419

implies that this kind of circulation retains many particles alongshore (at420

least during the final parts of their larval period). Moreover the fact that421

the network strength value (1.67 × 106) is much lower than the mean one422

(4.90 × 106) tells us that these numerous connections generally have small423

values. So that there are many paths sharing a limited amount of network424

strength. This agrees with the low betweenness values for matrix #1 at all425

the nodes.426

Case #11 (westward circulation) is also peculiar. It has the lowest427

number of existing connections (376, Figure 4a) but the highest network428

strength (7.5 × 106, Figure 4b). Thus this circulation pattern disperses a429

lot of particles offshore and only very few paths remain. Considering the430

21

Page 22: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

network strength value, we know that these paths have a high probability.431

Hence, in this situation, only the predominant paths are left. Therefore,432

the fact that node 21 still has a significantly high betweenness value in433

matrix #11 (see Figure 3) is a strong indication about its importance in434

the dynamics of the network.435

Moreover, the number of not null elements provides information on436

the effect of a specific circulation pattern on the spatial distribution of437

the species. In fact, a circulation pattern characterized by a connectivity438

matrix with many not null elements favours an exchange between all the439

sites, even if the connections are weak (matrix #1). Therefore, the species440

will tend to be more homogenised by the action of the circulation. Conversely,441

a circulation pattern creating few but intense connections (matrix #11)442

will tend to form predominant migration fluxes and thus spatially structure443

the species distribution.444

Note that all the other matrices are a composition of an intermediate445

number of not-null elements with an intermediate value of network strength,446

thus they cannot be interpreted as easily as cases 1 and 11 presented above.447

448

In general, from the results in Figure 3, we see that in all these cases449

only node 21 happens to have a much greater value of betweenness (roughly450

one order of magnitude) than the other ones. It corresponds to the site in451

front of Port Camargue, approximately in the center of the Gulf of Lion.452

From an oceanographic point of view the presence of a westward alongshore453

22

Page 23: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

circulation that is predominantly present in the GoL (Millot, 1990), is454

relevant to clarify the high betweenness of site 21. This kind of circulation455

induces a transport of particles alongshore that determines the high value456

of betweenness of the sites in the centre of the gulf. The occasional recirculation457

that forms offshore site 21 (Petrenko et al., 2008) can enhance the importance458

of this particular site. Note that if some off-shore sites had been used, the459

betweenness values may have had their maximum elsewhere. Nonetheless,460

here, site 21 is the most important for polychaetes because the 32 sites461

considered cover all its habitat which is coastal.462

Our results highlight the sensitivity of graph metrics to flow variability:463

different circulation patterns correspond to networks with different connection464

patterns (as already noted by Mitarai et al., 2009) and different centrality465

measure values. These results suggest that other than adapting the size of466

the MPAs on the basis of seasonal movements of a species (Meyer et al.,467

2010), it is possible to envision the implementation of adaptative MPA468

management, possibly on a meteorological time scale.469

470

One aspect of the analysis needs to be clarified. The values of betweenness471

were calculated for each variant matrix. This means that a specific value of472

betweenness relies on the hypothesis that a particular circulation pattern473

is maintained constantly throughout a multi-generational migration. This474

is an unrealistic assumption. Nonetheless, the constancy with which we475

find a high betweenness at node 21 means that this node is likely to have476

23

Page 24: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

a high betweenness even with more realistic circulation patterns. This477

could be not true in other case studies. In that case a solution to this issue478

could come from the implementation of time-dependent networks (see for479

example Ser-Giacomi et al., 2015, and references therein).480

Modularity and identification of sub-populations and481

rescue mechanisms in the Gulf of Lion482

Cluster analysis shows a fairly simple division of the network into two clusters483

(see Figure 5): a western one (sites 1 to 18) and a central-eastern one (sites484

19 to 32). This division of the network is the one found more frequently:485

40% of the times against 10% for other possible divisions. In the majority486

of the other cases only sites 18, 19 and, in some cases, 20 are assigned differently.487

The modularity value (Q = 0.16) gives us information about how important488

the exchange of larvae between the two clusters is. In general, there is no489

absolute threshold to discriminate between low and high values of modularity.490

Considering that, by definition, −1 < Q < 1, we are confident in stating491

two things: (i) as our value of Q is positive there is a cluster structure and492

(ii) as Q is less than a fifth of the maximum possible value (that is 0.2),493

we can define it as low. This means that the clusters exist but are not494

separated in a sharp way. Our leading hypothesis for the oceanographic495

mechanism at the base of this separation is the presence of recirculations in496

the central part of the GoL (Estournel et al., 2003) that can dynamically497

24

Page 25: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

separate the two parts of the gulf while still permitting a considerable communication498

between them. Thus within the gulf there is a considerable migration flow499

of polychaetes that is not, at least spatially, highly organized. This is very500

likely a characteristic of a coastal environment where all the sites are alongshore501

with no considerable physical and/or dynamical barrier between them.502

Noticeably, the modularity analysis result is in agreement with the503

results of the metapopulation model. Exactly like in Guizien et al. (2014),504

modularity analysis showed the presence of a rescue mechanism of the505

sites in the western part of the gulf by the eastern sites. This result is506

consistent with the organization of the metapopulation in two big inter-507

communicating sub-populations as evidenced by graph theory modularity508

clustering. The presence of a rescue mechanism is mirrored by the low509

value of positive modularity. This division into two clusters of polychaetes510

assemblages is also in line with the division into an eastern and a western511

cluster evidenced by Labrune et al. (2007) when studying sedimentary512

differentiation of the GoL seabed. Given these facts we are confident that513

modularity results are a reliable tool to identify subpopulations and rescue514

mechanisms.515

516

One possible objection to the modularity method is the validity (see517

Procedures) of a random model as null hypothesis (as pointed out, for518

example, by Thomas et al., 2014). Regarding this aspect we are convinced519

that a random null model is the best choice to model the effects of the520

25

Page 26: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

current field, main forcing of the biological system’s dynamic that -although521

deterministic- is chaotic due to its turbulent behaviour. Indeed, subpopulation522

structures in marine biological networks are likely to arise due to the effects523

of currents. This hypothesis is also backed by the mean assortativity values524

(a measure quantifying the ‘preference’ of a node to establish links with525

nodes of similar strength). For random networks we expect this value to526

be zero (see Barrat et al., 2008, for example). In our case, it is indeed very527

small (−0.04).528

The reader must also be aware that research on modularity has been529

largely developed since its introduction by Newman and Girwan (2004)530

and various shortcomings of this quantity are nowadays well known (see531

for example Fortunato and Barthelemy, 2006; and Kehagias and Pitsoulis,532

2013). For example, resolution limit problems (i.e., identifying clusters533

under a certain size) appear in the presence of quite peculiar network structures534

(see Kehagias and Pitsoulis, 2013, for example) but are not present in our535

case, mostly due to the randomness of our network.536

Bridging centrality and the preservation of network537

integrity538

The three nodes that are characterized by the higher value of bridging539

centrality are the nodes 11, 12 and 16 (see Figure 6). These nodes have540

bridging centrality values of 600, 572 and 513 respectively. Following Hwang541

26

Page 27: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

et al. (2008), these three nodes, representing the top ten percent of the 32542

nodes, are the nodes that are expected to prevent the network from easily543

breaking into separated sub-networks. Nevertheless, here, the removal of544

solely one of these nodes does not create sub-networks. The removal of545

all three high-bridging centrality nodes do not break the network either.546

Indeed, isolated sub-networks appear only after removing at least eight547

nodes. This result is due to the high average edge density of the 20 variant548

matrices (ρ = 0.604). Whereas the networks used by Hwang et al. (2008)549

or in other social sciences applications of graph theory have lower ρ. But550

one has to note that a lot of weak connections between the 32 sites are551

present (data not shown). The solidity of the biological network should552

be assessed independently from these many weak connections. In order to553

highlight the predominant connections, a threshold was set on the base of554

the following argument. Given a probability-based connectivity matrix,555

we can expect that a transfer rate T between the minimum not-null value556

Cm of the matrix and 1 is necessary for the maintenance of an overall good557

connectivity. We can estimate T as the geometric mean of Cm and 1: T =558

√Cm · 1. The geometric mean has the advantage of considering a vast559

range of values for the variables at play in determining an unknown quantity,560

while not being biased by the choice of too large/small extreme values.561

Dealing with living organisms, one also has to account for the survivorship562

of the propagulae: in an efficient network, we can expect that a percentage563

S of propagulae between T and at least Te

is likely needed for the maintenance564

27

Page 28: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

of the persistence of the species, where e is the Nepero constant. Thus:565

S =√

Te· T . As a last step, we also account for the percentage of the566

surviving particles that successfully reproduce. A percentage R between S567

and Se

is likely needed for a good persistence of the species in the habitat.568

Thus: R =√

Se· S. In our case R = 0.0041, which we round to the value569

0.001 used in our analysis. A qualitative test of this estimation showed570

that, for a threshold equal to 0.01, we obtained an almost completely disconnected571

network. While, when retaining all values above 0.0001, nothing hardly572

changed. Thus a threshold equal to 0.001 seems to be exactly the threshold573

we looked for in order to have a minimally-connected network. In particular,574

this threshold eliminated, on average, 36% of the connections from the 20575

connectivity matrices.576

After applying the threshold, the deletion of both sites 11 and 12 led577

to nine sub-networks - sets of nodes disjoint from other portions of the578

graph - mainly consisting of isolated single nodes or couples of nodes. In579

contrast the removal of other triplets of nodes created, on average, only580

seven separated sub-networks (data not shown). Noticeably, the joint deletion581

of node 16 did not enhance the effect of the deletion of only nodes 11 and582

12. This fact is not surprising because node 16 is the node, among high-583

bridging centrality nodes, with the lowest value of bridging centrality. In584

fact, the indication by Hwang et al. (2008) to consider as crucial the top585

10% of high-bridging centrality nodes is just a guideline and as such has to586

be taken.587

28

Page 29: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

588

The oceanographic explanation for a relevant value of bridging centrality589

for sites 11 and 12 is the separation into two clusters (east and west, as590

said above) and the presence of eastward and westward currents in the591

middle of the GoL (Petrenko, 2003). We think that these currents can592

justify the important communication between the clusters, as indicated593

by a low value of modularity and a high value of bridging centrality of two594

sites at the gulf’s mid height, such as sites 11 and 12.595

Remarkably the removal of node 21 had no particularly important596

effects on the fragmentation of the network. It is thus clear that bridging597

centrality adds more information on the structure of the network than598

what betweenness alone can provide.599

Minimum cycles identify retention loops induced by600

currents recirculation601

Here we direct the analysis to the inspection of the spatial scales at which602

multi-generational flows form retention loops.603

Minimum cycles provided evidence that the nodes with the greater604

probability to see their particles returning home, after a period ranging605

from 2 to 5 generations, are the nodes 13 to 16 (Figure 7). According to606

Hastings and Botsford (2005) these nodes are likely crucial for the persistence607

of polychaetes. Furthermore the nodes that are travelled through the most608

29

Page 30: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

frequently during the minimum cycles are also the nodes 13 to 16 (data not609

shown).610

Indeed the zone of the gulf spanning from nodes 13 to 16, where the611

cycles are shortest, corresponds to an area where the currents often recirculate,612

due to the presence of eddies (Hu et al., 2011a; and Hu et al., 2011b).613

Relevance for conservation614

With the results so far we can compare the hierarchy of the four sites (1,615

10, 18 and 32) established with respect to their importance for species616

persistence (as determined by metapopulation analysis), and to their hierarchy617

in terms of betweenness, shortest cycles and bridging centrality values618

(Table 1). Shortest cycles analysis correctly points to site 18 as most important619

among these four sites. Thus, shortest cycles are able to identify the nodes620

sustaining persistence as Hastings and Botsford (2005) established on a621

theoretical basis. However, the shortest cycles do not agree with the hierarchy622

of the metapopulation model for the other three nodes. This pinpoints623

a methodological discrepancy between the site removal procedure in the624

metapopulation model and the shortest cycles identification procedure625

when cycles becomes long (as clarified in the Comments and Recommendations626

Section).627

628

Betweenness fails to reproduce the hierarchy from the metapopulation629

model concerning sites’ importance for demographic persistence.630

30

Page 31: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Herein, in almost all cases of circulation patterns in spring 2004 and631

2006 in the Gulf of Lion, site 21 offshore of Port Camargue has the highest632

value of betweenness (see Figure 3). This means that site 21 is the site633

through which the majority of the larvae pass, suggesting the importance634

of site 21 as a “gateway” for multi-generational migration. This role has635

been demonstrated to be crucial for preserving gene flow continuity across636

the GoL (Padron and Guizien, 2015) and confirms the utility of the betweenness637

measure for identifying relays of species spreading. But “gateways” through638

which the majority of the larvae pass during multi-generational migrations639

may not be sufficient for persistence, unlike previously put forward (Treml640

et al., 2008; Andrello et al., 2013). Persistence implies continuity in population641

cycles. Thus, an efficient transfer, although the most intuitive requirement642

for connectivity, is probably not the most important. This fact highlights a643

risk of failure of conservation polices when only high-betweenness sites are644

preserved while persistence cycles are not maintained.645

646

We introduced a new formulation for bridging centrality to adapt it647

to directed-weighted graphs. It fails to reproduce the hierarchy resulting648

from the metapopulation model concerning demographic persistence. However,649

bridging centrality is focused on the network integrity. Thus it indicates650

sites important for species spreading at the regional scale, that are secondary651

sources for species expansion.652

31

Page 32: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Comments and Recommendations653

Firstly, we want to further highlight the importance of the difficulty that654

arises when using matrices containing larval transfer probabilities in graph655

theory analysis. In this case, the most immediate choice for edge definition656

would be the probabilities themselves (e.g., Rossi et al., 2014). But, with657

this choice, one obtains conceptually wrong results when dealing with concepts658

relying on the calculation of the shortest paths as, for instance, when calculating659

betweenness (as in Andrello et al., 2013, for example). In fact, with this660

metric, the shortest path is the most improbable one and the high-betweennes661

sites indicate the less frequented ones. The node-to-node metric derived662

from larval transfer probability we propose solves this inconsistency, making663

the use of shortest paths, betweenness and shortest cycles meaningful for664

the analysis of networks based on transfer probabilities.665

666

The discrepancy between shortest paths analysis and metapopulation667

modelling could be due to the particular site-removal procedure used in668

the habitat loss scenario of the metapopulation model analysis. Firstly,669

consistent with the fact that anthropic pressure decreases as distance from670

the harbours increases, scenarios included only the four harbours sites.671

Secondly, under the hypothesis that the effect of anthropic pressure acts672

predominantly on neighbouring sites, the habitat removal procedure was673

done by progressively eliminating neighbouring sites. Shortest path analysis674

32

Page 33: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

supports this point of view. In fact it points out that the nodes composing675

the shortest cycles are all close to each other. That is, the hypothesis that676

the survivorship of a site depends on the survivorship of the neighbouring677

ones is reasonable. Nevertheless shortest path analysis does not include678

any assumption on geographical proximity within the path. Consequently,679

shortest path analysis’ results would be fully comparable only with a more680

general removal procedure than the one used in Guizien et al. (2014).681

682

The general scope of the paper was to verify if the expected conservation683

interpretation of some graph theory measures was backed by an analysis684

with a more solid conservation interpretation like metapopulation model.685

We could not present a metapopulation equivalent of bridging centrality686

because the available metapopulation simulations did not test the role687

of isolated sites in rescuing the metapopulation. Nonetheless we think688

that the importance of high-bridging centrality nodes in the integrity of689

the network is a powerful indication of their importance for the regional690

spreading of a species.691

692

Overall we showed how some graph theory measures can be employed693

to obtain a considerable part of the information provided by metapopulation694

modelling exploiting only connectivity matrices and avoiding the difficult695

estimation of demographic parameters. A summary of the conservation696

interpretation of the graph theory measures analysed in this study can be697

33

Page 34: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

found in Table 2.698

34

Page 35: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Acknowledgements699

This study was partially funded by the European CoCoNet Project and by700

the Ministere de l’Education Nationale, de l’Enseignement Superieur et de701

la Recherche.702

The authors want to thank Dr.Rubao Ji, Dr.Lucio Bellomo, Dr.Leo703

Berline and Dr.Jean-Christophe Poggiale for helpful discussions. The first704

author also thanks Dr.Alain de Verneil for helping correcting the manuscript.705

Page 36: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Supplementary Materials706

A Metapopualtion model707

Population density at a given time at a given site results from spatially708

structured local survivorship and reproductive success inputs potentially709

depending on all the other sites in the system. The model used by Guizien710

et al. (2014) accounts for both (i) recruitment limitation due to space availability711

at the destination site (computed as the proportion of free space based712

on the saturating density of adults, and (ii) the variability in propagule713

transfer rate.714

In particular the model can be written in matrix form as follows:715

P (t+ ∆t) = min(G(t)P (t), Pmax)

with a time step of ∆t and the growth transfer matrix G defined as716

Gij = liaij(t)bj + sjjδ(i, j)

where: Pmax = 1/αA where αA is the mean cross-sectional area717

of one adult, P (t) ∈ RN contains the spatial density of adults in each718

site i ∈ [1, 32] at time t, li [number of larva per adult] is the propagulae719

production rate at site j, bj [number of adult per larva] is the recruitment720

success at site j, sjj [no units] is the adult survivorship rate at site i, δ(i, j)721

36

Page 37: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

is the Kronecker δ-function, Pmax is the site carrying capacity and aij is the722

propagulae transfer rate from site i to site j. The larvae production rate723

bj is equal to the number of larvae produced by each adult female FSRf724

where F is the fecundity rate, SR is the sex ratio in the adult population,725

and f is the probability of an egg being fertilized. The recruitment success726

bj accounts for all mortality losses since egg release until the first reproduction727

of new recruits, and includes mortality during larval dispersal, settlement728

and juveniles stages. Notice that adult survivorship rate can be related to729

species life expectancy LE as sjj = eln(0.01) ∆t

LE , where life expectancy LE is730

the age at which 99% of individuals of the same generation have died.731

B Node-to-node metric properties732

We verify that the new metric we propose for measuring the distance between733

nodes dij = ln( 1aij

), aij being the probability of advection from site i to site734

j in a given time, satisfies both homogeneity and additivity:735

1. Homogeneity:736

αdij = α ln

(1

aij

)= ln

(1

aαij

)

that is, a distance multiplied by a scalar is still a distance.737

2. Additivity:738

37

Page 38: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

dil + dlj = ln

(1

ail

)+ ln

(1

alj

)= ln

(1

ail · alj

)= ln

(1

aij

)= dij

that is, the sum of two distances is still a distance and, in particular,739

the final distance is obtained by the multiplication of the partial740

original metric values. This aspect adapts particularly well to the741

problem addressed in the paper. In fact the length of a path can742

be calculated as the product of the probabilities associated with its743

components.744

Note that as the elements aij have no units, also the elements dij745

have no units.746

List of captions747

FIGURE 1. Schematic representation of the typical circulation in the Gulf748

of Lion. The thick arrow represents the dominant alongshore Northern749

Current. The thinner arrow represents the eastward current that can be750

detected in stratified conditions or under particular wind field conditions.751

The positions of the 32 studied sites are plotted. The sites 3, 10, 18 and752

32, used for the habitat loss scenario in the metapopulation model, are753

highlighted by bigger grey dots. Node 21 is the smallest of the grey dots.754

The grey lines correspond to the 100 m, 200 m, 1000 m and 2000 m isobaths.755

38

Page 39: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

FIGURE 2. Spatial representation of the connectivity matrices and756

betweenness values in different circulation situations. (a) Westward drift757

(matrix #7), (b) eastward drift (matrix #1), (c) central retention (matrix758

#10). In the panel (d) the mean connectivity matrix values and the mean759

of the betweenness values are represented. In all the figures, a threshold on760

the value of connectivity was imposed for clarity: connectivity values lower761

than the 2/3 of the maximum one are not plotted.762

FIGURE 3. Values of betweenness for the 32 sites using the 20 variant763

connectivity matrices. The normalization is done on the maximum value of764

betweenness obtained using the different variant matrices.765

FIGURE 4. (a) Number of not null elements in the 20 variant matrices.766

Different circulation patterns have different effects on the connectivity767

inside the Gulf of Lion. (b) Network strength of the 20 variant matrices.768

FIGURE 5. Clusters identified with a criteria of maximization of769

modularity. The result is the average assignation of a node to one of the770

two clusters after 2·105 code runs with the 20 variant connectivity matrices.771

Two colors separate the two clusters.772

FIGURE 6. Bridging centrality values for the 32 sites. Geographical773

representation. Nodes 11 and 12 have the highest values: 600 and 572 respectively.774

FIGURE 7. Sum of the products of the weights of all the cycles (length775

from 2 to 5 steps) that start from each of the 32 sites. The minima correspond776

to the nodes for which the probability of particles returning home (in a 2777

to 5 generations time span) is higher.778

39

Page 40: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

References779

Andrello, M., Jacobi, M., Manel, S., Thuiller, W., and Mouillot, D. (2014).780

Extending networks of protected areas to optimize connectivity and781

population growth rate. Ecography, 38:273–282.782

Andrello, M., Mouillot, D., Beuvier, J., Albouy, C., Thuiller, W., and783

Manel, S. (2013). Low connectivity between mediterranean marine784

protected areas: a biophysical modeling approach for the dusky grouper:785

Epinephelus Marginatus. PLoS ONE, 8(7).786

Barrat, A., Barthelemy, M., and Vespignani, A. (2008). Dynamical787

processes on complex networks. Cambridge University Press; 1 edition.788

Brockmann, D. and Helbing, D. (2013). The hidden geometry of complex,789

network-driven contagion phenomena. Science, 342:1337–1342.790

Caswel, H. (2001). Matrix population models; 2nd edition. Sinauer791

Associates, Sunderland, MA, USA.792

Dijkstra, E. (1959). A note on two problems in connexion with graphs.793

Numerische Mathematik, 1:269–271.794

Duraiappah, A. and Shahid, N. (2005). Millennium ecosystem795

assessment.ecosystems and human well-being: Biodiversity synthesis.796

World Resources Institute,Washington, DC. USA.797

40

Page 41: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Estournel, C., de Madron, X. D., Marsaleix, P., Auclair, F., Julliand, C.,798

and Vehil, R. (2003). Observation and modeling of the winter coastal799

oceanic circulation in the Gulf of Lion under wind conditions influenced800

by the continental orography (FETCH experiment). J. Geophys. Res.,801

108(C3):1–18.802

Fortunato, S. and Barthelemy, M. (2006). Resolution limit in community803

detection. Proc. Natl. Acad. Sci. USA, 104:36–41.804

Galpern, P., Manseau, M., and Fall, A. (2011). Patch-based graphs of805

landscape connectivity: A guide to construction, analysis and application806

for conservation. Biol. Conserv., 144:44–55.807

Gao, X., Xiao, B., Tao, D., and Li, X. (2010). A survey of graph edit808

distance. Pattern Anal.Appl., 13:113–119.809

Guizien, K., Belharet, M., Moritz, C., and Guarini, J. (2014). Vulnerability810

of marine benthic metapopulations: implications of spatially structured811

connectivity for conservation practice. Divers. Distrib., 20(12):1392–812

1402.813

Hastings, A. and Botsford, L. (2005). Persistence of spatial populations814

depends on returning home. Proc. Natl. Acad. Sci. USA, 103:6067–6072.815

Hu, Z., Petrenko, A., and Doglioli, A. (2011a). Numerical study of eddy816

generation in the western part of the Gulf of Lion. J. Geophys. Res.,817

116:3–11.818

41

Page 42: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Hu, Z., Petrenko, A., and Doglioli, A. (2011b). Study of a mesoscale819

anticyclonic eddy in the western part of the Gulf of Lion. J. Mar. Syst.,820

7:3–11.821

Hwang, W., Taehyong, K., Murali, R., and Aidong, Z. (2008). Bridging822

centrality: Graph mining from element level to group level. Proceedings823

KDD 2008, :366–344.824

Jacobi, M., Andr, C., Doos, K., and Jonsson, P. (2012). Identification of825

subpopulations from connectivity matrices. Ecography, 35:31–44.826

Kehagias, A. and Pitsoulis, L. (2013). Bad communities with high827

modularity. Eur. Phys. J. B, 86:1434–6028.828

Kool, J., Moilansen, A., and Treml, E. (2013). Population connectivity:829

recent advances and new perspectives. Landscape Ecol., 28:165–185.830

Labrune, C., Gremare, A., Amoroux, J.-M., Sarda, R., Gil, J., and831

Taboada, S. (2007). Assessment of soft-bottom polychaete assemblages832

in the Gulf of Lion (NW Mediterranean) based on a mesoscale survey.833

Estuar. Coast. Shelf S., 71:133–147.834

Lagabrielle, E., Crochelet, E., Andrello, M., Schill, S., Arnaud-Haond,835

S., Alloncle, N., and Ponge, B. (2014). Connecting MPAs - eight836

challenges for science and management. Aquatic Conservation Marine837

And Freshwater Ecosystems, 24:94–110.838

42

Page 43: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

LeCorre, N., Guichard, F., and Johnson, L. (2012). Connectivity839

as a management tool for coastal ecosystems in changing oceans.840

Oceanography, Prof.M. Marcelli (Ed.), pages 235–258.841

Marsaleix, P., Auclair, P., and Estournel, C. (2006). Considerations on842

open boundary conditions for regional and coastal ocean models. J.843

Atmos. Ocean Tech., 23:1604–1613.844

McArthur, R. and Levins, R. (1967). The limiting similarity, convergence845

and divergence of coexisting species. Am. Nat., 101:377–385.846

McHugh, D. and Fong, P. (2002). Do life history traits account for847

diversity of polychaete annelids? Inverteb. Biol., 121:325–338.848

Meyer, C. G., Papastamatiou, Y., and Holland, K. (2010). Seasonal, diel,849

and tidal movements of green jobfish (aprion virescens, lutjanidae) at850

remote Hawaiian atolls: implications for marine protected area design.851

Mar. Biol., 151:2133–21434.852

Millot, C. (1990). The Gulf of Lions hydrodynamics. Cont. Shelf Res.,853

10:885–894.854

Mitarai, S., Siegel, D., Watson, J. R., Dong, C., and McWilliams, J.855

(2009). Quantifying connectivity in the coastal ocean with application to856

the southern california bight. Journal of Geophysical Research - Oceans,857

114:C10026.858

43

Page 44: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Moilanen, A. (2011). On the limitations of graph-theoretic connectivity in859

spatial ecology and conservation. Journal of Applied Ecology, 48:1543–860

1547.861

Newman, M. (2006). Modularity and community structure in networks.862

Proc. Natl. Acad. Sci. USA, 103:8577–8582.863

Newman, M. and Girwan, M. (2004). Finding and evaluating community864

structure in networks. Phys. Rev. E, 69(2):026113.865

Nicosia, V., Mangioni, G., Carchiolo, V., and Malgeri, M. (2009).866

Extending the definition of modularity to directed graphs with867

overlapping communities. J. Stat. Mech-Theory E, 3:1742–5468.868

Padron, M. and Guizien, K. (2015). Modelling the effect of demographic869

traits and connectivity on the genetic structuration of marine870

metapopulations of sedentary benthic invertebrates. ICES J.Mar. Sci.,871

doi:10.1093/icesjms/fsv158.872

Petrenko, A. (2003). Variability of circulation features in the Gulf of Lion873

NW Mediterranean Sea. importance of inertial currents. Oceanol. Acta,874

26:323–338.875

Petrenko, A., Dufau, C., and Estournel, C. (2008). Barotropic eastward876

currents in the western Gulf of Lion, north-western Mediterranean Sea,877

during stratified conditions. Journal of Marine Systems, 74:406–428.878

44

Page 45: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Petrenko, A., Leredde, Y., and Marsaleix, P. (2005). Circulation in a879

stratified and wind-forced Gulf of Lion, NW Mediterranean Sea: in situ880

and modeling data. Cont. Shelf Res., 25:7–27.881

Rayfield, B., Fortin, M.-J., and Fall, A. (2010). Connectivity for882

conservation: a framework to classify network measures. Ecology, 92:847–883

858.884

Rossi, V., Giacomi, E. S., Cristobal, A. L., and Hernandez-Garcia, E.885

(2014). Hydrodynamic provinces and oceanic connectivity from a886

transport network help desining marine reserves. Geoph. Res. Lett.,887

41:2883–2891.888

Rozenfeld, A., Arnaud-Haond, S., Hernandez-Garcia, E., Eguiluz, V.,889

Serrao, E., and Duarte, C. (2008). Network analysis identifies weak and890

strong links in a metapopulation system. Proc. Natl. Acad. Sci. USA,891

105:18824–18829.892

Schick, R. and Lindley, S. (2007). Directed connectivity among fish893

populations in a riverine network. J. Appl. Ecol., 44:1116–1126.894

Ser-Giacomi, E., Vasile, R., Hernandez-Garcia, E., and Lopez, C. (2015).895

Most probable paths in temporal weigted networks: An application to896

ocean transport. Phys. Rev. E, 92(1):012818.897

Siegel, D., Kinlan, B., Gaylord, B., and Gaines, S. (2003). Lagrangian898

45

Page 46: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

descriptions of marine larval dispersion. Mar. Ecol-Progr. Ser., 260:83–899

96.900

Thomas, C., Lambrechts, J., Wolansky, E., Traag, V., Blondel, V.,901

Deleersnijder, E., and Hanert, E. (2014). Numerical modelling and graph902

theory tools to study ecological connectivity in the Great Barrier Reef.903

Ecol. Model., 272:160–174.904

Treml, E., Halpin, P., Urban, D., and Pratson, L. (2008). Modeling905

population connectivity by ocean currents, a graph theoretic approach906

for marine conservation. Lansc. Ecol., 23:19–36.907

Urban, D. and Keitt, T. (2001). Landscape connectivity: a graph theoretic908

perspective. Ecology, 82:1205–1218.909

46

Page 47: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Figure 1: Schematic representation of the typical circulation inthe Gulf of Lion. The thick arrow represents the dominant alongshoreNorthern Current. The thinner arrow represents the eastward currentthat can be detected in stratified conditions or under particular wind fieldconditions. The positions of the 32 studied sites are plotted. The sites 3,10, 18 and 32, used for the habitat loss scenario in the metapopulationmodel, are highlighted by bigger grey dots. Node 21 is the smallest of thegrey dots. The grey lines correspond to the 100 m, 200 m, 1000 m and2000 m isobaths.

47

Page 48: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

(a) (b)

3oE 30’ 4oE 30’ 5oE

15’

30’

45’

43oN

15’

30’

12

34

567

8910

1112

13 1415

1617 1819

2021 2223 2425

2627 28 29303132

Con

nect

ivity

8.5

9

9.5

10

10.5

11

11.5

12

Betweenness0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

3oE 30’ 4oE 30’ 5oE

15’

30’

45’

43oN

15’

30’

12

34

567

8910

1112

13 1415

1617 1819

2021 2223 2425

2627 28 29303132

Con

nect

ivity

8.5

9

9.5

10

10.5

11

11.5

12

12.5

Betweenness0 0.01 0.02 0.03 0.04 0.05 0.06

(c) (d)

3oE 30’ 4oE 30’ 5oE

15’

30’

45’

43oN

15’

30’

12

34

567

8910

1112

13 1415

1617 1819

2021 2223 2425

2627 28 29303132

Con

nect

ivity

8.5

9

9.5

10

10.5

11

11.5

12

Betweenness0 0.05 0.1 0.15 0.2

3oE 30’ 4oE 30’ 5oE

15’

30’

45’

43oN

15’

30’

12

34

567

8910

1112

13 1415

1617 1819

2021 2223 2425

2627 28 29303132

Con

nect

ivity

10

10.5

11

11.5

12

12.5

13

13.5

14

14.5

Betweenness0 0.05 0.1 0.15 0.2

Figure 2: Spatial representation of the connectivity matrices andbetweenness values in different circulation situations. (a) Westward drift(matrix #7), (b) eastward drift (matrix #1), (c) central retention (matrix#10). In the panel (d) the mean connectivity matrix values and the meanof the betweenness values are represented. In all the figures, a threshold onthe value of connectivity was imposed for clarity: connectivity values lowerthan the 2/3 of the maximum one are not plotted.

48

Page 49: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Sites

Var

iant

Mat

rices

5 10 15 20 25 30

2

4

6

8

10

12

14

16

18

200

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Figure 3: Values of betweenness for the 32 sites using the 20 variantconnectivity matrices. The normalization is done on the maximum value ofbetweenness obtained using the different variant matrices.

49

Page 50: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

0 5 10 15 200

100

200

300

400

500

600

700

800

900

Matrices

Num

ber

of N

ot N

ull E

lem

ents

1 5 10 15 200

1

2

3

4

5

6

7

8x 10

6

Matrices

Net

wor

k S

tren

gth

(a) (b)

Figure 4: a) Number of not null elements in the 20 variant matrices.Different circulation patterns have different effects on the connectivityinside the Gulf of Lion. (b) Network strength of the 20 variant matrices.

50

Page 51: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

3oE 30’ 4oE 30’ 5oE

15’

30’

45’

43oN

15’

30’

12

34

567

8910

1112

13 1415

1617 1819

2021 2223 2425

2627 28 29303132

12

34

567

8910

1112

13 1415

1617 1819

2021 2223 2425

2627 28 29303132

Longitude

Latit

ude

Figure 5: Clusters identified with a criteria of maximization ofmodularity. The result is the average assignation of a node to one ofthe two clusters after 2 · 105 code runs with the 20 variant connectivitymatrices. Two colors separate the two clusters.

51

Page 52: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

3oE 30’ 4oE 30’ 5oE

15’

30’

45’

43oN

15’

30’

12

34

567

8910

1112

13 1415

1617 1819

2021 22 23 24 25

2627 2829

303132

Brid

ging

Cen

tral

ity

0

100

200

300

400

500

600

Figure 6: Bridging centrality values for the 32 sites. Geographicalrepresentation. Nodes 11 and 12 have the highest values: 600 and 572respectively.

52

Page 53: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

0 5 10 15 20 25 300

0.5

1

1.5

2

2.5

3x 10

5

Sites

Cyc

les

Leng

th

Figure 7: Sum of the products of the weights of all the cycles (lengthfrom 2 to 5 steps) that start from each of the 32 sites. The minimacorrespond to the nodes for which the probability of particles returninghome (in a 2 to 5 generations time span) is higher.

53

Page 54: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Sites testedin metapopulationmodel analysis

Hierarchy basedon metapopulationmodel

Hierarchy basedon Betweenness

Hierarchy basedon ShortestCycles

Hierarchy basedon BridgingCentrality

1 4◦ 1◦ 3◦ 4◦

10 3◦ 2◦ 2◦ 1◦

18 1◦ 3◦ 1◦ 2◦

32 2◦ 4◦ 4◦ 2◦

Table 1: Comparison of the hierarchy based on metapopulationmodel analysis of the four nodes tested in Guizien et al. (2014) with thehierarchies issued from different graph theory measures.

54

Page 55: A new interpretation of graph theory measures in ...doglioli/Costa_etal... · in evaluating marine metapopulations persistence: The study case of soft-bottom polychaetes in the Gulf

Costa et al. Graph theory for species persistence

Measure Scope InterpretationMinimum Cycles Identifying sites with high probability Important for persistence.

of returning home of the larvaespawning from them.

Betweenness Nodes through which the highest Nodes maintaining thepercentage of most probable paths gene flux at the whole

pass through them. network scale.Modularity Find sets of nodes more Sub-populations.

than randomly connected Indicates presence ofrescue mechanisms.

Bridging Centrality Find nodes leading the Nodes preventing thecommunication between clusters. fragmentation of the network.

Table 2: Recapitulation of the four main measures we use in theframework of this study. For each one we indicate scope and physical-biological interpretation.

55