predictive models for freshwater fish community composition€¦ · recognition and prediction....

Predictive models for freshwater fish community composition

Iulian David Olden

A thesis submitted in confonnity with the requirements

for the degree of Master's of Science

Graduate Department of Zoology

University of Toronto

Q Copyright by Julian David Olden 2000

National Library Bibliothéque nationale du Canada

Acquisitions and Acquisitions et Bibliographie Services services bibliographiques

395 Wellington Street 395, nie Wellington Ottawa ON K I A ON4 Ottawa ON K I A ON4 Canada Canada

The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, distribute or sel1 copies of this thesis in microfotm, paper or electronic formats.

The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fkom it may be printed or otherwise reproduced without the author's permission.

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfichelfilm, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

Predictive models for freshwater fish

comrnunity composition

-M~ter's of Science 20Q0

Julian David Olden

Graduate Department of Zoology

University of Toronto

Empincal models were developed to predict fish species occurrence, richness and

community composition for 286 temperate lakes located in sotith-central Ontario, Canada

based on whole-lake rneasures of habitat. Detailed analysis and cornparison of traditional

(i.e., logistic regression, discriminant analysis) and alternative (Le., classification trees, and

artificial neural networks) modeling statistical techniques show that predictive success differs

among species and approaches. Neural networks (NNs) are focused upon in subsequent

chapters given their potential utility in solving non-linear problems related to pattern

recognition and prediction. Details of mode1 construction, optimization and validation are

illustrated, and new rnethods for quantifjmg the explanatory value of NNs are developed.

Next, the utility of NNs to aid in understanding and predicting species abundance using near-

shore rneasures of lake habitat are illustrated. Together, my thesis provides ecological and

methodological insights into modeling of fish cornrnunities, both of which have considerable

value for the study, management and conservation of aquatic ecosystems.

It's currently 1 :34 in the moming and rather than going home I am considering

spending the night on the lab bench. But no . . . 1 must keep working as there is much to be

done . . . okay, just one more section then it's time to get some shut-eye with the cockroaches

of the Ramsay Wright building. Oh man . . . the acknowledgements! Not a pod thin5 to

wite when the caffeine is wearing off and my motor skills are deteriorathg quickly.

Ironically, this is perhaps the most difficult task of the whole thesis . . .. I'm not kidding.

There have been so many experiences that have guided me during my Iife. My father's

endless stories of sailing the seas, my mother's hand on my shoulder as 1 huny to complete

my school work before w i n g to the cockpit of the boat, my high school biology teacher

Mr. Johnson who fint exposed me to the many wonders of science, the student last week

who asked me to clariQ what exactly 1 was talking about. Hang on a minute! Let's do this

right, you know, in some organized manner (boy, rny mum be proud).

First, 1 cm honestly Say that I would not be in the position that 1 am today without my

supervisor, Don Jackson. He gave me my fint chance, hiring me as a sumrner field assistant,

having the amazing insight to look beyond my early sub-par grades in univenity and

aclaiowledging my potential as a scientist. For although many believe Don's expertises are

mainly numerical, just spend one moming on a lake with him and you will soon realize his

overwhelming knowledge of nahiral history. It is this knowledge that Don has blessed me

with, starting in the field in Dorset, continuing during rnoming coffee and evening beers, and

hopefùlly never ending. Thank you Don . . . well . . . for everything. 1 look forward to the

years ahead of working with you, and most importantly, our continued fiendskip.

Pedro Peres-Neto . . . for people who know Pedro need 1 Say more. My second

supervisor, an endless source of statistical information, wealth of ecological knowledge, love

for Rickards Red (even if it's flat), obsession with Saigon's Palace #34, and countless stories

of Brazil. 1 have never met such a dedicated individual to his studies, farnily and friends. I

have immensely enjoyed the hours of tallcing of science with you. 'ïhanks for the pep talks,

constant reassurance, and especially for being such a great and loyal fiend. Hey Pedro . . .

"1s Russia in Europe or what?!?". Thanks to Bryan 'The Notorious" Neff, Peng "Sperm

Boy" Fu and Trevor "Gunter" Pitcher for countless hours of laughter, gossip, Unreal

Tournament, Risk (long live the Black Plage), lifting weiphts, fishing (Bryan's black toe),

buming sessions, drinking beer and scotch, Kumtoya, Hung Fa, Kuïm Jug Yoen, Swiss

Chalet, NOT TIPPING $$, and just generally goofing-oft? 1 have enjoyed many intellechid

discussions with Bryan, Monte Car10 this, bootstrap that, and most Mportantly . . . "What the

heck is a time lag anyway?". With the introduction of Peng conversion began to slowly

deteriorate in content. but there were still numerous instances I remernber of talking

experimentation pnor to the start of the movie, life histories during coffee shop club (sorry

Bryan you couldn't join!), or just science in general while w a k n g to the pub. Thanks Peng,

but.. . "What's that metal thing in the bottom of the kettle?" Finally, with the entrance of

Trevor conversion completely depreciated, out with taking about science, in with endless

jokes and the amazing realization that he would tmly give the shirt off his own back for you.

We al1 enjoyed living vicariously through Trevor! Oh ya Gunt . . . "How 's that rock bass

taste?". Thanks to al1 the boys for the nights of drinking at Pedro's place, Duke of York,

Bedford Academy, My Aparmient, and we couldn't forget B.R., smoking cigars, and live

jazz.

Much love to Ladan Mehranvar for endless hours of taiking, watching movies, eating

exotic foods, and the fantastic mornings at Bake Works. Your Company is already dearly

missed. Many thanks to Pamela MacRae for being a great fnend and supporter, and Jeanette

Davis for adding a unique twist to the lab in later years. Thanks to the basketball crew

(Michelle Tseng, Lock Rogers, Paul Williams and Dave Punzalan) for allowing me to TAKE

IT TO THE KOOP . . . boy those GSU nms are forgiving!

Thanks to rny cornmittee, Brian Shuter, Nick Collins and Keith Somers for their

critical comments on my thesis and for simulating me intellectually. Nick Mandrak for

providing the Algonquin Park dataset used in the thesis, and for numerous conversations

regarding the fish communities of the park and surrounding areas. Sovan Lek for his

insightfùl comments about the h e r details of neural networks. Finally, Locke Rowe for

showing me his undergrad transcript years âge, reassuring me that GPA is not representative

of your ability.

Thanks to Papa Ceo and Cora for providing the materials to prove that it is possible to

survive on pizza. A specid th& to the staff of the many coffee shops 1 have inhabited over

the last two years. For their ffiendly smiles, unlimited patience for accepting my hours of

loitering and most importantly for providing the essential fuel needed to ensure the

completion of this thesis. Cheers, this coffee is for you!

Saving the best for last . . . my family . . . who has provided the much needed love and

support during my years at U of T. Thanks Mum and Dad for making the effort to travel to

the evil dwellings of Toronto, dragging my b ~ a out of my office, and force-feeding me.

Mum. your spoken and unspoken Cbiting your toneue I'rn sure) concem of whether I would

finish the thesis was surprisingly reassuring at times, and Dad's laid back, "1 know my son

cm do it" attitude provided the perfect contrast. My brother Morgan, who continually asked

whether 1 had tirne to spare to just hang out. Althouph my answer was too frequently "no",

he emphasized the importance of family and took it upon himself to maintain stmng ties

among us dl. Much love to T;iff, Dilys, Morgan, Sarnantha, Betty and George . . . I l phone

tom Colorado . . . I promise!

Funding for this thesis was provided by a number of sources, including a Natural

Sciences and Engineering Research Council of Canada Gnduate Scholarship, Edna Margaret

Robertson Scholanhip, Frederick P. Ide Graduate Award, University of Toronto Open

Scholarships, Department of Zoology Teaching Assistantships and Travel AWU~S, and a

Natural Sciences and Engineering Research Council of Canada Research Grant to Don

Jackson.

Table of Contents

. * Abstract ....................................... r i

... Acknowledgements ......................... 111

........................... Table of Contents vi ................................ List of Tables vii ................................. List of Figures x

................................. List of Boxes xiv ............................ List of Appendices xv

Thesis Introduction .......................... 1

C H A P T E R 1 Predictive rnodefs for commtiniîy assembij: Fish species occurrence in fakes

Abstract ....................................... 8 Introduction ................................... 9

....................................... Methods 11 ......................................... Results 19

.................................... Discussion 32 .................................... Conclusion 39

........................... Acknowledgments 40 ...................................... Reference 4 1

C H A P T E R 2 nlnminating the "black box": A randomization approach for understanding variab Le contributions in am$ciul neural networks

....................................... Abstract 39

Introduction ................................... 50 Case Study .................................... 52 Interpreting neural network

.......................... connection weights 53 ............... Illuminating the "black box" 55

.................................... Conclusion 72 Acknowledgments ........................... 73

.................................... References 74

C H A P T E R 3 Artijicial Netiral N e ~ o r k s : A predictive tool for Jsheries science

Abstract ......................................... 80 ..................................... Introduction 81

.................. Artificial Neural Networks 83 ...................................... Methods 89

Resulfs ......................................... 96 Discussion .................................... 107 Conclusion .................................... 113

........................... Acknow ledgrnents 113 References .................................... 114

.......................... Thesis Conclusion 119

Appendix A .................................. 120 .................................. Appendix B 127

Appendix C .................................. 128 .................................. Appendiv D 129

Appendix E .................................. 133

List of Tables

Page

Table 1.1. List of fish species, including species abbreviation (Code) and

fiequency of occurrence (%) in the 286 study lakes. Only species

occurring in greater than 5% of the lakes are included. ........................... 13

Table 1.2. Sumrnary statistics for the whole-lake habitat variables used in mode1

development (see Appendix B for Pearson-moment correlation coefficients

................................................................... among variables). 14

Table 1.3. S u m a r y of predictive performance of species-habitat models. Reported

values are percentage conectly classified (CC), specificity (SP: ability to

accurately predict species absence) and sensitivity (SN: ability to

accurately predict species presence). Predictions significantly different

from random (based on Kappa statistic) are indicated in italics

(a=0.05). Species codes are defined in Table 1.1. ............................. .20

Table 1.4. Species exhibiting hi& (significantly different fiom random) and low

(not significantly different fiom random) predictability for logistic

regression analysis, linear discriminant analysis, classification trees

..................................................... and artificid neural networks. .23

Table 1.5. Cornparison of observed and predicted fish community composition

for lakes based on first six components nom a principal coordhate

analysis using laccard's similarity coefficient. The range of total arnounts

of variation explained by the fint six components of the between lakes

matrices is summarîzed whereas al1 components were retained f?om the

species ordinations. Reported values are Gower's m2, ranghg fiom O to 1

where O indicates perfect agreement, and associated signifieance levels

........... based on 9,999 randomizations nom PROTEST in parentheses.. 28

Table 2.1.

Table 3.1.

Table 3.2.

Table 3.3.

Connection weight structure for die neural network modelling fish

species richness as a fùnction of 8 habitat variables. Wu represents

the input-hidden-output connection weight for input variable i

(where i =1 to 8) and hidden neuron j (where j = A to D). P values

for input-hidden-uutput connection weights !Kfi W5!, W-, and WC:),

overall connection weights (X Wdi+ Di) , and Garson's relative

........................... importance (%) are based on 9,999 randomizations. ..67

Summary statistics of macro-scale habitat variab les used in the neural

................................. networks to predict species presence or absence. 93

Performance of neural networks for predicting species presence or

absence in 128 lakes in the Madawaska River drainage (Training

Data) based on leave-one-out cross validation, and applying the

Madawaska networks to predicting occurrence in 32 lakes Erom

the Oxtongue River drainage (Test Data). The reported values are

per cent species occurrence (SO), # of hidden neurons in network (HN),

optimal decision threshold based on ROC analysis (ODT), per cent

correct classification (CC), sensitivity (SN), specificity (SP), Kappa

.................................................... statistic and associated P-value. 97

Cornparison of model predictions between full and pnined networks

with input variables and hidden neurons rernoved that were not

statistically significant (based on randomization test resdts). Pnuied

network design is reported after species narnes, where the three

values represent the number of input, hidden and output neurons,

respectively. The reported values are per cent correct classification

(CC), sensitivity (SN), specificity (SP) for predicting species

presence-absence (based on the optimal decision threshold from

ROC analysis), and correlation coefficient (r) between predicted

and actual abundances and root-mean-square-of-error prediction

(RMS E) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 06

List of Figures Page

Figure 1.1. Study lakes located in Amable du Fond River (A), Bo~echere

River (B), Madawaska River (C), Oxtongue River (D) and Petawawa

River (E) drainage bains of Algonquin Provincial Park, Ontario,

.................................................... Canada (45O50' N, 78'20' W). 12

Figure 1.2. Results Erom McNemar's test assessing differences among patterns of lake

miscIassifications using logistic regression, discriminant analysis,

classification trees and artificial neural networks. Shared shading for a

species represent pair-wise differences for lakes in which species occurrence

was incorrectly predicted (based on ac0.05). For example, blacknose shiner

was misclassified in different sets of lakes using logistic regression and

................................................................... classification tree. 24

Figure 1.3. The effect of fiequency of species occurrence on overall correct

classification rate (top panel), specificity (middle panel) and sensitivity

(lower panel) for 27 fish species using logistic regression (LRA),

discriminant analysis (LDA), classification trees (CART) and artificial

neural networks (ANN), expressed as percentage. Doned lines represent

....................................... expectations based on chance predictions. 25

Figure 1.4. Regression analysis between actual and predicted species richness

using logistic regression (LRA: y=3.24+0.73x), discriminant analysis

(LD A: y=2.96+0.75x), classification trees (CART: y-2.85+0.78x) and

artificial neural networks (ANN: y=2.64+0.86x). Solid h e represents

regression line and the dashed line represents a h e of perfect match

(1 : 1). Predicted nchness was tabulated by s u h g predicted species

............................. presences ushg models for each of the 27 çpecies. 29

Figure 1.5. First two axes from principal coordlliate analysis based on Jaccards

sunilarity coefficient of data sets (A) known or observed species

composition, (B) composition predicted by logistic regression model, (B)

predicted by discriminant analysis, (C) predicted by classification trees

and @) predicted by artificial neural networks. Letten refer to

species codes listed in Table 1.1. ................................................. 3 1

Figure 2.1. Neural interpretation diagram (NID) for neural network modelling

fish species nchness as a fûnction of 8 habitat variables. The

thickness of the lines joining neurons is proportional to the magnitude

of the connection weight, and the shade of the line indicates the

direction of the interaction between neurons: black connections are

positive (excitator) and gray connections are negative (inhibitor). ........... 57

Figure 2.2. Bar plots showing the percentage relative importance of each

habitat variable in the neural network predicting fish species richness

based on Garson's algorithm. See Box 2.1 for caiculations involved

in Garson's algorithm. .............................................................. 60

Figure 2.3. Contribution plots from the sensitivity analysis illustrating the output

of the network to changes in each habitat variable with al1 other

) and 80" variables held at their 20" (- - - . -), 40" (---), 6oth (- - - (-) percentile. ................................................................... 62

Figure 2.4. Distributions of (A) input-hidden-output connection weights for

hidden neuron B, (B) overall connection weight and (C) input relative

importance (%) for the influence of surface area on lake species richness.

Arrows represent observed input-hidden-output connection weight for

hidden neuron B (7.8 1), overail connection weight (7.43) and relative

importance (1 8.27%). .............................................................. .67

Figure 2.5. Neural hterpretation Diagram after non-signincant input-hidden-

output co~ec t ion weights are eliminated using the randomization test.

Only connection weights statistically different from zero (a = 0.05

and a = 0.10) are shown. The thickness of the Iines joining neurons

is proportional to the magnitude of the connection weight, and the

shade of the Iuie indicates the direction of the interaction between

neurons: black connections are positive (excitator) and gray connections

are negative (inhibitor). B!ack i q u t neuronr hdicate hzbit~t vwkbles

that have an overail positive influence on species tichness, and gray

.. input neurons indicate an overall negative influence on species richness. .70

Figure 3.1. One-hidden layer, feedfonvard neural network design. ........................ .85

Figure 3.2. First panel shows the location of study lakes from the Madawaska

River drainage (128 lakes depicted by circles) and Oxtongue River

drainage (32 lakes depicted by triangles) in Algonquin Provincial Park,

Ontario, Canada (45'50' N, 78'20' W). Second panel shows Crosson

Lake (45'05' N, 79'02' W) with 20 sampling stations depicted by

circles. .........~...........................~......................................... 91

Figure 3.3. Neural interpretation diagram (NID) for predicting fish species presence

or absence as a h c t i o n of macro-habitat variables. The thickness of the

lines joining neurons is proportional to the magnitude of the connection

weight, and the shade of the Iine indicates the direction of the interaction

between neurons; black connections are positive (excitator) and gray

connections are negative (inhibitor). Solid Iines represent connection

weights statistically different Born zero (a = O.OS), whereas dashed lines

represent non-significant comec tion weights. B lack input neurons

indicate habitat variables that have an overdl positive influence on

species occurrence, and gay input neurons indicate an overall negative

influence on species occurrence. ................................................... 98

Figure 3.4.

Figure 3.5.

Figure 3.6.

Relative importance (% of total contribution) of whole-lake habitat

variables in predicting species presence or absence based on the surn

of connection weights joining an input neuron and the output neuron. . . . . . .IO1

Neural interptation diagram (NID) for predicting fish species

abundance as a function of micro-habitat variables. The thickness

of the lines joining neurons is proportional to the magnitude of the

connection weight, and the shade of the Iine indicates the direction

of the interaction between neurons; black connections are positive

(excitator) and gray connections are negative (inhibitor). Solid lines

represent connection weights statishcally different from zero (a = O.OS),

whereas dashed lines represent non-significant connection weights.

Black input neurons indicate habitat variables that have an overall

positive influence on species abundance, and gray input neurons

indicate an overall negative influence on species abundance. . . . . . . . . . . . . . . ... 102

Relative importance (% of total contribution) of micro-habitat

variables in predicting species abundance based on the sum of

connection weights ioininn an i n ~ u t neuron and the oumut neuron. . . . . . . . . . 105

List of Boxes

Page

Box 2.1. Garson's algorithm for partitionhg and interpreting neural network

connection weights. Sample calculations shown for 3 input neurons

(1 , 2 and ?),2 hidden neurons (A and B! and 1 output neuron (O). . . .. . . . . .. 59

List of Appendices

Page

Appendix A List of 286 study lakes of Algonquin Provincial Park used in

Chapter 1 (included is latitude and longitude CO-ordinates). ................... 120

AppendLu B Matrix of Pearson product-moment correlation coefficients for raw (Le.,

untrans formed) whole-lake variables located in the lower triangle, and for

In (x) transformed variables located in the upper triangle (except for pH).

Note that only continuous variables are shown. Variable include surface

area (SA), volume (V), total shoreline penmeter (SP), maximum depth

(MD), total dissolved solids (DS), pH, lake elevation (LE), and growing

degree-days (GD). .................................................................... 127

Appendix C Data type (Le., raw or transformed) for which each species mode1 exhibited

the greatest correct classification rate using logistic regression (LRA),

discriminant analysis (LDA), classification tree (CART) and artificial

neural network (ANN). Optimal (i.e., highest correct classification rate)

classification tree size (i.e., number of terminal leaves) and number of

hidden neurons in the neural network are reported based on n-fold cross

validation. See Table 1.1 for definitions of species codes ...................... 128

Appendir D Logistic regression coefficients for 27 fish species models. Reported

values are the y-intercept (Int.), surface area (SA), volume (V), total

shoreline perirneter (SP), maximum depth (MD), total dissolved soiids

OS), pH, lake elevation (LE), growing degree-days (GD), occurrence of

summer stratification (SS), watershed dummy variable (WLW3) and

occurrence of a littoral-zone predator (P). See Table 1.1 for definitions

..................................................................... of species codes -129

Raw canonicd coefficients and centroid means fiom discriminant

function analysis for 27 fish species. Reported values are the constance,

surface area (SA), volume (V), total shoreline perimeter (SP), maximum

depth (MD), total dissolved solids @S), pH, lake elevation (LE), growing

degree-days (GD), occurrence of surnmer stratification (SS), watershed

dummy variable (Wl-W3), occurrence of a littoral-zone predator (P),

and centroid means for absence (0) and presence (1) of species. See

Table 1.1 for definitions of species codes ....................................... ,129

Appendix E MatLab (version 5.3 release 1 1) prograrnming code for artificial neural

network training using the least-surn-O f-squares error function (i.e.,

continuous response variable) in the backpropagation algorithm, where

................................ predictions are based on n-fold cross validation ,133

MatLab (version 5.3 release I l ) progamming code for artificial neunl

network training using the cross entropy error function (Le., binary

response variable) in the backpropagation algorithrn, where predictions

.............. are based on n-fold cross validation ......................... ,,, -133

Thesis Introduction

"nie history oflife on earth hus been a history of interaction between living

things and their surrozmdings. Tu a large ertent, the physical Jorn and the

habits of the earth 's vegetation and its animal life have been molded by the

en virurtrnertt. Corderirig f h w hok s p i i c fd< i r~ l i& iii>i 2, ihe ~ p p o j i i ~ é$2~i.

in wh ich life actunlly modifies its surroundings, has been rehtively slight.

Oniy within the moment of rime represented b-v the present century has one

species - man - acqnired signifcanr power io alter the nature of his world. "

- Rachel Carson

Global biodiversity is changing at an unprecedented rate ( P h et al. 1995) as a

result of human-induced changes in the global environment (Vitousek 1994), with greatest

impacts related to habitat loss (Sih et al. 2000). Habitat loss and modification have resulted

in the loss of biodiversity in many aquatic and terrestrial ecosystems (Wilson 1994, Sinclair

et al. 1995); for example, native fishes in Lake Victoria (Kaufman 1992), birds in fiagmented

Brazilian forests (Willis 1979), and small mamrnals in arid Australia (Woinarski and

Braithwaite 1990). During the last decade, understanding and predicting regional (e.g.,

Prendergast et ai. 1993), continental (e.g., Wahiberg et al. 1996, Channel1 and Lomolino

2000) and global changes (e.g., Rex et al. 1993, Guégan et ai. 1998, Redford and Richter

1999, Sala et al. 2000) of biodivenity in response to changes in environment and Iand-use

have been of major importance.

A central problem in conservation ecology is how to use Limited resources of tirne,

money, and energy most effectively to minimize the loss the Eaah's biolopical diversity. For

example, while species and habitats are disappearing at an alatming rate, we have often been

unabie to evaluate the extent of the biodiversity loss, let alone predict it. Empincal models

could play an important role in conservation ecology by providing a quantitative framework

fiom which species distributions, nchness and comrnunity structure c m be predicted fiom

patterns in habitat heterogeneity, biotic interactions and anthropogenic conditions. For

instance, predictive models of fish species occurrence at broad spatial scales could be helpful

in assessing the major factors shaping curent and predicting firme distributions. Such

models also serve an important role in forecasting the effects of changing land-use practices

(e.g., Guégan et al. 1998), altered climate regimes (e.g., Tom 1990) and biotic invasions

(e.g., Hrabik and Magnuson 1999) on biota. Similarly, ernpirical models for h e r scaie

measures, such as species abundance, are critical for un der stand in^ relationships between the

environment, its use by an organism and subsequent productivity at smaller, local scales,

which can then be used to predict the influence of habitat alteration, species introductions,

and other artificial and natural perturbations on population and community health.

Although, the public and scientists have traditionally viewed loss of biodiversity as an

issue of tropical ecosystems, there has been recent emphasis on the potential loss of

biodiversity in aquatic ecosystems of temperate regions (Moyle and William 1990, Hughes

and Noss 1992). Surprisingly, fisheries scientists have remained focused on understanding

species-environment relationships rather than developing predictive models using this

knowledge. The pnmary objective of this thesis is to advance the predictive realm of fish

ecology by focusing efforts on developing and comparing predictive models of species

occurrence, abundance, richness and community composition for keshwater fish

communities. Such models represent a major advance in studies of lake ecology and have

obvious research, conservation and management applications. For instance, predictive

models could serve as a template for forecasting species occurrence in lakes whose fauna has

not or carmot be adequately sampled, as well as predicting the effects of habitat modification

and changing land-use patterns on fish populations and communities. In addition, fish-

habitat models would provide managers with tools to predict biodiversity, direct searches for

unlaiown fish populations, predict the presence of indicator species, indicate habitat

suitability for restoration or reintroduction of species, and predict the spread of exotic

species.

The aim of the £ k t chapter is to use conventional (i.e., logistic regression and

discriminant analysis) and alternative (Le., classification trees and artincial neural networks)

statistical approaches to develop predictive models of species presence or absence based on

lake-wide measures of habitat for 286 lakes of south-central Ontario, Canada Fish-habitat

models are developed for 27 species, and the predictive performance of the models is

examined, interpreted and compared in detail using a number of performance measures.

Many of these analyses are an advancement over conventional model evaluations, and

provide insight into deteminhg the predictability of species occurrence, richness and

community composition as a function of whole-lake measures of habitat. Individual species-

habitat rnodels are then used to predict nchness and community composition of the study

lakes. If fish communities are predictable then a community-level approach to the

conservation of aquatic biodiversity may be feasible, rather than focushg efforts on

particular species or populations (Angcrmeier and Schlosser 1995). Such an approach would

be a major advance relative to ciment management proprams (Evans et al. 1987, Franklin

1993, Angermeier and Winston 1999), and consequently predictive models for fish

communities could play a critical role.

A secondary objective of Chapter 1 is to compare the performance of conventional

and alternative approaches for predicting species occurrence, nchness and community

composition. Such c o m p ~ s o n s are increasingly important given the large and evolving

range of approaches available for modeling, and the potential difficulty this poses for

ecologists or conservation biologists in choosing appropnate methods. For instance,

conventional techniques, based notably on multiple regression, are capable of solving many

problems, but a major drawback is that species-environment relationships are often non-

linear (linearity being a cntical assumption of these rnethods). In recent y e m , recursive-

partitioning and machine-learning techniques have received greater attention in the biological

sciences. Examples include classification and regression trees and artificial neural networks,

which show promise in ecological studies given their ability to model complex, non-linear

relationships among variables.

Although alternative techniques can be beneficiai in that they can readily model

complex, non-linear associations between a species and its environment (Lek et al. 1996),

methods for quantimg and interpretuig these associations are generdly lacking. For

instance, artificial neural networks have been cailed a "black box" approach to modeling

ecological phenornena shce they have been viewed as providing no information regarding

the importance of the variables in the model (e-g., Parue10 and Tomasel, 1997; Lek and

Guégan, 1999; ozesmi and ~zesmi, 1999). In chapter 2, my objective is to iIluminate the

"black box" by providing a synthesis of the methods available to better understand variable

contribution in networks. In addition, a randomization test for artificial neural networks is

developed which enables testing of statistical significance of independent variables in

network predictions and facilitates the interpretation of variable interactions.

Using the approaches developed in chapter 2, chapter 3 expands the utility of neural

networks for predicting and understanding species-habitat relationships in ternis of

occurrence at a regiooal scale (Le., within drainage) as a function of whole-lake measures of

habitat, and abundance at a tocal scale (Le., within lake) as a function of near-shore habitat

variables. Given that species abundance may be is a more sensitive response variable for

snidyùig fish-habitat relationships, a combined analysis of both occurrence and abundance is

beneficial. Furthexmore, a more detailed evaluation of the species occurrence models is

conducted, including the construction of Receiver-Operating Characteristic Plots (Metz

1978) to estimate optimal decision thresholds for prediction to maximize classification

success and independent assessrnent of geographic transferability of the models

References

Angermeier, P. L., and 1. J. Schlosser. 1995. Consenring aquatic biodiversity: beyond species

and populations. Amencan Fishenes Society Symposium l7:402-414.

Angermeier, P. L., and M. R. Winston. 1999. Characterizhg fish community diversity across

Virginia landscapes: Prerequiste for conservation. Ecological Applications 9:335-349.

Channell, R., and M. V. Lomolino. 2000. Dpamic biogeography and conservation of

endangered species. Nature 403:84-86.

Evans, D. O., B. A. Henderson, N. J. Bax, T.R. Marshall, R. T. Oglesby, and W. S. Christie.

1987. Concepts and methods of community ecology applied to fieshwater fisheries

management. Canadian Journal of Fishenes and Aquatic Sciences 44 (SuppI. 2):448-470.

Franklin, J. F. 1993. Preserving biodiversity: species, ecosystems, or Iandscapes? Ecological

Applications 3 :202-îOS.

Guégan, J.F., S. Lek, and T. Oberdorff. Energy availability and habitat heterogeneity predict

global nverine fish divenity. Nature 39 1 :3 82-3 84.

Krabik, T. R., and J. J. Magnuson. 1999. Sirnulated dispersal of exotic rainbow smelt

(Osmerus mordu) in a northern Wisconsin lake district and implications for

management. Canadian Journal of Fishenes and Aquatic Sciences 56 (Suppl. 1):35-42.

Hughes, R. M., and R. F. Noss. 1992. Biological diversity and biological intergrity: Current

concems for lakes and streams. Fishenes (Bethesda) 1 7: 1 1 - 19.

Kaufman, L. 1992. Catastrophic changes in species-rich Fi-eshwater ecosystems. BioScience

42846-858,

Lek, S., Delacoste, M., Baran, P., Dimopoulos, I., Lauga, J., Aulagnier, S., 1996. Application

of neural networks to modelling nonlinear relationships in ecology. Ecological Modelling

90:39-52.

Lek, S. and Guegen, J.F., 1999. Artificial neural networks as a tool in ecological modelling,

an introduction. Ecological Modelling, i20:65-73.

Metz, C. E. 1978. Basic pnnciples of ROC analysis. Seminan in Nuclear Medicine 8283-

298.

Moyle, P. B., and I. E. Williams. 1990. Biodiversity loss in the temperate zone: decline of the

native fish fauna of California. Conservation Biology 4:275-284.

ozesmi, S.L. and ozesmi, U., 1999. An artificial neural network approach to spatial habitat

modelling with interspecific interaction. Ecological Modelling 1 16: 15-3 1.

Paruelo, J. M. and Tomasel, F., 1997. Prediction of functional characteristics of ecosystems:

a cornparison of artificial neural networks and regression models. Ecological Modelling

98:173-186.

Prendergast, J.R., R. M. Quim, J. H. Lawton, B. C. Evenham, and D. W. Gibbons. 1993.

Rare species, the coincidence of diversity hotspots and conservation strategies. Nature

365:335-337.

Pimm, S. I., G. J. Russell, J .L. Ginelman, T. M. Brooks. 1995. The hiture of biodiversity.

Science 269:347-3 5 1.

Redford, K. H., and B. D. Richter. 1999. Conservation of biodiversity in a world of use.

Conservation Biology 13 : 1246- 1256.

Rex, M. A., C. T. Stuart, R. R. Hessler, I. A. Men, H. L. Sanders, and G. D. F. Wilson.

Global-scale latitudinal patterns of species diversity in the deep-sea benthos. Nature

365:636-639.

Sala, O.E., and 18 other authors. 2000. Global biodiversity scenarios for the year 2 100.

Science 287: 1770-1774.

Sih, A., B. G. Jonsson, and G. Luikart. 2000. Habitat loss: ecological, evoluhonary and

genetic consequences. Trends in Ecology and Evolution 15: 132- 134.

Sinclair, A. R. Es, D. S. Hik, O. J. Schmitz, G. G. E. Scidder, D. H. Turpin, and N. C. Larter.

1995. Biodiversity and the need for habitat renewai. Ecological Applications 5579-587.

Tom. W. M. 1990. Clhate change and fish communities: A conceptual framework.

Transactions of the Amencan Fisheries Society 1 1 9:33 7-352.

Vitousek, P. M. 1994. Beyond global wamiing: Ecology and global change. Ecology 75:

1861:1877.

Wahlberg, N., A. Moilanen, and 1. Hanski. 1996. Predicting the occurrence of endangered

species in fiagmented landscapes. Science 273: 1536- 1538.

Willis, E. 0. 1979. Species reductions in reminiscent woodlots in southem Brazil.

Proceedings International ûmithological Congress XW:783-786.

Wilson, E. 0. 1994. Biodiversity: challenge, science and opportunity. Amencan Zoologist

3415-1 1.

Woinmki, J. C. Z., and R. W. Braithwaite. 1990. Conservation foci for Australian birds and

mammals. Search 2 1 :65-68.

CHAPTER 1

Predictive models for community assembly: Fish species occurrence in lakes

ABSTRACT

The prediction of species occurrence and community composition is of primary

importance in ecology and conservation biology. Given the large and evolving range of

approaches available for developing predictive rnodels of species presence or absence, it is

potentially difficult for ecologists or conservation biologists to choose appropriate rnethods.

h this stud;. 1 wed logistic regression mdysis, linear discrimulmt mdysis, classification

trees and artificial neural networks to develop predictive models of presence or absence for

27 fish species based on habitat variables of 286 temperate lakes located in south-central

Ontario, Canada. Detailed evaluation of these rnodels based on overall correct classification,

specificity (i.e., ability to accurately predict species absence) and sensitivity (i.e., ability to

accurately predict species presence), showed that the approaches differed marginally in mean

predictive performance across d l species. Al1 four methods exhibited higher levels of correct

classification (76.6-78.9% mean success) and specificity (76.8-77.3% mean success)

compared to sensitivity (35.1-46.2% mean success), and levels of these three measures were

found to depend on the species frequency of occurrence in the study lakes. On the other

hand, individual species predictability varied greatly among approaches and species.

Furthemore, even when correct classification rates were similar for some species, 1 found

that linear approaches (Le., logistic regression and discriminant analysis) and non-linear

approaches (classification tree and neural nehvorks) differed in which of the study lakes they

correctly classified. The 27 species-habitat models denved using each method were used to

predict species richness and composition of each study lake, and results showed close

agreement with observed richness and composition using al1 approaches. Predictability of

community composition varied across the 5 drainages depending on the modeling approach,

with species showing both individual and shared (i.e., with other species) predicted responses

to habitat conditions. 1 showed that easily obtained lake attributes c m be used to predict fish

species occurrence, richness and comrnunity composition with high success.

INTRODUCTION

Historically ecologists have been interested in understanding and predicîing the

distribution of species and the composition of communities across landscapes (Orians 1980,

Wiens 1992, Pickett et al. 1994). However the relative emphasis that has been placed on the

explanatory and predictive cornponents of ecological research varies substantiaily across

disciplines and taxa (Keddy 1992). Plant ecologists have placed more emphasis on

developing models to predict species distributions (e.g., Hill and Keddy 1992, Toner and

Keddy 1997, Wiser et al. 1998) as have Stream ecologists in predicting the occurrence of

invertebrates (e.g., Bailey et al. 1998, Chessrnan 1999, Moss et al. 1999) and fish (e.g., h s e

et al. 1997, Dunham and Rieman 1999, Rahel and Nibbelink 1999, Scheller et al. 1999). In

contrast, lake ecologists have generally focused on understanding species-environment

processes rather than developing predictive models.

Our understanding about fish-environment associations in lakes has emerged

primarily f?om comparative studies that descnbe statisticai relationships between sets of

environrnental variables and species occurrence or abundance. These studies identified the

inHuence of abiotic conditions (e-g., lake morphology, water chemistry), biotic interactions

(e.g., predation, cornpetition), habitat isolation and human-related factors in structuring fish

populations and communities at local, landscape and regional spatial scales (e.g., Jackson and

Harvey 1989, Tom et ai. 1990, Hinch et al. 1994, Rodriguez and Lewis 1997, Magnuson et

al. 1998). Exarnining variation in fish-habitat relationships is useful to better understand

patterns of species distribution, and for providing insight into the mechanisrns shaping and

regulating assemblage structure. However, fish ecologists focusing on lakes have genedly

not used this understanding to develop predictive models for species occurrence and

community composition. Such models would represent a major advance in lake ecology and

would have O bvious applications in research, conservation and management. For example,

predictive models could be used to forecast the effects of habitat modification and changing

land-use patterns on fish populations, to estimate habitat suitability for restoration or

reintroduction of species, and to predict the potentiai spread ofexotic species. Moreover,

predictive models for fish communities could aid in the effective conservation of aquatic

biodiversity by focuskg management efforts on whole communities (Evans et al. 1987,

Franklin 1993, Angermeier and Winston 1999) rather than individual populations

(Angermeier and Schlosser 1995).

Developing predictive models for species occurrence is often difficult since fish

commonly exhibit complex, non-lùiear responses to habitat heterogeneity and biotic

interactions. Logistic regression and linear discriminant analysis remain the most frequently

used techniques, although our confidence in the results is often limited by the inability to

meet a number of assumptions. such as statistical distributions of vsriables, independence of

variables, and model linearity (James and McCulloch, 1990). Consequently, researchers

have begun to employ non-linear statistical approaches such as classification and regression

trees (e.g., Magnuson et al. 1998, Emmons et al. 1999, Rathert et al. 1999, Rejwan et al.

1999) and aaificial neural networks (e.g., Lek et al. 1996, Mastronllo et al. 1997, Guégan et

al. 1998, Manel et al. 1999a;b, ozesmi and ozesmi 1999) for rnodeling ecological data. It is

believed that these alternative approaches can provide researchen with more flexible tools

for modeling the complex relationships between species and their surroundhg environment.

The primary objective of my study is to determine whether lake habitat conditions

relate to species occurrence and ultimately the composition of fish communities, in sorne

predictable manner. 1 address this objective using logistic regression analysis, discriminant

analysis, classification trees and artificial neural networks to develop fish-habitat models for

species occuring in temperate lakes of Ontario, Canada. 1 determine the predictability of

species presence or absence based on readily available, whole-Iake habitat features and

provide a detailed evaluation and cornparison among species and arnong modeling

approaches. Second, each fish-habitat model is applied to assess the predictability of species

richness and cornrnunity composition. Mode1 performance in predicting the composition of

communities of different drainage basins is assessed and patterns in predicted and observed

species membenhip in the communities are compared. Many of these analyses are

advancements over more conventional model evduations, and provide important insight into

the predictability of species occurrence, richness and community composition as a function

of whole-lake measures of lake habitat.

METHODS

Ecological data

The study system consisted of 286 fi-eshwater lakes from five drainage basins located

in south-central Ontario, Canada (Fig. 1.1, Appendix A). Aquatic communities in this region

u e representative of relatively naniral ecosystems hecaiise these Iakes are located in

Algonquin Provincial Park and are subject to minimal perturbations from development and

species introductions. I developed fish-habitat models for 27 fish species (Table 1.1) by

modeling species presence or absence as a function of 13 whole-lake or watershed-level

habitat characteristics. These predictor variables were chosen to include factors that are

related to h o w n habitat requirements of fish in this region (e.g., Matuszek and Beggs 1988,

Minns 1989; Table 1.2), and included: (1) surface area; (2) volume; (3) total shoreline

perimeter (sum of lake and island penmeters); (4) maximum depth; (5) surface measurements

(taken at depths 2.0 m) of total dissolved solids; (6) pH; (7) lake elevation; (8) growing

degree-days (obtained by subtracting the value five fkom the average daily temperature and

summing across a11 days that the average daily temperature is above 5' C); (9) occurrence of

summer stratification; (10) occurrence of a large littoral-zone piscivore (i.e., northern pike,

smallmouth bass or largemouth bass) when modeling smail-bodied fish; and (1 1-13) three

binary variables delineating the five drainage basins, Le., Amable du Fond, Bonnechere,

Madawaska, Oxtongue and Petawawa Rivers, to account for the potential influence of

historical biogeography on fish community composition. Al1 data were obtained fiom the

Algonquin Park Fish Inventory Data Base (Crossman and Mandrak 199 l), and information

regarding the standardized methodology for this inventory can be obtained fiom Dodge et al.

(1 985).

Modehg species occurrence

I applied logistic regression anal ysis (LRA), linear discriminant analysis (LD A),

classification and regression trees (CART) and a r t i f i d neural networks (ANN) to mode1

Gsh presence or absence as a funchon of the habitat variables descnbed above. Due to their

well-documented use by ecologists, 1 r e m fiom detailing LRA and LDA, but discuss

CART and ANNs since these approaches are less farniliar to many ecologists.

Figure 1 .l. Study Iakes located in Amable du Fond River (A), Bonnechere River (B),

Madawaska River (C), Oxtongue River @) and Petawawa River (E) drainage basins of

Algonquin Provincial Park, Ontario, Canada (45'50' N, 78O20' W).

Table 1.1. List of fish species, including species abbreviation (Code) and fkequency of occurrence (%) in the 286 study lakes. ûniy species occurring in gréater than 5% of the lakes are included.

Code Comrnon Name Scientific Name %

BCS BNS BSB BT BB B C CC CS F FSD GS ID LC LT LW LS NRD PD PKS RB RW SMB SL T-P WS n?

BIackchin shiner Blacknose shiner Brook stickleback Brook trout Brown bullhead Burbot Cisco Creek chub Common shiner Fallfish Finescale dace Golden shiner Iowa darter Lake chub Lake trout Lake whitefish Longnose sucker

Northem redbelly dace Pearl dace Pumpkinseed Rock bass Round whitefish Smallrnouth bass Splake Trout-perch White sucker Yellow perch

Notropis heterodon Notrop is h eterolep is Cui~ea ;ircor;stms Salvelin us fon tinalis

Ameiztriu nebulosus Lota lota Coregonus artedi SemotiZw atromacdatus Luxifrrs cornrrfus Semotilru corpora fis Pho.rinus neogaeus Notemigon us crysoleucas Etheostorna aile Couesiirs phtmbeus Salvelinzci namaycush Coregonus clupea form is Catostomus catostomzcs Pho.rinus eus

Margarism margurita Lepomis gibbosus Ambloplites nrpestris Prosopium cylindraceum Microptenrs dolomieu

S. fontinalis xS. namaycush Percopsis omiscomayus Catostomus cornmersoni Perca fruvescens

Table 1.2. Summary statistics for the whole-lake habitat variables used in mode1 development (see Appendix B for Pearson-moment correlation coefficients among variables).

Predictor variable Minimum 25% Median 75% Maximum quartile quartile

Surface Area (ha) 4 3 Volume (x 10 m )

Total Shoreline Perimeter (km) Maximum Depth (m) Elevation (m) PH Total Dissolved Solids (mg L-') Growing Degree Days Occurrence of surnmer stratification (O,] ) Presence littoral-zone piscivores (0,l) 3 Watershed dumrny variables (0,I)

Classz~cation and regression trees. - The use of automatic construction of decision trees

dates Eom work in the social sciences by Morgan and Sonquist (1963), but Breiman et al.

(1984) had a major influence in bringing CART methods to the attention of statisticians.

CART is a nonparameûic multivariate classification technique which is most commonly

implemented using a recunive partitionhg algorithm (Ciampi 1990, Hand 1997). This

algorithm partitions the data set hto a nested series of paired groupings, each of them as

homogeneous as possible with respect to either the presence or absence of the species. The

procedure begins with the entire data set, also called the root node, and formulates split-

defining conditions for each possible value of the predictor variables to create candidate

splits. Next, the algorithm selects the candidate split that rninimizes the misclassification

rate, and uses it to partition the data set into 2 subgroups. The algorithm continues

recunively with each of the new subgroups until no split yields a significant decrease in the

misclassification rate, or until the subgroup contains a small number of O bsewations (i.e.,

usually set to 5 or 10). A terminal node or "leaf' is a node that the algorithm cannot partition

any fiirther, and represents the most homogeneous group (Breiman et al. 1984). The

response class (in this case the presence or absence of a species) for each terminal node is

assigned by rninimiPng the resubstitution estimate of the probability of misclassification for

the observations of that node, The number of terminal nodes defines the size of the tree.

Next, optimal tree size is assessed to simpliQ its structure without sacrificing

goodness-of-fit. This c m be a paaicularly important consideration since more splits in the

tree will result in lower misclassification rates at the cost of poorer predictive power when

applied to data not used in consûucting the tree. Alternatively, if the tree is too small then it

will not use al1 the classification information that is available fiom the data. 1 used jackknife

validation to estirnate misclassification rates for a series of candidate trees each of different

size, and defined the optimal tree size as the subtree exhibiting the lowest misclassification

rate (Breiman et ai. 1984 and Appendix C).

drt$%ial neural networks - Although ANNs were orïginalIy developed to better understand

how the mammalian brain Eunctions, researchers have become more uiterested in the

potential statisticai utility of neural neîwork algorithms (Cheng and Titterington 1994,

Bishop 1995). In this study 1 use one hidden-layer feedforward neural networks trained by

the backpropagation algorithm (Rumehart et al. 1986). This type of network is commonly

used because it is considered to be a universal approximator of any continuous fict ion

(Hornick et al. 1989). Furthemore, 1 use a single hidden layer because this is generally

satisfactory for statistical applications (Bishop 1995), it greatly reduces computational time,

and it often produces similar results compared to multiple hidden layers (Kurkovh 1992).

The one hidden-layer feedfonvard nehvork consists of a single input, hidden and

output layer, with each layer containing one or more neurons. The input layer contains p

neurons, each of which represents one of thep predictor variables, Le., in my case 12 input

neurons for each species, except for small-bodied species where the input layer contains 13

neurons (addition of littoral-zone predator variable). The nurnber of hidden neurons in the

neural network is determined empincally by calculating the misclassification rates for

networks containing 1 to 20 hidden neurons using n-fold cross validation, and choosing the

number of hidden neurons which produces the lowest miscIassification rate (Appendix C:

Bishop 1995). The output layer contains one neuron representing the probability of species

occurrence. Additional bias neurons with a constant output (equal to 1) are added to the

hidden and output layers. These neurons play a similar role to that of the constant temi in

multiple regression. Each neuron (exciuding the bias neurons) is connected to ail neurons

from adjacent layers with an axon. The axon connection between neurons is assigned a

weight which dictates the intensity of the signal transmitted by the axon. in feedforward

networks, axon signals are transmitted in a unidirectional path, fiom input layer to output

layer through the hidden layer. The "state" or "activity levei" of each neuron is determined

by the input received fiom the other neurons comected to it. The state of each input neuron

is defined by the incoming signal (Le. values) of the predictor variables. The state of the

other neurons is evaluated IocaUy by calculating the weighted sum of the incoming signals

kom the neurons of the previous layer. The entire process can be written mathematically as:

where .ri are the input signals, yk are the output signds, wu are the weights between input

neuron i to hidden neuron j, wjk are the weights between hidden neuron j and output neuron k,

pj and pi are the bias associated with the hidden and output layers, and +h and are

activation functions for the hidden and output layers. There are several activation functions

(see Bishop 1995) and 1 use the logistic (or sigrnoid) bc t ion . The outgoing signal to the

output neuron represents the probability of species occurrence.

Training Llie neural netw-or8 involves the back-propagation algoritlm whzre ille goal

is to find a set of comection weights that minimizes an ermr function. The cross-entropy

criterion (Le., similv to log-likelihood) is rninimized during network training:

where t,, is the observed output value and y,, is the predicted output value for observation n.

Observations are sequentially presented to the network, and weights are adjusted d e r each

output is calculated depending on the magnitude and direction of the error. This iterative

technique of minimikg the error is known as gradient descent, where weights are modified

in the direction of greatest descent, traveling "downhill" in the direction of the minimum.

Network training is stopped after 1000 iterations of the error back-propagation algorithm. To

minimize the potential for network overfitting 1 use the simplest network architecture (i.e.,

srnallest number of hidden neurons) where equivalent network configurations exhibit

identical predictive performance. Furthemore, 1 use Ripley's (1 994) regularization, where a

weight-decay parameter h (set equal to 0.01) is used to rnodiw the error function of the

network by penaiizing large connection weights. The weight-decay technique improves the

optùnization process and reduces the chances of developing a saturated network (i.e., dl

outputs approaching zero or one).

Mode1 construction and validation

Jackknife or "leave-one-out" cross validation is employed to estimate the predictive

performance of each species model. Thîs method excludes one observation, constnicts the

model with the remainhg observations, and then predicts the response of the excluded

observation using this model. This procedure is repeated N times so that each observation, in

tum, is excluded in model caiibration and its response is predicted. N-fold cross validation is

used since it has been s h o w to produce nearly unbiased estimates of prediction error (Olden

and Jackson 2000). For ail rnodels, a decision tbreshold of O S (predicted probability of

occurrence) is used to classify the species as present or absence in a lake. Both raw and

transformed data (al1 continuous predictor variables. except pH. were hlx) transformed to

approximate normal distributions) are anaiyzed, and the data type exhibiting the greatest

predictive performance for any given method and species is retained for ail subsequent

analyses (Appendix C). A total of 54 model construction and validation processes (including

the selection of optimal classification tree size and optimal nurnber of hidden neurons in the

network) are calculated for each method, i.e. 27 species for each of the 2 data types (raw and

transfomed).

Mode1 predictive performance

Predicting species occurrence - I partition the overall classification success of each species

model by deriving ''confusion matrices" following Fielding and Be11 (1997). Ushg these

matrices I examine three rnetrics of prediction success. First, I quanti@ the overall

classification performance of the model (CC) as the percentage of lakes where the model

correctly predicted the presence or absence of the species. Second, 1 examine the ability of

the model to predict species absence, termed model specificity (SP). Third, I examine the

ability of the model to predict species presence, termed model sensitivity (SE). Cohen's

kappa statistic was used to assess whether the performance of the model differs from

expectations based on chance alone (Titus et ai., 1984). McNemar7s test (with Yates

correction for continuity; Zar 1999) is used to compare differences in patterns of lake

misclassifications among LRA, LDA, CART and ANN for each species.

Predicting richness and comrnunity composition - Species richness is tabulated as the

number of species predicted to be present for each lake using the 27 species-habitat models,

and community composition was estimated by recording predicted species presence in each

lake. Predicted and actual cornmunity composition was compared using Procmstes anaiysis

(Jackson 1995) on lake scores 6rom the axes fiom a principal coordinate analysis (PCoA)

based on Jaccard distance. Jaccard's similarity coefficient was transformed to a distance

measure by taking the square root of the complement (i.e., (1 -s)? Jackson et aI. 1 989)

PROTEST is a randomization test of matrix concordance incorporathg Procrustean matrix

rotation (Gower 1975, Rohlf and Slice 1990, Jackson and Harvey 1993), which mat:hes the

position of each lake in one multivariate Face !i.e.. predicted community composition) to the

position of the same lake in a second multivariate space (Le., observed cornmunity

composition). The method minirnizes the sum-of-the-squared deviations (Le., m2; Gower

1975) between the pair of points representing each lake such that the greater the similarity of

the rnultivariate configurations fiom the data sets, the lower the m2 value. This measure is

compared with that denved from repeatedly randomizing the configuration from one matrix

and recalculating the rn2. The percentage of m' values equal to or less than the observed rn2

provides the significance level of the test (Jackson 1995). Predicted and actual comrnunity

compositions are compared for ordinations of al1 Mes, for lakes in each drainage basin, and

for species (i.e., patterns in lake classifications). Cornparisons among species used al1

dimensions from the PCoA whereas lake cornparisons are lirnited to 6 dimensions due to the

matrix size. AU statistical analyses were performed using S-Plus software (Mathsoft 1998,

version 4.5) and the PROTEST software available from the authors.

RESULTS

Predicting species occurrence

On average, LRA, LDA, CART and ANN correctly predicted species occurrence (i.e.,

correct classification) and absence (i.e., specificity) in approximately 7580% of the lakes,

but correctly predicted species presence (Le., sensitivity) in only 40% of the lakes (Table

1.3). Appendix D contains the logistic regression and discriminant models for al1 species.

Rates of correct classification were substantially less variable compared to those for

specificity and sensitivity. There were no significant differences in overall correct

classification (Kruskal-Wallis, H=0.608, P=0.90), specificity (H=0.625, M . 8 9 ) and

Table 1.3. Summary of predictive performance of species-habitat models. Reported values are percentage correctly classified (CC), specificity (SP: ability to accurately predict species absence) and sensitivity (SN: ability to accurately predict species presence). Predictions significantly different fiom random (based on Kappa statistic) are indicated in italics (a=0.05). Species codes are defined i i i Table 1.1.

- - -- -- - - -- - - --

Specics Logistic Discriminant Classification Neural Code Rcgression analy sis trec nchvork

-

CC SP SN CC SP SN CC SP SN CC SP SN

BCS BNS BSB BT BB B C CC CS F PSD GS ID LC LT LW LS NRD PD YKS RB HW SMB SL T-P

Mcan 78.5 77.3 43.1 78.9 76.9 46.2 77.6 76.8 41.0 76.6 77.0 35.1 S.D. 9.9 27.5 32.8 9.0 26.6 31.7 10.2 26.3 35.0 10.9 30.0 36.2

sensitivity (H=2.745, P=0.43) among LRA, LDA, CART and ANN for al1 species but

differences arnong methods exist for individual species.

Overall correct classification, specificity and sensitivity rates varied between species

and between methods (Table 1.3). Although many species were correctly classified with

equal success, directional strengths in their predictions (i.e., specificity, sensitivity) often

varied. For some species absence was better predicted (e.g., cisco, Longnose sucker,

srnailmouth bass), for some species presence was better predicted (e.g., brook trout, white

sucker, yellow perch) whereas others exhibited similar levels of specificity and sensitivity

(e.g., brown bullhead, lake trout). Cohen's kappa test showed that, LRA, LDA, CART, ANN

produced 19,2 1, 17 and 1 1 species-habitat models, respectively, whose predictions were

greater than expectations based on chance (Table 1.3). In addition to comparing the

performance of these methods, examining cases where a consensus across methods was

achieved provided an opportunity to better assess the power to predict species occurrence.

The list of species where al1 methods generated significant predictions of species occurrence,

and where al1 methods generated non-significant predictions is s h o w in Table 1.4.

McNemar's test showed that methods differed in patterns of misclassification for 10

out of the 27 species (Fig. 1.2). Of the 20 pair-wise differences between methods, 18 were

between linear and non-linear techniques, with the greatest number of discrepancies observed

for centrarchid spp., brook stickleback, peari dace and yellow perch. Although overall

predictability of particular species (e.g., brook trout, burbot, blacknose shiner; Table 1.4) was

similar among the statistical methods, the lakes in which the methods misclassified these

species differed (Fig. 1.2).

Frequency of species occurrence in the study lakes appeared to influence predictive

performance of the models, regardless of statistical method used. Rates of correct

classification showed a non-linear, U-shaped relationship with fiequency of occurrence, with

greatest predictability for rare (< 20%) and wide-spread (> 80%) species, whereas specificity

showed a negative relationship and sensitivity showed a positive with species occurrence

(Fig. 1.3).

Table 1.4. Species exhibiting high (significantly different from random) and low (not significantly different fiom random) predictability for logistic regression analysis, linear discriminant andysis, classification trees and artificial neural networks.

S pecies predictability

High Low

BIacknose shiner Blackchin shiner Brook trout Finescale dace Brown bulhead Lake whitefish Burbot Round whitefish Cisco Splake Common shiner Creek chub Lake trout Northem redbelly dace White sucker

Blacknose Shiner

Brook Trout

Burbot

Longnose Sucker

Brook Stickleback

Srnallmouth Bass

Roc kbass

Pump kinseed

Yellow Perch

Pearl Dace

Figure 1.2. Results fiom McNemar's test assessing differences among patterns of lake

misclassifications using logistic regression, discriminant andysis, classification trees and

artificid neural nebvorks. Shared shading for a species represent pair-wise differences for

lakes in which species occurrence was incorrectly predicted (based on a4.05). For exarnple,

blacknose shiner was misclassified in different sets of lakes using logistic regression and

classification tree.

LRA LDA

Species occurrence (%)

Figure 1.3. The effect of fiequency of species occurrence on overall correct classification rate (top panel), specificity (middle panel) and sensitivity (lower panel) for 27 fish species using logistic regression (LRA), discriminant analysis (LD A), classification trees (CART) and aaificial neural networks (ANN), expressed as percentage. Dotted lines represent expectations based on chance predictions.

CART ANN

O 20 40 60 80 100 O 20 40 60 80 100

Species occurrence (%)

Figure 1.3. continued.

Predicting species richness and community composition

Regressions between actual and predicted species nchness showed that al1 methods

had significant relationships (Fig. 1.4). Cornparhg the regression lines to the 1 : 1 line of

perfect predictions showed that al1 rnethods tended to over-estimate species nchness for

depauperate communities and under-estimate nchness for speciose communities.

Fish community composition was predicted with varying success depending on

geographic location (i.e., drainage) and statistical method used (Table 1.5). PROTEST

showed that for al1 the study lakes predictions of community composition matched closely

with actual community composition. More detailed examination showed that ability to

predict species membership in the Mes differed between the dninages. For example, the

communities of Bomechere and Oxtongue drainages were better predicted compared to lakes

of Madawaska and Petawawa drainages for a11 approaches; however, notable exceptions

included the Madawaska drainage where LDA outperformed the other methods and the

Petawawa drainage where CART was more successful (Table 1.5).

Figure 1.5 shows the species plots nom the PCoA of observed and predicted

community composition of the study lakes. Species occurring in the same lakes are

positioned close together on a plot, and species fiom different sets of lakes are positioned at

opposite ends of the plot. Observed and predicted comrnunity composition ordinations

patterns show strong agreement and are significantly concordant for d l rnethods (Table 1.5).

Figure 1.5 shows that brook trout, cisco, round whitefish and lake whitefish, d l members of

the Salmonidae family, are clustered close together in al1 plots (although showing some

scatter for ANN) indicating that they are obsewed and predicted to exist in a similar set of

lakes. Sirnilarly, rock bass, pumpkinseed and smallmouth bass, al1 members of the

Centrarchidae family are observed and predicted to occur in the same lakes. In contrast,

white sucker and longnose sucker are known to be in the same lakes, yet are separated by a

greater distance in the predicted ordination space for al1 methods, indicating that they are

predicted to be rnembers of different assemblages.

Table 1.5. Cornparison of observed and predicted fis11 community composition for lakes based on first six components fiom a principal CO-ordinate analysis based on Jaccard's similarity coefficient. Comparable analyses based on lakes fiom al1 M e s and frorn eacli drainage basin. The range of total amounts of variation explained by the first six components of the between lakes matrices is surnmarized whereas al1 componenis were retained from the species ordinations. Reported values are Gower's ln2,

ranging from O to 1 where O indicates perfect agreement, and associated significance levels based on 9,999 randornizations from PROTEST in parenthesis.

Lakcs N % variance explained

ALL 286 34 - 54 Amable du Fond 27 59 - 77 Bonnechere 1 O 83 - 94

Madawaska 128 54 - 67 Oxtongue 32 54 - 75

Petawawa 89 41 - 63

Logistic Regressioo

Discriminant analysis

Classification trce

Neural network

0.89 1 (0.000 1)

0.755 (0.0121)

0.404 (O. 1593) 0.859 (0.0001)

0.720 (0.0002)

0.870 (0.0001)

Species 27 1 O0 0.050 (0.000 1) 0.046 (0.000 1) 0.05 1 (0.000 1) 0.052 (0.0001)

241 LRA

Actual Species Richness

20 -

Figure 1.4. Regression analysis between actual and predicted species richness using logistic regression &RA: y=3.24+0.73x), discriminant analysis (LDA: y=2.96+0.75x), classification trees (CART: y=2.85+0.78x) and artificial neural networks (ANN: y-;2.64+0.86x). Solid h e represents regression h e and the dashed line represents a line of perfect match (1 : 1). Predicted Bchness was tabulated by d g predicted species presences using models for each of the 27 species.

0 0

0

r = 0.647 4 4

P < 0.001 . 9 4 . . 4

O . 4

16 - m m . * m m a O

12 - . . . ., a . . .

8 -

4 - . . m m . O

0 4 I - I 8 8 . - I 1

.

. m m . m . . * *

e a a m m

e e e

. a m m . . e . . t ' *

Actual Species Richness


OBSERVED I BSB

B NRD BT WS LS

PD L W T-P Cs

LC F BCS

RW ENS

PI(S RB &CC ID

YP SMB I

l

1 BSB

NRD RW

FSD PKS f-p -0.2 ,

GS SMB

-0.3 BB

BSB RW

L w ~ s BT FSD YP

CART

SMB F PKS

LDA I Bm WS

BSB L(w LC NRD

FSD

BB G S ~

CS

T- P S M B

BCS BSB ID

LT LC

ws Ls SL==

T-P BNS C RB

BT CS PKS

Lw BB SMB

Figure 1.5. First tsvo axes f?om principal coordinate andysis based on Jaccards distance of h o w n or observed species composition (A), composition predicted by logistic regression model, predicted by discriminant analysis (B), predicted by classification trees (C) and predicted by artificial neural networlû @). Letters refer to species codes Listed in Table 1.1.

DISCUSSION

Cornparison of modeüng approaches

Traditional (Le., linear) and alternative (Le., non-linear) approaches to developing

predictive models should be viewed as both cornpetitive and complementary methodologies

for cstablidhiag quuititativc linkages betwetn fish md their e n ~ i r ~ m e n i . 1 found t h t

average predictive performances of LRA, LDA, CART and ANN were sirnilar across al1

species, although the predictability for individual species varied greatly. Therefore, neither

linear nor non-linear approaches were optimal for al1 species. Indeed, recent studies

modeling species presence or absence have illustrated the predictive advantages of LRA

(e.g., Manel et al. 1999% ozesmi and Ozesmi 1999), LDA (e.g., Reichard and Hamilton

1997, Scheller et al. 1999), CART (e.g., Rejwan et al. 1999) and ANN (e-g., Mastrorillo et al.

1997) relative to their linear or non-linear counterparts. Where the underlying data structure

and assumptions are met for a particular traditional statistical method, there is no reason to

believe that major differences between traditional and alternative techniques should exist

(e.g., Smith et al. 1997, Manel et al. 1999b, Rathert et al. 1999). For example, one might

expect LRA and LDA to perfiorm well where linear relationships exist, whereas CART and

ANN should prove better in non-linear situations. Interestingly, the results showed 10

species, including brook trout, smallmouth bass and yellow perch, whose patterns of lakes

misclassification differed among the statistical methods. The majonty of these differences

occur in contrasts of Iinear and non-linear approaches, indicating that although average rates

of correct classification were similar, some methods may be superior depending on the shape

of the species response c w e to habitat conditions. UItimately, more direct cornparisons

based on simulation studies and using a wider anay of field data sets are required to

accurately address di fferences among tradi tional and alternative ap proaches.

1 reitente Ripley (1994) in saying that most data sets are expensive to collect both in

tems of thne and money, and that more effort should be spent in choosing and cornparing

diflerent statistical methods which best suit the particular question of interest and

charactenstics of the data at hand. In the discussion that follows 1 focus on examining results

common to all four rnodeling approaches, motivated by the belief that a consensus of

methods can provide us with a greater degree of confidence that patterns in species and

community predictability are eco logicdly meaningfùl, and not statistical artifacts of the

methods employed.

Predicüng species occurrence

My study shows that whole-lake attributes can be used successfully to predict species

presence or absence. For many species the occurrence in any particular lake was predicted

with hi& success, whereas for other species there remained a large degree of uncertainty in

the prediction. Species such as smallmouth bass, burbot, brook trout and lake bout were

correctly classified in approximately 7540% of the lakes, which is an especially promising

result given the economic and societal importance of such species. Although these models

are correlative, and thus I cannot infer causation and make interpretations of the underlying

mechanisrns (Cale et al. 1989), the results are consistent with findings Eom many studies of

temperate fish populations (Jackson and Harvey 1989, Tom et al. 1990, Magnuson et al.

1998). For instance, mailmouth bass and lake trout are known to be influenced by overall

lake size (Le., area, volume, maximum depth and shoreline perimeter) since these

morphological features aiter the mixing characteristics and hence the themal regirne of lakes

(e.g., Eadie and Keast 1984, Jackson and Harvey 1989). In addition, lake area and depth

serve as an indirect mesure of the diversity of habitats available in lakes, which may be

important to support the small-bodied, forage fish upon which srnalimouth bass and lake

bout feed.

Although the predictive abilities of conventional models for species presence or

absence are commody assessed fkom overall classification rates done, 1 show that by

partitioning the predictive performance of the models into measures such as sensitivity and

specificity, 1 c m assess more readily the strengths and weaknesses of the rnodels and better

evaluate their applicability. For example, the presence ofbrook trout and yellow perch were

predicted with a high degree of ceaainty (93-94% of the lakes), yet predicting the absence of

these species was more difficuit. Conversely, srnailmouth bass and rock bass exhibited high

rates of correct classification when absent (95-98%), but were poorly predicted when present

in a lake. Lake trout and brown bullhead had similar IeveIs of correct classification,

sensitivity and specificity indicating good mode1 stability for predicting both species

presence and absence.

1 found that the clearest determinant of prediction success was the fiequency of

species occurrence in the study lakes. Mode1 sensitivity increases and specificity decreases

with increasing species fiequency of occurrence, whereas overail rates of correct

classification were highest for rare and common species and decreased as species occurrence

approached 50%. This relationship is expected, yet is seldom considered in distributional

rnodeling (but see Fielding and Bell 1997, Manel et al. 1999b). For example, for rare species

(e.g., 5% occurrence), a naively constnicted simple model might predict the species being

absent in al1 lakes resulting in a 95% correct classification rate. Similarly, a cornmon species

f o n d in 95% of the lakes might be predicted to occur in al1 lakes, and again, the model will

correctly classify 95% of the lakes. Finally, using such a simple mode1 for a species

occurring in 50% of the lakes would result in the maximum error rate. This type of model

would provide a U-shaped response between the overall correct classification rate and the

fiequency of species occurrence. Although the dependency of model-prediction success on

species fiequency of occurrence is unavoidable, it is commonly overlooked. Thus, it is

imperative that mode1 performance is tested against expectations based on chance. In this

study I used Cohen's kappa statistic to assess the significance of model predictive

performance, although a number of other approaches may be employed.

There are a number of practical implications for the relationship between prediction

success and species fiequency of occurrence. First, a decrease in model sensitivity for rare

species irnplies that it will be more dificult to predict the occurrence of organisms whose

conservation and management is perhaps the most critical. This finding has great importance

in developing models for guiding searches for populations in previously unsampled areas and

for indicating site suitability for the reintroduction of rare species (e.g., Hill and Keddy 1992,

Wiser et al. 1998) since the predictive ability of the models will be limited (Scott et al. 1987).

Second, drawiog inferences about observed absences of species kom sites containing suitable

habitat conditions (e.g., indirect evidence for dispersal, predation, cornpetition) could be

limited if the models exhibit poor specificity. Finally, examinhg alternative measures of

prediction success can provide more accurate cornparisons of dBerent modeling approaches

(e-g., Manel et al. 1999b) and among species. For instance, 1 found that although the overall

correct classification rates for some species were similar, levels of specificity and sensitivity

were often quite different. By quantifjmg and examinhg alternative measures of mode1

prediction success, we can gain additional insight into the detemiinants shaping species

occurrence (e.g., Manel et al. 1999a;b, Scheller et al. 1999), which ultimately can lead to the

development of more robust predictive models.

The development of powerful, predictive models for species occurrence will add to

Our knowledge of the distribution and habitat requirements of species' as well as serve to

focus research both in terms of observational and experirnental studies by identifjmg gaps in

our howledge and help to narrow our examination of causal mechanisms shaping fish

community structure in Ereshwater ecosystems. Predictive models also play an increasing

role in the conservation and management of fish populations by providing first-order

estimates of habitat suitability, which could then be followed by ground truthing and field

validation, in order to predict sites with available spawning habitat (e.g., Knapp and Preisler

1999) or to establish potential areas for species restoration or reintroductions. Similarly,

mode1 predictions can be used to estimate the likelihood of local establishment and spread of

exotic species, which may help set conservation pnonties for preserving vulnerable species

and popuiations that might be lost locally (e.g., Hrabik and Magnuson 1999).

Although 1 have not included al1 factors that rnay influence the occurrence of a fish

species at a given location, the variables used in my study do successFully classi@ most

occurrences or, alternatively, are closely correlated with other factors that do discriminate

between presence and absence. The inclusion of other environmental variables (e.g.,

temperature, dissolved oxygen concentrations) and isolation-related factors (e.g., distance to

nearest lake of equal or greater size, number of inlets/outlets) may improve the predictive

performance of the models in terms of overall correct classification, specificity and

sensitivity. Examining the cases where the predictions of the models did not fit expectations

rnay provide important into the importance and causes of this unaccounted variation (e.g.,

Boone and Krohn 1999, Rahel and Nibbelink 1999).

Predicting species richness

Predictions of species richness in my study were generated by tabulating predicted

species presences using the individual models for the 27 species. 1 found good correlations

between predicted and actual richness values in the study lakes; however, the strength of the

correlations were not as hi$ as other fish studies conducted in this region (e.g., Eadie and

Keast 1984: Matusek and Beggs 1988. Minns 1989). This difference can be attnbuted to the

way species richness was predicted. My study took an individudistic approach by modeling

species-habitat relationships separately, rather than employing the conventional approach

where species richness is predicted as a conthuous variable with al1 species considered

collectively in a generic fashion. As such, my models provide a more detailed examination

of the direct role of habitat constraints on determinhg the presence or absence of individual

fish species, as compared to treating nchness collectively across al1 species which accounts

only for overall limitations on species richness. This distinction can be particularly important

since the composition of ecological communities can differ and change in time and space

without corresponding effects on species tichness (e.g., Kadmon and Pulliam 1993, York

2000). Furthemore, predicting individual species membership in ecological communities

can be important since species-nch assemblages often do not include those species in most

need of conservation (Prendergrast et al. 1993), and perhaps species should not be assigned

equal conservation value in predictive models.

Predicting community composition

A nurnber of multivariate approaches, such as canonical correspondence analyàs and

the Mantel test, have been used by ecologists to relate the structure of fish communities to

the environment (e-g., Jackson and Harvey 1993, Hinch et al. 1994, Matthews and Robison

1998, Pusey et al. 2000). These studies help us understand the major factors correlating and

possibly stnicturing fish communities, but do not produce a fkunework fiom which

individual species membership of such communities can be predicted. I see two general

approaches for developing predictive models for species composition of communities.

First, assemblages can be classified based on groups of species showing shared

patterns of CO-occurrence prior to relating these assemblage types to environmental

conditions. Commonly called community classification, such studies have used a number of

statistical techniques (e.g., clustering, ordination) to determine assemblages based on patterns

of species composition. Examples of community classification in fish studies include Tonn

and Magnuson (1982) who grouped northem Wisconsin assemblages into three types, which

they referred to as "mudminnow", "bass" and "pike", and Capone and Kushlan (1 99 1) who

classified three comrnunities: "mosquitofish", "black bullhead" and "sunfish-shiner-

mosquitofish" in dry-season stream pools of northeast Texas. In both studies assemblages

were defined by their characteristic or dominant species. Magnuson et al. (1 998) constnicted

8 cornrnunity types for srnaIl forest lakes of Finland and Wisconsin, and developed

classification models for predicting fish assemblage type based on environmental variables.

The RIVPAC program (Wright et al. 1984) relies on the identification of classification

groups of macro-invertebrate comrnunities. Reducing species composition into a mal1 set of

assemblage types can be a powerful approach for making the study of complex multi-species

assemblages more tractable (Poff 1997). Models are believed to exhibit greater predictive

power for classifjmg lakes into assernbly types since they only must discriminate among a

nal ler number of groups. Similar arguments have been made for the advantages of

developing predictive models for guilds (Austen et al. 1994).

Probably the greatest disadvantage of aggregating species into assemblage types is

the loss of information about individual species (Hay 1994). Moreover, if such species

associations are not a product of natural processes and are not temporally and spatially stable,

any prediction of community state or change based on that assemblage type might potentially

have little relevance to cornmunity dynamics (Austen et al. 1994). OAen, a subjective

decision is required to determine the degree of similarity necessary to group species together.

For instance, Angerrneier and Winston (1999) found over 90 statishcally distinct fish

assemblage types in Virginia and emphasized that the division of comrnunities into discrete

assemblages is arbitrary in that different sets of environmental factors at different spatial

scaies would yield greater or fewer types of communities. Thus, the selection of the cntena

used to dehe assemblage types is a particdarly important consideration when developing

predictive models. For example, &et the models have been developed they may not be

applicable to systems outside the sites used to classi@ the communities due to different

species pools and thus different assemblage types. I found that the prediction success of

community composition varied across drainages, reinforcing the fact that the identification of

comrnunity types is perhaps a spatially dependent process (Angermeier and Winston 1999).

Tonn et al. (1990) were unable to identify discrete fish assemblage types in mal1 Finnish

lakes using species presence-absence data. SUnilarly, Pusey et al. (2000) found that stream

fish assemblage structure in northeastem Australia did not represent discrete assemblages but

were cornposed of species varyhg along individual environmental gradients. In such cases,

it is possible that better predictive models for community composition may be achieved by

modelling the occurrence of individual species rather than whole assemblages.

My study took this second appmach by modeling species occurrence individually,

and applying each species-habitat model separately to predict community composition. 1

found that groups of species exhibited similar pattems of observed and predicted occurrence,

suggesting common responses to habitat features across lakes. In this case, pattems of

species CO-occurrence could facilitate the grouping of organisms for the purpose of

community prediction. However, some approaches predicted patterns in species CO-

occurrence that were not observed (e.g., longnose sucker, white sucker). In these cases the

species show more unique responses to habitat features (e.g., Pusey et al. 2000).

Constnicting communities based on predictions fiom individual species-habitat

models aids in identifjmg patterns of shared habitat among the species; however, these

approaches generally ignore interactions (e.g., competition, predation) among species.

Consequently, identifjhg sites where there is iikely to be suitable habitat that is unoccupied

suggests a non-habitat-related mechanism for theu absence, such as dispersai limitation, past

extinction events, predation or competition (Wiser et ai. 1998). Absences of species in sites

containing members of the same family or guild provide even stronger evidence for the

importance of alternative mechanisrns shaping pattems in species occurrence. Therefore,

comparing observed and predicted community composition based on environmental factors

could help tease apart the relative importance of habitat heterogeneity and other interactions

shaping the structure of fish assemblages, and can lead to a greater understanding the role of

the environment in mediating patterns in species dish.Lbutions and community composition.

Although the major@ of conservation efforts tend to focus on particuIar species or

populations (Angemeier and Schlosser 1999, a community-level approach to the

consemation of aquatic biodiversity would be a major advance relative to current

management programs (Franklin 1 993, Angermeier and Winston 1 999), and thus predictive

models for fish communities could play a critical role. Though discussion of the potential

applications of community models is beyond the scope of this paper, I re-interate TOM

(1990) in saying that purely reductionist attempts to understand and predict community

organization in terms of individuals and single-species populations may reduce our abilities

to identiQ larger-scale pattems and thus to fully understand the structure and hinction of

communities. Indeed, Evans et al. (1987) provided a convincing argument for using fish

cornmunities as ecologically derived units for fieshwater fisheries management. However,

the major prerequisite for such an approach is that discrete communities can actually be

recognized and defined (Jackson et al. 1992); a prerequisite which is not always met.

Therefore, a combined approach of using individual species-habitat models and community

classifications for predicting species assemblages might be warranted.

CONCLUSION

1 have shown that statistical modeling approaches have promise in providing testable,

predictive models for fish population and comrnunity ecology. Although overall (i.e., for al1

species) differences in predictive performance among the approaches are minimal,

differences did exist for individual species and in patterns of lake misclassifications. Given

that these comparisons are based solely on empirical data, advocating the use of one

approach is inappropriate. However, 1 found that lineu and non-hear approaches provide

complementary tools for modeling fish occurrence based on whole-lake measures of habitat.

Detailed evaluation of species-habitat models shows that by partitionhg the predictive

performance of the models into measures such as specificity and sensitivity, the strengths and

wehesses of the models can be assessed more readily within and across species. Predictive

models can play an important role in guiding the direction of future research and aiding in the

management of fishery resources. More effective conservation of aquatic biodiversity will

require new approaches that recognize the value of both species and assemblages, and that

emphasize the protection of key regional-scale processes (Angenneier and Winston 1999).

Developments in these areas require a . increased reliance on probabilistic models and will

represent an important advance in both population and community ecology. Ecologists must

try to reduce the uncertainîy in their predictive models by collecting and including

appropriate variables into predictive rnodels. and teasing apart the interactions between

temporal and spatial processes occurring on different scales (Tom 1990, Poff 1997). This

will increase our understanding of the applicability of such models for predicting species

occurrence and cornmunity composition, as well as provide greater insight into the nature of

species-environment interactions.

ACKNOWLEDGEMENTS

1 would like to thank Nick Mandrak for providing the Algonquin Park Fish uiventory Data

Base, and Laura Hatt and Bob Bailey for their comrnents on an earlier draft. Speciai thanks

to Pedro Peres-Neto for writing and providing the PROTEST software. Funding for this

research was provided by an NSERC Graduate Scholarship to J.D. Olden, and an NSERC

Research Grant to D.A. Jackson.

REFERENCES

Angermeier, P. L., and 1. J. Schlosser. 1995. Consenhg aquatic biodivelsity: beyond species

and populations. Amencan Fishenes Society Symposium 17:402-414.

Angermeier, P. L., and M. R. Winston. 1999. Characterizing fish community diversity across

Virginia landscapes: Prerequiste for conservation. Ecological Applications 9:335-349.

Austin, D. J., P. B. Bayley, and B. W. Memel. Importance of the y i l d concept to fishenes

research and management. Fisheries (Bethesda) 19: 12-20.

Bailey, R. C., M. G. Kennedy, M. 2. Dervish, and R. M. Taylor. 1998. Biological assessrnent

of Eeshwater ecosystems using a reference condition approach: comparing predicted and

actual benthic invertebrate communities in Yukon streams. Freshwater Bio l o g 39:765-

774.

Bishop, C. M. 1995. Neural networks for pattem recognition. Oxford Clarendon Press.

Boone, R. B., and W. B. Krohn. 1999. Modeling the occurrence of bird species: Are the

ex-rors predictable? Ecological Applications 9: 835-848.

Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and

regression trees. Wadsworth, Belmont, California, USA.

Cale, W. G., G. M. Henebry, and I .A. Yeakley. 1989. Infening process and pattem in

natural communities. BioScience 39:6OO-6OS.

Capone, T. A., and I. A. Kushlan. 199 1. Fish comunity structure in dry-season stream

pools. Ecology 72983-992.

Cheng, B., and D. M. Titterington. 1994. Neural networks: A review f?om a statisticai

perspective (with discussion). Statistical Science 99-54.

Chessman, B. C. 1999. Predicting the macroinvertebrate faunas of rivers b y multiple

regression of biological and environmental di fferences. Freshwater Biology 4 1 :747-75 7.

Ciampi, A. 199 1. Generalized regression trees. Computational S tatistics and Data Analysis

12:57-78.

Crossman, E.J. and N.E. Mandrak. 1991. An analysis of fish distribution and community

structure in Algonquin Park: annual report for 1991 and completion report, 1989- 199 1.

Ontario Minstry of Natural Resources, Toronto, Ontario, Canada.

Dodge, D. P., G. A. Goodchild, 1. MacRitchie, J. C. Tilt, and D. G. WaldnK 1985. Manual

of instructions: aquatic habitat inventory surveys. Ontario Minisûy of Naturd Resources,

Fishenes Bmch, Toronto, Ontario, Canada.

Dunham, J. B., and B. E. Rieman. 1999. Metapopulation structure of bu11 bout: Influences of

physical, biotic, and geomeûical landscape charactenstics. Ecological Applications

9:642-655.

Eadie, J. M., and A. Keast. 1984. Resource heterogeneity and fish species diversity in lakes.

Canadian Journal of Zoology 62: 1689- 16%.

Emmons, E. E., M. J. Jemings, and C. Edwards. 1999. An alternative classification method

for northem Wisconsin lakes. Canadian Journal of Fisheries and Aquatic Sciences

56:66 1-669.

Evans, D. O., B. A. Hendenon, N. J. Bax, T.R. Marshall, R. T. Oglesby, and W. J. Christie.

1987. Concepts and methods of comrnunity ecology applied to freshwater fisheries

management. Canadian Journal of Fishenes and Aquatic Sciences 44 (Suppl. 2):448-470

Fielding, A. H., and J. F. Bell. 1997. A review of methods for the assessment of prediction

errors in conservation presencdabsence models. Environmental Conservation 24:38-49.

Franklin, J. F. 1993. PreseMng biodiversity: species, ecosystems, or landscapes? Ecological

Applications 3:202-2OS.

Gower, J. C. 1975. Generalized procrustes analysis. Psychometrika 40:33-5 1

Guégan, J. F., S. Lek, and T. Oberdorff. 1998. Energy availability and habitat heterogeneity

predict global fish diversity. Nature 39 1 :382-384

Hand, D. J. 1997. Construction and assessment of classification rules. John Wiley and Sons,

Chichester.

Hay, M. E. 1994. Species as 'noise' in community ecology: do seaweeds block our view of

the kelp forest? Trends in Ecology and Evolution 9:414-416.

Hinch S. G., K. M. Somers, and N. C. Collins. 1994. Spatial autocorrelation and assessment

of habitat-abundance relationships in littoral zone fish. Canadian Jounial of Fishenes and

Aquatic Sciences 5 1 :701-712.

Hill, N. M., and P. A. Keddy. 1992. Prediction of rarities fkom habitat variables: Coastal

plain plants on Nova Scotian landshores. Ecology 73: 1852-1 857.

Hornick, K., M. Stinchcombe, and H. White. 1989. Multilayer feedfonvard networks are

universal approxirnators. Neural Networks 2:359-366.

Hrabik, T. R., and J. J. Magnuson. 1999. Simulated dispersal of exotic rainbow smelt

(Osmem mordar) in a northern Wisconsin lake district and implications for

management. Canadian Journal of Fisheries and Aquatic Sciences 56 (Suppl. 1):35-42.

Jackson, D. A. 1995. PROTEST : A procrustean randornization test of community

environment concordance. Écoscience 2:297-303

Jackson, D. A., and H. H. Harvey. 1989. Biogeographic associations in fish assemblages:

local venus regional processes. Ecology 70: 14724484.

Jackson, D. A., and H. H. Harvey. 1993. Fish and benthic invertebrates: community

concordance and cornmunity-environment relationships. Canadian Journal of Fishenes

and Aquatic Sciences 50:2641-265 1.

Jackson, D. A., K. M. Somes, and H. H. Harvey. 1989.Similarity coefficients: Measures of

CO-occurrence and association or simply measures of occurrence. Amencan Naturalist

133:436-453.

Jackson, D. A., K. M. Somers, and H. H. Harvey. 1992. Nul1 models and fish comrnunities:

Evidence of nonrandom patterns. Arnerican Naturalist 139:930-95 1.

James, F. C., and C. E. McCulloch. 1990. Multivariate analysis in ecology and systematics:

panacea or Pandora's box? h n u d Reviews in Ecology and Systematics 2 1 : 129-1 66.

Kadmon, R., and H. R. Pulliarn. 1993. Island biogeography: Effect of geographicai isolation

on species composition. Ecology 74:977-98 1.

Keddy, P. A. 1992. Assembly and response d e s : two goals for predictive cornmunity

ecology. Journal of Vegetation Science 3: 157-164.

Knapp, R. A., and H. K. Preisler. 1999.1s it possible to predict habitat use by spawning

salrnonids? A test using California golden trout (Oncorhynchzis mykiss aguabonita).

Canadian Journal of Fisheries and Aquatic Sciences 56: 1576-1581.

Kmse, C. G., W. A. Hubert, and F. J. Rahel. 1997. Geomorphic influences on the distribution

of Yellowstone cutthroat trout h the Absaroka Mountains, Wyoming. Transactions of the

Armican Fishenes Society 126:418-427.

Kurkovh, V. 1992. Kolmogorov's theorem and multilayer neural networks. Neural Networks

5:SOl-506.

Lek, S., M. Delacoste, P. Baran, 1. Dirnopoulos, J. Lauga, and S. Aulagnier. 1996.

Application of neural networks to modelling nonlinear relationships in ecology.

Ecological Modelling 9O:3 9-52.

Magnuson, J. J., W. M. TOM, A. Banerjee, J. Toivonen, O. Sanchez, and M. Rask. 1998.

Isolation vs. extinction in the assembly of fishes in smdl northern lakes. Ecology

79:294 1-2956.

Manel. S., J. M. Dias. S. T. Buckton, and S. J. Ormerod. 1999a. Comparing discriminant

anaiysis, neural networks and logistic regression for predicting species distribution: a

case study with a Himalayan river bird. Ecological Modelling 120:337-347.

mane el, S., J. M. Dias, S. T. Buckton, and S. J. Ormerod. 1999b. Alternative methods for

predicting species distribution: an illustration with Himalayan river birds. Journal of

Applied Ecology 36:734-747.

Mastrorillo, S., S. Lek, F. Dauba, and A. Beland. 1997. The use of artificial neural networks

to predict the presence of small-bodied fish in a river. Freshwater Biology 38:237-246.

Matthews, W. J., and H. W. Robison. 1998. Muence of drainage connectivity, drainage area

and regional species richness on fishes of the interior highlands in Arkansas. The

Amencan Midland Naturalist 139: 1 - 19.

iMatuszek, J. E., and G. L. Beggs. 1988. Fish species richness in relation to lake area, pH, and

other abiotic factors in Ontario Mes. Canadian Journal of Fisheries and Aquatic Sciences

45:1931-1941.

Mims, C. K. 1989. Factors affecting fish species nchness in Ontario lakes. Transactions of

the Arnerican Fisheries Society 1 18533445.

Morgan, J. N., and J. A. Sonquist. 1963. Problems in the analysis o f s w e y data, and a

proposai. Journal of the Amencan Statistical Association 58 :4 15-434.

Moss, D. M., I. F. Wright, M. T. Furse, and R. T. Clarke. 1999. A cornparison of alternative

techniques for prediction of the fauna of ninning-water sites in Great Bntain. Freshwater

Biology 4l:l67-l8l.

Olden, J. D., and D. A. Jackson. 2000. Torturing data for the sake ofgenerality: How vaiid

are our regression models? Écoscience 7(in press).

Orians, G. H. 1980. Micro and macro in ecological theory. Bioscience 30:79.

Ozesmi, S. L., and U. ozesmi. 1999. An artificial neural nehvork approach to spatial habitat

modelling with interspecific interaction. Ecological Modelling 1 16: 15-3 1.

Po$ N. L. 1997. Landscape filters and species traits: towards mechanistic understanding

and prediction in stream ecology. Journai of the North Amencan Benthological Society

16:391-409.

Picken, S. T. A., J. Kolasa, and C. G. Jones. 1994. Ecological understanding: the nature of

theory and the theory of nature. Academic Press, New York.

Prendergrast, J. R, R. M. Quinn, J. K. Lawton, B. C. Evenharn, and D. W. Gibbons. 1993.

Rare species, the coincidence of diversity hotspots and conservation strategies. Nature

365:335-337.

Pusey, B. J., M. J. Kennard, and A. H. Arthington. 2000. Discharge van-ability and the

development of predictive models relating strearn fish assemblage structure to habitat in

northeastem Australia. Ecology of Freshwater Fish 9: 30-50.

Rahel, F. J., and N. P. Nibbelink. 1999. Spatial patterns in relations among brown trout

(Salmo tnitta) distribution, summer air temperature, and Stream size in Rocky Mountain

strearns. Canadian Journal of Fisheries and Aquatic Sciences 56 (Suppl. 1):35-42.

Rathert, Da, D. White, J. C. Sifheos, and R. M. Hughes, 1999. Environmental correlates of

species richness for native keshwater fish in Oregon, U.S.A.. Journal of Biogeography

26 :257-273.

Reichard, S. H., and C. W. Hamilton. 1997. Predicting invasions of woody plants introduced

into North Amerka Conservation Biology 1 1 : 193-203.

Rejwan, C., N. C. Collins, L. Brumer, B. I. Shuter, and M. S. Ridgway. 1999. Tree

regression andysis on the nesting habitat of srnallmouth bass. Ecology 80:341-348.

Ripley, B. D. 1994. Neural networks and related methods for classification (with discussion).

Journal of the Royal Statistical Society, Series B 56:409-456.

Rodriguez, M. A., and W. M. Lewis. 1997. Structure of fish assemblages along

environmental gradients in floodplain lakes of the Orinoco River. Ecologicai

Monographs 67: 109-128.

RohlfF. J., and D. Slice. 1990. Extensions of the procrustes method for the optimal

superimposition of landmarks Systematic Zoology 39:40-59.

Rumelhart, R. E., R. J. Hinton, and R J. Williams. 1986. Learning representations by back-

propagating error. Nature 323 :533-536.

Scheller, R. M., V. M. Snarski, J. G. Eaton, and G .W. Oehiert. 1999. An analysis of the

influence of annual thermal vari ables on the occurrence of fifteen warmw ater fis hes.

Transactions of the Amencan Fisheries Society l28:257-264.

Scott, J. M., B. Csuti, J. D. Jacobi, and J. E. Estes. 1987. Species richness: a geographic

approach to protecting future biological diversity. BioScience 37:782-788.

Smith, S. J., S. J. Iverson, and W. D. Bowen. 1997. Fatty acid signatures and classification

trees: new tools for investigating the foraging ecology of seals. Canadian Journal of

Fisheries and Aquatic Sciences 54: 1377-1386.

Titus, K., J. A. Mosher, and B. K. Williams. 1984. Chance-corrected classification for use in

discriminant analysis: Ecological applications. American Midland Naturalist 1 1 1 : 1-7.

Toner, M., and P. A Keddy. 1997. River hydrology and riparian wetlands: A predictive

mode1 for ecological assembly. Ecological Applications 7236-246.

Tom, W. M. 1990. Climate change and fish comrnunities: A conceptual framework.

Transactions of the Amencan Fishenes Society 1 19:337-352.

TOM, W. M., and J. J. Magnuson. 1982. Patterns in the species composition and nchness of

fish assemblages in northern Wisconsin Lakes. Ecology 63: 1 149-1 166.

Tom, W. M., J. J. Magnuson, M. Rask, and J. Toivonen. 1990. Intercontinental cornparison

of smail-lake fish assemblages: The balance between local and regional processes.

Amencan Naturdist l36:345-375.

Wiens, J. A. 1992. Ecology 2000: an essay on future directions in ecology. Bulletin of the

Ecological Society of Amenca 73: 165- 170.

Wiser, S., R. K. Peet, and P. S. White. 1998. Prediction of rare-plant occurrence: A southem

Appaiachian example. Ecological Applications 8:909-920.

Wright, J. F., D. Moss, P. D. Armitage, and M. T. Furse. 1984. A preluninary classification

of nuuiing-water sites in Great Britain based on macro-invertebrate species and the

prediction of community type using environmental data. Freshwater Biology 14221-256.

York, A. 2000. Long-term effects of fiequent low-intensity buming on any communities in

coastal blackbutt forests of southeasteni Australia. Austral Ecology 2593-98.

Zar, J. H. 1999. Biostatistical Andysis, 4" edition. Prentice Hall, New Jersey.

CHAPTER 2

Illuminating the "black box": A randomization approach for

understanding variable contributions in artificial neural networks

ABSTRACT

With the growth of statistical modelling in the ecological sciences, researchen have

begun to use more complex methods, such as artificial neural networks (ANNs), to address

problerns associated with pattern recognition and prediction. Although in many studies

ANNs have been shown to exhibit supenor predictive power compared to traditional

approaches, they have also been labeled a "black box" because they provide little explanatory

insight into the relative influence of the independent variables in the prediction process. This

lack of explanatory power is a major concem to ecologists since the interpretation of

statistical models is desirable for gaining knowledge of the causal relationships driving

ecological phenomena. In this study, 1 descnbe a number of methods for understanding the

mechanics of ANNs (e.g., Neural hterpretation Diagram, Garson's algonthm, sensitivity

analysis). Next, I propose and demonstrate a randomization approach for statistically

assessing the importance of axon connection weights and the contribution of input variables

in the neural network. This approach provides researchen with the ability to eliminate null-

connections between neurons whose weights do not significantly influence the network

output (Le., predicted response variable), thus facilitating the interpretation of individual and

interacting contributions of the input variables in the network. Furthemore, the

randomization approach can ide&@ variables that significantly contribute to network

predictions, thereby providing an input-variable selection method for ANNs. 1 show that by

extending randomization approaches to artificial neural networks, the "black box" mechanics

of ANNs can be greatly illuminated. Thus, by coupling this new explanatory power of neural

networks with its strong predictive abilities, ANNs promise to be a valuable quantitative tool

to evaluate, understand, and predict ecological phenomena

INTRODUCTION

Artificial neural networks (ANNs) have begun to receive a great deal of attention in

the ecological sciences as a powerfil, flexible, statistical modelling technique for uncovering

patterns in data (Colassanti, 1991; Edwards and Morse, 1995; Lek et al., 1996a; Lek and

Guégan, 1999). This increased interest in ANNs was demonstrated recently during the k t

international worikshop on the appiicarions oheurai networks ro ecoiogicd modeihg

(conference papers are found in Ecological ModeIlkg: Volume 1 20, Issue 2-3). The utility

of A N N s for solving complex pattern recognition problems has been demonstrated in many

terrestrial (e.g., Parue10 and Tomasel, 1997; ozesmi and ozesmi, 1999; Manel et al., 1999;

Spitz and Lek, 1999) and aquatic studies (e.g., Lek et al., 1996a;b; Scardi, 1996; Bastarache

et al., L 997; Mastrorillo et al., 1998; Chen and Ware, 1999; Gozlan et al., 1999), and has led

many researchers to advocate ANNs as an attractive, non-linear alternative to traditional

statistical rnethods.

The primary application of ANNs has been for developing predictive models to

forecast fiiture values of a particula. response variable for a given set of independent

variables. Although the predictive value of ANNs has great appeal to many ecologists,

especially in applied areas, researchers have often criticized the explanatory value of ANNs,

calling it a 'black box' approach to modelling ecological phenornena (e.g., Panielo and

Tomasel, 1997; Lek and Guégan, 1999; ozesmi and ozesmi, 1999). This view stems fiom

the fact that the contribution of the input variables in predicting the value of the output (i.e.,

response variable) is difficult to disentangle within the network. Consequently, input

variables are ofien entered into the network and a response value is generated without

gaining any understanding of the relationships between the independent and response

variables, and therefore, providing no explanatory insight into the underlying mechanisms

being modelled by the network (Anderson, 1995; Bishop, 1995; Ripley, 1996). This is a

major piâall of ANNs since traditionai statistical approaches can readily identify the

influence of the independent variables in the modelling process, as well as provide a measure

of the degree of confidence regarding their contribution. Currently, there is a lack of

theoretical or practical ways to partition the contributions of the independent variables in

ANNs (Smith, 1994). This is a substantial drawback in the ecological sciences where the

interpretation of statistical models is desirable for gaining insight into causal relationships

driving ecoIogicai phenornena.

Recently, a number of authors have proposed methods for selecting the best network

architecture (Le., nurnber of neurons and topology of connections) among a set of candidate

networks. Examples include a number of statistical methods such as asymptotic cornparison

techniques, approximate Bayesian analysis and cross validation (e.g., Dimopoulos et al.,

1995; see Bishop, ! 995 for review). However, methods for interpreting the relative

contribution of predictor variabIes in the network are more complicated, and as a result are

rarely used in ecological studies. Intensive computationai approaches such as growing and

pruning algorithrns (Bishop, 1995), partial derivatives (e.g., Dimopoulos et al., 1995; 1999)

and asymptotic t-tests are not used often in favour of simpler algorithms which use network

connection weights (e.g., Garson's algorithm: Garson, 199 1) or sensitivity analysis to

determine the entire spectnun in which each variable influences the network response (e.g.,

Lek's algorithm; Lek et al., 1996a).

Although these approaches provide a means of determinhg the overail influence of

each predictor variable, interpretation of interactions among the variables is more dimcult to

assess since the strength and direction of individual axon connection weights within a

network must be examined directly. Bishop (1995) discusses the use of pnining dgonthms

to remove connection weights that do not contribute to the predictive performance of the

neural nebvork. In brief, a pnining approach begins with a highly c o ~ e c t e d network (i.e.,

large number of connections among neurons), and then successively removes weak

connections (i.e., small absolute weights) or connections that cause a minimal change in the

network error Function when removed. However, an important question is how to decide at

what threshold value (Le., absoiute connection weight or change in network error) should

weights be removed or retained in the network? Ln the present study, 1 propose a

randomization test for &ficial neural networks to address this question. This randomization

approach provides a statistical pruning technique for ehinating nul1 connection weights that

provide minimal influence on the response variable, as well as providing a method for

identifjmg independent variables that significantly contribute to predictions in the network.

By using randomization protocols for partitionhg the importance of c o ~ e c t i o n weights (in

tems of their magnitude and direction), researchers will be able to quantitatively assess both

the individual and interactive effects of the input variables in the network prediction process,

as well as evaluate the overall contributions of the variables. Finally, 1 illustrate the utility of

this ANN randomization test using an empirical example describing the relationship between

fish species nchness and habitat characteristics of fieshwater lakes.

Case study:

Habitat factors influencing fish species richness in freshwater lakes

Throughout this paper 1 use an empirical example relating fish species richness to

habitat conditions of 286 freshwater lakes located in Algonquin Provincial Park, south-

central Ontario, Canada (45'50', 78'20'). I tabulated species presence for each lake to

examine relationships between fish species richness (ranghg fiom 1 to 23) and a suite of

habitat-related variables (8 in total). Predictor variables were chosen to include factors that

have been shown to be related to cntical habitat requirements of fish in this geographic

region (Minns, 1989). These variables included: surface area, lake volume, and total

shoreline perimeter which are correlated with habitat diversity (Eadie and Keast, 1984);

maximum depth which is negatively correlated with winter dissolved-oxygen concentrations

and related to thermal stratification (Jackson and Harvey, 1989); surface measurements

(taken at depths 5 2.0 m) of total dissolved solids to provide an estimate of nutrient status

and lake productivity (Ryder, 1982) and pH; lake elevation which is related to both habitat

heterogeneity (Curie, 199 1) and colonization/extinction features of the lake (Magnuson et

al., 1998); and growing degree-days which is a m o g a t e for productivity (Richenon and

Lum, 1980).

Interpreting neural-network connection weights: An important

consideration

1 refiain hem detailing the specifics of neural network optimization and design (i.e.,

number of hidden neurons and layers) and instead refer the reader to the Chapter 1 and

extensive coverage provided in the texts by Smith (1994), Bishop (1995), and Ripley (1996),

as well as articles by Ripley (1 994) and Cheng and Tinenngton ( 1 394j. it is suEcient to say

that the rnethods described in this paper refer to the classic fmily of one hidden-layer, feed-

forward neural network trained by the backpropagation algorithm @urnehart et al., 1986).

These neural networks are commonly used in ecological studies since they are suggested to

be universal approximaton of any continuous hinction (Cybenko, 1989; Funahashi, 1989;

Homick et ai., 1989). Based on n-fold cross validation, I detemined that a neural network

with four hidden neurons exhibited good predictive power (r = 0.72 between observed and

predicted species nchness).

In the neural network, the connection weights between neurons are the links between

the inputs and the outputs, and therefore are the link between the problem and the solution.

The weights contain al1 the information about the network. The relative contribution of the

independent variables to the predictive output of the neural network depends primady on the

magnitude and direction of the connection weights. Input variables with larger connection

weights represent greater intensities of signal transfer, and therefore are more important in

the prediction process compared to variables with smaller weights. Negative connection

weights represent inhibitory effects on neurons (reducing the intensity of the incornhg

signal), whereas positive comection weights represent excitatory effects on neurons

(increasing the intensity of the incoming signal). Therefore, negative connection weights

negatively affect the response variables, whereas the opposite is tme for positive connection

weights.

Given the obvious importance of connection weights in assessing the relative

contributions of the independent variables, there is one topic that 1 believe warrants

additional detail. During the optimization process, it is necessary that the network converges

ro the global minimum of the fitting criterion (e.g., prediction error) rather than one of the

many local minima. Local minima refer to different alternative sets of network parameter

values due to symmetric interchanges of the connection weights between the oeurons of the

network (Ripley, 1994). Consequently, ninnllig the optimizer several times (Le.,

constructing a number of networks with the same data but different initial random weights)

can result in neural networks with identical predictive performance but quite different

comection weights. This can complicate the interpretation of neural networks. Two

approaches can be employed to ensure the greatest probability of network convergence to the

global minimum. Thc Erst approach iwol:.es combiring differrnt local mullms rzther hm

choosing between them. Some researchers have suggested that the optimal approach is to

average the outputs of neiworks using the connection weights corresponding to different

local minima (e.g., Wolpert, 1992; Xu et al., 1992; Perrone and Cooper, 1993; Ripley, 1995).

Ecoiogists do not readily use this approach, as it requires greater computational effort to

identify multipie local minima. The second approach involves global optimization

procedures where parameters such as learning rate, mornentum or regularization are Licluded

in network optimization. (e.g., White, 1989; Styblinski and Tang, 1990; Gelfànd and Mitter,

199 1 ; Ripley, 1994). The addition of a learning rate parameter ( 7) and momentum (a)

during optimization is used ofien in the ecological literature (e.g., Lek et al., 1996a;

Mastrodlo et al., 1997a; Gozlan et al., 1999; Spitz and Lek, 1 999) because in addition to

reducing the problem of local minima, it also accelerates the optimization process. The q

regulates the magnitude of changes in the weights and bises during optimization, and a

mediates the contribution of the last weight change in the previous iteration to the weight

change in the current iteration. The values of q and a can be set constant or c m Vary

during network optimization, although there are a number of the disadvantages to holding /I

and a constant (Bishop, 1995). Consequently, values of both f i and a are commonly

modifïed by either increasing or decreasing their value according to whether the error

decreased or increased respectively during an iteration of network optimization (e.g., Hagan

et al., 1996; Mastrorillo et al., 1998; Ozesmi and ozesmi, 1999). In my study 1 included

leaming rate and momentun parameters in the optimization process and defined them as a

function of the error (although the fint approach discwed above is an equally valid

method). For dl analyses 1 started the network optiniization with randorn connections

weights between -0.3 and 0.3. The variable leaming rate parameter, momentum parameter

and the small interval of initial random weights ensured a high probability of global network

convergence and thus provided confidence regarding the validity of the connection weights

and their interpretation.

Illuminating the '<black box"

Preparation of the data

Pnor to building the neural network, the data set must be modified so that the

dependent and independent variables exhibit particular distributional charac teristics. The

dependent variable must be converted to the range [O.. 11 so that it confoms to the demands

of the transfer function used (sigrnoid function) in the building of the neural network. This is

accomplished by using the formula:

where r, is the converted response value for observation n, y, is the original response value

for observation n, and rnin(Y) and max(Y) represent the minimum and maximum values

respectively, of the response variable Y. Note that the dependent variable does not have to be

converted when modelling a binary response variable (e.g., species presence/absence) since

its values akeady fa11 within this range.

To standardize the measurement scales of the inputs into the network the independent

variables are converted to z-scores (i.e., mean = O, standard deviation = 1) using the formula:

where z, is the standardized value of observation n, x, is the original value of observation n,

and X and s, are the mean and standard deviation of the variable X. It is essential to

standardize the input variables so that same percentage change in the weighted sum of the

inputs causes a similar percentage change in the unit output. Both the dependent and

independent variables of the richness-habitat data set were modified using the above

fornulas.

Methods for quantifjbg input variable contributions in ANNs

In the followhg section, I d e t i l a senes of me?hods that are cmilable to aid Li t l e

interpretation of c o ~ e c t i o n weights and variable contributions in neural networks. These

approaches have been used by ecologists and represent a set of appropnate techniques [or

understanding neuron connections in networks. Next, I extend a randomization approach to

these methods, illustrating how connection weights and the overail influence of the input

variables in the network can be assessed statistically.

Neural hterpretaiion Diagram (NID)

Recently, a number of investigaton have advocated using axon connection weights to

interpret predictor variable contributions in neural networks (e.g., Aoki and Komatsu, 1997;

Chen and Ware, 1999). Ozesmi and ozesmi (1999) proposed the Neural Interpretation

Diagram (NID) for providing a visual interpretation of the connection weights arnong

neurons. In NIDS, the relative magnitude of the comection weights is represented by line

thickness (i.e., thicker lines representing greater weights) and line shading represents the

direction of the weights (Le., black lines representing positive, excitator signals and gray

lines representing negative, inhibitor signals). Tracking the magnitude and direction of

weights between neurons enables researchers to identi@ individual and interacting effects of

the input variables on the output. Figure 2.1 illustrates the NID for the empirical example

and shows the relative influence of each habitat factor in predictuig fish species richness.

The relationship between the inputs and outputs can be determined in two steps since there

are fïrst input-hidden layer connections and second hidden-output layer connections.

Positive effects of input variables are depicted by positive input-hidden and positive hidden-

output connection weights, or negative input-hidden and negative hidden-output comection

weights. Negative effects of input variables are depicted by positive

Area

Mx. Deph

Volume

Sh. Per.

Elevation

TDS

PH

GDD

Species Richness

Figure 2.1. Neural interpretation d i a m (NID) for neural netsvork modelling fish species

richness as a fùnction of 8 habitat variables. The thickness of the lines joinuig neurons is

proportional to the magnitude of the comection weight, and the shade of the line indicates

the direction of the interaction between neurons: black co~ect ions are positive (excitator)

and gray connections are negative (inhibitor).

input-hidden and negative hidden-output comection weights, or negative input-hidden and

positive hidden-output comection weights. Therefore, the multiplication of connection

weight direction (Le., positive or negative) delineates the eEect each input variable has on the

response variable.

The interpretation of comection weights, and more specifically NIDS, is not an easy

task because of the cornplexiîy of connections among the neurons (Fig. 2.1). Additional

bidden r?ewor?s would d y make this interpretation more di fficult. Furthermore. a subjective

choice must be made regarding the magnitude at which comection weight should be

interpreted. These considerations make the direct examination of connection weights

challenging at best and virtually impossible in data sets with large numbers of variables. 1

show later that a randomization approach can aid in the interpretation NIDS by identifjmg

non-significant connection weights that can be removed.

Garson 's algorithm

Garson (1 99 1) proposed a method, later modified by Goh (1 995), for interpreting

neural network comection weights to determine the relative importance of independent

variables within the ANN. This approach has been used in a number of ecological studies

(e.g., Mastrorillo et al., 1997b;1998; Gozlan et al., 1999; Aurelle et al., 1999; Brosse et al.,

1999). Garson's algorithm partitions the neural network connection weights in order to

determine the individual importance of each input variable considered separately in the

network. Box 2.1 contains a summary of the protocol presented by Ganon (1991) that is

used to calculate input variable contributions. Figure 2.2 illustrates the overdl contribution

of each habitat variable in predicting lake species richness. The results show that the relative

importance of the predictor variables ranged fiom 6% to 18%, with lake area and elevation

contributing the most to predicting species nchness and lake volume and pH contributhg the

least.

Sensitivity anaiysis

A number of investigators have employed sensitivity analysis to neural networks to

determine the spectnun of input variable contributions in the neural network. Recently, a

Box 2.1. Garson's algorithm for partitioning and interpreting neural network comection weights. Sample caiculations s h o w for 3 input neurons (1,2 and 3), 2 hidden neurons (A and B) and 1 output neuron (O)

Output

1. Matrix containing input-hidden-output neuron connection weights

2. Contribution of each input neuron to the output via each hidden neuron calculated as the product of the input-hidden comection and the hidden-output connection: e.g.,c.,=w., xrv**=-2.61 ~ 1 . 1 1 =-2.90

3. Relative contribution of each input neuron to the outgoing signal of each hidden newon: e.g., r., = lc*, l /(I$. I+ 1c.J + IC.. l)= 2.90 1 (2.90 + O. 14 + 0.77) = 0.76; and sum of input neuron contributions: e.g., S, = r., + r,, =0.76 + 0.29 = 1.05

1 1 Hidden A 1 ~ i d d & ~ 1

Input1 Input2 Input 3 Output

Hidden A W.,=-2.61 w.,=0.13 W., = -0.69 v = 1.1 1

Input 1 Input2

4. Relative importance of each input variable: e.g., RI =$ / (S, +$ +S.) x 100 = 1.05 /(1,05 + 0.25 + 0.70) x 100 = 52.5 %

Hidden B W.,=-1.23 W.,=-0.91 w,, = -2.09 W., = 0.39

c., = -2.90 c,, = -0.48 c..=0.14 c.. = -0.35

Input 1 Hidden A r,, = 0.76

Input 1 Input 2

Input 3

input 2 1 ra, = 0.04 Input 3 1 r., = 0.20

Relative importance

52.5 %

12.5 %

35.0 %

Hidden B r,, = 0.29

S m S = 1-05

r., = 0.21 r,, = 0.50

$=O25 $ = 0.70

Area Max. Volume Sh. Per. Elev TDS pH GDD Depth

Predictor variable

Figure 2.2. Bar plots showing the percentage relative importance of each habitat variable in

the neural network predicting fish species nchness based on Garson's aigorithm. See Box

2.1 for cdculations Uivolved in Garson's algorithm.

number of alternative types of sensitiviv analysis have been proposed in the ecological

Literature. For example, the Senso-nets approach includes an additional weight in the

network for each input variable representing the variable's sensitivity (Schleiter et al., 1999).

Scardi and Harding (1999) added white noise to each input variable and examined the

resulting changes in the mean square error of the output. Although such approaches are

available, traditional sensitivity analysis involves varying each input variable across its range

wliilc holding dl other input ~whb.bles constant; e x h vwkb!e is examined in tum, to

determine how they individuaily contribute to patterns in the output. Such analyses are

somewhat cumbersome since there may be an overwhelming nurnber of variable

combinations to examine. As a result, it is common to first calculate a senes of summary

measures for each of the input variables (e.g., minimum, maximum, quartiles, percentiles).

Next, the independent variable under investigation is varied from its minimum to maximum

value while al1 other variables are held constant at each of these summary measures

sequentially (e.g., ozesmi and ozesmi, 1999). Relationships between each input variable

and the response can be examined for each surnmary measure, or the calculated response can

be averaged across the summary measures. Holding the input variables constant at a small

number of values provides a more manageable sensitivity analysis, yet still requires a great

deal of the time since each value of the input variable must be examined. Consequently, Lek

et al. (1995; 1996qb) suggested exarnining only 12 data values delimiting 11 equal intervals

over the variable range rather than varying it across its entire range (this has been termed

Lek's algorithrn). Contribution plots can be constructed by averaging the response value

across al1 summary statistics for each of the 12 values of the input variable of interest. Many

studies have employed Lek's algorithm (e-g., Lek et al., l995,l W6a; Mastrorillo et al.,

1997% 1998; Guégan et al., 1998; Laë et al., 1999; Leg-Ang et al., 1999; Spitz and Lek,

1999). In this study I constructed contribution plots for each of the 8 predictor variables in

the neural network by varying each input variable across its entire range and holding al1 other

variables constant at their 2 0 ~ , 40", 60" and goth percentile (Fig. 2.3). It is evident fiom the

contribution plots that the influence of the input variables (Le., lake habitat factors) on the

network output (i.e., species richness) varies greatly dependhg on the values (i.e.,

percentiles) of the other input variables. The following is a summary of the different

response cuves:

Gaussian response curve - input variable contributes greatest influence on output at

intermediate values, and exhibits decreasing influence at low and high values: e.g.,

influence of pH and growing-degree days on richness.

Bimodal response curve - input variable contributes greatest influence on output at

low and high values, and exhibits minimal influence at intermediate values: e.g.,

influznct of iake axa, rnaximiii dcpth, shoidinc p e r i c t e r md total dissobed solids

on richness when d l other variables are low in value.

Left-skewed response curve - input variable contributes greatest influence on output

at high values, and exhibits minimal influence at low and intemediate values: e.g.,

influence of lake elevation on richness.

Right-skewed response curve - input variable contributes greatest influence on

output at low values, and exhibits minimal influence at intermediate and high values:

e.g., influence of total dissolved solids on richness, influence of overall lake size (i.e.,

area, maximum depth, volume and shoreline perimeter) on rkhness when al1 other

variables are intermediate in value.

Decreasing response curve - input variable contributes decreasing influence on

output at increasing values: e.g., influence of surface area on richness when a11 other

variables are high in value.

Flat response curve - input variables contributes minimal influence on output across

its entire range: e.g., influence of growing-degree days on richness when al1 other

variables are hi& in value.

Randomization Test for Artifcial Neural Networks

1 propose a randomization test for input-hidden-output connection weight selection in

neural networks. By eliminating null-comection weights that do not differ signincantly fiom

random, I c m simplify the interpretation of neural networks by reducing the number of mon

pathways that have to be examined for direct and indirect (Le., interaction) effects on the

response variable, for instance when using NIDS. This objective is similar to statistical

pnining techniques (e.g., asymptotic t-tests), yet does not require the assumptions of

parametnc and non-pararnetric methods since the randomization approach empiricaily

constnicts the distribution of expected values under the nul1 hypothesis for the test statistic

(i.e., weight comection) f?om the data at hand. Similarly, a randomization approach can be

used as an input variable selection method for ANNs by summuig across input-hidden-output

connection weights or calculating the relative importance (i.e., Garson's algorithm) for each

input variable. This approach provides a quantitative tool for selecting statisticaily

significant input variables for inclusion into the network, again reducing network complexity

and assisting in the nef.i.ork hterprethn. The following is the -mdomization protocol for

testing the statistical significance ofconnection weights and input variables:

(1) construct a number of neural networks using the original data with different

initial random weights;

(2) select the neural network with the best predictive performance, record initial

random connection weights used in constructing this network, and calculate

and record:

(a) input-hidden-output connection weights: the product of input-hidden

and hidden-output connection weights for each input and hidden

neuron (e.g., observed CA, : step 2, Box 2.1);

(b) overall comection weight: the surn of the input-hidden-output

connection weights for each input variable (e.g., observed

C I = C A I + C B I ) ;

(c) relative importance (%) for each input variable based on Garçon's

algorithm (e.g., observed Ml : step 4, Box 1.1);

(3) randomly permute the original response variable O>rMdo,,,);

(4) consûuct a neural network usingymndom and initial random connection

weights; and

(5) repeat steps (3) and (4) a large number of times (Le., 9999 times in this study;

see Jackson and Somers, 1989) each time recording 2(a), (b) and (c); e-g.,

randomized gr, randomized C I , and randomized RII

The statisticai significance of each input-hidden-output connection weight, overall

connection weight and relative importance of each input variable (e.g., observed c~c,

observed c, and observed RIl) can be calculated as the proportion of randomized values (e.g.,

randomized CA], randomized cl and randomized a), including the observed, whose value is

equai to or more extreme than the observed values. Figure 2.4 illustrates the distribution of

randomized input-hidden-output comection weights (for hidden neuron B), overall

connection weight and relative importance of surface area for predicting lake species

nchness.

Table 3.1 cootains the connection weight structure for the neural network and the

associated p-values from the randomization tests. The results show that only a fiaction of the

total 32 input-hidden-output connections (Le., 8 predictors x 4 hidden neurons) are

statistically different from what would be expected based on chance alone. For instance,

only 6 input-hidden-output connections are significant at a = 0.05. The results also show

that when you account for al1 connection weights (Le., overall connection weight), lake size

(Le., lake area, maximum depth, volume and shoreline perimeter) and pH are positively

associated with species richness, while elevation, total dissolved solids and growing-degree

days are negatively associated with species richness. However, only the influence of

maximum drpth (P = 0.002) and shoreline penmeter (P = 0.021) are statistically significant,

with these variables having a relative importance of 12.5% and 11%' respectively (Table

2.1). Interestingly, the results from the randomization test using relative importance differ

fiom the results using ovedl connection weights. Using Garson's algorithm, surface area

was the only significant factor correlated with species richness (P = 0.03 1), and elevation was

marpinally nonsignificant (P = 0.084). The discrepancy between the two approaches results

From the different ways that the methods use the network comection weights. Garson's

algorithm uses absolute connection weights to calculate the influence of each input variable

on the response (see Box 2. l), whereas overall connection weight is calculated using the raw

values. Examinhg Figure 2.5 (a = 0.05), 1 can show that Garson's algorithm can be

potentially misleading for the interpretation of input variable contributions. It is evident that

lake elevation shows a strong, positive association with species nchness through hidden

neuron A, but a strong, negative relationship with nchness through hidden neuron D. Based

on absolute weights, Garson's algorithm indicates a large relative importance of that variable

since both connection weights have large magnitudes (e-g., for al1 hidden neurons RI=

17.94%: Table 2.1). However, in such a case the influence of the input variable on the

-18 -16 -14 -12 -10 -8 -6 -4 -2 O 2 4 6 8 10 12

Input-hidden-output comection weight

-20-18-16-14-12-10-8 -6 -4 -2 O 2 4 6 8 10 12 14 16 18

Total connection weight

Figure 2.4. Distributions of (A) input-hidden-output connection weights for hidden neuron

B, (B) overall comection weight and (C) input relative importance (%) for the influence of

d a c e area on lake species richness. Arrow s represent O bserved input-hidden-output

connection weight for hidden neuron B (7.8 1 ), overall connection weight (7.43) and relative

importance (18.27%).

O 2 4 6 8 IO 12 14 16 18 20 22 24 26 28

Relative importance (%)

Figure 2.4. con tinued.

Table 2.1. Connection weiglit structure for the neural network modelling fish species nchness as a function of 8 habitat variables. Wu

represents the input-hidden-output connection weight for input variable i (whcre i =1 to 8) and hidden neuron j (where j = A to D). P

values for input-hidden-output connection weights (VA,, WB,, Wei, and WDi), overall connection weiglits ( X W A i + and Garson's relative

importance (%) are based on 9,999 randomizations.

i Predictor Hidden Hidden Hidden Hidden Overall coiinectlon Relative variablc nçuron A neuron B neuron C neuron D wcight importance

- + - -- - ---- -- - .- - -- - - " "- - -- - --- --- - - - - - - -- - - - -

KI P P Wu P KII P C Ku+ DI P % P -

1 Area (ha) 2 Max. Depth (m) 3 Volume (m3) 4 Sh. Per. (km)

5 Elevation (m) 6 TDS (pg/L)

7 PM 8 GDD

Max. Depth ha

Spec ies Ric hness

GDD

Area

Max. Depth

Volume

Sh. Per.

Elevation

ms

PH

GDD

Species Richness

Figure 2.5. Neural Interpretation Diagram d e r non-significant input-hidden-output comection weights are eliminated using the randomization test. Only comection weights statisticdy different fiom zero (a = 0.05 and a = 0.10 ) are shown. The thickness of the lines joining neurons is proportional to the magnitude of the connection weight, and the shade of the h e indicates the direction of the interaction between neurons: black connections are positive (excitator) and gray connections are negative (inhibitor). Black input neurons indicate habitat variables that have an overail positive influence on species richness, and gray input neurons indicate an overall negative influence on species richness.

response is actually negligible since the positive influence through hidden neuron A is

counteracted by the negative influence through hidden neuron D (e.g., for al1 hidden neurons

ZWAIDl = - 1.17, P=0.2 17: Table 2.1). For this reason, 1 believe caution shouid be employed

when making inferences fiom the results generated by Garson's algorithm since the direction

of the input-output interaction is not taken into accounted.

Using results of the randomization test 1 removed non-significant comection weights

h m the Y e d hterpretation Diagram (originally shown in Figure 2.1 ), resul ting in Figure

2.5 which illustrates only connection weights that were statistically significantly different

from random at a = 0.05 and a = 0.10. Focusing on hidden neuron C in Figure 2.5 (a =

0. IO), it is apparent that as maximum depth and shoreline perimeter increase, and growing-

degree days decreases, species richness increases in the study lakes. Furthemore,

interactions among habitat factors c m be identified as input variables with contrasting

comection weights (Le., opposite directions) entering the sarne hidden neuron. For exarnple,

in exarnining hidden neuron D it is evident that lake shoreline perimeter interacts with lake

elevation. An increase in lake elevation decreases predicted species richness; however, this

negative effect weakens as shoreline perimeter increases. Therefore, there is an interaction

between lake elevation and shoreline perimeter in that high elevation lakes with convoluted

shorelines have greater species nchness compared to high elevation lakes with simple

shorelines. The MD also identifies input variables that do not interact, for example lake

volume, since this variable does not exhibit significant weights with contrasting effects at any

single hidden neuron with any of the other variables. Such information obtained fiom Figure

2.5 shows that a randomization approach can greatly aid in drawing conclusions fiom NIDS,

and more generally help researchers identify and interpret direct and indirect (i.e., interaction

between input variables) contributions of input variables in ANNs by using axon connection

weights.

Two important components of the mndomization test involved the optimization of the

neural network. First, 1 conducted the randomization test for the product of input-hidden and

hidden-output weights rather than each input-hidden and hidden-output connection weight

separately, since the direction of the connechon weights (Le., positive or negative) c m switch

between different networks optimized with the same data (Le., symmetnc interchanges of

weights: Ripley, 1994). For instance, the input-hidden and hidden-output weights might both

be positive in one network and both negative in another, but in both cases the input variable

exerts a positive influence on the response variable. To remove this problem 1 examined the

product of the input-hidden-output weights because the sign of this value will remain

constant and therefore will be representative of the true influence of the independent

variables. Second, it is important to use the initial random connection weights when

constructing the neural networks with the randomized response data. The reason for this is

Iliar diEzrcnt initial d o i n xeights c m rcsult il different h d comecticn wig!!ts with the

same overall predictive performance. Therefore, if different initial randorn weights were

used for each randomization, dissimilarities between the observed and randorn connection

weights might be an artifact of different initial connection weights and not the randomization

of the response variable. Using the same initial comection weights for each randomization

accounts for this problem. Furthemore, it is beneficial to check the distribution of input-

hidden-output connection weights for different random initial weights to ensure that the

convergence of different networks exhibit, on average, similar connection weights. For my

example 1 found that in almost al1 cases the input-hidden-output connection weights

exhibited unimodal distributions.

CONCLUSION

1 reiterate the concern raised by a number of ecologists and explicit to rny paper: Are

arti/iciul neural networks a black box approach for modelling ecologicnl phenomenu? In

light of the synthesis provided here, 1 argue the answer is unequivocdly no. I have reviewed

a senes of methods, ranging fiom qualitative (Le., NIDS) to quantitative (i.e., Garson's

algorithm and sensitivity analysis), for interpreting neural-network comection weights, and

have demonstrated the utility of these methods for shedding light on the inner workings of

neural networks. These methods provide a means for partitioning and interpreting the

contribution of input variables in the neural network modelling process. In addition, 1

descnbed a randomization procedure for testing the statistical significance of these

contributions in tems of individual connection weights and overall innuence of each input

variable. The former case facilitates the interpretation of direct and interacting effects of

input variables on the response by removing comection weights that do not contribute

significantly to the performance of the neural network. In the latter case, the randomization

test assesses whether the contribution of a particular input variable on the response differs

kom what would be expected by chance. The randomization procedure enables the removal

of nul1 neural pathways and insignificant input variables; thereby aiding in the interpretation

of the neural network by reducing its cornplexity. In conclusion, by coupling the explanatory

insight of neural networks with its powerful predictive abilities, artificial neural networks

h a ~ c gcat pi-ûriist in ccolow, rs s !ml to wa!ua?e, mderctand, and predict ecological

phenornena.

I would like to thank Sovan Lek for his insightfiil comrnents regarding the finer

points of artificial neural networks, and for providing some of the original MatLab code for

this study. This manuscnpt was greatly improved by the comrnents of Brian Shuter.

Funding for this research was provided by a Graduate Scholarship from the Naturd Sciences

and Engineering Research Couocil of Canada (NSERC) to J.D. Olden, and an NSERC

Research Grant to D.A. Jackson. Cornputer routines for dl randomization tests are available

in MatLab progamrning language fiom the authors upon request.

Anderson, J. A. 1995. An Introduction to Neural Networks. MIT, Cambridge, Massachusetts,

650 pp.

Aoki, I., and T. Komatsu. 1999. Analysis and prediction of the fluctuation of sardine

abundance using a neural network. Oceanologia Acta 20: 81-88.

Aureiie, D., S. Lek, J. L. Giraudzl, anci P. Bzrrzbi. 1999. Microsatcllites and xtificia! neural

networks: tools for the discrimination between natural and hatchery brown hout (Salmo

mitta, L.) in Atlantic populations. Ecological Modelling 120: 3 13-324.

Bastarache, D., N. El-Jabi, N. Turkkan, and T. A. Clair 1997. Predicting conductivity and

acidity for small streams using neural networks. Canadian Journal of Civil Engineering

24: 1030- t 039.

Bishop, C. M. 1995. Neural Neworks for Pattern Recognition. Clarendon Press, Oxford, 482

PP*

Brosse, S., J.F. Guégan, J. N. Tourenq, and S. Lek. 1999. The use of neural networks to

assess fish abundance and spatial occupancy in the littoral zone of a mesotrophic lake.

Ecological Modelling, 120: 299-3 1 1.

Chen, D.G., and D. M. Ware. 1999. A neural network mode1 for forecasting fish stock

recruitrnent. Canadian Joumal of Fishenes and Aquatic Sciences 56: 2385-2396.

Cheng, B., and D. M. Titterington. 1994. Neural networks: a review fiom a statistical

perspective (with discussion). Statistical Science 9: 2-54.

Colasanti, R. L. 199 1. Discussions of the possible use of neural network algorithms in

ecological rnodelling. Binary 3 : 13-1 5.

Currie, D. J. 1991. Energy and broad-scale patterns of animai- and plant-species nchness.

Amencan Naturalist 13 7: 27-49.

Cybenko, G. 1989. Approximation by superimpositions of a sigrnoidal function. Math.

Control Signals Systematics 2: 303-3 14.

DUnopoulos, I., J. Chronopoulos, A. Chronopoulos-Sereli., and S. Lek. 1999. Neural network

models to study relationships between lead concentration in grasses and permanent urban

descripton in Athens city (Greece). Ecological Modelling 120: 157-1 65.

D ~ O P O U ~ O S ~ Y., P. Bourret, and S. Lek. 1995. Use of some sensitivity criteria for choosing

networks with good generalization. Neural Processing Letters 2: 1-4.

Eadie, J. M. and A. Keast. 1984. Resource heterogeneity and fish species diversity in lakes.

Canadian Journal of Zoology 62: 1689- 1695.

Edwards, M. and D. R. Morse. 1995. The potential for cornputer-aided identification in

biodiversity research. Trends in Ecology and Evoluhon 10: 153- 158.

Fÿn3shshi, K. 1989. On the ~pproximate realization of continuous mapping by neural

networks. Neural Networks 2: 183-192.

Gallant, S. 1. 1993. Nelual network leaming and expert systems. MIT, Massachusetts, USA,

365 pp.

Garson, G. D. 199 1. Interpreting neural-network connection weights. Artificial Intelligence.

Expert 6: 47-5 1.

Gelfand, S. B., and S. K. Mitter. 1991. Recursive stochastic algorithm for global

optimization in R'. SIAM Journal of Control Optimization 29: 999- 101 8.

Goh, A.T.C. 1995. Back-propagation neural networks for modelling complex systems.

Artificial Intelligence Engineering 9: 143-15 1.

Goodman, P. H. 1996. NevProp software, version 3, University of Nevada, Reno, W .

Gozlan, R. E., S., Mastrorillo, G. H. Copp, and S. Lek, S. 1999. Predicting the structure and

diversity of young-O f-the-year fish assemblages in large rivers. Freshwater B iology 4 1 :

809-820.

Hagan, M. T., H. B. Demuth, and M. H. Bede. 1996. Neural Network Design. PWS

Publishing, Boston, MA.

Homick, K., M. Stinchcombe, and H. White. 1989. Multilayer feedfonvard networks are

universal approximators. Neural NehKorks 2: 359-366.


local versus regional processes. Ecology 70: 1472-1484.

Jackson, D. A., and K. M. Somers. 1989. Are probability estimates fiom the permutation

mode1 of Mantel's test stable? Canadian Journal of Zoology 67: 766-769.

Lae, R., S. Lek, and J. Moreau. 1999. Predicting fish yield of Anican lakes using neural

networks. Ecological Modelling 120: 325-335.

Lek, S., A. Beland, 1. Dhopoulos, J. Lauga, and J. Moreau. 1995. Improved estimation,

using neural networks, of the food consumption of fish populations. Marine and

Freshwater Research 46: 1229-1236.

Lek, S., M. Delacoste, P. Baran, 1. D ~ O P O U ~ O S ~ J. Lauga, and S. Aulagnier. 1996a.


Ecological Modelling 90: 39-52.

Lek, S., A. Belaud, P. B a , 1. Dimopoulos, and M., De!acoste. l996b. Role cf r o m

environmental variables in trout abundance models using neural networks. Aquatic

Living Resources 9: 23-29.

Lek, S. and J. F. Guégan. 1999. Artificial neural networks as a tool in ecological modelling,

an introduction. Ecological Modelling 120: 65-73.

Lek-hg, S., L. Dehanreng, and S. Lek. 1999. Predictive models of collembolan divenity

and abundance in a nparian habitat. Ecological Modelling 120: 247-260.

Magnuson, J. J., W. M. TOM, A. Banejee, J. Toivonen, O. Sanchez, and M. Rask. 1998.

Isolation vs. extinction in the assernbly of fishes in small northem lakes. Ecology 79:

2941 -2956.

Manel, S., J-M. Dias, and S. J. ûrmerod. 1999. Cornparhg discriminant analysis, neural

networks and logistic regression for predicting species distributions: a case snidy with a

Himalayan river bird. Ecological Modelling 120: 337-347.

Mastronllo, S., S. Lek, and F. Dauba. 1997a. Predicting the abundance of minnow Pho.rinus

phoxinw (Cyprinidae) in the River Ariege (France) using artificial neural networks.

Aquatic Living Resources 10: 169- 176.

Mastronllo, S., S. Lek, F. Dauba, and A. Beland. 1997b. The use of artificial neural networks

to predict the presence of mail-bodied fish in a river. Freshwater Biology 38: 237-246.

MastroriUo, S., F. Dauba, T. Oberdorff, JJ. F. Guégan, and S. Lek. 1998. Predicting local fish

species nchness in the Garonne River basin. C.R. Academy of Sciences, Paris, Sciences

de la vie 321: 423-428.

Minns, C. K. 1989. Factors aficting fish species richness in Ontario lakes. Transactions of

the Amencan Fishenes Society 1 18: 533-545.

ozesmi, S. L., and U. ozesmi. 1999. An artificial neural network approach to spatial habitat

m o d e h g with interspecific interaction. Ecological Modelling 1 16: 15-3 1.

Pmelo, J. M., and F. Tomasel. 1997. Prediction of fùnctional charactenstics of ecosystems:

a comparison of artificiai neural networks and regression models. Ecological Modelling

98: 173-186.

Perrone, M. P., and L. N. Cooper. 1993. When networks disagree: Ensemble methods for

hybrid neural networks In: R. J. Mammone (Editor), Artificial Neural Networks for

Speech and Vision, Chapman and Hall, London, pp. 126-147.

Richerson, P. J., md K. L. LU. 1980. Patterns of plant rpecies diversity in California:

relation to weather and topography. Amencan Naturalist 1 16: 504-536.

Ripley, B. D. 1994. Neural networks and related methods for classification. Journal of the

Royal Statistical Society, Series B 56: 409-456.

Ripley, B. D. 1995. Statistical ideas for selecting network architectures. In: B. Kappen and S.

Gielen (Editors), Neural networks: Artificial Intelligence and Industrial Applications.

Springer, London, pp. 1 83-1 90.

Ripley, B. D. 1996. Pattern Recognition and NeuraI Networks. Cambridge University Press,

403 pp.

Rumehart, D. E., G. E. Hinton, and R. J. Williams. 1986. Learning representations by back-

propagation erron. Nature 323: 533-536.

Ryder, R. A. 1982. The morphoedaphic index-use, abuse, and fundamental concepts.

Transactions of the Arnerican Fishenes Society 1 1 1 : 154- 164.

Scardi, M. 1996. Artificial neural networks as ernpirical models for estimating phytoplankton

production. Marine Ecology Progress Senes 139: 2 89-299.

Scardi, M., and L. W. Harding. 1999. Developllig an ernpirical mode1 of phytoplankton

primary production: a neural network case study. Ecological Modelling 120: 2 13-223.

Schleiter, 1. M., D. Borchardt, R. Wagner, T. Dapper, K,-D. Schmidt, H. H. Schmidt, and H.

Werner. 1999. Modelling water quality, bioindication and population dynamics in Iotic

ecosystems using neural networks. Ecological ModelIing 120: 271-286.

Smith, M. 1994. Neural networks for statistical modelling. Van Nostrand Reinhold, NY,

USA, 235 pp.

Spitz, F., and S. Lek. 1999. Environmental impact prediction using neural network

modelling. An example in wildlife damage. Journal of Applied Ecology 36: 3 17-326.

Styblinski, M. A. and T. S. Tang. 1990. Experiments in non-convex optimization: stochastic

approximation and simulated annealhg . Neural Networks 3 : 467-484.

White, H. 1989. Learning in aaificial neural networks: a statistical perspective. Neural

Computing 1 : 425-464.

Wolpert, D. H. 1992. S tacked generdization. Neural Networks 5 : 241 -259.

Xu, L., A. Kryzak, and C. Y. Suen. 1992. Methods for combining multiple classifien and t JI,;, -1- qplimtiitions !O h ~ d ' n i t i n g recopition. T m . IEEE on Syrtems, Man and

Cybernetics 22: 41 8-435.

CHAPTER 3

Artificial neural networks: A predictive tool for fisheries science

ABSTRACT

Understanding and predicting the effects of land-use practices and aiterations to

nearshore habitat on fish populations is one of the main challenges conf?onting fisheries

biologists. Fish-habitat models play an important role in this regard as they provide a means

to predict changes in fish populations across different spatial and temporal scales.

Developing predictive models using traditional statistical approaches is problematic since

species often exhibit cornplex, nonlinear responses to habitat heterogeneity and biotic

interactions. In this study 1 demonstrate the ability of a robust statistical technique, artificial

neural networks (ANNs), to rnodel such complexities in fish-habitat relationships. Using

ANNs 1 provide both explanatory and predictive insight into the within-lake and whole-lake

habitat factors shaping species abundance and occurrence in temperate lakes of south-central

Ontario, Canada. The results show that species presence or absence is highly predictable

based on whole-Iake measures of habitat, and that these fish-habitat models exhibit good

generality in predicting occurrence in other lakes £rom an adjacent drainage. Detailed

evaluation of these models shows that by partitioning the predictive performance of the

models into measures such as sensitivity (ability to predict species presence) and specidcity

(ability to predict species absence), the strengths and weaknesses of the models is assessed

more readily. Furthemore, by varying the decision threshold probability for which the

rnodel predicts a species as being present or absent, rather than following the conventional

arbitrary threshold of OS, more powerfil predictions were achieved. Finally, ANNs provide

a usefùl approach to examine the interaction effects of nearshore habitat conditions on

species abundance and spatial occupancy. I show that ANNs have considerable promise for

understandhg habitat-related controls on fish populations, for predicting future states of

these populations, and can provide a valuable tool for fishenes management.

INTRODUCTION

Changing land-use pattems and alterations to nearshore habitat of lakes have caused

major changes in fish populations throughout the world. Efforts to understand the linkage

berneen habitat, its use by fish and subsequent productivity have become increasingly

important, and currently are a central issue in the aquatic sciences (Hughes and Noss 1992;

Harig md Bain 1998). In northem tcmperatc Ikcs, antk~opogcnic activitj. h3s altered n m y

components of riparian areas and nemhore habitats (Jennings et al. 1999). Modifications

include changes in the composition and density of macrophytes (Bryan and Scarnecchia

1992), quantity and diversity of shoreline habitat such as woody matenal (Christensen et al.

1996), and size and uniforrnity of substrate particles (Beauchamp et al. 1994; Jennings et al.

1996). Littoral-zone alterations to habitat can have ciramatic and persistent impacts on fish

assemblages since lake habitat provides the template upon which the organization and

dynamics of lenthic ecosystems occurs (Jackson and Harvey 1989; Tonn et al. 1990; Hinch et

al. 1991; Magnuson et al. 1998).

An obvious first step in any efficient, effective conservation, management or

restoration shategy is to obtain a good knowledge of the relationships between habitat

elements and fish populations. The ability to evaluate the effects of habitat change and other

human impacts on fish populations requires extensive surveying of the fish populations

before and afler the change occurs (Lester et al. 1996). However, pollution, shoreline

development, and other forms of habitat degradation are often not single events whose timing

and magnitude are controIlable; as well they are cumulative in their impact. Individuai

effects on populations may be so srna11 relative to natural population variability, that

statistically significant effects might be detectable only after many years of study. Predictive

fish-habitat models may play a useful role in this regard by providuig the ability to forecast

the effects of habitat modification and changing land-use patterns on fish populations and

communities. For instance, fish-habitat models could provide lake managers with the ability

to predict species occurrence and abundance at different spatial scales using whole-iake and

within-lake measures of habitat. Ultirnately predictive rnodels would enhance managers'

ability to predict temporal and spatial scales at which habitat can be changed while

minimizing the impact to lake fish populations.

Although fish-habitat models play important roles in fishenes ecology and

management, developing such models is often difficult because species exhibit complex,

non-linea. responses to habitat heterogeneity and biotic interactions. Multiple Linear

regression and linear discriminant andysis remain the most fiequently used techniques for

modeling fish-habitat relationships, although our confidence in the results is often limited by

the inability to meet a number of assumptions, such as error structure of the variables,

independencc of vaïïab!cs, yld mode1 1inearit-y (James and McColloch 1990). The !xt

assumption is particularly susceptible to violation with ecological data since species

generally exhibit nonlinear or non-monotonie associations with environmental conditions.

Data transformations of variables c m improve the results of traditional approaches, but they

are often only partially successfÙ1 (e.g., Lek et al. 1996; Guégan et al. 1998; Wally and

Fontarna 1998). Furthemore, the choice of transformation cm influence the results, and thus

potentially bias our interpretation of ecological relationships.

Artificial neural networks (ANNs) are a promising aitemative to traditionai statistical

approaches as they provide a powerful, flexible learning technique for uncovering non-linear

patterns in data. Applications of ANNs are diverse within the scientific literature, ranging

from social sciences to chemistry, and recently are beginning to receive more attention in the

ecological sciences for solving complex pattern-recognition problems (Colasanti 199 1 ;

Edwards and Morse 1995; Lek et al. 1996). ANNs are believed to provide better solutions

than traditional methods when applied to complex systems that may be poorly defined and

understood, and situations where input data are incomplete or ambiguous by nature. Unlike

the more commoniy used methods, neural networks do not require particular functiond

relationships, make no assumptions regarding the distributional properties of the data, and

require no a prion understanding of system relationships. This makes artificial neural

networks a potentidly powerful modeling tool for exploring complex, non-linear biologicd

problems, such as fish-habitat relationships.

The primary objectives of my study are to demonstrate the use of artificial neural

networks for modeling ecological relationships and illustrate their ability to provide hsight

into understanding and predicting relationships between fish populations and the

environment. These objectives are addressed by h t modeling relationships between lake-

wide habitat attributes and species occurrence. More specifically, 1 determine the

predictability of species presence-absence based on readily available, whole-lake habitat

factors (e-g., surface area, maximum depth, elevation) and provide a detailed evaluation of

these models by estimating optimal decision thresholds for prediction to maximize

classification success, sensitivity and specificity of the models. Next, I test the performance

of these models for predicting species occurrences for a second set of lakes, providing an

assessrnent of the generality of the models. Given that species abundance perhaps is a more

sènsitive wsponse vxiablc for studyiag fish-habitat rcllitionsfiips, I mode1 associ;itions

between within lake species abundances and near-shore habitat features (e.g., macrophyte

cover, substrate types, site exposure) of littoral zone fishes. Many of these analyses represent

an advance over more conventional mode1 evaluations, and provide important insight into the

predictability of species occurrence and abundance as a Funetion of the macro- and micro-

habitat of lakes. Finally, 1 present a randomization approach for ANNs that enables

researchers to readily identiQ the variables that contribute significantly to predicting species

occurrence and abundance. 1 demonstrate that ANNs can provide powerfil predictions and

important explanatory insight into fish-habitat relationships at both regional and local scales.

ARTIFICIAL NEURAL NETWORKS

The ability of the human brain to perform complex tasks, such as pattern recognition,

has motivated a large body of research exploring the cornputational capabilities of highly

connected networks of relatively simple elements called artificial neural networks (ANNs).

Although ANNs were initially developed to better understand how the mammalian brain

functions, researchers in a variety of scientific disciplines have becorne more interested in the

potential mathematical utility of neural network algorithrns for addressing an array of

problems. For example, ANNs have shown great promise for solving complex patteni-

recognition problems and for developing prediction or classification rules in the biological

sciences (Colasanti 1991; Edwards and Morse 1995; Lek et ai. 1996; Lek and Guégan 1999).

Previous stuclies using ANNs are too numerous to list here; however their use in fisheries

applications has been limited and includes modeling fish species richness (Guégan et aï.

1998), presence-absence (MastroriUo et al. 1997), abundance (Lek et al. 1996; Brosse et al.

1999), and production (Chen and Ware 1999).

Although there are many types of ANNs (Bishop 1995; Ripley 1996), here 1 descnbe

the type used most fiequently; the one hidden-layer, feedforward neural network trained by

the back-propagation algorithm (Rumehart et al. 1986). These neural networks are

extremely popular and have been used extensively in the biological literatwe since they are

considered to be universal approximators of any continuous function (Cybenko 1989;

Funahashi 1989; Hornick et al. 1989). Furthemore, single hidden-layer networks greatly

réituct somputational and oflen produce similx results cornpved to m~ltiple hiclden-

layers (KurkovB 1992). Below, 1 discuss the two pnmary features of ANNs: network

architecture and the back-propagation algorithm used to parameterize the network.

Network architecture and the back-propagation algorithm

Network architecture refers to the number and organization of the computing units

(cailed neurons) in the network. in the one hidden-hyer feedforward network, neurons are

organized in an input layer, a hidden layer and an output layer, with each layer containhg

one or more neurons (Fig. 3.1). Each neuron is connected to al1 neurons in adjacent layers

with an axon; however, neurons within each layer and in non-adjacent layers are not

connected. The input Iayer typically contains p neurons, representhg predictor variables xi

... x,, i.e., one neuron for each of the predictor variables. The number of'neurons in the

hidden layer is determined empirically by the investigator to minirnize the trade-off between

bias and variance (Geman et ai. 1992). Additional hidden neurons increase the ability of a

network to approximate any underlying relationship, Le., reduce bias, but will result in a

network having an enormous number of f?ee parameten, i.e., increasing variance in

predictions due to overfitting the data Although mathematical derivations exist for selecting

an optimal design, in practice it is cornmon to train networks with different nurnbers of

hidden neurons and use the performance on a test set to choose the network that performs the

bat. For continuous and binary response variables the output layer commody contains one

neuron. However, the number of output neurons can be greater han one if there is more than

one response variable or if the response variable is categorical (i.e., a separate neuron for

classifyiog observations into each

INPUT OUTPUT

Bias * a

Figure 3.1. One-hidden layer, feedfonvard neural network design.

8 5

category). Additional neurons with a constant output (commonly set to 1) are atso added to

the hidden and output layers (Fig. 3.1), although this is not mandatory These are called bias

neurons, and play a similar role to that of the constant term in multiple regression analysis.

The connection between any two neurons is assigned a weight that dictates the

intensity of the signal they transmit through the awon. Consequently, the "state" or "activity

level" of each neuron is de tedned by the input received nom the other neurons comected

to it. feed-forward networks, axon signals are transrnitted in a unidirectional path fkom

input layer to output layer through hidden layers. The States of the input neurons are defmed

by the incorning signal (Le. values) of the predictor variables. The state of each hidden

neuron is evaluated locally by calculating the weighted sum of the Uicoming signals fiom the

neurons of the input layer (Fig. 3.1 inset) and then a bias input is added. The weighted sum

is then subjected to an activation function, Le. a differentiable Function of the neuron's total

incoming signal fiom the input neurons, in order to produce the state of the hidden neuron

(Fig. 3.1 inset). The same procedure descnbed above is repeated for the axon signals fiom

the hidden layer tu the output layer. The entire process can be written rnathematically as:

where xi are the input signais, yk are the output signals, wu are the weights between input

neuron i to hidden neuron j, wjk are the weights between hidden neuron j and output neuron k,

and A are the bias associated with the hidden and output layers, and +,, and are

activation functions for the hidden and output layers. There are several activation fictions,

but the logistic function defined as:

1 f (x) = -

1 + eaX

is the most commonly used.

Training the neural network involves an error back-propagation algorithm which

fin& a set of connection weights that produce an output signal that has a small error relative

to the observed output. During training the weights are adapted to minirnize some fitting

critenon. For continuous output variables, the most commonly used cnterion is the Ieast-

squares error function:

For dichotomous output variables, the most commonly used critenon is the cross entropy

(Le., similar to log-likelihood) error function (Bishop 1995):

where ln is the observed output value and y" is the predicted output value for observation.

The algorithm adjusts connection weights in a backwards fashion, layer by layer, in the

direction of steepest descent in minimizing the enor function (also called gradient descent).

One iteration of the gradient descent algorithm can be summarized as:

where Aw, is the weight change between neuron s and neuron t in the next layer. The

training of the network is a recursive process where observations fiom the training data are

entered into the network in him, each time modifjmg the input-hidden and hidden-output

connection weigbts (using eq. (3.5)). This procedure is repeated with the entire training

dataset (i.e., each of the n observations) for a number of iterations until a stopping mle is

achieved. This type of training is a sequential approach to network optirnization, and

contrasts with the batch approach where the entire data set is used to adjust the weights

during each iteration (Bishop 1995). Commonly, network training is stopped when the

difference between predicted outputs fiom the network and the observed output (i.e,, the

ermr function) is small, or it is stopped to minimize the possibly of overfïtting the data.

Interpreting neural network connection weights

Although many studies have shown ANNs to exhibit superior predictive power

compared to traditional approaches ( e g , Lek et al. 1996), researchers often cal1 it a 'black

box' approach to statistical modeling since the networks are believed to provide Little

cxplmato~ insigllt into the relnti~e b-hence of the independent variables in the prediction

process (e.g., Lek and Guégan, 1999; ozesmi and ozesmi, 1999). The lack of explanatory

power is a major concem since the interpretation of statistical models is desirable for gainhg

knowledge of the causal relationships dnving ecological phenornena. This was a major

pitfall of ANNs since traditional statistical approaches can readily identifi the influence of

the independent variables in the modeling process, as well as provide a degree of confidence

regarding their contribution. Recent studies provide greater insight about the imer workings

of ANNs, thus providing a variety of methods for quantiwng and interpreting the

contributions of the independent variables in the network. For exarnple, a number of

intensive computational approaches have been developed such as growing and pruning

algorithms (Bishop 1995), partial derivatives (e.g., Dirnopoulos et al. 1995) and asymptotic t-

tests. However, these approaches are ofien not used by biologists, who prefer simpler

algorithms that directly use the network comection weights.

In the neural nework, the comection weights between neurons are the Iinkages

between the inputs and the output of the network, and therefore are the linkage between the

problem and the solution. The relative contribution of the independent variables to the

predictive output of the neural network depends primarily on the magnitude and direction of

the comection weights. Input variables with larger connection weights represent greater

intensities of signai transfer, and therefore are more important in predicting the output

compared to variables with smaller weights. Negative comection weights represent

inhibitory effects on neurons (reducing the intensity or contribution of the incoming signal

and negatively affecthg the output), whereas positive connection weights represent

excitatory effects on neurons (increasing the intensity of the incoming signal and positively

affecting the output). Recently, a nurnber of studies have used c o ~ e c t i o n weights to

interpret the participation of input variables in predictuig the output of the network (e-g.,

Aoki and Komatsu 1997; Chen and Ware 1999; ozesmi and ozesmi 1999). Other

approaches involve using al1 the weights of the network to quantiQ overall variable

importance (e.g., Garson 199 1) and sensitivity analysis to determine the spectnim of input

variable contributions in the neural nehvork (e.g., Lek et al. 1996; Mastronllo et al. 1997;

Guégan et al. 1998). Although these approaches can determine the overall influence of each

predictor variable, interpretation of interactions among the variables is more difficult to

rssess shce the strength md direction of individual axon connection weights within a

network m u t be examined directly. With even small networks, the number of connections is

large, and thus the interpretation of the network is difficult. For exarnple, a network

containing 10 input neurons and 7 hidden neurons would have 70 connection weights to

examine. Bishop (1995) suggested removing srnail weights fiom the network to ease

interpretation; however, how does one decide which weights should be retained or eliminated

fiom the network? 1 developed a randomization test for artificial neural networks to address

this question in Chapter 2. This approach randomizes the response variable, then constructs a

neural network using the randomized data and records al1 input-hidden-output connection

weights (product of the input-hidden and hidden-output weights). This process is repeated a

large number of tirnes to generate a nul1 distribution for each input-hidden-output comection

weight, which is then compared to the observed values to calculate the significance b e l .

The randomization test provides an objective pruning technique for eliminating comection

weights that have minimal influence on the network output and identifies independent

variables that significantly contnbute to the prediction process.

METHODS

Fish-habitat models for predicting fish presence/absence

The study sites consisted of 128 lakes fkom the Madawaska River basin and 32 lakes

fiom the Oxtongue River basin, located in Algonquin Provincial Park, south-central Ontario,

Canada (45'50 Tl, 78'2O'W; Fig. 3 2). Aquatic cornrnunities in this region are representative

of relatively nahiral ecosystems because these lakes are located in a provincial park and are

currently subject to minimai perturbations fiom devebpment and species introductions. I

developed fish-habitat models for 9 fish species: brown bulbead, common shiner, creek

chub, golden shiner, lake trout, northem redbelly dace, pumpkinseed, smallmouth bass and

yellow perch by modeling species presence-absence as a function of 7 whole-lake variables

(Table 3.1). Predictor variables were chosen to include factors that are related to known

habitat requirements of fish in this geographic region (Matuszek and Beggs 1988; Minns

1989) md included surface area, total shoreline perirneter, maximum depth, total dissolved

solids, pH, lake elevation, and occurrence of summer stratification. For small-bodied fish

(Le., common shiner, creek chub, golden shiner, and northem redbelly dace) 1 included the

presence or absence of smallmouth bas, largemouth bass or northem pike as an extra

predictor variable since littoral-zone predation could be an important force. Data were

obtained Eom the Algonquin Park Fish Inventory Data Base (Crossman and Mandrak 199 l),

and details of sampling methodologies are descnbed in Dodge et al. (1 985).

The optimal number of neurons in the hidden layer was deterrnined empiricdly by

cornparhg the performance of different networks, with 1 to 20 hidden neurons, and choosing

the network with the best predictive performance. 1 included Ieaming rate (q) and

momentum (a) parameters (varying as a fùnction of the network model) during network

training to ensure a high probability of global network convergence (Bishop 1995), and

considered a maximum of 1000 iterations for the back-propagation algorithm to determine

the optimal axon weights. Pnor to training the neural network, the independent variables

were converted to z-scores to standardize the measurement scales of the inputs into the

nehvork, and thus to ensure that same percentage change in the weighted s u m of the inputs

caused a similar percentage change in the unit output.

To evaluate predictive performance, fish-habitat models were validated using two

approaches. First, n-fold or "leave-one-out" cross validation (also referred to as jackknife

validation) was used to assess mode1 performance ushg 128 lakes

Figure 3.2. First panel shows the location of study lakes from the Madawaska River

drainage (128 Iakes depicted by circles) and Oxtongue River drainage (32 lakes depicted by

triangles) in Algonquin Provincial Park, Ontario, Canada (45'50' N, 78'20' W). Second

panel shows Crosson Lake (45'05' N, 77'20' W) with 20 sampling stations depicted by

cides.

Crosson Lake


Table 3.1. Summary statistics of whole-lake habitat variables used in the neural networks to

predict species presence or absence.

Macro-scale variables Madawaska River Drainage

(Training Data)

Oxtongue River Drainage

(Test Data)

Area (ha) Maximum Depth (m) Shoreline Perimeter (km) Elevation (m) Total Dissolved Solids (mfl ) PH Summer Stratification (O, 1) Littoral-zone predator (0,1)

fkom the Madawaska River drainage, as this provides a nearly unbiased estimate of mode1

performance (Olden and Jackson 2000). Second, I tested the ability of the Madawaska-

drainage models to predict species presence-absence in 32 lakes fiom Oxtongue River

drainage. This analysis provided an opportunîty to mess the generalization of the models to

other drainages in the sarne geographic region.

The output value kom the ANN ranges fkom O to 1, and represents the probability of

species occurrence in a particular lake. 1 partitioned the overall classification success of each

species rnodel by derking "confusion matrices" following Fielding and Be11 (1997). Using

these matrices 1 examined three metrics of prediction success. First, I quantified the overail

classification performance of the model as the percentage of lakes where the model correctly

predicted the presence or absence of the species (CC). Second, 1 examined the ability of the

model to accurately predict species presence, termed model sensitivity (SE). Third, 1

examined the ability of the model to accurately predict species absence, termed model

specificity (SP). Rather than simply following the conventional decision threshold of 0.5 to

classifi a species as present or absent, 1 constmcted Receiver-Operating Chanctenstic

W C ) plots for each species to estimate the predictive ability of the models over al1 decision

thresholds (Metz 1978). A ROC graph is a plot of the sensitivity/specificity pairs resuiting

fiom continuously varying the decision threshold over the entire range of results observed.

The optimal decision threshold was chosen to maximize overall classification performance of

the model, given equal costs of m i s c l a s s i ~ g the species as present or absent. The optimal

decision threshold was then used to calculate CC, SE and SP, and Cohen's kappa statistic

was used to assess whether the performance of the model differed fkom expectations based

on chance done (Titus et al. 1984).

Fish-habitat models for predicting fish abundance

The within-lake analysis examined fish-habitat associations for 4 of the most

abundant species (golden shiner, creek chub, pumpkinseed and yellow perch) in Crosson

Lake, south-central Ontario (45'05'2i1,79'02'W). Sampling was done duruig two t h e

periods in July and August of the same summer and the sampling period was coded and

included as a predictor variable to detemine whether a temporal component was important in

predicting relative abundance. Sampling consisted of approxirnately 24-hour sets of baited

minnow traps at depths of either 0.5m or 1 Sm around the perimeter of the lake (Fig 3.2).

Species relative abundances were calculated by standardizing the catch to a 24-hour sampling

penod. Habitat was assessed visualIy f?om within a boat at each sampling location. Sites

were categorized on the basis of relative cover of vegetation (none, sparse, moderate, or

deose), relative cover of woody materials (none. sparse. moderate. or dense). bottom type

(categorized into 8 ordered categones based on particle size ranging fiom (muck, clay, silt,

sand, gravel, rubble, boulder, and bedrock), presence of terrestrial Ieaf litter, and degree of

exposure (none, limited, moderate, extreme). The degree of vegetation cover and woody

matenal cover was coded as 0,1,2, or 3 depending on whether the site was classified as

havhg none, spane, moderate or dense cover. Exposure bottom type were coded in a

similar manner. Some sites contained multiple bottom types and these were averaged to give

a single value per site. The number of bottom types present was calculated to provide a

measure of the diversity of bottom types present.

Associations between species abundance and fine-scale habitat variables were

modeled using ANNs, and the optimal number of hidden neurons was determined. The

dependent variable was standardized to the range from O to 1 so that it conformed to the

requirements of the sigmoid transfer function used in the building of the neural network, and

independent variables were z-scored (see above section for details). Predictive performance

of the models was evaluated using n-fold cross validation as was done with the species-

occurrence models. Performance of the models was assessed using the Pearson product-

moment correlation between predicted and actual species abundance, and the root-mean-

square-of-enor (RMSE) of the predicted values. The Pearson comelation provides a measure

of model accuracy with better models represented by correlation coeficients approaching 1.

RMSE measures model precision with small values representing high precision and large

values indicating low precision.

Matlab programming code for training neural networks for species presence-absence

and abundance are presented in AppendYi E.

RESULTS

Fish-habitat models for predicting fish occurrence

Whole-lake attributes were successful in predicting species presence or absence

(Table 3.2). Spzcies wwzrc classified somctly in 60.9 ta 80.5% of the l aka whtxcas levels of

mode1 sensitivity and specificity varied widely among species and between drainages. In the

Madawaska drainage the predictive performance for 7 out of the 9 species-habitat models

differed significantly fiom random. Smalimouth bass and lake bout exhibited the highest

classification rates, creek chub and pumpkinseed showed the greatest sensitivity and brown

bullhead and golden shiner had the greatest specificity. The neural interpretation diagrams

for smallmouth bas , lake trout, cornmon shiner and northem redbelly dace are s h o w in

Figure 3.3. In these diagrams, the relative magnitude of the comection weights is

represented by line thickness (Le., thicker lines representing greater weights) and Iine

shading represents the direction of the weights (Le., black lines represent positive si pals and

gray lines represent negative signais). The relationship between the inputs and outputs is

deterrnined in two steps since there are input-hidden layer connections and hidden-output

layer connections. Positive effects of input variables are depicted by positive input-hidden

and positive hidden-output connection weights, or negative input-hidden and negative

hidden-output connection weights. Negative effects of input variables are depicted by

positive input-hidden and negative hidden-output co~ec t ion weights, or by negative input-

hidden and positive hidden-output connection weights. The multiplication of connection

weight directions (positive or negative) indicates the effect each input variable has on the

response variable. Interactions among predictor variables cm be identified as input variables

with opposing connection weights entering the same hidden neuron. The total contribution

of an input variable is calculated as the sum of the products of the input-hidden - hidden-

output connection weights.

Individual and interacting influences of the habitat variables on the predicted

probability of species occurrence were interpreted when connection weights differed

signincantly fkom random (based on ~ 0 . 0 5 ) . The probability of m a h o u t h bass

Table 3.2. Performance of neural networks for predicting species presence or absence in 128

lakes in the Madawaska River drainage (Training Data) based on leave-one-out cross validation,

and applying the Madawaska networks to predicting occurrence in 32 lakes from the Oxtongue

River drainage (Test Data). The reported values are per cent species occurrence (SO), # of

hidden neurons in network (HN), optimal decision threshold based on ROC andysis (ODT), per

cent correct classification (CC), sensitivity (SN), specificity (SP), Kappa statistic and associated

P-value. Note that ODT values of the Madawaska drainage models were used for predicting

species occurrence in the Oxtongue drainage.

Madawaska River Drainage (Training Data)

Species HN SO ODT CC SN SP Kappa P

Brown Bullhead Common Shiner Creek Chub Golden Shiner Lake Trout N. Redbelly Dace

Pump kinseed Srnallmouth Bass Yellow Perch

(0.41,0.59)

(0.41,0.59)

(O. 10y0.90)

(0.46,0.54)

(0.46,0.54)

(0.47,0.53)

(0.3 1 ,O,79)

(0.61,0.39)

(0.48,0,52)

Oxtongue River Drainage (Test Data)

-

Brown Bullhead

Comrnon Shiner

Creek Chub Golden Shiner Lake Trout N. Redbeliy Dace

Pumpkinseed Smallrnouth Bass Yellow Perch

Lake Trout

. * ,

pH @.*'* . . . -

Stratification

Area

Maximum Depth

Shoreline

TDS

S u m e r Stratification

Presence of predator

Northern Redbelly Dace


occurrence is positively correlated with lake area, shoreline perheter and TDS through

hidden neuron A, as well as by lake elevation through hidden neuron B (Figure 3.3). In

contrast, pH (hidden neuron A) and elevation (hidden neurons C and D) negatively influence

the probability of occurrence. Focuskg on hidden neuron C, it is evident that the effects of

Iake elevation and TDS interact such that the negative influence of elevation on the

probability of smallrnouth bass occurrence weakens as TDS increases. Summing weights

acmss ail hidden neurons shows that shoreline perimeter and TDS have a significant positive

effect on the predicted probability of smallrnouth bass occurrence (Fig. 3.4). The lake bout

NID shows that lake area and shoreline perimeter interact through hidden neuron A, resulting

in the negative influence of surface area weakening as shoreline perimeter increases (Fig.

3.3). Increasing maximum depth, shoreline perimeter and elevation result in an increased

probability of the occurrence of lake trout (Fig. 3.4). Similar to lake trout, the probability of

cornmon shiner occurrence is afFected by the same interaction between area and shoreline

perimeter (hdden neuron C; Fig. 3.3). No habitat variables significantly contribute to

predicted probabilities of common shiner occurrence, although lake area shows the strongest

influence (Fig. 3.4). Probability of northem redbelly dace occurrence decreases with the

presence of a littoral-zone predator. However, this negative influence weakens with

increasing shoreline perimeter and elevation (hidden neuron D: Fig. 3.3). Maximum depth

and elevation positively influence the probability of northem redbelly dace occurrence,

whereas the presence of a littoral-zone predator has a strong negative influence (Fig. 3 -4).

Fish-habitat models for predicting fish abundance

Within-lake variables predict species abundance with good accuracy and precision for

creek chub (r=0.833, RMSE=0.194); golden shiner (~0 .783 , RMSE=0.260); pumpkinseed

(~0.734, RMSE=0.209); and yellow perch (~0.784, RMSE=0.204). The MDs highlight

relationships between predicted abundances and habitat for each species (Fig. 3.5). For

yellow perch the positive influence of wood cover on predicted abundance weakens with

increasing density of vegetation (hidden neuron E), and the positive relationship between

predicted abundance and depth diminishes with increasing site exposure çnidden neuron A;

Smallmouth Bass Lake Trout 'O 1

Northern Wner Redbelly Dace

Habitat variables

Figure 3.4. Relative importance (% of total contribution) of whole-lake habitat variables in

predicting species presence or absence based on the sum of c o ~ e c t i o n weighrs joining an

input neuron and the output neuron.

Substrate tup. - Yeiiow Perch

of litter

Vegetation

Cover

Presence

Exposure

Sampling rnonth

Substrate rype

Substrate diversity

Presence of litter

Vegetation - - - -

Cover

Exposure

Depth

Sampling rnonth

Golden S hiner

Figure 35. Neural interpretation diagram (NID) for predichiig fish species abundance as a huiction of withia- lake variables. The thickness of the lines joining neurons is proportionai to the magnitude of the connection weight, and the shade of the h e indicates the direction of the interaction between neurons; bIack connections are positive (excitator) and gray connections are negative (inhi'bitor). Solid fines represent connection weights statisticaliy different fkom zero (a = 0.05 ), whereas dashed lines repment non-signincant connection weights. BIack input neurons indicate habitat variables that have an overall positive influence on species abundance, and gray input neurons indicate an overall negative Muence on species abundance.

Fig. 3.5). The amount of wood cover and depth contributes positiveIy to the predicted yellow

perch abundance, whereas vegetation density contributes negatively (Fig. 3.6). Similady,

interactions among habitat variables for pumpkinseed abundance were common. The

positive influence of wood cover and litter on predicted abundance weakens with increasing

site exposure and depth mdden neuron A; Fig. 3.5). Accounting for d l connection weights,

increasing amounts of cover and litter and decreasing depth predict greater abundance (Fig.

3.6). Predicted golden shiner abundance is negatively correlated with amount of wood cover.

but this relationship weakens with hcreasing depth (hidden neuron C; Fig. 3.5). Overall,

golden shiner abundance exhibits a positive association with vegetation density, and a

negative association with wood cover and sampling month (Fig. 3.6). Predicted creek chub

abundance is negatively associated with the presence of leaf Iitter and substrate type;

however, this association diminishes with increasing depth Qudden neuron C; Fig. 3 3. Vegetation density has a positive influence, whereas leaf litter and depth negatively influence

predicted abundance (Fig. 3.6).

The Madawaska lake models can be transferred readily to the Oxtongue drainage

Iakes, with rates of correct classification, sensitivity and specificity being very similar for

both drainages (Table 3.2). Due to diflerences in the fiequency of occurrence of the species

between the two drainages, only 4 out of the 9 species-habitat models differ significantly

from random at the 5% level, although 6 out of 9 are significant at Pe0.063. Most notably,

comrnon shiner, lake trout and srnaIlmouth bass are highly predictable in both the

Madawaska and Oxtongue drainages. It is also apparent that for many species the optimal

decision threshold for class@-mg a species as present or absent deviates substantially from

0.5, but typicdly fdls in the 0.4-0.6 range.

In addition to using the resdts fkom the randomization test to interpret variable

contributions, 1 used the results as a variable selection method for removing input and hidden

neurons whose incoming or outgoing connection weights were not significantly different

fiom random. The predictive performance of these "pruned" networks were then re-tested

and 1 found that the predictability of both species occurrence and abundance was generally

unaf3ected by the removal of non-significant neumns in the network (Table 3.3). For

Yellow Perch Pum pkinseed Golden Shiner fl

Creek Cbub

Habitat variables

Figure 3.6. Relative importance (% of total contribution) of within-lake variables in

predicting species abundance based on the surn of comection weights joining an input

neuron and the output neuron.

Table 3.3. Cornparison of mode1 predictions between full and pruned networks with

input ~yiab tes and S d d a seuruns rezoved that were not ctatistical s i g i Bcant (based on

randomization test results). Pruned network design is reported after species names,

where the three values represent the number of input, hidden and output neurons,

respectively. The reported values are per cent comct classification (CC), sensitivity

(SN), specificity (SP) for predicting species presence-absence (based on the optimal

decision threshold fiom ROC analysis), and correlation coefficient (r) between predicted

and actuai abundances and root-mean-square-of-error of prediction (RMSE).

S pecies Full network Pruned network

Presence - Absence -- - -

Common Shiner (5-3- 1)

Lake Trout (4-2- 1 )

N. Redbelly Dace ( 7 4 1 ) Srnailmouth Bass (5-44)

Abuadance r RMSE - -

r RMSE

Creek Chub (6-3- 1) Golden Shiner (5-3-1) Pumpkinseed (6-2- 1)

Yellow Perch (641)

example, after pnining the network, the occurrence of lake trout was highly predictable based

on the reduced set of variables.

DISCUSSION

Modeling fish-habitat associations using ANNs

Amficial neural networks have a number of advantages over traditional modeling

approaches that make them potentially beneficial for modeling fishenes data. A N N s are

capable of modelhg non-linear associations with a variety of data types (e.g., continuous,

discrete), require no specific assumptions conceming the distributional characteristics of the

independent variables, and can accommodate interactions among independent variables

without any a prion' specification (Ripley 1996). A M V s approximate any continuous

fùnction (Cybenko 1989; Funahashi 1989; Hornick et al. 1989), and thus exhibit Bexibility

for modeling non-linear relationships between variables. For these reasons, the application

of ANNs for pattern recognition and prediction has been advocated by researchers in a large

number of disciplines, and has been shown in many ecological studies to exhibit superior

predictive capabilities compared to traditional approaches (Lek and Guégan 1999). It is

important to stress that where the underlying data structure and assumptions are met for a

particular traditional statistical technique, there is no reason to believe that major differences

between haditional approaches and ANNs should exist. However, given that ecological data

are cornmonly non-linear in nature, differences may arise due to choice of transformations,

and achieving linearity is often not possible (e.g., Lek et al. 1996; Guégan et al. 1998; Wally

and Fontana 1998), A N N s are an aitractive alterative. hdeed, the results from my study

show that A N N s can provide a powerful quantitative approach for modeling fish-habitat

relationships. 1 showed that species presence or absence was predictable fiom whole-lake

measures of habitat, which is consistent with many studies of temperate fish populations

(Jackson and Harvey 1989; TOM et al. 1990; Magnuson et al. 1998). Species such as

ma lhou th bass and lake trout were predicted with high accuracy, which is a . especially

attractive result given the economic and societal importance of these spoa fishes. Similady,

ANNs provided accurate predictions of species abundance based on micro-habitat

characteristics within a lake.

One practical disadvantage of ANNs stressed by many researchen has been the lack

of explanatory insight provided by ANNs compared to traditional approaches. However my

study shows that the contribution of the independent variables in the neural network can be

quantified by direct evaluation of the connection weights. This examination is M e r aided

by üsiiîg a mdomization 3pproach to r m o w non-sip.ificant wei&b that do not contribute

to the network prediction, thus assisting in the interpretation of direct and interacting effects

of the variables in the network, and simplifjhg the network structure. For instance, overall

lake size (i.e., area, maximum depth and shoreline penmeter) and TDS (a sturogate for

productivity) were identified as positively influencing the probability of smalimouth bass and

lake trout occurrence. Lake area and maximum depth are known to influence the occurrence

of these species (e.g., Eadie and Keast 1984; Jackson and Harvey 1989) since they alter the

mixing characteristics and hence the thermal regime of lakes. Furthemore, lake area and

depth serve as an indirect mesure of the diversity of habitats available in lakes, which may

be important to support the smail-bodied, forage fish upon which smallmouth bass and lake

&out feed. Presence of a littoral-zone predator had a strong negative effect on the probability

of northem redbelly dace occurrence, but had minimal effect on common shiner occurrence.

This result is consistent with studies that suggest that the abundance and distributions of

northem redbelly dace are greatly affected by the presence of a littoral predator (Findlay et

al. 2000), whereas common shiner appears to be more resistant to predation (Chapleau et al.

1997; Whittier et al. 1997). Interestingly, the negative relationship between northem

redbelly dace and presence of a predator weakens substantially with increasing shoreline

perimeter. As shoreline perimeter increases for a given lake area, the shoreLine becomes

more convoluted, thus a greater potential for the existence of protected embayments and

patchy nearshore habitats providing increased habitat heterogeneity and potential refuge fiom

predation. In addition, the negative influence of the presence of a predator weakens with

increasing lake elevation. Since my results show that smallrnouth bass (the primary predator

in the study lakes) are more likely to be found in lower elevation lakes, the expected negative

relationship between elevation and probability of northem redbelly dace occurrence might

have been àisrupted. This supports the hding of Chapleau et al. (1997) who found that

lakes without piscivores exhibited a negative relationship between lake elevation and small-

bodied species richness, but lakes with piscivores exhibited no such relationship.

A number of micro-habitat factors were related to increased species abundance.

Greater abundances of yellow perch and pumpkinseed were predicted for sites with large

arnounts of cover (in terms of coarse woody matenal) and low densities of vegetation. The

opposite was txue for golden shiner and creek chub, which were found in greater abundance

iii mon: vcgetated sites. Habitrt cover wis generd!y more kirniportant in the models for creek

chub than those for golden shiner, supporting the view that creek chub may be less tolerant of

habitat modifications (Whittier and Hughes 1998). Although the form of preferred cover

differs arnong species, these results strengthen the notion that predicted abundance is greater

in areas with greater habitat cover (Bryan and Scarnecchia 1992; Moring and Nicholson

1994; Christensen et al. 1996). Occupancy of complex habitats by golden shiner and creek

chub supports the idea that these habitats provide profitable foraging areas (e.g., Werner et al.

1983; Diehl and Eklov 1995), rather than simply providing shelter kom predation, since

Crosson Lake lacks large piscivorous fish. Depth also played an important role in predicted

species abundance. Yellow perch and golden shiner are predicted to occur in greater

abundances at 1.5 than at 0.5m depths, whereas pumpkinseed and creek chub were more

numerous in shallower habitats closer to shore. Therefore, spatial occupancy of these species

appean to be divided into 4 components depending on the type of cover (Le., vegetation or

coarse woody material) and depth. These species-habitat associations were ofien influenced

by the degree of site exposure. For example, the importance of depth and cover for

predictions of yellow perch and pumpkinseed weakens with increasing site exposure.

Finally, sampling month appeared to be important only for golden shiner abundance, which

decreased fkom the July to the August sampling period. In summary, the ANNs provided a

powerful technique for uncovering interactions among habitat characteristics of lakes, and

examinùig their influence on species occurrence and abundance.

Fish-habitat models as a management tool

The development of models for predicting the distribution and abundance of fish

populations is of great importance given the dernand for development of lake shorelines

continues to increase and the associated impact on fish populations. My study shows that

ANNs can provide accurate predictions regarding the abundance and occurrence of fish

species based on within- and whob-lake habitat characteristics. Predictions about the effects

of littoral-zone alteration on fish abundance could be a valuable tool for lake managers

deciding whether proposed shoreline modifications should be allowed in a system, or

altematively, where in a lake modifications shouid occur in order to minimize their impact on

tk fish commiinir;. Codqy! o m e n often remove hoth macrophytes and woody material

f?om their shorelines to enhance the "cosmetic" appearance of their property and minirnize

boating problems. Developed lakes with shoreline residences have substantially lower

densities of coarse-woody matenal than less developed lakes (Christensen et al. 1 W6), which

can result in negative impacts on species composition and abundance of fish ( e g , Poe et al.

1986; Everett and Ruiz 1993). Fish-habitat models may be particularly usehl for predicting

the cumulative effects of small-scale habitat modifications on fish abundance and spatial

occupancy. Some researchers have argued that modeling the effects of mal1 incremental

habitat change may be impractical due to dificulties in identiwng and interpreting the

effects of multiple modifications on fish populations (e.g., Jemings et al. 1999). Othen have

argued that modeling çuch relationships is not possible due to the lack of detailed data (Panek

1979) and powerfùl quantitative techniques ('Burns 199 1). 1 believe that developing

statistical methodologies help to offset these difficulties. There currently exists detailed data

descnbing within-lake habitat characteristics and fish use for many systems. However the

problem is that the most common approaches for analyzing and summarizing such data

involve simple, descriptive statistics pain et al. 1999). Therefore, better use of available

data and more flexible, powerfd statistical methods, such as artificial neural networks, may

enable managers to predict the effects of small-scale habitat modifications on fish

populations.

1 have shown that whole-lake habitat attributes, many of which are mappable, can

successfully predict fish occurrence (e-g., lake trout). The development of such models has

important implications for prioritizuig surveys and monitoring programs of fish populations

since Limits to resources preclude extensive sarnpling of aquatic habitats. Mode1 predictions

can also be used as fht-order estimates of habitat suitability, which could then be followed

by ground tnithing and field validation, in order to predict sites with available spawning

habitat (e.g., Knapp and Preisler 1999) or to establish potential locations for species

reintroduction. Similarly, models can be used tu predict the likelihood of local establishment

and spread of exotic species, which may help set conservation priorities for preserving

vulnerabie species and populations that might be lost locdly (e.g., Hrabik and Magnuson

1999).

Developing more powerful fish-habitat models

The predictive abilities of conventional models for species presence-absence are

commoniy assessed fkom overall classification rates alone. In my study 1 show that by

partitionhg the predictive performance of the models into rneasures such as sensitivity and

specificity, 1 c m assess more readily the strengths and weaknesses of the models. For

example, the presence of creek chub, pumpkinseed, and yellow perch could be predicted with

a hi& degree of certainty (greater than 90% of the Iakes), yet predicting the absence of these

species was more difficult. It is dso evident that model sensitivity increases and specificity

decreases with increasing fiequency of species occurrence in the lakes. This relationship is

expected, yet is seldom considered in distribution modeling (but see Fielding and Bell 1997;

Manel et ai. 1999). There are a number of practical implications for the relationship between

prediction success and species frequency of occurrence. First, a decrease in model sensitivity

for rare species impiies that it will be more difficult to predict the occurrence of organisms

whose conservation and management is the most critical. Consequently, our ability to

identiQ suitable locations for species reintroductions could be limited. Second, drawing

inferences ftom observed absences of species from sites containing suitable habitat

conditions (e.g., indirect evident for dispenal, predation, cornpetition) could be limited if the

models exhibit poor specificity. Examinhg alternative rneasures of prediction success can

provide more accurate cornparisons of different modeling approaches (cg., Manel et al.

1999) and différent models (Le., different subsets of variables). For instance, 1 found that

although the overall correct classification rates for some species were similar, levels of

specificity and sensitivity were often quite different. Also, correct classification rates did not

change between the full and pruned neural networks, but both sensitivity and specificity did.

The effect of species prevalence in model development is unavoidable as it is

expected that given increased fkequency of occurrence, there is a greater probability of

predicting the species to be present. However, varying the decision threshold probability for

which the model predicts presence or absence, rather than following the conventional

arbitrary threshold of OS, can cornpensate for this bias (Fielding, 1999) and result in more

powerfbl models (e.g., Carroll et al. 1999; Manel et al. 1999). Determining the optimal

decision thesho!d involves consmicting Receiver Operathg Characteristic (ROC) plots and

then choosing the threshold that maximizes sensitivity and specificity given particular

misclassification costs. This technique has been applied widely to clinical problerns in

medicine; however few ecological studies have employed ROC analysis. In this study 1

defmed equal costs of false presence (misclassifjhg a species as present) and Mse absence

(misclaçsifying a species as absent); however, in practice, it may be advantageous to assign

more appropriate costs to misclassifications if such information is available. Although

assigning costs is a complex and potentially subjective process, much can be gained. For

example, 1 might tolerate more false presences for endangered species, and thus adjust the

decision threshold accordingly to develop a more powerful predictive model.

Finally, one important concem is that many models lack geographical transferability

(i.e., poor model performance outside the original data used to develop the model) since

species-environment associations can differ substantially in different systems (Shirvell 1989).

Nevertheless, modeIs may be usehl when applied at the scde at which they were developed

and in systems where similar species-environment associations exist. 1 show that testing

models in adjacent drainages demonstrates the generality of the fish-habitat models. Models

built using lakes in the Madawaska River drainage not only performed well for the same set

of lakes, but actualIy performed slightl y better, on average, for predicting species occurrence

in the Oxtongue River drainage. Although it might be surprking that correct classification

rates were slightly higher in the Oxtongue lakes (Le., test data) compared to the Madawaska

Iakes (Le., training data), the fact that model sensitivity was on average hi& and that species

modeled were more prevalent in the Oxtongue lakes than in the Madawaska drainage can

account for this result. Consequently the effect of species prevaience on geographic

transferability of fish-habitat models needs to be considered.

CONCLUSION

ANNs have wide applicability to the study of ecological relationships, bot& as

exploratory and predictive tools. ANNs provide a flexible approach that can accommodate a

wide varîety of study designs without the statistical constraints of independence and linearity,

and they require no a priori understanding ~f ss te ; i i xlationships. Consequently they are

usefui techniques for relating the distrihtions and abundances OF fish populations to h i .

physical environment. Given the obvious importance of establishing linkages between

habitat features, fish distributions, and their utilkation of nearshore habitats, the development

and testhg of fish-habitat models are important steps in the conservation and management of

Iake fish populations. Such predictive models c m advance management efforts to

understand fish-habitat associations and predict the effects of natural and anthropogenic-

related habitat modification on fieshwater fish populations.

ACKNOWLEDGMENTS

[ thank Dr. Sovan Lek for conversions regarding the fmer details of ANNs. Funding for this

research was provided by a Graduate Scholmhip fiom the Natural Sciences and Engineering

Research Council of Canada (NSERC) and University of Toronto scholarships to J.D. Olden,

and an NSERC Research Grant to D.A. Jackson.

Aoki, I., and T. Komatsu. 1997. Analysis and prediction of the fluctuations of sardine

abundance using a neural network. Oceanologica Acta 20:81-88.

Bain, M. B., T. C. Hughes, and K. K. Arend. 1999. Trends in methods for assessing

fieshwater habitats. Fisheries (Bethesda) 24: 16-2 1.

Beauchamp, D. A., E. R Bjmn, and W. A. Wurtsbaug!. 1394. Sumner habitat use by

littoral-zone fishes in Lake Tahoe and the effects of shoreline structures. North Amencan

Journal of Fishenes Management 14:3 85-394.

Bishop, C. M. 1995. Neural Networks for Pattern Recognition. Clarendon Press, Oxford.

Brosse, S., I. F. Guégan, J. N. Tourenq, and S. Lek. 1999. The use of neural nehvorks to

assess fish abundance and spatial occupancy in the littoral zone of a mesotrophic lake.

Ecological Modelling. L20:299-3 1 1.

Bryan, M. D., and D. L. Scamecchua 1992. Species richness, composition, and abundance of

fish larvae and juveniles inhabiting n a d and developed shorelines of a glacial Iowa

Iake. Environmental Biology of Fishes 35 :329-34 1.

Burns, D. C. 1991. Cumulative effects o f small modifications to habitat. Fisheries (Bethesda)

16: 12-17.

Carroll, C., W. J. Zielinski, and R. F. Noss. 1999. Using presence-absence data to build and

test spatial habitat models for the Fisher in the Klamath region, U.S.A. Conservation

Biology 13:1344-1359.

Chapleau, F., C. S. Findlay, and E. Szenasy. 1997. Impact of piscivorous fish introductions

on fish species richness of small lakes in Gatineau Park, Quebec. Écoscience 4259-268.

Chen, D. G., and D. M. Ware. 1999. A neural network mode1 for forecasting fish stock

recnlltment. Canadian Journal of Fisheries and Aquatic Sciences 56:2385-2396.

Christensen, D. L., B. J. Kerwig, D. E. Schindler, and S. R Carpenter. 1996. Impacts of

lakeshore residential development on course woody debris in north temperate lakes.

Ecological Applications 6: 1 143-1 149.

Colasanti, R. L. 1991. Discussions of the possible use of neural network algorithm in

ecological modelling. Binary 3: 13-15.

Crossman, E.J. and NE. Mandrak. 199 1. An andysis of fish distribution and community

structure in Algonquin Park: annual report for 199 1 and cornpletion report, 1989- 199 1.

Ontario Minstry of Natural Resources, Toronto, Ontario, Canada.

Cybenko, G. 1989. Approximation by superimpositions of a sigmoidal function. Mathematics

of Control, Signals, and S ystems 2,303-3 14.

Dodge, D. P., G. A. Goodchild, 1. MacRitchie, J. C. Tilt, and D. G. Waidnff. 1985. Manual

of instructions: aquatic habitat inventory sweys . Ontario Ministry of Natural Resources,

Fisheries Branch, Toronto, Ontario, Canada.

Diehl, S., and P. Eklov. 1995. Piscivore-mediated habitat use in fish: effects on invertebrate

resources, diet, and growth of perch, Percufluviatilur. Ecology 76: 1712- 1726.

Dimopoulos, Y., P. Bourret, and S. Lek. 1995. Use of some sensitivity criteria for choosing

networks with good generalization. Neural Processing Letters 2: 1-4.

Eadie, J. M., and A. Keast. 1984. Resource heterogeneity and fish species divenity in lakes.

Canadian Journal of Zoology 62: 1689-1695.

Edwards, M., and D. R. Morse. 1995. The potential for computer-aided identification in

biodivenity research. Trends in Ecology and Evolution 10: 153- 158.

Everett, R. A., and G. M. Ruiz. 1993. Coarse woody debris as a refuge from predation in

aquatic communities: An experhental test. Oecologia 93:475-486.

Fielding, A. H., and J. F. Bell. 1997. A review of methods for the assessrnent of prediction

enors in conservation presence/absence models. Environmental Conservation 2438-49.

Fielding. A. H., 1999. Application of machine leaming techniques to ecological problems.

Kluver Associates, Norwell, MA.

Findlay, C. S., D. G. Bert and L. Zheng. 2000. EEect of introduced piscivores on native

minnow communities in Adirondack lakes. Canadian Journal of Fisheries and Aquatic

Sciences 57570-580.

Funahashi, K. 1989. On the approxirnate realization of continuous mapping by neural

networks. Neural Networks 2: 1 83-1 92.

Garson, G. D. 199 1. Interpreting neural-network connection weights. Amficial Intelligence

Expert 6:47-5 1.

Gernan, S., E. Bienenstock, and R. Dounat. 1992. Neural networks and the bias/variance

dilemma Neural Computation 4: 1-58.

Guégan, J. F., S. Lek, and T. Oberdorff. 1998. Energy availability and habitat heterogeneity

predict global nverine fish diversity. Nature 39 1:382-384.

Harig, A. L., and M. B. Bain. 1998. Defining and restoring biological integrity in wilderness

lakes. Ecologicd Applications 8: 7 1-87,

Hinch, S. G., N. C. Collins, and H. H. Harvey. 1991. Relative abundance of littoral zone

fishes: Biotic interactions, abiotic factors, and postglaciai colonization. Ecology 72: 13 14-

1324.

Homick, K., M. Stinchcombe, and H. White. 1989. Multilayer feedforward networks are

universal approximators. Neural Networks 2:359-366.

Hrabik, T. R., and J. J. Magnuson. 1999. Simulated dispersal of exotic rainbow smelt

(Osmenu mordax) in a northem Wisconsin lake district and implications for

management. Canadian Journal of Fisheries and Aquatic Sciences 56 (Suppl. 1):35-42.

Hughes, R. M., and R. F. Noss. 1992. Biologicai diversity and biologicd integrity: current

concems for lakes and streams. Fisheries (Bethesda) 17: 1 1-19.


local versus regional processes. Ecology 70: 1472-1484.

James, F. C., and C. E. McCulIoch. 1990. Multivariate analysis in ecology and systematics:

panacea or Pandora's box? Annual Reviews in Ecology and Systematics 2 1 : 129- 1 66.

Jennings, M. J., K. Johnson, and M. Staggs. 1996. Shoreline protection study: a report to the

Wisconsin state legislatue. Wisconsin Department of Natural Resources, Publication

PUBL-RS-92 1-96, Madison.

Jennings, M. J., M. A. Bozek, G. R. Hatzenbeler, E. E. Emmons, and M. D. Staggs. 1999.

Cumulative effects of uicrernental shoreline habitat modification on fish assemblages in

north temperate lakes. North American Journal of Fisheries Management 19: 18-27.

Knapp, R. A., and H. K. Preisler. 1999.1s it possible to predict habitat use by spawning

salmonids? A test using California golden tmut (Oncorhynchus mykiss aguabonita).

Canadian Journal of Fisheries and Aquatic Sciences 56: 1576- 1 584.

Kurkovi, V. 1992. Kolmogorov's theorem and multilayer neural networks. Neural Networks

5: 501-506.

Lek, S., M. Delacoste, P. Baran, 1. Dimopoulos, J. Lauga, and S. Aulagnier. 1996.


Ecological Modelling 90:39-52.

Lek, S., I. F. Guégan. 1999. Artificial neural networks as a tool in ecological modelling, an

introduction. Ecological Modelling 120:65-73.

Lester, N. P., W. 1. Dunlop, and C. C. Willox. 1996. Detecting changes in the nearshore fish

commwity. C~."adiui J ~ ~ ~ a l of Fisheries and Aquatic Sciences 53 (Suppl. 1 ) : B I -402.

Magnuson, J. J., W. M. TOM, A. Banerjee, J. Toivonen, O. Sanchez, and Rask, M. 1998.

Isolation vs. extinction in the assembly of fishes in small northem lakes. Ecology

79:294 1-2956,

Manel, S., J. M. Dias, S. T. Buckton, and S. J. Omerod. 1999. Alternative methods for

predicting species distribution: an illustration with Himalayan river birds. Journal of

Applied Ecology 36:734-747.

Mash-orillo, S., S. Lek, F. Dauba, and A. Beland. 1997. The use of artificial neural networks

to predict the presence of small-bodied fish in a river. Freshwater Biology 38~237-246.

Matuszek, J. E., and G. L. Beggs. 1988. Fish species nchness in relation to lake area, pH, and

other abiotic factors in Ontario lakes. Canadian Journal of Fishenes and Aquatic Sciences

45:1931-1941.

Metz, C. E. 1978. Basic p ~ c i p l e s of ROC analysis. Seminan in Nuclear Medicine 8:283-

298.

Mims, C. K. 1989. Factors affecting fish species nchness in Ontario lakes. Transactions of

the Amencan Fisheries Society 1 18:533-545.

Moring, J. R., and P. H. Nicholson. 1994. Evaluation of three types of artificial habitats for

fishes in a freshwater pond in Maine, USA. Bulletin of Marine Science 55: 1 149-1 1 59.

Olden, J. D., and D. A. Jackson. 2000. Torturing data for the sake of generality: How valid

are our regession models? Écoscience 7(in press).

ozesmi, S. L., and U. ozesmi. 1999. An artincial neural network approach to spatial habitat

rnodelling with interspecific interaction. EcologicaI Modelling 1 16: 15-3 1.

Panek, F. M. 1979. Cumulative effects of mal1 modifications to habitat. Fisheries (Bethesda)

454-5 7.

Poe, T. P., C. O. Hatcher, C. L. Brown, and S. W. Schloesser. 1986. Cornparison of species

composition and nchness of fish assemblages in altered and unaltered littoral habitats.

Journal of Freshwater Biology 3525-536.

Ripley, B. D. 1996. Pattern Recognition and Neural Networks. Cambridge University Press.

Rurnelhart, D. E., G. E. Hinton, and R. J. Williams. 1986. Leamhg representations by back-

propagation errors. Nature 323533436 .

S h ~ e l l , C. S., 1989. Habitat rnodels and their predictive capability to infer habitat effects on

stocksize. Pages 173-179 in C. D. Levings, C.D., L. B. Holtby, and M. A. Henderson,

editors. Proceedings of the National Workshop on the Effects of Habitat Alteration on

Salmonid Stocks. Vol. 105. Special Publication in Canadian Journal of Fishenes and

Aquatic Sciences, May 6-8, Nanaimo, B.C.

Titus, K., J. A. Mosher, and B. K. Williams. 1984. Chance-conected classification for use in

discriminant anaiysis: Ecological applications. Amencan Midland Naturalist 1 1 1 : 1-7.

Tonn, W. M., J. I. Magnuson, M. Rask and J. Toivonen. 1990. Intercontinental cornparison

of small-lake fish assemblages: The balance bebveen locai and regional processes.

Amencan Naturalist 136:345-375.

Walley, W. J., and V. N. Fontama. 1998. Neural network predictoe of average score per

taxon and number of families at unpolluted sites in Great Britain. Water Resources

32:613-622.

Werner, E. E., G. G. Mittelbach, D. J. Hall, and J. F. Gilliarn. 1983. Expenmentai tests of

optimal habitat use in fish: the role of relative habitat profitability. Ecology 64: 1525-

1539.

Whittier, T. R., D. B. Halliwell, and S. G. Paulsen. 1997. Cyprinid distributions in Northeast

U.S.A. lakes: evidence of regional-scale rnimow biodiversity losses. Canadian Journal of

Fisheries and Aquatic Sciences 54: 1593-1 607.

Whittier, T. R., and R M. Hughes. 1998. Evaluation of fish species tolerances to

environmental stressors in lakes in the northeastern United States. North American

Journal of Fisheries Management 18:236-252.

My thesis contributes significantly to both the biological and methodological realms

of population and community ecology.

By developing ernpirical models for predicting fish species occurrence, abundance,

richness and community composition using bath fine- and couse-scaie mziisurcs sf lahe

habitat, 1 have illustrated that predictive models can provide guidance for the direction of

future research and aid in the conservation and management of fishery resources.

Developing models that recognize the value of both species and communities may result in

more effective conservation of aquatic biodiversity and emphasizes the protection of key

local- and regional-scale processes.

By describing alternative approaches, i.e., tree-based and neural network, for

modeling ecological data and providing a detailed cornparison of these approaches to

conventional methods, Le., logistic regression and discriminant analysis, I have provided an

important cornparison between linear and norflinear techniques to rnodeling species

occurrence data. Such c o m p ~ s o n s will become hcreasingly important as the number of

statistical techniques grow and evolve in future years. In addition, by deveioping a

complementary tool for quantimg variable contributions in neural networks, I have

eliminated the primary shortcoming of neural networks, thus making this approach both a

powerful explanatory as well as predictive tool for modeling ecological data.

"Any science may be iikened to a river. If ha3 its obscure and impretentiuus

beginning; its quiet stretches ar well as ils rnpids; its periods ofdrought as

well ar fiiliness. If gothers rnomentum with the work of many investigaiors and

as it is fed by other streams of thought; it is deepened and broadened by the

concepts and generalizations that are gradua& evolved. "

- Cal. P. Swanson

Appendix A

List of 286 study lakes of Algonquin Provincial Park used in Chapter 1

(included is latihide and longitude CO-ordinates).

Lake Name Latitude Loneitude AIRY LAKE ALLAN LAKE ALLURING LAKE ALSEVER LAKE AMIKEUS LAKE ANIMOOSH LAKE AUBREY LAKE BAB LAKE BAILEY LAKE BAND LAKE BARRON LAKE BASIN LAKE BEAVERLY LAKE BERM LAKE BIG PORCUPINE LAKE BIG RED LAKE (RED PINE) BIG ROCK LAKE BIG TROUT LAKE BIGGAR LAKE BlLLlNGS LAKE BILLS LAKE (Nt) BlLLY LAKE BIRCHCLJFFE LAKE BLUE LAKE BLUFF LAKE (NL) BO6 LAKE (NL) BONFIELD LAKE BONNECHERE LAKE BOOT LAKE BOOTH LAKE BORDER LAKE (NL) BRANCH LAKE BREWER LAKE BRIDLE LAKE BRUCE LAKE BRULE LAKE 45038'

120

BUD LAKE BURNT ISLAND LAKE BURNTROOT LAKE BUIT LAKE BYERS LAKE CACHE LAKE CALUMET LAKE CANISBAY LAKE CANOE LAKE CARCAJOU LAKE CARL WILSON LAKE CASTALIA LAKE (NL) CAT LAKE CATFISH LAKE CAUCHON LAKE CAULIFLOWER LAKE (CLYDAWADKA) CEDAR LAKE CHARLES LAKE CHEWINK LAKE CHICKAREE iAKE CLARA LAKE CLARKE LAKE CLEMOW L4KE CLOUD LAKE CLOVER LAKE ClUB LAKE CLYDEGALE LAKE COLDSPRING LAKE COON LAKE COOT LAKE COSTELLO LAKE CRADLE LAKE CRAIG LAKE CRANEBILL LAKE CROTCH LAKE CUCKOO LAKE DAlSY LAKE DAVID LAKE DELANO LAKE DICKSON LAKE DOVE LAKE DUCKPOND LAKE ERABLESLAKE FARM BAY LAKE FARM lAKE FARNCOMB LAKE FASSETT LAKE 46OO 1 '

FAVOVIER LAKE FLORENCE LAKE FOOLS LAKE FORK LAKE FOUND LAKE FOYS LAKE FRANCIS LAKE FRANK LAKE FRASER LAKE GPLFP.IRY LAKE GEM LAKE GISSON LAKE GILMOUR LAKE GOUINLOCK LAKE GWND LAKE GRAPE LAKE GREENLEAF LAKE HAILSTORM M E HAMBONE LAKE HAPPY ISLE LAKE HARRY LAKE HIGHFALLS LAKE HlLLlARD LAKE HIRAM LAKE HOGAN IAKE IGNACE LAKE IRIS LAKE ISLET LAKE JAKE LAKE (MARGARET) JOE LAKE JOHNSTON LAKE KAKASAMIC LAKE KAWA LAKE KEARNEY LAKE KENNEDY LAKE KINGSCOTE LAKE KlOSHKOKWl LAKE KIRKWOOD LAKE KlrrY LAKE LAKE LA MUlR LAKE LAVIEILLE LAKE LOUISA LAKE OF TWO RIVERS LAKE TRAVERSE LAUREL LAKE (LAURIE) LAWRENCE LAKE LENGTH LAKE 45'52' 77'40'

LILYPOND LAKE LINDA LAKE LITIIE BILLINGS LAKE LITLE CAUCHON LAKE LITTLE CAULIFLOWER LAKE LITTLE COON LAKE LllTLE CROOKED LAKE LIITLE CROW LAKE LITTLE DICKSON LAKE L1FLE !-!!Y LPKE LITTLE ISLAND LAKE LITTLE JOE LAKE LITTLE MCCAULAY LAKE LITTLE MlNK LAKE LITRE MINNOW LAKE LIITLE OTTERSLIDE LAKE LITTLE ROCK LAKE LllTLE TROUT LAKE LIlTLEDOE LAKE LONGBOW LAKE LONGER LAKE LOONTAll LAKE LORNE. LAKE LOST DOG LAKE (NAMEGOS) LOWER MINNOW LAKE LOXLEY LAKE LUCKLESS LAKE LUPUS LAKE LYNX LAKE MAPLE LAKE MARCH HARE U K E MARGARET LAKE (NL) MARIE LAKE MATHEWS LAKE MAiTOWACKA LAKE MCCRANEY LAKE MCGARVEY LAKE MCINTOSH LAKE MCKASKILL LAKE MERCHANT LAKE MERGANSER LAKE MEW LAKE MILDRED LAKE MINK LAKE MlSTl LAKE MOCCASIN LAKE MOLE LAKE 45O37'

123

MOUSE LAKE MUBWAYAKA LAKE MUDVILLE LAKE MYRA LAKE (NL) NAHMA LAKE NEPAWIN LAKE NORTH BRANCH LAKE NORTH DEPOT LAKE NORTH GRACE LAKE ?IOf,T!-! R!V ER LWE NORTH ROUGE LAKE NORTH SYLVIA LAKE NORTH TEA LAKE NORWAY LAKE O'NEILL LAKE OPEONGO LAKE ORAM LAKE OITERSLIDE LAKE OUSE LAKE OWAISSA LAKE OWL LAKE PARK LAKE (LONG) PECK LAKE PERLEY LAKE PHILIP LAKE PHIPPS LAKE PINETREE LAKE PISHNECKA LAKE POG LAKE POND LAKE POTTER LAKE PRElTY LAKE PROTTLER LAKE PROULX LAKE PROVOKING LAKE QUEER LAKE RADIANT IAKE RAGGED LAKE RAIN LAKE RAJA LAKE RAVEN LAKE RED FOX LAKE REDROCK LAKE RENCE LAKE ROBIN LAKE ROBINSON LAKE ROBlTAlLLE LAKE 45O41' n052'

I24

ROCK LAKE 45'30' ROD AND GUN LAKE ROSEBARY LAKE ROSEPOND LAKE ROUND ISLAND LAKE ROUNDBUSH LAKE RYAN LAKE RYEGRASS LAKE SAM LAKE SPNDV LAKE SAWYER LAKE SCORCH LAKE SCOTT LAKE SEC LAKE SHAû LAKE SHALL LAKE SHALLNOT LAKE SHIPPAGEW LAKE SHIRLEY LAKE SHREW LAKE SEC0 LAKE SMITH LAKE SMOKE LAKE SOURCE LAKE SPECKLEDTROUT LAKE SPOOR LAKE SPROULE LAKE ST. ANDREWS LAKE STRArrON LAKE STRINGER LAKE SUNDASSA LAKE SUNDAY LAKE SWAN LAKE SYLVIA LAKE TANAMAKOON LAKE TAlTLER LAKE TEA LAKE TECUMSEH U K E TEPEE LAKE THOMAS LAKE THREE MILE LAKE TIM LAKE TIMBERWOLF LAKE TI? UP LAKE TOM THOMPSON LAKE TROUT LAKE TUB LAKE (NL) 45O3 1 ' 77'58'

125

TURQUOISE LAKE 4S049' 77O35' UPPER KAWA LAKE 45'58' 78'53' UPPER MINNOW LAKE 45'1 4' 78O1 4' VIRE0 LAKE 45O44' 77'59' WATERCLEAR LAKE 46O03' 78'47'

WEED LAKE 45036' 78O53'

WELCOME LAKE 45%' 78025' WEST HARRY LAKE 45032' 78O49' WESTWARD LAKE 45029' 78O47' 'P!H!TE PARTRIDGE LbKE 45O50' 78O06'

WHlTEBlRCH LAKE 46'04' 78'49' WHITEFISH LAKE 45033' 78025' WHITEGULL LAKE 4S040' 78027' WHITNEY LAKE 45O34' 78Ol7' WILKINS LAKE 45'4 1 ' 77'55'

Data type (i.e., raw or transformed) for which each species mode1 exhibited the greatest correct classification rate using logistic regession (LM), discriminant analysis (LDA), classification fxee (CART) and artificial neural network (ANN). Optimal (i.e., highest correct classification rate) classification tree size (Le., number of terminal leaves) and number of hidden neurons in the neural network are reported based on n-fold cross validation. See Table 1.1 for definitions of species codes.

Classification Number of hidden tree size neurons in ANN

LRA LDA CART ANN Raw Tran Raw Tran

B

BB

BCS

BNS

BSB

BT

C

CC

CS

F

FSD

GS

ID

LC

LS

LT

L W

NRD

PD

PKS

RB

RW

SL

SMB

T-P

WS

Erans

trans

raw

M W

Tâw

TaW

trans

tram

tram

tram

raw

raw

trans

trans

raw

trans

tram

trans

raw

raw

raw

raw

trans

fifw

trans

trans

trans

t r a s

mw

tram

MW

M W

tram

mm

m s

trans

raw

MW

uims

tIïms

trilus

tram

faw

trans

trans

MW

raw

Law

raw

raw

trans

mns

MW

trans

raw

tram

raw

crans

raw

mw

raw

raw

raw

raw

raw

raw

trms

nw

raw

raw

raw

raw

raw

raw

raw

tram

raw

raw

trans

raw

raw

trans

raw

trans

tram

mw

trans

tram

raw

taw

trans

trans

raw

trans

raw

tram

trans

tMns

r3W

raw

raw

raw

raw

trans

raw

O h r , r n z r - - o o m Cs O O F O I - w s \ o * * m O

Cs Cs 8 t $ S S * 8 8 a a - c g g Z L O ~ ~ C ; ? v ? \ s ~ t - = ? * 7 ' ? '1

PKS RB RW SL

SMB T-P WS

YP -

Raw canonical coefficients and centroid means from discriminant îunction analysis for 27 fish species. Reported values are the

constant, surface area (SA), volume (V), total shoreline perirneter (SP), maximum depth (MD), total dissolved solids (DS), pH, lake

elevation (LE), growing degree-days (GD), occurrence of summer stratification (SS), watershed dummy variable (W 1-W3),

occurrence of a littoral-zone predator (P), and centroid means for absence (0) and presence (1) of species. See Table 1.1 for

definitions of species codes.

Constant SA V SP MD DS pli LE CD SS

B -72,553 0.1 34 14 -0,15849 -0.65002 -0.42942 -0.82191 -0.024 15 -0.27955 10.74684 0.20158 BB -9,661 0.55001 0.01245 0.38280 -0.93483 0.12606 0.14850 0.47477 0.68385 0.86380 BCS -3,438 -0.00350 0,00000 0.09829 -0.01545 0.00542 -0.57827 -0,00456 0.0052 1 -0.3 1488 BNS 38.63 1 -0,32029 0.27242 -0.55 1 O8 -0.85828 -0,18558 0.03492 0.20469 -4.97087 1.54 191 BSB 29,3 14 -0.00352 0.00012 0.06 160 0.03908 -0.00220 0.34226 -0.01045 -0.0 1753 -0.4597 1 BT 13.980 0,00177 -0.00012 0.00801 0.02761 0.0034 1 0.40 149 -0.00230 -0.01 142 2.18487 C 85.399 1 .O0683 -0.57670 -0,05037 0.7828 1 -0.22687 0.265 15 -2.08847 - 10.35438 -0,08330 CC 30.848 0.38836 -0.66936 -0.25970 -0.14 147 -0,15285 0.20245 -2.14284 -2.10539 0.64 166 CS 22.220 -0,39253 0.29348 -0.67447 -0.358 18 -0.5488 1 -0.33857 -0.36896 - 1.82 130 0.38996 F -92.107 0.072 10 -0,27828 0.9 1765 0.02957 0.08386 0.08226 1.450 16 1 1.08536 -0.13684 FSD - 12,949 -0.00205 0.000 16 0.04258 -0.00 1 O3 -0.00943 -0.82074 -0.00250 0.0 1 159 0.5788 1 CS -9.879 -0.00 194 0.00000 0.09463 0.00748 0.009 14 -0.45303 -0.00 144 0,00796 - 1 .O42 19 ID -1 02.448 1.18460 -0.43924 -0.83440 0.64926 -0.66239 0.17375 0.53 127 13.24498 - 1.4 19 16 LC 48.866 -0.24350 -0,49507 0.59907 -0.34571 O. 12622 0.0 15 10 -2.18245 -4.50200 0.561 30

LS - 16.3 1 O -1.183 15 0.24 102 0.30699 -0.23066 -0.55936 0.24760 -0.24246 3.0468 1 -0.29099 LT -63,956 0,55426 -0.62373 -0.5522 1 -0.49991 -0.40720 -0.2 1740 -0.22786 9.63304 0.44437 LW 3.500 -0.004 12 0,00020 -0.00792 0.00390 -0.00793 -0.58487 -0.005 1 1 0.0017 1 0.47545 NRD -62.389 1.7 1 S 19 -0.62643 -1 SOO 13 -0.50544 0.33827 -0.59822 1.73061 7.59965 0.09378 PD 83,762 -1,35937 0.41020 1.71584 0.16305 0,18013 0.13350 -1.26743-10.46119 -0.09905 PKS - 16,957 -0,00038 0.000 12 -0.06867 0,007 12 -0.0 1 236 -0.59472 0.01 224 0.00986 1.04475 RB -0.370 -0.00029 0.00008 -0.0272 1 -0.00745 0.0 159 1 0.72206 0.00608 -0.00379 -0.38779

Centroid Ccntroid

W3 P Mean O Mcan 1

- 0.5780 1 - 0.42 179 -0.9490 1 - 1.13532 - -0.41430 0.46341 0.22357 1.69792 -0.08709 1,46958 -0.99337 -1,08837 0.47043 -0.68942

0.47362 0.2 1405 -0.23806 0.63485 0.41821 - -0,82410 0.24723 -0.02311 - -0.31612 1.16604 -0.8056 1 -0.57566 0.70760 -0.3302 1 - 0.49945 -0.97332 0,59668 -0.50429 1.2484 1 0.19547 -0.15208 1.20724

0.00083 - 1 .O 1898 -0.15277 0.86332 0.2 1 O63 0.92270 -0.26 1 89 0.395 13 1.17296 0.42090 -0.28496 1.12022

-0.52 142 0.8 1 143 O. 1 7490 -0.88940 -0,5561 1 - 0.20958 -0.92139

-0.0033 1 - 0.8214 1 -0.73437 -0.69644 - O. 13472 -0,85325 -1.203 15 1,42492 0.38927 -0.32439 0.33936 - 1.10672 -0.35944 OS 1 17 1 -0.1 1883 - 0.46736 -0.25 127 -0.16 135 - 0.26885 - 1.92805

Appendix E

MatLab (version 5.3 release 11) prograrnrning code for artificial neural network training using the least-sum-of-squares error functioo (Le., continuous response variable) in the backpropagation algorithm, where predictions are based on n- foId cross validation.

% note: the datafile contains the data for predictor variables (in colurnns 1 to p) % and the response variable (in column p+l)

% note: the datafile contains the data for predictor variables (in coiumns 2 to p+l)and the response variable (in column 1) load c:\temp\datafile.txt;

% a prion commands pausetirne= l ; % tum off the wmings associated with leambpm, logsig and deltalog NNTTWARN OFF

% GLOBAL VARIABLES (mut be specified by user) IN=8; % number of input neurons (Le., nurnber of predictor variables) H N 4 ; % nurnber of hidden neurons (Le., determined empirically using n-foid CV) ON= 1 ; % number of output neurons (Le., commonly equal to 1) N=286; % number of observations weight.ange=0.3; % initial range of comection weights (Le., -0.3 ... 0.3) noepoch=lOOl; % number of iterations to consider (including the k t random)

% creating input matrix, P=data(:, 1 +ON:IN+ON); P=P'; % creating output matrix T=data(:,ON); T=T' ;

% N-fold cross validation for k 1 : N

% local mode1 parameters for network optimization lr=û.0001; % initial leaming rate parameter Ir-hc=1 .OS; % variable Iearning rate increase lrJec=0.75; % variable learning rate decrease

err-ratio= 1.06; % threshold if new error rate exceeds then modify leaming rate momentum=0.95; % initial momentum parameter

% setting initial iteration number epoch=l; % selecting n- l observations for training cvP=P; cvP ( :, k)=[] ; [cvP,meancvP,stdcvP]=prestd(cvP) ; ?h selecting n-l observations for training and standardin'np data cvT=T; cvT(:,k)=[] ; maxcvT=max(cvT); rnincvT=min(cvT); cvT=(cvT-mincvT)/(maxcvT-mincvT);

% initial connection weights set equal to -0.3 to 0.3 in order to increase prob(convergence) % calculated as a random uniform nurnber between -1 and 1 multipiied by 0.3, rand(keedl,sum(l OO*clock))

W 1=randsm,IN)*weightrange; % input-hidden weights B 1 =rands(HN. 1 )*weightrange; % bias neuroa into hidden WZ=rands(ON,HN)*weightrange; % hidden-output weights BZ=rands(ON, l)*weightrange; % bias neuron into output

% calculatinp theoretical neuron output from input into hidden (Al) and f?om hidden to output (A2) using a log sigmoid transfer f ic t ion Al=logsig(W 1 *cvP,B 1); A2=logsig(W2*A 1 ,B2);

% calculathg the error of the network (difference between theoretical and achial output) E=cvT-A2; SSE=sumsqr(E);

% setting vectoa of weights and biases back to zero mc=O;dW 1=(W1 *O);dW2=(W2*O);dB I=(B 1 *O);dB2=(B2*0);

% conduct the following comrnands until epoch (Le., # of iterations) is geater than noepoch while epochenoepoch

D2=deltaIog(A2,E); D l=deltalog(Al ,D2, W2); % neural network leaming process based on MATLAB 4.0 [dWl ,dB l]=leambpm(cvP,D 1 ,lr,rnc,dW l ,dB 1);

[dW2,dBZ]=leambpm(A 1 ,D2,lr,rnc,dW2,dB2); TWl=Wl+dWI; TBl=Bl+dB1; TW=W2+dW2; TBZ=B2+dB2; % optirnize network with the new values of the connection weights (TW and TB) TAl=logsig(TW 1 *cvP,TB 1); TA2=logsig(TW2*TA1 ,TB2); % calculating the error of the network (difference between theoretical and actual output) TE-vT-TA2; TSSE=sumsqr(TE); % altering the leaming and momenturn rate according to the error rate of the network if TSSPSSE*err-ratio

k l f l lr-dec; rnc=û;

else if TSSEGSE

k l r * lr-inc; rnc=rnomenturn;

end W 1 =TW I ;B l=TB 1 ; W2=TW2;B2=TBZ;Al=TAl ;A2=TM;E=TE;SSE=TSSE; end epoch = epoch + 1; end

% standardking the 1 ornitted observations using the mean and std. of the whole dataset cvdata=trastd(P(:,k),meancvP,stdcvP); % predicted the 1 omitted observations cvTA 1 =logsig(TWl *cvdata,?B 1); predictT(:,k)=iogsig(TW2*cvTA 1 ,TB2); predic tT(:,k)=predictT(:, k)*(maxcvT-vT; Eprintf('omitted observation: %7fb',k);

end

c=corrcoef(T,pred.îctT); c=c(1,2); Fprintf(T3est correlation coefficient: %7.4h1,c);

MatLab (version 5.3 release 11) programming code for artificial neural network training using the cross entropy error function (Le., binary response variable) in the backpropagation algorithm, where predictions are based on n-fold cross validation.

% note: the datafile contains the data for predictor variables (in colurnns 1 to p) K and the response variable (in column p+l)

% note: the datafile contains the data for predictor variables (in columns 2 to p+l)and the response variable (in colurnn 1) load c:\temp\datafile.txt;

% a priori commands pausetirne= 1 ; % tum off the wamings associated with leambpm, logsig and deltalog NNTWARN OFF

% GLOBAL VARIABLES (must be specified by user) N=X; % number of input neurons (Le., nurnber of predictor variables) m=4; % number of hidden neurons (Le., determined empirically using n-fold CV) ON=l; % number of output neurons (Le., commonly equal to 1) N=28 6; % number of observations weightrange=0.3; % initial range of connection weights (Le., -0.3 ... 0.3) noepoch=1001; % number of iterations to consider (including the tirst random)

% creating input matrix, P=data(:, 1 +ON:IN+ON); P=P1; % creating output matrix T=data(:,ON) ; T=T';

% N-fold cross vaIidation for k l : N

% local mode1 parameten for network optirnization kO.000 1 ; % initial ieaming rate parameter Ir_inc=l.05; % variable leaming rate increase lr_dec=O.75; % variable learning rate decrease en-rati~1.06; % threshold if new error rate exceeds then modiQ leaming rate mornenhun=0.95; % initial rnomentum parameter

% setting initial iteration number epoch=l;

j=2*k-1; % selecting n-1 observations for training cvP=P; cvP(: j)=[]; [cvP,meancvP,stdcvP]=prestd(cvP); % selecting n-1 observations for training and standardizing data *- rT=T; b v

cvT(:j)=u; maxcvT=max(cvT); rnincvT-min(cvT); cvT=(cvT-mincvT)/(maxcvT-mincvT);

% initial comection weights set equal to -0.3 to 0.3 in order to increase prob(convergence) % caiculated as a random unifom number between -1 and 1 multiplied by 0.3, rand('seed',sum( 1 OO*clock)) W 1 =rands(HN,IN)*weightrange; % input-hidden weights B 1 =rands(HN, 1 )*weightrange; % bias neuron into hidden W2=rands(ON9HN)*weightrange; % hidden-output weights BZ=rands(ON, 1 )*weightrange; % bias neuron into output

% caiculating theoretical neuron output from input into hidden (Al) and from hidden to output (A2) using a log sigmoid transfer fùnction A1 =logsig(W 1 *cvP,B 1); A2=Iogsig(W2*A 1 ,B2); % caiculating the error of the network (difference between theoretical and actual output) E-VT-A~;

% CROSS ENTROPY ce=o; for a=l :N-2

~e(a)=Iog(A2(a)~cvT(a)*( 1 -A2(a))"(l -cvT(a))); end SSE=surn(ce); % setting vectors of weights and biases back to zero mc=û;dWl=(W 1 *O);dW2=(W2*0);dB I=(B 1*0);dB2=(B2*0);

% conduct the following commands until epoch (Le., # of iterations) is greater than noepoch while epochcnoepoch

D 1 =deltalog(Al D2, W2); % neural network leaming process based on MATLAB 4.0 [dW i ,dB l]=leambpm(cvP,D 1 ,lr,mc,dW 1 ,dB 1); [dW2,dBZ]=leambprn(Al ,D2,lr,mc,dW2,dB2); TWZ=Wl+dWl; TBl=BZ+dB 1; TWZ=W2+dW2; TB2=B2+dB2; % optimize network with the new vaiues of the c o ~ e c t i o n weights (TW and TB) TAl=!ogsig(TWl *cvP,TB 1); TA2=logsig(TW2*TA 1 ,TB2); % calculating the error of the network (difference behveen theoretical and actual output) TE-vT-TA2; % CROSS ENTROPY ce=[]; for a=l :N- 1

ce(a)=iog(A2(a)AcvT(a)*( 1 -A2 (a))*( 1 -cvT(a))); end TSSE=sum(ce); % altering the leaming and momenhim rate according to the error rate of the network if TSSDSSE*err_ratio

klr*lr-dec; mc=O;

else if TSSECSSE

k l r * 4inc; mc=mornentum;

end

W1=TW l;B 1=TB 1;W2=TW2;B2=TB2;Al=TAl;A2=TA2;E=TE;SSE=TSSE; end epoch = epoch + 1; end % standardking the 1 omitted observations using the mean and std. of the whole dataset cvdata=ûastd(P(: j),meancvP,stdcvP); % predicted the 1 omitted observations cvTAl =logsig(TW 1 *cvdata.,TB 1); predictT(: j)=logsig(W2*cvTA 1 ,TB2); predictT(:&==redictT(: j}*(maxcvT-rnincv+mhcvT; Fprintf('omitted observations: %7.4f\nt j:);

end

m=sum(abs(T-round@redictT)))/N; fp~d(Mscisclassification rate: %7.4f\n',.m);

predictive models for freshwater fish community composition€¦ · recognition and prediction....

Documents