predictive models for freshwater fish community composition€¦ · recognition and prediction....
TRANSCRIPT
Predictive models for freshwater fish community composition
Iulian David Olden
A thesis submitted in confonnity with the requirements
for the degree of Master's of Science
Graduate Department of Zoology
University of Toronto
Q Copyright by Julian David Olden 2000
National Library Bibliothéque nationale du Canada
Acquisitions and Acquisitions et Bibliographie Services services bibliographiques
395 Wellington Street 395, nie Wellington Ottawa ON K I A ON4 Ottawa ON K I A ON4 Canada Canada
The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, distribute or sel1 copies of this thesis in microfotm, paper or electronic formats.
The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fkom it may be printed or otherwise reproduced without the author's permission.
L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfichelfilm, de reproduction sur papier ou sur format électronique.
L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.
Predictive models for freshwater fish
comrnunity composition
-M~ter's of Science 20Q0
Julian David Olden
Graduate Department of Zoology
University of Toronto
Empincal models were developed to predict fish species occurrence, richness and
community composition for 286 temperate lakes located in sotith-central Ontario, Canada
based on whole-lake rneasures of habitat. Detailed analysis and cornparison of traditional
(i.e., logistic regression, discriminant analysis) and alternative (Le., classification trees, and
artificial neural networks) modeling statistical techniques show that predictive success differs
among species and approaches. Neural networks (NNs) are focused upon in subsequent
chapters given their potential utility in solving non-linear problems related to pattern
recognition and prediction. Details of mode1 construction, optimization and validation are
illustrated, and new rnethods for quantifjmg the explanatory value of NNs are developed.
Next, the utility of NNs to aid in understanding and predicting species abundance using near-
shore rneasures of lake habitat are illustrated. Together, my thesis provides ecological and
methodological insights into modeling of fish cornrnunities, both of which have considerable
value for the study, management and conservation of aquatic ecosystems.
It's currently 1 :34 in the moming and rather than going home I am considering
spending the night on the lab bench. But no . . . 1 must keep working as there is much to be
done . . . okay, just one more section then it's time to get some shut-eye with the cockroaches
of the Ramsay Wright building. Oh man . . . the acknowledgements! Not a pod thin5 to
wite when the caffeine is wearing off and my motor skills are deteriorathg quickly.
Ironically, this is perhaps the most difficult task of the whole thesis . . .. I'm not kidding.
There have been so many experiences that have guided me during my Iife. My father's
endless stories of sailing the seas, my mother's hand on my shoulder as 1 huny to complete
my school work before w i n g to the cockpit of the boat, my high school biology teacher
Mr. Johnson who fint exposed me to the many wonders of science, the student last week
who asked me to clariQ what exactly 1 was talking about. Hang on a minute! Let's do this
right, you know, in some organized manner (boy, rny mum be proud).
First, 1 cm honestly Say that I would not be in the position that 1 am today without my
supervisor, Don Jackson. He gave me my fint chance, hiring me as a sumrner field assistant,
having the amazing insight to look beyond my early sub-par grades in univenity and
aclaiowledging my potential as a scientist. For although many believe Don's expertises are
mainly numerical, just spend one moming on a lake with him and you will soon realize his
overwhelming knowledge of nahiral history. It is this knowledge that Don has blessed me
with, starting in the field in Dorset, continuing during rnoming coffee and evening beers, and
hopefùlly never ending. Thank you Don . . . well . . . for everything. 1 look forward to the
years ahead of working with you, and most importantly, our continued fiendskip.
Pedro Peres-Neto . . . for people who know Pedro need 1 Say more. My second
supervisor, an endless source of statistical information, wealth of ecological knowledge, love
for Rickards Red (even if it's flat), obsession with Saigon's Palace #34, and countless stories
of Brazil. 1 have never met such a dedicated individual to his studies, farnily and friends. I
have immensely enjoyed the hours of tallcing of science with you. 'ïhanks for the pep talks,
constant reassurance, and especially for being such a great and loyal fiend. Hey Pedro . . .
"1s Russia in Europe or what?!?". Thanks to Bryan 'The Notorious" Neff, Peng "Sperm
Boy" Fu and Trevor "Gunter" Pitcher for countless hours of laughter, gossip, Unreal
Tournament, Risk (long live the Black Plage), lifting weiphts, fishing (Bryan's black toe),
buming sessions, drinking beer and scotch, Kumtoya, Hung Fa, Kuïm Jug Yoen, Swiss
Chalet, NOT TIPPING $$, and just generally goofing-oft? 1 have enjoyed many intellechid
discussions with Bryan, Monte Car10 this, bootstrap that, and most Mportantly . . . "What the
heck is a time lag anyway?". With the introduction of Peng conversion began to slowly
deteriorate in content. but there were still numerous instances I remernber of talking
experimentation pnor to the start of the movie, life histories during coffee shop club (sorry
Bryan you couldn't join!), or just science in general while w a k n g to the pub. Thanks Peng,
but.. . "What's that metal thing in the bottom of the kettle?" Finally, with the entrance of
Trevor conversion completely depreciated, out with taking about science, in with endless
jokes and the amazing realization that he would tmly give the shirt off his own back for you.
We al1 enjoyed living vicariously through Trevor! Oh ya Gunt . . . "How 's that rock bass
taste?". Thanks to al1 the boys for the nights of drinking at Pedro's place, Duke of York,
Bedford Academy, My Aparmient, and we couldn't forget B.R., smoking cigars, and live
jazz.
Much love to Ladan Mehranvar for endless hours of taiking, watching movies, eating
exotic foods, and the fantastic mornings at Bake Works. Your Company is already dearly
missed. Many thanks to Pamela MacRae for being a great fnend and supporter, and Jeanette
Davis for adding a unique twist to the lab in later years. Thanks to the basketball crew
(Michelle Tseng, Lock Rogers, Paul Williams and Dave Punzalan) for allowing me to TAKE
IT TO THE KOOP . . . boy those GSU nms are forgiving!
Thanks to rny cornmittee, Brian Shuter, Nick Collins and Keith Somers for their
critical comments on my thesis and for simulating me intellectually. Nick Mandrak for
providing the Algonquin Park dataset used in the thesis, and for numerous conversations
regarding the fish communities of the park and surrounding areas. Sovan Lek for his
insightfùl comments about the h e r details of neural networks. Finally, Locke Rowe for
showing me his undergrad transcript years âge, reassuring me that GPA is not representative
of your ability.
Thanks to Papa Ceo and Cora for providing the materials to prove that it is possible to
survive on pizza. A specid th& to the staff of the many coffee shops 1 have inhabited over
the last two years. For their ffiendly smiles, unlimited patience for accepting my hours of
loitering and most importantly for providing the essential fuel needed to ensure the
completion of this thesis. Cheers, this coffee is for you!
Saving the best for last . . . my family . . . who has provided the much needed love and
support during my years at U of T. Thanks Mum and Dad for making the effort to travel to
the evil dwellings of Toronto, dragging my b ~ a out of my office, and force-feeding me.
Mum. your spoken and unspoken Cbiting your toneue I'rn sure) concem of whether I would
finish the thesis was surprisingly reassuring at times, and Dad's laid back, "1 know my son
cm do it" attitude provided the perfect contrast. My brother Morgan, who continually asked
whether 1 had tirne to spare to just hang out. Althouph my answer was too frequently "no",
he emphasized the importance of family and took it upon himself to maintain stmng ties
among us dl. Much love to T;iff, Dilys, Morgan, Sarnantha, Betty and George . . . I l phone
tom Colorado . . . I promise!
Funding for this thesis was provided by a number of sources, including a Natural
Sciences and Engineering Research Council of Canada Gnduate Scholarship, Edna Margaret
Robertson Scholanhip, Frederick P. Ide Graduate Award, University of Toronto Open
Scholarships, Department of Zoology Teaching Assistantships and Travel AWU~S, and a
Natural Sciences and Engineering Research Council of Canada Research Grant to Don
Jackson.
Table of Contents
. * Abstract ....................................... r i
... Acknowledgements ......................... 111
........................... Table of Contents vi ................................ List of Tables vii ................................. List of Figures x
................................. List of Boxes xiv ............................ List of Appendices xv
Thesis Introduction .......................... 1
C H A P T E R 1 Predictive rnodefs for commtiniîy assembij: Fish species occurrence in fakes
Abstract ....................................... 8 Introduction ................................... 9
....................................... Methods 11 ......................................... Results 19
.................................... Discussion 32 .................................... Conclusion 39
........................... Acknowledgments 40 ...................................... Reference 4 1
C H A P T E R 2 nlnminating the "black box": A randomization approach for understanding variab Le contributions in am$ciul neural networks
....................................... Abstract 39
Introduction ................................... 50 Case Study .................................... 52 Interpreting neural network
.......................... connection weights 53 ............... Illuminating the "black box" 55
.................................... Conclusion 72 Acknowledgments ........................... 73
.................................... References 74
C H A P T E R 3 Artijicial Netiral N e ~ o r k s : A predictive tool for Jsheries science
Abstract ......................................... 80 ..................................... Introduction 81
.................. Artificial Neural Networks 83 ...................................... Methods 89
Resulfs ......................................... 96 Discussion .................................... 107 Conclusion .................................... 113
........................... Acknow ledgrnents 113 References .................................... 114
.......................... Thesis Conclusion 119
Appendix A .................................. 120 .................................. Appendix B 127
Appendix C .................................. 128 .................................. Appendiv D 129
Appendix E .................................. 133
List of Tables
Page
Table 1.1. List of fish species, including species abbreviation (Code) and
fiequency of occurrence (%) in the 286 study lakes. Only species
occurring in greater than 5% of the lakes are included. ........................... 13
Table 1.2. Sumrnary statistics for the whole-lake habitat variables used in mode1
development (see Appendix B for Pearson-moment correlation coefficients
................................................................... among variables). 14
Table 1.3. S u m a r y of predictive performance of species-habitat models. Reported
values are percentage conectly classified (CC), specificity (SP: ability to
accurately predict species absence) and sensitivity (SN: ability to
accurately predict species presence). Predictions significantly different
from random (based on Kappa statistic) are indicated in italics
(a=0.05). Species codes are defined in Table 1.1. ............................. .20
Table 1.4. Species exhibiting hi& (significantly different fiom random) and low
(not significantly different fiom random) predictability for logistic
regression analysis, linear discriminant analysis, classification trees
..................................................... and artificid neural networks. .23
Table 1.5. Cornparison of observed and predicted fish community composition
for lakes based on first six components nom a principal coordhate
analysis using laccard's similarity coefficient. The range of total arnounts
of variation explained by the fint six components of the between lakes
matrices is summarîzed whereas al1 components were retained f?om the
species ordinations. Reported values are Gower's m2, ranghg fiom O to 1
where O indicates perfect agreement, and associated signifieance levels
........... based on 9,999 randomizations nom PROTEST in parentheses.. 28
Table 2.1.
Table 3.1.
Table 3.2.
Table 3.3.
Connection weight structure for die neural network modelling fish
species richness as a fùnction of 8 habitat variables. Wu represents
the input-hidden-output connection weight for input variable i
(where i =1 to 8) and hidden neuron j (where j = A to D). P values
for input-hidden-uutput connection weights !Kfi W5!, W-, and WC:),
overall connection weights (X Wdi+ Di) , and Garson's relative
........................... importance (%) are based on 9,999 randomizations. ..67
Summary statistics of macro-scale habitat variab les used in the neural
................................. networks to predict species presence or absence. 93
Performance of neural networks for predicting species presence or
absence in 128 lakes in the Madawaska River drainage (Training
Data) based on leave-one-out cross validation, and applying the
Madawaska networks to predicting occurrence in 32 lakes Erom
the Oxtongue River drainage (Test Data). The reported values are
per cent species occurrence (SO), # of hidden neurons in network (HN),
optimal decision threshold based on ROC analysis (ODT), per cent
correct classification (CC), sensitivity (SN), specificity (SP), Kappa
.................................................... statistic and associated P-value. 97
Cornparison of model predictions between full and pnined networks
with input variables and hidden neurons rernoved that were not
statistically significant (based on randomization test resdts). Pnuied
network design is reported after species narnes, where the three
values represent the number of input, hidden and output neurons,
respectively. The reported values are per cent correct classification
(CC), sensitivity (SN), specificity (SP) for predicting species
presence-absence (based on the optimal decision threshold from
ROC analysis), and correlation coefficient (r) between predicted
and actual abundances and root-mean-square-of-error prediction
(RMS E) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 06
List of Figures Page
Figure 1.1. Study lakes located in Amable du Fond River (A), Bo~echere
River (B), Madawaska River (C), Oxtongue River (D) and Petawawa
River (E) drainage bains of Algonquin Provincial Park, Ontario,
.................................................... Canada (45O50' N, 78'20' W). 12
Figure 1.2. Results Erom McNemar's test assessing differences among patterns of lake
miscIassifications using logistic regression, discriminant analysis,
classification trees and artificial neural networks. Shared shading for a
species represent pair-wise differences for lakes in which species occurrence
was incorrectly predicted (based on ac0.05). For example, blacknose shiner
was misclassified in different sets of lakes using logistic regression and
................................................................... classification tree. 24
Figure 1.3. The effect of fiequency of species occurrence on overall correct
classification rate (top panel), specificity (middle panel) and sensitivity
(lower panel) for 27 fish species using logistic regression (LRA),
discriminant analysis (LDA), classification trees (CART) and artificial
neural networks (ANN), expressed as percentage. Doned lines represent
....................................... expectations based on chance predictions. 25
Figure 1.4. Regression analysis between actual and predicted species richness
using logistic regression (LRA: y=3.24+0.73x), discriminant analysis
(LD A: y=2.96+0.75x), classification trees (CART: y-2.85+0.78x) and
artificial neural networks (ANN: y=2.64+0.86x). Solid h e represents
regression line and the dashed line represents a h e of perfect match
(1 : 1). Predicted nchness was tabulated by s u h g predicted species
............................. presences ushg models for each of the 27 çpecies. 29
Figure 1.5. First two axes from principal coordlliate analysis based on Jaccards
sunilarity coefficient of data sets (A) known or observed species
composition, (B) composition predicted by logistic regression model, (B)
predicted by discriminant analysis, (C) predicted by classification trees
and @) predicted by artificial neural networks. Letten refer to
species codes listed in Table 1.1. ................................................. 3 1
Figure 2.1. Neural interpretation diagram (NID) for neural network modelling
fish species nchness as a fûnction of 8 habitat variables. The
thickness of the lines joining neurons is proportional to the magnitude
of the connection weight, and the shade of the line indicates the
direction of the interaction between neurons: black connections are
positive (excitator) and gray connections are negative (inhibitor). ........... 57
Figure 2.2. Bar plots showing the percentage relative importance of each
habitat variable in the neural network predicting fish species richness
based on Garson's algorithm. See Box 2.1 for caiculations involved
in Garson's algorithm. .............................................................. 60
Figure 2.3. Contribution plots from the sensitivity analysis illustrating the output
of the network to changes in each habitat variable with al1 other
) and 80" variables held at their 20" (- - - . -), 40" (---), 6oth (- - - (-) percentile. ................................................................... 62
Figure 2.4. Distributions of (A) input-hidden-output connection weights for
hidden neuron B, (B) overall connection weight and (C) input relative
importance (%) for the influence of surface area on lake species richness.
Arrows represent observed input-hidden-output connection weight for
hidden neuron B (7.8 1), overail connection weight (7.43) and relative
importance (1 8.27%). .............................................................. .67
Figure 2.5. Neural hterpretation Diagram after non-signincant input-hidden-
output co~ec t ion weights are eliminated using the randomization test.
Only connection weights statistically different from zero (a = 0.05
and a = 0.10) are shown. The thickness of the Iines joining neurons
is proportional to the magnitude of the connection weight, and the
shade of the Iuie indicates the direction of the interaction between
neurons: black connections are positive (excitator) and gray connections
are negative (inhibitor). B!ack i q u t neuronr hdicate hzbit~t vwkbles
that have an overail positive influence on species tichness, and gray
.. input neurons indicate an overall negative influence on species richness. .70
Figure 3.1. One-hidden layer, feedfonvard neural network design. ........................ .85
Figure 3.2. First panel shows the location of study lakes from the Madawaska
River drainage (128 lakes depicted by circles) and Oxtongue River
drainage (32 lakes depicted by triangles) in Algonquin Provincial Park,
Ontario, Canada (45'50' N, 78'20' W). Second panel shows Crosson
Lake (45'05' N, 79'02' W) with 20 sampling stations depicted by
circles. .........~...........................~......................................... 91
Figure 3.3. Neural interpretation diagram (NID) for predicting fish species presence
or absence as a h c t i o n of macro-habitat variables. The thickness of the
lines joining neurons is proportional to the magnitude of the connection
weight, and the shade of the Iine indicates the direction of the interaction
between neurons; black connections are positive (excitator) and gray
connections are negative (inhibitor). Solid Iines represent connection
weights statistically different Born zero (a = O.OS), whereas dashed lines
represent non-significant comec tion weights. B lack input neurons
indicate habitat variables that have an overdl positive influence on
species occurrence, and gay input neurons indicate an overall negative
influence on species occurrence. ................................................... 98
Figure 3.4.
Figure 3.5.
Figure 3.6.
Relative importance (% of total contribution) of whole-lake habitat
variables in predicting species presence or absence based on the surn
of connection weights joining an input neuron and the output neuron. . . . . . .IO1
Neural interptation diagram (NID) for predicting fish species
abundance as a function of micro-habitat variables. The thickness
of the lines joining neurons is proportional to the magnitude of the
connection weight, and the shade of the Iine indicates the direction
of the interaction between neurons; black connections are positive
(excitator) and gray connections are negative (inhibitor). Solid lines
represent connection weights statishcally different from zero (a = O.OS),
whereas dashed lines represent non-significant connection weights.
Black input neurons indicate habitat variables that have an overall
positive influence on species abundance, and gray input neurons
indicate an overall negative influence on species abundance. . . . . . . . . . . . . . . ... 102
Relative importance (% of total contribution) of micro-habitat
variables in predicting species abundance based on the sum of
connection weights ioininn an i n ~ u t neuron and the oumut neuron. . . . . . . . . . 105
List of Boxes
Page
Box 2.1. Garson's algorithm for partitionhg and interpreting neural network
connection weights. Sample calculations shown for 3 input neurons
(1 , 2 and ?),2 hidden neurons (A and B! and 1 output neuron (O). . . .. . . . . .. 59
List of Appendices
Page
Appendix A List of 286 study lakes of Algonquin Provincial Park used in
Chapter 1 (included is latitude and longitude CO-ordinates). ................... 120
AppendLu B Matrix of Pearson product-moment correlation coefficients for raw (Le.,
untrans formed) whole-lake variables located in the lower triangle, and for
In (x) transformed variables located in the upper triangle (except for pH).
Note that only continuous variables are shown. Variable include surface
area (SA), volume (V), total shoreline penmeter (SP), maximum depth
(MD), total dissolved solids (DS), pH, lake elevation (LE), and growing
degree-days (GD). .................................................................... 127
Appendix C Data type (Le., raw or transformed) for which each species mode1 exhibited
the greatest correct classification rate using logistic regression (LRA),
discriminant analysis (LDA), classification tree (CART) and artificial
neural network (ANN). Optimal (i.e., highest correct classification rate)
classification tree size (i.e., number of terminal leaves) and number of
hidden neurons in the neural network are reported based on n-fold cross
validation. See Table 1.1 for definitions of species codes ...................... 128
Appendir D Logistic regression coefficients for 27 fish species models. Reported
values are the y-intercept (Int.), surface area (SA), volume (V), total
shoreline perirneter (SP), maximum depth (MD), total dissolved soiids
OS), pH, lake elevation (LE), growing degree-days (GD), occurrence of
summer stratification (SS), watershed dummy variable (WLW3) and
occurrence of a littoral-zone predator (P). See Table 1.1 for definitions
..................................................................... of species codes -129
Raw canonicd coefficients and centroid means fiom discriminant
function analysis for 27 fish species. Reported values are the constance,
surface area (SA), volume (V), total shoreline perimeter (SP), maximum
depth (MD), total dissolved solids @S), pH, lake elevation (LE), growing
degree-days (GD), occurrence of surnmer stratification (SS), watershed
dummy variable (Wl-W3), occurrence of a littoral-zone predator (P),
and centroid means for absence (0) and presence (1) of species. See
Table 1.1 for definitions of species codes ....................................... ,129
Appendix E MatLab (version 5.3 release 1 1) prograrnming code for artificial neural
network training using the least-surn-O f-squares error function (i.e.,
continuous response variable) in the backpropagation algorithm, where
................................ predictions are based on n-fold cross validation ,133
MatLab (version 5.3 release I l ) progamming code for artificial neunl
network training using the cross entropy error function (Le., binary
response variable) in the backpropagation algorithrn, where predictions
.............. are based on n-fold cross validation ......................... ,,, -133
Thesis Introduction
"nie history oflife on earth hus been a history of interaction between living
things and their surrozmdings. Tu a large ertent, the physical Jorn and the
habits of the earth 's vegetation and its animal life have been molded by the
en virurtrnertt. Corderirig f h w hok s p i i c fd< i r~ l i& iii>i 2, ihe ~ p p o j i i ~ é$2~i.
in wh ich life actunlly modifies its surroundings, has been rehtively slight.
Oniy within the moment of rime represented b-v the present century has one
species - man - acqnired signifcanr power io alter the nature of his world. "
- Rachel Carson
Global biodiversity is changing at an unprecedented rate ( P h et al. 1995) as a
result of human-induced changes in the global environment (Vitousek 1994), with greatest
impacts related to habitat loss (Sih et al. 2000). Habitat loss and modification have resulted
in the loss of biodiversity in many aquatic and terrestrial ecosystems (Wilson 1994, Sinclair
et al. 1995); for example, native fishes in Lake Victoria (Kaufman 1992), birds in fiagmented
Brazilian forests (Willis 1979), and small mamrnals in arid Australia (Woinarski and
Braithwaite 1990). During the last decade, understanding and predicting regional (e.g.,
Prendergast et ai. 1993), continental (e.g., Wahiberg et al. 1996, Channel1 and Lomolino
2000) and global changes (e.g., Rex et al. 1993, Guégan et ai. 1998, Redford and Richter
1999, Sala et al. 2000) of biodivenity in response to changes in environment and Iand-use
have been of major importance.
A central problem in conservation ecology is how to use Limited resources of tirne,
money, and energy most effectively to minimize the loss the Eaah's biolopical diversity. For
example, while species and habitats are disappearing at an alatming rate, we have often been
unabie to evaluate the extent of the biodiversity loss, let alone predict it. Empincal models
could play an important role in conservation ecology by providing a quantitative framework
fiom which species distributions, nchness and comrnunity structure c m be predicted fiom
patterns in habitat heterogeneity, biotic interactions and anthropogenic conditions. For
instance, predictive models of fish species occurrence at broad spatial scales could be helpful
in assessing the major factors shaping curent and predicting firme distributions. Such
models also serve an important role in forecasting the effects of changing land-use practices
(e.g., Guégan et al. 1998), altered climate regimes (e.g., Tom 1990) and biotic invasions
(e.g., Hrabik and Magnuson 1999) on biota. Similarly, ernpirical models for h e r scaie
measures, such as species abundance, are critical for un der stand in^ relationships between the
environment, its use by an organism and subsequent productivity at smaller, local scales,
which can then be used to predict the influence of habitat alteration, species introductions,
and other artificial and natural perturbations on population and community health.
Although, the public and scientists have traditionally viewed loss of biodiversity as an
issue of tropical ecosystems, there has been recent emphasis on the potential loss of
biodiversity in aquatic ecosystems of temperate regions (Moyle and William 1990, Hughes
and Noss 1992). Surprisingly, fisheries scientists have remained focused on understanding
species-environment relationships rather than developing predictive models using this
knowledge. The pnmary objective of this thesis is to advance the predictive realm of fish
ecology by focusing efforts on developing and comparing predictive models of species
occurrence, abundance, richness and community composition for keshwater fish
communities. Such models represent a major advance in studies of lake ecology and have
obvious research, conservation and management applications. For instance, predictive
models could serve as a template for forecasting species occurrence in lakes whose fauna has
not or carmot be adequately sampled, as well as predicting the effects of habitat modification
and changing land-use patterns on fish populations and communities. In addition, fish-
habitat models would provide managers with tools to predict biodiversity, direct searches for
unlaiown fish populations, predict the presence of indicator species, indicate habitat
suitability for restoration or reintroduction of species, and predict the spread of exotic
species.
The aim of the £ k t chapter is to use conventional (i.e., logistic regression and
discriminant analysis) and alternative (Le., classification trees and artincial neural networks)
statistical approaches to develop predictive models of species presence or absence based on
lake-wide measures of habitat for 286 lakes of south-central Ontario, Canada Fish-habitat
models are developed for 27 species, and the predictive performance of the models is
examined, interpreted and compared in detail using a number of performance measures.
Many of these analyses are an advancement over conventional model evaluations, and
provide insight into deteminhg the predictability of species occurrence, richness and
community composition as a function of whole-lake measures of habitat. Individual species-
habitat rnodels are then used to predict nchness and community composition of the study
lakes. If fish communities are predictable then a community-level approach to the
conservation of aquatic biodiversity may be feasible, rather than focushg efforts on
particular species or populations (Angcrmeier and Schlosser 1995). Such an approach would
be a major advance relative to ciment management proprams (Evans et al. 1987, Franklin
1993, Angermeier and Winston 1999), and consequently predictive models for fish
communities could play a critical role.
A secondary objective of Chapter 1 is to compare the performance of conventional
and alternative approaches for predicting species occurrence, nchness and community
composition. Such c o m p ~ s o n s are increasingly important given the large and evolving
range of approaches available for modeling, and the potential difficulty this poses for
ecologists or conservation biologists in choosing appropnate methods. For instance,
conventional techniques, based notably on multiple regression, are capable of solving many
problems, but a major drawback is that species-environment relationships are often non-
linear (linearity being a cntical assumption of these rnethods). In recent y e m , recursive-
partitioning and machine-learning techniques have received greater attention in the biological
sciences. Examples include classification and regression trees and artificial neural networks,
which show promise in ecological studies given their ability to model complex, non-linear
relationships among variables.
Although alternative techniques can be beneficiai in that they can readily model
complex, non-linear associations between a species and its environment (Lek et al. 1996),
methods for quantimg and interpretuig these associations are generdly lacking. For
instance, artificial neural networks have been cailed a "black box" approach to modeling
ecological phenornena shce they have been viewed as providing no information regarding
the importance of the variables in the model (e-g., Parue10 and Tomasel, 1997; Lek and
Guégan, 1999; ozesmi and ~zesmi, 1999). In chapter 2, my objective is to iIluminate the
"black box" by providing a synthesis of the methods available to better understand variable
contribution in networks. In addition, a randomization test for artificial neural networks is
developed which enables testing of statistical significance of independent variables in
network predictions and facilitates the interpretation of variable interactions.
Using the approaches developed in chapter 2, chapter 3 expands the utility of neural
networks for predicting and understanding species-habitat relationships in ternis of
occurrence at a regiooal scale (Le., within drainage) as a function of whole-lake measures of
habitat, and abundance at a tocal scale (Le., within lake) as a function of near-shore habitat
variables. Given that species abundance may be is a more sensitive response variable for
snidyùig fish-habitat relationships, a combined analysis of both occurrence and abundance is
beneficial. Furthexmore, a more detailed evaluation of the species occurrence models is
conducted, including the construction of Receiver-Operating Characteristic Plots (Metz
1978) to estimate optimal decision thresholds for prediction to maximize classification
success and independent assessrnent of geographic transferability of the models
References
Angermeier, P. L., and 1. J. Schlosser. 1995. Consenring aquatic biodiversity: beyond species
and populations. Amencan Fishenes Society Symposium l7:402-414.
Angermeier, P. L., and M. R. Winston. 1999. Characterizhg fish community diversity across
Virginia landscapes: Prerequiste for conservation. Ecological Applications 9:335-349.
Channell, R., and M. V. Lomolino. 2000. Dpamic biogeography and conservation of
endangered species. Nature 403:84-86.
Evans, D. O., B. A. Henderson, N. J. Bax, T.R. Marshall, R. T. Oglesby, and W. S. Christie.
1987. Concepts and methods of community ecology applied to fieshwater fisheries
management. Canadian Journal of Fishenes and Aquatic Sciences 44 (SuppI. 2):448-470.
Franklin, J. F. 1993. Preserving biodiversity: species, ecosystems, or Iandscapes? Ecological
Applications 3 :202-îOS.
Guégan, J.F., S. Lek, and T. Oberdorff. Energy availability and habitat heterogeneity predict
global nverine fish divenity. Nature 39 1 :3 82-3 84.
Krabik, T. R., and J. J. Magnuson. 1999. Sirnulated dispersal of exotic rainbow smelt
(Osmerus mordu) in a northern Wisconsin lake district and implications for
management. Canadian Journal of Fishenes and Aquatic Sciences 56 (Suppl. 1):35-42.
Hughes, R. M., and R. F. Noss. 1992. Biological diversity and biological intergrity: Current
concems for lakes and streams. Fishenes (Bethesda) 1 7: 1 1 - 19.
Kaufman, L. 1992. Catastrophic changes in species-rich Fi-eshwater ecosystems. BioScience
42846-858,
Lek, S., Delacoste, M., Baran, P., Dimopoulos, I., Lauga, J., Aulagnier, S., 1996. Application
of neural networks to modelling nonlinear relationships in ecology. Ecological Modelling
90:39-52.
Lek, S. and Guegen, J.F., 1999. Artificial neural networks as a tool in ecological modelling,
an introduction. Ecological Modelling, i20:65-73.
Metz, C. E. 1978. Basic pnnciples of ROC analysis. Seminan in Nuclear Medicine 8283-
298.
Moyle, P. B., and I. E. Williams. 1990. Biodiversity loss in the temperate zone: decline of the
native fish fauna of California. Conservation Biology 4:275-284.
ozesmi, S.L. and ozesmi, U., 1999. An artificial neural network approach to spatial habitat
modelling with interspecific interaction. Ecological Modelling 1 16: 15-3 1.
Paruelo, J. M. and Tomasel, F., 1997. Prediction of functional characteristics of ecosystems:
a cornparison of artificial neural networks and regression models. Ecological Modelling
98:173-186.
Prendergast, J.R., R. M. Quim, J. H. Lawton, B. C. Evenham, and D. W. Gibbons. 1993.
Rare species, the coincidence of diversity hotspots and conservation strategies. Nature
365:335-337.
Pimm, S. I., G. J. Russell, J .L. Ginelman, T. M. Brooks. 1995. The hiture of biodiversity.
Science 269:347-3 5 1.
Redford, K. H., and B. D. Richter. 1999. Conservation of biodiversity in a world of use.
Conservation Biology 13 : 1246- 1256.
Rex, M. A., C. T. Stuart, R. R. Hessler, I. A. Men, H. L. Sanders, and G. D. F. Wilson.
Global-scale latitudinal patterns of species diversity in the deep-sea benthos. Nature
365:636-639.
Sala, O.E., and 18 other authors. 2000. Global biodiversity scenarios for the year 2 100.
Science 287: 1770-1774.
Sih, A., B. G. Jonsson, and G. Luikart. 2000. Habitat loss: ecological, evoluhonary and
genetic consequences. Trends in Ecology and Evolution 15: 132- 134.
Sinclair, A. R. Es, D. S. Hik, O. J. Schmitz, G. G. E. Scidder, D. H. Turpin, and N. C. Larter.
1995. Biodiversity and the need for habitat renewai. Ecological Applications 5579-587.
Tom. W. M. 1990. Clhate change and fish communities: A conceptual framework.
Transactions of the Amencan Fisheries Society 1 1 9:33 7-352.
Vitousek, P. M. 1994. Beyond global wamiing: Ecology and global change. Ecology 75:
1861:1877.
Wahlberg, N., A. Moilanen, and 1. Hanski. 1996. Predicting the occurrence of endangered
species in fiagmented landscapes. Science 273: 1536- 1538.
Willis, E. 0. 1979. Species reductions in reminiscent woodlots in southem Brazil.
Proceedings International ûmithological Congress XW:783-786.
Wilson, E. 0. 1994. Biodiversity: challenge, science and opportunity. Amencan Zoologist
3415-1 1.
Woinmki, J. C. Z., and R. W. Braithwaite. 1990. Conservation foci for Australian birds and
mammals. Search 2 1 :65-68.
CHAPTER 1
Predictive models for community assembly: Fish species occurrence in lakes
ABSTRACT
The prediction of species occurrence and community composition is of primary
importance in ecology and conservation biology. Given the large and evolving range of
approaches available for developing predictive rnodels of species presence or absence, it is
potentially difficult for ecologists or conservation biologists to choose appropriate rnethods.
h this stud;. 1 wed logistic regression mdysis, linear discrimulmt mdysis, classification
trees and artificial neural networks to develop predictive models of presence or absence for
27 fish species based on habitat variables of 286 temperate lakes located in south-central
Ontario, Canada. Detailed evaluation of these rnodels based on overall correct classification,
specificity (i.e., ability to accurately predict species absence) and sensitivity (i.e., ability to
accurately predict species presence), showed that the approaches differed marginally in mean
predictive performance across d l species. Al1 four methods exhibited higher levels of correct
classification (76.6-78.9% mean success) and specificity (76.8-77.3% mean success)
compared to sensitivity (35.1-46.2% mean success), and levels of these three measures were
found to depend on the species frequency of occurrence in the study lakes. On the other
hand, individual species predictability varied greatly among approaches and species.
Furthemore, even when correct classification rates were similar for some species, 1 found
that linear approaches (Le., logistic regression and discriminant analysis) and non-linear
approaches (classification tree and neural nehvorks) differed in which of the study lakes they
correctly classified. The 27 species-habitat models denved using each method were used to
predict species richness and composition of each study lake, and results showed close
agreement with observed richness and composition using al1 approaches. Predictability of
community composition varied across the 5 drainages depending on the modeling approach,
with species showing both individual and shared (i.e., with other species) predicted responses
to habitat conditions. 1 showed that easily obtained lake attributes c m be used to predict fish
species occurrence, richness and comrnunity composition with high success.
INTRODUCTION
Historically ecologists have been interested in understanding and predicîing the
distribution of species and the composition of communities across landscapes (Orians 1980,
Wiens 1992, Pickett et al. 1994). However the relative emphasis that has been placed on the
explanatory and predictive cornponents of ecological research varies substantiaily across
disciplines and taxa (Keddy 1992). Plant ecologists have placed more emphasis on
developing models to predict species distributions (e.g., Hill and Keddy 1992, Toner and
Keddy 1997, Wiser et al. 1998) as have Stream ecologists in predicting the occurrence of
invertebrates (e.g., Bailey et al. 1998, Chessrnan 1999, Moss et al. 1999) and fish (e.g., h s e
et al. 1997, Dunham and Rieman 1999, Rahel and Nibbelink 1999, Scheller et al. 1999). In
contrast, lake ecologists have generally focused on understanding species-environment
processes rather than developing predictive models.
Our understanding about fish-environment associations in lakes has emerged
primarily f?om comparative studies that descnbe statisticai relationships between sets of
environrnental variables and species occurrence or abundance. These studies identified the
inHuence of abiotic conditions (e-g., lake morphology, water chemistry), biotic interactions
(e.g., predation, cornpetition), habitat isolation and human-related factors in structuring fish
populations and communities at local, landscape and regional spatial scales (e.g., Jackson and
Harvey 1989, Tom et ai. 1990, Hinch et al. 1994, Rodriguez and Lewis 1997, Magnuson et
al. 1998). Exarnining variation in fish-habitat relationships is useful to better understand
patterns of species distribution, and for providing insight into the mechanisrns shaping and
regulating assemblage structure. However, fish ecologists focusing on lakes have genedly
not used this understanding to develop predictive models for species occurrence and
community composition. Such models would represent a major advance in lake ecology and
would have O bvious applications in research, conservation and management. For example,
predictive models could be used to forecast the effects of habitat modification and changing
land-use patterns on fish populations, to estimate habitat suitability for restoration or
reintroduction of species, and to predict the potentiai spread ofexotic species. Moreover,
predictive models for fish communities could aid in the effective conservation of aquatic
biodiversity by focuskg management efforts on whole communities (Evans et al. 1987,
Franklin 1993, Angermeier and Winston 1999) rather than individual populations
(Angermeier and Schlosser 1995).
Developing predictive models for species occurrence is often difficult since fish
commonly exhibit complex, non-lùiear responses to habitat heterogeneity and biotic
interactions. Logistic regression and linear discriminant analysis remain the most frequently
used techniques, although our confidence in the results is often limited by the inability to
meet a number of assumptions. such as statistical distributions of vsriables, independence of
variables, and model linearity (James and McCulloch, 1990). Consequently, researchers
have begun to employ non-linear statistical approaches such as classification and regression
trees (e.g., Magnuson et al. 1998, Emmons et al. 1999, Rathert et al. 1999, Rejwan et al.
1999) and aaificial neural networks (e.g., Lek et al. 1996, Mastronllo et al. 1997, Guégan et
al. 1998, Manel et al. 1999a;b, ozesmi and ozesmi 1999) for rnodeling ecological data. It is
believed that these alternative approaches can provide researchen with more flexible tools
for modeling the complex relationships between species and their surroundhg environment.
The primary objective of my study is to determine whether lake habitat conditions
relate to species occurrence and ultimately the composition of fish communities, in sorne
predictable manner. 1 address this objective using logistic regression analysis, discriminant
analysis, classification trees and artificial neural networks to develop fish-habitat models for
species occuring in temperate lakes of Ontario, Canada. 1 determine the predictability of
species presence or absence based on readily available, whole-Iake habitat features and
provide a detailed evaluation and cornparison among species and arnong modeling
approaches. Second, each fish-habitat model is applied to assess the predictability of species
richness and cornrnunity composition. Mode1 performance in predicting the composition of
communities of different drainage basins is assessed and patterns in predicted and observed
species membenhip in the communities are compared. Many of these analyses are
advancements over more conventional model evduations, and provide important insight into
the predictability of species occurrence, richness and community composition as a function
of whole-lake measures of lake habitat.
METHODS
Ecological data
The study system consisted of 286 fi-eshwater lakes from five drainage basins located
in south-central Ontario, Canada (Fig. 1.1, Appendix A). Aquatic communities in this region
u e representative of relatively naniral ecosystems hecaiise these Iakes are located in
Algonquin Provincial Park and are subject to minimal perturbations from development and
species introductions. I developed fish-habitat models for 27 fish species (Table 1.1) by
modeling species presence or absence as a function of 13 whole-lake or watershed-level
habitat characteristics. These predictor variables were chosen to include factors that are
related to h o w n habitat requirements of fish in this region (e.g., Matuszek and Beggs 1988,
Minns 1989; Table 1.2), and included: (1) surface area; (2) volume; (3) total shoreline
perimeter (sum of lake and island penmeters); (4) maximum depth; (5) surface measurements
(taken at depths 2.0 m) of total dissolved solids; (6) pH; (7) lake elevation; (8) growing
degree-days (obtained by subtracting the value five fkom the average daily temperature and
summing across a11 days that the average daily temperature is above 5' C); (9) occurrence of
summer stratification; (10) occurrence of a large littoral-zone piscivore (i.e., northern pike,
smallmouth bass or largemouth bass) when modeling smail-bodied fish; and (1 1-13) three
binary variables delineating the five drainage basins, Le., Amable du Fond, Bonnechere,
Madawaska, Oxtongue and Petawawa Rivers, to account for the potential influence of
historical biogeography on fish community composition. Al1 data were obtained fiom the
Algonquin Park Fish Inventory Data Base (Crossman and Mandrak 199 l), and information
regarding the standardized methodology for this inventory can be obtained fiom Dodge et al.
(1 985).
Modehg species occurrence
I applied logistic regression anal ysis (LRA), linear discriminant analysis (LD A),
classification and regression trees (CART) and a r t i f i d neural networks (ANN) to mode1
Gsh presence or absence as a funchon of the habitat variables descnbed above. Due to their
well-documented use by ecologists, 1 r e m fiom detailing LRA and LDA, but discuss
CART and ANNs since these approaches are less farniliar to many ecologists.
Figure 1 .l. Study Iakes located in Amable du Fond River (A), Bonnechere River (B),
Madawaska River (C), Oxtongue River @) and Petawawa River (E) drainage basins of
Algonquin Provincial Park, Ontario, Canada (45'50' N, 78O20' W).
Table 1.1. List of fish species, including species abbreviation (Code) and fkequency of occurrence (%) in the 286 study lakes. ûniy species occurring in gréater than 5% of the lakes are included.
Code Comrnon Name Scientific Name %
BCS BNS BSB BT BB B C CC CS F FSD GS ID LC LT LW LS NRD PD PKS RB RW SMB SL T-P WS n?
BIackchin shiner Blacknose shiner Brook stickleback Brook trout Brown bullhead Burbot Cisco Creek chub Common shiner Fallfish Finescale dace Golden shiner Iowa darter Lake chub Lake trout Lake whitefish Longnose sucker
Northem redbelly dace Pearl dace Pumpkinseed Rock bass Round whitefish Smallrnouth bass Splake Trout-perch White sucker Yellow perch
Notropis heterodon Notrop is h eterolep is Cui~ea ;ircor;stms Salvelin us fon tinalis
Ameiztriu nebulosus Lota lota Coregonus artedi SemotiZw atromacdatus Luxifrrs cornrrfus Semotilru corpora fis Pho.rinus neogaeus Notemigon us crysoleucas Etheostorna aile Couesiirs phtmbeus Salvelinzci namaycush Coregonus clupea form is Catostomus catostomzcs Pho.rinus eus
Margarism margurita Lepomis gibbosus Ambloplites nrpestris Prosopium cylindraceum Microptenrs dolomieu
S. fontinalis xS. namaycush Percopsis omiscomayus Catostomus cornmersoni Perca fruvescens
Table 1.2. Summary statistics for the whole-lake habitat variables used in mode1 development (see Appendix B for Pearson-moment correlation coefficients among variables).
Predictor variable Minimum 25% Median 75% Maximum quartile quartile
Surface Area (ha) 4 3 Volume (x 10 m )
Total Shoreline Perimeter (km) Maximum Depth (m) Elevation (m) PH Total Dissolved Solids (mg L-') Growing Degree Days Occurrence of surnmer stratification (O,] ) Presence littoral-zone piscivores (0,l) 3 Watershed dumrny variables (0,I)
Classz~cation and regression trees. - The use of automatic construction of decision trees
dates Eom work in the social sciences by Morgan and Sonquist (1963), but Breiman et al.
(1984) had a major influence in bringing CART methods to the attention of statisticians.
CART is a nonparameûic multivariate classification technique which is most commonly
implemented using a recunive partitionhg algorithm (Ciampi 1990, Hand 1997). This
algorithm partitions the data set hto a nested series of paired groupings, each of them as
homogeneous as possible with respect to either the presence or absence of the species. The
procedure begins with the entire data set, also called the root node, and formulates split-
defining conditions for each possible value of the predictor variables to create candidate
splits. Next, the algorithm selects the candidate split that rninimizes the misclassification
rate, and uses it to partition the data set into 2 subgroups. The algorithm continues
recunively with each of the new subgroups until no split yields a significant decrease in the
misclassification rate, or until the subgroup contains a small number of O bsewations (i.e.,
usually set to 5 or 10). A terminal node or "leaf' is a node that the algorithm cannot partition
any fiirther, and represents the most homogeneous group (Breiman et al. 1984). The
response class (in this case the presence or absence of a species) for each terminal node is
assigned by rninimiPng the resubstitution estimate of the probability of misclassification for
the observations of that node, The number of terminal nodes defines the size of the tree.
Next, optimal tree size is assessed to simpliQ its structure without sacrificing
goodness-of-fit. This c m be a paaicularly important consideration since more splits in the
tree will result in lower misclassification rates at the cost of poorer predictive power when
applied to data not used in consûucting the tree. Alternatively, if the tree is too small then it
will not use al1 the classification information that is available fiom the data. 1 used jackknife
validation to estirnate misclassification rates for a series of candidate trees each of different
size, and defined the optimal tree size as the subtree exhibiting the lowest misclassification
rate (Breiman et ai. 1984 and Appendix C).
drt$%ial neural networks - Although ANNs were orïginalIy developed to better understand
how the mammalian brain Eunctions, researchers have become more uiterested in the
potential statisticai utility of neural neîwork algorithms (Cheng and Titterington 1994,
Bishop 1995). In this study 1 use one hidden-layer feedforward neural networks trained by
the backpropagation algorithm (Rumehart et al. 1986). This type of network is commonly
used because it is considered to be a universal approximator of any continuous fict ion
(Hornick et al. 1989). Furthemore, 1 use a single hidden layer because this is generally
satisfactory for statistical applications (Bishop 1995), it greatly reduces computational time,
and it often produces similar results compared to multiple hidden layers (Kurkovh 1992).
The one hidden-layer feedfonvard nehvork consists of a single input, hidden and
output layer, with each layer containing one or more neurons. The input layer contains p
neurons, each of which represents one of thep predictor variables, Le., in my case 12 input
neurons for each species, except for small-bodied species where the input layer contains 13
neurons (addition of littoral-zone predator variable). The nurnber of hidden neurons in the
neural network is determined empincally by calculating the misclassification rates for
networks containing 1 to 20 hidden neurons using n-fold cross validation, and choosing the
number of hidden neurons which produces the lowest miscIassification rate (Appendix C:
Bishop 1995). The output layer contains one neuron representing the probability of species
occurrence. Additional bias neurons with a constant output (equal to 1) are added to the
hidden and output layers. These neurons play a similar role to that of the constant temi in
multiple regression. Each neuron (exciuding the bias neurons) is connected to ail neurons
from adjacent layers with an axon. The axon connection between neurons is assigned a
weight which dictates the intensity of the signal transmitted by the axon. in feedforward
networks, axon signals are transmitted in a unidirectional path, fiom input layer to output
layer through the hidden layer. The "state" or "activity levei" of each neuron is determined
by the input received fiom the other neurons comected to it. The state of each input neuron
is defined by the incoming signal (Le. values) of the predictor variables. The state of the
other neurons is evaluated IocaUy by calculating the weighted sum of the incoming signals
kom the neurons of the previous layer. The entire process can be written mathematically as:
where .ri are the input signals, yk are the output signds, wu are the weights between input
neuron i to hidden neuron j, wjk are the weights between hidden neuron j and output neuron k,
pj and pi are the bias associated with the hidden and output layers, and +h and are
activation functions for the hidden and output layers. There are several activation functions
(see Bishop 1995) and 1 use the logistic (or sigrnoid) bc t ion . The outgoing signal to the
output neuron represents the probability of species occurrence.
Training Llie neural netw-or8 involves the back-propagation algoritlm whzre ille goal
is to find a set of comection weights that minimizes an ermr function. The cross-entropy
criterion (Le., similv to log-likelihood) is rninimized during network training:
where t,, is the observed output value and y,, is the predicted output value for observation n.
Observations are sequentially presented to the network, and weights are adjusted d e r each
output is calculated depending on the magnitude and direction of the error. This iterative
technique of minimikg the error is known as gradient descent, where weights are modified
in the direction of greatest descent, traveling "downhill" in the direction of the minimum.
Network training is stopped after 1000 iterations of the error back-propagation algorithm. To
minimize the potential for network overfitting 1 use the simplest network architecture (i.e.,
srnallest number of hidden neurons) where equivalent network configurations exhibit
identical predictive performance. Furthemore, 1 use Ripley's (1 994) regularization, where a
weight-decay parameter h (set equal to 0.01) is used to rnodiw the error function of the
network by penaiizing large connection weights. The weight-decay technique improves the
optùnization process and reduces the chances of developing a saturated network (i.e., dl
outputs approaching zero or one).
Mode1 construction and validation
Jackknife or "leave-one-out" cross validation is employed to estimate the predictive
performance of each species model. Thîs method excludes one observation, constnicts the
model with the remainhg observations, and then predicts the response of the excluded
observation using this model. This procedure is repeated N times so that each observation, in
tum, is excluded in model caiibration and its response is predicted. N-fold cross validation is
used since it has been s h o w to produce nearly unbiased estimates of prediction error (Olden
and Jackson 2000). For ail rnodels, a decision tbreshold of O S (predicted probability of
occurrence) is used to classify the species as present or absence in a lake. Both raw and
transformed data (al1 continuous predictor variables. except pH. were hlx) transformed to
approximate normal distributions) are anaiyzed, and the data type exhibiting the greatest
predictive performance for any given method and species is retained for ail subsequent
analyses (Appendix C). A total of 54 model construction and validation processes (including
the selection of optimal classification tree size and optimal nurnber of hidden neurons in the
network) are calculated for each method, i.e. 27 species for each of the 2 data types (raw and
transfomed).
Mode1 predictive performance
Predicting species occurrence - I partition the overall classification success of each species
model by deriving ''confusion matrices" following Fielding and Be11 (1997). Ushg these
matrices I examine three rnetrics of prediction success. First, I quanti@ the overall
classification performance of the model (CC) as the percentage of lakes where the model
correctly predicted the presence or absence of the species. Second, 1 examine the ability of
the model to predict species absence, termed model specificity (SP). Third, I examine the
ability of the model to predict species presence, termed model sensitivity (SE). Cohen's
kappa statistic was used to assess whether the performance of the model differs from
expectations based on chance alone (Titus et ai., 1984). McNemar7s test (with Yates
correction for continuity; Zar 1999) is used to compare differences in patterns of lake
misclassifications among LRA, LDA, CART and ANN for each species.
Predicting richness and comrnunity composition - Species richness is tabulated as the
number of species predicted to be present for each lake using the 27 species-habitat models,
and community composition was estimated by recording predicted species presence in each
lake. Predicted and actual cornmunity composition was compared using Procmstes anaiysis
(Jackson 1995) on lake scores 6rom the axes fiom a principal coordinate analysis (PCoA)
based on Jaccard distance. Jaccard's similarity coefficient was transformed to a distance
measure by taking the square root of the complement (i.e., (1 -s)? Jackson et aI. 1 989)
PROTEST is a randomization test of matrix concordance incorporathg Procrustean matrix
rotation (Gower 1975, Rohlf and Slice 1990, Jackson and Harvey 1993), which mat:hes the
position of each lake in one multivariate Face !i.e.. predicted community composition) to the
position of the same lake in a second multivariate space (Le., observed cornmunity
composition). The method minirnizes the sum-of-the-squared deviations (Le., m2; Gower
1975) between the pair of points representing each lake such that the greater the similarity of
the rnultivariate configurations fiom the data sets, the lower the m2 value. This measure is
compared with that denved from repeatedly randomizing the configuration from one matrix
and recalculating the rn2. The percentage of m' values equal to or less than the observed rn2
provides the significance level of the test (Jackson 1995). Predicted and actual comrnunity
compositions are compared for ordinations of al1 Mes, for lakes in each drainage basin, and
for species (i.e., patterns in lake classifications). Cornparisons among species used al1
dimensions from the PCoA whereas lake cornparisons are lirnited to 6 dimensions due to the
matrix size. AU statistical analyses were performed using S-Plus software (Mathsoft 1998,
version 4.5) and the PROTEST software available from the authors.
RESULTS
Predicting species occurrence
On average, LRA, LDA, CART and ANN correctly predicted species occurrence (i.e.,
correct classification) and absence (i.e., specificity) in approximately 7580% of the lakes,
but correctly predicted species presence (Le., sensitivity) in only 40% of the lakes (Table
1.3). Appendix D contains the logistic regression and discriminant models for al1 species.
Rates of correct classification were substantially less variable compared to those for
specificity and sensitivity. There were no significant differences in overall correct
classification (Kruskal-Wallis, H=0.608, P=0.90), specificity (H=0.625, M . 8 9 ) and
Table 1.3. Summary of predictive performance of species-habitat models. Reported values are percentage correctly classified (CC), specificity (SP: ability to accurately predict species absence) and sensitivity (SN: ability to accurately predict species presence). Predictions significantly different fiom random (based on Kappa statistic) are indicated in italics (a=0.05). Species codes are defined i i i Table 1.1.
- - -- -- - - -- - - --
Specics Logistic Discriminant Classification Neural Code Rcgression analy sis trec nchvork
-
CC SP SN CC SP SN CC SP SN CC SP SN
BCS BNS BSB BT BB B C CC CS F PSD GS ID LC LT LW LS NRD PD YKS RB HW SMB SL T-P
Mcan 78.5 77.3 43.1 78.9 76.9 46.2 77.6 76.8 41.0 76.6 77.0 35.1 S.D. 9.9 27.5 32.8 9.0 26.6 31.7 10.2 26.3 35.0 10.9 30.0 36.2
sensitivity (H=2.745, P=0.43) among LRA, LDA, CART and ANN for al1 species but
differences arnong methods exist for individual species.
Overall correct classification, specificity and sensitivity rates varied between species
and between methods (Table 1.3). Although many species were correctly classified with
equal success, directional strengths in their predictions (i.e., specificity, sensitivity) often
varied. For some species absence was better predicted (e.g., cisco, Longnose sucker,
srnailmouth bass), for some species presence was better predicted (e.g., brook trout, white
sucker, yellow perch) whereas others exhibited similar levels of specificity and sensitivity
(e.g., brown bullhead, lake trout). Cohen's kappa test showed that, LRA, LDA, CART, ANN
produced 19,2 1, 17 and 1 1 species-habitat models, respectively, whose predictions were
greater than expectations based on chance (Table 1.3). In addition to comparing the
performance of these methods, examining cases where a consensus across methods was
achieved provided an opportunity to better assess the power to predict species occurrence.
The list of species where al1 methods generated significant predictions of species occurrence,
and where al1 methods generated non-significant predictions is s h o w in Table 1.4.
McNemar's test showed that methods differed in patterns of misclassification for 10
out of the 27 species (Fig. 1.2). Of the 20 pair-wise differences between methods, 18 were
between linear and non-linear techniques, with the greatest number of discrepancies observed
for centrarchid spp., brook stickleback, peari dace and yellow perch. Although overall
predictability of particular species (e.g., brook trout, burbot, blacknose shiner; Table 1.4) was
similar among the statistical methods, the lakes in which the methods misclassified these
species differed (Fig. 1.2).
Frequency of species occurrence in the study lakes appeared to influence predictive
performance of the models, regardless of statistical method used. Rates of correct
classification showed a non-linear, U-shaped relationship with fiequency of occurrence, with
greatest predictability for rare (< 20%) and wide-spread (> 80%) species, whereas specificity
showed a negative relationship and sensitivity showed a positive with species occurrence
(Fig. 1.3).
Table 1.4. Species exhibiting high (significantly different from random) and low (not significantly different fiom random) predictability for logistic regression analysis, linear discriminant andysis, classification trees and artificial neural networks.
S pecies predictability
High Low
BIacknose shiner Blackchin shiner Brook trout Finescale dace Brown bulhead Lake whitefish Burbot Round whitefish Cisco Splake Common shiner Creek chub Lake trout Northem redbelly dace White sucker
Blacknose Shiner
Brook Trout
Burbot
Longnose Sucker
Brook Stickleback
Srnallmouth Bass
Roc kbass
Pump kinseed
Yellow Perch
Pearl Dace
Figure 1.2. Results fiom McNemar's test assessing differences among patterns of lake
misclassifications using logistic regression, discriminant andysis, classification trees and
artificid neural nebvorks. Shared shading for a species represent pair-wise differences for
lakes in which species occurrence was incorrectly predicted (based on a4.05). For exarnple,
blacknose shiner was misclassified in different sets of lakes using logistic regression and
classification tree.
LRA LDA
Species occurrence (%)
Figure 1.3. The effect of fiequency of species occurrence on overall correct classification rate (top panel), specificity (middle panel) and sensitivity (lower panel) for 27 fish species using logistic regression (LRA), discriminant analysis (LD A), classification trees (CART) and aaificial neural networks (ANN), expressed as percentage. Dotted lines represent expectations based on chance predictions.
CART ANN
O 20 40 60 80 100 O 20 40 60 80 100
Species occurrence (%)
Figure 1.3. continued.
Predicting species richness and community composition
Regressions between actual and predicted species nchness showed that al1 methods
had significant relationships (Fig. 1.4). Cornparhg the regression lines to the 1 : 1 line of
perfect predictions showed that al1 rnethods tended to over-estimate species nchness for
depauperate communities and under-estimate nchness for speciose communities.
Fish community composition was predicted with varying success depending on
geographic location (i.e., drainage) and statistical method used (Table 1.5). PROTEST
showed that for al1 the study lakes predictions of community composition matched closely
with actual community composition. More detailed examination showed that ability to
predict species membership in the Mes differed between the dninages. For example, the
communities of Bomechere and Oxtongue drainages were better predicted compared to lakes
of Madawaska and Petawawa drainages for a11 approaches; however, notable exceptions
included the Madawaska drainage where LDA outperformed the other methods and the
Petawawa drainage where CART was more successful (Table 1.5).
Figure 1.5 shows the species plots nom the PCoA of observed and predicted
community composition of the study lakes. Species occurring in the same lakes are
positioned close together on a plot, and species fiom different sets of lakes are positioned at
opposite ends of the plot. Observed and predicted comrnunity composition ordinations
patterns show strong agreement and are significantly concordant for d l rnethods (Table 1.5).
Figure 1.5 shows that brook trout, cisco, round whitefish and lake whitefish, d l members of
the Salmonidae family, are clustered close together in al1 plots (although showing some
scatter for ANN) indicating that they are obsewed and predicted to exist in a similar set of
lakes. Sirnilarly, rock bass, pumpkinseed and smallmouth bass, al1 members of the
Centrarchidae family are observed and predicted to occur in the same lakes. In contrast,
white sucker and longnose sucker are known to be in the same lakes, yet are separated by a
greater distance in the predicted ordination space for al1 methods, indicating that they are
predicted to be rnembers of different assemblages.
Table 1.5. Cornparison of observed and predicted fis11 community composition for lakes based on first six components fiom a principal CO-ordinate analysis based on Jaccard's similarity coefficient. Comparable analyses based on lakes fiom al1 M e s and frorn eacli drainage basin. The range of total amounts of variation explained by the first six components of the between lakes matrices is surnmarized whereas al1 componenis were retained from the species ordinations. Reported values are Gower's ln2,
ranging from O to 1 where O indicates perfect agreement, and associated significance levels based on 9,999 randornizations from PROTEST in parenthesis.
Lakcs N % variance explained
ALL 286 34 - 54 Amable du Fond 27 59 - 77 Bonnechere 1 O 83 - 94
Madawaska 128 54 - 67 Oxtongue 32 54 - 75
Petawawa 89 41 - 63
Logistic Regressioo
Discriminant analysis
Classification trce
Neural network
0.89 1 (0.000 1)
0.755 (0.0121)
0.404 (O. 1593) 0.859 (0.0001)
0.720 (0.0002)
0.870 (0.0001)
Species 27 1 O0 0.050 (0.000 1) 0.046 (0.000 1) 0.05 1 (0.000 1) 0.052 (0.0001)
241 LRA
Actual Species Richness
20 -
Figure 1.4. Regression analysis between actual and predicted species richness using logistic regression &RA: y=3.24+0.73x), discriminant analysis (LDA: y=2.96+0.75x), classification trees (CART: y=2.85+0.78x) and artificial neural networks (ANN: y-;2.64+0.86x). Solid h e represents regression h e and the dashed line represents a line of perfect match (1 : 1). Predicted Bchness was tabulated by d g predicted species presences using models for each of the 27 species.
0 0
0
r = 0.647 4 4
P < 0.001 . 9 4 . . 4
O . 4
16 - m m . * m m a O
12 - . . . ., a . . .
8 -
4 - . . m m . O
0 4 I - I 8 8 . - I 1
.
. m m . m . . * *
e a a m m
e e e
. a m m . . e . . t ' *
Actual Species Richness
Figure 1.4. continued.
OBSERVED I BSB
B NRD BT WS LS
PD L W T-P Cs
LC F BCS
RW ENS
PI(S RB &CC ID
YP SMB I
l
1 BSB
NRD RW
FSD PKS f-p -0.2 ,
GS SMB
-0.3 BB
BSB RW
L w ~ s BT FSD YP
CART
SMB F PKS
LDA I Bm WS
BSB L(w LC NRD
FSD
BB G S ~
CS
T- P S M B
BCS BSB ID
LT LC
ws Ls SL==
T-P BNS C RB
BT CS PKS
Lw BB SMB
Figure 1.5. First tsvo axes f?om principal coordinate andysis based on Jaccards distance of h o w n or observed species composition (A), composition predicted by logistic regression model, predicted by discriminant analysis (B), predicted by classification trees (C) and predicted by artificial neural networlû @). Letters refer to species codes Listed in Table 1.1.
DISCUSSION
Cornparison of modeüng approaches
Traditional (Le., linear) and alternative (Le., non-linear) approaches to developing
predictive models should be viewed as both cornpetitive and complementary methodologies
for cstablidhiag quuititativc linkages betwetn fish md their e n ~ i r ~ m e n i . 1 found t h t
average predictive performances of LRA, LDA, CART and ANN were sirnilar across al1
species, although the predictability for individual species varied greatly. Therefore, neither
linear nor non-linear approaches were optimal for al1 species. Indeed, recent studies
modeling species presence or absence have illustrated the predictive advantages of LRA
(e.g., Manel et al. 1999% ozesmi and Ozesmi 1999), LDA (e.g., Reichard and Hamilton
1997, Scheller et al. 1999), CART (e.g., Rejwan et al. 1999) and ANN (e-g., Mastrorillo et al.
1997) relative to their linear or non-linear counterparts. Where the underlying data structure
and assumptions are met for a particular traditional statistical method, there is no reason to
believe that major differences between traditional and alternative techniques should exist
(e.g., Smith et al. 1997, Manel et al. 1999b, Rathert et al. 1999). For example, one might
expect LRA and LDA to perfiorm well where linear relationships exist, whereas CART and
ANN should prove better in non-linear situations. Interestingly, the results showed 10
species, including brook trout, smallmouth bass and yellow perch, whose patterns of lakes
misclassification differed among the statistical methods. The majonty of these differences
occur in contrasts of Iinear and non-linear approaches, indicating that although average rates
of correct classification were similar, some methods may be superior depending on the shape
of the species response c w e to habitat conditions. UItimately, more direct cornparisons
based on simulation studies and using a wider anay of field data sets are required to
accurately address di fferences among tradi tional and alternative ap proaches.
1 reitente Ripley (1994) in saying that most data sets are expensive to collect both in
tems of thne and money, and that more effort should be spent in choosing and cornparing
diflerent statistical methods which best suit the particular question of interest and
charactenstics of the data at hand. In the discussion that follows 1 focus on examining results
common to all four rnodeling approaches, motivated by the belief that a consensus of
methods can provide us with a greater degree of confidence that patterns in species and
community predictability are eco logicdly meaningfùl, and not statistical artifacts of the
methods employed.
Predicüng species occurrence
My study shows that whole-lake attributes can be used successfully to predict species
presence or absence. For many species the occurrence in any particular lake was predicted
with hi& success, whereas for other species there remained a large degree of uncertainty in
the prediction. Species such as smallmouth bass, burbot, brook trout and lake bout were
correctly classified in approximately 7540% of the lakes, which is an especially promising
result given the economic and societal importance of such species. Although these models
are correlative, and thus I cannot infer causation and make interpretations of the underlying
mechanisrns (Cale et al. 1989), the results are consistent with findings Eom many studies of
temperate fish populations (Jackson and Harvey 1989, Tom et al. 1990, Magnuson et al.
1998). For instance, mailmouth bass and lake trout are known to be influenced by overall
lake size (Le., area, volume, maximum depth and shoreline perimeter) since these
morphological features aiter the mixing characteristics and hence the themal regirne of lakes
(e.g., Eadie and Keast 1984, Jackson and Harvey 1989). In addition, lake area and depth
serve as an indirect mesure of the diversity of habitats available in lakes, which may be
important to support the small-bodied, forage fish upon which srnalimouth bass and lake
bout feed.
Although the predictive abilities of conventional models for species presence or
absence are commody assessed fkom overall classification rates done, 1 show that by
partitioning the predictive performance of the models into measures such as sensitivity and
specificity, 1 c m assess more readily the strengths and weaknesses of the rnodels and better
evaluate their applicability. For example, the presence ofbrook trout and yellow perch were
predicted with a high degree of ceaainty (93-94% of the lakes), yet predicting the absence of
these species was more difficuit. Conversely, srnailmouth bass and rock bass exhibited high
rates of correct classification when absent (95-98%), but were poorly predicted when present
in a lake. Lake trout and brown bullhead had similar IeveIs of correct classification,
sensitivity and specificity indicating good mode1 stability for predicting both species
presence and absence.
1 found that the clearest determinant of prediction success was the fiequency of
species occurrence in the study lakes. Mode1 sensitivity increases and specificity decreases
with increasing species fiequency of occurrence, whereas overail rates of correct
classification were highest for rare and common species and decreased as species occurrence
approached 50%. This relationship is expected, yet is seldom considered in distributional
rnodeling (but see Fielding and Bell 1997, Manel et al. 1999b). For example, for rare species
(e.g., 5% occurrence), a naively constnicted simple model might predict the species being
absent in al1 lakes resulting in a 95% correct classification rate. Similarly, a cornmon species
f o n d in 95% of the lakes might be predicted to occur in al1 lakes, and again, the model will
correctly classify 95% of the lakes. Finally, using such a simple mode1 for a species
occurring in 50% of the lakes would result in the maximum error rate. This type of model
would provide a U-shaped response between the overall correct classification rate and the
fiequency of species occurrence. Although the dependency of model-prediction success on
species fiequency of occurrence is unavoidable, it is commonly overlooked. Thus, it is
imperative that mode1 performance is tested against expectations based on chance. In this
study I used Cohen's kappa statistic to assess the significance of model predictive
performance, although a number of other approaches may be employed.
There are a number of practical implications for the relationship between prediction
success and species fiequency of occurrence. First, a decrease in model sensitivity for rare
species irnplies that it will be more dificult to predict the occurrence of organisms whose
conservation and management is perhaps the most critical. This finding has great importance
in developing models for guiding searches for populations in previously unsampled areas and
for indicating site suitability for the reintroduction of rare species (e.g., Hill and Keddy 1992,
Wiser et al. 1998) since the predictive ability of the models will be limited (Scott et al. 1987).
Second, drawiog inferences about observed absences of species kom sites containing suitable
habitat conditions (e.g., indirect evidence for dispersal, predation, cornpetition) could be
limited if the models exhibit poor specificity. Finally, examinhg alternative measures of
prediction success can provide more accurate cornparisons of dBerent modeling approaches
(e-g., Manel et al. 1999b) and among species. For instance, 1 found that although the overall
correct classification rates for some species were similar, levels of specificity and sensitivity
were often quite different. By quantifjmg and examinhg alternative measures of mode1
prediction success, we can gain additional insight into the detemiinants shaping species
occurrence (e.g., Manel et al. 1999a;b, Scheller et al. 1999), which ultimately can lead to the
development of more robust predictive models.
The development of powerful, predictive models for species occurrence will add to
Our knowledge of the distribution and habitat requirements of species' as well as serve to
focus research both in terms of observational and experirnental studies by identifjmg gaps in
our howledge and help to narrow our examination of causal mechanisms shaping fish
community structure in Ereshwater ecosystems. Predictive models also play an increasing
role in the conservation and management of fish populations by providing first-order
estimates of habitat suitability, which could then be followed by ground truthing and field
validation, in order to predict sites with available spawning habitat (e.g., Knapp and Preisler
1999) or to establish potential areas for species restoration or reintroductions. Similarly,
mode1 predictions can be used to estimate the likelihood of local establishment and spread of
exotic species, which may help set conservation pnonties for preserving vulnerable species
and popuiations that might be lost locally (e.g., Hrabik and Magnuson 1999).
Although 1 have not included al1 factors that rnay influence the occurrence of a fish
species at a given location, the variables used in my study do successFully classi@ most
occurrences or, alternatively, are closely correlated with other factors that do discriminate
between presence and absence. The inclusion of other environmental variables (e.g.,
temperature, dissolved oxygen concentrations) and isolation-related factors (e.g., distance to
nearest lake of equal or greater size, number of inlets/outlets) may improve the predictive
performance of the models in terms of overall correct classification, specificity and
sensitivity. Examining the cases where the predictions of the models did not fit expectations
rnay provide important into the importance and causes of this unaccounted variation (e.g.,
Boone and Krohn 1999, Rahel and Nibbelink 1999).
Predicting species richness
Predictions of species richness in my study were generated by tabulating predicted
species presences using the individual models for the 27 species. 1 found good correlations
between predicted and actual richness values in the study lakes; however, the strength of the
correlations were not as hi$ as other fish studies conducted in this region (e.g., Eadie and
Keast 1984: Matusek and Beggs 1988. Minns 1989). This difference can be attnbuted to the
way species richness was predicted. My study took an individudistic approach by modeling
species-habitat relationships separately, rather than employing the conventional approach
where species richness is predicted as a conthuous variable with al1 species considered
collectively in a generic fashion. As such, my models provide a more detailed examination
of the direct role of habitat constraints on determinhg the presence or absence of individual
fish species, as compared to treating nchness collectively across al1 species which accounts
only for overall limitations on species richness. This distinction can be particularly important
since the composition of ecological communities can differ and change in time and space
without corresponding effects on species tichness (e.g., Kadmon and Pulliam 1993, York
2000). Furthemore, predicting individual species membership in ecological communities
can be important since species-nch assemblages often do not include those species in most
need of conservation (Prendergrast et al. 1993), and perhaps species should not be assigned
equal conservation value in predictive models.
Predicting community composition
A nurnber of multivariate approaches, such as canonical correspondence analyàs and
the Mantel test, have been used by ecologists to relate the structure of fish communities to
the environment (e-g., Jackson and Harvey 1993, Hinch et al. 1994, Matthews and Robison
1998, Pusey et al. 2000). These studies help us understand the major factors correlating and
possibly stnicturing fish communities, but do not produce a fkunework fiom which
individual species membership of such communities can be predicted. I see two general
approaches for developing predictive models for species composition of communities.
First, assemblages can be classified based on groups of species showing shared
patterns of CO-occurrence prior to relating these assemblage types to environmental
conditions. Commonly called community classification, such studies have used a number of
statistical techniques (e.g., clustering, ordination) to determine assemblages based on patterns
of species composition. Examples of community classification in fish studies include Tonn
and Magnuson (1982) who grouped northem Wisconsin assemblages into three types, which
they referred to as "mudminnow", "bass" and "pike", and Capone and Kushlan (1 99 1) who
classified three comrnunities: "mosquitofish", "black bullhead" and "sunfish-shiner-
mosquitofish" in dry-season stream pools of northeast Texas. In both studies assemblages
were defined by their characteristic or dominant species. Magnuson et al. (1 998) constnicted
8 cornrnunity types for srnaIl forest lakes of Finland and Wisconsin, and developed
classification models for predicting fish assemblage type based on environmental variables.
The RIVPAC program (Wright et al. 1984) relies on the identification of classification
groups of macro-invertebrate comrnunities. Reducing species composition into a mal1 set of
assemblage types can be a powerful approach for making the study of complex multi-species
assemblages more tractable (Poff 1997). Models are believed to exhibit greater predictive
power for classifjmg lakes into assernbly types since they only must discriminate among a
nal ler number of groups. Similar arguments have been made for the advantages of
developing predictive models for guilds (Austen et al. 1994).
Probably the greatest disadvantage of aggregating species into assemblage types is
the loss of information about individual species (Hay 1994). Moreover, if such species
associations are not a product of natural processes and are not temporally and spatially stable,
any prediction of community state or change based on that assemblage type might potentially
have little relevance to cornmunity dynamics (Austen et al. 1994). OAen, a subjective
decision is required to determine the degree of similarity necessary to group species together.
For instance, Angerrneier and Winston (1999) found over 90 statishcally distinct fish
assemblage types in Virginia and emphasized that the division of comrnunities into discrete
assemblages is arbitrary in that different sets of environmental factors at different spatial
scaies would yield greater or fewer types of communities. Thus, the selection of the cntena
used to dehe assemblage types is a particdarly important consideration when developing
predictive models. For example, &et the models have been developed they may not be
applicable to systems outside the sites used to classi@ the communities due to different
species pools and thus different assemblage types. I found that the prediction success of
community composition varied across drainages, reinforcing the fact that the identification of
comrnunity types is perhaps a spatially dependent process (Angermeier and Winston 1999).
Tonn et al. (1990) were unable to identify discrete fish assemblage types in mal1 Finnish
lakes using species presence-absence data. SUnilarly, Pusey et al. (2000) found that stream
fish assemblage structure in northeastem Australia did not represent discrete assemblages but
were cornposed of species varyhg along individual environmental gradients. In such cases,
it is possible that better predictive models for community composition may be achieved by
modelling the occurrence of individual species rather than whole assemblages.
My study took this second appmach by modeling species occurrence individually,
and applying each species-habitat model separately to predict community composition. 1
found that groups of species exhibited similar pattems of observed and predicted occurrence,
suggesting common responses to habitat features across lakes. In this case, pattems of
species CO-occurrence could facilitate the grouping of organisms for the purpose of
community prediction. However, some approaches predicted patterns in species CO-
occurrence that were not observed (e.g., longnose sucker, white sucker). In these cases the
species show more unique responses to habitat features (e.g., Pusey et al. 2000).
Constnicting communities based on predictions fiom individual species-habitat
models aids in identifjmg patterns of shared habitat among the species; however, these
approaches generally ignore interactions (e.g., competition, predation) among species.
Consequently, identifjhg sites where there is iikely to be suitable habitat that is unoccupied
suggests a non-habitat-related mechanism for theu absence, such as dispersai limitation, past
extinction events, predation or competition (Wiser et ai. 1998). Absences of species in sites
containing members of the same family or guild provide even stronger evidence for the
importance of alternative mechanisrns shaping pattems in species occurrence. Therefore,
comparing observed and predicted community composition based on environmental factors
could help tease apart the relative importance of habitat heterogeneity and other interactions
shaping the structure of fish assemblages, and can lead to a greater understanding the role of
the environment in mediating patterns in species dish.Lbutions and community composition.
Although the major@ of conservation efforts tend to focus on particuIar species or
populations (Angemeier and Schlosser 1999, a community-level approach to the
consemation of aquatic biodiversity would be a major advance relative to current
management programs (Franklin 1 993, Angermeier and Winston 1 999), and thus predictive
models for fish communities could play a critical role. Though discussion of the potential
applications of community models is beyond the scope of this paper, I re-interate TOM
(1990) in saying that purely reductionist attempts to understand and predict community
organization in terms of individuals and single-species populations may reduce our abilities
to identiQ larger-scale pattems and thus to fully understand the structure and hinction of
communities. Indeed, Evans et al. (1987) provided a convincing argument for using fish
cornmunities as ecologically derived units for fieshwater fisheries management. However,
the major prerequisite for such an approach is that discrete communities can actually be
recognized and defined (Jackson et al. 1992); a prerequisite which is not always met.
Therefore, a combined approach of using individual species-habitat models and community
classifications for predicting species assemblages might be warranted.
CONCLUSION
1 have shown that statistical modeling approaches have promise in providing testable,
predictive models for fish population and comrnunity ecology. Although overall (i.e., for al1
species) differences in predictive performance among the approaches are minimal,
differences did exist for individual species and in patterns of lake misclassifications. Given
that these comparisons are based solely on empirical data, advocating the use of one
approach is inappropriate. However, 1 found that lineu and non-hear approaches provide
complementary tools for modeling fish occurrence based on whole-lake measures of habitat.
Detailed evaluation of species-habitat models shows that by partitionhg the predictive
performance of the models into measures such as specificity and sensitivity, the strengths and
wehesses of the models can be assessed more readily within and across species. Predictive
models can play an important role in guiding the direction of future research and aiding in the
management of fishery resources. More effective conservation of aquatic biodiversity will
require new approaches that recognize the value of both species and assemblages, and that
emphasize the protection of key regional-scale processes (Angenneier and Winston 1999).
Developments in these areas require a . increased reliance on probabilistic models and will
represent an important advance in both population and community ecology. Ecologists must
try to reduce the uncertainîy in their predictive models by collecting and including
appropriate variables into predictive rnodels. and teasing apart the interactions between
temporal and spatial processes occurring on different scales (Tom 1990, Poff 1997). This
will increase our understanding of the applicability of such models for predicting species
occurrence and cornmunity composition, as well as provide greater insight into the nature of
species-environment interactions.
ACKNOWLEDGEMENTS
1 would like to thank Nick Mandrak for providing the Algonquin Park Fish uiventory Data
Base, and Laura Hatt and Bob Bailey for their comrnents on an earlier draft. Speciai thanks
to Pedro Peres-Neto for writing and providing the PROTEST software. Funding for this
research was provided by an NSERC Graduate Scholarship to J.D. Olden, and an NSERC
Research Grant to D.A. Jackson.
REFERENCES
Angermeier, P. L., and 1. J. Schlosser. 1995. Consenhg aquatic biodivelsity: beyond species
and populations. Amencan Fishenes Society Symposium 17:402-414.
Angermeier, P. L., and M. R. Winston. 1999. Characterizing fish community diversity across
Virginia landscapes: Prerequiste for conservation. Ecological Applications 9:335-349.
Austin, D. J., P. B. Bayley, and B. W. Memel. Importance of the y i l d concept to fishenes
research and management. Fisheries (Bethesda) 19: 12-20.
Bailey, R. C., M. G. Kennedy, M. 2. Dervish, and R. M. Taylor. 1998. Biological assessrnent
of Eeshwater ecosystems using a reference condition approach: comparing predicted and
actual benthic invertebrate communities in Yukon streams. Freshwater Bio l o g 39:765-
774.
Bishop, C. M. 1995. Neural networks for pattem recognition. Oxford Clarendon Press.
Boone, R. B., and W. B. Krohn. 1999. Modeling the occurrence of bird species: Are the
ex-rors predictable? Ecological Applications 9: 835-848.
Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and
regression trees. Wadsworth, Belmont, California, USA.
Cale, W. G., G. M. Henebry, and I .A. Yeakley. 1989. Infening process and pattem in
natural communities. BioScience 39:6OO-6OS.
Capone, T. A., and I. A. Kushlan. 199 1. Fish comunity structure in dry-season stream
pools. Ecology 72983-992.
Cheng, B., and D. M. Titterington. 1994. Neural networks: A review f?om a statisticai
perspective (with discussion). Statistical Science 99-54.
Chessman, B. C. 1999. Predicting the macroinvertebrate faunas of rivers b y multiple
regression of biological and environmental di fferences. Freshwater Biology 4 1 :747-75 7.
Ciampi, A. 199 1. Generalized regression trees. Computational S tatistics and Data Analysis
12:57-78.
Crossman, E.J. and N.E. Mandrak. 1991. An analysis of fish distribution and community
structure in Algonquin Park: annual report for 1991 and completion report, 1989- 199 1.
Ontario Minstry of Natural Resources, Toronto, Ontario, Canada.
Dodge, D. P., G. A. Goodchild, 1. MacRitchie, J. C. Tilt, and D. G. WaldnK 1985. Manual
of instructions: aquatic habitat inventory surveys. Ontario Minisûy of Naturd Resources,
Fishenes Bmch, Toronto, Ontario, Canada.
Dunham, J. B., and B. E. Rieman. 1999. Metapopulation structure of bu11 bout: Influences of
physical, biotic, and geomeûical landscape charactenstics. Ecological Applications
9:642-655.
Eadie, J. M., and A. Keast. 1984. Resource heterogeneity and fish species diversity in lakes.
Canadian Journal of Zoology 62: 1689- 16%.
Emmons, E. E., M. J. Jemings, and C. Edwards. 1999. An alternative classification method
for northem Wisconsin lakes. Canadian Journal of Fisheries and Aquatic Sciences
56:66 1-669.
Evans, D. O., B. A. Hendenon, N. J. Bax, T.R. Marshall, R. T. Oglesby, and W. J. Christie.
1987. Concepts and methods of comrnunity ecology applied to freshwater fisheries
management. Canadian Journal of Fishenes and Aquatic Sciences 44 (Suppl. 2):448-470
Fielding, A. H., and J. F. Bell. 1997. A review of methods for the assessment of prediction
errors in conservation presencdabsence models. Environmental Conservation 24:38-49.
Franklin, J. F. 1993. PreseMng biodiversity: species, ecosystems, or landscapes? Ecological
Applications 3:202-2OS.
Gower, J. C. 1975. Generalized procrustes analysis. Psychometrika 40:33-5 1
Guégan, J. F., S. Lek, and T. Oberdorff. 1998. Energy availability and habitat heterogeneity
predict global fish diversity. Nature 39 1 :382-384
Hand, D. J. 1997. Construction and assessment of classification rules. John Wiley and Sons,
Chichester.
Hay, M. E. 1994. Species as 'noise' in community ecology: do seaweeds block our view of
the kelp forest? Trends in Ecology and Evolution 9:414-416.
Hinch S. G., K. M. Somers, and N. C. Collins. 1994. Spatial autocorrelation and assessment
of habitat-abundance relationships in littoral zone fish. Canadian Jounial of Fishenes and
Aquatic Sciences 5 1 :701-712.
Hill, N. M., and P. A. Keddy. 1992. Prediction of rarities fkom habitat variables: Coastal
plain plants on Nova Scotian landshores. Ecology 73: 1852-1 857.
Hornick, K., M. Stinchcombe, and H. White. 1989. Multilayer feedfonvard networks are
universal approxirnators. Neural Networks 2:359-366.
Hrabik, T. R., and J. J. Magnuson. 1999. Simulated dispersal of exotic rainbow smelt
(Osmem mordar) in a northern Wisconsin lake district and implications for
management. Canadian Journal of Fisheries and Aquatic Sciences 56 (Suppl. 1):35-42.
Jackson, D. A. 1995. PROTEST : A procrustean randornization test of community
environment concordance. Écoscience 2:297-303
Jackson, D. A., and H. H. Harvey. 1989. Biogeographic associations in fish assemblages:
local venus regional processes. Ecology 70: 14724484.
Jackson, D. A., and H. H. Harvey. 1993. Fish and benthic invertebrates: community
concordance and cornmunity-environment relationships. Canadian Journal of Fishenes
and Aquatic Sciences 50:2641-265 1.
Jackson, D. A., K. M. Somes, and H. H. Harvey. 1989.Similarity coefficients: Measures of
CO-occurrence and association or simply measures of occurrence. Amencan Naturalist
133:436-453.
Jackson, D. A., K. M. Somers, and H. H. Harvey. 1992. Nul1 models and fish comrnunities:
Evidence of nonrandom patterns. Arnerican Naturalist 139:930-95 1.
James, F. C., and C. E. McCulloch. 1990. Multivariate analysis in ecology and systematics:
panacea or Pandora's box? h n u d Reviews in Ecology and Systematics 2 1 : 129-1 66.
Kadmon, R., and H. R. Pulliarn. 1993. Island biogeography: Effect of geographicai isolation
on species composition. Ecology 74:977-98 1.
Keddy, P. A. 1992. Assembly and response d e s : two goals for predictive cornmunity
ecology. Journal of Vegetation Science 3: 157-164.
Knapp, R. A., and H. K. Preisler. 1999.1s it possible to predict habitat use by spawning
salrnonids? A test using California golden trout (Oncorhynchzis mykiss aguabonita).
Canadian Journal of Fisheries and Aquatic Sciences 56: 1576-1581.
Kmse, C. G., W. A. Hubert, and F. J. Rahel. 1997. Geomorphic influences on the distribution
of Yellowstone cutthroat trout h the Absaroka Mountains, Wyoming. Transactions of the
Armican Fishenes Society 126:418-427.
Kurkovh, V. 1992. Kolmogorov's theorem and multilayer neural networks. Neural Networks
5:SOl-506.
Lek, S., M. Delacoste, P. Baran, 1. Dirnopoulos, J. Lauga, and S. Aulagnier. 1996.
Application of neural networks to modelling nonlinear relationships in ecology.
Ecological Modelling 9O:3 9-52.
Magnuson, J. J., W. M. TOM, A. Banerjee, J. Toivonen, O. Sanchez, and M. Rask. 1998.
Isolation vs. extinction in the assembly of fishes in smdl northern lakes. Ecology
79:294 1-2956.
Manel. S., J. M. Dias. S. T. Buckton, and S. J. Ormerod. 1999a. Comparing discriminant
anaiysis, neural networks and logistic regression for predicting species distribution: a
case study with a Himalayan river bird. Ecological Modelling 120:337-347.
mane el, S., J. M. Dias, S. T. Buckton, and S. J. Ormerod. 1999b. Alternative methods for
predicting species distribution: an illustration with Himalayan river birds. Journal of
Applied Ecology 36:734-747.
Mastrorillo, S., S. Lek, F. Dauba, and A. Beland. 1997. The use of artificial neural networks
to predict the presence of small-bodied fish in a river. Freshwater Biology 38:237-246.
Matthews, W. J., and H. W. Robison. 1998. Muence of drainage connectivity, drainage area
and regional species richness on fishes of the interior highlands in Arkansas. The
Amencan Midland Naturalist 139: 1 - 19.
iMatuszek, J. E., and G. L. Beggs. 1988. Fish species richness in relation to lake area, pH, and
other abiotic factors in Ontario Mes. Canadian Journal of Fisheries and Aquatic Sciences
45:1931-1941.
Mims, C. K. 1989. Factors affecting fish species nchness in Ontario lakes. Transactions of
the Arnerican Fisheries Society 1 18533445.
Morgan, J. N., and J. A. Sonquist. 1963. Problems in the analysis o f s w e y data, and a
proposai. Journal of the Amencan Statistical Association 58 :4 15-434.
Moss, D. M., I. F. Wright, M. T. Furse, and R. T. Clarke. 1999. A cornparison of alternative
techniques for prediction of the fauna of ninning-water sites in Great Bntain. Freshwater
Biology 4l:l67-l8l.
Olden, J. D., and D. A. Jackson. 2000. Torturing data for the sake ofgenerality: How vaiid
are our regression models? Écoscience 7(in press).
Orians, G. H. 1980. Micro and macro in ecological theory. Bioscience 30:79.
Ozesmi, S. L., and U. ozesmi. 1999. An artificial neural nehvork approach to spatial habitat
modelling with interspecific interaction. Ecological Modelling 1 16: 15-3 1.
Po$ N. L. 1997. Landscape filters and species traits: towards mechanistic understanding
and prediction in stream ecology. Journai of the North Amencan Benthological Society
16:391-409.
Picken, S. T. A., J. Kolasa, and C. G. Jones. 1994. Ecological understanding: the nature of
theory and the theory of nature. Academic Press, New York.
Prendergrast, J. R, R. M. Quinn, J. K. Lawton, B. C. Evenharn, and D. W. Gibbons. 1993.
Rare species, the coincidence of diversity hotspots and conservation strategies. Nature
365:335-337.
Pusey, B. J., M. J. Kennard, and A. H. Arthington. 2000. Discharge van-ability and the
development of predictive models relating strearn fish assemblage structure to habitat in
northeastem Australia. Ecology of Freshwater Fish 9: 30-50.
Rahel, F. J., and N. P. Nibbelink. 1999. Spatial patterns in relations among brown trout
(Salmo tnitta) distribution, summer air temperature, and Stream size in Rocky Mountain
strearns. Canadian Journal of Fisheries and Aquatic Sciences 56 (Suppl. 1):35-42.
Rathert, Da, D. White, J. C. Sifheos, and R. M. Hughes, 1999. Environmental correlates of
species richness for native keshwater fish in Oregon, U.S.A.. Journal of Biogeography
26 :257-273.
Reichard, S. H., and C. W. Hamilton. 1997. Predicting invasions of woody plants introduced
into North Amerka Conservation Biology 1 1 : 193-203.
Rejwan, C., N. C. Collins, L. Brumer, B. I. Shuter, and M. S. Ridgway. 1999. Tree
regression andysis on the nesting habitat of srnallmouth bass. Ecology 80:341-348.
Ripley, B. D. 1994. Neural networks and related methods for classification (with discussion).
Journal of the Royal Statistical Society, Series B 56:409-456.
Rodriguez, M. A., and W. M. Lewis. 1997. Structure of fish assemblages along
environmental gradients in floodplain lakes of the Orinoco River. Ecologicai
Monographs 67: 109-128.
RohlfF. J., and D. Slice. 1990. Extensions of the procrustes method for the optimal
superimposition of landmarks Systematic Zoology 39:40-59.
Rumelhart, R. E., R. J. Hinton, and R J. Williams. 1986. Learning representations by back-
propagating error. Nature 323 :533-536.
Scheller, R. M., V. M. Snarski, J. G. Eaton, and G .W. Oehiert. 1999. An analysis of the
influence of annual thermal vari ables on the occurrence of fifteen warmw ater fis hes.
Transactions of the Amencan Fisheries Society l28:257-264.
Scott, J. M., B. Csuti, J. D. Jacobi, and J. E. Estes. 1987. Species richness: a geographic
approach to protecting future biological diversity. BioScience 37:782-788.
Smith, S. J., S. J. Iverson, and W. D. Bowen. 1997. Fatty acid signatures and classification
trees: new tools for investigating the foraging ecology of seals. Canadian Journal of
Fisheries and Aquatic Sciences 54: 1377-1386.
Titus, K., J. A. Mosher, and B. K. Williams. 1984. Chance-corrected classification for use in
discriminant analysis: Ecological applications. American Midland Naturalist 1 1 1 : 1-7.
Toner, M., and P. A Keddy. 1997. River hydrology and riparian wetlands: A predictive
mode1 for ecological assembly. Ecological Applications 7236-246.
Tom, W. M. 1990. Climate change and fish comrnunities: A conceptual framework.
Transactions of the Amencan Fishenes Society 1 19:337-352.
TOM, W. M., and J. J. Magnuson. 1982. Patterns in the species composition and nchness of
fish assemblages in northern Wisconsin Lakes. Ecology 63: 1 149-1 166.
Tom, W. M., J. J. Magnuson, M. Rask, and J. Toivonen. 1990. Intercontinental cornparison
of smail-lake fish assemblages: The balance between local and regional processes.
Amencan Naturdist l36:345-375.
Wiens, J. A. 1992. Ecology 2000: an essay on future directions in ecology. Bulletin of the
Ecological Society of Amenca 73: 165- 170.
Wiser, S., R. K. Peet, and P. S. White. 1998. Prediction of rare-plant occurrence: A southem
Appaiachian example. Ecological Applications 8:909-920.
Wright, J. F., D. Moss, P. D. Armitage, and M. T. Furse. 1984. A preluninary classification
of nuuiing-water sites in Great Britain based on macro-invertebrate species and the
prediction of community type using environmental data. Freshwater Biology 14221-256.
York, A. 2000. Long-term effects of fiequent low-intensity buming on any communities in
coastal blackbutt forests of southeasteni Australia. Austral Ecology 2593-98.
Zar, J. H. 1999. Biostatistical Andysis, 4" edition. Prentice Hall, New Jersey.
CHAPTER 2
Illuminating the "black box": A randomization approach for
understanding variable contributions in artificial neural networks
ABSTRACT
With the growth of statistical modelling in the ecological sciences, researchen have
begun to use more complex methods, such as artificial neural networks (ANNs), to address
problerns associated with pattern recognition and prediction. Although in many studies
ANNs have been shown to exhibit supenor predictive power compared to traditional
approaches, they have also been labeled a "black box" because they provide little explanatory
insight into the relative influence of the independent variables in the prediction process. This
lack of explanatory power is a major concem to ecologists since the interpretation of
statistical models is desirable for gaining knowledge of the causal relationships driving
ecological phenomena. In this study, 1 descnbe a number of methods for understanding the
mechanics of ANNs (e.g., Neural hterpretation Diagram, Garson's algonthm, sensitivity
analysis). Next, I propose and demonstrate a randomization approach for statistically
assessing the importance of axon connection weights and the contribution of input variables
in the neural network. This approach provides researchen with the ability to eliminate null-
connections between neurons whose weights do not significantly influence the network
output (Le., predicted response variable), thus facilitating the interpretation of individual and
interacting contributions of the input variables in the network. Furthemore, the
randomization approach can ide&@ variables that significantly contribute to network
predictions, thereby providing an input-variable selection method for ANNs. 1 show that by
extending randomization approaches to artificial neural networks, the "black box" mechanics
of ANNs can be greatly illuminated. Thus, by coupling this new explanatory power of neural
networks with its strong predictive abilities, ANNs promise to be a valuable quantitative tool
to evaluate, understand, and predict ecological phenomena
INTRODUCTION
Artificial neural networks (ANNs) have begun to receive a great deal of attention in
the ecological sciences as a powerfil, flexible, statistical modelling technique for uncovering
patterns in data (Colassanti, 1991; Edwards and Morse, 1995; Lek et al., 1996a; Lek and
Guégan, 1999). This increased interest in ANNs was demonstrated recently during the k t
international worikshop on the appiicarions oheurai networks ro ecoiogicd modeihg
(conference papers are found in Ecological ModeIlkg: Volume 1 20, Issue 2-3). The utility
of A N N s for solving complex pattern recognition problems has been demonstrated in many
terrestrial (e.g., Parue10 and Tomasel, 1997; ozesmi and ozesmi, 1999; Manel et al., 1999;
Spitz and Lek, 1999) and aquatic studies (e.g., Lek et al., 1996a;b; Scardi, 1996; Bastarache
et al., L 997; Mastrorillo et al., 1998; Chen and Ware, 1999; Gozlan et al., 1999), and has led
many researchers to advocate ANNs as an attractive, non-linear alternative to traditional
statistical rnethods.
The primary application of ANNs has been for developing predictive models to
forecast fiiture values of a particula. response variable for a given set of independent
variables. Although the predictive value of ANNs has great appeal to many ecologists,
especially in applied areas, researchers have often criticized the explanatory value of ANNs,
calling it a 'black box' approach to modelling ecological phenornena (e.g., Panielo and
Tomasel, 1997; Lek and Guégan, 1999; ozesmi and ozesmi, 1999). This view stems fiom
the fact that the contribution of the input variables in predicting the value of the output (i.e.,
response variable) is difficult to disentangle within the network. Consequently, input
variables are ofien entered into the network and a response value is generated without
gaining any understanding of the relationships between the independent and response
variables, and therefore, providing no explanatory insight into the underlying mechanisms
being modelled by the network (Anderson, 1995; Bishop, 1995; Ripley, 1996). This is a
major piâall of ANNs since traditionai statistical approaches can readily identify the
influence of the independent variables in the modelling process, as well as provide a measure
of the degree of confidence regarding their contribution. Currently, there is a lack of
theoretical or practical ways to partition the contributions of the independent variables in
ANNs (Smith, 1994). This is a substantial drawback in the ecological sciences where the
interpretation of statistical models is desirable for gaining insight into causal relationships
driving ecoIogicai phenornena.
Recently, a number of authors have proposed methods for selecting the best network
architecture (Le., nurnber of neurons and topology of connections) among a set of candidate
networks. Examples include a number of statistical methods such as asymptotic cornparison
techniques, approximate Bayesian analysis and cross validation (e.g., Dimopoulos et al.,
1995; see Bishop, ! 995 for review). However, methods for interpreting the relative
contribution of predictor variabIes in the network are more complicated, and as a result are
rarely used in ecological studies. Intensive computationai approaches such as growing and
pruning algorithrns (Bishop, 1995), partial derivatives (e.g., Dimopoulos et al., 1995; 1999)
and asymptotic t-tests are not used often in favour of simpler algorithms which use network
connection weights (e.g., Garson's algorithm: Garson, 199 1) or sensitivity analysis to
determine the entire spectnun in which each variable influences the network response (e.g.,
Lek's algorithm; Lek et al., 1996a).
Although these approaches provide a means of determinhg the overail influence of
each predictor variable, interpretation of interactions among the variables is more dimcult to
assess since the strength and direction of individual axon connection weights within a
network must be examined directly. Bishop (1995) discusses the use of pnining dgonthms
to remove connection weights that do not contribute to the predictive performance of the
neural nebvork. In brief, a pnining approach begins with a highly c o ~ e c t e d network (i.e.,
large number of connections among neurons), and then successively removes weak
connections (i.e., small absolute weights) or connections that cause a minimal change in the
network error Function when removed. However, an important question is how to decide at
what threshold value (Le., absoiute connection weight or change in network error) should
weights be removed or retained in the network? Ln the present study, 1 propose a
randomization test for &ficial neural networks to address this question. This randomization
approach provides a statistical pruning technique for ehinating nul1 connection weights that
provide minimal influence on the response variable, as well as providing a method for
identifjmg independent variables that significantly contribute to predictions in the network.
By using randomization protocols for partitionhg the importance of c o ~ e c t i o n weights (in
tems of their magnitude and direction), researchers will be able to quantitatively assess both
the individual and interactive effects of the input variables in the network prediction process,
as well as evaluate the overall contributions of the variables. Finally, 1 illustrate the utility of
this ANN randomization test using an empirical example describing the relationship between
fish species nchness and habitat characteristics of fieshwater lakes.
Case study:
Habitat factors influencing fish species richness in freshwater lakes
Throughout this paper 1 use an empirical example relating fish species richness to
habitat conditions of 286 freshwater lakes located in Algonquin Provincial Park, south-
central Ontario, Canada (45'50', 78'20'). I tabulated species presence for each lake to
examine relationships between fish species richness (ranghg fiom 1 to 23) and a suite of
habitat-related variables (8 in total). Predictor variables were chosen to include factors that
have been shown to be related to cntical habitat requirements of fish in this geographic
region (Minns, 1989). These variables included: surface area, lake volume, and total
shoreline perimeter which are correlated with habitat diversity (Eadie and Keast, 1984);
maximum depth which is negatively correlated with winter dissolved-oxygen concentrations
and related to thermal stratification (Jackson and Harvey, 1989); surface measurements
(taken at depths 5 2.0 m) of total dissolved solids to provide an estimate of nutrient status
and lake productivity (Ryder, 1982) and pH; lake elevation which is related to both habitat
heterogeneity (Curie, 199 1) and colonization/extinction features of the lake (Magnuson et
al., 1998); and growing degree-days which is a m o g a t e for productivity (Richenon and
Lum, 1980).
Interpreting neural-network connection weights: An important
consideration
1 refiain hem detailing the specifics of neural network optimization and design (i.e.,
number of hidden neurons and layers) and instead refer the reader to the Chapter 1 and
extensive coverage provided in the texts by Smith (1994), Bishop (1995), and Ripley (1996),
as well as articles by Ripley (1 994) and Cheng and Tinenngton ( 1 394j. it is suEcient to say
that the rnethods described in this paper refer to the classic fmily of one hidden-layer, feed-
forward neural network trained by the backpropagation algorithm @urnehart et al., 1986).
These neural networks are commonly used in ecological studies since they are suggested to
be universal approximaton of any continuous hinction (Cybenko, 1989; Funahashi, 1989;
Homick et ai., 1989). Based on n-fold cross validation, I detemined that a neural network
with four hidden neurons exhibited good predictive power (r = 0.72 between observed and
predicted species nchness).
In the neural network, the connection weights between neurons are the links between
the inputs and the outputs, and therefore are the link between the problem and the solution.
The weights contain al1 the information about the network. The relative contribution of the
independent variables to the predictive output of the neural network depends primady on the
magnitude and direction of the connection weights. Input variables with larger connection
weights represent greater intensities of signal transfer, and therefore are more important in
the prediction process compared to variables with smaller weights. Negative connection
weights represent inhibitory effects on neurons (reducing the intensity of the incornhg
signal), whereas positive comection weights represent excitatory effects on neurons
(increasing the intensity of the incoming signal). Therefore, negative connection weights
negatively affect the response variables, whereas the opposite is tme for positive connection
weights.
Given the obvious importance of connection weights in assessing the relative
contributions of the independent variables, there is one topic that 1 believe warrants
additional detail. During the optimization process, it is necessary that the network converges
ro the global minimum of the fitting criterion (e.g., prediction error) rather than one of the
many local minima. Local minima refer to different alternative sets of network parameter
values due to symmetric interchanges of the connection weights between the oeurons of the
network (Ripley, 1994). Consequently, ninnllig the optimizer several times (Le.,
constructing a number of networks with the same data but different initial random weights)
can result in neural networks with identical predictive performance but quite different
comection weights. This can complicate the interpretation of neural networks. Two
approaches can be employed to ensure the greatest probability of network convergence to the
global minimum. Thc Erst approach iwol:.es combiring differrnt local mullms rzther hm
choosing between them. Some researchers have suggested that the optimal approach is to
average the outputs of neiworks using the connection weights corresponding to different
local minima (e.g., Wolpert, 1992; Xu et al., 1992; Perrone and Cooper, 1993; Ripley, 1995).
Ecoiogists do not readily use this approach, as it requires greater computational effort to
identify multipie local minima. The second approach involves global optimization
procedures where parameters such as learning rate, mornentum or regularization are Licluded
in network optimization. (e.g., White, 1989; Styblinski and Tang, 1990; Gelfànd and Mitter,
199 1 ; Ripley, 1994). The addition of a learning rate parameter ( 7) and momentum (a)
during optimization is used ofien in the ecological literature (e.g., Lek et al., 1996a;
Mastrodlo et al., 1997a; Gozlan et al., 1999; Spitz and Lek, 1 999) because in addition to
reducing the problem of local minima, it also accelerates the optimization process. The q
regulates the magnitude of changes in the weights and bises during optimization, and a
mediates the contribution of the last weight change in the previous iteration to the weight
change in the current iteration. The values of q and a can be set constant or c m Vary
during network optimization, although there are a number of the disadvantages to holding /I
and a constant (Bishop, 1995). Consequently, values of both f i and a are commonly
modifïed by either increasing or decreasing their value according to whether the error
decreased or increased respectively during an iteration of network optimization (e.g., Hagan
et al., 1996; Mastrorillo et al., 1998; Ozesmi and ozesmi, 1999). In my study 1 included
leaming rate and momentun parameters in the optimization process and defined them as a
function of the error (although the fint approach discwed above is an equally valid
method). For dl analyses 1 started the network optiniization with randorn connections
weights between -0.3 and 0.3. The variable leaming rate parameter, momentum parameter
and the small interval of initial random weights ensured a high probability of global network
convergence and thus provided confidence regarding the validity of the connection weights
and their interpretation.
Illuminating the '<black box"
Preparation of the data
Pnor to building the neural network, the data set must be modified so that the
dependent and independent variables exhibit particular distributional charac teristics. The
dependent variable must be converted to the range [O.. 11 so that it confoms to the demands
of the transfer function used (sigrnoid function) in the building of the neural network. This is
accomplished by using the formula:
where r, is the converted response value for observation n, y, is the original response value
for observation n, and rnin(Y) and max(Y) represent the minimum and maximum values
respectively, of the response variable Y. Note that the dependent variable does not have to be
converted when modelling a binary response variable (e.g., species presence/absence) since
its values akeady fa11 within this range.
To standardize the measurement scales of the inputs into the network the independent
variables are converted to z-scores (i.e., mean = O, standard deviation = 1) using the formula:
where z, is the standardized value of observation n, x, is the original value of observation n,
and X and s, are the mean and standard deviation of the variable X. It is essential to
standardize the input variables so that same percentage change in the weighted sum of the
inputs causes a similar percentage change in the unit output. Both the dependent and
independent variables of the richness-habitat data set were modified using the above
fornulas.
Methods for quantifjbg input variable contributions in ANNs
In the followhg section, I d e t i l a senes of me?hods that are cmilable to aid Li t l e
interpretation of c o ~ e c t i o n weights and variable contributions in neural networks. These
approaches have been used by ecologists and represent a set of appropnate techniques [or
understanding neuron connections in networks. Next, I extend a randomization approach to
these methods, illustrating how connection weights and the overail influence of the input
variables in the network can be assessed statistically.
Neural hterpretaiion Diagram (NID)
Recently, a number of investigaton have advocated using axon connection weights to
interpret predictor variable contributions in neural networks (e.g., Aoki and Komatsu, 1997;
Chen and Ware, 1999). Ozesmi and ozesmi (1999) proposed the Neural Interpretation
Diagram (NID) for providing a visual interpretation of the connection weights arnong
neurons. In NIDS, the relative magnitude of the comection weights is represented by line
thickness (i.e., thicker lines representing greater weights) and line shading represents the
direction of the weights (Le., black lines representing positive, excitator signals and gray
lines representing negative, inhibitor signals). Tracking the magnitude and direction of
weights between neurons enables researchers to identi@ individual and interacting effects of
the input variables on the output. Figure 2.1 illustrates the NID for the empirical example
and shows the relative influence of each habitat factor in predictuig fish species richness.
The relationship between the inputs and outputs can be determined in two steps since there
are fïrst input-hidden layer connections and second hidden-output layer connections.
Positive effects of input variables are depicted by positive input-hidden and positive hidden-
output connection weights, or negative input-hidden and negative hidden-output comection
weights. Negative effects of input variables are depicted by positive
Area
Mx. Deph
Volume
Sh. Per.
Elevation
TDS
PH
GDD
Species Richness
Figure 2.1. Neural interpretation d i a m (NID) for neural netsvork modelling fish species
richness as a fùnction of 8 habitat variables. The thickness of the lines joinuig neurons is
proportional to the magnitude of the comection weight, and the shade of the line indicates
the direction of the interaction between neurons: black co~ect ions are positive (excitator)
and gray connections are negative (inhibitor).
input-hidden and negative hidden-output comection weights, or negative input-hidden and
positive hidden-output comection weights. Therefore, the multiplication of connection
weight direction (Le., positive or negative) delineates the eEect each input variable has on the
response variable.
The interpretation of comection weights, and more specifically NIDS, is not an easy
task because of the cornplexiîy of connections among the neurons (Fig. 2.1). Additional
bidden r?ewor?s would d y make this interpretation more di fficult. Furthermore. a subjective
choice must be made regarding the magnitude at which comection weight should be
interpreted. These considerations make the direct examination of connection weights
challenging at best and virtually impossible in data sets with large numbers of variables. 1
show later that a randomization approach can aid in the interpretation NIDS by identifjmg
non-significant connection weights that can be removed.
Garson 's algorithm
Garson (1 99 1) proposed a method, later modified by Goh (1 995), for interpreting
neural network comection weights to determine the relative importance of independent
variables within the ANN. This approach has been used in a number of ecological studies
(e.g., Mastrorillo et al., 1997b;1998; Gozlan et al., 1999; Aurelle et al., 1999; Brosse et al.,
1999). Garson's algorithm partitions the neural network connection weights in order to
determine the individual importance of each input variable considered separately in the
network. Box 2.1 contains a summary of the protocol presented by Ganon (1991) that is
used to calculate input variable contributions. Figure 2.2 illustrates the overdl contribution
of each habitat variable in predicting lake species richness. The results show that the relative
importance of the predictor variables ranged fiom 6% to 18%, with lake area and elevation
contributing the most to predicting species nchness and lake volume and pH contributhg the
least.
Sensitivity anaiysis
A number of investigators have employed sensitivity analysis to neural networks to
determine the spectnun of input variable contributions in the neural network. Recently, a
Box 2.1. Garson's algorithm for partitioning and interpreting neural network comection weights. Sample caiculations s h o w for 3 input neurons (1,2 and 3), 2 hidden neurons (A and B) and 1 output neuron (O)
Output
1. Matrix containing input-hidden-output neuron connection weights
2. Contribution of each input neuron to the output via each hidden neuron calculated as the product of the input-hidden comection and the hidden-output connection: e.g.,c.,=w., xrv**=-2.61 ~ 1 . 1 1 =-2.90
3. Relative contribution of each input neuron to the outgoing signal of each hidden newon: e.g., r., = lc*, l /(I$. I+ 1c.J + IC.. l)= 2.90 1 (2.90 + O. 14 + 0.77) = 0.76; and sum of input neuron contributions: e.g., S, = r., + r,, =0.76 + 0.29 = 1.05
1 1 Hidden A 1 ~ i d d & ~ 1
Input1 Input2 Input 3 Output
Hidden A W.,=-2.61 w.,=0.13 W., = -0.69 v = 1.1 1
Input 1 Input2
4. Relative importance of each input variable: e.g., RI =$ / (S, +$ +S.) x 100 = 1.05 /(1,05 + 0.25 + 0.70) x 100 = 52.5 %
Hidden B W.,=-1.23 W.,=-0.91 w,, = -2.09 W., = 0.39
c., = -2.90 c,, = -0.48 c..=0.14 c.. = -0.35
Input 1 Hidden A r,, = 0.76
Input 1 Input 2
Input 3
input 2 1 ra, = 0.04 Input 3 1 r., = 0.20
Relative importance
52.5 %
12.5 %
35.0 %
Hidden B r,, = 0.29
S m S = 1-05
r., = 0.21 r,, = 0.50
$=O25 $ = 0.70
Area Max. Volume Sh. Per. Elev TDS pH GDD Depth
Predictor variable
Figure 2.2. Bar plots showing the percentage relative importance of each habitat variable in
the neural network predicting fish species nchness based on Garson's aigorithm. See Box
2.1 for cdculations Uivolved in Garson's algorithm.
number of alternative types of sensitiviv analysis have been proposed in the ecological
Literature. For example, the Senso-nets approach includes an additional weight in the
network for each input variable representing the variable's sensitivity (Schleiter et al., 1999).
Scardi and Harding (1999) added white noise to each input variable and examined the
resulting changes in the mean square error of the output. Although such approaches are
available, traditional sensitivity analysis involves varying each input variable across its range
wliilc holding dl other input ~whb.bles constant; e x h vwkb!e is examined in tum, to
determine how they individuaily contribute to patterns in the output. Such analyses are
somewhat cumbersome since there may be an overwhelming nurnber of variable
combinations to examine. As a result, it is common to first calculate a senes of summary
measures for each of the input variables (e.g., minimum, maximum, quartiles, percentiles).
Next, the independent variable under investigation is varied from its minimum to maximum
value while al1 other variables are held constant at each of these summary measures
sequentially (e.g., ozesmi and ozesmi, 1999). Relationships between each input variable
and the response can be examined for each surnmary measure, or the calculated response can
be averaged across the summary measures. Holding the input variables constant at a small
number of values provides a more manageable sensitivity analysis, yet still requires a great
deal of the time since each value of the input variable must be examined. Consequently, Lek
et al. (1995; 1996qb) suggested exarnining only 12 data values delimiting 11 equal intervals
over the variable range rather than varying it across its entire range (this has been termed
Lek's algorithrn). Contribution plots can be constructed by averaging the response value
across al1 summary statistics for each of the 12 values of the input variable of interest. Many
studies have employed Lek's algorithm (e-g., Lek et al., l995,l W6a; Mastrorillo et al.,
1997% 1998; Guégan et al., 1998; Laë et al., 1999; Leg-Ang et al., 1999; Spitz and Lek,
1999). In this study I constructed contribution plots for each of the 8 predictor variables in
the neural network by varying each input variable across its entire range and holding al1 other
variables constant at their 2 0 ~ , 40", 60" and goth percentile (Fig. 2.3). It is evident fiom the
contribution plots that the influence of the input variables (Le., lake habitat factors) on the
network output (i.e., species richness) varies greatly dependhg on the values (i.e.,
percentiles) of the other input variables. The following is a summary of the different
response cuves:
Gaussian response curve - input variable contributes greatest influence on output at
intermediate values, and exhibits decreasing influence at low and high values: e.g.,
influence of pH and growing-degree days on richness.
Bimodal response curve - input variable contributes greatest influence on output at
low and high values, and exhibits minimal influence at intermediate values: e.g.,
influznct of iake axa, rnaximiii dcpth, shoidinc p e r i c t e r md total dissobed solids
on richness when d l other variables are low in value.
Left-skewed response curve - input variable contributes greatest influence on output
at high values, and exhibits minimal influence at low and intemediate values: e.g.,
influence of lake elevation on richness.
Right-skewed response curve - input variable contributes greatest influence on
output at low values, and exhibits minimal influence at intermediate and high values:
e.g., influence of total dissolved solids on richness, influence of overall lake size (i.e.,
area, maximum depth, volume and shoreline perimeter) on rkhness when al1 other
variables are intermediate in value.
Decreasing response curve - input variable contributes decreasing influence on
output at increasing values: e.g., influence of surface area on richness when a11 other
variables are high in value.
Flat response curve - input variables contributes minimal influence on output across
its entire range: e.g., influence of growing-degree days on richness when al1 other
variables are hi& in value.
Randomization Test for Artifcial Neural Networks
1 propose a randomization test for input-hidden-output connection weight selection in
neural networks. By eliminating null-comection weights that do not differ signincantly fiom
random, I c m simplify the interpretation of neural networks by reducing the number of mon
pathways that have to be examined for direct and indirect (Le., interaction) effects on the
response variable, for instance when using NIDS. This objective is similar to statistical
pnining techniques (e.g., asymptotic t-tests), yet does not require the assumptions of
parametnc and non-pararnetric methods since the randomization approach empiricaily
constnicts the distribution of expected values under the nul1 hypothesis for the test statistic
(i.e., weight comection) f?om the data at hand. Similarly, a randomization approach can be
used as an input variable selection method for ANNs by summuig across input-hidden-output
connection weights or calculating the relative importance (i.e., Garson's algorithm) for each
input variable. This approach provides a quantitative tool for selecting statisticaily
significant input variables for inclusion into the network, again reducing network complexity
and assisting in the nef.i.ork hterprethn. The following is the -mdomization protocol for
testing the statistical significance ofconnection weights and input variables:
(1) construct a number of neural networks using the original data with different
initial random weights;
(2) select the neural network with the best predictive performance, record initial
random connection weights used in constructing this network, and calculate
and record:
(a) input-hidden-output connection weights: the product of input-hidden
and hidden-output connection weights for each input and hidden
neuron (e.g., observed CA, : step 2, Box 2.1);
(b) overall comection weight: the surn of the input-hidden-output
connection weights for each input variable (e.g., observed
C I = C A I + C B I ) ;
(c) relative importance (%) for each input variable based on Garçon's
algorithm (e.g., observed Ml : step 4, Box 1.1);
(3) randomly permute the original response variable O>rMdo,,,);
(4) consûuct a neural network usingymndom and initial random connection
weights; and
(5) repeat steps (3) and (4) a large number of times (Le., 9999 times in this study;
see Jackson and Somers, 1989) each time recording 2(a), (b) and (c); e-g.,
randomized gr, randomized C I , and randomized RII
The statisticai significance of each input-hidden-output connection weight, overall
connection weight and relative importance of each input variable (e.g., observed c~c,
observed c, and observed RIl) can be calculated as the proportion of randomized values (e.g.,
randomized CA], randomized cl and randomized a), including the observed, whose value is
equai to or more extreme than the observed values. Figure 2.4 illustrates the distribution of
randomized input-hidden-output comection weights (for hidden neuron B), overall
connection weight and relative importance of surface area for predicting lake species
nchness.
Table 3.1 cootains the connection weight structure for the neural network and the
associated p-values from the randomization tests. The results show that only a fiaction of the
total 32 input-hidden-output connections (Le., 8 predictors x 4 hidden neurons) are
statistically different from what would be expected based on chance alone. For instance,
only 6 input-hidden-output connections are significant at a = 0.05. The results also show
that when you account for al1 connection weights (Le., overall connection weight), lake size
(Le., lake area, maximum depth, volume and shoreline perimeter) and pH are positively
associated with species richness, while elevation, total dissolved solids and growing-degree
days are negatively associated with species richness. However, only the influence of
maximum drpth (P = 0.002) and shoreline penmeter (P = 0.021) are statistically significant,
with these variables having a relative importance of 12.5% and 11%' respectively (Table
2.1). Interestingly, the results from the randomization test using relative importance differ
fiom the results using ovedl connection weights. Using Garson's algorithm, surface area
was the only significant factor correlated with species richness (P = 0.03 1), and elevation was
marpinally nonsignificant (P = 0.084). The discrepancy between the two approaches results
From the different ways that the methods use the network comection weights. Garson's
algorithm uses absolute connection weights to calculate the influence of each input variable
on the response (see Box 2. l), whereas overall connection weight is calculated using the raw
values. Examinhg Figure 2.5 (a = 0.05), 1 can show that Garson's algorithm can be
potentially misleading for the interpretation of input variable contributions. It is evident that
lake elevation shows a strong, positive association with species nchness through hidden
neuron A, but a strong, negative relationship with nchness through hidden neuron D. Based
on absolute weights, Garson's algorithm indicates a large relative importance of that variable
since both connection weights have large magnitudes (e-g., for al1 hidden neurons RI=
17.94%: Table 2.1). However, in such a case the influence of the input variable on the
-18 -16 -14 -12 -10 -8 -6 -4 -2 O 2 4 6 8 10 12
Input-hidden-output comection weight
-20-18-16-14-12-10-8 -6 -4 -2 O 2 4 6 8 10 12 14 16 18
Total connection weight
Figure 2.4. Distributions of (A) input-hidden-output connection weights for hidden neuron
B, (B) overall comection weight and (C) input relative importance (%) for the influence of
d a c e area on lake species richness. Arrow s represent O bserved input-hidden-output
connection weight for hidden neuron B (7.8 1 ), overall connection weight (7.43) and relative
importance (18.27%).
O 2 4 6 8 IO 12 14 16 18 20 22 24 26 28
Relative importance (%)
Figure 2.4. con tinued.
Table 2.1. Connection weiglit structure for the neural network modelling fish species nchness as a function of 8 habitat variables. Wu
represents the input-hidden-output connection weight for input variable i (whcre i =1 to 8) and hidden neuron j (where j = A to D). P
values for input-hidden-output connection weights (VA,, WB,, Wei, and WDi), overall connection weiglits ( X W A i + and Garson's relative
importance (%) are based on 9,999 randomizations.
i Predictor Hidden Hidden Hidden Hidden Overall coiinectlon Relative variablc nçuron A neuron B neuron C neuron D wcight importance
- + - -- - ---- -- - .- - -- - - " "- - -- - --- --- - - - - - - -- - - - -
KI P P Wu P KII P C Ku+ DI P % P -
1 Area (ha) 2 Max. Depth (m) 3 Volume (m3) 4 Sh. Per. (km)
5 Elevation (m) 6 TDS (pg/L)
7 PM 8 GDD
Max. Depth ha
Spec ies Ric hness
GDD
Area
Max. Depth
Volume
Sh. Per.
Elevation
ms
PH
GDD
Species Richness
Figure 2.5. Neural Interpretation Diagram d e r non-significant input-hidden-output comection weights are eliminated using the randomization test. Only comection weights statisticdy different fiom zero (a = 0.05 and a = 0.10 ) are shown. The thickness of the lines joining neurons is proportional to the magnitude of the connection weight, and the shade of the h e indicates the direction of the interaction between neurons: black connections are positive (excitator) and gray connections are negative (inhibitor). Black input neurons indicate habitat variables that have an overail positive influence on species richness, and gray input neurons indicate an overall negative influence on species richness.
response is actually negligible since the positive influence through hidden neuron A is
counteracted by the negative influence through hidden neuron D (e.g., for al1 hidden neurons
ZWAIDl = - 1.17, P=0.2 17: Table 2.1). For this reason, 1 believe caution shouid be employed
when making inferences fiom the results generated by Garson's algorithm since the direction
of the input-output interaction is not taken into accounted.
Using results of the randomization test 1 removed non-significant comection weights
h m the Y e d hterpretation Diagram (originally shown in Figure 2.1 ), resul ting in Figure
2.5 which illustrates only connection weights that were statistically significantly different
from random at a = 0.05 and a = 0.10. Focusing on hidden neuron C in Figure 2.5 (a =
0. IO), it is apparent that as maximum depth and shoreline perimeter increase, and growing-
degree days decreases, species richness increases in the study lakes. Furthemore,
interactions among habitat factors c m be identified as input variables with contrasting
comection weights (Le., opposite directions) entering the sarne hidden neuron. For exarnple,
in exarnining hidden neuron D it is evident that lake shoreline perimeter interacts with lake
elevation. An increase in lake elevation decreases predicted species richness; however, this
negative effect weakens as shoreline perimeter increases. Therefore, there is an interaction
between lake elevation and shoreline perimeter in that high elevation lakes with convoluted
shorelines have greater species nchness compared to high elevation lakes with simple
shorelines. The MD also identifies input variables that do not interact, for example lake
volume, since this variable does not exhibit significant weights with contrasting effects at any
single hidden neuron with any of the other variables. Such information obtained fiom Figure
2.5 shows that a randomization approach can greatly aid in drawing conclusions fiom NIDS,
and more generally help researchers identify and interpret direct and indirect (i.e., interaction
between input variables) contributions of input variables in ANNs by using axon connection
weights.
Two important components of the mndomization test involved the optimization of the
neural network. First, 1 conducted the randomization test for the product of input-hidden and
hidden-output weights rather than each input-hidden and hidden-output connection weight
separately, since the direction of the connechon weights (Le., positive or negative) c m switch
between different networks optimized with the same data (Le., symmetnc interchanges of
weights: Ripley, 1994). For instance, the input-hidden and hidden-output weights might both
be positive in one network and both negative in another, but in both cases the input variable
exerts a positive influence on the response variable. To remove this problem 1 examined the
product of the input-hidden-output weights because the sign of this value will remain
constant and therefore will be representative of the true influence of the independent
variables. Second, it is important to use the initial random connection weights when
constructing the neural networks with the randomized response data. The reason for this is
Iliar diEzrcnt initial d o i n xeights c m rcsult il different h d comecticn wig!!ts with the
same overall predictive performance. Therefore, if different initial randorn weights were
used for each randomization, dissimilarities between the observed and randorn connection
weights might be an artifact of different initial connection weights and not the randomization
of the response variable. Using the same initial comection weights for each randomization
accounts for this problem. Furthemore, it is beneficial to check the distribution of input-
hidden-output connection weights for different random initial weights to ensure that the
convergence of different networks exhibit, on average, similar connection weights. For my
example 1 found that in almost al1 cases the input-hidden-output connection weights
exhibited unimodal distributions.
CONCLUSION
1 reiterate the concern raised by a number of ecologists and explicit to rny paper: Are
arti/iciul neural networks a black box approach for modelling ecologicnl phenomenu? In
light of the synthesis provided here, 1 argue the answer is unequivocdly no. I have reviewed
a senes of methods, ranging fiom qualitative (Le., NIDS) to quantitative (i.e., Garson's
algorithm and sensitivity analysis), for interpreting neural-network comection weights, and
have demonstrated the utility of these methods for shedding light on the inner workings of
neural networks. These methods provide a means for partitioning and interpreting the
contribution of input variables in the neural network modelling process. In addition, 1
descnbed a randomization procedure for testing the statistical significance of these
contributions in tems of individual connection weights and overall innuence of each input
variable. The former case facilitates the interpretation of direct and interacting effects of
input variables on the response by removing comection weights that do not contribute
significantly to the performance of the neural network. In the latter case, the randomization
test assesses whether the contribution of a particular input variable on the response differs
kom what would be expected by chance. The randomization procedure enables the removal
of nul1 neural pathways and insignificant input variables; thereby aiding in the interpretation
of the neural network by reducing its cornplexity. In conclusion, by coupling the explanatory
insight of neural networks with its powerful predictive abilities, artificial neural networks
h a ~ c gcat pi-ûriist in ccolow, rs s !ml to wa!ua?e, mderctand, and predict ecological
phenornena.
I would like to thank Sovan Lek for his insightfiil comrnents regarding the finer
points of artificial neural networks, and for providing some of the original MatLab code for
this study. This manuscnpt was greatly improved by the comrnents of Brian Shuter.
Funding for this research was provided by a Graduate Scholarship from the Naturd Sciences
and Engineering Research Couocil of Canada (NSERC) to J.D. Olden, and an NSERC
Research Grant to D.A. Jackson. Cornputer routines for dl randomization tests are available
in MatLab progamrning language fiom the authors upon request.
Anderson, J. A. 1995. An Introduction to Neural Networks. MIT, Cambridge, Massachusetts,
650 pp.
Aoki, I., and T. Komatsu. 1999. Analysis and prediction of the fluctuation of sardine
abundance using a neural network. Oceanologia Acta 20: 81-88.
Aureiie, D., S. Lek, J. L. Giraudzl, anci P. Bzrrzbi. 1999. Microsatcllites and xtificia! neural
networks: tools for the discrimination between natural and hatchery brown hout (Salmo
mitta, L.) in Atlantic populations. Ecological Modelling 120: 3 13-324.
Bastarache, D., N. El-Jabi, N. Turkkan, and T. A. Clair 1997. Predicting conductivity and
acidity for small streams using neural networks. Canadian Journal of Civil Engineering
24: 1030- t 039.
Bishop, C. M. 1995. Neural Neworks for Pattern Recognition. Clarendon Press, Oxford, 482
PP*
Brosse, S., J.F. Guégan, J. N. Tourenq, and S. Lek. 1999. The use of neural networks to
assess fish abundance and spatial occupancy in the littoral zone of a mesotrophic lake.
Ecological Modelling, 120: 299-3 1 1.
Chen, D.G., and D. M. Ware. 1999. A neural network mode1 for forecasting fish stock
recruitrnent. Canadian Joumal of Fishenes and Aquatic Sciences 56: 2385-2396.
Cheng, B., and D. M. Titterington. 1994. Neural networks: a review fiom a statistical
perspective (with discussion). Statistical Science 9: 2-54.
Colasanti, R. L. 199 1. Discussions of the possible use of neural network algorithms in
ecological rnodelling. Binary 3 : 13-1 5.
Currie, D. J. 1991. Energy and broad-scale patterns of animai- and plant-species nchness.
Amencan Naturalist 13 7: 27-49.
Cybenko, G. 1989. Approximation by superimpositions of a sigrnoidal function. Math.
Control Signals Systematics 2: 303-3 14.
DUnopoulos, I., J. Chronopoulos, A. Chronopoulos-Sereli., and S. Lek. 1999. Neural network
models to study relationships between lead concentration in grasses and permanent urban
descripton in Athens city (Greece). Ecological Modelling 120: 157-1 65.
D ~ O P O U ~ O S ~ Y., P. Bourret, and S. Lek. 1995. Use of some sensitivity criteria for choosing
networks with good generalization. Neural Processing Letters 2: 1-4.
Eadie, J. M. and A. Keast. 1984. Resource heterogeneity and fish species diversity in lakes.
Canadian Journal of Zoology 62: 1689- 1695.
Edwards, M. and D. R. Morse. 1995. The potential for cornputer-aided identification in
biodiversity research. Trends in Ecology and Evoluhon 10: 153- 158.
Fÿn3shshi, K. 1989. On the ~pproximate realization of continuous mapping by neural
networks. Neural Networks 2: 183-192.
Gallant, S. 1. 1993. Nelual network leaming and expert systems. MIT, Massachusetts, USA,
365 pp.
Garson, G. D. 199 1. Interpreting neural-network connection weights. Artificial Intelligence.
Expert 6: 47-5 1.
Gelfand, S. B., and S. K. Mitter. 1991. Recursive stochastic algorithm for global
optimization in R'. SIAM Journal of Control Optimization 29: 999- 101 8.
Goh, A.T.C. 1995. Back-propagation neural networks for modelling complex systems.
Artificial Intelligence Engineering 9: 143-15 1.
Goodman, P. H. 1996. NevProp software, version 3, University of Nevada, Reno, W .
Gozlan, R. E., S., Mastrorillo, G. H. Copp, and S. Lek, S. 1999. Predicting the structure and
diversity of young-O f-the-year fish assemblages in large rivers. Freshwater B iology 4 1 :
809-820.
Hagan, M. T., H. B. Demuth, and M. H. Bede. 1996. Neural Network Design. PWS
Publishing, Boston, MA.
Homick, K., M. Stinchcombe, and H. White. 1989. Multilayer feedfonvard networks are
universal approximators. Neural NehKorks 2: 359-366.
Jackson, D. A., and H. H. Harvey. 1989. Biogeographic associations in fish assemblages:
local versus regional processes. Ecology 70: 1472-1484.
Jackson, D. A., and K. M. Somers. 1989. Are probability estimates fiom the permutation
mode1 of Mantel's test stable? Canadian Journal of Zoology 67: 766-769.
Lae, R., S. Lek, and J. Moreau. 1999. Predicting fish yield of Anican lakes using neural
networks. Ecological Modelling 120: 325-335.
Lek, S., A. Beland, 1. Dhopoulos, J. Lauga, and J. Moreau. 1995. Improved estimation,
using neural networks, of the food consumption of fish populations. Marine and
Freshwater Research 46: 1229-1236.
Lek, S., M. Delacoste, P. Baran, 1. D ~ O P O U ~ O S ~ J. Lauga, and S. Aulagnier. 1996a.
Application of neural networks to modelling nonlinear relationships in ecology.
Ecological Modelling 90: 39-52.
Lek, S., A. Belaud, P. B a , 1. Dimopoulos, and M., De!acoste. l996b. Role cf r o m
environmental variables in trout abundance models using neural networks. Aquatic
Living Resources 9: 23-29.
Lek, S. and J. F. Guégan. 1999. Artificial neural networks as a tool in ecological modelling,
an introduction. Ecological Modelling 120: 65-73.
Lek-hg, S., L. Dehanreng, and S. Lek. 1999. Predictive models of collembolan divenity
and abundance in a nparian habitat. Ecological Modelling 120: 247-260.
Magnuson, J. J., W. M. TOM, A. Banejee, J. Toivonen, O. Sanchez, and M. Rask. 1998.
Isolation vs. extinction in the assernbly of fishes in small northem lakes. Ecology 79:
2941 -2956.
Manel, S., J-M. Dias, and S. J. ûrmerod. 1999. Cornparhg discriminant analysis, neural
networks and logistic regression for predicting species distributions: a case snidy with a
Himalayan river bird. Ecological Modelling 120: 337-347.
Mastronllo, S., S. Lek, and F. Dauba. 1997a. Predicting the abundance of minnow Pho.rinus
phoxinw (Cyprinidae) in the River Ariege (France) using artificial neural networks.
Aquatic Living Resources 10: 169- 176.
Mastronllo, S., S. Lek, F. Dauba, and A. Beland. 1997b. The use of artificial neural networks
to predict the presence of mail-bodied fish in a river. Freshwater Biology 38: 237-246.
MastroriUo, S., F. Dauba, T. Oberdorff, JJ. F. Guégan, and S. Lek. 1998. Predicting local fish
species nchness in the Garonne River basin. C.R. Academy of Sciences, Paris, Sciences
de la vie 321: 423-428.
Minns, C. K. 1989. Factors aficting fish species richness in Ontario lakes. Transactions of
the Amencan Fishenes Society 1 18: 533-545.
ozesmi, S. L., and U. ozesmi. 1999. An artificial neural network approach to spatial habitat
m o d e h g with interspecific interaction. Ecological Modelling 1 16: 15-3 1.
Pmelo, J. M., and F. Tomasel. 1997. Prediction of fùnctional charactenstics of ecosystems:
a comparison of artificiai neural networks and regression models. Ecological Modelling
98: 173-186.
Perrone, M. P., and L. N. Cooper. 1993. When networks disagree: Ensemble methods for
hybrid neural networks In: R. J. Mammone (Editor), Artificial Neural Networks for
Speech and Vision, Chapman and Hall, London, pp. 126-147.
Richerson, P. J., md K. L. LU. 1980. Patterns of plant rpecies diversity in California:
relation to weather and topography. Amencan Naturalist 1 16: 504-536.
Ripley, B. D. 1994. Neural networks and related methods for classification. Journal of the
Royal Statistical Society, Series B 56: 409-456.
Ripley, B. D. 1995. Statistical ideas for selecting network architectures. In: B. Kappen and S.
Gielen (Editors), Neural networks: Artificial Intelligence and Industrial Applications.
Springer, London, pp. 1 83-1 90.
Ripley, B. D. 1996. Pattern Recognition and NeuraI Networks. Cambridge University Press,
403 pp.
Rumehart, D. E., G. E. Hinton, and R. J. Williams. 1986. Learning representations by back-
propagation erron. Nature 323: 533-536.
Ryder, R. A. 1982. The morphoedaphic index-use, abuse, and fundamental concepts.
Transactions of the Arnerican Fishenes Society 1 1 1 : 154- 164.
Scardi, M. 1996. Artificial neural networks as ernpirical models for estimating phytoplankton
production. Marine Ecology Progress Senes 139: 2 89-299.
Scardi, M., and L. W. Harding. 1999. Developllig an ernpirical mode1 of phytoplankton
primary production: a neural network case study. Ecological Modelling 120: 2 13-223.
Schleiter, 1. M., D. Borchardt, R. Wagner, T. Dapper, K,-D. Schmidt, H. H. Schmidt, and H.
Werner. 1999. Modelling water quality, bioindication and population dynamics in Iotic
ecosystems using neural networks. Ecological ModelIing 120: 271-286.
Smith, M. 1994. Neural networks for statistical modelling. Van Nostrand Reinhold, NY,
USA, 235 pp.
Spitz, F., and S. Lek. 1999. Environmental impact prediction using neural network
modelling. An example in wildlife damage. Journal of Applied Ecology 36: 3 17-326.
Styblinski, M. A. and T. S. Tang. 1990. Experiments in non-convex optimization: stochastic
approximation and simulated annealhg . Neural Networks 3 : 467-484.
White, H. 1989. Learning in aaificial neural networks: a statistical perspective. Neural
Computing 1 : 425-464.
Wolpert, D. H. 1992. S tacked generdization. Neural Networks 5 : 241 -259.
Xu, L., A. Kryzak, and C. Y. Suen. 1992. Methods for combining multiple classifien and t JI,;, -1- qplimtiitions !O h ~ d ' n i t i n g recopition. T m . IEEE on Syrtems, Man and
Cybernetics 22: 41 8-435.
CHAPTER 3
Artificial neural networks: A predictive tool for fisheries science
ABSTRACT
Understanding and predicting the effects of land-use practices and aiterations to
nearshore habitat on fish populations is one of the main challenges conf?onting fisheries
biologists. Fish-habitat models play an important role in this regard as they provide a means
to predict changes in fish populations across different spatial and temporal scales.
Developing predictive models using traditional statistical approaches is problematic since
species often exhibit cornplex, nonlinear responses to habitat heterogeneity and biotic
interactions. In this study 1 demonstrate the ability of a robust statistical technique, artificial
neural networks (ANNs), to rnodel such complexities in fish-habitat relationships. Using
ANNs 1 provide both explanatory and predictive insight into the within-lake and whole-lake
habitat factors shaping species abundance and occurrence in temperate lakes of south-central
Ontario, Canada. The results show that species presence or absence is highly predictable
based on whole-Iake measures of habitat, and that these fish-habitat models exhibit good
generality in predicting occurrence in other lakes £rom an adjacent drainage. Detailed
evaluation of these models shows that by partitioning the predictive performance of the
models into measures such as sensitivity (ability to predict species presence) and specidcity
(ability to predict species absence), the strengths and weaknesses of the models is assessed
more readily. Furthemore, by varying the decision threshold probability for which the
rnodel predicts a species as being present or absent, rather than following the conventional
arbitrary threshold of OS, more powerfil predictions were achieved. Finally, ANNs provide
a usefùl approach to examine the interaction effects of nearshore habitat conditions on
species abundance and spatial occupancy. I show that ANNs have considerable promise for
understandhg habitat-related controls on fish populations, for predicting future states of
these populations, and can provide a valuable tool for fishenes management.
INTRODUCTION
Changing land-use pattems and alterations to nearshore habitat of lakes have caused
major changes in fish populations throughout the world. Efforts to understand the linkage
berneen habitat, its use by fish and subsequent productivity have become increasingly
important, and currently are a central issue in the aquatic sciences (Hughes and Noss 1992;
Harig md Bain 1998). In northem tcmperatc Ikcs, antk~opogcnic activitj. h3s altered n m y
components of riparian areas and nemhore habitats (Jennings et al. 1999). Modifications
include changes in the composition and density of macrophytes (Bryan and Scarnecchia
1992), quantity and diversity of shoreline habitat such as woody matenal (Christensen et al.
1996), and size and uniforrnity of substrate particles (Beauchamp et al. 1994; Jennings et al.
1996). Littoral-zone alterations to habitat can have ciramatic and persistent impacts on fish
assemblages since lake habitat provides the template upon which the organization and
dynamics of lenthic ecosystems occurs (Jackson and Harvey 1989; Tonn et al. 1990; Hinch et
al. 1991; Magnuson et al. 1998).
An obvious first step in any efficient, effective conservation, management or
restoration shategy is to obtain a good knowledge of the relationships between habitat
elements and fish populations. The ability to evaluate the effects of habitat change and other
human impacts on fish populations requires extensive surveying of the fish populations
before and afler the change occurs (Lester et al. 1996). However, pollution, shoreline
development, and other forms of habitat degradation are often not single events whose timing
and magnitude are controIlable; as well they are cumulative in their impact. Individuai
effects on populations may be so srna11 relative to natural population variability, that
statistically significant effects might be detectable only after many years of study. Predictive
fish-habitat models may play a useful role in this regard by providuig the ability to forecast
the effects of habitat modification and changing land-use patterns on fish populations and
communities. For instance, fish-habitat models could provide lake managers with the ability
to predict species occurrence and abundance at different spatial scales using whole-iake and
within-lake measures of habitat. Ultirnately predictive rnodels would enhance managers'
ability to predict temporal and spatial scales at which habitat can be changed while
minimizing the impact to lake fish populations.
Although fish-habitat models play important roles in fishenes ecology and
management, developing such models is often difficult because species exhibit complex,
non-linea. responses to habitat heterogeneity and biotic interactions. Multiple Linear
regression and linear discriminant andysis remain the most fiequently used techniques for
modeling fish-habitat relationships, although our confidence in the results is often limited by
the inability to meet a number of assumptions, such as error structure of the variables,
independencc of vaïïab!cs, yld mode1 1inearit-y (James and McColloch 1990). The !xt
assumption is particularly susceptible to violation with ecological data since species
generally exhibit nonlinear or non-monotonie associations with environmental conditions.
Data transformations of variables c m improve the results of traditional approaches, but they
are often only partially successfÙ1 (e.g., Lek et al. 1996; Guégan et al. 1998; Wally and
Fontarna 1998). Furthemore, the choice of transformation cm influence the results, and thus
potentially bias our interpretation of ecological relationships.
Artificial neural networks (ANNs) are a promising aitemative to traditionai statistical
approaches as they provide a powerful, flexible learning technique for uncovering non-linear
patterns in data. Applications of ANNs are diverse within the scientific literature, ranging
from social sciences to chemistry, and recently are beginning to receive more attention in the
ecological sciences for solving complex pattern-recognition problems (Colasanti 199 1 ;
Edwards and Morse 1995; Lek et al. 1996). ANNs are believed to provide better solutions
than traditional methods when applied to complex systems that may be poorly defined and
understood, and situations where input data are incomplete or ambiguous by nature. Unlike
the more commoniy used methods, neural networks do not require particular functiond
relationships, make no assumptions regarding the distributional properties of the data, and
require no a prion understanding of system relationships. This makes artificial neural
networks a potentidly powerful modeling tool for exploring complex, non-linear biologicd
problems, such as fish-habitat relationships.
The primary objectives of my study are to demonstrate the use of artificial neural
networks for modeling ecological relationships and illustrate their ability to provide hsight
into understanding and predicting relationships between fish populations and the
environment. These objectives are addressed by h t modeling relationships between lake-
wide habitat attributes and species occurrence. More specifically, 1 determine the
predictability of species presence-absence based on readily available, whole-lake habitat
factors (e-g., surface area, maximum depth, elevation) and provide a detailed evaluation of
these models by estimating optimal decision thresholds for prediction to maximize
classification success, sensitivity and specificity of the models. Next, I test the performance
of these models for predicting species occurrences for a second set of lakes, providing an
assessrnent of the generality of the models. Given that species abundance perhaps is a more
sènsitive wsponse vxiablc for studyiag fish-habitat rcllitionsfiips, I mode1 associ;itions
between within lake species abundances and near-shore habitat features (e.g., macrophyte
cover, substrate types, site exposure) of littoral zone fishes. Many of these analyses represent
an advance over more conventional mode1 evaluations, and provide important insight into the
predictability of species occurrence and abundance as a Funetion of the macro- and micro-
habitat of lakes. Finally, 1 present a randomization approach for ANNs that enables
researchers to readily identiQ the variables that contribute significantly to predicting species
occurrence and abundance. 1 demonstrate that ANNs can provide powerfil predictions and
important explanatory insight into fish-habitat relationships at both regional and local scales.
ARTIFICIAL NEURAL NETWORKS
The ability of the human brain to perform complex tasks, such as pattern recognition,
has motivated a large body of research exploring the cornputational capabilities of highly
connected networks of relatively simple elements called artificial neural networks (ANNs).
Although ANNs were initially developed to better understand how the mammalian brain
functions, researchers in a variety of scientific disciplines have becorne more interested in the
potential mathematical utility of neural network algorithrns for addressing an array of
problems. For example, ANNs have shown great promise for solving complex patteni-
recognition problems and for developing prediction or classification rules in the biological
sciences (Colasanti 1991; Edwards and Morse 1995; Lek et ai. 1996; Lek and Guégan 1999).
Previous stuclies using ANNs are too numerous to list here; however their use in fisheries
applications has been limited and includes modeling fish species richness (Guégan et aï.
1998), presence-absence (MastroriUo et al. 1997), abundance (Lek et al. 1996; Brosse et al.
1999), and production (Chen and Ware 1999).
Although there are many types of ANNs (Bishop 1995; Ripley 1996), here 1 descnbe
the type used most fiequently; the one hidden-layer, feedforward neural network trained by
the back-propagation algorithm (Rumehart et al. 1986). These neural networks are
extremely popular and have been used extensively in the biological literatwe since they are
considered to be universal approximators of any continuous function (Cybenko 1989;
Funahashi 1989; Hornick et al. 1989). Furthemore, single hidden-layer networks greatly
réituct somputational and oflen produce similx results cornpved to m~ltiple hiclden-
layers (KurkovB 1992). Below, 1 discuss the two pnmary features of ANNs: network
architecture and the back-propagation algorithm used to parameterize the network.
Network architecture and the back-propagation algorithm
Network architecture refers to the number and organization of the computing units
(cailed neurons) in the network. in the one hidden-hyer feedforward network, neurons are
organized in an input layer, a hidden layer and an output layer, with each layer containhg
one or more neurons (Fig. 3.1). Each neuron is connected to al1 neurons in adjacent layers
with an axon; however, neurons within each layer and in non-adjacent layers are not
connected. The input Iayer typically contains p neurons, representhg predictor variables xi
... x,, i.e., one neuron for each of the predictor variables. The number of'neurons in the
hidden layer is determined empirically by the investigator to minirnize the trade-off between
bias and variance (Geman et ai. 1992). Additional hidden neurons increase the ability of a
network to approximate any underlying relationship, Le., reduce bias, but will result in a
network having an enormous number of f?ee parameten, i.e., increasing variance in
predictions due to overfitting the data Although mathematical derivations exist for selecting
an optimal design, in practice it is cornmon to train networks with different nurnbers of
hidden neurons and use the performance on a test set to choose the network that performs the
bat. For continuous and binary response variables the output layer commody contains one
neuron. However, the number of output neurons can be greater han one if there is more than
one response variable or if the response variable is categorical (i.e., a separate neuron for
classifyiog observations into each
INPUT OUTPUT
Bias * a
Figure 3.1. One-hidden layer, feedfonvard neural network design.
8 5
category). Additional neurons with a constant output (commonly set to 1) are atso added to
the hidden and output layers (Fig. 3.1), although this is not mandatory These are called bias
neurons, and play a similar role to that of the constant term in multiple regression analysis.
The connection between any two neurons is assigned a weight that dictates the
intensity of the signal they transmit through the awon. Consequently, the "state" or "activity
level" of each neuron is de tedned by the input received nom the other neurons comected
to it. feed-forward networks, axon signals are transrnitted in a unidirectional path fkom
input layer to output layer through hidden layers. The States of the input neurons are defmed
by the incorning signal (Le. values) of the predictor variables. The state of each hidden
neuron is evaluated locally by calculating the weighted sum of the Uicoming signals fiom the
neurons of the input layer (Fig. 3.1 inset) and then a bias input is added. The weighted sum
is then subjected to an activation function, Le. a differentiable Function of the neuron's total
incoming signal fiom the input neurons, in order to produce the state of the hidden neuron
(Fig. 3.1 inset). The same procedure descnbed above is repeated for the axon signals fiom
the hidden layer tu the output layer. The entire process can be written rnathematically as:
where xi are the input signais, yk are the output signals, wu are the weights between input
neuron i to hidden neuron j, wjk are the weights between hidden neuron j and output neuron k,
and A are the bias associated with the hidden and output layers, and +,, and are
activation functions for the hidden and output layers. There are several activation fictions,
but the logistic function defined as:
1 f (x) = -
1 + eaX
is the most commonly used.
Training the neural network involves an error back-propagation algorithm which
fin& a set of connection weights that produce an output signal that has a small error relative
to the observed output. During training the weights are adapted to minirnize some fitting
critenon. For continuous output variables, the most commonly used cnterion is the Ieast-
squares error function:
For dichotomous output variables, the most commonly used critenon is the cross entropy
(Le., similar to log-likelihood) error function (Bishop 1995):
where ln is the observed output value and y" is the predicted output value for observation.
The algorithm adjusts connection weights in a backwards fashion, layer by layer, in the
direction of steepest descent in minimizing the enor function (also called gradient descent).
One iteration of the gradient descent algorithm can be summarized as:
where Aw, is the weight change between neuron s and neuron t in the next layer. The
training of the network is a recursive process where observations fiom the training data are
entered into the network in him, each time modifjmg the input-hidden and hidden-output
connection weigbts (using eq. (3.5)). This procedure is repeated with the entire training
dataset (i.e., each of the n observations) for a number of iterations until a stopping mle is
achieved. This type of training is a sequential approach to network optirnization, and
contrasts with the batch approach where the entire data set is used to adjust the weights
during each iteration (Bishop 1995). Commonly, network training is stopped when the
difference between predicted outputs fiom the network and the observed output (i.e,, the
ermr function) is small, or it is stopped to minimize the possibly of overfïtting the data.
Interpreting neural network connection weights
Although many studies have shown ANNs to exhibit superior predictive power
compared to traditional approaches ( e g , Lek et al. 1996), researchers often cal1 it a 'black
box' approach to statistical modeling since the networks are believed to provide Little
cxplmato~ insigllt into the relnti~e b-hence of the independent variables in the prediction
process (e.g., Lek and Guégan, 1999; ozesmi and ozesmi, 1999). The lack of explanatory
power is a major concem since the interpretation of statistical models is desirable for gainhg
knowledge of the causal relationships dnving ecological phenornena. This was a major
pitfall of ANNs since traditional statistical approaches can readily identifi the influence of
the independent variables in the modeling process, as well as provide a degree of confidence
regarding their contribution. Recent studies provide greater insight about the imer workings
of ANNs, thus providing a variety of methods for quantiwng and interpreting the
contributions of the independent variables in the network. For exarnple, a number of
intensive computational approaches have been developed such as growing and pruning
algorithms (Bishop 1995), partial derivatives (e.g., Dirnopoulos et al. 1995) and asymptotic t-
tests. However, these approaches are ofien not used by biologists, who prefer simpler
algorithms that directly use the network comection weights.
In the neural nework, the comection weights between neurons are the Iinkages
between the inputs and the output of the network, and therefore are the linkage between the
problem and the solution. The relative contribution of the independent variables to the
predictive output of the neural network depends primarily on the magnitude and direction of
the comection weights. Input variables with larger connection weights represent greater
intensities of signai transfer, and therefore are more important in predicting the output
compared to variables with smaller weights. Negative comection weights represent
inhibitory effects on neurons (reducing the intensity or contribution of the incoming signal
and negatively affecthg the output), whereas positive connection weights represent
excitatory effects on neurons (increasing the intensity of the incoming signal and positively
affecting the output). Recently, a nurnber of studies have used c o ~ e c t i o n weights to
interpret the participation of input variables in predictuig the output of the network (e-g.,
Aoki and Komatsu 1997; Chen and Ware 1999; ozesmi and ozesmi 1999). Other
approaches involve using al1 the weights of the network to quantiQ overall variable
importance (e.g., Garson 199 1) and sensitivity analysis to determine the spectnim of input
variable contributions in the neural nehvork (e.g., Lek et al. 1996; Mastronllo et al. 1997;
Guégan et al. 1998). Although these approaches can determine the overall influence of each
predictor variable, interpretation of interactions among the variables is more difficult to
rssess shce the strength md direction of individual axon connection weights within a
network m u t be examined directly. With even small networks, the number of connections is
large, and thus the interpretation of the network is difficult. For exarnple, a network
containing 10 input neurons and 7 hidden neurons would have 70 connection weights to
examine. Bishop (1995) suggested removing srnail weights fiom the network to ease
interpretation; however, how does one decide which weights should be retained or eliminated
fiom the network? 1 developed a randomization test for artificial neural networks to address
this question in Chapter 2. This approach randomizes the response variable, then constructs a
neural network using the randomized data and records al1 input-hidden-output connection
weights (product of the input-hidden and hidden-output weights). This process is repeated a
large number of tirnes to generate a nul1 distribution for each input-hidden-output comection
weight, which is then compared to the observed values to calculate the significance b e l .
The randomization test provides an objective pruning technique for eliminating comection
weights that have minimal influence on the network output and identifies independent
variables that significantly contnbute to the prediction process.
METHODS
Fish-habitat models for predicting fish presence/absence
The study sites consisted of 128 lakes fkom the Madawaska River basin and 32 lakes
fiom the Oxtongue River basin, located in Algonquin Provincial Park, south-central Ontario,
Canada (45'50 Tl, 78'2O'W; Fig. 3 2). Aquatic cornrnunities in this region are representative
of relatively nahiral ecosystems because these lakes are located in a provincial park and are
currently subject to minimai perturbations fiom devebpment and species introductions. I
developed fish-habitat models for 9 fish species: brown bulbead, common shiner, creek
chub, golden shiner, lake trout, northem redbelly dace, pumpkinseed, smallmouth bass and
yellow perch by modeling species presence-absence as a function of 7 whole-lake variables
(Table 3.1). Predictor variables were chosen to include factors that are related to known
habitat requirements of fish in this geographic region (Matuszek and Beggs 1988; Minns
1989) md included surface area, total shoreline perirneter, maximum depth, total dissolved
solids, pH, lake elevation, and occurrence of summer stratification. For small-bodied fish
(Le., common shiner, creek chub, golden shiner, and northem redbelly dace) 1 included the
presence or absence of smallmouth bas, largemouth bass or northem pike as an extra
predictor variable since littoral-zone predation could be an important force. Data were
obtained Eom the Algonquin Park Fish Inventory Data Base (Crossman and Mandrak 199 l),
and details of sampling methodologies are descnbed in Dodge et al. (1 985).
The optimal number of neurons in the hidden layer was deterrnined empiricdly by
cornparhg the performance of different networks, with 1 to 20 hidden neurons, and choosing
the network with the best predictive performance. 1 included Ieaming rate (q) and
momentum (a) parameters (varying as a fùnction of the network model) during network
training to ensure a high probability of global network convergence (Bishop 1995), and
considered a maximum of 1000 iterations for the back-propagation algorithm to determine
the optimal axon weights. Pnor to training the neural network, the independent variables
were converted to z-scores to standardize the measurement scales of the inputs into the
nehvork, and thus to ensure that same percentage change in the weighted s u m of the inputs
caused a similar percentage change in the unit output.
To evaluate predictive performance, fish-habitat models were validated using two
approaches. First, n-fold or "leave-one-out" cross validation (also referred to as jackknife
validation) was used to assess mode1 performance ushg 128 lakes
Figure 3.2. First panel shows the location of study lakes from the Madawaska River
drainage (128 Iakes depicted by circles) and Oxtongue River drainage (32 lakes depicted by
triangles) in Algonquin Provincial Park, Ontario, Canada (45'50' N, 78'20' W). Second
panel shows Crosson Lake (45'05' N, 77'20' W) with 20 sampling stations depicted by
cides.
Crosson Lake
Figure 3.2. continued.
Table 3.1. Summary statistics of whole-lake habitat variables used in the neural networks to
predict species presence or absence.
Macro-scale variables Madawaska River Drainage
(Training Data)
Oxtongue River Drainage
(Test Data)
Area (ha) Maximum Depth (m) Shoreline Perimeter (km) Elevation (m) Total Dissolved Solids (mfl ) PH Summer Stratification (O, 1) Littoral-zone predator (0,1)
fkom the Madawaska River drainage, as this provides a nearly unbiased estimate of mode1
performance (Olden and Jackson 2000). Second, I tested the ability of the Madawaska-
drainage models to predict species presence-absence in 32 lakes fiom Oxtongue River
drainage. This analysis provided an opportunîty to mess the generalization of the models to
other drainages in the sarne geographic region.
The output value kom the ANN ranges fkom O to 1, and represents the probability of
species occurrence in a particular lake. 1 partitioned the overall classification success of each
species rnodel by derking "confusion matrices" following Fielding and Be11 (1997). Using
these matrices 1 examined three metrics of prediction success. First, I quantified the overail
classification performance of the model as the percentage of lakes where the model correctly
predicted the presence or absence of the species (CC). Second, 1 examined the ability of the
model to accurately predict species presence, termed model sensitivity (SE). Third, 1
examined the ability of the model to accurately predict species absence, termed model
specificity (SP). Rather than simply following the conventional decision threshold of 0.5 to
classifi a species as present or absent, 1 constmcted Receiver-Operating Chanctenstic
W C ) plots for each species to estimate the predictive ability of the models over al1 decision
thresholds (Metz 1978). A ROC graph is a plot of the sensitivity/specificity pairs resuiting
fiom continuously varying the decision threshold over the entire range of results observed.
The optimal decision threshold was chosen to maximize overall classification performance of
the model, given equal costs of m i s c l a s s i ~ g the species as present or absent. The optimal
decision threshold was then used to calculate CC, SE and SP, and Cohen's kappa statistic
was used to assess whether the performance of the model differed fkom expectations based
on chance done (Titus et al. 1984).
Fish-habitat models for predicting fish abundance
The within-lake analysis examined fish-habitat associations for 4 of the most
abundant species (golden shiner, creek chub, pumpkinseed and yellow perch) in Crosson
Lake, south-central Ontario (45'05'2i1,79'02'W). Sampling was done duruig two t h e
periods in July and August of the same summer and the sampling period was coded and
included as a predictor variable to detemine whether a temporal component was important in
predicting relative abundance. Sampling consisted of approxirnately 24-hour sets of baited
minnow traps at depths of either 0.5m or 1 Sm around the perimeter of the lake (Fig 3.2).
Species relative abundances were calculated by standardizing the catch to a 24-hour sampling
penod. Habitat was assessed visualIy f?om within a boat at each sampling location. Sites
were categorized on the basis of relative cover of vegetation (none, sparse, moderate, or
deose), relative cover of woody materials (none. sparse. moderate. or dense). bottom type
(categorized into 8 ordered categones based on particle size ranging fiom (muck, clay, silt,
sand, gravel, rubble, boulder, and bedrock), presence of terrestrial Ieaf litter, and degree of
exposure (none, limited, moderate, extreme). The degree of vegetation cover and woody
matenal cover was coded as 0,1,2, or 3 depending on whether the site was classified as
havhg none, spane, moderate or dense cover. Exposure bottom type were coded in a
similar manner. Some sites contained multiple bottom types and these were averaged to give
a single value per site. The number of bottom types present was calculated to provide a
measure of the diversity of bottom types present.
Associations between species abundance and fine-scale habitat variables were
modeled using ANNs, and the optimal number of hidden neurons was determined. The
dependent variable was standardized to the range from O to 1 so that it conformed to the
requirements of the sigmoid transfer function used in the building of the neural network, and
independent variables were z-scored (see above section for details). Predictive performance
of the models was evaluated using n-fold cross validation as was done with the species-
occurrence models. Performance of the models was assessed using the Pearson product-
moment correlation between predicted and actual species abundance, and the root-mean-
square-of-enor (RMSE) of the predicted values. The Pearson comelation provides a measure
of model accuracy with better models represented by correlation coeficients approaching 1.
RMSE measures model precision with small values representing high precision and large
values indicating low precision.
Matlab programming code for training neural networks for species presence-absence
and abundance are presented in AppendYi E.
RESULTS
Fish-habitat models for predicting fish occurrence
Whole-lake attributes were successful in predicting species presence or absence
(Table 3.2). Spzcies wwzrc classified somctly in 60.9 ta 80.5% of the l aka whtxcas levels of
mode1 sensitivity and specificity varied widely among species and between drainages. In the
Madawaska drainage the predictive performance for 7 out of the 9 species-habitat models
differed significantly fiom random. Smalimouth bass and lake bout exhibited the highest
classification rates, creek chub and pumpkinseed showed the greatest sensitivity and brown
bullhead and golden shiner had the greatest specificity. The neural interpretation diagrams
for smallmouth bas , lake trout, cornmon shiner and northem redbelly dace are s h o w in
Figure 3.3. In these diagrams, the relative magnitude of the comection weights is
represented by line thickness (Le., thicker lines representing greater weights) and Iine
shading represents the direction of the weights (Le., black lines represent positive si pals and
gray lines represent negative signais). The relationship between the inputs and outputs is
deterrnined in two steps since there are input-hidden layer connections and hidden-output
layer connections. Positive effects of input variables are depicted by positive input-hidden
and positive hidden-output connection weights, or negative input-hidden and negative
hidden-output connection weights. Negative effects of input variables are depicted by
positive input-hidden and negative hidden-output co~ec t ion weights, or by negative input-
hidden and positive hidden-output connection weights. The multiplication of connection
weight directions (positive or negative) indicates the effect each input variable has on the
response variable. Interactions among predictor variables cm be identified as input variables
with opposing connection weights entering the same hidden neuron. The total contribution
of an input variable is calculated as the sum of the products of the input-hidden - hidden-
output connection weights.
Individual and interacting influences of the habitat variables on the predicted
probability of species occurrence were interpreted when connection weights differed
signincantly fkom random (based on ~ 0 . 0 5 ) . The probability of m a h o u t h bass
Table 3.2. Performance of neural networks for predicting species presence or absence in 128
lakes in the Madawaska River drainage (Training Data) based on leave-one-out cross validation,
and applying the Madawaska networks to predicting occurrence in 32 lakes from the Oxtongue
River drainage (Test Data). The reported values are per cent species occurrence (SO), # of
hidden neurons in network (HN), optimal decision threshold based on ROC andysis (ODT), per
cent correct classification (CC), sensitivity (SN), specificity (SP), Kappa statistic and associated
P-value. Note that ODT values of the Madawaska drainage models were used for predicting
species occurrence in the Oxtongue drainage.
Madawaska River Drainage (Training Data)
Species HN SO ODT CC SN SP Kappa P
Brown Bullhead Common Shiner Creek Chub Golden Shiner Lake Trout N. Redbelly Dace
Pump kinseed Srnallmouth Bass Yellow Perch
(0.41,0.59)
(0.41,0.59)
(O. 10y0.90)
(0.46,0.54)
(0.46,0.54)
(0.47,0.53)
(0.3 1 ,O,79)
(0.61,0.39)
(0.48,0,52)
Oxtongue River Drainage (Test Data)
-
Brown Bullhead
Comrnon Shiner
Creek Chub Golden Shiner Lake Trout N. Redbeliy Dace
Pumpkinseed Smallrnouth Bass Yellow Perch
Lake Trout
. * ,
pH @.*'* . . . -
Stratification
Area
Maximum Depth
Shoreline
TDS
S u m e r Stratification
Presence of predator
Northern Redbelly Dace
Figure 3.3. continued.
occurrence is positively correlated with lake area, shoreline perheter and TDS through
hidden neuron A, as well as by lake elevation through hidden neuron B (Figure 3.3). In
contrast, pH (hidden neuron A) and elevation (hidden neurons C and D) negatively influence
the probability of occurrence. Focuskg on hidden neuron C, it is evident that the effects of
Iake elevation and TDS interact such that the negative influence of elevation on the
probability of smallrnouth bass occurrence weakens as TDS increases. Summing weights
acmss ail hidden neurons shows that shoreline perimeter and TDS have a significant positive
effect on the predicted probability of smallrnouth bass occurrence (Fig. 3.4). The lake bout
NID shows that lake area and shoreline perimeter interact through hidden neuron A, resulting
in the negative influence of surface area weakening as shoreline perimeter increases (Fig.
3.3). Increasing maximum depth, shoreline perimeter and elevation result in an increased
probability of the occurrence of lake trout (Fig. 3.4). Similar to lake trout, the probability of
cornmon shiner occurrence is afFected by the same interaction between area and shoreline
perimeter (hdden neuron C; Fig. 3.3). No habitat variables significantly contribute to
predicted probabilities of common shiner occurrence, although lake area shows the strongest
influence (Fig. 3.4). Probability of northem redbelly dace occurrence decreases with the
presence of a littoral-zone predator. However, this negative influence weakens with
increasing shoreline perimeter and elevation (hidden neuron D: Fig. 3.3). Maximum depth
and elevation positively influence the probability of northem redbelly dace occurrence,
whereas the presence of a littoral-zone predator has a strong negative influence (Fig. 3 -4).
Fish-habitat models for predicting fish abundance
Within-lake variables predict species abundance with good accuracy and precision for
creek chub (r=0.833, RMSE=0.194); golden shiner (~0 .783 , RMSE=0.260); pumpkinseed
(~0.734, RMSE=0.209); and yellow perch (~0.784, RMSE=0.204). The MDs highlight
relationships between predicted abundances and habitat for each species (Fig. 3.5). For
yellow perch the positive influence of wood cover on predicted abundance weakens with
increasing density of vegetation (hidden neuron E), and the positive relationship between
predicted abundance and depth diminishes with increasing site exposure çnidden neuron A;
Smallmouth Bass Lake Trout 'O 1
Northern Wner Redbelly Dace
Habitat variables
Figure 3.4. Relative importance (% of total contribution) of whole-lake habitat variables in
predicting species presence or absence based on the sum of c o ~ e c t i o n weighrs joining an
input neuron and the output neuron.
Substrate tup. - Yeiiow Perch
of litter
Vegetation
Cover
Presence
Exposure
Sampling rnonth
Substrate rype
Substrate diversity
Presence of litter
Vegetation - - - -
Cover
Exposure
Depth
Sampling rnonth
Golden S hiner
Figure 35. Neural interpretation diagram (NID) for predichiig fish species abundance as a huiction of withia- lake variables. The thickness of the lines joining neurons is proportionai to the magnitude of the connection weight, and the shade of the h e indicates the direction of the interaction between neurons; bIack connections are positive (excitator) and gray connections are negative (inhi'bitor). Solid fines represent connection weights statisticaliy different fkom zero (a = 0.05 ), whereas dashed lines repment non-signincant connection weights. BIack input neurons indicate habitat variables that have an overall positive influence on species abundance, and gray input neurons indicate an overall negative Muence on species abundance.
Fig. 3.5). The amount of wood cover and depth contributes positiveIy to the predicted yellow
perch abundance, whereas vegetation density contributes negatively (Fig. 3.6). Similady,
interactions among habitat variables for pumpkinseed abundance were common. The
positive influence of wood cover and litter on predicted abundance weakens with increasing
site exposure and depth mdden neuron A; Fig. 3.5). Accounting for d l connection weights,
increasing amounts of cover and litter and decreasing depth predict greater abundance (Fig.
3.6). Predicted golden shiner abundance is negatively correlated with amount of wood cover.
but this relationship weakens with hcreasing depth (hidden neuron C; Fig. 3.5). Overall,
golden shiner abundance exhibits a positive association with vegetation density, and a
negative association with wood cover and sampling month (Fig. 3.6). Predicted creek chub
abundance is negatively associated with the presence of leaf Iitter and substrate type;
however, this association diminishes with increasing depth Qudden neuron C; Fig. 3 3. Vegetation density has a positive influence, whereas leaf litter and depth negatively influence
predicted abundance (Fig. 3.6).
The Madawaska lake models can be transferred readily to the Oxtongue drainage
Iakes, with rates of correct classification, sensitivity and specificity being very similar for
both drainages (Table 3.2). Due to diflerences in the fiequency of occurrence of the species
between the two drainages, only 4 out of the 9 species-habitat models differ significantly
from random at the 5% level, although 6 out of 9 are significant at Pe0.063. Most notably,
comrnon shiner, lake trout and srnaIlmouth bass are highly predictable in both the
Madawaska and Oxtongue drainages. It is also apparent that for many species the optimal
decision threshold for class@-mg a species as present or absent deviates substantially from
0.5, but typicdly fdls in the 0.4-0.6 range.
In addition to using the resdts fkom the randomization test to interpret variable
contributions, 1 used the results as a variable selection method for removing input and hidden
neurons whose incoming or outgoing connection weights were not significantly different
fiom random. The predictive performance of these "pruned" networks were then re-tested
and 1 found that the predictability of both species occurrence and abundance was generally
unaf3ected by the removal of non-significant neumns in the network (Table 3.3). For
Yellow Perch Pum pkinseed Golden Shiner fl
Creek Cbub
Habitat variables
Figure 3.6. Relative importance (% of total contribution) of within-lake variables in
predicting species abundance based on the surn of comection weights joining an input
neuron and the output neuron.
Table 3.3. Cornparison of mode1 predictions between full and pruned networks with
input ~yiab tes and S d d a seuruns rezoved that were not ctatistical s i g i Bcant (based on
randomization test results). Pruned network design is reported after species names,
where the three values represent the number of input, hidden and output neurons,
respectively. The reported values are per cent comct classification (CC), sensitivity
(SN), specificity (SP) for predicting species presence-absence (based on the optimal
decision threshold fiom ROC analysis), and correlation coefficient (r) between predicted
and actuai abundances and root-mean-square-of-error of prediction (RMSE).
S pecies Full network Pruned network
Presence - Absence -- - -
Common Shiner (5-3- 1)
Lake Trout (4-2- 1 )
N. Redbelly Dace ( 7 4 1 ) Srnailmouth Bass (5-44)
Abuadance r RMSE - -
r RMSE
Creek Chub (6-3- 1) Golden Shiner (5-3-1) Pumpkinseed (6-2- 1)
Yellow Perch (641)
example, after pnining the network, the occurrence of lake trout was highly predictable based
on the reduced set of variables.
DISCUSSION
Modeling fish-habitat associations using ANNs
Amficial neural networks have a number of advantages over traditional modeling
approaches that make them potentially beneficial for modeling fishenes data. A N N s are
capable of modelhg non-linear associations with a variety of data types (e.g., continuous,
discrete), require no specific assumptions conceming the distributional characteristics of the
independent variables, and can accommodate interactions among independent variables
without any a prion' specification (Ripley 1996). A M V s approximate any continuous
fùnction (Cybenko 1989; Funahashi 1989; Hornick et al. 1989), and thus exhibit Bexibility
for modeling non-linear relationships between variables. For these reasons, the application
of ANNs for pattern recognition and prediction has been advocated by researchers in a large
number of disciplines, and has been shown in many ecological studies to exhibit superior
predictive capabilities compared to traditional approaches (Lek and Guégan 1999). It is
important to stress that where the underlying data structure and assumptions are met for a
particular traditional statistical technique, there is no reason to believe that major differences
between haditional approaches and ANNs should exist. However, given that ecological data
are cornmonly non-linear in nature, differences may arise due to choice of transformations,
and achieving linearity is often not possible (e.g., Lek et al. 1996; Guégan et al. 1998; Wally
and Fontana 1998), A N N s are an aitractive alterative. hdeed, the results from my study
show that A N N s can provide a powerful quantitative approach for modeling fish-habitat
relationships. 1 showed that species presence or absence was predictable fiom whole-lake
measures of habitat, which is consistent with many studies of temperate fish populations
(Jackson and Harvey 1989; TOM et al. 1990; Magnuson et al. 1998). Species such as
ma lhou th bass and lake trout were predicted with high accuracy, which is a . especially
attractive result given the economic and societal importance of these spoa fishes. Similady,
ANNs provided accurate predictions of species abundance based on micro-habitat
characteristics within a lake.
One practical disadvantage of ANNs stressed by many researchen has been the lack
of explanatory insight provided by ANNs compared to traditional approaches. However my
study shows that the contribution of the independent variables in the neural network can be
quantified by direct evaluation of the connection weights. This examination is M e r aided
by üsiiîg a mdomization 3pproach to r m o w non-sip.ificant wei&b that do not contribute
to the network prediction, thus assisting in the interpretation of direct and interacting effects
of the variables in the network, and simplifjhg the network structure. For instance, overall
lake size (i.e., area, maximum depth and shoreline penmeter) and TDS (a sturogate for
productivity) were identified as positively influencing the probability of smalimouth bass and
lake trout occurrence. Lake area and maximum depth are known to influence the occurrence
of these species (e.g., Eadie and Keast 1984; Jackson and Harvey 1989) since they alter the
mixing characteristics and hence the thermal regime of lakes. Furthemore, lake area and
depth serve as an indirect mesure of the diversity of habitats available in lakes, which may
be important to support the smail-bodied, forage fish upon which smallmouth bass and lake
&out feed. Presence of a littoral-zone predator had a strong negative effect on the probability
of northem redbelly dace occurrence, but had minimal effect on common shiner occurrence.
This result is consistent with studies that suggest that the abundance and distributions of
northem redbelly dace are greatly affected by the presence of a littoral predator (Findlay et
al. 2000), whereas common shiner appears to be more resistant to predation (Chapleau et al.
1997; Whittier et al. 1997). Interestingly, the negative relationship between northem
redbelly dace and presence of a predator weakens substantially with increasing shoreline
perimeter. As shoreline perimeter increases for a given lake area, the shoreLine becomes
more convoluted, thus a greater potential for the existence of protected embayments and
patchy nearshore habitats providing increased habitat heterogeneity and potential refuge fiom
predation. In addition, the negative influence of the presence of a predator weakens with
increasing lake elevation. Since my results show that smallrnouth bass (the primary predator
in the study lakes) are more likely to be found in lower elevation lakes, the expected negative
relationship between elevation and probability of northem redbelly dace occurrence might
have been àisrupted. This supports the hding of Chapleau et al. (1997) who found that
lakes without piscivores exhibited a negative relationship between lake elevation and small-
bodied species richness, but lakes with piscivores exhibited no such relationship.
A number of micro-habitat factors were related to increased species abundance.
Greater abundances of yellow perch and pumpkinseed were predicted for sites with large
arnounts of cover (in terms of coarse woody matenal) and low densities of vegetation. The
opposite was txue for golden shiner and creek chub, which were found in greater abundance
iii mon: vcgetated sites. Habitrt cover wis generd!y more kirniportant in the models for creek
chub than those for golden shiner, supporting the view that creek chub may be less tolerant of
habitat modifications (Whittier and Hughes 1998). Although the form of preferred cover
differs arnong species, these results strengthen the notion that predicted abundance is greater
in areas with greater habitat cover (Bryan and Scarnecchia 1992; Moring and Nicholson
1994; Christensen et al. 1996). Occupancy of complex habitats by golden shiner and creek
chub supports the idea that these habitats provide profitable foraging areas (e.g., Werner et al.
1983; Diehl and Eklov 1995), rather than simply providing shelter kom predation, since
Crosson Lake lacks large piscivorous fish. Depth also played an important role in predicted
species abundance. Yellow perch and golden shiner are predicted to occur in greater
abundances at 1.5 than at 0.5m depths, whereas pumpkinseed and creek chub were more
numerous in shallower habitats closer to shore. Therefore, spatial occupancy of these species
appean to be divided into 4 components depending on the type of cover (Le., vegetation or
coarse woody material) and depth. These species-habitat associations were ofien influenced
by the degree of site exposure. For example, the importance of depth and cover for
predictions of yellow perch and pumpkinseed weakens with increasing site exposure.
Finally, sampling month appeared to be important only for golden shiner abundance, which
decreased fkom the July to the August sampling period. In summary, the ANNs provided a
powerful technique for uncovering interactions among habitat characteristics of lakes, and
examinùig their influence on species occurrence and abundance.
Fish-habitat models as a management tool
The development of models for predicting the distribution and abundance of fish
populations is of great importance given the dernand for development of lake shorelines
continues to increase and the associated impact on fish populations. My study shows that
ANNs can provide accurate predictions regarding the abundance and occurrence of fish
species based on within- and whob-lake habitat characteristics. Predictions about the effects
of littoral-zone alteration on fish abundance could be a valuable tool for lake managers
deciding whether proposed shoreline modifications should be allowed in a system, or
altematively, where in a lake modifications shouid occur in order to minimize their impact on
tk fish commiinir;. Codqy! o m e n often remove hoth macrophytes and woody material
f?om their shorelines to enhance the "cosmetic" appearance of their property and minirnize
boating problems. Developed lakes with shoreline residences have substantially lower
densities of coarse-woody matenal than less developed lakes (Christensen et al. 1 W6), which
can result in negative impacts on species composition and abundance of fish ( e g , Poe et al.
1986; Everett and Ruiz 1993). Fish-habitat models may be particularly usehl for predicting
the cumulative effects of small-scale habitat modifications on fish abundance and spatial
occupancy. Some researchers have argued that modeling the effects of mal1 incremental
habitat change may be impractical due to dificulties in identiwng and interpreting the
effects of multiple modifications on fish populations (e.g., Jemings et al. 1999). Othen have
argued that modeling çuch relationships is not possible due to the lack of detailed data (Panek
1979) and powerfùl quantitative techniques ('Burns 199 1). 1 believe that developing
statistical methodologies help to offset these difficulties. There currently exists detailed data
descnbing within-lake habitat characteristics and fish use for many systems. However the
problem is that the most common approaches for analyzing and summarizing such data
involve simple, descriptive statistics pain et al. 1999). Therefore, better use of available
data and more flexible, powerfd statistical methods, such as artificial neural networks, may
enable managers to predict the effects of small-scale habitat modifications on fish
populations.
1 have shown that whole-lake habitat attributes, many of which are mappable, can
successfully predict fish occurrence (e-g., lake trout). The development of such models has
important implications for prioritizuig surveys and monitoring programs of fish populations
since Limits to resources preclude extensive sarnpling of aquatic habitats. Mode1 predictions
can also be used as fht-order estimates of habitat suitability, which could then be followed
by ground tnithing and field validation, in order to predict sites with available spawning
habitat (e.g., Knapp and Preisler 1999) or to establish potential locations for species
reintroduction. Similarly, models can be used tu predict the likelihood of local establishment
and spread of exotic species, which may help set conservation priorities for preserving
vulnerabie species and populations that might be lost locdly (e.g., Hrabik and Magnuson
1999).
Developing more powerful fish-habitat models
The predictive abilities of conventional models for species presence-absence are
commoniy assessed fkom overall classification rates alone. In my study 1 show that by
partitionhg the predictive performance of the models into rneasures such as sensitivity and
specificity, 1 c m assess more readily the strengths and weaknesses of the models. For
example, the presence of creek chub, pumpkinseed, and yellow perch could be predicted with
a hi& degree of certainty (greater than 90% of the Iakes), yet predicting the absence of these
species was more difficult. It is dso evident that model sensitivity increases and specificity
decreases with increasing fiequency of species occurrence in the lakes. This relationship is
expected, yet is seldom considered in distribution modeling (but see Fielding and Bell 1997;
Manel et ai. 1999). There are a number of practical implications for the relationship between
prediction success and species frequency of occurrence. First, a decrease in model sensitivity
for rare species impiies that it will be more difficult to predict the occurrence of organisms
whose conservation and management is the most critical. Consequently, our ability to
identiQ suitable locations for species reintroductions could be limited. Second, drawing
inferences ftom observed absences of species from sites containing suitable habitat
conditions (e.g., indirect evident for dispenal, predation, cornpetition) could be limited if the
models exhibit poor specificity. Examinhg alternative rneasures of prediction success can
provide more accurate cornparisons of different modeling approaches (cg., Manel et al.
1999) and différent models (Le., different subsets of variables). For instance, 1 found that
although the overall correct classification rates for some species were similar, levels of
specificity and sensitivity were often quite different. Also, correct classification rates did not
change between the full and pruned neural networks, but both sensitivity and specificity did.
The effect of species prevalence in model development is unavoidable as it is
expected that given increased fkequency of occurrence, there is a greater probability of
predicting the species to be present. However, varying the decision threshold probability for
which the model predicts presence or absence, rather than following the conventional
arbitrary threshold of OS, can cornpensate for this bias (Fielding, 1999) and result in more
powerfbl models (e.g., Carroll et al. 1999; Manel et al. 1999). Determining the optimal
decision thesho!d involves consmicting Receiver Operathg Characteristic (ROC) plots and
then choosing the threshold that maximizes sensitivity and specificity given particular
misclassification costs. This technique has been applied widely to clinical problerns in
medicine; however few ecological studies have employed ROC analysis. In this study 1
defmed equal costs of false presence (misclassifjhg a species as present) and Mse absence
(misclaçsifying a species as absent); however, in practice, it may be advantageous to assign
more appropriate costs to misclassifications if such information is available. Although
assigning costs is a complex and potentially subjective process, much can be gained. For
example, 1 might tolerate more false presences for endangered species, and thus adjust the
decision threshold accordingly to develop a more powerful predictive model.
Finally, one important concem is that many models lack geographical transferability
(i.e., poor model performance outside the original data used to develop the model) since
species-environment associations can differ substantially in different systems (Shirvell 1989).
Nevertheless, modeIs may be usehl when applied at the scde at which they were developed
and in systems where similar species-environment associations exist. 1 show that testing
models in adjacent drainages demonstrates the generality of the fish-habitat models. Models
built using lakes in the Madawaska River drainage not only performed well for the same set
of lakes, but actualIy performed slightl y better, on average, for predicting species occurrence
in the Oxtongue River drainage. Although it might be surprking that correct classification
rates were slightly higher in the Oxtongue lakes (Le., test data) compared to the Madawaska
Iakes (Le., training data), the fact that model sensitivity was on average hi& and that species
modeled were more prevalent in the Oxtongue lakes than in the Madawaska drainage can
account for this result. Consequently the effect of species prevaience on geographic
transferability of fish-habitat models needs to be considered.
CONCLUSION
ANNs have wide applicability to the study of ecological relationships, bot& as
exploratory and predictive tools. ANNs provide a flexible approach that can accommodate a
wide varîety of study designs without the statistical constraints of independence and linearity,
and they require no a priori understanding ~f ss te ; i i xlationships. Consequently they are
usefui techniques for relating the distrihtions and abundances OF fish populations to h i .
physical environment. Given the obvious importance of establishing linkages between
habitat features, fish distributions, and their utilkation of nearshore habitats, the development
and testhg of fish-habitat models are important steps in the conservation and management of
Iake fish populations. Such predictive models c m advance management efforts to
understand fish-habitat associations and predict the effects of natural and anthropogenic-
related habitat modification on fieshwater fish populations.
ACKNOWLEDGMENTS
[ thank Dr. Sovan Lek for conversions regarding the fmer details of ANNs. Funding for this
research was provided by a Graduate Scholmhip fiom the Natural Sciences and Engineering
Research Council of Canada (NSERC) and University of Toronto scholarships to J.D. Olden,
and an NSERC Research Grant to D.A. Jackson.
Aoki, I., and T. Komatsu. 1997. Analysis and prediction of the fluctuations of sardine
abundance using a neural network. Oceanologica Acta 20:81-88.
Bain, M. B., T. C. Hughes, and K. K. Arend. 1999. Trends in methods for assessing
fieshwater habitats. Fisheries (Bethesda) 24: 16-2 1.
Beauchamp, D. A., E. R Bjmn, and W. A. Wurtsbaug!. 1394. Sumner habitat use by
littoral-zone fishes in Lake Tahoe and the effects of shoreline structures. North Amencan
Journal of Fishenes Management 14:3 85-394.
Bishop, C. M. 1995. Neural Networks for Pattern Recognition. Clarendon Press, Oxford.
Brosse, S., I. F. Guégan, J. N. Tourenq, and S. Lek. 1999. The use of neural nehvorks to
assess fish abundance and spatial occupancy in the littoral zone of a mesotrophic lake.
Ecological Modelling. L20:299-3 1 1.
Bryan, M. D., and D. L. Scamecchua 1992. Species richness, composition, and abundance of
fish larvae and juveniles inhabiting n a d and developed shorelines of a glacial Iowa
Iake. Environmental Biology of Fishes 35 :329-34 1.
Burns, D. C. 1991. Cumulative effects o f small modifications to habitat. Fisheries (Bethesda)
16: 12-17.
Carroll, C., W. J. Zielinski, and R. F. Noss. 1999. Using presence-absence data to build and
test spatial habitat models for the Fisher in the Klamath region, U.S.A. Conservation
Biology 13:1344-1359.
Chapleau, F., C. S. Findlay, and E. Szenasy. 1997. Impact of piscivorous fish introductions
on fish species richness of small lakes in Gatineau Park, Quebec. Écoscience 4259-268.
Chen, D. G., and D. M. Ware. 1999. A neural network mode1 for forecasting fish stock
recnlltment. Canadian Journal of Fisheries and Aquatic Sciences 56:2385-2396.
Christensen, D. L., B. J. Kerwig, D. E. Schindler, and S. R Carpenter. 1996. Impacts of
lakeshore residential development on course woody debris in north temperate lakes.
Ecological Applications 6: 1 143-1 149.
Colasanti, R. L. 1991. Discussions of the possible use of neural network algorithm in
ecological modelling. Binary 3: 13-15.
Crossman, E.J. and NE. Mandrak. 199 1. An andysis of fish distribution and community
structure in Algonquin Park: annual report for 199 1 and cornpletion report, 1989- 199 1.
Ontario Minstry of Natural Resources, Toronto, Ontario, Canada.
Cybenko, G. 1989. Approximation by superimpositions of a sigmoidal function. Mathematics
of Control, Signals, and S ystems 2,303-3 14.
Dodge, D. P., G. A. Goodchild, 1. MacRitchie, J. C. Tilt, and D. G. Waidnff. 1985. Manual
of instructions: aquatic habitat inventory sweys . Ontario Ministry of Natural Resources,
Fisheries Branch, Toronto, Ontario, Canada.
Diehl, S., and P. Eklov. 1995. Piscivore-mediated habitat use in fish: effects on invertebrate
resources, diet, and growth of perch, Percufluviatilur. Ecology 76: 1712- 1726.
Dimopoulos, Y., P. Bourret, and S. Lek. 1995. Use of some sensitivity criteria for choosing
networks with good generalization. Neural Processing Letters 2: 1-4.
Eadie, J. M., and A. Keast. 1984. Resource heterogeneity and fish species divenity in lakes.
Canadian Journal of Zoology 62: 1689-1695.
Edwards, M., and D. R. Morse. 1995. The potential for computer-aided identification in
biodivenity research. Trends in Ecology and Evolution 10: 153- 158.
Everett, R. A., and G. M. Ruiz. 1993. Coarse woody debris as a refuge from predation in
aquatic communities: An experhental test. Oecologia 93:475-486.
Fielding, A. H., and J. F. Bell. 1997. A review of methods for the assessrnent of prediction
enors in conservation presence/absence models. Environmental Conservation 2438-49.
Fielding. A. H., 1999. Application of machine leaming techniques to ecological problems.
Kluver Associates, Norwell, MA.
Findlay, C. S., D. G. Bert and L. Zheng. 2000. EEect of introduced piscivores on native
minnow communities in Adirondack lakes. Canadian Journal of Fisheries and Aquatic
Sciences 57570-580.
Funahashi, K. 1989. On the approxirnate realization of continuous mapping by neural
networks. Neural Networks 2: 1 83-1 92.
Garson, G. D. 199 1. Interpreting neural-network connection weights. Amficial Intelligence
Expert 6:47-5 1.
Gernan, S., E. Bienenstock, and R. Dounat. 1992. Neural networks and the bias/variance
dilemma Neural Computation 4: 1-58.
Guégan, J. F., S. Lek, and T. Oberdorff. 1998. Energy availability and habitat heterogeneity
predict global nverine fish diversity. Nature 39 1:382-384.
Harig, A. L., and M. B. Bain. 1998. Defining and restoring biological integrity in wilderness
lakes. Ecologicd Applications 8: 7 1-87,
Hinch, S. G., N. C. Collins, and H. H. Harvey. 1991. Relative abundance of littoral zone
fishes: Biotic interactions, abiotic factors, and postglaciai colonization. Ecology 72: 13 14-
1324.
Homick, K., M. Stinchcombe, and H. White. 1989. Multilayer feedforward networks are
universal approximators. Neural Networks 2:359-366.
Hrabik, T. R., and J. J. Magnuson. 1999. Simulated dispersal of exotic rainbow smelt
(Osmenu mordax) in a northem Wisconsin lake district and implications for
management. Canadian Journal of Fisheries and Aquatic Sciences 56 (Suppl. 1):35-42.
Hughes, R. M., and R. F. Noss. 1992. Biologicai diversity and biologicd integrity: current
concems for lakes and streams. Fisheries (Bethesda) 17: 1 1-19.
Jackson, D. A., and H. H. Harvey. 1989. Biogeographic associations in fish assemblages:
local versus regional processes. Ecology 70: 1472-1484.
James, F. C., and C. E. McCulIoch. 1990. Multivariate analysis in ecology and systematics:
panacea or Pandora's box? Annual Reviews in Ecology and Systematics 2 1 : 129- 1 66.
Jennings, M. J., K. Johnson, and M. Staggs. 1996. Shoreline protection study: a report to the
Wisconsin state legislatue. Wisconsin Department of Natural Resources, Publication
PUBL-RS-92 1-96, Madison.
Jennings, M. J., M. A. Bozek, G. R. Hatzenbeler, E. E. Emmons, and M. D. Staggs. 1999.
Cumulative effects of uicrernental shoreline habitat modification on fish assemblages in
north temperate lakes. North American Journal of Fisheries Management 19: 18-27.
Knapp, R. A., and H. K. Preisler. 1999.1s it possible to predict habitat use by spawning
salmonids? A test using California golden tmut (Oncorhynchus mykiss aguabonita).
Canadian Journal of Fisheries and Aquatic Sciences 56: 1576- 1 584.
Kurkovi, V. 1992. Kolmogorov's theorem and multilayer neural networks. Neural Networks
5: 501-506.
Lek, S., M. Delacoste, P. Baran, 1. Dimopoulos, J. Lauga, and S. Aulagnier. 1996.
Application of neural networks to modelling nonlinear relationships in ecology.
Ecological Modelling 90:39-52.
Lek, S., I. F. Guégan. 1999. Artificial neural networks as a tool in ecological modelling, an
introduction. Ecological Modelling 120:65-73.
Lester, N. P., W. 1. Dunlop, and C. C. Willox. 1996. Detecting changes in the nearshore fish
commwity. C~."adiui J ~ ~ ~ a l of Fisheries and Aquatic Sciences 53 (Suppl. 1 ) : B I -402.
Magnuson, J. J., W. M. TOM, A. Banerjee, J. Toivonen, O. Sanchez, and Rask, M. 1998.
Isolation vs. extinction in the assembly of fishes in small northem lakes. Ecology
79:294 1-2956,
Manel, S., J. M. Dias, S. T. Buckton, and S. J. Omerod. 1999. Alternative methods for
predicting species distribution: an illustration with Himalayan river birds. Journal of
Applied Ecology 36:734-747.
Mash-orillo, S., S. Lek, F. Dauba, and A. Beland. 1997. The use of artificial neural networks
to predict the presence of small-bodied fish in a river. Freshwater Biology 38~237-246.
Matuszek, J. E., and G. L. Beggs. 1988. Fish species nchness in relation to lake area, pH, and
other abiotic factors in Ontario lakes. Canadian Journal of Fishenes and Aquatic Sciences
45:1931-1941.
Metz, C. E. 1978. Basic p ~ c i p l e s of ROC analysis. Seminan in Nuclear Medicine 8:283-
298.
Mims, C. K. 1989. Factors affecting fish species nchness in Ontario lakes. Transactions of
the Amencan Fisheries Society 1 18:533-545.
Moring, J. R., and P. H. Nicholson. 1994. Evaluation of three types of artificial habitats for
fishes in a freshwater pond in Maine, USA. Bulletin of Marine Science 55: 1 149-1 1 59.
Olden, J. D., and D. A. Jackson. 2000. Torturing data for the sake of generality: How valid
are our regession models? Écoscience 7(in press).
ozesmi, S. L., and U. ozesmi. 1999. An artincial neural network approach to spatial habitat
rnodelling with interspecific interaction. EcologicaI Modelling 1 16: 15-3 1.
Panek, F. M. 1979. Cumulative effects of mal1 modifications to habitat. Fisheries (Bethesda)
454-5 7.
Poe, T. P., C. O. Hatcher, C. L. Brown, and S. W. Schloesser. 1986. Cornparison of species
composition and nchness of fish assemblages in altered and unaltered littoral habitats.
Journal of Freshwater Biology 3525-536.
Ripley, B. D. 1996. Pattern Recognition and Neural Networks. Cambridge University Press.
Rurnelhart, D. E., G. E. Hinton, and R. J. Williams. 1986. Leamhg representations by back-
propagation errors. Nature 323533436 .
S h ~ e l l , C. S., 1989. Habitat rnodels and their predictive capability to infer habitat effects on
stocksize. Pages 173-179 in C. D. Levings, C.D., L. B. Holtby, and M. A. Henderson,
editors. Proceedings of the National Workshop on the Effects of Habitat Alteration on
Salmonid Stocks. Vol. 105. Special Publication in Canadian Journal of Fishenes and
Aquatic Sciences, May 6-8, Nanaimo, B.C.
Titus, K., J. A. Mosher, and B. K. Williams. 1984. Chance-conected classification for use in
discriminant anaiysis: Ecological applications. Amencan Midland Naturalist 1 1 1 : 1-7.
Tonn, W. M., J. I. Magnuson, M. Rask and J. Toivonen. 1990. Intercontinental cornparison
of small-lake fish assemblages: The balance bebveen locai and regional processes.
Amencan Naturalist 136:345-375.
Walley, W. J., and V. N. Fontama. 1998. Neural network predictoe of average score per
taxon and number of families at unpolluted sites in Great Britain. Water Resources
32:613-622.
Werner, E. E., G. G. Mittelbach, D. J. Hall, and J. F. Gilliarn. 1983. Expenmentai tests of
optimal habitat use in fish: the role of relative habitat profitability. Ecology 64: 1525-
1539.
Whittier, T. R., D. B. Halliwell, and S. G. Paulsen. 1997. Cyprinid distributions in Northeast
U.S.A. lakes: evidence of regional-scale rnimow biodiversity losses. Canadian Journal of
Fisheries and Aquatic Sciences 54: 1593-1 607.
Whittier, T. R., and R M. Hughes. 1998. Evaluation of fish species tolerances to
environmental stressors in lakes in the northeastern United States. North American
Journal of Fisheries Management 18:236-252.
My thesis contributes significantly to both the biological and methodological realms
of population and community ecology.
By developing ernpirical models for predicting fish species occurrence, abundance,
richness and community composition using bath fine- and couse-scaie mziisurcs sf lahe
habitat, 1 have illustrated that predictive models can provide guidance for the direction of
future research and aid in the conservation and management of fishery resources.
Developing models that recognize the value of both species and communities may result in
more effective conservation of aquatic biodiversity and emphasizes the protection of key
local- and regional-scale processes.
By describing alternative approaches, i.e., tree-based and neural network, for
modeling ecological data and providing a detailed cornparison of these approaches to
conventional methods, Le., logistic regression and discriminant analysis, I have provided an
important cornparison between linear and norflinear techniques to rnodeling species
occurrence data. Such c o m p ~ s o n s will become hcreasingly important as the number of
statistical techniques grow and evolve in future years. In addition, by deveioping a
complementary tool for quantimg variable contributions in neural networks, I have
eliminated the primary shortcoming of neural networks, thus making this approach both a
powerful explanatory as well as predictive tool for modeling ecological data.
"Any science may be iikened to a river. If ha3 its obscure and impretentiuus
beginning; its quiet stretches ar well as ils rnpids; its periods ofdrought as
well ar fiiliness. If gothers rnomentum with the work of many investigaiors and
as it is fed by other streams of thought; it is deepened and broadened by the
concepts and generalizations that are gradua& evolved. "
- Cal. P. Swanson
Appendix A
List of 286 study lakes of Algonquin Provincial Park used in Chapter 1
(included is latihide and longitude CO-ordinates).
Lake Name Latitude Loneitude AIRY LAKE ALLAN LAKE ALLURING LAKE ALSEVER LAKE AMIKEUS LAKE ANIMOOSH LAKE AUBREY LAKE BAB LAKE BAILEY LAKE BAND LAKE BARRON LAKE BASIN LAKE BEAVERLY LAKE BERM LAKE BIG PORCUPINE LAKE BIG RED LAKE (RED PINE) BIG ROCK LAKE BIG TROUT LAKE BIGGAR LAKE BlLLlNGS LAKE BILLS LAKE (Nt) BlLLY LAKE BIRCHCLJFFE LAKE BLUE LAKE BLUFF LAKE (NL) BO6 LAKE (NL) BONFIELD LAKE BONNECHERE LAKE BOOT LAKE BOOTH LAKE BORDER LAKE (NL) BRANCH LAKE BREWER LAKE BRIDLE LAKE BRUCE LAKE BRULE LAKE 45038'
120
BUD LAKE BURNT ISLAND LAKE BURNTROOT LAKE BUIT LAKE BYERS LAKE CACHE LAKE CALUMET LAKE CANISBAY LAKE CANOE LAKE CARCAJOU LAKE CARL WILSON LAKE CASTALIA LAKE (NL) CAT LAKE CATFISH LAKE CAUCHON LAKE CAULIFLOWER LAKE (CLYDAWADKA) CEDAR LAKE CHARLES LAKE CHEWINK LAKE CHICKAREE iAKE CLARA LAKE CLARKE LAKE CLEMOW L4KE CLOUD LAKE CLOVER LAKE ClUB LAKE CLYDEGALE LAKE COLDSPRING LAKE COON LAKE COOT LAKE COSTELLO LAKE CRADLE LAKE CRAIG LAKE CRANEBILL LAKE CROTCH LAKE CUCKOO LAKE DAlSY LAKE DAVID LAKE DELANO LAKE DICKSON LAKE DOVE LAKE DUCKPOND LAKE ERABLESLAKE FARM BAY LAKE FARM lAKE FARNCOMB LAKE FASSETT LAKE 46OO 1 '
FAVOVIER LAKE FLORENCE LAKE FOOLS LAKE FORK LAKE FOUND LAKE FOYS LAKE FRANCIS LAKE FRANK LAKE FRASER LAKE GPLFP.IRY LAKE GEM LAKE GISSON LAKE GILMOUR LAKE GOUINLOCK LAKE GWND LAKE GRAPE LAKE GREENLEAF LAKE HAILSTORM M E HAMBONE LAKE HAPPY ISLE LAKE HARRY LAKE HIGHFALLS LAKE HlLLlARD LAKE HIRAM LAKE HOGAN IAKE IGNACE LAKE IRIS LAKE ISLET LAKE JAKE LAKE (MARGARET) JOE LAKE JOHNSTON LAKE KAKASAMIC LAKE KAWA LAKE KEARNEY LAKE KENNEDY LAKE KINGSCOTE LAKE KlOSHKOKWl LAKE KIRKWOOD LAKE KlrrY LAKE LAKE LA MUlR LAKE LAVIEILLE LAKE LOUISA LAKE OF TWO RIVERS LAKE TRAVERSE LAUREL LAKE (LAURIE) LAWRENCE LAKE LENGTH LAKE 45'52' 77'40'
LILYPOND LAKE LINDA LAKE LITIIE BILLINGS LAKE LITLE CAUCHON LAKE LITTLE CAULIFLOWER LAKE LITTLE COON LAKE LllTLE CROOKED LAKE LIITLE CROW LAKE LITTLE DICKSON LAKE L1FLE !-!!Y LPKE LITTLE ISLAND LAKE LITTLE JOE LAKE LITTLE MCCAULAY LAKE LITTLE MlNK LAKE LITRE MINNOW LAKE LIITLE OTTERSLIDE LAKE LITTLE ROCK LAKE LllTLE TROUT LAKE LIlTLEDOE LAKE LONGBOW LAKE LONGER LAKE LOONTAll LAKE LORNE. LAKE LOST DOG LAKE (NAMEGOS) LOWER MINNOW LAKE LOXLEY LAKE LUCKLESS LAKE LUPUS LAKE LYNX LAKE MAPLE LAKE MARCH HARE U K E MARGARET LAKE (NL) MARIE LAKE MATHEWS LAKE MAiTOWACKA LAKE MCCRANEY LAKE MCGARVEY LAKE MCINTOSH LAKE MCKASKILL LAKE MERCHANT LAKE MERGANSER LAKE MEW LAKE MILDRED LAKE MINK LAKE MlSTl LAKE MOCCASIN LAKE MOLE LAKE 45O37'
123
MOUSE LAKE MUBWAYAKA LAKE MUDVILLE LAKE MYRA LAKE (NL) NAHMA LAKE NEPAWIN LAKE NORTH BRANCH LAKE NORTH DEPOT LAKE NORTH GRACE LAKE ?IOf,T!-! R!V ER LWE NORTH ROUGE LAKE NORTH SYLVIA LAKE NORTH TEA LAKE NORWAY LAKE O'NEILL LAKE OPEONGO LAKE ORAM LAKE OITERSLIDE LAKE OUSE LAKE OWAISSA LAKE OWL LAKE PARK LAKE (LONG) PECK LAKE PERLEY LAKE PHILIP LAKE PHIPPS LAKE PINETREE LAKE PISHNECKA LAKE POG LAKE POND LAKE POTTER LAKE PRElTY LAKE PROTTLER LAKE PROULX LAKE PROVOKING LAKE QUEER LAKE RADIANT IAKE RAGGED LAKE RAIN LAKE RAJA LAKE RAVEN LAKE RED FOX LAKE REDROCK LAKE RENCE LAKE ROBIN LAKE ROBINSON LAKE ROBlTAlLLE LAKE 45O41' n052'
I24
ROCK LAKE 45'30' ROD AND GUN LAKE ROSEBARY LAKE ROSEPOND LAKE ROUND ISLAND LAKE ROUNDBUSH LAKE RYAN LAKE RYEGRASS LAKE SAM LAKE SPNDV LAKE SAWYER LAKE SCORCH LAKE SCOTT LAKE SEC LAKE SHAû LAKE SHALL LAKE SHALLNOT LAKE SHIPPAGEW LAKE SHIRLEY LAKE SHREW LAKE SEC0 LAKE SMITH LAKE SMOKE LAKE SOURCE LAKE SPECKLEDTROUT LAKE SPOOR LAKE SPROULE LAKE ST. ANDREWS LAKE STRArrON LAKE STRINGER LAKE SUNDASSA LAKE SUNDAY LAKE SWAN LAKE SYLVIA LAKE TANAMAKOON LAKE TAlTLER LAKE TEA LAKE TECUMSEH U K E TEPEE LAKE THOMAS LAKE THREE MILE LAKE TIM LAKE TIMBERWOLF LAKE TI? UP LAKE TOM THOMPSON LAKE TROUT LAKE TUB LAKE (NL) 45O3 1 ' 77'58'
125
TURQUOISE LAKE 4S049' 77O35' UPPER KAWA LAKE 45'58' 78'53' UPPER MINNOW LAKE 45'1 4' 78O1 4' VIRE0 LAKE 45O44' 77'59' WATERCLEAR LAKE 46O03' 78'47'
WEED LAKE 45036' 78O53'
WELCOME LAKE 45%' 78025' WEST HARRY LAKE 45032' 78O49' WESTWARD LAKE 45029' 78O47' 'P!H!TE PARTRIDGE LbKE 45O50' 78O06'
WHlTEBlRCH LAKE 46'04' 78'49' WHITEFISH LAKE 45033' 78025' WHITEGULL LAKE 4S040' 78027' WHITNEY LAKE 45O34' 78Ol7' WILKINS LAKE 45'4 1 ' 77'55'
Data type (i.e., raw or transformed) for which each species mode1 exhibited the greatest correct classification rate using logistic regession (LM), discriminant analysis (LDA), classification fxee (CART) and artificial neural network (ANN). Optimal (i.e., highest correct classification rate) classification tree size (Le., number of terminal leaves) and number of hidden neurons in the neural network are reported based on n-fold cross validation. See Table 1.1 for definitions of species codes.
Classification Number of hidden tree size neurons in ANN
LRA LDA CART ANN Raw Tran Raw Tran
B
BB
BCS
BNS
BSB
BT
C
CC
CS
F
FSD
GS
ID
LC
LS
LT
L W
NRD
PD
PKS
RB
RW
SL
SMB
T-P
WS
Erans
trans
raw
M W
Tâw
TaW
trans
tram
tram
tram
raw
raw
trans
trans
raw
trans
tram
trans
raw
raw
raw
raw
trans
fifw
trans
trans
trans
t r a s
mw
tram
MW
M W
tram
mm
m s
trans
raw
MW
uims
tIïms
trilus
tram
faw
trans
trans
MW
raw
Law
raw
raw
trans
mns
MW
trans
raw
tram
raw
crans
raw
mw
raw
raw
raw
raw
raw
raw
trms
nw
raw
raw
raw
raw
raw
raw
raw
tram
raw
raw
trans
raw
raw
trans
raw
trans
tram
mw
trans
tram
raw
taw
trans
trans
raw
trans
raw
tram
trans
tMns
r3W
raw
raw
raw
raw
trans
raw
O h r , r n z r - - o o m Cs O O F O I - w s \ o * * m O
Cs Cs 8 t $ S S * 8 8 a a - c g g Z L O ~ ~ C ; ? v ? \ s ~ t - = ? * 7 ' ? '1
PKS RB RW SL
SMB T-P WS
YP -
Raw canonical coefficients and centroid means from discriminant îunction analysis for 27 fish species. Reported values are the
constant, surface area (SA), volume (V), total shoreline perirneter (SP), maximum depth (MD), total dissolved solids (DS), pH, lake
elevation (LE), growing degree-days (GD), occurrence of summer stratification (SS), watershed dummy variable (W 1-W3),
occurrence of a littoral-zone predator (P), and centroid means for absence (0) and presence (1) of species. See Table 1.1 for
definitions of species codes.
Constant SA V SP MD DS pli LE CD SS
B -72,553 0.1 34 14 -0,15849 -0.65002 -0.42942 -0.82191 -0.024 15 -0.27955 10.74684 0.20158 BB -9,661 0.55001 0.01245 0.38280 -0.93483 0.12606 0.14850 0.47477 0.68385 0.86380 BCS -3,438 -0.00350 0,00000 0.09829 -0.01545 0.00542 -0.57827 -0,00456 0.0052 1 -0.3 1488 BNS 38.63 1 -0,32029 0.27242 -0.55 1 O8 -0.85828 -0,18558 0.03492 0.20469 -4.97087 1.54 191 BSB 29,3 14 -0.00352 0.00012 0.06 160 0.03908 -0.00220 0.34226 -0.01045 -0.0 1753 -0.4597 1 BT 13.980 0,00177 -0.00012 0.00801 0.02761 0.0034 1 0.40 149 -0.00230 -0.01 142 2.18487 C 85.399 1 .O0683 -0.57670 -0,05037 0.7828 1 -0.22687 0.265 15 -2.08847 - 10.35438 -0,08330 CC 30.848 0.38836 -0.66936 -0.25970 -0.14 147 -0,15285 0.20245 -2.14284 -2.10539 0.64 166 CS 22.220 -0,39253 0.29348 -0.67447 -0.358 18 -0.5488 1 -0.33857 -0.36896 - 1.82 130 0.38996 F -92.107 0.072 10 -0,27828 0.9 1765 0.02957 0.08386 0.08226 1.450 16 1 1.08536 -0.13684 FSD - 12,949 -0.00205 0.000 16 0.04258 -0.00 1 O3 -0.00943 -0.82074 -0.00250 0.0 1 159 0.5788 1 CS -9.879 -0.00 194 0.00000 0.09463 0.00748 0.009 14 -0.45303 -0.00 144 0,00796 - 1 .O42 19 ID -1 02.448 1.18460 -0.43924 -0.83440 0.64926 -0.66239 0.17375 0.53 127 13.24498 - 1.4 19 16 LC 48.866 -0.24350 -0,49507 0.59907 -0.34571 O. 12622 0.0 15 10 -2.18245 -4.50200 0.561 30
LS - 16.3 1 O -1.183 15 0.24 102 0.30699 -0.23066 -0.55936 0.24760 -0.24246 3.0468 1 -0.29099 LT -63,956 0,55426 -0.62373 -0.5522 1 -0.49991 -0.40720 -0.2 1740 -0.22786 9.63304 0.44437 LW 3.500 -0.004 12 0,00020 -0.00792 0.00390 -0.00793 -0.58487 -0.005 1 1 0.0017 1 0.47545 NRD -62.389 1.7 1 S 19 -0.62643 -1 SOO 13 -0.50544 0.33827 -0.59822 1.73061 7.59965 0.09378 PD 83,762 -1,35937 0.41020 1.71584 0.16305 0,18013 0.13350 -1.26743-10.46119 -0.09905 PKS - 16,957 -0,00038 0.000 12 -0.06867 0,007 12 -0.0 1 236 -0.59472 0.01 224 0.00986 1.04475 RB -0.370 -0.00029 0.00008 -0.0272 1 -0.00745 0.0 159 1 0.72206 0.00608 -0.00379 -0.38779
Centroid Ccntroid
W3 P Mean O Mcan 1
- 0.5780 1 - 0.42 179 -0.9490 1 - 1.13532 - -0.41430 0.46341 0.22357 1.69792 -0.08709 1,46958 -0.99337 -1,08837 0.47043 -0.68942
0.47362 0.2 1405 -0.23806 0.63485 0.41821 - -0,82410 0.24723 -0.02311 - -0.31612 1.16604 -0.8056 1 -0.57566 0.70760 -0.3302 1 - 0.49945 -0.97332 0,59668 -0.50429 1.2484 1 0.19547 -0.15208 1.20724
0.00083 - 1 .O 1898 -0.15277 0.86332 0.2 1 O63 0.92270 -0.26 1 89 0.395 13 1.17296 0.42090 -0.28496 1.12022
-0.52 142 0.8 1 143 O. 1 7490 -0.88940 -0,5561 1 - 0.20958 -0.92139
-0.0033 1 - 0.8214 1 -0.73437 -0.69644 - O. 13472 -0,85325 -1.203 15 1,42492 0.38927 -0.32439 0.33936 - 1.10672 -0.35944 OS 1 17 1 -0.1 1883 - 0.46736 -0.25 127 -0.16 135 - 0.26885 - 1.92805
Appendix E
MatLab (version 5.3 release 11) prograrnrning code for artificial neural network training using the least-sum-of-squares error functioo (Le., continuous response variable) in the backpropagation algorithm, where predictions are based on n- foId cross validation.
% note: the datafile contains the data for predictor variables (in colurnns 1 to p) % and the response variable (in column p+l)
% note: the datafile contains the data for predictor variables (in coiumns 2 to p+l)and the response variable (in column 1) load c:\temp\datafile.txt;
% a prion commands pausetirne= l ; % tum off the wmings associated with leambpm, logsig and deltalog NNTTWARN OFF
% GLOBAL VARIABLES (mut be specified by user) IN=8; % number of input neurons (Le., nurnber of predictor variables) H N 4 ; % nurnber of hidden neurons (Le., determined empirically using n-foid CV) ON= 1 ; % number of output neurons (Le., commonly equal to 1) N=286; % number of observations weight.ange=0.3; % initial range of comection weights (Le., -0.3 ... 0.3) noepoch=lOOl; % number of iterations to consider (including the k t random)
% creating input matrix, P=data(:, 1 +ON:IN+ON); P=P'; % creating output matrix T=data(:,ON); T=T' ;
% N-fold cross validation for k 1 : N
% local mode1 parameters for network optimization lr=û.0001; % initial leaming rate parameter Ir-hc=1 .OS; % variable Iearning rate increase lrJec=0.75; % variable learning rate decrease
err-ratio= 1.06; % threshold if new error rate exceeds then modify leaming rate momentum=0.95; % initial momentum parameter
% setting initial iteration number epoch=l; % selecting n- l observations for training cvP=P; cvP ( :, k)=[] ; [cvP,meancvP,stdcvP]=prestd(cvP) ; ?h selecting n-l observations for training and standardin'np data cvT=T; cvT(:,k)=[] ; maxcvT=max(cvT); rnincvT=min(cvT); cvT=(cvT-mincvT)/(maxcvT-mincvT);
% initial connection weights set equal to -0.3 to 0.3 in order to increase prob(convergence) % calculated as a random uniform nurnber between -1 and 1 multipiied by 0.3, rand(keedl,sum(l OO*clock))
W 1=randsm,IN)*weightrange; % input-hidden weights B 1 =rands(HN. 1 )*weightrange; % bias neuroa into hidden WZ=rands(ON,HN)*weightrange; % hidden-output weights BZ=rands(ON, l)*weightrange; % bias neuron into output
% calculatinp theoretical neuron output from input into hidden (Al) and f?om hidden to output (A2) using a log sigmoid transfer f ic t ion Al=logsig(W 1 *cvP,B 1); A2=logsig(W2*A 1 ,B2);
% calculathg the error of the network (difference between theoretical and achial output) E=cvT-A2; SSE=sumsqr(E);
% setting vectoa of weights and biases back to zero mc=O;dW 1=(W1 *O);dW2=(W2*O);dB I=(B 1 *O);dB2=(B2*0);
% conduct the following comrnands until epoch (Le., # of iterations) is geater than noepoch while epochenoepoch
D2=deltaIog(A2,E); D l=deltalog(Al ,D2, W2); % neural network leaming process based on MATLAB 4.0 [dWl ,dB l]=leambpm(cvP,D 1 ,lr,rnc,dW l ,dB 1);
[dW2,dBZ]=leambpm(A 1 ,D2,lr,rnc,dW2,dB2); TWl=Wl+dWI; TBl=Bl+dB1; TW=W2+dW2; TBZ=B2+dB2; % optirnize network with the new values of the connection weights (TW and TB) TAl=logsig(TW 1 *cvP,TB 1); TA2=logsig(TW2*TA1 ,TB2); % calculating the error of the network (difference between theoretical and actual output) TE-vT-TA2; TSSE=sumsqr(TE); % altering the leaming and momenturn rate according to the error rate of the network if TSSPSSE*err-ratio
k l f l lr-dec; rnc=û;
else if TSSEGSE
k l r * lr-inc; rnc=rnomenturn;
end W 1 =TW I ;B l=TB 1 ; W2=TW2;B2=TBZ;Al=TAl ;A2=TM;E=TE;SSE=TSSE; end epoch = epoch + 1; end
% standardking the 1 ornitted observations using the mean and std. of the whole dataset cvdata=trastd(P(:,k),meancvP,stdcvP); % predicted the 1 omitted observations cvTA 1 =logsig(TWl *cvdata,?B 1); predictT(:,k)=iogsig(TW2*cvTA 1 ,TB2); predic tT(:,k)=predictT(:, k)*(maxcvT-vT; Eprintf('omitted observation: %7fb',k);
end
c=corrcoef(T,pred.îctT); c=c(1,2); Fprintf(T3est correlation coefficient: %7.4h1,c);
MatLab (version 5.3 release 11) programming code for artificial neural network training using the cross entropy error function (Le., binary response variable) in the backpropagation algorithm, where predictions are based on n-fold cross validation.
% note: the datafile contains the data for predictor variables (in colurnns 1 to p) K and the response variable (in column p+l)
% note: the datafile contains the data for predictor variables (in columns 2 to p+l)and the response variable (in colurnn 1) load c:\temp\datafile.txt;
% a priori commands pausetirne= 1 ; % tum off the wamings associated with leambpm, logsig and deltalog NNTWARN OFF
% GLOBAL VARIABLES (must be specified by user) N=X; % number of input neurons (Le., nurnber of predictor variables) m=4; % number of hidden neurons (Le., determined empirically using n-fold CV) ON=l; % number of output neurons (Le., commonly equal to 1) N=28 6; % number of observations weightrange=0.3; % initial range of connection weights (Le., -0.3 ... 0.3) noepoch=1001; % number of iterations to consider (including the tirst random)
% creating input matrix, P=data(:, 1 +ON:IN+ON); P=P1; % creating output matrix T=data(:,ON) ; T=T';
% N-fold cross vaIidation for k l : N
% local mode1 parameten for network optirnization kO.000 1 ; % initial ieaming rate parameter Ir_inc=l.05; % variable leaming rate increase lr_dec=O.75; % variable learning rate decrease en-rati~1.06; % threshold if new error rate exceeds then modiQ leaming rate mornenhun=0.95; % initial rnomentum parameter
% setting initial iteration number epoch=l;
j=2*k-1; % selecting n-1 observations for training cvP=P; cvP(: j)=[]; [cvP,meancvP,stdcvP]=prestd(cvP); % selecting n-1 observations for training and standardizing data *- rT=T; b v
cvT(:j)=u; maxcvT=max(cvT); rnincvT-min(cvT); cvT=(cvT-mincvT)/(maxcvT-mincvT);
% initial comection weights set equal to -0.3 to 0.3 in order to increase prob(convergence) % caiculated as a random unifom number between -1 and 1 multiplied by 0.3, rand('seed',sum( 1 OO*clock)) W 1 =rands(HN,IN)*weightrange; % input-hidden weights B 1 =rands(HN, 1 )*weightrange; % bias neuron into hidden W2=rands(ON9HN)*weightrange; % hidden-output weights BZ=rands(ON, 1 )*weightrange; % bias neuron into output
% caiculating theoretical neuron output from input into hidden (Al) and from hidden to output (A2) using a log sigmoid transfer fùnction A1 =logsig(W 1 *cvP,B 1); A2=Iogsig(W2*A 1 ,B2); % caiculating the error of the network (difference between theoretical and actual output) E-VT-A~;
% CROSS ENTROPY ce=o; for a=l :N-2
~e(a)=Iog(A2(a)~cvT(a)*( 1 -A2(a))"(l -cvT(a))); end SSE=surn(ce); % setting vectors of weights and biases back to zero mc=û;dWl=(W 1 *O);dW2=(W2*0);dB I=(B 1*0);dB2=(B2*0);
% conduct the following commands until epoch (Le., # of iterations) is greater than noepoch while epochcnoepoch
D 1 =deltalog(Al D2, W2); % neural network leaming process based on MATLAB 4.0 [dW i ,dB l]=leambpm(cvP,D 1 ,lr,mc,dW 1 ,dB 1); [dW2,dBZ]=leambprn(Al ,D2,lr,mc,dW2,dB2); TWZ=Wl+dWl; TBl=BZ+dB 1; TWZ=W2+dW2; TB2=B2+dB2; % optimize network with the new vaiues of the c o ~ e c t i o n weights (TW and TB) TAl=!ogsig(TWl *cvP,TB 1); TA2=logsig(TW2*TA 1 ,TB2); % calculating the error of the network (difference behveen theoretical and actual output) TE-vT-TA2; % CROSS ENTROPY ce=[]; for a=l :N- 1
ce(a)=iog(A2(a)AcvT(a)*( 1 -A2 (a))*( 1 -cvT(a))); end TSSE=sum(ce); % altering the leaming and momenhim rate according to the error rate of the network if TSSDSSE*err_ratio
klr*lr-dec; mc=O;
else if TSSECSSE
k l r * 4inc; mc=mornentum;
end
W1=TW l;B 1=TB 1;W2=TW2;B2=TB2;Al=TAl;A2=TA2;E=TE;SSE=TSSE; end epoch = epoch + 1; end % standardking the 1 omitted observations using the mean and std. of the whole dataset cvdata=ûastd(P(: j),meancvP,stdcvP); % predicted the 1 omitted observations cvTAl =logsig(TW 1 *cvdata.,TB 1); predictT(: j)=logsig(W2*cvTA 1 ,TB2); predictT(:&==redictT(: j}*(maxcvT-rnincv+mhcvT; Fprintf('omitted observations: %7.4f\nt j:);
end
m=sum(abs(T-round@redictT)))/N; fp~d(Mscisclassification rate: %7.4f\n',.m);