exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of...

25
Ecological Modelling 192 (2006) 385–409 Exploring ecological patterns with structural equation modeling and Bayesian analysis G.B. Arhonditsis a,, C.A. Stow b , L.J. Steinberg c , M.A. Kenney a , R.C. Lathrop d , S.J. McBride a , K.H. Reckhow a a Nicholas School of the Environment and Earth Sciences, Duke University, Durham, NC 27708, USA b Department of Environmental Health Sciences, Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA c Department of Civil and Environmental Engineering, Tulane University, New Orleans, LA 70118, USA d Wisconsin Department of Natural Resources and Center for Limnology, University of Wisconsin, Madison, WI 53706, USA Received 7 December 2004; received in revised form 26 May 2005; accepted 18 July 2005 Available online 3 October 2005 Abstract Structural equation modeling is a multivariate statistical method that allows evaluation of a network of relationships between manifest and latent variables. In this statistical technique, preconceptualizations that reflect research questions or existing knowledge of system structure create the initial framework for model development, while both direct and indirect effects and measurement errors are considered. Given the interesting features of this method, it is quite surprising that the number of applications in ecology is limited, and even less common in aquatic ecosystems. This study presents two examples where structural equation modeling is used for exploring ecological structures; i.e., summer epilimnetic phytoplankton dynamics. Both eutrophic (Lake Mendota) and mesotrophic (Lake Washington) conditions were used to test an initial hypothesized model that considered the regulatory role of abiotic factors and biological interactions on lake phytoplankton dynamics and water clarity during the summer stratification period. Generally, the model gave plausible results, while a higher proportion of the observed variability was accounted for in the eutrophic environment. Most importantly, we show that structural equation modeling provided a convenient means for assessing the relative role of several ecological processes (e.g., vertical mixing, intrusions of the hypolimnetic nutrient stock, herbivory) known to determine the levels of water quality variables of management interest (e.g., water clarity, cyanobacteria). A Bayesian hierarchical methodology is also introduced to relax the classical identifiability restrictions and treat them as stochastic. Additional advantages of the Bayesian approach are the flexible incorporation of prior knowledge on parameters, the ability to get information on multimodality in marginal densities (undetectable by standard procedures), and the fact that the structural equation modeling process does not rely on asymptotic theory which is particularly important when the sample size is small (commonly experienced in environmental studies). Special emphasis is given on how this Bayesian methodological framework can be used for assessing eutrophic conditions and assisting water quality management. Structural Corresponding author at: Department of Physical and Environmental Sciences, University of Toronto, Toronto, Ont., Canada M1C 1A4. Tel.: +1 416 208 4858; fax: +1 416 287 7279. E-mail addresses: [email protected], [email protected] (G.B. Arhonditsis). 0304-3800/$ – see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.ecolmodel.2005.07.028

Upload: others

Post on 09-Feb-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

Ecological Modelling 192 (2006) 385–409

Exploring ecological patterns with structural equationmodeling and Bayesian analysis

G.B. Arhonditsisa,∗, C.A. Stowb, L.J. Steinbergc, M.A. Kenneya,R.C. Lathropd, S.J. McBridea, K.H. Reckhowa

a Nicholas School of the Environment and Earth Sciences, Duke University, Durham, NC 27708, USAb Department of Environmental Health Sciences, Arnold School of Public Health, University of South Carolina,

Columbia, SC 29208, USAc Department of Civil and Environmental Engineering, Tulane University, New Orleans, LA 70118, USA

d Wisconsin Department of Natural Resources and Center for Limnology, University of Wisconsin,Madison, WI 53706, USA

Received 7 December 2004; received in revised form 26 May 2005; accepted 18 July 2005Available online 3 October 2005

Abstract

Structural equation modeling is a multivariate statistical method that allows evaluation of a network of relationships betweenmanifest and latent variables. In this statistical technique, preconceptualizations that reflect research questions or existing

effectsumber ofstructural

utrophicnsidered

ring theility was

nvenientolimneticr clarity,ns andledge ones), andnt whenBayesianStructural

1A4.

knowledge of system structure create the initial framework for model development, while both direct and indirectand measurement errors are considered. Given the interesting features of this method, it is quite surprising that the napplications in ecology is limited, and even less common in aquatic ecosystems. This study presents two examples whereequation modeling is used for exploring ecological structures; i.e., summer epilimnetic phytoplankton dynamics. Both e(Lake Mendota) and mesotrophic (Lake Washington) conditions were used to test an initial hypothesized model that cothe regulatory role of abiotic factors and biological interactions on lake phytoplankton dynamics and water clarity dusummer stratification period. Generally, the model gave plausible results, while a higher proportion of the observed variabaccounted for in the eutrophic environment. Most importantly, we show that structural equation modeling provided a comeans for assessing the relative role of several ecological processes (e.g., vertical mixing, intrusions of the hypnutrient stock, herbivory) known to determine the levels of water quality variables of management interest (e.g., watecyanobacteria). A Bayesian hierarchical methodology is also introduced to relax the classical identifiability restrictiotreat them as stochastic. Additional advantages of the Bayesian approach are the flexible incorporation of prior knowparameters, the ability to get information on multimodality in marginal densities (undetectable by standard procedurthe fact that the structural equation modeling process does not rely on asymptotic theory which is particularly importathe sample size is small (commonly experienced in environmental studies). Special emphasis is given on how thismethodological framework can be used for assessing eutrophic conditions and assisting water quality management.

∗ Corresponding author at: Department of Physical and Environmental Sciences, University of Toronto, Toronto, Ont., Canada M1CTel.: +1 416 208 4858; fax: +1 416 287 7279.

E-mail addresses: [email protected], [email protected] (G.B. Arhonditsis).

0304-3800/$ – see front matter © 2005 Elsevier B.V. All rights reserved.doi:10.1016/j.ecolmodel.2005.07.028

Page 2: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

386 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

equation modeling has several attractive features that can be particularly useful to researchers when exploring ecological patternsor disentangling complex environmental management issues.© 2005 Elsevier B.V. All rights reserved.

Keywords: Structural equation modeling; Ecological patterns; Bayesian analysis; Phytoplankton dynamics; Water quality management

1. Introduction

“. . .Thinking only in terms of directly observable vari-ables confines our horizons and limits our assessmentof complex systems. . .” (Malaeb et al., 2000, Environ-mental and Ecological Statistics, pg 95)

Mechanistic understanding and prediction ofpatterns is a key feature in ecological research (Peters,1991; Jorgensen, 1997; Pace, 2001; Carpenter, 2002;Arhonditsis and Brett, 2005). Given the hierarchicalstructure of biological information, observations ina particular study scale are usually associated withupper-level joint behaviors and lower-level processes,thus it is essential for ecologists, when exploringpatterns, to be able to shift between different scalesof description in space, time, and organizational com-plexity (Sugihara and May, 1990; Levin, 1992). Forexample,Vepsalainen and Spence (2000)advocatedthe development of “general explanatory frameworks”that comprise (i) the focal level, defined by thepattern/process of interest, and (ii) the contiguouslower and upper levels, associated with the initiatingconditions and boundary constraints, respectively.Recognizing the intertwined nature of ecological hier-archies, this framework was proposed as a convenientmeans to select the appropriate level of information forunderstanding specific observations/events. Similarly,earlier studies bySalthe (1985)and O’Neill et al.(1986)acknowledged the importance of a basic triadica ofs sseso Thep ilityt andp e,2 ssest nglev m ist rchy

that can be decomposed into an infinite number of sub-processes and causal interactions. The premise behindthis partitioning is that the selective measurement ofsome of these elements can improve our understandingabout the collective behavior/mechanism. The attemptto abstract essential features and reduce the complexityof the real world is ubiquitous in ecological practice.Levin (1992)characterized the study of the transfer-ability of ecological phenomena across scales and thedevelopment of laws of simplification and aggregationas a central problem in ecology and evolutionarybiology.

Modeling as a tool for elucidating ecologicalpatterns is subject to the same problem of complexity,and the optimal model dimension has been extensivelydebated in the ecological literature (Levins, 1966;Costanza and Sklar, 1985; Rastetter et al., 1992;Jorgensen, 1999; Reckhow, 1999; Arhonditsis andBrett, 2004). Applied ecologists are inclined to selectrealism and precision in favor of generality; drivenby technical or conceptual limitations, they adopt“intuitively manageable scales” and develop modelsthat aim to provide “faithful descriptions” of the data(Vepsalainen and Spence, 2000). A characteristicexample is the application of regression analysisfor analyzing data from experimental/observationalstudies and the use of the best-fit model for inferenceand hypothesis testing. While useful for investigatingcausality in nature, regression models have severallimitations: (i) the predictor variables are assumed tob tion,( edb the-s thei ta hatw ticei velc andt s.

pproach, which explicitly considers the effectslow, larger-scale and fast, smaller-scale procen the focal level observed ecological patterns.roblem of scale can also be raised from our inab

o measure with absolute accuracy characteristicsroperties of conceptual interest (McCune and Grac002). Ecologists are frequently interested in proce

hat cannot be measured effectively by one siariable, and a common way to address this probleo perceive the ecological concept as a nested hiera

e free of measurement error or uncontrolled variaii) the assumption of normality is frequently violaty the errors in the resultant models, and (iii) hypoes are formulated in a way that solely allow fornclusion of directly observed variables (Malaeb el., 2000). Therefore, it is increasingly recognized that is missing from the common ecological prac

s a statistical technique with the ability to unraomplex interrelationships and aid generalizationheory testing by relaxing some of these restriction

Page 3: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 387

Structural equation modeling (SEM) is a methodthat can address several of the above restrictions, pro-viding a robust technique for studying interdependen-cies among a set of correlated variables. We believethat it is well suited to provide insight into the relation-ships of the often correlated and error-contaminatedphysical, chemical, and biological variables in ecolog-ical research. To that end, we developed and testeda conceptual model concerning the regulatory role ofabiotic conditions and biological interactions on lakephytoplankton dynamics and water clarity during thesummer-stratified period in Lake Washington and inLake Mendota. Our objectives are: (1) to assess theadequacy of this conceptual model, (2) to examinethe value of structural equation modeling in empiri-cal confirmation of scientific hypotheses, and (3) todiscuss how this multivariate statistical method canbe combined with Bayesian analysis and assist natu-ral resource management.

2. Methods

2.1. Structural equation modeling

SEM is a multivariate statistical methodology thatencompasses factor and path analysis (McCune andGrace, 2002; Pugesek et al., 2003). Even though themajor advancements towards generalization of themethod occurred after the 1970s (Keesling, 1972;J salp hipsa ears( -t itlyt les,w iatedb ;K -p lacko ectr ticalu atedi on”( suse ores cep-t sing

multiple indicator (observed) variables. An aquaticecosystem example is the combination of several indi-cators such as photosynthetic pigments (chlorophylls,carotenoids), primary productivity, algal biovolumeor carbon biomass, to model the latent (unobservable)variable “phytoplankton”. However,Bollen (1989a)emphasized the need for caution when developinglatent variable models and discussed several validitytests for examining the correspondence betweenconcepts and observed variables. It should be notedthat principal component analysis has also the abilityto reduce a set of correlated variables to higher-ordercomponents but has a limited flexibility to specifythe model structure prior to the analysis and does notaccount for measurement error (McCune and Grace,2002).

SEM is an “a priori” statistical technique, wherethe modeler proposes and tests a hypothesized struc-ture/mechanism that usually reflects existing knowl-edge. The formulation of this initial model creates anexpected covariance structure, which is tested againstthe covariance matrix from observed data. The nullhypothesisH0 that formalizes the idea of structuralequation modeling is:

H0 : Σ = Σ(θ) (1)

whereΣ is the population (or sample) covariancematrix of observed variables,θ is a vector that containsthe model parameters, andΣ(θ) is the model-impliedcc then urale .N thep ure).T eno onlyu od( ral-i( thea tin-gt instd ase,tm inst a

oreskog, 1973), attempts to analyze multiple cauathways and partition direct and indirect relationsmong variables date back approximately 80 yWright, 1918; Wright, 1921). In contrast with mulivariate regression, SEM allows the user to explicest indirect effects between two explanatory variabhere effects between two variables can be medy another intermediary variable (e.g.,Bollen, 1989aline, 1998). Additionally, SEM can explicitly incororate uncertainty due to measurement error orf validity of the observed variables. The latter aspefers to the essential feature of allowing theoreni- or multidimensional concepts to be amalgam

nto single entities of variant “degrees of abstractie.g., phytoplankton or zooplankton community vernvironmental degradation or ecosystem health). Mpecifically, SEM can represent variables of conual interest that are not directly measurable, by u

ovariance matrix (Bollen, 1989a). In contrast withonventional statistical models where rejection ofull hypothesis is sought, the objective of structquation modeling isacceptance of the null hypothesisot rejectingH0, means that existing data supportroposed model (hypothesized covariance structhe model is fit by minimizing the differences betwebserved and model-predicted covariances. Commsed fitting functions include maximum likelihoML), unweighted least squares (ULS) and genezed least squares (GLS). Finally,Joreskog and Sorbom1993)articulated the important issue of extractingppropriate inferences from model results, by disuishing among three situations: (i)strictly confirma-

ory: a single model is formulated and tested agaatasets, ideally after model specification. In this c

he model can be accepted or rejected, (ii)alternativeodels: several prespecified models are tested aga

Page 4: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

388 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

single set of data. In this case, one of the models shouldbe selected, and (iii)model generating: the analysisstarts with a tentative model, which is subject to eval-uation and modification. These respecifications shouldprovide meaningful interpretations and the final modelneeds further confirmation (Raykov, 1992; McCuneand Grace, 2002).

2.1.1. SEM applications in ecologySEM has been extensively applied in research areas

including social science, psychology, chemistry, andbiology (e.g.,Bollen, 1989a; Hair et al., 1995; Hay-duk, 1996; Kline, 1998). Applications in ecology andenvironmental sciences, however, are still limited (e.g.,Mitchell, 1992; Smith, 1995; Shipley, 1997; Grace andPugesek, 1998; Shipley, 2000). For example,Grace andPugesek (1997)provided an example of exploratoryanalysis in ecology by developing a general SEM thatexamined the relative effects of abiotic conditions(e.g., soil salinity, elevation, nutrient content), distur-bances (e.g., herbivory) and biomass density on plantspecies richness.La Peyre et al. (2001)used SEM forevaluating a hypothesized model for national wetlandmanagement effort, which explained 60% of theobserved variability in 90 nations and highlighted therole of social development for more effective wetlandprotection.

Even less common is SEM application in aquaticecosystems and limited number of relevant studies canbe found in the literature. In an illustrative application,M n-t ed)v Bs,t ility( thp fit oft wast thp nd ah ity.T uci-d atedb lml e offi them od-o

aimed to provide a predictive approach to water qualitycriteria.

2.1.2. Benefits of Bayesian approachThe importance of considering auxiliary prior

information on individual parameters was recognizedearly in the SEM literature (Martin and McDonald,1975; Bartholomew, 1981; Lee, 1981). Lee (1981)used a hierarchical Bayesian approach for analyzingthe confirmatory factor analytic model, which led to animprovement of the factor loading, factor covariance,and unique variance estimates in comparison withthe maximum likelihood method. Subsequent studiesconsidered prior information in the form of stochasticfunctional relationships among the parameters, i.e.,the classical identifiability constraints were restated asstochastic (Lee, 1992; Lee and Ho, 1993). They foundthat the Bayesian approach with stochastic constraintsperformed equally well with the classical approachwhen prior information was available, and providedmore robust results when population parameters weremisspecified.Scheines et al. (1999)demonstratedhow the Bayesian estimation along with informativepriors can assist for obtaining posterior distributionsfor parameters of underidentified models; he replaceda regression model with a SEM where the predictorswere measured with substantial error and the availableprior information was not sufficient to associate therespective parameters with unique values. An inter-esting finding of the same study was the ability of theB oret hel thes ton’sm nteCe van-t els)h enceiM tf sd onoS siana ty toi andt on

alaeb et al. (2000)tested a conceptual model to ideify the relationships between four latent (unobservariables: sediment contamination (e.g., total PCotal pesticides, silt/clay percent), natural variabe.g., salinity, water clarity), biodiversity, and growotential (benthic abundance). Besides the good

he model, another interesting finding of this studyhe positive total effect of natural variability on growotential, as a result of a negative direct effect aigher positive indirect mediated through biodivershe importance of path analysis when trying to elate patterns of causal association was also indicy Stow and Borsuk (2003); the use of graphicaodels provided evidence that toxicity ofPfiesteria-

ike organisms was the effect rather than the causshkills. Structural equation modeling was alsoajor component of a recently introduced methlogical framework byReckhow et al. (2005), which

ayesian approach to identify the existence of mhan one local maximum value (multimodality) in tikelihood surface, which were not detectable bytandard procedures (see their example with Wheaodel). Recent developments in Markov Chain Moarlo (MCMC) methods (e.g., seePaap (2002)for anxtensive discussion about the computational adages of MCMC application on latent variable modave increased the application of Bayesian infer

n non-linear factor analysis models (Arminger anduthen, 1998; Zhu and Lee, 1999) that also accoun

or polytomous (Lee and Zhu, 2000) and dichotomouata (Lee and Song, 2003), or assess the contributif incomplete datasets to model selection (Lee andong, 2004). Generally, the advantages of the Bayepproach over the classical methods are the abili

ncorporate prior knowledge about the parametershe fact that the modeling process does not rely

Page 5: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 389

asymptotic theory (Congdon, 2003). The latter issueis particularly important when the sample size is small(commonly experienced in environmental studies),and thus the classical estimation methods (maximumlikelihood, generalized and weighted least squares)are not robust. MCMC samples are taken from theposterior distribution, and as a result the procedureworks for all sample sizes and various sources of non-normality.

We formulated a structural model that describesepilimnetic phytoplankton dynamics as the interplaybetween physical, chemical, and biological factors.The latent and observed variable selection and themodel temporal resolution were based on data availablefrom routine monitoring programs (bimonthly sam-plings, standard limnological variables). This modelspecification allowed us to assess the adequacy of theunderlying information to give plausible results withSEM and identify the relative role of several ecologicalprocesses known to regulate water quality variables ofmanagement interest. The model is tested in two lakeswith different trophic status (i.e., the mesotrophicLake Washington and the eutrophic Lake Mendota).We also introduced a Bayesian hierarchical frameworkand tested its results against the classical likelihoodapproach. Using the same identification conditionsand uninformative priors, we firstly compared the con-sistency between Bayesian, maximum likelihood, andbootstrap estimates. The sensitivity of the structuralmodel results was then evaluated by treating these iden-t tiono ngw tonm nS

nlyp icalt thatS cte ure-m eralc n fore tentv ticalc es ad ofiM for

delineating general abstract theories/hypotheses andtransform them into controllable-scale entities thatcan be incorporated in experimental/observationalframeworks. This is particularly important in the fieldof ecology where the directions of the dependencerelationships are largely known and prior knowledgecan be easily translated in models testable againstpresent and future datasets (McCune and Grace,2002).

2.2. Case study sites

2.2.1. Lake WashingtonLake Washington is the second largest natural lake in

the State of Washington, with a surface area of 87.6 km2

and a total volume of 2.9 km3. The mean depth of thelake is 32.9 m (maximum depth 65.2 m), the summerepilimnion depth is typically 10 m with a epilimnion:hypolimnion volume ratio of 0.39. The retention timeof the lake is 2.4 years on average. Lake Washingtonis a mesotrophic ecosystem after a successful lakerestoration by sewage diversion (Edmondson, 1994).The limnological processes are strongly dominatedby a recurrent spring diatom bloom with epilimneticchlorophyll concentration peaks of 10�g/L on average,which is 3.2 times higher than the summer-stratifiedperiod concentrations (Arhonditsis et al., 2003).Generally, the strongly phosphorus limiting conditionsalong with the zooplankton grazing pressure sustainsummer phytoplankton at an approximate level of 3�gc tonh andt ragecASAT l setu odelt phicc

on ar ns)l ngC s/l -p sedi

ification restrictions as stochastic. A brief descripf the SEM terminology and typical notation aloith the matrix representation of the Lake Washingodel is provided inAppendix 1, while the BayesiaEM configuration is presented inAppendix 2.We recognize that a single example can o

artially cover the general principles of this statistechnique. Thus, our intention is at least to showEM’s ability to estimate both direct and indireffects between variables, to account for measent error, and to simultaneously evaluate sev

ause-effect relationships warrant its consideratiocological pattern description. The addition of laariables enables the linkage between theoreoncepts and observed variables, which providefensible method to quantify natural properties

nterest measured with uncertainty (Bollen, 1989a).ost importantly, SEM can serve as a useful tool

hl a/L. In its current restored state, Lake Washingas not experienced major cyanobacteria blooms

he summer phytoplankton assemblage on aveomprises 26% diatoms (Aulacoseira, Stephanodiscus,sterionella, Fragilaria), 37% chlorophytes (Oocystis,phaerocystis), and 25% cyanobacteria (Anabaena,nacystis, Microcystis) (Arhonditsis et al., 2003).hus, Lake Washington provided an environmentap where we tested the ability of our conceptual m

o describe phytoplankton patterns under mesotroonditions.

The dataset for SEM development was basedecent (1994–2001), spatially intensive (12 statioimnological sampling program carried out by Kiounty/ Metro (http://dnr.metrokc.gov/wlr/waterre

akes/Wash.HTM). Detailed description of this samling network along with the analytical methods u

s provided elsewhere (Arhonditsis et al., 2003, 2004a).

Page 6: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

390 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

For this project, we selected data from the deeper partsof the lake and sampling dates that spanned from theonset of thermal stratification until the fall overturn(n = 57).

2.2.2. Lake MendotaLake Mendota (12.7 m mean depth, surface area

of 38.95 km2 and flushing rate of 0.15 per year) is aculturally eutrophic lake located in south-central Wis-consin, USA. The lake is relatively deep (maximumdepth of 25.3 m) and is characterized by a dimicticcirculation pattern. Agricultural areas cover a large por-tion (≈85%) of the lake’s watershed (604 km2 totalsurface area), while urban and forested areas and wet-lands account for the remaining 15%. Agriculturaland urban non-point loadings are the dominant exter-nal nutrient (mostly P) inputs and have maintainedthe lake’s eutrophic character, despite the diversionof sewage effluents in 1971 (Lathrop et al., 1998).In addition, nutrient intrusions from the hypolimnion(internal loading) play a significant role in the epilim-netic P budget, albeit the high interannual variabilityin the relative contribution of the various (exogenousversus endogenous) sources (Soranno et al., 1997).The lake has been characterized by the occurrence ofcyanobacteria blooms during the summer, while theunsuccessful implementation of several managementprograms necessitated the adoption of more aggressivenon-point pollution reduction strategies (Betz et al.,1997). The summer cyanobacteria assemblage is dom-izc ionu ndi-t 97H phicL t tod var-ia ond

bledf co-l ol-o erec tiona andm .

wisc.edu/index.jsp?projectid=LTER1. SEM develop-ment was based on a 5-year period (1997–2001), whenconsistent measurements for all the observed variablesof the conceptual model existed in the dataset.

2.2.3. Statistical assumptions and datatransformations

The datasets used for SEM development were col-lected from conventional monitoring programs withmoderate sampling intensity during the summer period(i.e., once or twice a month), which was also the tem-poral resolution of our study (no time-averaging wasconsidered). For maximum likelihood estimation, theχ2-test is the commonly used goodness-of-fit mea-sure where the observed variables are assumed to fol-low a multivariate normal distribution. We tested forboth univariate and multivariate normality (skewnessand kurtosis) and applied transformations when neces-sary. Multivariate kurtosis was examined with Mardia’scoefficient (Mardia, 1974). We also examined for influ-ential observations and outliers before and after trans-formations were applied. The squared Mahalanobisdistance was used as a screening test for detecting mul-tidimensional outliers (Legendre and Legendre, 1983).According to this distance measure, the deviationd2

i

of the i-th observation from the centroid of all obser-vations, is given by the formulation:

d2i = (xi − x)′S−1(xi − x) (2)

wvt ncem wasa lues.T f thee eansd ces( so ets,a n wasn

2role

o onl ringt

nated by colonial and filamentous species,Aphani-omenon flos-aquae, Oscillatoria agardhii, andMicro-ystis aeruginosa, capable of surface scum formatnder appropriate weather and water quality co

ions (Lathrop and Carpenter, 1992a; Soranno, 19).ence, in contrast with Lake Washington, the eutroake Mendota provides an alternative environmenetect differences in the relative importance of the

ous pathways of the hypothesized model ofFig. 1nd its ability to illuminate epilimnetic phytoplanktynamics.

The dataset used for this analysis was assemrom the Northern Temperate Lakes Long Term Eogical Research (LTER) program (Center for Limngy, University of Wisconsin, Madison). The data wollected approximately twice a month from one stat the deepest part of the lake (for further samplingethodological details, seehttp://lterquery.limnology

here xi is the i-th observation on thep observedariables, ¯x is the vector of their means, andS−1 ishe unbiased estimate of their population covariaatrix. The overall mean of each observed variablelso used to fill gaps in the dataset due to missing vahe number of missing values was less than 2% oxisting data, and thus the use of the respective mid not cause distortions (shrinking) of the varianMalaeb et al., 2000). Finally, no significant problemf temporal autocorrelation were found in the datasnd thus the independent observations assumptioot violated.

.2.4. The conceptual modelOur conceptual model considers the regulatory

f abiotic conditions and biological interactionsake phytoplankton dynamics and water clarity duhe summer-stratified period (Fig. 1). Abiotic condi-

Page 7: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 391

Fig. 1. (a) The hypothesized conceptualization and (b) the actual structural equation model used for predicting epilimnetic phytoplanktondynamics. The use of a rectangular box for the epilimnion depth and the water clarity implies that the variable was considered as directlyobservable with no measurement error (λ1 =λ6 = 1.0, andδ1 = ε3 = 0). The metrics of the latent variables were set by fixingλ2 =λ5 =λ8 = 1.0.The notation is similar to that inAppendix 1.

tions refer to the physical and chemical properties ofthe epilimnetic environment, and our intention is toexamine the relative importance of their effects onthe phytoplankton community. We hypothesized thatthe physical environment will be represented by the

changes of the epilimnion depth (defined as the depthwhere the temperature change was≥1◦C m−1). Forsimplicity, we did not include other surrogate vari-ables of the physical environment (e.g., the Schmidtstability index), although we recognize that the present

Page 8: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

392 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

indicator does not reflect the entire range of macro-and microscale advection and diffusion processes thatcan occur in a lake’s epilimnion. The latent variablenutrients along with the two indicator variables solu-ble reactive phosphorus (SRP) and dissolved inorganicnitrogen (DIN) concentrations comprised the measure-ment model for the chemical environment. Zooplank-ton grazing pressure on phytoplankton represented thefood–web interactions. We used the latent variable her-bivory and two surrogate variables: the first, referredto as total zooplankton included all the herbivorouszooplankton species, while the second included onlyDaphnia species. Both herbivory indicator variableswere formed as the sum of the species abundances(expressed as organisms per litre) weighted by therespective mean lengths (Carpenter et al., 1996). Sec-chi transparency was considered as a perfect measure(no measurement error) of water clarity, and thus weonly considered the latent (structural) error in the equa-tion that relates phytoplankton to water clarity. Due todifferences in data availability, the physical interpreta-tion of the latent variable phytoplankton were differentbetween the two case studies because different indica-tors were used (Lake Washington and Lake Mendota).For the Lake Mendota SEM, we included two measuresof phytoplankton (i.e., chlorophyll a and total algal bio-volume), which resulted in a quantitative configurationof the phytoplankton latent variable. In contrast, theLake Washington SEM development was based on thecombination of two variables, chlorophyll a and a poly-t nce.T eleda ture( thism thea ctlyc ncet eticp r ofl tenta ppli-c sseda rs ch,w t oft ceedst -i e

of simplicity, we adopted a recursive approach (i.e., allcausal effects were unidirectional and the disturbanceswere uncorrelated), but we recognize that some of theconsidered ecological paths would be more realisticallyrepresented by a non-recursive model.

3. Results and discussion

3.1. Lake Washington

Variance resolution for phytoplankton (24%), her-bivory (31%) and water clarity (32%) (Fig. 2) wasrelatively low. Theχ2-test statistic value was 27.795with 17 d.f. with ap-value of 0.047. [Note that a sig-nificant χ2 value calls for rejection of the proposedmodel.] The root mean square error of approximation(RMSEA) was 0.106, which under the null hypothe-sis of “close fit” (i.e., RMSEA is no greater than 0.05in the population) corresponds to ap-value of 0.108.Based on their experience,Browne and Cudeck (1993)argued that a value of about 0.08 or less for the RMSEAwould indicate a close fit, while a model with a RMSEAgreater than 0.1 would not be satisfactory. The valuesof the incremental fit index (IFI;Bollen, 1989b) andcomparative fit index (CFI;Bentler, 1990) were 0.858and 0.834, respectively. These indices provide infor-mation for the comparison between the hypothesizedand the baseline model and a value close to 1 indicatesa good fit. The baseline model is defined as the simplestm risonw odelc indi-c rvedv l hasa s ont pro-v ,1 sts d am ep-t vels( or-g werem tri-e edw iaa orre-

omous characterization of cyanobacteria abundaherefore, the respective latent variable was labs phytoplankton community to stress the dual naphytoplankton abundance and composition) ofeasurement model. Aside from this difference,pplication and evaluation of the model was strionfirmatory; the model was constructed in advao accommodate current knowledge of the epilimnhytoplankton dynamics, no changes (i.e., numbe

atent variables, direct or indirect effects between land observed variables) were made during the aation, and the prespecified model’s fit was assegainst the two datasets (Bollen, 1989a). Thus, outrategy differs from the usual two-step approahere SEM development starts with testing the fi

he measurement models to the data and then proo the structural equation model (Anderson and Gerbng, 1992; Malaeb et al., 2000). Finally, for the sak

odel that is a reasonable standard for compaith the tested one. For example, a baseline man suggest that no common factors underlie theators and that the correlations between the obseariables are zero. Generally, the baseline modevery constrained structure (i.e., many restriction

he population moments), and thus it is expected toide a poor fit to the dataset (Bollen, 1989a; Arbuckle995). Finally, Hoelter’s criticalN test (i.e., the largeample size for accepting the model) indicatearginal model rejection (CN = 56) and model acc

ance (CN = 68) at the 0.05 and 0.01 significance leHoelter, 1983). The observed variables dissolved inanic nitrogen and soluble reactive phosphorusoderately correlated with the latent variable nunts (r≈0.46). Chlorophyll a was strongly correlatith phytoplankton (r≈0.75), while cyanobacterbundance showed a relatively weak negative c

Page 9: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 393

Fig. 2. Structural equation model for Lake Washington (N = 57). The numbers correspond to the standardized path coefficients and theR-squaredvalues (numbers in rectangles);χ2, d.f. andp correspond to the chi-square test values, the degrees of freedom and the probability level for rejectingthe null hypothesis, respectively.

lation (r≈−0.27). BothDaphnia (r≈0.72) and totalzooplankton (r≈0.89) length weighted biomass werestrongly correlated with the latent variable herbivory.

Generally, there was a good agreement betweenobserved and model-implied values in 30 out of 36moments of the covariance matrix (Table 1). Sixresidual covariances were large and the respectivestandardized estimates (i.e., the residual covariancesdivided by their standard error) were above one. Allthe paths between the latent variables of the initialhypothesized structural model were significant withthe only exception being the direct path from nutrientsto phytoplankton (p = 0.427; seeTable 2). In addition,the covariance between epilimnion depth and nutrientswas positive but non-significant (p = 0.251). Thestandardized direct effect (i.e., the unstandardizedpartial regression coefficients multiplied by the ratioof the standard deviation of the explanatory variableto the standard deviation of the variable it affects) ofthe epilimnion depth on phytoplankton was−0.498.

The standardized direct paths from phytoplankton toherbivory (0.554) and water clarity (−0.567) werenearly equal in magnitude, but opposite in sign.

Interpreting these results, we highlight the absenceof significant linkage between the inorganic nutrientstock and phytoplankton variability in the LakeWashington summer epilimnion. The epilimnion alsolacks significant replenishments from the hypolimnion(non-significant covariance between nutrients andepilimnion depth) probably due to the low nutri-ent levels below the thermocline (DIN≤300�g/Land SRP≤15�g/L; see Lehman, 1988). The tworegulatory factors for the phytoplankton communitystructure are the mixing processes and grazing pressureimposed by the zooplankton community. While thenegative path from epilimnion depth to phytoplanktonis plausible (i.e., dilution effects of epilimneticerosion/deepening) the positive relationship withherbivory invites further explanation. During thesummer-stratified period, a co-dependence between

Page 10: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

394 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

Table 1The analyzed covariance matrix for the Lake Washington model with predicted values and residuals (study periods 1994–2001)

Epilimniondepth

Secchidepth

log(SRP)√

DIN Totalzooplankton

Daphnia log(chloro-phyll a)

Cyanobacteria

Epilimnion depthObserved 12.646Predicted 12.646Residual 0

Secchi depthObserved 1.029 1.380Predicted 0.951 1.380Residual 0.078 0

log(SRP)Observed 0.262 −0.061 0.243Predicted 0.257 −0.018 0.243Residual 0.005 −0.043 0√

DINObserved 2.593 0.569 0.489 23.900Predicted 2.643 −0.189 0.489 23.900Residual −0.049 0.758 0 0

Total zooplanktonObserved −2.976 −1.767 0.430 5.309 23.878Predicted −3.438 −1.603 0.067 0.685 23.878Residual 0.462 −0.164 0.363 4.624 0

DaphniaObserved −0.585 −0.244 0.035 0.106 3.636 1.362Predicted −0.663 −0.309 0.013 0.132 3.636 1.362Residual 0.078 0.065 0.022 −0.026 0 0

log(chlorophyll a)Observed −0.206 −0.123 −0.002 0.088 0.375 0.086 0.052Predicted −0.244 −0.114 0.005 0.049 0.412 0.079 0.052Residual 0.038 −0.009 −0.007 0.040 −0.037 0.006 0

CyanobacteriaObserved 13.498 −0.876 0.949 5.936 −7.055 −1.425 −0.539 92.039Predicted 3.722 1.735 −0.072 −0.742 −6.272 −1.209 −0.446 92.039Residual 9.775 −2.612 1.021 6.677 −0.783 −0.215 −0.093 0

the phytoplankton and zooplankton community existsin the Lake Washington epilimnion, when a significantportion of the phosphorus supply (60–90%) in themixed layer is provided by zooplankton excretion(Richey, 1979; Arhonditsis et al., 2004b). Thus,zooplankton nutrient recycling fuels phytoplanktongrowth, which in turn has a positive feedback andsustains herbivore biomass. Interestingly, the standard-ized total effects of nutrients differed in sign betweencyanobacteria (−0.079 =−0.272×0.292) and chloro-phyll a (0.219 = 0.751×0.292). A possible explanationfor this difference is a pattern where small-pulsed

nutrient inputs from thermocline migrations/externalsources selectively subsidize the taxonomic groups ofthe Lake Washington phytoplankton community withaffinity and velocity competitive advantages (e.g.,diatoms, chlorophytes) over cyanobacteria (Sommer,1989). The use of fully quantitative informationwill further assist the elucidation of the interspecificphytoplankton response to discrete nutrient fluxes inmesotrophic environments during the stratified period.

We used the modified bootstrap procedure proposedby Bollen and Stine (1992)for testing the Lake Wash-ington SEM. The Bollen and Stine method introduces

Page 11: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 395

Table 2Comparison of the maximum likelihood, bootstrap (N = 2000) and Markov Chain Monte Carlo sampling estimates and standard errors of theLake Washington model path coefficients and error variances. The second pairs of the Bayesian estimates (medians and respective standarderrors) correspond to a limited movement around the three fixed loading coefficients by sampling the normal distribution (1, 1). The notation issimilar to that inAppendix 1

Parameters Symbol Maximum likelihood Bootstrap Markov Chain Monte Carlo sampling

Estimate S.E. Estimate S.E. Median S.E. Median S.E.

Phytoplankton←Nutrients γ2 0.229 0.289 0.345 0.570 0.155 0.299 0.122 0.306Phytoplankton←Epilimnion γ1 −0.024 0.009 −0.025 0.016 −0.020 0.010 −0.035 0.020Herbivory←Phytoplankton β2 2.711 1.217 3.190 1.931 2.337 0.995 1.710 1.282Cyanobacteria←Phytoplankton λ4 −15.225 9.510 −17.776 24.838 −13.761 9.511 −13.710 3.046Chlorophyll a←Phytoplankton λ5 1.000 1.000 1.000 0.527 0.215Daphnia←Herbivory λ8 1.000 1.000 1.000 0.929 0.441Zooplankton←Herbivory λ7 5.186 1.747 6.211 3.730 5.781 1.511 5.146 2.197DIN←Nutrients λ3 10.294 9.739 14.586 17.329 15.361 12.631 10.640 3.072SRP←Nutrients λ2 1.000 1.000 1.000 0.476 0.499Water Clarity←Phytoplankton β1 −3.890 1.371 −4.179 2.228 −3.841 1.281 −2.251 1.095Var(Nutrients) ϕ22 0.047 0.055 0.088 0.172 0.029 0.026 0.037 0.071Var(Epilimnion) ϕ11 12.646 2.390 12.476 2.345 12.572 2.485 12.445 2.474Covar(Nutrients, Epilimnion) ϕ12 0.257 0.224 0.283 0.214 0.154 0.186 0.213 0.233Var(Phytoplankton) ψ11 0.022 0.013 0.019 0.024 0.029 0.010 0.060 0.051Var(Cyanobacteria) var(ε1) 85.248 16.854 79.194 18.640 87.260 18.642 83.403 19.026Var(Chlorophyll a) var(ε2) 0.023 0.011 0.021 0.022 0.031 0.009 0.035 0.010Var(Herbivory) ψ33 0.486 0.225 0.487 0.438 0.466 0.233 0.523 6.978Var(Daphnia) var(ε5) 0.661 0.251 0.601 0.432 0.736 0.226 0.727 0.228Var(Total Zooplankton) var(ε4) 5.018 5.930 1.990 11.256 2.238 4.087 2.453 4.235Var(DIN) var(δ3) 18.869 6.230 16.008 9.798 16.624 9.206 12.986 8.886Var(SRP) var(δ2) 0.196 0.061 0.152 0.078 0.222 0.052 0.221 0.059Var(Water Clarity) ψ22 0.937 0.230 0.901 0.278 0.936 0.280 1.016 0.275

a transformation of the data matrix to ensure that thebootstrap samples will not be drawn from a set of obser-vations for which the null hypothesis does not hold(Bollen and Stine, 1992). As a result, this scheme ismore objective than the naıve bootstrapping, and thusH0 will not be rejected regardless of whether it holds ornot for the entire population. We formed 2000 bootstrapsamples by taking independent draws with replacementfrom the transformed dataset. Testing the null hypoth-esis that the model is correct, we found ap = 0.048;comparing the model performance when using the fulldataset, the fit was worse 98 out of 2000 bootstrapsamples. Interestingly, during the bootstrap procedureseveral samples resulted in singular covariance matri-ces, while the standard error for at least three bootstrapestimates (var(ε4), λ3, λ4) was notably higher thanthose from the classical method (Table 2). However,these discrepancies did not alter the inference regard-ing their significance. The first pairs of Bayesian esti-mates (medians and respective standard errors) showthat the Bayesian SEM provided consistent results with

the maximum likelihood method.Fig. 3 presents thecomparison between the observed chlorophyll a andtotal zooplankton abundance and the posterior pre-dictive median, quartiles, and 95% credible sets. Inaddition, when the assumptions regarding the metricsof the latent variables (λ2 =λ5 =λ8 = 1) were relaxedby sampling from the normal distribution (1,1), theinterpretation of the structural model remained unal-tered. Using the posterior medians, the predicted struc-tural equation models were:n1 =−0.035ξ1 + 0.122ξ2,n2 =−2.251n1, and n3 = 1.710n1. By comparing thestandard error relative to the medians, we infer thatthe weak relationship between nutrients and phyto-plankton was also evident with the Bayesian approach,while the path between phytoplankton and herbivorywas still positive but weaker (Table 2). Interestingly,the DIN loading (λ3) over the latent variable nutri-ents was stronger after the stochastic treatment ofthe assumption, whereas the SRP loading (λ2) wasnot significant. This finding probably reflects thephosphorus-limiting conditions in Lake Washington

Page 12: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

396 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

Fig. 3. Comparison between the observed data and the posterior predictive distributions for (a) chlorophyll a and (b) total zooplankton abundanceof the Lake Washington SEM.

where SRP is usually below the detection limit andthe largest fraction of the phosphorus stock of thesystem is sequestered in the phytoplankton cells. Incontrast, the non-limiting DIN (mostly NO3) rangesin detectable levels (DIN≥80�g/L), and thus seemsto more closely portray the phytoplankton fluctua-tions. Moreover, both herbivory and phytoplanktonwere strongly associated with their pair of indicatorvariables.

Generally, even though there were some discrep-ancies, both the bootstrap testing and the Bayesianapproach provided similar results to the maximum like-lihood method. The hypothesized conceptualization

of the epilimnetic phytoplankton dynamics providedsatisfactory results, which were also consistent withthe existing Lake Washington literature. However, itshould not be neglected that the model explained arelatively low proportion of the observed variability,while several tests of fit were in the marginal areabetween model acceptance/rejection. We highlight twopossible aspects of the model that warrant reconsider-ation/enhancement:

(i) The inclusion of the soluble form of phoshorusas a sole limiting nutrient indicator in the nutri-ents measurement model is not sufficient to com-

Page 13: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 397

pletely capture the phytoplankton dynamics. Ina strongly phosphorus limiting epilimnetic envi-ronment, the usually small-sized subsidies fromthe hypolimnion and/or the zooplankton recyclingare rapidly uptaken by the primary producers. Asa result, the use of SRP data from a moderatesampling intensity monitoring program provideslimited sensitivity in describing the phosphorus-phytoplankton relationship. The incorporation ofan additional indicator variable that also accountsfor the particulate or organic phosphorus fraction(e.g., TP) is likely to improve the model.

(ii) The dual nature of the latent variable phytoplank-ton community might be another cause of themoderate model performance. It is possible thatthe combination of chlorophyll a and cyanobac-teria into one common factor is not the mosteffective way to model phytoplankton responsebecause the relationship of these two variablesis not always clear in a mesotrophic environ-

ment. Even though it is reasonable to expect animprovement after the inclusion of a fully quanti-tative cyanobacteria characterization, the descrip-tion of a stochastic and non-linear phenomenon(i.e., species competition) with a linear and fairlysimple model is probably overoptimistic. Thereare several physical, chemical, biological factorsthat can affect the phytoplankton composition inthe summer epilimnion, which dynamically inter-act and regulate the growth-minus-loss balancefor each phytoplankton group and determine the“superior” competitor under any specific set ofconditions (Dokulil and Teubner, 2000; Downinget al., 2001). The recognition of this model limi-tation raises the classical “simplicity versus com-plexity” dilemma; the realistic selection of a modelthat simply focuses on a quantitative description ofthe phytoplankton community (e.g., the Lake Men-dota SEM) or a more complicated non-linear SEMapproach that requires several additional latent and

F numbev chi-squ level forr

ig. 4. Structural equation model for Lake Mendota (N = 48). Thealues (numbers in rectangles);χ2, d.f. andp correspond to theejecting the null hypothesis, respectively.

rs correspond to the standardized path coefficients and theR-squaredare test values, the degrees of freedom and the probability

Page 14: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

398 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

observed variables (e.g., turbulence, pH, TN/TP,CO2, trace elements, toxins) that can potentiallycause cyanobacteria dominance.

3.2. Lake Mendota

In constrast to Lake Washington, the Lake Men-dota model accounts for a large amount of the observedvariability in phytoplankton (76%), herbivory (43%),and water clarity (84%). Theχ2-test statistic valuewas 22.473 with 19 d.f. and non-significantp-value(= 0.261) (Fig. 4). In initial maximum likelihood cal-

culation runs, we found that the model yielded severalnegative error variances. These improper solutions –also referred as “Heywood cases” – can be causedby several factors, e.g., not “typical” samples, out-liers or influential observations, and fundamental faultson model specification (Bollen, 1989a). For example,Boomsma (1982)found that implausible values werepossible when small sample sizes and two indicatorsper factor were used, andAnderson and Gerbing (1984)suggested the use of large sample sizes (≥150) andmore than three indicators per factor to avoid negativeerror variances.

Table 3The analyzed covariance matrix for the Lake Mendota model with predicted values and residuals (study periods 1997–2001)

Epilimniondepth

Secchidepth

SRP√

DIN√

Totalzooplankton

√Daphnia log(chlorophyll a) log(total

biovolume)

Epilimnion depthObserved 16.496Predicted 16.496Residual 0

Secchi depthObserved 2.666 2.923Predicted 2.744 2.923Residual −0.077 0

SRPObserved 49.018 36.558 1097.123Predicted 48.168 38.192 1097.124Residual 0.850 −1.634 −0.001√

DIN.427.4424√.810.3866√.368.36008

l

l

Observed 16.821 13.437 272.742 96Predicted 16.832 13.346 272.146 96Residual −0.010 0.092 0.596 −0.01

Total zooplanktonObserved 0.673 0.921 17.824 5Predicted 1.313 1.179 18.275 6Residual −0.639 −0.259 −0.450 −0.57

Daphnia

Observed 0.691 1.143 22.965 7Predicted 1.307 1.175 18.200 6Residual −0.617 −0.032 4.765 1.0

og(chlorophyll a)

Observed −0.730 −0.566 −7.655 −3.088 −Predicted −0.630 −0.566 −8.777 −3.067 −Residual −0.099 0 1.122 −0.021

og(total biovolume)Observed −0.821 −0.796 −10.639 −4.005 −Predicted −0.866 −0.778 −12.057 −4.213 −Residual 0.045 −0.018 1.418 0.208

1.5711.571

0

1.303 1.3811.293 1.388

0.010 −0.007

0.234 −0.283 0.1650.271 −0.270 0.1650.037 −0.013 0

0.366 −0.431 0.177 0.3690.372 −0.371 0.179 0.369

0.006 −0.060 −0.002 0

Page 15: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 399

To prevent the Lake Mendota model from yieldingnegative values, we added specification of two errorvariances (DIN,Daphnia), thus increasing the d.f. from17 to 19. To compute the prespecified error variances,we ran the Bayesian SEM configuration and used themedian error variances, which resulted from this modelas the prescribed values for the maximum likelihoodmodel. We ran the Bayesian SEM model keeping thethree loading coefficients fixed (equal to one; see belowand Appendices 1 and 2) and using “flat” (uninfor-mative) priors for the rest of the model parameters.Note that in a Bayesian framework, the use of a con-jugate prior distribution specifies the variances in aregion of positive values, and thus the inadmissible val-ues are avoided (Lee and Shi, 2000; see alsoMartinand McDonald, 1975for another Bayesian approachto avoid inadmissible estimates in unrestricted factoranalysis). Researchers using SEM for environmentaldata sets are likely to find that such methods may beneeded in their analyses, since datasets of similar sizeare frequent in environmental science.

The RMSEA was 0.062, which under the nullhypothesis of “close fit” corresponds to ap-valueof 0.388. The IFI and CFI values were 0.989 and0.987, respectively. Hoelter’s criticalN test indicated amodel acceptance (CN = 64) at the 0.05 significancelevel. The latent variable nutrients was well corre-lated with dissolved inorganic nitrogen (r≈0.99) andsoluble reactive phosphorus (r≈0.84). Chlorophyll aand total algal biovolume were also highly correlatedwith phytoplankton and the respective coefficients were0.89 and 0.81. Finally, bothDaphnia (r≈0.96) andtotal zooplankton (r≈0.91) length weighted biomasswere strongly correlated with the latent variable her-bivory.

Generally, there was a good agreement betweenthe observed and model-implied moments and all thestandardized estimates of the residuals were belowone (Table 3). The largest residual covariances werefound between the epilimnion depth and the two her-bivory indicators and between SRP andDaphnia (stan-dardized estimates of the residuals≥0.7). The initial

Table 4Comparison of the maximum likelihood, bootstrap (N = 2000) and Markov Chain Monte Carlo sampling estimates and standard errors of theLake Mendota model path coefficients and error variances. The second pairs of the Bayesian estimates (medians and respective standard errors)correspond to a limited movement around the three fixed loading coefficients by sampling the normal distribution (1, 1). The notation is similarto that inAppendix 1

Parameters Symbol Maximum likelihood Bootstrap Markov Chain Monte Carlo sampling

Estimate S.E. Estimate S.E. Median S.E. Median S.E.

Phytoplankton←Nutrients γ2 −0.011 0.002 −0.011 0.002 −0.011 0.002 −0.037 0.024P 08 −H 16 −T 86 .336C 37D 30Z 82 .536D 33 .295S 48W 64 −V 263 5.721V 03 .513C 08 .081V 10 .146V 30 .034V 10 .012V 77 .809V 59V 76 .094V 63V 864 7.673V 53 .172

hytoplankton←Epilimnion γ1 −0.006 0.0erbivory←Phytoplankton β2 −2.074 0.4otal biovolume←Phytoplankton λ4 1.374 0.1hlorophyll a←Phytoplankton λ5 1.000aphnia←Herbivory λ8 1.000ooplankton←Herbivory λ7 1.004 0.0IN←Nutrients λ3 0.349 0.0RP←Nutrients λ2 1.000ater Clarity←Cyanobacteria β1 −4.352 0.4

ar(Nutrients) ϕ22 778.817 217.ar(Epilimnion) ϕ11 16.496 3.4ovar(Nutrients, Epilimnion) ϕ12 48.168 18.6ar(Phytoplankton) ψ11 0.035 0.0ar(Total Biovolume) var(ε1) 0.123 0.0ar(Chlorophyll a) var(ε2) 0.031 0.0ar(Herbivory) ψ33 0.728 0.1ar(Daphnia) var(ε5) 0.100ar(Total Zooplankton) var(ε4) 0.273 0.0ar(DIN) var(δ3) 1.344ar(SRP) var(δ2) 318.306 67.ar(Water Clarity) ψ22 0.458 0.1

0.007 0.009 −0.007 0.007 −0.008 0.0142.078 0.444 −2.035 0.436 −1.452 0.8541.380 0.196 1.361 0.193 1.053 01.000 1.000 0.756 0.21.000 1.000 1.102 0.51.009 0.081 1.006 0.091 1.095 00.353 0.036 0.367 0.042 0.935 01.000 1.000 2.553 0.74.391 0.510 −4.335 0.486 −3.381 1.013759.919 209.328 699.311 217.721 109.322 1616.121 3.240 16.312 3.533 16.280 346.929 17.905 44.215 18.481 17.432 110.029 0.010 0.040 0.012 0.062 00.118 0.029 0.131 0.034 0.131 00.033 0.010 0.043 0.012 0.044 00.694 0.166 0.759 0.200 0.651 10.100 0.100 0.059 0.083 0.00.259 0.075 0.291 0.092 0.292 01.344 1.344 1.920 0.627 3.5

304.697 63.891 332.557 76.999 330.907 70.435 0.151 0.449 0.174 0.448 0

Page 16: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

400 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

hypothesized structural model was confirmed and allthe paths between the latent variables were significant.The only exception was the direct path from epil-imnion depth to phytoplankton (p = 0.436; seeTable 4).The standardized direct effect of nutrients on phyto-plankton was−0.841. The standardized direct pathsfrom phytoplankton to herbivory (−0.659) and waterclarity (−0.918) were negative and highly significant.Finally, the correlation (r = 0.425) between epilimniondepth and nutrients (p = 0.01) was also significant. Thelatter result is consistent with previous studies thatemphasized the internal loading contribution to theepilimnetic nutrient stock in Lake Mendota.Sorannoet al. (1997)found that major entrainment events canoccur and result in nutrient fluxes that are significantlyhigher than those from external nutrient sources. Thesenutrient pulses stimulate phytoplankton (cyanobacte-ria) blooms (≥30�g chl a/L), surface scum formation

with adverse effects on the water clarity and the aes-thetic value of the lake (Lathrop et al., 1996; Soranno,1997). Our SEM approach also highlighted the controlthat herbivory exerts on summer epilimnetic phyto-plankton dynamics. Several past Lake Mendota studiesbased on both short and long-term datasets, variant tem-poral resolution (from days to seasonal averages), andcontemporaneous or lagged measurements providedsimilar evidence and underscored the role of zooplank-ton grazing pressure (especially from the large bodiedDaphnia spp.) on phytoplankton dynamics (e.g., com-position, abundance) and the water clarity (Lathrop andCarpenter, 1992b; Lathrop et al., 1996; Soranno, 1997;Lathrop et al., 1999). Overall, the conceptual modelresulted in a good fit with the data and it provided aplausible interpretation of the ecological processes in aeutrophic environment that is consistent with the exist-ing literature.

F ior pred eL

ig. 5. Comparison between the observed data and the poster

ake Mendota SEM.

ictive distributions for (a) chlorophyll a and (b)Daphnia abundance of th

Page 17: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 401

Both bootstrap estimates (Table 4) and modelfit (p = 0.367) were consistent with the maximumlikelihood results.Fig. 5 presents the comparisonbetween the observed chlorophyll a andDaphnia abun-dance and the posterior predictive median, quartiles,and 95% credible sets of the Bayesian model withthe three identification restrictions (λ2 =λ5 =λ8 = 1).Moreover, interesting findings were raised whenthese three constraints were relaxed. Using the pos-terior medians, we have the structural equationmodel n1 =−0.008ξ1−0.037ξ2, n2 =−3.381n1, andn3 =−1.452n1. Comparing these estimates with therespective standard errors, we infer that the Bayesianapproach also underlined the importance of the pathsfrom nutrients to phytoplankton, and from phytoplank-

ton to herbivory and water transparency. In contrast,the variance of the nutrients (ϕ22) and the covariancebetween nutrients and epilimnion depth (ϕ12) werenotably lower. The lowerϕ12 andϕ22 variance esti-mates were related to the assumption that determinesthe metric of the latent variable nutrients (λ2 = 1) incombination with the high variance of the used SRPvalues (Table 3), which in turn is indicative of the roleof the hypolimnetic fluxes. [Also note that the bestfit to the normal distribution was provided by the raw(untransformed) SRP data for the Lake Mendota SEM.]The discrepancies were minimized when we sampledvalues for theλ2 loading from a normal distributionwith mean 1 and precision 5 or when keeping thisloading factor fixed. Finally, significant loadings were

Foqs

ig. 6. Posterior distributions for the two parameters associated withf the Lake Mendota structural equation model: each point corresponuality index for each sampling date (study period 1997–2001,N = 48). Thatisfactory/non-satisfactory water quality conditions.

(a) chlorophyll a, (b) water transparency, and (c) water quality predictionsds to the mean value of the posterior predictive distribution of the water

e water quality index is based on a binary characterization (0, 1) of

Page 18: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

402 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

found between the three latent variables and the respec-tive pairs of indicators.

3.3. Application in natural resource management

To illustrate one way of how this methodologicalframework can be used for assessing trophic conditionsand assist water quality management, we linked theBayesian configuration of the Lake Mendota SEM withthe following model:

logit(p[i]) = α0+ α1 chlorophyll a[i] + α2 water transparency[i],

whereY = 1 if chlorophyll a[i] ≥ 10�g/L or Secchi depth[i] ≤ 2 m elseY = 0,

Y [i]|p[i]∼Bernoulli (p[i]), logit p[i] = loge(p[i]/(1− p[i])) (3)

The indicator variable for identifying trophic state(referred to as “Water Quality Index” inFig. 6)was based on a binary characterization (0, 1) ofsatisfactory/non-satisfactory water quality conditions.More specifically, the observedY[i] was modeled asa realization from a random Bernoulli process wherep[i], the probability of non-satisfactory conditions, isa function of chlorophyll a and water transparency(coincides with Secchi depth in this model) for thiscategorization. The respective cutoff points were10�g/L and 2 m.Fig. 6(a) and (b) show the posteriordistribution for the parameters associated with thetwo water quality variables.Fig. 6(c) illustrates howthe proposed conceptualization of the Lake Mendotae forw pos-t ex( ed hylla videp ual-i atev ad-i entm ool.

Mw giesf atei (i)t asei del-i ling

programs (moderate sampling intensity, standard lim-nological variables used), and (ii) the ability to testa single model simultaneously for multiple waterbod-ies (multi-group analyses or stacked models), wherethe use of constraints across groups enables the iden-tification of the longitudinal differences (Pugesek andTomer, 1996). Bayesian analysis can also be advanta-geous along this line, since the stochastic treatment ofthe cross-sectional constraints improves the accuracyof the estimation and results in a more meaningful inter-pretation of the model (Lee, 1992; Lee and Ho, 1993).

4. Conclusions

We presented an illustrative example that examinedthe efficiency of a multivariate statistical method toexplore ecological patterns. Structural equation mod-eling was used to formulate a simple conceptual modelregarding epilimnetic phytoplankton dynamics, whichwas then tested in two lakes with different trophic status(i.e., eutrophic Lake Mendota and mesotrophic LakeWashington). The basic feature of the confirmationtheory is the recognition that science is a hypothetico-deductive process and observations/experimental datashould be considered a consequence of a theory or ag -t tingk ofe ory”c on-d ntalm ction( n-t tice them icalp l con-c tualf lesb ctsp othd urea

pilimnetic phytoplankton dynamics can be usedater quality predictions. The mean values of the

erior predictive distribution of the water quality indstudy period 1997–2001,N = 48) were used for thelineation of this surface, where the joint chloropand water transparency SEM estimations pro

redictions – through the model (3) – on the water qty of the lake. In addition, the inclusion of surrogariables for watershed dynamics (e.g., nutrient longs) will further elaborate the ability of the pres

odeling framework to be used as management tIt should be noted that the combination of SE

ith Bayesian analysis offers two additional strateor the development of a framework that will integrnto the model information over time and space:he sequential updating, which in this particular cs relatively easy since the basis for the present mong construction was data from conventional samp

eneral law (Oreskes et al., 1994). While the formulaion of our hypothetical model was based on exisnowledge from the limnological literature, the usexisting data undermines somewhat the “confirmatharacter of our study. Ideally, ecologists when cucting confirmatory analysis must design their meodels in advance and then proceed to data colle

McCune and Grace, 2002). Nevertheless, our inteion with this simple (and very familiar to aquacologists) example was to show the flexibility ofethod to: (i) translate a fairly complicated ecologhenomenon and express it as a function of severaeptual environmental factors, (ii) link the concepactors of interest with routinely measured variaby explicitly acknowledging that none of those refleerfectly the underlying property, and (iii) test birect and indirect paths of this ecological structnd identify the importance of their role.

Page 19: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 403

Model acceptance in two or more case studies isnot evidence for a general statement, but merely thestart of a “perpetual race” for confirmation (Oreskes etal., 1994). This realization might be misinterpreted asa necessity for good starting models, where the ecol-ogist has to embody conceptualizations with a highlikelihood of confirmation in a variety of conditions.In typical SEM practice, what is needed is a tenta-tive initial model and probably prior knowledge ofthe variety of observed variables that can reflect thestudied ecological conditions. The initial model can berespecified and effectively optimized, as long as themodifications are done through a combination of dataand theory-driven exploratory analysis (McCune andGrace, 2002). By integrating intuition, theory, and evi-dence from the data, we insure that the final model hasnot only the best fit but also meaningful paths. Thus,the resulting modeling development provides a plau-sible framework that seeks for further confirmation.For several reasons, SEM has received criticism withregards to its ability to serve as a methodological toolin ecology and evolutionary biology (e.g.,Petraitis etal., 1996), which however have been addressed on avis-a-vis basis in the SEM literature (e.g.,Pugesek andTomer, 1995; Grace and Pugesek, 1998). For exam-ple, SEM does not require larger sample sizes thanother multivariate methods (e.g., multiple regression,MANOVA); can account for non-linear relationships;can include categorical data and can overcome devia-tions from multinormality.

siana rguet gicals on-i ane sistn n,2 tivep pro-p andm er,e ax-i alls se ch,w ste-r ckles r

study showed, additional insight can be gained by thestochastic treatment of the assumptions used to deter-mine the latent variable metrics. Another aspect wherethe Bayesian strategy can be particularly useful, thoughit was not explored in this study, is the treatment ofunidentified initial models; the problem of exact iden-tification restrictions can be overcome by exploitingexisting information and defining plausible informa-tive prior distributions (Scheines et al., 1999).

It should also be stressed that SEM can be morecomplicated than the presented model and more infor-mation can be easily included.McCune and Grace(2002)discussed that SEMs can become problematicafter the inclusion of more than 10 latent variables.For example, the ecological structure of our model canbe augmented by the inclusion of more indicators foreach of the already existing latent variables (e.g., totalnutrient forms, silica, trace elements, primary produc-tivity, Schmidt stability index), higher predators of thefood web, external nutrient loadings or concepts thatreflect recent advancements in aquatic ecology such asalgal food quality for zooplankton (e.g., highly unsat-urated fatty acid, amino acid, protein content, and/ordigestibility, seeSterner and Hessen, 1994; Kilham etal., 1997; Brett and Muller-Navarra, 1997; Kleppel etal., 1998). Finally, another methodological advance-ment for the analysis of complex ecological systems islikely to result from the delineation of the correspon-dence between SEM and system dynamic modelingand the integration of these two techniques into onec

hate tedt arek EMm tantc rns,r bilityt reci-s thei ticei

A

terE Uni-

Several benefits can also be gained by a Bayepproach to structural equation modeling. We a

hat a Bayesian SEM that subjects a realistic ecolotructure to sequential updating with routinely mtored environmental variables is likely to lead toffective, easily implementable framework and asatural resource management (Dorazio and Johnso003). In the present study, we used uninformarior parameter distributions where the posterior isortional to the likelihood, and thus the Bayesianaximum likelihood estimates did not differ. Howev

ven though it was not encountered here, local mma or non-Gaussian likelihood surfaces from smample sizes can cause major discrepancies (Scheinet al., 1999). The adoption of a Bayesian approahere MCMC samples are taken from the true po

ior distribution over parameters, can assist to tauch problems (Congdon, 2003). Moreover, as ou

omprehensive tool (Grace, 2001; Hovmand, 2003).Holling (1978)emphasized the popular notion t

verything in ecological systems is tightly conneco everything else is, in fact, false. Rather, thereey linkages that dominate ecosystem dynamics. Sodels are useful to help identify these impor

onnections. Striving to elucidate ecological patteesearchers can use SEM as a tool that lends flexio compromise between generality, realism and pion and obtain the optimal scale of description;ntegration of this technique in the ecological pracs warranted.

cknowledgments

This study was funded by a grant from the Wanvironment Research Foundation through the

Page 20: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

404 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

versity of North Carolina Water Resources ResearchInstitute. We thank Curtis L. DeCasperi (King County,Washington State, Department of Natural Resourcesand Parks), Michael T. Brett (Department of Civiland Environmental Engineering, University of Wash-ington), and Olli Malve (Finnish Environment Insti-tute) for helpful comments to an earlier draft of themanuscript. We also wish to thank Daniel E. Schindler(School of Aquatic and Fisheries Sciences, Univer-sity of Washington) for supplying zooplankton data forLake Washington.

Appendix 1

A.1. General description

By expressing the observations as deviations fromtheir means, the following set of three equations pro-vides the general matrix representation of a structuralequation model:

X = Λxξ + δ,Y = Λyη+ ε (Measurement model),

η )

w rsov entl nt(ivomcccm usv

E

is uncorrelated withξ, (4) ζ is uncorrelated withξ, (5)ζ, ε andδ are mutually uncorrelated, and (6) (I−B)−1

is non-singular. In addition, the associated covariancematrices are: Cov(ξ) =Φ(n× n): covariances betweenthe independent variablesξ; Cov(ε) =Θε(p× p):covariances between the measurement errors inY;Cov(δ) =Θδ(q× q): covariances between the measure-ment errors inX; Cov(ζ) =Ψ (m×m): covariancesbetween the structural errorsζ.

The (p + q) model-implied covariance matrixΣ(θ)of the observed variables can be partitioned into fourmatricesΣyy(θ),Σyx(θ),Σxy(θ) andΣxx(θ) that denotethe model-implied covariances between theY observedvariables, theX observed variables, and theX and Yobserved variables, respectively.

Σ(θ) =[Σyy(θ) Σyx(θ)

Σxy(θ) Σxx(θ)

](A.2)

Based on the previous assumptions, the matrix(A.2)takes the following form (Bollen, 1989a):

Σ(θ) =[Λy(I − B)−1(ΓΦΓ ′ + Ψ )[(I − B)−1]′Λ′y +Θε Λy(I − B)−1ΓΦΛ′x

ΛxΦΓ′[(I − B)−1]′Λ′y ΛxΦΛ

′x +Θδ

](A.3)

A.2. Lake Washington structural equation model

As an illustrative example, we present the matrices’forms and the specific assumptions made for the LakeW tiono ilarwe df reu pil-i us,t l con-s

X

ξ

= Bη+ Γξ + ζ (Latent variable model) (A.1

here X is a q×1 vector of observable indicatof the independent latent variablesξ; Y is a p×1ector of observable indicators of the dependatent variablesη; η is a m×1 vector of dependeendogenous) latent variables;ξ is a n×1 vector ofndependent (exogenous) latent variables;ζ is am×1ector of latent (structural) errors;ε is a p×1 vectorf measurement errors forY; δ is a q×1 vector ofeasurement errors forX; Λy is a p×m matrix of

oefficients relatingY to η; Λx is a q× n matrix ofoefficients relatingX to ξ; Γ is a m×n matrix ofoefficients for the latent exogenous variables;B is a×m matrix of coefficients for the latent endogeno

ariables.The statistical assumptions are: (1)E(η) = E(ξ) =

(ε) = E(δ) = E(ζ) = 0, (2)ε is uncorrelated withη, (3)δ

ashington structural equation model. The extracf the Lake Mendota SEM can be obtained in a simay. The Lake Washington SEM included two (n = 2)xogenous latent variablesξ, which were describerom three (q = 3) indicators; i.e., SRP and DIN wesed for the latent variable “Nutrients” and the e

mnion depth for the respective latent variable. Thhe exogenous latent variable measurement modeists of the following four matrices:

=

X1 = Epilimniondepth

X2 = SRP

X3 = DIN

, ΛX=

λ1 0

0 λ2

0 λ3

,

=[ξ1 = Epilimniondepth

ξ2 = Nutrients

], δ =

δ1

δ2

δ3

(A.4)

Page 21: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 405

Five indicators (p = 5) were used for the represen-tation of the three (m = 3) endogenous latent variables;i.e., cyanobacteria counts and chlorophyll a were usedas indicators for the latent variable phytoplankton com-munity, the Secchi depth for the Water clarity,Daphniaand total zooplankton were used to characterize thelatent variable Herbivory. Thus, the exogenous latentvariable measurement model can be described from thefour matrices:

Y =

Y1 = Cyanobacteria

Y2 = Chlorophyll a

Y3 = Secchi depth

Y4 = Total Zooplankton

Y5 = Daphnia

,

ΛY =

λ4 0 0

λ5 0 0

0 λ6 0

0 0 λ7

0 0 λ8

,

η =

η1 = Phytoplankton community

η2 =Water clarity

η3 = Herbivory

,

ε =

ε1

ε2

ε

(A.5)

The additional three matrices of the structural equationfor the latent variable model are:

Γ =

γ1 γ2

0 0

0 0

, B =

0 0 0

β1 0 0

β2 0 0

, ζ =

ζ1

ζ2

ζ3

(A.6)

As it can be inferred from the path diagram (Fig. 1) andthe form of the matrixB, the Lake Washington struc-tural equation model is recursive (no feedback causalrelations and uncorrelated measurement or structuralerrors). Thus, the associated covariance matrices are:

Θε =

var(ε1)

0 var(ε2)

0 0 var(ε3)

0 0 0 var(ε4)

0 0 0 0 var(ε5)

,

Θδ =

var(δ1)

0 var(δ2)

0 0 var(δ3)

,

Ψ =

ψ11

0 ψ22

, Φ =

[φ11

φ12 φ22

]

B et

Σ ,

Σ + var(

3

ε4

ε5

xx(θ) =

λ2

1φ11+ var(δ1)

λ1λ2φ12 λ22φ22+ var(δ2)

λ1λ3φ12 λ2λ3φ22 λ23φ22+ var(δ3)

yy(θ) =

λ2

4Aaux+ var(ε1)

λ4λ5Aaux λ25Aaux+ var(ε2)

λ4λ6β1Aaux λ5λ6β1Aaux λ26β

21Aaux+ λ2

6ψ22

λ4λ7β2Aaux λ5λ7β2Aaux λ6λ7β1β2Aaux

λ4λ8β2Aaux λ5λ8β2Aaux λ6λ8β1β2Aaux

0 0 ψ33(A.7)

y substituting(A.4)–(A.7) into (A.3), we determinhe four sub-matrices of(A.2):

ε3)

λ27β

22Aaux+ λ2

7ψ33+ var(ε4)

λ7λ8β22Aaux+ λ7λ8ψ33 λ2

8β22Aaux+ λ2

8ψ33+ var(ε5)

Page 22: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

406 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

where Aaux= γ1(γ1φ11+ γ2φ12)+ γ2(γ1φ12+γ2φ22)+ ψ11, and

Σyx(θ) = [Σxy(θ)]′ =

λ1λ4(γ1φ11+ γ2φ12) λ2λ4(γ1φ12+ γ2φ22) λ3λ4(γ1φ12+ γ2φ22)

λ1λ5(γ1φ11+ γ2φ12) λ2λ5(γ1φ12+ γ2φ22) λ3λ5(γ1φ12+ γ2φ22)

λ1λ6β1(γ1φ11+ γ2φ12) λ2λ6β1(γ1φ12+ γ2φ22) λ3λ6β1(γ1φ12+ γ2φ22)

λ1λ7β2(γ1φ11+ γ2φ12) λ2λ7β2(γ1φ12+ γ2φ22) λ3λ7β2(γ1φ12+ γ2φ22)

λ1λ8β2(γ1φ11+ γ2φ12) λ2λ8β2(γ1φ12+ γ2φ22) λ3λ8β2(γ1φ12+ γ2φ22)

According to the null hypothesis, this model-implied8×8 matrixΣ(θ) is equal to the known sample covari-ance matrixS. The 36 [= (1/2) (8) (9)] non-redundantelements of the two matrices provide 36 equations thatcan be solved with one of the common fitting methods(e.g., maximum likelihood, unweighted or generalizedleast squares). In this study, we used the maximum like-lihood (ML) method, where the fitting function that isminimized is (Bollen, 1989a):

FML = log|Σ(θ)| + tr(SΣ−1(θ))− log|S| − (p+ q)(A.8)

Before evaluating the identification status of the model,it is essential to set the metric of the latent variables.One way that this can be accomplished is by fixing oneloading in each column ofΛX andΛY to 1.0. In this par-ticular case, we assumed thatλ2 =λ5 =λ8 = 1.0. More-over, implicit in the assumption that the latent variablesepilimnion depth and water clarity coincide with theobserved variables epilimnion depth and Secchi depthi em thatc siestt -b thent bero ix oft les:

t

I delpβ

ψ

dw rycs tee

model identification (Kaplan, 2000). Additional rulesexist and can be used to establish model identification(Bollen, 1989a). We used the software Amos 5 for SEMdevelopment (Arbuckle, 1995).

Appendix 2

Using the previous notation, the hierarchicalBayesian configuration of the Lake Washington SEMcan be specified as

X1i = λ1ξ1i + δ1, X2i = λ2ξ2i + δ2,X3i = λ3ξ2i + δ3, δ∼N(0,Θδ), ξ∼N(0, Φ);

Y1i = λ4η1i + ε1, Y2i = λ5η1i + ε2,

Y3i = λ6η2i + ε3, Y4i = λ7η3i + ε4,

Y5i = λ8η3i + ε5, ε∼N(0,Θε),

η1i = γ1ξ1i + γ2ξ2i + ζ1, η2i = β1η1i + ζ2,η

L ft nAfw tor,�

v n inA

p

s:λ1 =λ6 = 1.0, andδ1 = ε3 = 0. Having determined thetric of the latent variables, there are several rules

an be used to check SEM identification. The eaest to apply is the so-calledt-rule. If t is the total numer of model parameters that are to be estimated,

his numbert must be less than or equal to the numf non-redundant elements in the covariance matr

he observed (endogenous and exogenous) variab

≤ (1/2)(p+ q)(p+ q+ 1) (A.9)

n the Lake Washington SEM, the unknown moarameters were 19, i.e.,θ = (λ3, λ4, λ7, γ1, γ2, β1,2, var(ε1), var(ε2), var(ε4), var(ε5), var(δ2), var(δ3),11, ψ22, ψ33, ϕ11, ϕ12, ϕ22) which left 17 (36−19).f. in the model. Even though in practice, thet-ruleorks for the majority of models (except from veomplex ones), it should be noted that it is aneces-ary but notsufficient condition and does not guaran

3i = β2η1i + ζ3, ζ∼N(0, Ψ ) (B.1)

etwi = {yi, xi, i = 1, . . . , n} be the joint vector ohe observed variables for an arbitrary observatioi.ccording to the model(B.1), each observationi comes

rom a multivariate normal distributionf(µ(θ)i, Σ(θ))hereµ(θ)i is the conditional mean (expected) vec(θ) is the conditional covariance matrix andθ is theector of the unknown model parameters both giveppendix 1. The likelihood ofw = (w1, . . . , wn) is:

(w|θ) =n∏i=1

(2π)−(p+q)/2|Σ(θ)|−1/2

exp

[−1

2[wi − µ(θ)i]

′Σ(θ)−1[wi − µ(θi)]

](B.2)

Page 23: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 407

whereq = 3 andp = 5 are the number of exogenous andendogenous manifest variables. In the context of theBayesian statistical inference, the focus is on the pos-terior density ofθ given the observed dataw, which isdefined as

p(θ|w) = p(w|θ)p(θ)∫p(w|θ)p(θ)dθ

∝ p(w|θ)p(θ) (B.3)

wherep(θ) is the prior density ofθ which is required tobe specified for each of the unknown model parameters.We used a Wishart distribution (ρ, R) over the precisionmatrixΦ−1 that corresponds to vague prior knowledge;i.e., we chose the degrees of freedomρ to be as smallas possible (2, the rank ofΦ), and a prior guess atthe order of magnitude of the covariance matrixΦ tobeR = 0.1I (Ansari et al., 2000). Aside from the caseswhere no measurement error was assumed between thelatent and indicator variables (i.e., epilimnion depth,water clarity), we used independent non-informativeconjugate gamma priors (0.01, 0.01) for the elementsof the matricesΘ−1

δ , Θ−1ε , andΨ−1 (Spiegelhalter et

al., 1996; pg 39). Effectively “flat” normal prior distri-butions with means equal to the maximum likelihoodestimates and precisions equal to 0.0001 were used forthe structural parameters and the factor loadings whenwe compared the Bayesian model configuration withthe classical approach (λ2, λ5 andλ8 were kept fixedand equal to 1). The sensitivity of the model results wasthen tested by sampling the loadingsλ2,λ5 andλ8 fromthe normal distribution (1,1), which allowed a limitedm gp stri-b ode ula-t d intu plesw ela-t difiedG -m cesc es sedo tion( sti-m arloe eano ; see

Spiegelhalter et al., 2003) for all the parameters wasless than 5% of the sample standard deviation.

References

Anderson, J.C., Gerbing, D.W., 1984. The effect of sampling error onconvergence, improper solutions, and goodness-of-fit indexes formaximum-likelihood confirmatory factor-analysis. Psychome-trika 49, 155–173.

Anderson, J.C., Gerbing, D.W., 1992. Assumptions and compara-tive strengths of the two-step approach. Socio. Meth. Res. 20,321–333.

Ansari, A., Jedidi, K., Jagpal, S., 2000. A hierarchical Bayesianmethodology for treating heterogeneity in structural equationmodels. Market. Sci. 19, 328–347.

Arbuckle, J.L., 1995. Amos Users’ Guide. Small Waters Corporation,Chicago, IL.

Arhonditsis, G.B., Brett, M.T., 2005. Eutrophication model for LakeWashington (USA): Part II. Model calibration and system dynam-ics analysis. Ecol. Model. 187, 179–200.

Arhonditsis, G.B., Brett, M.T., DeGasperi, C.L., Schindler, D.E.,2004a. Effects of climatic variability on the thermal propertiesof Lake Washington. Limnol. Oceanogr. 49, 256–270.

Arhonditsis, G.B., Winder, M., Brett, M.T., Schindler, D.E., 2004b.Patterns and mechanisms of phytoplankton variability in LakeWashington (USA). Water Res. 38, 4013–4027.

Arhonditsis, G.B., Brett, M.T., 2004. Evaluation of the current stateof mechanistic aquatic biogeochemical modeling. Mar. Ecol. -Prog. Ser. 271, 13–26.

Arhonditsis, G., Brett, M.T., Frodge, J., 2003. Environmental controland limnological impacts of a large recurrent spring bloom inLake Washington, USA. Environ. Manage. 31, 603–618.

A non-d the0.

B Brit.

B dels.

B urceject.WT-

B les.

B uc-

B ea-. 21,

B mall,rt I.

ovement (Congdon, 2003). In this case, the remaininarameters were sampled from a normal prior diution with mean equal to the maximum likelihostimates and precisions equal to 0.1. MCMC sim

ion was used as the computation tool implementehe WinBUGS software (Spiegelhalter et al., 2003). Wesed three chain runs of 50000 iterations and samere taken every 50th iteration to avoid serial corr

ion, and convergence was assessed using the moelman–Rubin convergence statistic (Brooks and Gelan, 1998). Generally, we noticed that the sequen

onverged very rapidly (≈1000 iterations), while thummary statistics reported in this study were ban the last 20,000 draws by keeping every 5th iterathin = 5). In addition, the accuracy of the posterior eates was inspected by assuring that the Monte Crror (an estimate of the difference between the mf the sampled values and the true posterior mean

rminger, G., Muthen, B.O., 1998. A Bayesian approach tolinear latent variable models using the Gibbs sampler anMetropolis-Hastings algorithm. Psychometrika 63, 271–30

artholomew, D.J., 1981. Posterior analysis of the factor model.J. Math. Stat. Psychol. 34, 93–99.

entler, P.M., 1990. Comparative fit indexes in structural moPsychol. Bull. 107, 238–246.

etz, C.R., Lowndes, M.L., Porter, S., 1997. Non-point SoControl Plan for the Lake Mendota Priority Watershed ProWisconsin Department of Natural Resources Publication No481-97.

ollen, K.A., 1989a. Structural Equations with Latent VariabWiley and Sons, New York.

ollen, K.A., 1989b. A new incremental fit index for general strtural equation models. Sociol. Meth. Res. 17, 303–316.

ollen, K.A., Stine, R.A., 1992. Bootstrapping goodness-of-fit msures in structural equation models. Sociol. Meth. Res205–229.

oomsma, A., 1982. The robustness of LISREL against ssample sizes in factor analysis models. In: Joreskog, K.G.Wold, H. (Eds.), System under Indirect Observation, PaNorth-Holland, Amsterdam, pp. 149–173.

Page 24: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

408 G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409

Brett, M.T., Muller-Navarra, D.C., 1997. The role of highly unsatu-rated fatty acids in aquatic food–web processes. Freshwater Biol.38, 483–499.

Brooks, S.P., Gelman, A., 1998. Alternative methods for monitoringconvergence of iterative simulations. J. Comput. Graph Stat. 7,434–455.

Browne, M.W., Cudeck, R., 1993. Alternative ways of assessingmodel fit. In: Bollen, K.A., Long, J.S. (Eds.), Testing Struc-tural Equation Models. Sage, Newbury Park, CA, pp. 136–162.

Carpenter, S.R., 2002. Ecological futures: building an ecology of thelong now. Ecology 83, 2069–2083.

Carpenter, S.R., Kitchell, J.F., Cottingham, K.L., Schindler, D.E.,Christensen, D.L., Post, D.M., Voichick, N., 1996. Chlorophyllvariability, nutrient input, and grazing: evidence from whole-lakeexperiments. Ecology 77, 725–735.

Congdon, P., 2003. Applied Bayesian Modeling. Wiley Series inProbability and Statistics, West Sussex, England.

Costanza, R., Sklar, F.H., 1985. Articulation, accuracy and effective-ness of mathematical models—A review of fresh-water wetlandapplications. Ecol. Model. 27, 45–68.

Dorazio, R.M., Johnson, F.A., 2003. Bayesian inference and decisiontheory—A framework for decision making in natural resourcemanagement. Ecol. Appl. 13, 556–563.

Dokulil, M.T., Teubner, K., 2000. Cyanobacterial dominance inlakes. Hydrobiologia 438, 1–12.

Downing, J.A., Watson, S.B., McCauley, E., 2001. Predictingcyanobacteria dominance in lakes. Can. J. Fish Aquat. Sci. 58,1905–1908.

Edmondson, W.T., 1994. Sixty years of Lake Washington: a curricu-lum vitae. Lake Reserv. Manage. 10, 75–84.

Grace, J.B., 2001. The roles of community biomass and species poolsin the regulation of plant diversity. Oikos 92, 193–207.

Grace, J.B., Pugesek, B.H., 1998. On the use of path analysis andrelated procedures for the investigation of ecological problems.

G el oftland.

H Mul-

H John

H ness-

H an-

H risondel-

uraln-pp.

J truc-ds.),mic

Joreskog, K.G., Sorbom, D., 1993. LISREL 8: Structural Equa-tion Modeling with the SIMPLIS Command Language. ErlbaumAssociates, Hillsdale, NJ.

Jorgensen, S.E., 1999. State-of-the-art of ecological modeling withemphasis on development of structural dynamic models. Ecol.Model. 120, 75–96.

Jorgensen, S.E., 1997. Integration of Ecosystem Theories: A Pattern.Kluwer Academic Publishers, Dordrecht.

Kaplan, D., 2000. Structural Equation Modeling: Foundations andExtensions. Sage Publications, Newbury Park, Calif.

Keesling, J.W., 1972. Maximum likelihood approaches to causalanalysis. PhD Dissertion. University of Chicago, Chicago.

Kilham, S.S., Kreeger, D.A., Goulden, C.E., Lynn, S.G., 1997.Effects of algal food quality on fecundity and population growthrates ofDaphnia. Freshwater Biol. 38, 639–647.

Kleppel, G.S., Burkart, C.A., Houchin, L., 1998. Nutrition and theregulation of egg production in the calanoid copepod Acartiatonsa. Limnol. Oceanogr. 43, 1000–1007.

Kline, R.B., 1998. Principles and Practice of Structural EquationModeling. The Guildford Press, New York.

La Peyre, M.K., Mendelssohn, I.A., Reams, M.A., Templet, P.H.,Grace, J.B., 2001. Identifying determinants of nations’ wetlandmanagement programs using structural equation modeling: anexploratory analysis. Environ. Manage. 27, 859–868.

Lathrop, R.C., Carpenter, S.R., Robertson, D.M., 1999. Sum-mer water clarity responses to phosphorus,Daphnia grazing,and internal mixing in Lake Mendota. Limnol. Oceanogr. 44,137–146.

Lathrop, R.C., Carpenter, S.R., Stow, C.A., Soranno, P.A., Panuska,J.C., 1998. Phosphorus loading reductions needed to controlblue-green algal blooms in Lake Mendota. Can. J. Fish Aquat.Sci. 55, 1169–1178.

Lathrop, R.C., Carpenter, S.R., Rudstam, L.G., 1996. Water clarityin Lake Mendota since 1900: responses to differing levels ofnutrients and herbivory. Can. J. Fish Aquat. Sci. 53, 2250–2261.

L r rela-ge-rlag,

L rela-ebsin.

L naly-

L struc-l. 45,

L andcon-

L coresnn. I

L qua-73–

Am. Nat. 152, 151–159.race, J.B., Pugesek, B.H., 1997. A structural equation mod

plant species richness and its application to a coastal weAm. Nat. 149, 436–460.

air Jr., J.F., Anderson, R.E., Tatham, R.L., Black, W.C., 1995.tivariate Data Analysis, 4th ed. Prentice-Hall, New York.

ayduk, L.A., 1996. LISREL: Issues, Debates and Strategies.Hopkins University Press, Baltimore.

oelter, J.W., 1983. The analysis of covariance structures: goodof-fit indices. Sociol. Meth. Res. 11, 325–344.

olling, C.S., 1978. Adaptive Environmental Assessment and Magement. John Wiley and Sons, NY.

ovmand, P.S., 2003. Analyzing dynamic systems: a compaof structural equation modeling and systems dynamic moing. In: Pugesek, B.H., Tomer, A., Von Eye, A. (Eds.), StructEquation Modeling: Applications in Ecological and Evolutioary Biology. Cambridge University Press, Cambridge, UK,85–112.

oreskog, K.G., 1973. A general method for estimating a linear stural equation system. In: Goldberger, A.S., Duncan, O.D. (EStructural Equation Model in the Social Sciences. AcadePress, New York, pp. 85–112.

athrop, R.C., Carpenter, S.R., 1992a. Phytoplankton and theitionship to nutrients. In: Kitchell, J.F. (Ed.), Food web manament: a case study of Lake Mendota, Wisconsin. Springer-VeNew York, pp. 99–128.

athrop, R.C., Carpenter, S.R., 1992b. Zooplankton and theirtionship to phytoplankton. In: Kitchell, J.F. (Ed.), Food WManagement: A Case Study of Lake Mendota, WisconSpringer-Verlag, New York, pp. 129–152.

ee, S.Y., 1981. A Bayesian-approach to confirmatory factor asis. Psychometrika 46, 153–160.

ee, S.Y., 1992. Bayesian-analysis of stochastic constraints intural equation equation models. Brit. J. Math. Stat. Psycho93–107.

ee, S.Y., Ho, W.T., 1993. Analysis of multisample identifiednon-identified structural equation models with stochasticstraints. Comput. Stat. Data Anal. 16, 441–453.

ee, S.Y., Shi, J.Q., 2000. Joint Bayesian analysis of factor sand structural parameters in the factor analysis model. AStat. Math. 52, 722–736.

ee, S.Y., Song, X.Y., 2003. Bayesian analysis of structural etion models with dichotomous variables. Stat. Med. 22, 303088.

Page 25: Exploring ecological patterns with structural equation ...georgea/resources/22.pdf · the value of structural equation modeling in empiri-cal confirmation of scientific hypotheses,

G.B. Arhonditsis et al. / Ecological Modelling 192 (2006) 385–409 409

Lee, S.Y., Song, X.Y., 2004. Bayesian model comparison of non-linear structural equation models with missing continuous andordinal categorical data. Brit. J. Math. Stat. Psychol. 57, 131–150.

Lee, S.Y., Zhu, H.T., 2000. Statistical analysis of non-linear structuralequation models with continuous and polytomous data. Brit. J.Math. Stat. Psychol. 53, 209–232.

Legendre, L., Legendre, P., 1983. Numerical Ecology. Elsevier Sci-ence Publications, Amsterdam.

Lehman, J.T., 1988. Hypolimnetic metabolism in LakeWashington—Relative effects of nutrient load and foodweb structure on lake productivity. Limnol. Oceanogr. 33,1334–1347.

Levin, S.A., 1992. The problem of pattern and scale in ecology.Ecology 73, 1943–1967.

Levins, R., 1966. The strategy of model building in population biol-ogy. Am. Sci. 54, 421–431.

Malaeb, Z.A., Summers, J.K., Pugesek, B.H., 2000. Using structuralequation modeling to investigate relationships among ecologicalvariables. Environ. Ecol. Stat. 7, 93–111.

Mardia, K.V., 1974. Applications of some measures of multivariateskewness and kurtosis in testing normality and robustness studies.Sankhya, Ser B 36, 115–128.

Martin, J.K., McDonald, R.P., 1975. Bayesian estimation in unre-stricted factor analysis: a treatment for Heywood cases. Psy-chometrika 40, 505–517.

McCune, B., Grace, J.B., 2002. Analysis of Ecological Communities.MjM Software Design, Oregon, USA.

Mitchell, R.J., 1992. Testing evolutionary and ecological hypothe-ses using path-analysis and structural equation modeling. Funct.Ecol. 6, 123–129.

O’Neill, R.V., DeAngelis, D.L., Waide, J.B., Allen, T.F.H., 1986. AHierarchical Concept of Ecosystems. Princeton University Press,Princeton, New Jersey, USA.

Oreskes, N., Shrader-Frechette, K., Belitz, K., 1994. Verification,validation, and confirmation of numerical models in the earth-

P rence

P . Fish

P sity

P ringcol.

P tiongy.

P data:. 10,

P radi-odels

R eill,ical

knowledge to model coarser-scale attributes of ecosystems. Ecol.Appl. 2, 55–70.

Raykov, T., 1992. On structural models for analyzing change. Scand.J. Psychol. 33, 247–265.

Reckhow, K.H., 1999. Water quality prediction and probability net-work models. Can. J. Fish Aquat. Sci. 56, 1150–1158.

Reckhow, K.H., Arhonditsis, G.B., Kenney, M.A., Hauser, L., Tribo,J., Wu, C., Elcock, K.J., Steinberg, L.J., Stow, C.A., McBride,S.J., 2005. A predictive approach to nutrient criteria. Environ.Sci. Technol. 39, 2913–2919.

Richey, J.E., 1979. Patterns of phosphorus supply and utilizationin Lake Washington and Findley Lake. Limnol. Oceanogr. 24,906–916.

Salthe, S.N., 1985. Evolving Hierarchical Systems. Columbia Uni-versity Press, New York, USA.

Scheines, R., Hoijtink, H., Boomsma, A., 1999. Bayesian estima-tion and testing of structural equation models. Psychometrika64, 37–52.

Shipley, B., 2000. Cause and Correlation in Biology: A User’s Guideto Path Analysis, Structural Equations and Causal Inference.Cambridge University Press, Cambridge UK.

Shipley, B., 1997. Exploratory path analysis with applications inecology and evolution. Am. Nat. 149, 1113–1138.

Smith, F.A., 1995. Scaling of digestive efficiency with body-mass inNeotoma. Funct. Ecol. 9, 299–305.

Sommer, U., 1989. The role of competition for resources in phyto-plankton succession. In: Sommer, U. (Ed.), Phytoplankton ecol-ogy. Succession in plankton communities. Springer-Verlag, NewYork, pp. 57–106.

Soranno, P.A., 1997. Factors affecting the timing of surface scumsand epilimnetic blooms of blue-green algae in a eutrophic lake.Can. J. Fish Aquat. Sci. 54, 1965–1975.

Soranno, P.A., Carpenter, S.R., Lathrop, R.C., 1997. Internal phos-phorus loading in Lake Mendota: response to external loads andweather. Can. J. Fish Aquat. Sci. 54, 1883–1893.

S sianat

S in-

S the–29.

S nt of1–19.

S ogy.

V y andlos.

W 367–

W 557–

Z ctor

sciences. Science 263, 641–646.aap, R., 2002. What are the advantages of MCMC based infe

in latent variable models? Stat. Neerl. 56, 2–22.ace, M.L., 2001. Prediction and the aquatic sciences. Can. J

Aquat. Sci. 58, 63–72.eters, R.H., 1991. A Critique for Ecology. Cambridge Univer

Press, Cambridge, UK.etraitis, P.S., Dunham, A.E., Niewiarowski, P.H., 1996. Infer

multiple causality: the limitations of path analysis. Funct. E10, 421–431.

ugesek, B.H., Tomer, A., Von Eye, A., 2003. Structural EquaModeling: Applications in Ecological and Evolutionary BioloCambridge University Press, Cambridge, UK.

ugesek, B.H., Tomer, A., 1996. The Bumpus house sparrowa reanalysis using structural equation models. Evol. Ecol387–404.

ugesek, B.H., Tomer, A., 1995. Determination of selection gents using multiple-regression versus structural equation m(SEM). Biomet. J. 37, 449–462.

astetter, E.B., King, A.W., Cosby, B.J., Hornberger, G.M., O’NR.V., Hobbie, J.E., 1992. Aggregating fine-scale ecolog

piegelhalter, D., Thomas, A., Best, N., Gilks, W., 1996. BayeVersion Using Gibbs Sampling Manual, Version ii. Availablehttp://www.mrc-bsu.cam.ac.uk/bugs.

piegelhalter, D., Thomas, A., Best, N., Lunn, D., 2003. WBUGS User Manual, Version 1.4. Available athttp://www.mrc-bsu.cam.ac.uk/bugs.

terner, R.W., Hessen, D.O., 1994. Algal nutrient limitation andnutrition of aquatic herbivores. Annu. Rev. Ecol. Syst. 25, 1

tow, C.A., Borsuk, M.E., 2003. Enhancing causal assessmeestuarine fishkills using graphical models. Ecosystems 6, 1

ugihara, G., May, R.M., 1990. Applications of fractals in ecolTrends Ecol. E 5, 79–86.

epsalainen, K., Spence, J.R., 2000. Generalization in ecologevolutionary biology: from hypothesis to paradigm. Biol. Phi15, 211–238.

right, S., 1918. On the nature of size factors. Genetics 3,374.

right, S., 1921. Correlation and causation. J. Agr. Res. 20,585.

hu, H.T., Lee, S.Y., 1999. Statistical analysis of non-linear faanalysis models. Brit. J. Math. Stat. Psychol. 52, 225–242.