macroecological bhpmf – a hierarchical bayesian approach

MACROECOLOGICALMETHODS

BHPMF – a hierarchical Bayesianapproach to gap-filling and traitprediction for macroecology andfunctional biogeographyFranziska Schrodt1,2,3,*, Jens Kattge1,2, Hanhuai Shan4,5, Farideh Fazayeli4,

Julia Joswig1, Arindam Banerjee4, Markus Reichstein1, Gerhard Bönisch1,

Sandra Díaz6, John Dickie7, Andy Gillison8, Anuj Karpatne4, Sandra Lavorel9,

Paul Leadley10, Christian B. Wirth2,11, Ian J. Wright12, S. Joseph Wright13 and

Peter B. Reich3,14

1Max Planck Institute for Biogeochemistry,

Hans-Knöll-Strasse 10, 07745 Jena, Germany,2German Centre for Integrative Biodiversity

Research (iDiv) Halle-Jena-Leipzig, Deutscher

Platz 5, 04103 Leipzig, Germany, 3Department

of Forest Resources, University of Minnesota,

St Paul, MN 55108, USA, 4Department of

Computer Science and Engineering, University

of Minnesota, Twin Cities, USA, 5Microsoft

Corporation, One Microsoft Way, Redmond,

WA 98052, USA, 6Instituto Multidisciplinario

de Biología Vegetal (IMBIV – CONICET) and

Departamento de Diversidad Biológica y

Ecología, FCEFyN, Universidad Nacional de

Córdoba, CC 495, 5000, Córdoba, Argentina,7Royal Botanic Gardens Kew, Wakehurst Place,

RH17 6TN, UK, 8Center for Biodiversity

Management, Yungaburra 4884, Queensland,

Australia, 9Centre National de la Recherche

Scientifique, Grenoble, France, 10Laboratoire

ESE, Université Paris-Sud, UMR 8079 CNRS,

UOS, AgroParisTech, 91405 Orsay, France,11University of Leipzig, Leipzig, Germany,12Department of Biological Sciences,

Macquarie University, NSW 2109, Australia,13Smithsonian Tropical Research Institute,

Apartado 0843-03092, Balboa, Republic of

Panama, 14Hawkesbury Institute for the

Environment, University of Western Sydney,

Locked Bag 1797, Penrith, NSW 2751

Australia

ABSTRACT

Aim Functional traits of organisms are key to understanding and predicting bio-diversity and ecological change, which motivates continuous collection of traits andtheir integration into global databases. Such trait matrices are inherently sparse,severely limiting their usefulness for further analyses. On the other hand, traits arecharacterized by the phylogenetic trait signal, trait–trait correlations and environ-mental constraints, all of which provide information that could be used to statis-tically fill gaps. We propose the application of probabilistic models which, for thefirst time, utilize all three characteristics to fill gaps in trait databases and predicttrait values at larger spatial scales.

Innovation For this purpose we introduce BHPMF, a hierarchical Bayesianextension of probabilistic matrix factorization (PMF). PMF is a machine learningtechnique which exploits the correlation structure of sparse matrices to imputemissing entries. BHPMF additionally utilizes the taxonomic hierarchy for traitprediction and provides uncertainty estimates for each imputation. In combinationwith multiple regression against environmental information, BHPMF allows forextrapolation from point measurements to larger spatial scales. We demonstrate theapplicability of BHPMF in ecological contexts, using different plant functional traitdatasets, also comparing results to taking the species mean and PMF.

Main conclusions Sensitivity analyses validate the robustness and accuracy ofBHPMF: our method captures the correlation structure of the trait matrix as wellas the phylogenetic trait signal – also for extremely sparse trait matrices – andprovides a robust measure of confidence in prediction accuracy for each missingentry. The combination of BHPMF with environmental constraints provides apromising concept to extrapolate traits beyond sampled regions, accounting forintraspecific trait variability. We conclude that BHPMF and its derivatives have ahigh potential to support future trait-based research in macroecology and func-tional biogeography.

KeywordsBayesian hierarchical model, gap-filling, imputation, machine learning, matrixfactorization, PFT, plant functional trait, sparse matrix, spatial extrapolation,TRY.

*Correspondence: Franziska Schrodt, MaxPlanck Institute for Biogeochemistry,Hans-Knöll-Strasse 10, 07745 Jena, Germany.E-mail: [email protected]

bs_bs_banner

Global Ecology and Biogeography, (Global Ecol. Biogeogr.) (2015)

© 2015 John Wiley & Sons Ltd DOI: 10.1111/geb.12335http://wileyonlinelibrary.com/journal/geb 1

INTRODUCTION

Functional trait measurements and analyses have been the focus

of numerous studies in recent decades (e.g. Reich et al., 1997;

Wright et al., 2004; Chave et al., 2009; Schrodt et al., 2015).

However, due to the time and resources required and the sheer

number of species on earth, only a small number of species and

their traits could be captured to date, especially in tropical and

remote ecosystems. In addition, trait data are highly dispersed

among numerous datasets and are often not accessible to the

wider scientific community. The integration of databases is thus

becoming increasingly important for the consolidation of glob-

ally dispersed data, as a source of standardized data for further

applications, such as model building and validation, and to

coordinate future measurement efforts.

Combining trait observations from studies with different

research foci produces matrices with substantial gaps. For

example, the largest database for plant traits to date, TRY

(Kattge et al., 2011), currently contains 215 datasets with 5.6

million trait entries for 1100 traits of 2 million individuals,

representing 100,000 plant species. On average only 2 of the

1100 traits represented in TRY are measured for any individual,

restricting the usefulness of combined datasets especially for

multivariate analyses.

General characteristics of traits

Some characteristics inherent to functional traits may support

statistical gap-filling of sparse trait matrices: a strong

phylogenetic trait signal, functional and structural trade-offs

between traits and trait–environment relationships.

The phylogenetic trait signal is an effective aid in predicting

trait values (e.g. Lovette & Hochachka, 2006; Swenson, 2014):

the closer two individuals are related, the more similar their

traits will be – with exceptions due to convergent evolution and

environmental diversification. This phylogenetic trait signal is

reflected in the taxonomic hierarchy: on average individuals

within a species are more similar than individuals within genera,

families or phylogenetic groups (Kerkhoff et al., 2006; Swenson

& Enquist, 2007). In general, most of the trait variance is

observed between species (Kattge et al., 2011), differences

between species are consistent across spatial scales (Kazakou

et al., 2014) and mean trait values at native ranges are appropri-

ate estimates at invaded areas (McMahon, 2002; Ordonez,

2014). Due to these frequently observed patterns, mean trait

values of species and even genera are often used to fill gaps in

trait matrices (Fried et al., 2012; Cordlandwehr et al., 2013).

Traits possessed by any individual are not independent –

functional and structural trade-offs cause correlations between

traits (Reich et al., 1997). This has been characterized at the

global scale, for example for the leaf and wood economic

spectra, respectively (Wright et al., 2004; Chave et al., 2009).

Trait variability is also influenced by the environment which, on

the one hand, causes large-scale patterns, for example latitudinal

gradients of leaf N/P stoichiometry (Reich & Oleksyn, 2004) or

changes in the competitive success of Drosophila species along a

temperature gradient (Davis et al., 1998), and on the other hand

small-scale intraspecific variation, for example in plant traits

(Albert et al., 2010; Clark, 2010) and mammalian body size

(Diniz-Filho et al., 2007).

Gap-filling in trait ecology

Many disciplines, for example psychology (e.g. Schafer &

Graham, 2002), sociology (e.g. Johnson & Young, 2011) or

biogeochemistry (e.g. Moffat et al., 2007; Lasslop et al., 2010)

have developed their own set of accepted techniques for predict-

ing missing values tailored to their data structure. In trait

ecology, due to the relatively recent advent of large databases,

gap-filling methods have been adopted from other disciplines,

often without adjusting them to differences in data structure. In

general three approaches are used within the community of trait

ecologists: (1) deleting rows with missing cases or pair-wise

analysis; (2) predicting trait values based on the taxonomic trait

signal at species or genus level (mean traits); and (3) predicting

values for individual missing entries based on structure, e.g.

multiple imputation (MI) (Rubin, 1987; Su et al., 2011) and

multivariate imputation using chained equations (MICE)

(Rubin, 1987; van Buuren & Groothuis-Oudshoorn, 2011),

or/and spatial correlations within the trait matrix (e.g. kriging;

Lamsal et al., 2012; Räty & Kangas, 2012). These approaches

provide useful imputations in some cases, but whilst deleting

missing cases can introduce bias in model parameters

(Nakagawa & Freckleton, 2008), taking the ‘mean’ adds new data

points without adding new information, which results in incor-

rect confidence limits. The techniques mentioned in (3) on the

other hand were developed to fill gaps in databases with 10–30%

missing entries, but not for a sparsity as high as that observed in

combined trait databases such as TRY.

A recent exercise comparing different approaches (amongst

them ‘mean’ and MICE) to filling gaps in a plant trait matrix –

although at the species rather than the individual level –showed

that all methods were indeed only effective up to 30% gaps, that

genus mean produced the least reliable results and that includ-

ing ecological theory, for example by taking into account trait–

trait correlations, will substantially improve the accuracy of gap-

filling approaches (Taugourdeau et al., 2014). Ogle (2013)

developed a hierarchical Bayesian model, which could be used

for gap-filling of individual trait values based on taxonomic

hierarchy and covariates. However, this method only allows for

one trait to be predicted at a time, ignoring the correlation

structure within a multi-trait matrix. In contrast, some

phylogenetic imputation methods, as reviewed recently by

Swenson (2014), do allow for multiple trait predictions but only

give species or higher-order means and lack any accounting for

intraspecific variability. While this may be resolved, for example

by adding ‘twigs’ on the ends of phylogenies with terminal nodes

as species (Swenson, pers. comm.) to the authors’ knowledge,

this has not been implemented yet.

Here we present a new approach – Bayesian hierarchical

probabilistic matrix factorization (BHPMF) – which imputes

trait values based on the taxonomic hierarchy, structure within

F. Schrodt et al.

Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd2

the trait matrix and trait–environment relationships at the same

time as providing uncertainty estimates for each single trait

prediction. An extension of the method provides a concept for

out-of-sample predictions – the extrapolation of point measure-

ments to spatial scales beyond measured areas. We evaluate trait

predictions by BHPMF under various aspects and provide per-

spectives for future developments.

METHODS

Probabilistic matrix factorization

The method we present is a development of probabilistic matrix

factorization (PMF) (Salakhutdinov & Mnih, 2008). PMF is a

recommendation system developed on the example of predict-

ing users’ preferences in movies from other users’ movie ratings

(Netflix, 2009). Due to its good scalability and predictive accu-

racy, even for highly sparse datasets, PMF has become a standard

technique for imputing missing data (Koren et al., 2009).

PMF models a sparse matrix, such as the TRY database, as the

scalar (or inner, dot) product of two latent matrices with the aim

of finding a factorization that minimizes the error between pre-

dicted and observed data. This technique is closely related to

principal components analysis (PCA), which converts a set of

observations of correlated variables into a set of values of lin-

early uncorrelated variables called principal components. Like

PCA, PMF is efficient if the original matrix is of low rank, i.e. if

the axes of the original matrix provide strong correlations.

In the original case, each column represents a video and each

row a user, providing a video ranking (Salakhutdinov & Mnih,

2008). In the case of a trait matrix, videos are replaced by traits

and users by entities. In our case, entities are defined as individ-

ual plants. They could equally represent the level of an organ or

the average over different organisms, for example nitrogen

content of a single leaf, motility of a phytoplankton species or

the average beak length of several individuals of an avian species.

For simplicity, we refer to the individual plant as plant from now

on.

In a first step, independent latent vectors are generated over

each row (individual, u) and column (trait, v) of the plant × trait

matrix X N M∈ ×R with N rows and M columns (Fig. 1). Any

missing entry (n, m) in the original matrix X can be predicted as

the inner product of these latent vectors xnm = ⟨un, vm⟩ (Fig. 1).

PMF has been shown to be applicable to biological data, for

example in population genetics (Duforet-Frebourg and Blum,

2014). However, our first experiments indicated that for plant

traits the prediction accuracy of PMF was insufficient: the accu-

racy was worse than using species mean trait values to fill the

gaps (see Results). We therefore developed an extension of PMF,

which accounts for a plants’ taxonomic hierarchy to improve

prediction accuracy (Bayesian hierarchical PMF; BHPMF). The

concept was developed as HPMF in the context of machine

learning (Shan et al., 2012). We here introduce the additional

application of a Gibbs sampler in order to provide a measure of

uncertainty for each imputed trait value (BHPMF) (see ‘Gibbs

sampler – uncertainty quantified trait prediction’), as well as an

extension to facilitate out-of-sample predictions (aHPMF).

Bayesian hierarchical probabilisticmatrix factorization

BHPMF exploits the taxonomic hierarchy of the plant kingdom

as a proxy for the phylogenetic trait signal, with the individual

plant being nested in species, species in genus, genus in family

and family in phylogenetic group (Fig. 2).

BHPMF sequentially performs PMF at the different hierar-

chical levels, using latent vectors of the neighbouring level (ℓ) as

prior information at the current level. For example, trait data

averaged at species level are used to optimize latent vectors at

species level, which in turn act as priors for latent vectors at

the individual level, which finally are optimized against the

observed trait entries in the trait matrix (Fig. 2, equation 1).The

sequential approach across the taxonomic hierarchy turned out

to be most effective if applied iteratively top down and bottom

up.

After transformation of traits to approximate normal distri-

butions and z-score transformation, the cost function is devel-

oped as the sum of absolute deviations of predictions versus

observations for traits (m) of entities (n) (first summand in

equation 1) and the sum of absolute deviations of posterior and

prior of the latent factors u and v (second and third summand of

equation 1) across all hierarchical levels (L):

E xnm nm n m

nm

L

u n p n

= − ⟨ ⟩( )⎧⎨⎩

+ −

( ) ( ) ( ) ( )

=

( )( )−

∑∑ δ

λ

� � � �

�

� �

u v

u u

,2

1

11

2

2 12

2( ) ( ) −( )∑ ∑+ − ⎫⎬⎭n

v m m

m

λ v v� � ,

(1)

Figure 1 Schematic of the probabilistic matrix factorization(PMF) model. u denotes the latent vector on the individual plantside, v the latent vector on the functional trait side, both of whichhave a Gaussian normal distribution with a mean of 0 and avariance of σ2. Each missing entry Xij can be approximated by theproduct of the transposed latent vector U and the latent vector V.

Gap-filling in trait databases

Global Ecology and Biogeography, © 2015 John Wiley & Sons Ltd 3

where λ σ σu u= 2 2 , λ σ σv v= 2 2 with σ being the standard devia-

tion of the imputations. {·} denotes the set of data at all L levels,

and δnm�( ) = 1 when the entry (n, m) of X �( ) is non-missing and 0

otherwise. Replace ℓ − 1 with ℓ + 1 and the parent node (p(n))

with the child node (c(n)) for the bottom-up approach. For

details see BHPMF in Appendix S1 in Supporting Information.

Gibbs sampler – uncertainty quantifiedtrait prediction

The parameters of our BHPMF model are optimized against the

observations in the matrix using a Gibbs sampler (Fazayeli et al.,

2014). The Gibbs sampler is a Markov chain Monte Carlo

(MCMC) method, which samples the probability density distri-

butions of model parameters (here the latent vectors) and

model predictions (here entries in the plant × trait matrix) (see

the grey inset in Fig. 2 and ‘Model evaluation’, as well as Gibbs

sampler results in Appendix S10). The Gibbs sampler-inferred

density distributions of trait values are then used to infer the

most likely imputation value, as well as the associated uncer-

tainty for each prediction.

aHPMF – extrapolation from point measurements toregional scales

If BHPMF is stopped at the species level, i.e. without accounting

for trait variability specific to individual plants, the residual

error represents the intraspecific variability and modelling/

measurement errors. aHPMF focuses on explaining this residual

trait variability based on environmental variables, such as soil

and climate characteristics of their growth environment in order

to enable out-of-sample prediction, i.e. trait predictions for

individual plants where the only known factors are species iden-

tity and location but no traits have actually been measured on

the given individual (Fig. 3).

To capture trait variability that can be attributed to environ-

mental factors, we utilize a hierarchical regression framework,

taking into account the taxonomic structure of plants to regu-

larize the regression model. The regression framework takes as

independent predictor variables the climatic and soil variables

mentioned below at locations with georeferenced trait measure-

ments. The residuals of BHPMF for the 13 plant traits of each

observation are considered as the target dependent variables to

be predicted. We treat each plant trait independently of every

other while regressing them using climate and soil features.

In essence, combining BHPMF with least squares regression

over the residuals against environmental factors, we can model

the unknown value for species n and trait m in a probabilistic

model as

k u v w x enm n m nm= + +α βT T (2)

where (un, vm) are the latent factors, with un having a hierarchical

prior from the taxonomy and x being the environmental condi-

tion with w as the regression coefficient. enm is the zero mean

Gaussian noise. Note that α, β are scalar parameters: for BHPMF

set (α = 1, β = 0), for aHPMF set (α = 1, β = 1). For details see

‘aHPMF’ in Appendix S1.

Data: traits, climate and soil

We demonstrate the applicability of the methods introduced

above on the example of a trait matrix derived from TRY. For

details on data standardization see Kattge et al. (2011). The

spatial distribution of measurement sites and detailed informa-

tion on the original datasets are shown in Fig. S4.1 (Appendix

S4) and Tables S3.1 & S3.2 in Appendix S3.

We extracted a matrix of 13 georeferenced traits consisting of

204,404 trait measurements on 78,300 individuals, spanning

14,320 species, 3793 genera, 358 families and 6 phylogenetic

Figure 2 Schematic of the Bayesian hierarchical probabilisticmatrix factorization (BHPMF) model. N denotes the entity(individual plant) side and U the corresponding matrix of latentvectors on the row side, M the trait side and V the correspondingmatrix of latent vectors on the column side. x denotes an entry inthe original plant × trait matrix S. The numbers in parenthesesshow the taxonomic level L. For example (4) is the species levelwhereas (2) is the family level. The grey inset provides a schemafor the Gibbs sampler where p(n) is the parent node of n in theupper level and c(n) is the set of child nodes n in the lower level.

Figure 3 Schematic of the advanced hierarchical probabilisticmatrix factorization (aHPMF) model. w denotes the regressioncoefficient at different levels of the hierarchy and Q thecorresponding matrix of latent vectors. The numbers inparentheses shows the taxonomic level L.

F. Schrodt et al.


groups. The sparsity ranged from 49.63% for leaf area to 92.33%

for the leaf N to P ratio, with an average sparsity of 79.9% across

the trait matrix (Table 1). All traits were log- and z-transformed

to improve normality and equalize traits in the cost function

during optimization.

For out-of-sample predictions by aHPMF, climate data for

mean annual precipitation, mean annual temperature, isother-

mally and precipitation seasonality were extracted from the

WorldClim dataset (Hijmans et al., 2005) and soil texture (sand,

silt, clay) and soil organic carbon content in the top soil from the

Harmonized World Soil Database v1.2 (FAO et al., 2012).

Model evaluation

We ran PMF, BHPMF and aHPMF on the test dataset extracted

from TRY. Given the plant × trait matrix, we randomly selected

80% of entries for training (parameter setting), 10% for valida-

tion (parameter adjustment by optimizing performance) and

10% for test (independent performance testing after parameter

adjustment and learning). This cross-validation improves model

fidelity by ensuring that none of the observations are known by

the model when performing new predictions. Test entries

without training data in the same row would have highly

inflated variance. Such cases were prevented by adjusting the

splitting accordingly (see ‘BHPMF’ in Appendix S1).

We evaluated the predicted trait values, using the root mean

squared error (RMSE; see equation S13 in Appendix S1) and the

correlation coefficient (R2) of z-transformed predicted versus

observed traits as indicators of overall prediction accuracy. We

compared the performance of PMF, BHPMF and aHPMF with a

baseline of species mean trait values (MEAN), which uses the

overall trait mean of all individual plants within a species for

prediction. The effectiveness of capturing the phylogenetic trait

signal was explored by performing BHPMF including increas-

ingly detailed taxonomic information (Fig. 2).

In order to evaluate how well not only predicted versus meas-

ured but also trait-trait correlations are preserved in BHPMF, we

performed standardized major axis (SMA) regression, the first

principal component vector of a correlation matrix fitted

through the data centroid (Taskinen & Warton, 2011), on the

measured and imputed trait values for some key trait correla-

tions. We also performed a Procrustes analysis with PROTEST

(using the R package ‘vegan’) on a PCA of a subset of the

original data versus a PCA based on the estimated values for

artificially introduced gaps. Due to its good data cover, we per-

formed this test on the RAINFOR extract from the TRY database

(see below). Procrustes is a statistical shape analysis tool (least-

squares orthogonal mapping) which compares two ‘superim-

posed’ matrices for overlap, with placement in space and object

size being adjustable. We show how uncertainty in trait predic-

tions is accounted for using the Gibbs sampler, comparing pre-

diction confidence (SD) with prediction accuracy (RMSE).

The sensitivity of BHPMF to the fraction of gaps and the

effect of using a global database to fill gaps in local or regional

datasets were explored using two approaches. First by ‘cutting

out’ a local dataset with high coverage, adding additional gaps

(0, 10, 30, 60 and 80%; see Table S8.1 in Appendix S8) and

second by using a regional gappy dataset, filling gaps in each of

these ‘cut-outs’ using (1) the global data with information from

the local/regional data and (2) just the local/regional data. For

our local example, we extracted TRY trait data contributed by

the RAINFOR group (Fyllas et al., 2009), which shows a good

coverage (sparsity 11%) and covers most of the Amazon

(Fig. S8.1 in Appendix S8). For our regional example, we

extracted all of the European data (sparsity 72%) (Fig. S8.2 in

Appendix S8). For details on methodology please refer to

Table 1 Number of entries, sparsity androot mean square error (RMSE) ofspecies mean (MEAN), probabilisticmatrix factorization (PMF), Bayesianhierarchical PMF (BHPMF) andadvanced hierarchical PMF (aHPMF) bytrait, as well as R2 values of theregression of imputed versus measuredtraits. The lowest RMSE and highest R2

are shown in bold.

Trait Entries Sparsity

MEAN PMF BHPMF aHPMF

RMSE R2 RMSE R2 RMSE R2 RMSE R2

SLA 33001 57.9 0.53 0.85 0.88 0.49 0.46 0.88 0.53 0.89Plant height 16465 79.0 0.47 0.90 0.91 0.40 0.40 0.92 0.44 0.93Seed mass 7311 90.7 0.37 0.91 0.77 0.40 0.36 0.92 0.36 0.92LDMC 17331 77.9 0.53 0.83 0.87 0.41 0.43 0.88 0.49 0.89SSD 9191 88.3 0.51 0.86 1.01 0.19 0.44 0.87 0.51 0.87Leaf area 39438 49.6 0.50 0.87 0.91 0.41 0.37 0.93 0.41 0.93Leaf N 26882 65.7 0.67 0.77 0.99 0.28 0.53 0.86 0.59 0.86Leaf P 11975 84.7 0.72 0.69 0.78 0.62 0.52 0.83 0.62 0.83Leaf N/area 8180 89.6 0.79 0.65 0.80 0.64 0.51 0.82 0.72 0.82Leaf fresh mass 11484 85.3 0.47 0.89 0.71 0.71 0.27 0.96 0.39 0.96Leaf N/P ratio 5999 92.3 0.76 0.69 0.90 0.44 0.49 0.85 0.67 0.84Leaf C/dry mass 8123 89.6 0.70 0.74 0.884 0.35 0.61 0.61 0.62 0.78Leaf δ 15N 9022 88.5 0.63 0.79 1.02 0.02 0.50 0.87 0.53 0.88Average 15723 79.9 0.58 0.80 0.88 0.41 0.45 0.88 0.53 0.87

SLA, specific leaf area; LDMC, leaf dry matter content; SSD, stem-specific density; Leaf N and Leaf P,leaf nitrogen and phosphorus concentrations per dry mass, respectively; Leaf N/area, leaf nitrogenconcentration per leaf area; Leaf C/dry mass, leaf carbon concentration per dry mass. For definitionsof all traits and data sources as well as corresponding references see the Supporting Information(Appendices S2, S3 and S11 respectively).



Appendices S8 & S1. Finally, we provide an example for out-of-

sample prediction, extrapolating leaf nitrogen concentration

(leaf N) from point measurements to the whole species range of

Acer saccharum using aHPMF.

Probabilistic matrix factorization and subsequent regression

were developed and applied in MATLAB version 2012a

(MATLAB, 2012). All other analyses were performed using the

statistical platform R version 2.15 (R Core Team, 2014). The

maps reported here were produced in ArcMap 10.1 (ArcGIS

Desktop, 2011) and R, using the tree species distribution map of

A. saccharum from the US Geological Survey (Little, 1971). R

scripts to implement BHPMF are available from the authors by

request.

RESULTS

Predicted versus observed trait values

To analyse prediction accuracy we compare RMSE and the coef-

ficient of determination (R2) for MEAN, PMF, BHPMF and

aHPMF averaged across traits and for each trait separately

(Table 1; for scatterplots of observed versus predicted for all

traits see Fig. S9.1 in Appendix S9). On average, across all traits,

BHPMF outperforms PMF, MEAN and aHPMF, with MEAN

being significantly more accurate than PMF. This holds after

statistical evaluation using a paired t-test with P-values smaller

than 10−5 at all levels, and is supported by the evaluation of the

correlation coefficient R2 (Table 1). As the RMSE is calculated

from z-transformed approximate normal distributions of traits,

a RMSE of 0.45 for BHPMF indicates that the average error of

predictions is about half a standard deviation, or about 10% of

the 95% CI. BHPMF outperforms MEAN and PMF in all traits,

while aHPMF shows the same or higher RMSE and higher R2

than BHPMF for SLA, plant height, leaf dry matter content

(LDMC), leaf carbon (C) per dry mass and leaf δ 15N (D15N)

(Table 1). The advantage of BHPMF over MEAN is largest for

‘physiological traits’, such as leaf N and leaf phosphorus concen-

tration (leaf P), and smaller for more ‘structural traits’ such as

seed mass or plant height. The prediction accuracy of BHPMF

varies across traits: from RMSE = 0.36 (R2 = 0.92) for seed mass

to RMSE = 0.61 (R2 = 0.61) for leaf C content per dry mass.

Interestingly, prediction accuracy is not related to the number of

entries per trait (Table 1).

Accounting for taxonomic hierarchy

The RMSE of MEAN and BHPMF decreases with increasing

taxonomic information, indicating that both methods can

utilize the hierarchical structure to their advantage (Table S7.1

in Appendix S7). This is also supported by the scatter plot of

measured versus predicted specific leaf area (SLA) and leaf N

shown in Fig. 4. With increasing taxonomic information, the

scatter plot approaches the 1:1 line, i.e. prediction accuracy

improves.

Trait–trait correlations

Although the presence of strong trait–trait correlations is a pre-

requisite for the accuracy of BHPMF, such correlations are not

provided a priori and are thus not part of the objective function

used (equation 1). This turns them into a suitable evaluation

measure. An important quality criterion is to what extent the

imputed values reflect the observed bivariate correlations, as this

is a first indication of the extent to which the overall correlation

structure of the n-dimensional trait matrix is maintained by

imputation. Our dataset shows on average strong trait–trait cor-

relations, with some exceptions (Fig. S9.5 in Appendix S9).

BHPMF and MEAN capture these general trait–trait correla-

tions, but BHPMF reproduces extreme values more accurately

than MEAN and is therefore generally better at capturing the

shape of the scatter of observed trait data, which is confirmed by

more similar SMA R2 values (Fig. S9.2 in Appendix S9). Looking

at the multivariate preservation of trait–trait correlations using

Procrustes analysis, our results indicate again that BHPMF does

Figure 4 Scatter plots of predicted versus true values for twotraits with increasing taxonomic information. Left column, leafnitrogen concentration per dry weight; right column, specific leafarea. Row 1, no phylogenetic information is used; row 2, only thephylogenetic group is used; row 3, phylogenetic group and familyare used; row 4, phylogenetic group, family and genus are used;row 5, phylogenetic group, family, genus and species are used.Predictions are based on Bayesian hierarchical probabilistic matrixfactorization. The data are presented in z-transformed space.Dotted lines indicate the 1:1 correlation.

F. Schrodt et al.


not significantly alter the correlation structure of the gap-filled

matrix (Fig. 5). The first four principal component axes explain

83.4% and 83.4% of the variability in the dataset for the original

and gap-filled data, respectively. None of the principal compo-

nent axes are significantly different between the gappy and gap-

filled data for any of the traits. The traits stem specific density

and leaf carbon differ – but not significantly – along the third

and fourth axes (see Fig. S9.3 in Appendix S9).

Uncertainty quantified predictions

The Gibbs sampler provides a probability distribution for every

single prediction, as shown in the example of Gibbs sampler-

generated density plots of BHPMF-estimated LDMC, leaf N and

SLA for A. saccharum and Pinus sylvestris trees (Fig. S10.1 in

Appendix S10). This distribution can be exploited to calculate

indices for the best estimate (e.g. mean) and variability (e.g. SD).

This provides an additional means to evaluate our imputation

model by comparing prediction confidence (SD) with predic-

tion accuracy (RMSE): when we are confident about our pre-

dictions (small SD), these predictions should also be accurate

(small RMSE) and vice versa. Figure S10.2 in Appendix S10

shows that this is indeed the case for the whole 13-trait dataset,

implying that our model is appropriate. This remains true when

we evaluate the Gibbs sampler on each trait separately

(Fig. S10.3 in Appendix S10).

Gap-filling of regional/local data using BHPMF

As expected, increasing the number of gaps in the RAINFOR

dataset generally resulted in a decrease of prediction accuracy

(Fig. 6), although less so for structural traits, such as stem-

specific density (SSD) and plant height. Reproducibility was

high in all cases (Fig. 6). Prediction accuracy of BHPMF was

generally approximately equal, no matter whether the regional

(RforR) or global (WforR) datasets were used to fill the gaps

(Figs 6 & S8.3 in Appendix S8). This was particularly the case if

gap sizes were large (above 10%), whilst RforR outperformed

WforR for the imputations of plant height, leaf N, SLA and leaf

carbon only where additional gap sizes were small (0 and

10%).

Out-of-sample prediction (aHPMF)

We illustrate the extension of BHPMF towards out-of-sample

prediction with the example of leaf N across the species range of

A. saccharum (Fig. 7).

Figure 5 Procrustes analysis errors for the first and secondprincipal component axes comparing a princpal componentsanalysis (PCA) performed on the original, gappy RAINFOR datawith a PCA performed on the RAINFOR data with artificiallyintroduced gaps being filled using Bayesian hierarchicalprobabilistic matrix factorization.

Figure 6 Root mean square error (RMSE) of performingBayesian hierarchical probabilistic matrix factorization (BHPMF)on the RAINFOR cutout (red points in Fig. S8.1 in Appendix S8)for the whole dataset (Total), specific leaf area (SLA), plant height(PlantHt), stem-specific density (SSD), leaf nitrogen (LeafN), leafphosphorus (LeafP), leaf nitrogen per area (LeafNArea), leafcarbon (LeafC) with increasing number of gaps added to theoriginal RAINFOR data (inherent gappiness of 11%). For the totalnumber of gaps for each trait and added gaps per dataset, seeTable S8.1 in Appendix S8. Left- and right-hand sections for eachtrait (separated by a dotted line) show results when using only theRAINFOR data (RforR) or using all available data (WforR),respectively, to fill the gaps.



BHPMF was stopped at species level, followed by a multivari-

ate regression of residuals of predicted versus observed traits at

the level of the individual plants against environmental condi-

tions. Trait predictions follow the trends in both measured leaf

N and environmental conditions. However, the variability

within aHPMF-predicted leaf N is low (18.69–18.80 mg g−1)

compared with the range of actually measured values (8.65–

28.23 mg g−1). While the amplitude of predicted traits for the

grid elements is much smaller than the observed ranges, it is

notable that A. saccharum occurs over a wide range of environ-

mental conditions and grid averages may not reflect variation at

local scales. The flat response surface Reich & Oleksyn (2004)

found for the correlation between leaf N in the genus Acer and

mean annual temperature may further support the validity of

our results.

DISCUSSION

We demonstrate that BHPMF provides accurate and robust

uncertainty-quantified trait predictions, even for sparse matri-

ces with up to 80% missing entries. BHPMF outperforms the

species MEAN baseline in all aspects: RMSE and R2 of predicted

versus observed entries are smaller and larger, respectively, for

each individual trait (Table 1) and trait–trait correlations are

better retrieved (Fig. S9.2 in Appendix S9). Prediction accuracy

is high (small RMSE) when prediction uncertainty is small

(small SD; Fig. S10.2 in Appendix S10), providing a measure of

confidence for prediction accuracy. In addition, aHPMF pro-

vides a concept to extrapolate from point measurements to

species ranges accounting for intraspecific variability (Fig. 7).

These results give rise to three major questions: (1) Why does

BHPMF provide accurate and robust trait predictions even for

sparse trait matrices? (2) Is the prediction accuracy of BHPMF

sufficient for applications in ecological contexts? (3) What are

the prospects for BHPMF?

Why are BHPMF predictions robust and accurate?

BHPMF is a Bayesian hierarchical approach that simultaneously

takes the taxonomic trait signal, the correlation structure within

the trait matrix and environmental constraints into account to

fill gaps in trait matrices. A comparison of BHPMF-predicted

trait values with results based on PMF, MEAN and aHPMF

indicates the relevance of all three aspects.

PMF has been shown to be accurate even for the imputation

of sparse datasets (Koren et al., 2009). However, in the case of

our test dataset, PMF performs worse than the baseline MEAN

approach. This indicates that our test dataset is not of suffi-

ciently low rank to allow for robust trait predictions based on

the correlation structure in the matrix alone.

BHPMF converts PMF into a hierarchical Bayesian model.

This has two effects: (1) the higher levels of taxonomy (e.g.

phylogenetic group, family) provide almost complete informa-

tion (very low sparsity), which enables efficient PMF; (2)

approximations of the matrices at higher taxonomic level

provide excellent prior information for the approximations at

lower levels (e.g. genus, species), thus constraining imputations

due to the taxonomic signal in trait variation at all levels.

This is achieved without a priori assuming a phylogenetic

signal in the trait variability, but rather by opening the door for

our model to extract a signal, if it should be there. Thus, in some

cases, BHPMF might not put any constraint on the imputed

Figure 7 Advanced hierarchical probabilistic matrix factorization (aHPMF)-predicted leaf nitrogen concentration (mg g−1) of Acersaccharum (a), measured values for leaf nitrogen (mg g−1) (b), MAT (mean annual temperature) (c), and MAP (mean annual precipitation)(d) across the species range of A. saccharum. For a map of the geographic location see Figure S5.1 in Appendix S5.

F. Schrodt et al.


trait values from the taxonomy side, whereas in others, this

signal might be stronger, hence the constraint from the tax-

onomy. The hierarchical taxonomic structure in combination

with a consistent phylogenetic trait signal is therefore the key to

facilitate robust gap-filling by BHPMF – despite high sparsity –

at the level of the individual observations. On the other hand

BHPMF outperforms the MEAN approach in all aspects. This

indicates that the capture by PMF of the correlation structure of

the matrix on top of the phylogenetic trait signal is the key to

providing accurate predictions.

A comparison of BHPMF with aHPMF indicates that explic-

itly taking environmental constraints into account surprisingly

adds little or no improvement to BHPMF: the average R2 was

only higher in 5 out of 13 traits with the RMSE being consist-

ently smaller for BHPMF compared with aHPMF (Table 1). At

the individual level, PMF was replaced by multiple regressions

against environmental constraints, based on the assumption

that the phylogenetic trait signal is mainly observed at the

species level whilst environmental constraints add to the trait

variability at the individual level.

The fact that BHPMF largely outperforms aHPMF is an indi-

cation that, by taking into account trait–trait correlations, PMF

seems to be more appropriate than the multiple regression at the

individual level. Also, BHPMF implicitly takes environmental

information into account via the correlation of measured to

predicted traits. This environmental constraint is related to the

immediate environment experienced by the observed plant and

propagated via trait–trait correlations to the predicted traits,

while environmental information incorporated in the multiple

regression model by necessity represents only the average over

larger scales. Taking environmental information explicitly into

account may therefore not substantially improve BHPMF-based

gap-filling. Rather, it may be a tool to extrapolate traits from

point measurements to the regional scale, i.e. perform out-of-

sample predictions.

Is the prediction accuracy sufficient for applicationsin ecological contexts?

Our results indicate that BHPMF outperforms the species

MEAN baseline in all aspects, but is the prediction accuracy

indeed appropriate for use in ecological contexts, and to what

extent are the results a special case of our test dataset?

The test dataset is an extract of 78,300 individual observations

and the 13 best covered traits from the TRY database, still with

80% of trait entries missing. Our dataset is not typical for

common datasets of traits, which generally originate from spe-

cific measurement campaigns with only 5–30% missing entries.

In these cases, gap-filling may not be such a challenge and

common approaches may be sufficient. However, given the

experience that BHPMF substantially outperforms both PMF

and MEAN, we also expected improved trait predictions in such

cases. Indeed, our sensitivity test using ‘cut outs’ from Europe

and data across the Amazon showed that BHPMF was able to

impute gaps even in smaller and extremely sparse databases.

When taking advantage of using the global database to fill

gaps in a smaller ‘cutout’, possible confounding factors intro-

duced by global trait variability influencing the sometimes more

constrained trait space of a local dataset should be considered.

For example in the case of the RAINFOR cut-out, plant height

was better predicted using just the local dataset. One likely

explanation is that a large amount of new information was

included from the global database, amounting to more than

60% more data with a high standard deviation within species.

For SSD, on the other hand, only 30% more data were contrib-

uted by the global dataset, which were also less plastic within

species compared with plant height. Thus, a global dataset may

easily introduce ‘false’ information in the case of plastic traits

that are highly influenced by local environmental conditions but

provide a lot of valuable additional information in the case of

traits that are mainly determined by phylogeny. We recommend,

depending on the number and sparsity of the data, using both

approaches wherever possible – local and global gap-filling –

and select the best-fitting approach depending on the target trait

and aim.

BHPMF trait prediction uncertainties are well correlated with

their error, turning uncertainty into a good surrogate for pre-

diction error (Fig. S10.2 in Appendix S10). This offers the

opportunity to select trait predictions with high probability of

low errors or weight results in further analyses according to their

uncertainty.

Perspectives

Statistical approaches to gap-filling and improvement of eco-

logical understanding of species occurrence and dynamics have

been criticized for being too complex and including parameters

and assumptions without appropriate prior validation (Lavine,

2010). In contrast, BHPMF is a simple and generic model, which

has the advantage of incorporating factors known to influence

trait expression, such as phylogenetic trait signals, trait–trait

correlations and trade-offs without implicitly requiring prior

assumptions. This simplicity opens the opportunity to add com-

plexity to improve trait predictions, for example by replacing the

taxonomic hierarchy by a hierarchy of phylogenetic distances.

BHPMF has been explicitly developed for gap-filling (matrix

completion), not for the prediction of trait values in individuals

where no trait has ever been measured before, i.e. out-of-sample

predictions. Advancing BHPMF to account for phylogenetic dis-

tances instead of taxonomy will improve opportunities for effi-

cient out-of-sample predictions, at least in well-resolved clades.

Uncertainty quantified predictions support the validity of the

BHPMF approach, where predictions with low uncertainty

(small SD) also show high accuracy (small RMSE), and vice

versa. Using the Gibbs sampler to get an indication as to where

uncertainties are largest merits special attention. Data coverage

for traits such as SLA, which are relatively fast and easy to collect,

is much better than for other traits. In addition, some traits are

highly variable across ecosystems and plant populations. Using

the Gibbs sampler, one can statistically define where more sam-

pling effort is probably going to significantly improve our



understanding of variation in any specific trait and where the

currently available data are sufficient.

The combination of BHPMF with environmental covariates

may not necessarily improve gap-filling of trait matrices, but

seems to be a promising concept for the extrapolation of traits

from point measurements to regional scales. The concept of

out-of-sample prediction presented here illustrates a fundamen-

tal problem in trait ecology, separating the phylogenetic signal

and environmental impact on intraspecific trait variation. The

regression of trait values against the environment after gap-

filling by BHPMF provides an advantage over using non-gap-

filled data of an improved number of data points, which might

be essential as trait predictions are often limited by data avail-

ability (Verheijen et al., 2013). BHPMF-filled trait matrices

could thus become an invaluable tool for the parameterization

and validation of global vegetation models.

The TRY database is a typical example of a combination of

datasets which have been collected for different purposes. Other

disciplines besides plant ecology which have also started to

combine their (trait) data are faced with the same problem of

sparsity and restrictions in the context of multivariate analyses.

BHPMF provides an opportunity to extract the entire informa-

tion content from such databases combining data from various

aspects measured at locations all over the world.

Conclusion

BHPMF is a hierarchical Bayesian implementation of probabil-

istic matrix factorization, which for the first time simultaneously

utilizes the taxonomic trait signal, the correlation structure

within trait matrices and – implicitly through trait–

environment relationships – environmental constraints for gap-

filling of trait matrices. We demonstrate using the example of

different plant trait datasets that BHPMF provides robust and

accurate predictions even for sparse matrices. In addition, Gibbs

sampler-calculated uncertainties indicate how accurate each

imputed trait value is, and thus how to treat it in further analy-

ses. The combination of BHPMF with environmental informa-

tion provides the opportunity to extrapolate from point

measurements to continuous trait surfaces across large spatial

scales whilst accounting for intraspecific trait variability. We

therefore conclude that BHPMF-based gap-filling and trait pre-

diction has a high potential to support future trait-based

research in macroecology and functional biogeography.

ACKNOWLEDGEMENTS

F.S. was supported by the University of Minnesota, Institute on

the Environment via the grant to P.R.: ‘Transformational steps in

synthesis science’, the Max Planck Institute for Biogeochemistry

and iDiv, the German Centre for Integrative Biodiversity

Research. H.S. was supported by NSF grants IIS-0812183, IIS-

0916750, IIS-1029711, IIS-1017647, and NSF CAREER award

IIS-0953274. This study has been performed with support by the

TRY initiative on plant traits (https://www.try-db.org). TRY is

hosted and developed at the Max Planck Institute for

Biogeochemistry, with support from DIVERSITAS and iDiv. We

thank Ulrich Weber for help with the preparation of soil data

and Maarten Braakhekke, Nate Swenson and two anonymous

referees for helpful comments and suggestions.

REFERENCES

Albert, C.H., Thuiller, W., Yoccoz, N.G., Soudant, A., Boucher,

F., Saccone, P. & Lavorel, S. (2010) Intraspecific functional

variability: extent, structure and sources of variation. Journal

of Ecology, 98, 604–613.

ArcGIS Desktop, E. (2011) Release 10.1. Environmental Systems

Research Institute, Redlands, CA.

van Buuren, S. & Groothuis-Oudshoorn, K. (2011) mice: multi-

ple imputation by chained equations in r. Journal of Statistical

Software, 45, 1–67.

Chave, J., Coomes, D., Jansen, S., Lewis, S.L., Swenson, N.G. &

Zanne, A.E. (2009) Towards a worldwide wood economics

spectrum. Ecology Letters, 12, 351–366.

Clark, J.S. (2010) Individuals and the variation needed for

high species diversity in forest trees. Science, 327, 1129–

1132.

Cordlandwehr, V., Meredith, R., Ozinga, W., Bekker, R., van

Groenendael, J. & Bakker, J. (2013) Do plant traits retrieved

from a database accurately predict on-site measurements?

Journal of Ecology, 101, 662–670.

Davis, A., Lawton, J., Shorrocks, B. & Jenkinson, L. (1998) Indi-

vidualistic species responses invalidate simple physiological

models of community dynamics under global environmental

change. Journal of Animal Ecology, 67, 600–612.

Diniz-Filho, J.A.F., Bini, L.M., Rodríguez, M.A., Rangel,

T.F.L.V.B. & Hawkins, B.A. (2007) Seeing the forest for the

trees: partitioning ecological and phylogenetic components of

Bergmann’s rule in European Carnivora. Ecography, 30, 598–

608.

Duforet-Frebourg, N. & Blum, M.G.B. (2014) Bayesian matrix

factorization for outlier detection: an application in popula-

tion genetics. The Contribution of Young Researchers to

Bayesian Statistics. Springer Proceedings in Mathematics &

Statistics, 63, 143–147.

FAO, IIASA, ISRIC, ISSCAS, JRC (2012) Harmonized World Soil

Database (version 1.2). FAO, Rome, Italy and IIASA,

Laxenburg, Austria.

Fazayeli, F., Banerjee, A., Kattge, J., Schrodt, F. & Reich, P. (2014)

Uncertainty quantified matrix completion using Bayesian

hierarchical matrix factorization. Proceedings of the 13th

International Conference on Machine Learning and

Applications.

Fried, G., Kazakou, E. & Gaba, S. (2012) Trajectories of weed

communities explained by traits associated with species

response to management practices. Agriculture, Ecosystems

and Environment, 158, 147–155.

Fyllas, N.M., Patiño, S., Baker, T.R. et al. (2009) Basin-wide vari-

ations in foliar properties of Amazonian forest: phylogeny,

soils and climate. Biogeosciences, 6, 2677–2708.

F. Schrodt et al.


Hijmans, R.J., Cameron, S.E., Parra, J.L., Jones, P.G. & Jarvis, A.

(2005) Very high resolution interpolated climate surfaces for

global land areas. International Journal of Climatology, 25,

1965–1978.

Johnson, D.R. & Young, R. (2011) Toward best practices in

analyzing datasets with missing data: comparisons and rec-

ommendations. Journal of Marriage and Family, 73, 926–

945.

Kattge, J., Díaz, S., Lavorel, S., et al. (2011) TRY – a global

database of plant traits. Global Change Biology, 17, 2905–

2935.

Kazakou, E., Violle, C., Roumet, C., Navas, M.L., Vile, D., Kattge,

J. & Garnier, E. (2014) Are trait-based species rankings con-

sistent across datasets and spatial scales? Journal of Vegetation

Science, 25, 235–237.

Kerkhoff, A., Fagan, W., Elser, J. & Enquist, B. (2006)

Phylogenetic and growth form variation in the scaling of

nitrogen and phosphorus in the seed plants. The American

Naturalist, 168, E103–E122.

Koren, Y., Bell, R. & Volinsky, C. (2009) Matrix factorization

techniques for recommender systems. IEEE Computer, 42(8),

30–37.

Lamsal, S., Rizzo, D.M. & Meentemeyer, R.K. (2012) Spatial

variation and prediction of forest biomass in a heterogeneous

landscape. Journal of Forestry Research, 23, 13–22.

Lasslop, G., Reichstein, M., Papale, D., Richardson, A., Arneth,

A., Barr, A., Stoy, P. & Wohlfahrt, G. (2010) Separation of net

ecosystem exchange into assimilation and respiration using a

light response curve approach: critical issues and global evalu-

ation. Global Change Biology, 16, 187–208.

Lavine, M. (2010) Living dangerously with big fancy models.

Ecology, 91, 3487.

Little, E.L.J. (1971) Atlas of United States trees, volume 1, conifers

and important hardwoods. Miscellaneous Publication 1146.

US Department of Agriculture, Forest Service, Washington,

DC.

Lovette, I.J. & Hochachka, W.M. (2006) Simultaneous effects of

phylogenetic niche conservatism and competition on avian

community structure. Ecology, 87, 14–28.

McMahon, R.F. (2002) Evolutionary and physiological adapta-

tions of aquatic invasive animals: r selection versus resistance.

Canadian Journal of Fisheries and Aquatic Sciences, 59, 1235–

1244.

MATLAB (2012) Computer software. The MathWorks Inc.,

Natick, MA.

Moffat, A., Papale, D., Reichstein, M., Hollinger, D., Richardson,

A., Barr, A., Beckstein, C., Braswell, B., Churkina, G., Desai, A.,

Falge, E., Gove, J., Heimann, M., Hui, D., Jarvis, A., Kattge, J.,

Noormets, A. & Stauch, V. (2007) Comprehensive comparison

of gap-filling techniques for eddy covariance net carbon

fluxes. Agricultural and Forest Meteorology, 147, 209–232.

Nakagawa, S. & Freckleton, R.P. (2008) Missing inaction: the

dangers of ignoring missing data. Trends in Ecology and Evo-

lution, 23, 592–596.

Netflix (2009) Netflix prize. Available at: http://www

.netflixprize.com.

Ogle, K. (2013) Feedback and modularization in a Bayesian

metaanalysis of tree traits affecting forest dynamics. Bayesian

Analysis, 8, 133–168.

Ordonez, A. (2014) Functional and phylogenetic similarity

of alien plants to co-occurring natives. Ecology, 95, 1191–

1202.

R Core Team (2014) R: a language and environment for statis-

tical computing. R Foundation for Statistical Computing.

Vienna, Austria. Available at: http://www.R-project.org/.

Räty, M. & Kangas, A. (2012) Comparison of k-msn and kriging

in local prediction. Forest Ecology and Management, 263,

47–56.

Reich, P.B. & Oleksyn, J. (2004) Global patterns of plant leaf N

and P in relation to temperature and latitude. Proceedings of

the National Academy of Sciences USA, 101, 11001–11006.

Reich, P.B., Walters, M.B. & Ellsworth, D.S. (1997) From tropics

to tundra: global convergence in plant functioning. Proceed-

ings of the National Academy of Sciences USA, 94, 13730–

13734.

Rubin, D.B. (1987) Multiple imputation for nonresponse in

surveys. Wiley and Sons, New York.

Salakhutdinov, S. & Mnih, A. (2008) Probabilistic matrix fac-

torization. Advances in Neural Information Processing Systems

20 (NIPS 07). Available at: http://www.cs.toronto.edu/

∼rsalakhu/papers/nips07_pmf.pdf

Schafer, J.L. & Graham, J.W. (2002) Missing data: our view of

state of the art. Psychological Methods, 7, 147–177.

Schrodt, F., Domingues, T.F., Feldpausch, T.R. et al. (2015) Foliar

trait contrasts between African forest and savanna trees:

genetic versus environmental effects. Functional Plant Biology,

42, 63–83.

Shan, H., Kattge, J., Reich, P.B., Banerjee, A., Schrodt, F. &

Reichstein, M. (2012) Gap filling in the plant kingdom – trait

prediction using hierarchical probabilistic matrix factoriza-

tion. Proceedings of the 29th International Conference on

Machine Learning (ed. by J. Langford and J. Pineau), pp.

1303–1310. Omnipress, Madison, WI.

Su, Y.S., Gelman, A., Hill, J. & Yajima, M. (2011) Multiple impu-

tation with diagnostics (mi) in r: opening windows into the

black box. Journal of Statistical Software, 45, 55–64.

Swenson, N. & Enquist, B. (2007) Ecological and evolutionary

determinants of a key plant functional trait: wood density and

its community-wide variation across latitude and elevation.

American Journal of Botany, 94, 451–459.

Swenson, N.G. (2014) Phylogenetic imputation of plant func-

tional trait databases. Ecography, 37, 105–110.

Taskinen, S. & Warton, D.I. (2011) Robust estimation and infer-

ence for bivariate line-fitting in allometry. Biometrical Journal,

53, 652–672.

Taugourdeau, S., Villerd, J., Plantureux, S., Huguenin-Elie, O. &

Amiaud, B. (2014) Filling the gap in functional trait databases:

use of ecological hypotheses to replace missing data. Ecology

and Evolution, 4, 944–958.

Verheijen, L.M., Brovkin, V., Aerts, R., Bönisch, G., Cornelissen,

J.H.C., Kattge, J., Reich, P.B., Wright, I.J. & van Bodegom, P.M.

(2013) Impacts of trait variation through observed



trait–climate relationships on performance of an earth system

model: a conceptual analysis. Biogeosciences, 10, 5497–5515.

Wright, I.J., Reich, P.B., Westoby, M. et al. (2004) The worldwide

leaf economics spectrum. Nature, 428, 821–827.

Additional references are in the supplementary file at: (weblink)

SUPPORTING INFORMATION

Additional supporting information may be found in the online

version of this article at the publisher’s web-site.

Appendix S1 Supplementary methods.

Appendix S2 Definition of traits used in this study.

Appendix S3 References for contributing databases and number

of traits contributed.

Appendix S4 Map of TRY measurement sites.

Appendix S5 Location of Acer saccharum range map and soil

and climate across the range of Acer saccharum

Appendix S6 Correlation between traits and environmental

variables used in aHPMF.

Appendix S7 Root mean squared error comparison between

MEAN, BHPMF and aHPMF across the taxonomic hierarchy.

Appendix S8 Sensitivity analysis.

Appendix S9 Bi- and multivariate relationships between traits,

measured and imputed trait values.

Appendix S10 Gibbs sampler results.

Appendix S11 Additional references of data contributors.

Appendix S12 Author contributions.

BIOSKETCH

Franziska Schrodt is a post-doctoral researcher at the

Max Planck Institute for Biogeochemistry in Jena and

the German Centre for Integrative Biodiversity Research

iDIV Leipzig, Jena, Halle. Her work focuses on the

application of machine learning and nonlinear

statistical tools to the study of biogeochemical patterns.

She is especially interested in plant functional

trait/biodiversity–environment correlations and the

associated implications for ecosystem structure and

functioning.

Editor: José Alexandre Diniz-Filho

F. Schrodt et al.


S1. Methods

S1.1. MEAN

A “hierarchical mean” strategy is used. For example, to predict trait m of plant n, if there are plants in the samespecies as plant n, we use species mean for prediction; otherwise, if there are plants in the same genus as plant n, weuse the genus mean, and so on. In general, among species mean, genus mean, family mean, and phylogenetic groupmean, we use the first available one at the lowest level.

S1.2. BHPMF

BHPMF is a hierarchical Bayesian implementation of PMF. The latent vectors are implemented as Gaussian normaldistributions with prior mean of 0 and a variance of σ 2, which results in two adjustable parameters per latent vector(mean and variance). The length of the latent vectors can be defined, but needs to be constrained to avoid over-fitting(Salakhutdinov and Mnih, 2008). We run a Gibbs sampler for optimization (see S1.3). In principle, this allows us toupdate {U(`)} and {V (`)} - which are the matrices formed by stacking the latent vectors u and v at the taxonomic level of` - in an arbitrary order. Empirically, we do it level by level iteratively following a top-down and bottom-up order. Ineach iteration, we first do a top-down pass to update

({U(1)}, {V (1)}

)to

({U(L)}, {V (L)}

), followed by a bottom-up pass to

update({U(L)}, {V (L)}

)to

({U(1)}, {V (1)}

), and repeat the process for several iterations. The intuition is that after updating(

{U(`)}, {V (`)}), we want to immediately use it for regularization (to prevent unnecessary complexity and over-fitting) in

the next level update. Empirically, we observed that such a strategy converges faster than only doing top-down updatesrepeatedly. The posterior over {U(`)} and {V (`)} is

p({U(`)}, {V (`)}|{X(`)}, σ2,U(0),V (0)

)∝

L∏`=1

{∏n

N(u(`)n |u

(`−1)p(n) , σ

2uI)

∏m

N(v(`)m |v

(`−1)m , σ2

v I)

∏n,m

δ(`)nmN

(x(`)

nm|〈u(`)n , v(`)

m 〉, σ2) }

,

(S1)

where {·} denotes the set of data at all L levels (L = 5 for the TRY data), and δ(`)nm = 1 when the entry (n,m) of X(`)

is non-missing and 0 otherwise. MAP (Maximum a posteriori) inference which is similar to Monte Carlo inference on{U(`)} and {V (`)} can be done by maximizing the logarithm of the posterior in eqn S1, which boils down to minimizingthe regularized squared loss as

E =

L∑`=1

{∑nm

δ(`)nm ‖ x(`)

nm − 〈u(`)n , v(`)

m 〉 ‖22

+ λu

∑n

‖u(`)n − u(`−1)

p(n) ‖22 +λv

∑m

‖v(`)m − v(`−1)

m ‖22

},

(S2)

where λu = σ2/σ2u and λv = σ2/σ2

v and ` − 1 is replaced by ` + 1 for the bottom-up iteration. The objective functioncontaining U(`) and V (`) is given by

E(`) =∑n,m

δ(`)nm ‖ x(`)

nm − 〈u(`)n , v(`)

m 〉 ‖22

+ λu

∑n

‖ u(`)n −u(`−1)

p(n) ‖22 +1(`<L)

∑n′∈c(n)

‖u(`)n −u(`+1)

n′ ‖22

+ λv

∑m

(‖ v(`)

m − v(`−1)m ‖22 +1(`<L) ‖ v(`)

m − v(`+1)m ‖22

),

(S3)

1

where c(n) is the set of child nodes of n, e.g, if n is a species, c(n) denotes plants of that species, and 1(`<L) is anindicator function taking value 1 when ` < L and 0 otherwise. The regularization terms ‖u(`)

n −u(`−1)p(n) ‖

22 and ‖v(`)

m −v(`−1)m ‖22

keep u(`)n and v(`)

m close to the corresponding latent factor at level ` − 1, and the regularization terms∑

c(n)‖u(`)n − u(`+1)

c(n) ‖22

and ‖v(`)m − v(`+1)

m ‖22 keep u(`)n and v(`)

m close to the corresponding latent factor at level ` + 1 (if applicable).At least one trait is needed for each plant in order to run matrix factorization methods. Therefore, we split the

training, test and validation sets as follows: For each plant, if it has at least three traits available, we randomly holdout one trait for test, one trait for validation, and use the rest for training; if it has two traits available, we randomlyhold out one trait for training and one for test; if it only has one trait available, we use it for training. Following such astrategy, each plant has at least one trait in the training set. The test set is used for test and the validation set is usedduring the training process for early stopping, i.e., when there have been more than 5 iterations and the performance onthe validation set decreases, we stop training. We repeat the holding-out process 5 times to get five randomly splitdatasets, then constructing the upper-level matrices for training and validation, but the test set only operates at theplant×trait level.

S1.3. Gibbs samplerMany imputation methods commonly used in ecology do not provide means to assess the uncertainty of every

single predicted value. Ideally, one would like to quantify both, the expected value and the range of variation (in caseof normal distribution mean and standard deviation (SD)) of the imputations. Using Gibbs sampling we infer theprobability distribution for each prediction (Casella and George, 1992). The Gibbs sampler is a Markov chain MonteCarlo algorithm which, based on the Metropolis algorithm (Metropolis et al., 1953), samples from the conditionaldistribution of one variable given all the others. In BHPMF, each variable (element in each latent factor) is conditionallyindependent of most other variables, thereby leading to an efficient sampler. In essence, the procedure is as follows: fora given matrix X, the sampler updates the latent factor matrices (U(`),V (`)) at each level `, keeping the factors at allother levels fixed. Each sample at the lowest level is obtained by sampling the upper level matrices iteratively followinga top-down and bottom-up order (Algorithm S1).

Each row of U(`) (u(`)n ) is independent of U(`)

−n, U(−`)(−p(n),−c(n)), V (−`), X(−`), and X(`)

−n given its Markov blanket (x(`)n , V (`),

u(`−1)p(n) , u(`+1)

c(n) ), where p(n) is the parent node of n in the upper level and c(n) is the set of child nodes n in the lower level.Therefore, the conditional probability of U(`) can be factorized into the product of conditional probability of its rows

p(U(`)|X(`),V (`),U(`−1)

p ,U(`+1)c

)(S4)

=∏

n

p(u(`)

n |x(`)n ,V (`),u(`−1)

p(n) ,u(`+1)c(n)

).

By applying Bayes rule and given that the product of multiple Gaussian distributions is another Gaussian distribution,it can be shown that the conditional probability of un is a Gaussian distribution

p(u(`)

n |x(`)n ,V (`),u(`−1)

p(n) ,u(`+1)c(n)

)= N(u(`)

n |µ∗(`)n ,Σ∗(`)n )

∼∏

m

[δ(`)

nmN(x(`)nm|〈u

(`)n , v(`)

m 〉, σ2)]N(u(`)

n |u(`−1)p(n) , σ

2uI)∏

n′∈c(n)

[N(u(`+1)

n′ |u(`)n , σ2

uI)] (S5)

Σ∗(`)n =

|c(n)` + 1|σ2

uI +

1σ2

∑m

δ(`)nmv(`)

m v(`)Tm

−1

µ∗(`)n = Σ∗(`)n

1σ2

uI

u(`−1)p(n) +

∑n′∈c(n)

u(`+1)n′

+

1σ2

∑m

δ(`)nmx(`)

nmv(`)m

.(S6)

2

where |.| denotes set cardinality.With the similar argument, the conditional probability of V (`) can be factorized into the product of conditional

probability of its rows, where x:m is column m of x.

p(V (`)|X(`),U(`),V (`−1),V (`+1)

)(S7)

=∏

m

p(v(`)

m |x(`):m ,U

(`), v(`−1)m , v(`+1)

m

).

p(v(`)

m |x(`):m ,U

(`), v(`−1)m , v(`+1)

m

)= N(v(`)

m |µ∗(`)m ,Σ∗(`)m )

∼∏

n

[δ(`)

nmN(x(`)nm|〈u

(`)n , v(`)

m 〉, σ2)]

(S8)

N(v(`)m |v

(`−1)m , σ2

v I) N(v(`+1)m |v(`)

m , σ2v I)

where

Σ∗(`)m =

2σ2

vI +

1σ2

∑n

δ(`)nmu(`)

n u(`)Tn

−1

(S9)

and

µ∗(`)m = Σ∗(`)m

[1σ2

vI(v(`−1)

m + v(`+1)m

)+

1σ2

∑n

δ(`)nmx(`)

nmu(`)n

. (S10)

For a given matrix X, the sampler updates the latent factor matrices (U(`),V (`)) at every level `. Each sample at thelowest level is obtained by sampling the upper level matrices iteratively following a top-down and bottom-up order. Ateach iteration, we first do a bottom-up pass to sample (U(L),V (L)) to (U(1),V (1)), followed by a top-down pass to sample(U(1),V (1)) to (U(L),V (L)), and repeat the procedure to generate enough samples (Algorithm S1).

Algorithm S1 Gibbs Sampling for BHPMF1: for ` = 1, · · · , L do2: Initialize model parameters {U1(`),V1(`)}

3: for t = 1, · · · ,T do4: for ` = L, · · · , 1 do . bottom-up5: for each n = 1 · · ·N sample un in parallel (eqn S5):6: ut+1(`)

n ∼ p(ut(`)n |x

(`)n ,V t(`),ut(`−1)

p(n) ,ut(`+1)c(n) )

7: for each m = 1 · · ·M sample vm in parallel (eqn S8):8: vt+1(`)

m ∼ p(vt(`)m |x

(`)m ,U t+1(`), vt(`−1)

m , vt(`+1)m )

9: for ` = 1, · · · , L do . top-down10: for each n = 1 · · ·N sample un in parallel (eqn S5):11: ut+2(`)

n ∼ p(ut+1(`)n |x(`)

n ,V t+1(`),ut+1(`−1)p(n) ,ut+1(`+1)

c(n) )12: for each m = 1 · · ·M sample vm in parallel (eqn S8):13: vt+2(`)

m ∼ p(vt+1(`)m |x(`)

m ,U t+2(`), vt+1(`−1)m , vt+1(`+1)

m )

Where t denotes the trait, c(n) the child node, p(n) the parent node, l the hierarchical level, u and n the row (entityor, in our case, plant) side and v and m the column (trait) side. The used burn-in period was 100 samples with a lag of 2and a final number of 450 samples.

3

S1.4. aHPMF

HPMF was developed to fill gaps in plant trait matrices. To facilitate trait predictions at regional scale, we stopHPMF at species level, followed by a least squares regression of the residuals against environmental features. StoppingHPMF at species level separates phylogenetic conservatism from environmental drivers. Explicitly taking into accountenvironmental conditions as co-determinants of trait variation enables out-of-sample prediction.

Let Q` denote the number of distinct categories available at level ` of the taxonomy, e.g., Q1 is the number ofspecies available at ` = 1, and so on. We learn distinct regression model parameters w`

q for each category q at everylevel ` of the taxonomy by partitioning the observations into their respective categories, and also accounting for ahierarchical regularization among the parameters based on the taxonomic hierarchy.

Let X ∈ RQ×7 be the design matrix of climate and soil features (7 covariates, including a column of ones to handlea constant intercept), where the total number of observations, including all levels of the taxonomic hierarchy, is Q. LetY ∈ RQ×1 be the target residuals of BHPMF to be predicted for a given trait k. Let (X(`)

q ,Y (`)q ) be the subset of data that

belong to the qth category at level `. With w(`)q denoting the regression vector for category q at level `, we consider the

following objective function

E(w) =

L∑`=1

Q∑q=1

γ`∥∥∥Y (`)

q − X(`)q w(`)

q

∥∥∥2

+ λ

L∑`=1

Q∑q=1

‖w`q − w(`−1)

p(q) ‖2

(S11)

where w is the weight vectors over all categories, γ` is a weight term for minimizing the squared errors at level `, λis the trade-off parameter of regularization, and p(q) is the parent of category q. Using vector notations, the objectivefunction can be written as:

E(w) = (Y − Xw)Tγ(Y − Xw) + λwTLw , (S12)

where γ is a vector of γ` for all `, Y is a concatenated vector of Y (`)q , X is a block diagonal concatenation of X(`)

q ,and L is the graph Laplacian of the taxonomic hierarchy. We utilize gradient descent based methods for solving theoptimization problem (eqn S12), in a spirit similar to BHPMF. Finally, we use the learned parameters at the specieslevel for predicting the plant trait residuals of BHPMF over unobserved data instances during the testing phase.

S1.5. RMSE

RMSE is used for evaluation. Assuming there are in total T entries for test, at is the true value and at is the predictedvalue, RMSE is defined as:

RMS E =

√∑t

(at − at)2/T . (S13)

S1.6. Sensitivity analysis

RMSEs of the sensitivity analysis testing the effect of using a global versus local (RAINFOR)/regional (Europe)dataset to fill gaps, as well as the effect of different degrees of sparsity were calculated based on the results of threeBHPMF runs obtained for each subset (global data to fill local gaps, local data to fill local gaps (10%, 30%, 60% and80% extra gaps), as well as global data to fill regional gaps and regional data to fill regional gaps). In order to enablecomparison of the results, all data were z-transformed on the basis of the whole dataset. To ensure consistency in matrixlength, we constrained the random introduction of gaps by retaining one data point in every row (plant), also usingthe same gaps for the respective a) or b) runs. Gaps were added as percentage of the available trait data for each traitwhich resulted in different absolute degrees of missingness. It should also be noted that the total RMSE was calculatedindependently for the whole dataset and is not the average across all RMSEs per trait. Differences in RMSE withineach treatment are due to pre-processing (i.e. differences in the random setting of gaps) in the three different runs ratherthan BHPMF per se.

4

References

Casella, G., George, E.I., 1992. Explaining the gibbs sampler. The American Statistician 46, 167–174.Metropolis, A.W., Rosenbluth, M.N., Rosenbluth, A.H., Teller, H., Teller, E., 1953. Equation of state calculations by fast computing machines.

Journal of Chemical Physics 21, 10871092.Salakhutdinov, S., Mnih, A., 2008. Probabilistic matrix factorization. IEEE CS Press: Advances in Neural Information Processing Systems 20 (NIPS

07) .

5

S2. Definition of traits used in this study

Table S2.1: numbered code, units of measurement, plant functional traits, number of non-missing geo-referenced entries (Nr. geo-ref.) and definitionof the respective trait.

Code Unit Trait Nr. geo-ref. Definition1 mm2 mg−1 Specific leaf area 33001 One sided area of a fresh leaf divided by its oven-dry mass2 m Plant height 16465 Shortest distance between the upper boundary of main photo-

synthetic tissue or reproduction unit on a plant and the groundlevel

3 mg Seed dry mass 7311 Dry mass of a whole single seed4 g g−1 Leaf dry matter content 17331 Leaf dry mass per unit of leaf fresh mass (hydrated)5 mg mm−3 Stem specific density 9191 Oven-dry mass of a section of a plant’s main stem divided by its

fresh volume6 mm2 Leaf area 39438 One-sided projected surface area of a single leaf or leaf lamina7 mg g−1 Leaf nitrogen per weight 26882 Total amount of nitrogen per unit of leaf dry mass8 mg g−1 Leaf phosphorus per weight 11975 Total amount of phosphorus per unit of leaf dry mass9 g m−2 Leaf nitrogen per area 8180 Total amount of nitrogen per unit of leaf area (one-sided)10 mg Leaf fresh mass 11484 Fresh mass of a whole leaf11 g g−1 Leaf nitrogen/phosphorus ratio 5999 Ratio of leaf total nitrogen versus total phosphorus12 mg g−1 Leaf carbon per dry mass 8125 Total carbon per unit of leaf dry mass13 ‰ Leaf δ 15N 9022 foliar 15N:14N ratios relative to 15N:14N ratios in atmospheric N2

S3. References of contributing databases and number of traits contributed

Table S3.1: References of contributing databases in order of decreasing contribution as measured in the number of traits, where Ref.1 contributed23364 traits and Ref.56 37 traits. Where the database has not been published, yet, only the name of the database is given. For a list of contributedtraits see table S3.2, for the list of references, see S11

Ref. Database Reference1 The LEDA Tb (Kleyer et al., 2008)2 Global 15N Db (Craine et al., 2009, 2005)3 Panama Plant Traits Db (Wright et al., 2010)4 Global Leaf N, P Db (Reich et al., 2009)5 Catalonian Mediterranean Forest Trait Db (Ogaya and Penuelas, 2003)6 The VISTA Plant Trait Db (Garnier et al., 2007)7 Herbaceous Leaf Traits Db Old Field New York8 Panama Leaf Traits Db (Messier et al., 2010)9 Midwestern and Southern US Herbaceous Db10 The RAINFOR Plant Trait Db (Fyllas et al., 2009)11 Global Seed Mass, Plant Height Db (Moles et al., 2004, 2005)12 GLOPNET - Global Plant Trait Network Db (Wright et al., 2004, 2006)13 Sheffield Db (Cornelissen et al., 2001, 2003)14 VegClass CBM Global Db (Gillison and Carpenter, 1997)15 Floridian Leaf Traits Db (Cavender-Bares et al., 2006)16 Neotropic Plant Traits Db (Wright et al., 2007)17 Leaf and Whole Plant Traits Db (Shipley, 1989, 1995, 2002)18 Leaf Physiology Db (Kattge et al., 2009)19 Overton/Wright New Zealand Db20 New South Wales Db (Fonseca et al., 2000)21 Traits of Bornean Trees Db (Kurokawa and Nakashizuka, 2008)22 Chinese Leaf Traits Db (Han et al., 2005)23 The Netherlands Plant Traits Db (Ordonez et al., 2010)24 Leaf Biomechanics Db (Onoda et al., 2011)25 Sheffield-Iran-Spain Db (Dıaz et al., 2004)26 Quercus Leaf C and N Db27 Global Leaf Robustness and Physiology Db (Niinemets, 2001)28 Ponderosa Pine Forest Db (Laughlin et al., 2010)29 Ukraine Wetlands Plant Traits Db30 ECOCRAFT (Medlyn and Jarvis, 1999)31 Abisko & Sheffield Db (Cornelissen et al., 1996, 2004)32 ArtDeco Db (Cornwell et al., 2008)33 Photosynthesis and Leaf Characteristics Db34 Causasus Plant Traits Db35 New South Wales Plant Traits Db36 European Mountain Meadows Plant Traits Db (Bahn et al., 1999)37 CORDOBASE (Dıaz et al., 2004)38 Global Respiration Db (Reich et al., 2008)39 Tropical Rainforest Traits Db (Poorter and Bongers, 2006)40 Global A, N, P, SLA Db (Reich et al., 2009)41 ECOQUA South American Plant Traits Db (Muller et al., 2007)42 FAPESP Brazil Rainforest Db43 South African Woody Plants Db (ZLTP)44 French Massif Central Grassland Trait Db (Louault et al., 2005)45 Tropical Traits from West Java Db (Poorter, 2009)46 Jasper Ridge Californian Woody Plants Db (Preston et al., 2006)47 Hawaiian Leaf Traits Db (Penuelas et al., 2010)48 Wetland Dunes Db (Bodegom et al., 2005, 2008)49 Tropical Traits from West Java Db (Shioder et al., 2008)50 Tropical Plant Traits From Borneo Db (Swaine, 2007)51 Herbaceous Traits from the Oland Island Db (Hickler, 1999)52 Traits from Subarctic Plant Species Db (Freschet et al., 2010)53 Leaf and Whole-Plant Traits Db (Sack et al., 2003, 2005, 2006)54 Plant Traits in Pollution Gradients Db55 Cedar Creek Plant Physiology Db56 VirtualForests Trait Db (Gutierrez and Huth, 2012)

Tabl

eS3

.2:D

atab

ases

and

num

bero

ftra

itsco

ntri

bute

d.Fo

ralis

tofr

efer

ence

sse

eta

ble

S3.1

.SL

Ais

the

spec

ific

leaf

area

,SD

Mse

eddr

ym

ass,

LDM

Cle

afdr

ym

atte

rcon

tent

,SSD

stem

spec

ific

area

,LA

leaf

area

,Nle

afni

troge

n,P

leaf

phos

phor

us,N

area

leaf

nitro

gen

peru

nitl

eafa

rea,

LFM

leaf

fres

hm

ass,

Nto

Pth

era

tioof

leaf

nitr

ogen

tole

afph

osph

orus

,Cle

afca

rbon

and

15N

the

leaf

cont

ento

fdel

ta15

nitr

ogen

Ref

.SL

AH

eigh

tSD

ML

DM

CSS

DL

AN

PN

area

LFM

Nto

PC

15N

167

3613

0650

652

2310

9583

210

531

1030

1030

1054

53

5256

293

331

5000

1003

5355

214

5098

455

0255

0255

025

819

660

5483

2219

0213

0972

018

876

1560

3401

1573

1564

989

707

969

991

723

2123

2123

2123

2123

218

1866

1872

1880

1836

1886

1836

919

9716

4126

022

8689

2098

2254

1015

5588

315

0615

6315

5415

2515

6311

1700

8416

1223

7056

015

8520

6178

719

7574

513

2953

245

2852

2594

477

474

286

1447

8147

8115

2270

2255

3514

1656

714

3010

0727

6917

750

840

375

726

720

312

1230

040

031

218

927

312

942

1464

1910

9911

1810

9920

1043

1028

1043

2146

746

647

146

747

122

813

1179

2328

928

726

928

828

228

128

224

1735

2543

042

642

943

026

767

767

2775

511

222

612

722

625

2813

913

913

913

913

713

913

813

913

913

929

214

214

209

214

214

107

214

3033

821

639

694

196

112

3124

924

860

254

624

517

632

576

603

3357

857

834

144

102

144

144

156

129

144

155

3526

129

224

828

736

210

202

210

210

201

3717

713

714

467

2114

676

5570

5050

3826

730

267

7826

739

135

8216

413

582

135

8240

124

4912

712

712

411

941

656

4267

7714

467

7667

7676

Con

tinue

don

Nex

tPag

e...

Tabl

eS3

.2–

Con

tinue

d

Ref

.SL

AH

eigh

tSD

ML

DM

CSS

DL

AN

PN

area

LFM

Nto

PC

15N

4325

960

6060

6044

4312

943

4343

4343

4345

101

9610

110

146

5451

5454

5453

5347

8888

8886

4815

316

149

101

9610

150

3636

3536

3636

3651

8080

8052

4040

4040

4040

5317

430

1122

5440

8040

2640

5545

4545

5637

sum

4204

627

953

1262

022

571

9905

4246

932

246

1415

210

158

1210

375

2892

0410

820

S4. Map of TRY measurement sites

Figure S4.1: Measurement sites of trait data contributed to the TRY project (acquisition date 2012-08-25) on a map of mean annual temperature(extracted from the WorldClim dataset). Note the sparse spatial distribution of measurement sites, especially in extremely cold or hot areas.

S5. Location of Acer saccharum range map and soil and climate across the range of Acer saccharum

Figure S5.1: Geographic location of the species range map of Acer saccharum

Figure S5.2: Soil %clay (a), soil organic carbon (b), %silt (c), %clay (d), isothermality (e) and precipitation seasonality (f) across the species rangeof Acer saccharum. For a map of the geographic location of these maps, see Figure S5.1.

S6. Correlation between traits and environmental variables used in aHPMF

Table S6.1: Correlations between the raw trait values and the environmental variables used within aHPMF (MAT is the mean annual temperature,MAP is the mean annual precipitation, SOC soil organic carbon, AWC the annual water capacity and ISO isothermality

Trait % silt % sand/% silt % SOC MAT MAP AWC ISOSpecific leaf area 0.0654 0.0131 0.0248 -0.1905 -0.0707 0.0059 -0.1701Plant height -0.1725 -0.2779 -0.207 0.6358 0.5786 0.312 0.6951Seed mass -0.2304 -0.1278 -0.1291 0.5304 0.6136 0.3343 0.574LDMC -0.1949 -0.1329 -0.2054 0.3535 0.361 0.1684 0.3579Stem specific density 0.0587 -0.0784 -0.0133 0.2686 0.0491 -0.1442 0.1392Leaf area -0.0401 -0.3173 -0.0765 0.4836 0.4577 0.2924 0.4442Leaf nitrogen concentration -0.0237 -0.0341 -0.0592 0.0269 0.0861 0.0762 0.0888Leaf phosphorus concentration 0.2681 0.0057 0.0883 -0.3262 -0.1687 -0.1491 -0.2661Leaf nitrogen per area 0.0444 -0.0659 -0.0785 0.1289 0.0673 0.0349 0.0894Leaf fresh mass -0.6522 -0.2866 0.6947 0.705 0.6988 0.3934 0.7172Leaf nitrogen/phosphorus ratio -0.4176 -0.0927 -0.2129 0.4456 0.3186 0.1976 0.5253Leaf carbon per dry mass 0.0915 -0.1133 0.0045 0.0814 0.1301 -0.0303 0.0497Leaf δ 15N -0.4532 -0.1218 -0.3584 0.5313 0.2171 0.3115 0.5598

S7. Root mean squared error comparison between MEAN, BHPMF and aHPMF

Table S7.1: RMSE ± range of Species Mean (MEAN), HPMF and aHPMF with increasing taxonomic information. Latent dimension k=15 formatrix factorization methods. Lowest RMSE is shown in bold. Phylo is the phylogenetic group.

Taxonomic info MEAN HPMF aHPMFPhylo 0.9301 ± 0.0026 0.8362 ± 0.0103 ×

Phylo+family 0.7415 ± 0.0019 0.6956 ± 0.0113 ×

Phylo+family+genus 0.6009 ± 0.0021 0.5694 ± 0.0032 ×

Phylo+family+genus+species 0.5853 ± 0.0024 0.5009 ± 0.0034 0.5272 ± 0.0025

S8. Sensitivity analysis

Figure S8.1: Location of the extracted RAINFOR data (red) and the whole TRY data in this region (black).

Table S8.1: Number of observations corresponding to the percentage of added gaps of the RAINFOR database by trait, using a) only the RAINFORdata itself (Rainfor for Rainfor) or b) the whole TRY database (World for Rainfor) to fill these gaps. SLA is the specific leaf area, PlantHt is the Plantheight, SSD is the stem specific density, P is phosphorus, N is nitrogen and C is carbon.

Rainfor for Rainfor World for Rainfor0 10 30 60 80 0 10 30 60 80

Total 8.967 8.070 6.277 3.587 1.798 113.819 112.922 111.129 108.439 106.645SLA 1.360 1.209 954 538 273 33.001 32.850 32.595 32.179 31.914PlantHt 864 771 602 342 174 16.465 16.372 16.203 15.943 15.775SSD 1.327 1.186 936 543 260 9.191 9.050 8.800 8.407 8.124Leaf P 1.353 1.234 963 546 273 11.975 11.856 11.585 11.168 10.895Leaf N 1.364 1.237 967 542 290 26.882 26.755 26.485 26.060 25.808Leaf N/area 1.335 1.211 923 531 262 8.180 8.056 7.768 7.376 7.107Leaf C/dry mass 1.364 1.222 932 545 261 8.125 7.983 7.693 7.306 7.022

Figure S8.2: Location of the extracted European data (red) and the global TRY data (black).

Table S8.2: Number of observations in the gap-filling the European database exercise, using a) only the European data itself (Europe for Europe) orb) the whole TRY database (World for Europe) to fill these gaps. SLA is the specific leaf area, PlantHt is the Plant height, LDMC is the leaf drymatter content, SSD is the stem specific density, N is nitrogen, P is phosphorus and C is carbon

Europe for Europe World for EuropeTotal 155.147 204.389SLA 24.749 33.001PlantHt 13.959 16.465Seed mass 5.930 7.311LDMC 12.560 17.331SSD 3.675 9.191Leaf area 34.813 39.438Leaf N 18.857 26.882Leaf P 7.280 11.975Leaf N/area 5.014 8.180Leaf fresh mass 11.484 11.484Leaf N/P ratio 4.123 5.999Leaf C/dry mass 4.470 8.125Leaf δ 15N 8.233 9.007

Figure S8.3: RMSE of performing HPMF on the European cutout (red points in Figure S8.2) for the whole dataset (Total), specific leaf area (SLA),Plant height (PlantHt), seed mass, leaf dry matter content (LDMC), stem specific density (SSD), leaf area, leaf nitrogen, leaf phosphorus, leafnitrogen per area (LeafNArea), leaf fresh mass (LeafFmass), leaf nitrogen to phosphorus ratio (LeafNP), leaf carbon and leaf delta 15 nitrogen(LeafD15N). Grey and black boxplots (on the left and right for each trait respectively) are RMSE for imputations using just the European dataset(grey) or the whole global dataset (black) to fill the gaps

S9. Bi- and multi-variate relationships between traits, measured and imputed trait values

Figure S9.1: Figure is continued on next page.

observed plant height

Mea

n −

pre

dict

ed p

lant

hei

ght

−4

−2

0

2

4

−4 −2 0 2 4

heightPlant

observed SSD

Mea

n −

pre

dict

ed S

SD

−8

−6

−4

−2

0

−8 −6 −4 −2 0

densityStem specific

observed Leaf P

Mea

n −

pre

dict

ed L

eaf P

−3

−2

−1

0

1

2

−3 −2 −1 0 1 2

P

observed LDMC

Mea

n −

pre

dict

ed L

DM

C

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

−3.0 −2.0 −1.0

LDMC

observed foliar N/P ratio

Mea

n −

pre

dict

ed fo

liar

N/P

rat

io

1

2

3

4

1 2 3 4

N to P ratio


HP

MF

− p

redi

cted

pla

nt h

eigh

t

−4

−2

0

2

4

−4 −2 0 2 4

heightPlant

observed SSD

HP

MF

− p

redi

cted

SS

D

−8

−6

−4

−2

0

−8 −6 −4 −2 0


observed Leaf P

HP

MF

− p

redi

cted

Lea

f P

−3

−2

−1

0

1

2

−3 −2 −1 0 1 2

P

observed LDMC

HP

MF

− p

redi

cted

LD

MC

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

−3.0 −2.0 −1.0

LDMC


HP

MF

− p

redi

cted

folia

r N

/P r

atio

1

2

3

4

1 2 3 4

N to P ratio


aHP

MF

− p

redi

cted

pla

nt h

eigh

t

−4

−2

0

2

4

−4 −2 0 2 4

heightPlant

observed SSD

aHP

MF

− p

redi

cted

SS

D

−8

−6

−4

−2

0

−8 −6 −4 −2 0


observed Leaf P

aHP

MF

− p

redi

cted

Lea

f P

−3

−2

−1

0

1

2

−3 −2 −1 0 1 2

P

observed LDMC

aHP

MF

− p

redi

cted

LD

MC

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

−3.0 −2.0 −1.0

LDMC


aHP

MF

− p

redi

cted

folia

r N

/P r

atio

1

2

3

4

1 2 3 4

N to P ratio

observed Leaf N per area

Mea

n −

pre

dict

ed L

eaf N

per

are

a

−2

−1

0

1

2

−2 −1 0 1 2

N per area

observed Leaf C per dry massMea

n −

pre

dict

ed L

eaf C

per

dry

mas

s

5.8

6.0

6.2

6.4

5.8 6.0 6.2 6.4

dry massC per

observed Seed mass

Mea

n −

pre

dict

ed S

eed

mas

s

−5

0

5

10

−5 0 5 10

Seed mass

observed foliar N

Mea

n −

pre

dict

ed fo

liar

N

1

2

3

4

1 2 3 4

N

observed leaf area

Mea

n −

pre

dict

ed le

af a

rea

0

5

10

0 5 10

Leaf area


HP

MF

− p

redi

cted

Lea

f N p

er a

rea

−2

−1

0

1

2

−2 −1 0 1 2

N per area

observed Leaf C per dry massHP

MF

− p

redi

cted

Lea

f C p

er d

ry m

ass

5.8

6.0

6.2

6.4

5.8 6.0 6.2 6.4

dry massC per

observed Seed mass

HP

MF

− p

redi

cted

See

d m

ass

−5

0

5

10

−5 0 5 10

Seed mass

observed foliar N

HP

MF

− p

redi

cted

folia

r N

1

2

3

4

1 2 3 4

N

observed leaf area

HP

MF

− p

redi

cted

leaf

are

a

0

5

10

0 5 10

Leaf area


aHP

MF

− p

redi

cted

Lea

f N p

er a

rea

−2

−1

0

1

2

−2 −1 0 1 2

N per area

observed Leaf C per dry massaHP

MF

− p

redi

cted

Lea

f C p

er d

ry m

ass

5.8

6.0

6.2

6.4

5.8 6.0 6.2 6.4

dry massC per

observed Seed mass

aHP

MF

− p

redi

cted

See

d m

ass

−5

0

5

10

−5 0 5 10

Seed mass

observed foliar N

aHP

MF

− p

redi

cted

folia

r N

1

2

3

4

1 2 3 4

N

observed leaf area

aHP

MF

− p

redi

cted

leaf

are

a

0

5

10

0 5 10

Leaf area

Figure S9.1: Scatter plot for pairs of traits on true test data (x) versus predicted test data (y) using the MEAN (blue triangles), HPMF (red circles)and aHPMF (green crosses) approach. All values are given in natural log-normal (base e) transformed space. Note how HPMF and aHPMF imputedvalues are generally less scattered and closer to the 1:1 line than MEAN imputed values. This holds true for structural traits (such as plant height) aswell as physiological traits (like the foliar N to P ratio). All correlations were highly significant at the p <0.001 level.

Figure S9.2: Scatter plots of trait-trait correlations for observed data (left column), Mean predictions (second column) and BHPMF predictions(right column) for specific leaf area (SLA) versus foliar nitrogen (N) concentration (top row) and foliar phosphorus (P) concentration versus foliar Nconcentration (bottom row). Values are given in natural log-normal (base e) transformed space). Note the correlation coefficients (R2) given.

−0.8 −0.4 0.0 0.4

−0.

40.

00.

4

Procrustes errors

PCA 3

PC

A 4

Figure S9.3: Procrustes analysis errors for the 3rd and 4th Principal Component axes comparing a PCA performed on the original, gappy RAINFORdata with a PCA performed on the RAINFOR data with artificially introduced gaps being filled using BHPMF

−1.0 −0.5 0.0 0.5 1.0

−0.

40.

00.

40.

8

Procrustes errors

PCA 5

PC

A 6

Figure S9.4: Procrustes analysis errors for the 5th and 6th Principal Component axes comparing a PCA performed on the original, gappy RAINFORdata with a PCA performed on the RAINFOR data with artificially introduced gaps being filled using BHPMF

Figure S9.5: Bivariate Pearson correlations between traits. Colors indicate strength of relationship with deep red being positive and deep bluenegative correlations. The plots in the diagonal show the distribution of each trait with the number indicating the trait according to Table S2.1

S10. Gibbs sampler results

Figure S10.1: Gibbs sampler generated density plots of BHPMF-estimated natural log-normal transformed leaf dry matter content, foliar nitrogenconcentration and specific leaf area (SLA) of three Acer saccharum (top row) and Pinus sylvestris (bottom row) individuals respectively.

10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1All traits

Ave

rag

e o

f R

MS

E

Percentage of Data with Ascending StdFigure S10.2: Gibbs sampler for the 13-trait data set with the inverse of prediction confidence (SD) on the x-axis and the prediction error (RMSE)on the y-axis. The x-axis is produced by sorting the data points in ascending order of their SD, dividing the test sets evenly into 10 classes withascending SD. I.e., the first class contains the 10% of predictions with the lowest SD, the second class the 10% of predictions with the second lowestSD, and so on. We regress the average RMSE per class against the ordered SD.

10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0

0.2

0.4

0.6

0.8

1

1.2

1.41− SLA

Ave

rage

of R

MS

E

Percentage of Data with Ascending Std10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.22− PlantHeight

Ave

rage

of R

MS

E


0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.23− SeedMass

Ave

rage

of R

MS

E

Percentage of Data with Ascending Std

10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.24− LDMC

Ave

rage

of R

MS

E


−0.2

0

0.2

0.4

0.6

0.8

1

1.25− StemSpecificDensity

Ave

rage

of R

MS

E


0

0.2

0.4

0.6

0.8

1

1.2

1.46− LeafArea

Ave

rage

of R

MS

E


10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.27− LeafN

Ave

rage

of R

MS

E


0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.28− LeafP

Ave

rage

of R

MS

E


0

0.2

0.4

0.6

0.8

1

1.2

1.49− Leaf nitrogen (N) content per area

Ave

rage

of R

MS

E


10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.810− Leaf fresh mass

Ave

rage

of R

MS

E


0.2

0.4

0.6

0.8

1

1.2

1.4

1.611− Leaf nitrogen/phosphorus (N/P) ratio

Ave

rage

of R

MS

E


0.2

0.4

0.6

0.8

1

1.2

1.4

1.612− Leaf carbon (C) content per dry mass

Ave

rage

of R

MS

E


0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.213− Leaf delta 15N

Ave

rage

of R

MS

E


Figure S10.3: Gibbs sampler for each of the 13 traits with the inverse of prediction confidence (SD) on the x-axis and the prediction error (RMSE)on the y-axis. The x-axis is produced as follows. If we have 100 total entries, we rank them by SD in ascending order, so 10% represent the first 10entries with the lowest SD, 20% represent the second 10% entries with the lowest SD etc. From top to bottom and left to right: specific leaf area(SLA), Plant height (PlantHt), seed mass, leaf dry matter content (LDMC), stem specific density (SSD), leaf area, leaf nitrogen, leaf phosphorus, leafnitrogen per area (LeafNArea), leaf fresh mass (LeafFmass), leaf nitrogen to phosphorus ratio (LeafNP), leaf carbon and leaf delta 15 nitrogen(LeafD15N)

S11. Additional references of data contributors

References

Bahn, M., Wohlfahrt, G., Haubner, E., Horak, I., Michaeler, W., Rottmar, K., Tappeiner, U., Cernusca, A., 1999. Leaf photosynthesis, nitrogencontents and specific leaf area of 30 grassland species in differently managed mountain ecosystems in the eastern alps, in: Land-Use Changes inEuropean Mountain Ecosystems. ECOMONT- Concept and Results. in: Land-Use Changes in European Mountain Ecosystems. ECOMONT-Concept and Results, ed.: Cernusca, A. and Tappeiner, U. and Bayfield, N., Blackwell Wissenschaft, Berlin.

Bodegom, P.M.v., Kanter, M., Aerts, C.B.R., 2005. Radial oxygen loss, a plastic property of dune slack plant species. Plant and Soil 271, 351–364.Bodegom, P.M.v., Sorrell, B.K., Oosthoek, A., Bakker, C., Aerts, R., 2008. Separating the effects of partial submergence and soil oxygen demand on

plant physiology. Ecology 89, 193–204.Cavender-Bares, J., Keen, A., Miles, B., 2006. Phylogenetic structure of floridian plant communities depends on taxonomic and spatial scale.

Ecology 87, 109–122.Cornelissen, J., Aerts, R., Cerabolini, B., Werger, M., van der Heijden, M., 2001. Carbon cycling traits of plant species are linked with mycorrhizal

strategy. Oecologia 129, 611–619.Cornelissen, J., Cerabolini, B., Castro-Dez, P., Villar-Salvador, P., Montserrat-Mart, G., Puyravaud, J., Maestro, M., Werger, M., Aerts, R., 2003.

Functional traits of woody plants: correspondence of species rankings between field adults and laboratory-grown seedlings? Journal of VegetationScience 14, 311–22.

Cornelissen, J.H.C., Diez, P.C., Hunt, R., 1996. Seedling growth, allocation and leaf attributes in a wide range of woody plant species and types.Journal of Ecology 84, 755–765.

Cornelissen, J.H.C., Quested, H.M., Gwynn-Jones, D., Van Logtestijn, R.S.P., De Beus, M.A.H., Kondratchuk, A., Callaghan, T.V., Aerts, R., 2004.Leaf digestibility and litter decomposability are related in a wide range of subarctic plant species and types. Functional Ecology 18, 779–786.

Cornwell, W.K., Cornelissen, J.H.C., Amatangelo, K., Dorrepaal, E., Eviner, V.T., Godoy, O., Hobbie, S.E., Hoorens, B., Kurokawa, H., Prez-Harguindeguy, N., Quested, H.M., Santiago, L.S., Wardle, D.A., Wright, I.J., Aerts, R., Allison, S.D., Van Bodegom, P., Brovkin, V., Chatain,A., Callaghan, T.V., Daz, S., Garnier, E., Gurvich, D.E., Kazakou, E., Klein, J.A., Read, J., Reich, P.B., Soudzilovskaia, N.A., Vaieretti, M.V.,Westoby, M., 2008. Plant species traits are the predominant control on litter decomposition rates within biomes worldwide. Ecology Letters 11,1065–1071.

Craine, J.M., Elmore, A.J., Aidar, M.P.M., Bustamante, M., Dawson, T.E., Hobbie, E.A., Kahmen, A., Mack, M.C., McLauchlan, K.K., Michelsen,A., Nardoto, G.B., Pardo, L.H., Peuelas, J., Reich, P.B., Schuur, E.A.G., Stock, W.D., Templer, P.H., Virginia, R.A., Welker, J.M., Wright, I.J.,2009. Global patterns of foliar nitrogen isotopes and their relationships with climate, mycorrhizal fungi, foliar nutrient concentrations, andnitrogen availability. New Phytologist 183, 980–992.

Craine, J.M., Lee, W.G., Bond, W.J., Williams, R.J., Johnson, L.C., 2005. Environmental constraints on a global relationship among leaf and roottraits of grasses. Australian Journal of Botany 86, 12–19.

Dıaz, S., Hodgson, J., Thompson, K., Cabido, M., Cornelissen, J., Jalili, A., Montserrat-Martı, G., Grime, J., Zarrinkamar, F., Asri, Y., Band, S.,Basconcelo, S., Castro-Dıez, P., Funes, G., Hamzehee, B., Khoshnevi, M., Perez-Harguindeguy, N., Perez-Rontome, M., Shirvany, F., Vendramini,F., Yazdani, S., Abbas-Azimi, R., Bogaard, A., Boustani, S., Charles, M., Dehghan, M., de Torres-Espuny, L., Falczuk, V., Guerrero-Campo, J.,Hynd, A., Jones, G., Kowsary, E., Kazemi-Saeed, F., Maestro-Martınez, M., Romo-Dıez, A., Shaw, S., Siavash, B., Villar-Salvador, P., Zak, M.,2004. The plant traits that drive ecosystems: Evidence from three continents. Journal of Vegetation Science 15, 295–304.

Fonseca, C.R., Overton, J.M., Collins, B., Westoby, M., 2000. Shifts in trait-combinations along rainfall and phosphorus gradients. Journal ofEcology 88, 964–977.

Freschet, G.T., Cornelissen, J.H.C., Van Logtestijn, R.S.P., Aerts, R., 2010. Evidence of the plant economics spectrum in a subarctic flora. Journal ofEcology 98, 362–373.

Fyllas, N.M., Patino, S., Baker, T.R., Bielefeld Nardoto, G., Martinelli, L.A., Quesada, C.A., Paiva, R., Schwarz, M., Horna, V., Mercado, L.M.,Santos, A., Arroyo, L., Jimenez, E.M., Luizao, F.J., Neill, D.A., Silva, N., Prieto, A., Rudas, A., Silviera, M., Vieira, I.C.G., Lopez-Gonzalez,G., Malhi, Y., Phillips, O.L., Lloyd, J., 2009. Basin-wide variations in foliar properties of amazonian forest: phylogeny, soils and climate.Biogeosciences 6, 2677–2708.

Garnier, E., Lavorel, S., Ansquer, P., Castro, H., Cruz, P., Dolezal, J., Eriksson, O., Fortunel, C., Freitas, H., Golodets, C., Grigulis, K., Jouany, C.,Kazakou, E., Kigel, J., Kleyer, M., Lehsten, V., Lep, J., Meier, T., Pakeman, R., Papadimitriou, M., Papanastasis, V.P., Quested, H., Qutier, F.,Robson, M., Roumet, C., Rusch, G., Skarpe, C., Sternberg, M., Theau, J.P., Thbault, A., Vile, D., Zarovali, M.P., 2007. Assessing the effectsof land-use change on plant traits, communities and ecosystem functioning in grasslands: A standardized methodology and lessons from anapplication to 11 european sites. Annals of Botany 99, 967–985.

Gillison, A.N., Carpenter, G., 1997. A generic plant functional attribute set and grammar for dynamic vegetation description and analysis. FunctionalEcology 11, 775–783.

Gutierrez, A.G., Huth, A., 2012. Successional stages of primary temperate rainforests of chiloe island, chile. Perspectives in Plant Ecology, Evolutionand Systematics 14, 243 – 256.

Han, W.X., Fang, J.Y., Guo, D.L., Zhang, Y., 2005. Leaf nitrogen and phosphorus stoichiometry across 753 terrestrial plant species in china. NewPhytologist 168, 377–385.

Hickler, T., 1999. Plant functional types and community characteristics along environmental gradients on Oland’s Great Alvar (Sweden). Master’sthesis. University of Lund, Sweden.

Kattge, J., Knorr, W., Raddatz, T., Wirth, C., 2009. Quantifying photosynthetic capacity and its relationship to leaf nitrogen content for global-scaleterrestrial biosphere models. Global Change Biology 15, 976 – 991.

Kleyer, M., Bekker, R., Knevel, I., Bakker, J., Thompson, K., Sonnenschein, M., Poschlod, P., Van Groenendael, J., Klimes, L., Klimesova, J., Klotz,S., Rusch, G., Hermy, M., Adriaens, D., Boedeltje, G., Bossuyt, B., Dannemann, A., Endels, P., Gotzenberger, L., Hodgson, J., Jackel, A.K.,Kuhn, I., Kunzmann, D., Ozinga, W., Romermann, C., Stadler, M., Schlegelmilch, J., Steendam, H., Tackenberg, O., Wilmann, B., Cornelissen, J.,

Eriksson, O., Garnier, E., Peco, B., 2008. The leda traitbase: a database of life-history traits of the northwest european flora. Journal of Ecology96, 1266–1274.

Kurokawa, H., Nakashizuka, T., 2008. Leaf herbivory and decomposability in a malaysian tropical rain forest. Ecology 89, 2645–2656.Laughlin, D.C., Leppert, J.J., Moore, M.M., Sieg, C.H., 2010. A multi-trait test of the leaf-height-seed plant strategy scheme with 133 species from a

pine forest flora. Functional Ecology 24, 493–501.Louault, F., Pillar, V., Aufrre, J., Garnier, E., Soussana, J.F., 2005. Plant traits and functional types in response to reduced disturbance in a

semi-natural grassland. Journal of Vegetation Science 16, 151–160.Medlyn, B.E., Jarvis, P.G., 1999. Design and use of a database of model parameters from elevated co2 experiments. Ecological Modelling 124,

69–83.Messier, J., McGill, B.J., Lechowicz, M.J., 2010. How do traits vary across ecological scales? a case for trait-based ecology. Ecology Letters 13,

838–848.Moles, A.T., Ackerly, D.D., Webb, C.O., Tweddle, J.C., Dickie, J.B., Westoby, M., 2005. A brief history of seed size. Science 307, 576–580.Moles, A.T., Falster, D.S., Leishman, M.R., Westoby, M., 2004. Small-seeded species produce more seeds per square metre of canopy per year, but

not per individual per lifetime. Journal of Ecology 92, 384–396.Muller, S.C., Overbeck, G.E., Pfadenhauer, J., Pillar, V.D., 2007. Plant functional types of woody species related to fire disturbance in forestgrassland

ecotones. Plant Ecology 189, 1–14.Niinemets, U., 2001. Global-scale climatic controls of leaf dry mass per area, density, and thickness in trees and shrubs. Ecology 82, 453–469.Ogaya, R., Penuelas, J., 2003. Comparative field study of quercus ilex and phillyrea latifolia: photosynthetic response to experimental drought

conditions. Environmental and Experimental Botany 50, 137 – 148.Onoda, Y., Westoby, M., Adler, P.B., Choong, A.M.F., Clissold, F.J., Cornelissen, J.H.C., Dıaz, S., Dominy, N.J., Elgart, A., Enrico, L., Fine, P.V.A.,

Howard, J.J., Jalili, A., Kitajima, K., Kurokawa, H., McArthur, C., Lucas, P.W., Markesteijn, L., Perez-Harguindeguy, N., Poorter, L., Richards,L., Santiago, L.S., Sosinski, E.E., Van Bael, S.A., Warton, D.I., Wright, I.J., Joseph Wright, S., Yamashita, N., 2011. Global patterns of leafmechanical properties. Ecology Letters 14, 301–312.

Ordonez, J.C., Van Bodegom, P.M., Witte, J.P.M., Bartholomeus, R.P., Hal, J.R., Aerts, R., 2010. Plant strategies in relation to resource supply inmesic to wet environments: Does theory mirror nature? The American Naturalist 175, 225–239.

Penuelas, J., Sardans, J., Llusia, J., Owen, S.M., Carnicer, J., Giambelluca, T.W., Rezende, E.L., Waite, M., Niinemets, U., 2010. Faster returns onleaf economics and different biogeochemical niche in invasive compared with native plant species. Global Change Biology 16, 2171–2185.

Poorter, L., 2009. Leaf traits show different relationships with shade tolerance in moist versus dry tropical forests. New Phytologist 181, 890–900.Poorter, L., Bongers, F., 2006. Leaf traits are good predictors of plant performance across 53 rain forest species. Ecology 87, 1733–1743.Preston, K.A., Cornwell, W.K., DeNoyer, J.L., 2006. Wood density and vessel traits as distinct correlates of ecological strategy in 51 california coast

range angiosperms. New Phytologist 170, 807–818.Reich, P.B., Oleksyn, J., Wright, I.J., 2009. Leaf phosphorus influences the photosynthesis-nitrogen relation: a cross-biome analysis of 314 species.

Oecologia 160, 207–212.Reich, P.B., Tjoelker, M.G., Pregitzer, K.S., Wright, I.J., Oleksyn, J., Machado, J.L., 2008. Scaling of respiration to nitrogen in leaves, stems and

roots of higher land plants. Ecology Letters 11, 793–801.Sack, L., Cowan, P.D., Jaikumar, N., Holbrook, N.M., 2003. The hydrology of leaves: co-ordination of structure and function in temperate woody

species. Plant, Cell and Environment 26, 1343–1356.Sack, L., Frole, K., 2006. Leaf structural diversity is related to hydraulic capacity in tropical rain forest trees. Ecology 87, 483491.Sack, L., Tyree, M.T., 2005. Leaf Hydraulics and Its Implications in Plant Structure and Function. Oxford, UK.Shioder, S., Rahajoe, J.S., Kohyama, T., 2008. Variation in longevity and traits of leaves among co-occurring understory plants in a tropical montane

forest. Journal of Tropical Ecology 24, 121–133.Shipley, B., 1989. The use of above-ground maximum relative growth rate as an accurate predictor of whole-plant maximum relative growth rate.

Functional Ecology 3, 771–775.Shipley, B., 1995. Structured interspecific determinants of specific leaf area in 34 species of herbaceous angiosperms. Functional Ecology 9,

312–319.Shipley, B., 2002. Trade-offs between net assimilation rate and specific leaf area in determining relative growth rate: relationship with daily

irradiance. Functional Ecology 16, 682–689.Swaine, E.K., 2007. Ecological and evolutionary drivers of plant community assembly in a Bornean rain forest. Ph.D. thesis. University of Aberdeen,

UK.Wright, I.J., Ackerly, D.D., Bongers, F., Harms, K.E., Ibarra-Manriquez, G., Martinez-Ramos, M., Mazer, S.J., Muller-Landau, H.C., Paz, H., Pitman,

N.C.A., Poorter, L., Silman, M.R., Vriesendorp, C.F., Webb, C.O., Westoby, M., Wright, S.J., 2007. Relationships among ecologically importantdimensions of plant trait variation in seven neotropical forests. Annals of Botany 99, 1003–1015.

Wright, I.J., Reich, P.B., Atkin, O.K., Lusk, C.H., Tjoelker, M.G., Westoby, M., 2006. Irradiance, temperature and rainfall influence leaf darkrespiration in woody plants: evidence from comparisons across 20 sites. New Phytologist 169, 309–319.

Wright, I.J., Reich, P.B., Westoby, M., Ackerly, D.D., Baruch, Z., Bongers, F., Cavender-Bares, J., Chapin, T., Cornelissen, J.H.C., Diemer, M.,Flexas, J., Garnier, E., Groom, P.K., Gulias, J., Hikosaka, K., Lamont, B.B., Lee, T., Lee, W., Lusk, C., Midgley, J.J., Navas, M.L., Niinemets, U.,Oleksyn, J., Osada, N., Poorter, H., Poot, P., Prior, L., Pyankov, V.I., Roumet, C., Thomas, S.C., Tjoelker, M.G., Veneklaas, E.J., Villar, R., 2004.The worldwide leaf economics spectrum. Nature 428, 821 – 827.

Wright, S.J., Kitajima, K., Kraft, N.J., Reich, P.B., Wright, I.J., Bunker, D.E., Condit, R., Dalling, J.W., Davies, S.J., Dıaz, S., Engelbrecht, B.M.,Harms, K.E., Hubbell, S.P., Marks, C.O., Ruiz-Jaen, M.C., Salvador, C.M., Zanne, A.E., 2010. Functional traits and the growth-mortality tradeoff

in tropical trees. Ecology 9, 3664–3674.

Individual contributions of the authors: The concept of the paper was developed by Franziska Schrodt, Jens Kattge, Arindam Banerjee and Peter B. Reich. Development of methods by Hanhuai Shan, Farideh Fazayeli, Anuj Karpatne, Franziska Schrodt, Arindam Banerjee and Jens Kattge Data analyses and sensitivity analyses were performed by Franziska Schrodt, Julia Joswig and Farideh Fazayeli. Trait data were provided by the TRY initiative, including major contributions by Jens Kattge, Gerhard Boenisch, Sandra Diaz, John Dickie, Andy Gillison, Sandra Lavorel, Paul Leadley, Christian Wirth, Ian J. Wright, S. Joseph Wright and Peter B. Reich. Franziska Schrodt, Hanhuai Shan, Farideh Fazayeli and Jens Kattge wrote the paper with contributions from all co-authors.

macroecological bhpmf – a hierarchical bayesian approach

Documents