analysis of multivariate genotype - environment data
DESCRIPTION
Archived at http://orgprints.org/8021. Analysis of multivariate genotype - environment data using Non-linear Canonical Correlation Analysis Hans Pinnschmidt Danish Institute for Agricultural Sciences Division of Crop Protection Cereal Plant Pathology Group Denmark. Background - PowerPoint PPT PresentationTRANSCRIPT
-
Analysis of multivariate genotype - environment data using Non-linear Canonical Correlation Analysis
Hans PinnschmidtDanish Institute for Agricultural SciencesDivision of Crop Protection Cereal Plant Pathology GroupDenmark
-
Background
BAROF WP1 data: multivariate measurements on 86 spring barley genotypes in 10 environments (2 years: 2002 & 2003, 3 sites: Flakkebjerg, Foulum, Jyndevad, 2 production systems: ecological & conventional).
Objectives
Multivariate characterisation of genotypes with emphasis on yield-related properties.
-
variables: yield 1000 grain weight grain protein contents culm length date of emergence growth duration mildew severity rust severity scald severity net blotch severity disease diversity weed cover broken panicles & culms lodgingparameters: raw data mean/median/max./min. rank/relative values main effects interaction slopes raw data adjusted for E/G main effects/slopes (residuals) IPCA scoresSD/variance}derive information on general properties, specificity, stability/variability
-
Non-linear Canonical Correlation Analysis (NCCA): an optimal scaling procedure suited for handling multivariate data of any kind of scaling (numerical/quantitative, ordinal, nominal).
-
Non-linear Canonical Correlation Analysis (NCCA)
data treatment: quantitative variables (vm) were converted into ordinal variables with n categories (v11 ... v1n, ..., vm1 ... vmn).
-
Non-linear Canonical Correlation Analysis (NCCA)
is based on multivariate contingency tables containing frequency counts.
-
Non-linear Canonical Correlation Analysis (NCCA):
main dimensions ( principal components) are determined
loadings of variables ( overall correlation) are computed
category centroids are quantified
object scores ( principal component scores) are computed
-
Characterisation of environments
based on data adjusted for G main effects (= residuals)
-
Characterisation of genotypes
based on data adjusted for E main effects (= residuals)
-
Characterisation of genotypes & environmentsbased on:
raw data data adjusted for E main effects data adjusted for G & E main effects ( G x E interaction)
-
Conclusions & outlook
NCCA is an intuitive method good for visualising the main features in multivariate data of various scales.
NCCA is useful for obtaining an overall orientation of G properties and E characteristics.
Future work: Refinements to obtain a better synopsis of E-specific performance of Gs as related to their property profiles. Include AMMI- and clustering (biclassification) results in NCCA, organise data as environment-specific sets of variables.
-
Characterisation of genotype performance in individual environments based on:
raw yield- and disease data disease main effects of Gs environmental disease variability of Gs (= standard deviation of E adjusted data)