analysis of multivariate genotype - environment data

Analysis of multivariate genotype - environment data using Non-linear Canonical Correlation Analysis Hans Pinnschmidt Danish Institute for Agricultural Sciences Division of Crop Protection Cereal Plant Pathology Group Denmark Archived at http://orgprints.org/8021

Upload: jeff

Post on 15-Jan-2016

34 views

Category:

Documents

5 download

Report

Download

Embed Size (px):

DESCRIPTION

Archived at http://orgprints.org/8021. Analysis of multivariate genotype - environment data using Non-linear Canonical Correlation Analysis Hans Pinnschmidt Danish Institute for Agricultural Sciences Division of Crop Protection Cereal Plant Pathology Group Denmark. Background - PowerPoint PPT Presentation

TRANSCRIPT

Analysis of multivariate genotype - environment data using Non-linear Canonical Correlation Analysis

Hans PinnschmidtDanish Institute for Agricultural SciencesDivision of Crop Protection Cereal Plant Pathology GroupDenmark
Background

BAROF WP1 data: multivariate measurements on 86 spring barley genotypes in 10 environments (2 years: 2002 & 2003, 3 sites: Flakkebjerg, Foulum, Jyndevad, 2 production systems: ecological & conventional).

Objectives

Multivariate characterisation of genotypes with emphasis on yield-related properties.
variables: yield 1000 grain weight grain protein contents culm length date of emergence growth duration mildew severity rust severity scald severity net blotch severity disease diversity weed cover broken panicles & culms lodgingparameters: raw data mean/median/max./min. rank/relative values main effects interaction slopes raw data adjusted for E/G main effects/slopes (residuals) IPCA scoresSD/variance}derive information on general properties, specificity, stability/variability
Non-linear Canonical Correlation Analysis (NCCA): an optimal scaling procedure suited for handling multivariate data of any kind of scaling (numerical/quantitative, ordinal, nominal).
Non-linear Canonical Correlation Analysis (NCCA)

data treatment: quantitative variables (vm) were converted into ordinal variables with n categories (v11 ... v1n, ..., vm1 ... vmn).
Non-linear Canonical Correlation Analysis (NCCA)

is based on multivariate contingency tables containing frequency counts.
Non-linear Canonical Correlation Analysis (NCCA):

main dimensions ( principal components) are determined

loadings of variables ( overall correlation) are computed

category centroids are quantified

object scores ( principal component scores) are computed
Characterisation of environments

based on data adjusted for G main effects (= residuals)
Characterisation of genotypes

based on data adjusted for E main effects (= residuals)
Characterisation of genotypes & environmentsbased on:

raw data data adjusted for E main effects data adjusted for G & E main effects ( G x E interaction)
Conclusions & outlook

NCCA is an intuitive method good for visualising the main features in multivariate data of various scales.

NCCA is useful for obtaining an overall orientation of G properties and E characteristics.

Future work: Refinements to obtain a better synopsis of E-specific performance of Gs as related to their property profiles. Include AMMI- and clustering (biclassification) results in NCCA, organise data as environment-specific sets of variables.
Characterisation of genotype performance in individual environments based on:

raw yield- and disease data disease main effects of Gs environmental disease variability of Gs (= standard deviation of E adjusted data)