how to solve biological problems with math 2012 23 mars 2012

15
How to solve biological problems with math 2012 23 Mars 2012

Upload: faustine-dufour

Post on 04-Apr-2015

112 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How to solve biological problems with math 2012 23 Mars 2012

How to solve biological problems with math 2012

23 Mars 2012

Page 2: How to solve biological problems with math 2012 23 Mars 2012

Phenotypic variation:

Page 3: How to solve biological problems with math 2012 23 Mars 2012

0

0.2

0.4

0.6

0.8

1

1.2

-6 -4 -2 0 2 4 6

What is association?chromosomeSNPs trait variant

Genetic variation yields phenotypic variation

Population with ‘ ’ allele Population with ‘ ’ allele

Distributions of “trait”

Page 4: How to solve biological problems with math 2012 23 Mars 2012

Quantifying Significance

Page 5: How to solve biological problems with math 2012 23 Mars 2012

T-test

t-value (significance) can be translated into p-value (probability)

Page 6: How to solve biological problems with math 2012 23 Mars 2012

Association using regression

genotype Coded genotype

phen

otyp

e

Page 7: How to solve biological problems with math 2012 23 Mars 2012

Regression analysis

X

Y

“response”

“feature(s)”

“intercept”

“coefficients”

“residuals”

Page 8: How to solve biological problems with math 2012 23 Mars 2012

Regression formalism

(monotonic)transformation

phenotype(response variable)of individual i

effect size(regression coefficient)

coded genotype(feature) of individual i

p(β=0)error(residual)

Goal: Find effect size that explains best all (potentially transformed) phenotypes as a linear function of the genotypes and estimate the probability (p-value) for the data being consistent with the null hypothesis (i.e. no effect)

Page 9: How to solve biological problems with math 2012 23 Mars 2012

Matlab function for Linear regression

• [x p tmp se] = regress_p(pheno,[ones(length(pheno),1) COV1 COV2 Genotype ]

Page 10: How to solve biological problems with math 2012 23 Mars 2012

Régression logistique

• Très utilisée en épidémiologie• Variable à expliquer: dichotomique• La maladie est caractérisée par un risque• Exprimer sous forme de risque ( ou de

probabilité) la relation entre une variable Y dichotomique et plusieurs variables X (facteurs de risque) (qualitatives ou quantitatives)

Page 11: How to solve biological problems with math 2012 23 Mars 2012

• Méthode d’estimation de l’association entre les facteurs de risque et la maladie (les bétas): méthode du maximum de vraisemblance,

• Odds ratio (rapport des cotes): force de l’association entre 1 facteur et la maladie (risque relatif)

Régression logistique

Page 12: How to solve biological problems with math 2012 23 Mars 2012

Le modèle logistiqueProbabilité d'une maladie cardiaque

en fonction de l'age

AGE

70605040302010

Pro

b(Y

=1 /

X)

1.0

.8

.6

.4

.2

0.0

Probability of the outcome

measure of the total contribution of all the independent variables used in the model and is known as the logit

Page 13: How to solve biological problems with math 2012 23 Mars 2012

The application of a logistic regression may be illustrated using a fictitious example of death from heart disease. This simplified model uses only three risk factors (age, sex, and blood cholesterol level) to predict the 10-year risk of death from heart disease. These are the parameters that the data fit:

The model can hence be expressed as

In this model, increasing age is associated with an increasing risk of death from heart disease (z goes up by 2.0 for every year over the age of 50), female sex is associated with a decreased risk of death from heart disease (z goes down by 1.0 if the patient is female), and increasing cholesterol is associated with an increasing risk of death (z goes up by 1.2 for each 1 mmol/L increase in cholesterol above 5 mmol/L).We wish to use this model to predict a particular subject's risk of death from heart disease: he is 50 years old and his cholesterol level is 7.0 mmol/L. The subject's risk of death is therefore

This means that by this model, the subject's risk of dying from heart disease in the next 10 years is 0.07 (or 7%).

Page 14: How to solve biological problems with math 2012 23 Mars 2012

Odds ratio• Rapport des chances, rapport des cotes ou risque relatif rapproché est

une • Mesure statistique, permettant de mesurer le degré de dépendance entre

des variables aléatoires qualitatives. • Mesure l'effet d'un facteur.• Le rapport des chances qu'un événement arrivant, par exemple une

maladie, à un groupe de personnes A arrive également à un autre groupe B.

• Si la probabilité qu'un évènement arrive dans le groupe A est p et q dans le groupe B, le rapport des chances est :

Odds ratio (OR) =

Page 15: How to solve biological problems with math 2012 23 Mars 2012

Matlab function for logistic regression

• [p0 x0 se0] = log_reg(Pheno,[COV1 COV2 ],Geno)