variación genética en el genoma
DESCRIPTION
A G A G T T C T G C T C G A G G G T T A T G C G C G. A G A G T T C T G C T C G A G G G T T A T G C G C G. A G A G T T C T G C T C G A G G G T T A T G C G C G. A G A G T T C T G C T C G A G G G T T A T G C G C G. Variación genética en el genoma. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/1.jpg)
Variación genética en el genoma
A G A G T T C T G C T C GA G G G T T A T G C G C G
A G A G T T C T G C T C GA G G G T T A T G C G C G
A G A G T T C T G C T C GA G G G T T A T G C G C G
A G A G T T C T G C T C GA G G G T T A T G C G C G
![Page 3: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/3.jpg)
International HapMap Project (http://www.hapmap.org)
![Page 4: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/4.jpg)
International HapMap Project (http://www.hapmap.org)
1. Disponer datos genotípicos diferentes grupos
étnicos
2. Selección TagSNPs estudio asociación -> Potencial para Whole Genome Association studies
3. Evaluación significación estadística e interpretación resultados
4. Estudio de los alelos menos comunes
5. Estudio variación estructural
6. Farmacogenómica
Aplicaciones biomédicas
![Page 5: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/5.jpg)
Bases de datos de variación genética
Online Mendelian Inheritance in Man
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM
Catalog of human genetic and genomic disorders
International HapMap Project
http://www.hapmap.org
Personalized Genomes: J. Watson’s genome
http://jimwatsonsequence.cshl.edu/cgi-perl/gbrowse/jwsequence/
Entrez dbSNP http://www.ncbi.nlm.nih.gov/projects/SNP/
Database of Genotype and Phenotype
http://view.ncbi.nlm.nih.gov/dbgap
MamPol
DPDB
http://mampol.uab.cat
http://dpdb.uab.cat
Human Genome Variation Database
http://hgvbase.cgb.ki.se/
![Page 6: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/6.jpg)
Human genetic & phenotypic diversity database
SNP1 SNP2 SNP3
Secuenceindividual 1
Secuence individual 2
...
...
A/A
A/C
G/C
C/C
G/T
T/T
...
Disease 1
Healthy
Phenotype
Estimation phenotypic effect
Association studies: Phenotpyic effect of SNPs
Genotype
...
Trait i
x2
x1
Cervical
Cancer
![Page 7: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/7.jpg)
BioBanks: Studies of cohorts at a great scale
•deCODE (Islandia)•Estonia•Germany•Canada•Japan•China
USA
![Page 8: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/8.jpg)
Association Studies
![Page 9: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/9.jpg)
Association Studies
•Study design
•Statistical analyses
![Page 10: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/10.jpg)
1st phase: DesignStudy designs
![Page 11: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/11.jpg)
1st phase: DesignStudy designs
![Page 12: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/12.jpg)
Statistical analysis methods
2nd phase: Statistical analysis
![Page 13: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/13.jpg)
Statistical analyses
in Association Studies
1. Data validation
2. Genetic description1. Unidimensional (snp by snp)2. Multidimensional
3. Test for association genotype-phenotype1. snp by snp2. Multisnp / haplotype /tagSNP3. Power assessment
4. Predictive model
Steps
2nd phase
![Page 14: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/14.jpg)
Statistical analyses
in Association Studies
1. Data validation (error sources: sampling, genotyping)
• Checking with SNPref • Hardy-Weinberg proportions (separately for controls and
cases)• Consistence among samples• Stratification (genetic markers)
Step
![Page 15: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/15.jpg)
Hardy-Weinberg Test
SNP rs1137933
Genotype frequencies
SNP diallelic: A & a with p and q relative freq.
Genotypic HW proportions AA, Aa & aa p2, 2pq & q2
Three statistics:
(i) That based on the Pearson (χ2) test statistic(ii) That based on the Likelihood ratio test statistic (G test). (iii) An exact test
CT CC TT
Control 38
(50.1) 76
(70.0) 15 (8.9)
Case 105 (95.2)
122 (126.9)
13 (17.9)
![Page 16: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/16.jpg)
Example of Hardy-Weinberg Test
CT CC TT
Control 38
(50.1) 76
(70.0) 15 (8.9)
SNP rs1137933
Genotypes
Pearson (χ2) test statistic
X2 = Σ (Oi-Ei) 2 / Ei
p = f(C)= f(CC) + f(CT)/2q = 1 – p
--------- Genotype SS SF FF Total
Number, obs 38 76 15 = 129 = N Frequency, exp p2 2pq q2 = 1,00 Number, exp p2N 2pqN q2N = N Number, exp 50.1 70.0 8.9 = 129 ----------
Control
Likelhood ratio (G) test statistic
G = - 2 Σ ln (Oi / Ei)
![Page 17: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/17.jpg)
SNP rs1137933
Control
Genotypes = 129
p1 = f(C)= 0,736p2 = f(T)= 0,264
ChiSquare (1 gl) = 7,5** p = 0,00617
G (Likelihood ratio) (1 gl) = 7,06** p = 0,00788
Case
genotypes = 240
p1 = f(C)= 0,727p2 = f(T)= 0,273
ChiSquare (1 gl) = 2,52 ns p = 0,11241
G (Likelihood ratio) (1 gl) = 2,63 ns p = 0,10486
CT CC TT
Control 38
(50.1) 76
(70.0) 15 (8.9)
Case 105 (95.2)
122 (126.9)
13 (17.9)
Example of Hardy-Weinberg Test
![Page 18: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/18.jpg)
Genetic description:
SNP by SNP
CT CC TT
Control 38
(29,5%) 76
(58,9%) 15
(11,6%)
Case 105
(43,8%) 122
(50,8%) 13
(5,4%)
Genotype frequencies
C T
Control 190
(73,6%) 68
(26,4%)
Case 349
(72,7%) 131
(27,3%)
Allele frequencies
SNP rs1137933
![Page 19: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/19.jpg)
Haplotype inference
Haplotype 1 acgtagcatcgtatgcgttagacgggggggtagcaccagtacagHaplotype 2 acgtagcatcgtatgcgttagacgggggggtagcaccagtacagHaplotype 3 acgtagcatcgtatgcgttagacgggggggtagcaccagtacagHaplotype 4 acgtagcatcgtttgcgttagacgggggggtagcaccagtacagHaplotype 5 acgtagcatcgtttgcgttagacgggggggtagcaccagtacagHaplotype 6 acgtagcatcgtttgcgttagacggcatggcaccggcagtacagHaplotype 7 acgtagcatcgtttgcgttagacggcatggcaccggcagtacagHaplotype 8 acgtagcatcgtttgcgttagacggcatggcaccggcagtacagHaplotype 9 acgtagcatcgtttgcgttagacggcatggcaccggcagtacag
Genetic description:
MultiSNP
a/t g/c ->
a) a g
t c
b) a c
t g
Genotypes Possible haplotypes
![Page 20: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/20.jpg)
Frequency Haplotype estimates
Haplotype
SNPrs1042522
SNPrs12951053
SNPrs8064946
SNPrs6541003
SNPrs4846049
SNPrs4646421
SNPrs4986885
SNPrs91590
7
SNPrs4147567
SNPrs2266633
Total
1 G A G G T C G C G G 0.1056
2 G A G A G C G C G G 0.0767
3 G A G A G C G C A G 0.0485
4 G A G A G C G A G G 0.0423
5 G A C G G T G A A A 0.0378
6 G A C A G T A A A A 0.0282
7 G A G G G C G C A G 0.0276
![Page 21: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/21.jpg)
Linkage disequilibrium measure (D’ Lewontin)
B1 B2 Total
A1 p11 = p1q1 + D p12 = p1q2 - D p1
A2 p21 = p2q1 - D p22 = p2q2 + D p2
Total q1 q2 1D’ = D / Dmax
r = D’ / square root (p1 p2 q1 q2)
Genetic description:
MultiSNP
![Page 22: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/22.jpg)
Linkage Disequilibrium representation
Linkage blocks
Recombination Hotspot
Associated SitesTagSNPs
![Page 23: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/23.jpg)
Statistical analyses
in Association StudiesSteps
1. Data validation
2. Genetic description1. Unidimensional (snp by snp)2. Multidimensional
3. Test for association genotype-phenotype1. snp by snp2. Multisnp / haplotype /tagSNP3. Power assessment
4. Predictive model
![Page 24: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/24.jpg)
Genetic - phenotype Association -> Guilty by association
Case vs Control
SNP2 (A/T) 100% A 0% T 0% A 100% T Mendelian SNP
SNP3 (T/G) 80% T 20% G 60% T 40% G QTL SNP
SNPn
SNP1 (G/C) 40% G 60% C 40% G 60% C Neutral SNP
Case – control study
![Page 25: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/25.jpg)
CT CC TT
Control 38
(29,5%) 76
(58,9%) 15
(11,6%)
Case 105
(43,8%) 122
(50,8%) 13
(5,4%)
Genotypic
C T
Control 190
(73,6%) 68
(26,4%)
Case 349
(72,7%) 131
(27,3%)
Allele
Test for association (snp by snp)
ChiSquare (2 gl) = 9,71** p = 0,00779
G (Likelihood ratio) (2 gl) = 9,67** p = 0,00795
ChiSquare (1 gl) = 0,07 p = 0,79134 G (Likelihood ratio) (1 gl) = 0,07 p = 0,79134
Odds Ratio (OR) = 1,05
Risk Ratio (RR) = 1,02
SNP rs1137933
Chi-square Independence Test
![Page 26: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/26.jpg)
Odds ratio (oportunidad relativa)
odds (oportunidad) is the ratio of probabilties for an event given by the quantity p / (1 − p), where p is the probability of the event
An disease with a 1 in 5 probability of occurring for a given genotype (i.e. 0.2 or 20%), then the odds are 0.2 / (1 − 0.2) = 0.2 / 0.8 = 0.25.
The odds ratio is defined as the ratio of the odds of an event occurring in one group to the odds of it occurring in another group. These groups might be case and control groups, or any other dichotomous classification. So if the probabilities of the event in each of the groups are p (first group) and q (second group), then the odds-ratio is:
p
1 - p
![Page 27: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/27.jpg)
Odds ratio (razón de posibilidades)
Casos Controles Total
Alelo 1 SNP1a b a+b
Alelo 2 SNP1c d c+d
Total a+c b+d N
El cociente a/c es la Odds de exposición observada en el grupo de casos. El cociente b/d es la Odds de exposición en el grupo control
OR = 2,2 -> 2,2:1
Un efecto (enfermedad) aparece 2,2 veces más ante la presencia de otra variable (alelo SNP) que si esta variable no está presente
![Page 28: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/28.jpg)
Riesgo relativo RR, Risk ratioRR= tasa de incidencia de expuestos/tasa de incidencia en no expuestos
Casos Controles Total
Alelo 1 SNP1 a b a+b
Alelo 2 SNP1 c d c+d
Total a+c b+d N
Riesgo Relativo
![Page 29: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/29.jpg)
Casos Controles Total
Alelo 1 SNP1 210 250 460
Alelo 2 SNP1 100 300 400
Total 310 550 860
Riesgo Relativo = 210/460
100/400= 1,83
Razón Odds = 210/100
250/300= 2,52
![Page 30: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/30.jpg)
Genotypic
Controling for other independent variables
SNP rs1137933
♀
♂ CC CT
TT
Control 15 (62,5%
)
6 (25%)
3 (12,5%
)
Case 16 (51,6%
)
13 (41,9%
)
2 (6,5%
ChiSquare (2 gl) = 1,95 p = 0,37719
G (Likelihood ratio) (2 gl) = 1,98 p = 0,37158
CC CT
TT
Control 61
(58,1%) 32
(30,5%) 12
(11,4%)
Case 106
(50,7%) 92
(44%) 11
(5,3%)
ChiSquare (2 gl) = 7,59*
p = 0,02248
G (Likelihood ratio)(2 gl) = 7,5*
p = 0,02352
![Page 31: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/31.jpg)
Test for association (multisnp)
Test for association among haplotype and response (diseases) or TagSNP and response
Haplotypec
SNPrs1042522
SNPrs12951053
SNPrs8064946
SNPrs6541003
SNPrs4846049
SNPrs4646421
SNPrs4986885
SNPrs915907
SNPrs4147567
SNPrs2266633
Case
1 G A G G T C G C G G 0.1056
2 G A G A G C G C G G 0.0767
3 G A G A G C G C A G 0.0485
4 G A G A G C G A G G 0.0423
5 G A C G G T G A A A 0.0378
6 G A C A G T A A A A 0.0282
7 G A G G G C G C A G 0.0276
Haplotypec
SNPrs1042522
SNPrs12951053
SNPrs8064946
SNPrs6541003
SNPrs4846049
SNPrs4646421
SNPrs4986885
SNPrs915907
SNPrs4147567
SNPrs2266633
Control
1 G A G G T C G C G G 0.1168
2 G A G A G C G C G G 0.0657
3 G A G A G C G C A G 0.0405
4 G A G A G C G A G G 0.0345
5 G A C G G T G A A A 0.0275
6 G A C A G T A A A A 0.0185
7 G A G G G C G C A G 0.0134
![Page 32: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/32.jpg)
Logistic regressionLogistic regression modelo de regresión estadística de variables dependientes binarias. Puede considerarse un modelo lineal generalizado que usa la función logit como función de enalce (link), y sus errores están distribuidos binomialmente.
El modelo se expresa en la forma
i, = 1, ..., n, donde
El logaritmo de odds (probabilidad dividida por uno menos la probabilidad) del resultado se modela como una función lineal de variables explicativas, X1 a Xk. Puede escribirse como
La interpretación de las estimas de los parámetros β es el efecto multiplicativo sobre la razón de odds. En el caso de variables dicotómicas explicativas, por ejemplo sexo, eβ (el antilog de β) es la estima del odds-ratio of tener el resultado según se compare machos y hembras. Los parámetros α β1, ..., βk se estiman normalmente por máxima verosimilitud.
![Page 33: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/33.jpg)
Logistic regression is a predictive tool
if the logit β1 = 2.303, then the corresponding odds ratio (the exponential function, eβ1 ) is 10, then we may say that when the independent variable increases one unit, the odds that the dependent = 1 increase by a factor of 10, when other variables are controlled.
![Page 34: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/34.jpg)
Links
•http://bioinfo.iconcologia.net/SNPstats (Web tool for association studies)
•http://www.mep.ki.se/genestat/tl/genass_ldmap (Tutorial for association studies)
•http://linkage.rockefeller.edu/soft (Software for genetic analysis)
•http://www.broad.mit.edu/personal/jcbarret/haploview (Haploview)
•http://www.genome.gov/26525384 (Catálogo de estudios de GWA publicados)
•http://geneticassociationdb.nih.gov (Base de datos de estudios de asociación de enfermedades humana)
![Page 35: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/35.jpg)
Association studies: Recurso Web
http://bioinfo.iconcologia.net/index.php?module=Snpstats
![Page 36: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/36.jpg)
Asociación genética -> Culpable por asociación
Pacientes vs Control
SNP2 (A/T) 100% A 0% T 0% A 100% T
SNP3 (T/G) 80% T 20% G 60% T 40% G
SNPn
SNP1 (G/C) 40% G 60% C 40% G 60% C
![Page 37: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/37.jpg)
Hoy podemos abordar el análisis de asociación de miles de SNPs, pudiendo
desvelar la base genética de las enfermedades.
![Page 38: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/38.jpg)
Translation of genetic-
phenotypic information
into the clinical practise
D.R. Bentley. 2004 Nature 429: 440-445
![Page 39: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/39.jpg)
Translation of genetic-
phenotypic information
into the clinical practise
![Page 40: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/40.jpg)
Translation of genetic-
phenotypic information
into the clinical practise
![Page 41: Variación genética en el genoma](https://reader035.vdocuments.net/reader035/viewer/2022062222/56814e80550346895dbc1ce8/html5/thumbnails/41.jpg)
Translation of genetic-
phenotypic information
into the clinical practise