![Page 1: Lab 13: Association Genetics December 5, 2011. Goals Use Mixed Models and General Linear Models to determine genetic associations. Understand the effect](https://reader031.vdocuments.net/reader031/viewer/2022031921/56649f3e5503460f94c5f1fb/html5/thumbnails/1.jpg)
Lab 13: Association Genetics
December 5, 2011
![Page 2: Lab 13: Association Genetics December 5, 2011. Goals Use Mixed Models and General Linear Models to determine genetic associations. Understand the effect](https://reader031.vdocuments.net/reader031/viewer/2022031921/56649f3e5503460f94c5f1fb/html5/thumbnails/2.jpg)
Goals• Use Mixed Models and General Linear Models
to determine genetic associations.
• Understand the effect of population structure and kinship on associations.
• Use Trait Analysis by aSSociation, Evolution and Linkage (TASSEL) to calculate phenotype-genotype associations.
![Page 3: Lab 13: Association Genetics December 5, 2011. Goals Use Mixed Models and General Linear Models to determine genetic associations. Understand the effect](https://reader031.vdocuments.net/reader031/viewer/2022031921/56649f3e5503460f94c5f1fb/html5/thumbnails/3.jpg)
Mixed Model
phenotype(response variable)of individual i
effect of target SNP Family effect(Kinship coefficient)
Population Effect (e.g., Admixture coefficient from Structure or values of Principal Components)
effects of background SNPs
![Page 4: Lab 13: Association Genetics December 5, 2011. Goals Use Mixed Models and General Linear Models to determine genetic associations. Understand the effect](https://reader031.vdocuments.net/reader031/viewer/2022031921/56649f3e5503460f94c5f1fb/html5/thumbnails/4.jpg)
Principal Component Analysis (PCA)
• PCA is computationally much more efficient than maximum likelihood method.
• PCA reduces dimensionality of the data so that the correlated variables are transformed into uncorrelated variables called principal components.
• PC1 captures as much of the variation as possible and proceeds with PC2, PC3….
• Requires elimination of monomorphic markers and imputation of missing values.
![Page 5: Lab 13: Association Genetics December 5, 2011. Goals Use Mixed Models and General Linear Models to determine genetic associations. Understand the effect](https://reader031.vdocuments.net/reader031/viewer/2022031921/56649f3e5503460f94c5f1fb/html5/thumbnails/5.jpg)
Imputing Missing Genotypes
Typically accomplished with software such as IMPUTE, PLINK, MACH, BEAGLE, and fastPHASE
From Isik and Wetten 2011 Workshop on Genomic Selection
![Page 6: Lab 13: Association Genetics December 5, 2011. Goals Use Mixed Models and General Linear Models to determine genetic associations. Understand the effect](https://reader031.vdocuments.net/reader031/viewer/2022031921/56649f3e5503460f94c5f1fb/html5/thumbnails/6.jpg)
PCA and Population Structure
-0.4 -0.2 0.0 0.2 0.4
PC1
-0.3
-0.2
-0.1
0.0
0.1
Tahoe
Willamette
Columbia
Puyallup
Skykomish
Skagit
Lilloet
Homathko
Klinaklini
Dean
Dean
Klinaklini
Homathko
Lilloet
Skagit
Skykomish
Puyallup
Columbia
Willamette
Tahoe
A B
PC2
-0.4 -0.2 0.0 0.2 0.4-0.2
-0.1
0.0
0.1
0.2
PuyallupSkykomishSkagitLilloetHomathkoKlinikliniDean
-0.4 -0.2 0.0 0.2 0.4-0.4
-0.2
0.0
0.2
0.4
PuyallupSkykomishSkagit
C D
![Page 7: Lab 13: Association Genetics December 5, 2011. Goals Use Mixed Models and General Linear Models to determine genetic associations. Understand the effect](https://reader031.vdocuments.net/reader031/viewer/2022031921/56649f3e5503460f94c5f1fb/html5/thumbnails/7.jpg)
Population Structure
• Unequal distribution of alleles unrelated to disease between cases and controls.
• Any allele more common in diseased population may spuriously appear to be associated with disease.
Cases Controls Genotype
Pop 1 Pop 1
Pop 2 Pop 2
TT
AT
AA
![Page 8: Lab 13: Association Genetics December 5, 2011. Goals Use Mixed Models and General Linear Models to determine genetic associations. Understand the effect](https://reader031.vdocuments.net/reader031/viewer/2022031921/56649f3e5503460f94c5f1fb/html5/thumbnails/8.jpg)
Problem 1Use the Tassel Tutorial Data to explore how to perform association genetic analyses for some commercially-important Maize phenotypes: flowering time, ear height, and ear width.
a) Which traits are significantly associated with polymorphisms in the Dwarf8 gene? Propose a reasonable biological hypothesis for these associations? See Thornsberry et al. (2001) and information from public genome databases for necessary background information.
b) Are there any patterns to the locations of the significant SNPs within the gene (e.g., are the significant SNPs clustered or dispersed, where in the gene do they occur)? What are some possible reasons for these patterns?
c) How do the corrections for population structure and kinship change the associations? Why?