understanding gwas chip design – linkage disequilibrium and hapmap

Click here to load reader

Post on 24-Feb-2016




0 download

Embed Size (px)


Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap. Peter Castaldi January 29, 2013. Objectives. Introduce the concept of linkage disequilibrium (LD) Describe how the HapMap project provides publically available information on genetic variation and LD structure - PowerPoint PPT Presentation


Understanding GWAS Chip Design Linkage Disequilibrium and HapMap

Understanding GWAS Chip Design Linkage Disequilibrium and HapMapPeter CastaldiJanuary 29, 2013ObjectivesIntroduce the concept of linkage disequilibrium (LD)Describe how the HapMap project provides publically available information on genetic variation and LD structureReview how LD enables genome-wide screens with only a subset of genome-wide SNP markersDescribe the design of chip-based genotype assays

Human Genome3 billion base pairs, 23 paired chromosomes

99.9% sequence similarity between individuals

~12 million variant sites

What are the Different Types of Genetic Variation?Single base pair change (ACGT ATGT), aka Single Nucleotide Polymorphism~12 million across the genomeInsertions/Deletions (TGGTTTCTA TGGT---TA)Can be of variable sizeTrinucelotide repeats (microsatellites)Highly polymorphic, less common than SNPsResponsible for certain clinic disorders (Huntingtons, Fragile X, myotonic dystrophy)

SNPs in detailSNPs can have up to four possible alleles (A,C,G,T), most have only two alleles present in human populationsEach person has two SNP alleles (one for each copy of the chromosome)when both copies are the same, youre homozygous (i.e. AA, CC, GG, TT). When theyre different (AT), your heterozygous.Each allele has a frequency in which it appears in a given populationmajor allele (more common), minor allele (less common)they sum to 1 (or 100%)SNPs are Used as Genetic Markers for GWAS ChipsProperties of SNPs that make them good markers for GWASdensely spaced across the genomeusually bi-allelic (only 2 alleles in the population, simplifies statistical tests)GWAS chips can effectively represent most common variation with just a subset of SNPs with ~500,000 SNPs, most common variation can be capturedthis is because there is significant correlation between neighboring SNPs Linkage Disequilibrium Causes Correlation Between Neighboring SNPsMendels laws state that genes (alleles) are independently transferred across generations (random assortment linkage equilibrium).This is not the case when two genetic loci are physically close to each other.When two physically close genetic loci are not randomly assorted, this is called linkage disequilibrium.Linkage Equilibrium Arises Because of Meiotic Recombination

http://kenpitts.net/hbio/8cell_repro/meiosis_pics.htmLinkage and RecombinationXYZxyzXyzXyzyXzXYzGametogenesisPaternal DNAMaternal DNAFrom Paternal grandfatherFrom Paternal grandmother

Recombination Breaks Up Chromosomal Segments Over Generations

recombination is not uniform across the genome (recombination hotspots).

SNPs within the yellow region are correlated with each other and form haplotypes.

Because of this correlation, one can often use a single SNP from a haplotype to represent all the SNP variation within a haplotype.Haplotype Structure Reflects Evolutionary History

The structure of haplotype blocks varies across racial groups

African populations have short LD blocks, reflecting the longer evolutionary history of those populations

~500,000 SNP Markers Can Reasonably Represent Most of the Common Genetic Variation in European GenomesGWAS relies upon linkage disequilibrium and the ubiquitous nature of SNP markers to enable genome-wide surveys of the impact of common variation on disease susceptibilityPeer et al. Nat Gen. 2006The HapMap Project is a catalog of human variation across populationsThe Human Genome project provided the complete human sequence for a small number of individualsTo get an accurate sense of variable sites, data from many individuals is neededHapMap has three iterations (http://hapmap.ncbi.nlm.nih.gov/)dense genotype data from multiple populations groupsCEU individuals of Northern and Western European ancestry from UtahYRI Yorubans from NigeriaJPT Japanese from TokyoCHB Han Chinese from BeijingData from the HapMap Project Enabled GWAS Chip DesignInformation from HapMap Used in chip designpanel of potential SNPs to use in a genotype chippopulation specific LD structure to allow the identification of tag SNPs that effectively tag haplotypes

Using Linkage Disequilibrium to find GenesLinkage disequilibrium (LD) means that sites of genetic variation can serve as markers for larger chromosomal segments.Correlation between markers is quantified with r-squared and D.

GWAS identify novel disease loci, but additional localization is often necessary

Genotype Chip Technology


Kang et al. The American Journal of Human Genetics Volume 74, Issue 3 2004 495 - 510SummaryGenetic material is transmitted across generations in blocks called haplotypes.Linkage disequilibrium and haplotype blocks allow for SNP tagging approaches that enable GWAS chips to capture common genetic variation with a subset of genetic markers.Haplotype structure varies across ancestral groups.The HapMap project catalogs human genetic variation and LD structure across populations.