genotyping, linkage mapping and binary data

76
Genotyping, Linkage Mapping and Binary Data Mohamed Atia Omar Ph.D Genome Mapping Research Dept. AGERI ARC FAO Training , 2014 , Egypt

Upload: plant-genetic

Post on 20-Jun-2015

372 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Genotyping, linkage mapping and binary data

Genotyping, Linkage Mapping and

Binary Data

Mohamed Atia OmarPh.D

Genome Mapping Research Dept. – AGERI – ARC

FAO Training , 2014 , Egypt

Page 2: Genotyping, linkage mapping and binary data

Genotyping

Overview

Page 3: Genotyping, linkage mapping and binary data

What is genotyping ?

The analysis of DNA-sequence variation

Genotype = the genetic constitution of an individual

Page 4: Genotyping, linkage mapping and binary data

1.7—2.0 million species

Estimates to 10 million

How much biodiversity

Page 5: Genotyping, linkage mapping and binary data
Page 6: Genotyping, linkage mapping and binary data

Important Terms

Variation : Any nucleotide change in the genome

Rare Polymorphism: Variation found in < 1% of population

Polymorphism : Variation found in ≥1% of population

Locus: Chromosomal location of a gene

Allele : alternative form of a gene or DNA sequence at a specific chromosomal location (locus)

Heterozygous: Feature of interest is different in both alleles

Homozygous : Feature of interest is identical in each allele

Hemizygous : Only one allele exists (X in Males)

Page 7: Genotyping, linkage mapping and binary data

What are the Types of Mutations /

Polymorphisms to be Genotyped?

There are six major classes of genetic variation:

1. Single base changes

2. Simple di-, tri-, tetranucleotide repeats

3. Small insertions or deletions

4. Larger, tandem repeats

5. Multi-gene (Megabase) duplication (CNV)

6. Complex rearrangements

Page 8: Genotyping, linkage mapping and binary data

Classes of Mutation

Page 9: Genotyping, linkage mapping and binary data

An example of one simple question:

How much variation is there?

Page 10: Genotyping, linkage mapping and binary data
Page 11: Genotyping, linkage mapping and binary data
Page 12: Genotyping, linkage mapping and binary data

What are the most Informative Classes for

Genotyping Studies ?

Polymorphism Type Nickname Heterozygosity

1. Single base changes SNP 1-50%

2. Simple di-, tri-, tetranucleotide repeats STR- short tandem repeats 50-90%

3. Small insertions or deletions INDELS - Insertions or deletion 1-50%

4. Larger, tandem repeats VNTR- variable # of tandem repeat 50-90%

5. Multi-gene (Megabase) duplication CNV - Copy Number Variation 1-50%

6. Complex rearrangements ----------- 1-50%

Page 13: Genotyping, linkage mapping and binary data

How many loci should be assayed?

Two strategies for selecting are possible:

• Select a few highly informative markers

• Select numerous, poorly informative, markers randomly

distributed within the genome

Page 14: Genotyping, linkage mapping and binary data

To scan the whole genomes…

Not like this……. but like this

Microcentrifuge Tube

96-well plates

384-well plates

Affymetrix genechip

Page 15: Genotyping, linkage mapping and binary data

Not like this……. but like thisSetting up

the reactions

Page 16: Genotyping, linkage mapping and binary data

Not like this……. but like this

Page 17: Genotyping, linkage mapping and binary data

Genotyped loci

10

10

100

100

1,000

1,000

10,000

10,000

100,000

100,000

Genoty

ped indiv

iduals

1,000,000

GWAS

validation and

candidate gene

association

Genome-Wide Association Studies

Plant and

animal

breeding for

selected traits

Candidate region

fine mapping

Fingerprinting, Whole genome scans

Diagnostics

Applications enabled by HTP genotypingDiagnostics, MAS, disease related genes, Domestication traits,

bar coding, industrial protection of genotypes

Page 18: Genotyping, linkage mapping and binary data

High Throughput genotyping techniques

Genotyped loci

10

10

100

100

1,000

1,000

10,000

10,000

100,000

100,000

Genoty

ped indiv

iduals

1,000,000

GoldenGate

assay

Infinium BeadChips

iselect

VeraCode

GoldenGate

SNPlex,

GenPlex

TaqMan

Openarrays

iPLEX

Gold

PyroseqSNaP

shot

Invader

TaqMan

BeadChips

Illumina

AB

Sequenom

Targeted GeneChips

Affymetrix

Illumina High-Density 1M-Duo chipIllumina

Affymetrix Genome-Wide Human SNP Array 6.0

Genome-Wide Association Studies

Two main suppliers for GWA: ILLUMINA and AFFYMETRIX

Page 19: Genotyping, linkage mapping and binary data
Page 20: Genotyping, linkage mapping and binary data
Page 21: Genotyping, linkage mapping and binary data

1) Hybridization

– Microarrays

– TaqMan, Molecular Beacons

2) Allele-specific PCR

– FRET

– Intercalating Dyes

3) Primer Extension

– MALDI-TOF (Matrix Assisted Laser Desorption/Ionization Time-of-flight mass spectrometry)

– SNaPshot (Single nucleotide primer extension)

4) Ligation

– Padlock Probes

– Rolling Circle Amplification

5) Endonuclease Cleavage

– RFLP

– PIRA/RFL

5 Basic Methodologies …..

Page 22: Genotyping, linkage mapping and binary data
Page 23: Genotyping, linkage mapping and binary data

RFLPs (Based on Endonuclease Cleavage)

Differences in DNA sequence generate different recognition sequences and DNA

cleavage sites for specific restriction enzymes

Two different genes will produce different fragment patterns when cut with the same

restriction enzyme due to differences in DNA sequence

Page 24: Genotyping, linkage mapping and binary data

Microarray (Based on Hybridization)

Purpose: multiple simultaneous measurements by hybridization of labeled

probe

DNA elements may be:

Oligonucleotides

cDNA’s

Large insert genomic clones

Page 25: Genotyping, linkage mapping and binary data

Microarray technologiesDNA microarrays

Ordered arrangement of multiple sets of DNA on solid support

Page 26: Genotyping, linkage mapping and binary data

Microarray chip

Affymetrix 100k chip set

Entire genome with 100 000 SNPs (low density).

Affymetrix 500k chip (SNP array 5.0)

Entire genome with 500 000 SNPs (high density)

Affymetrix 1M chip (SNP array 6.0)

Entire genome with 1 000 000 SNPs (very high density)

Page 27: Genotyping, linkage mapping and binary data

Organization of a DNA microarray

1.28 cm

Page 28: Genotyping, linkage mapping and binary data

Hybridization of a labeled probe to the microarray

Page 29: Genotyping, linkage mapping and binary data

Detection of hybridization on microarray

Light from laser

Page 30: Genotyping, linkage mapping and binary data

Hybridization intensities on DNA microarray

following laser scanning

Page 31: Genotyping, linkage mapping and binary data

A

BBB

(0)

AB

(0.5)

AA

(1)

Page 32: Genotyping, linkage mapping and binary data

SNPs

Single Nucleotide Polymorphisms

Change one nucleotide

Insert

Delete

Replace it with a different nucleotide

Many have no phenotypic effect

Some can disrupt or affect gene function

Page 33: Genotyping, linkage mapping and binary data

SNP genotyping methods

over 100 different approaches

Ideal SNP genotyping platform:

high-throughput capacity

simple assay design

robust

affordable price

automated genotype calling

accurate and reliable results

Page 34: Genotyping, linkage mapping and binary data

Overview of SNP array technology

Page 35: Genotyping, linkage mapping and binary data

A little more on SNPs

Most SNPs have only two alleles Easy to automate their

scoring

Becoming extremely popular

Typing Methods Sequencing

Restriction Site

Hybridization

Page 36: Genotyping, linkage mapping and binary data

Linkage Mapping

Overview

Page 37: Genotyping, linkage mapping and binary data

Types of Maps

Physical Maps

Complete or partially sequenced organisms

Cytogenetic Maps

Breakpoints in disease

Direct binding of probes to chromosome

Genetic Linkage Maps

Markers

Page 38: Genotyping, linkage mapping and binary data

What happens in meiosis…

Leads to formation of haploid

gametes from diploid cells

Assortment of genetic loci

Recombination or crossover

Page 39: Genotyping, linkage mapping and binary data

What is Linkage?

Linkage is defined genetically: the failure of two genes to assort independently.

Linkage occurs when two genes are close to each other on the same chromosome.

However, two genes on the same chromosome are called syntenic.

Linked genes are syntenic, but syntenic genes are not always linked. Genes farapart on the same chromosome assort independently: they are not linked.

Linkage is based on the frequency of crossing over between the two genes.

Crossing over occurs in prophase of meiosis 1, where homologous chromosomesbreak at identical locations and rejoin with each other.

Page 40: Genotyping, linkage mapping and binary data

Applications/Uses of Linkage Maps

Studying genome structure, organization and evolution.

Estimation of gene effects of important agronomic traits.

Tagging genes of interest to facilitate marker assisted

selection (MAS) programs.

Map based cloning

Identify genes responsible for traits.

Plants or Animals

Disease resistance

Meat or Milk Production, …… etc

Page 41: Genotyping, linkage mapping and binary data

Genetic Linkage Mapping Steps

Development of The Mapping Population

Genotyping of Mapping Population (Selection of suitable MM).

Linkage Analysis

Map Construction

QTL Identification (in case QTL-Mapping)

Marker-Assisted Selection.

Page 42: Genotyping, linkage mapping and binary data

Development of The Mapping Population

Page 43: Genotyping, linkage mapping and binary data
Page 44: Genotyping, linkage mapping and binary data
Page 45: Genotyping, linkage mapping and binary data

Linkage analysis

Linkage : alleles from two loci segregate together in a family.

Recombination fraction (θ): the probability of a marker and a susceptibility

locus segregating independently (recombination).

θ= 0.5 No linkage; θ< 0.5 linked together

Page 46: Genotyping, linkage mapping and binary data

1. Chance

2. Preferential Segregation (nonrandom segregation of non-

homologous chromosomes) - hinted at but not shown in humans

3. Linkage - the presence of loci measurably close together on the

same chromosome.

Reasons why alleles at different loci may not assort independently:

Page 47: Genotyping, linkage mapping and binary data

ƒParametric Lod-Score

ƒHaseman-Elston Sib-Pair

ƒAffected Sib-Pair and

Affected Relative Pair

ƒAffected Pedigree Member Method

ƒVariance Components Method

Types of Linkage Analysis

Page 48: Genotyping, linkage mapping and binary data

Recombination frequency

Ɵ =

A

B

a

b

50% non-rec and 50% rec

Total amount of recombinants

Total amount of recombinants + Total amount of non-recombinants

Theta

100% non-rec 0

0.5

GametesParent

90% non-rec and 10% rec

99% non-rec and 1% rec

0.1

0.01

Page 49: Genotyping, linkage mapping and binary data
Page 50: Genotyping, linkage mapping and binary data

In double heterozyote:

Cis configuration = mutant alleles of both genes are on the same chromosome = ab/AB

Trans configuration = mutant alleles are on different homologues of the same chromosome = Ab/aB

Page 51: Genotyping, linkage mapping and binary data

Genes with recombination frequencies less than 50 percent are on the samechromosome = linked)

Linkage group = all known genes on a chromosome

Two genes that undergo independent assortment have recombination frequency of50 percent and are located on nonhomologous chromosomes or far apart on thesame chromosome = unlinked

Page 52: Genotyping, linkage mapping and binary data

Recombination

Recombination between linked genes occurs at the same frequency whether alleles are in cis or trans configuration

Recombination frequency is specific for a particular pair of genes

Recombination frequency increases with increasing distances between genes

No matter how far apart two genes may be, the maximum frequency of

recombination between any two genes is 50 percent.

Page 53: Genotyping, linkage mapping and binary data

• Cross-over frequencies can be converted into map units.

• Ex: A 5% cross-over frequency equals 5 map units.

– gene A and gene B cross over 6.0

percent of the time

– gene B and gene C

cross over 12.5 percent

of the time

– gene A and gene C cross over 18.5 percent of the

time

Page 54: Genotyping, linkage mapping and binary data

Lod scores

1cM = 1MB

1MB=1000kb

1kb=1000bp

1cM = 1,000,000 bp

Page 55: Genotyping, linkage mapping and binary data
Page 56: Genotyping, linkage mapping and binary data

58

Genetic Mapping

The map distance (cM) between two genes equals one half the average number of crossovers in that region per meiotic cell

The recombination frequency between two genes indicates how much recombination is actually observed in a particular experiment; it is a measure of recombination

Over an interval so short that multiple crossovers are precluded (~ 10 percent recombination or less), the map distance equals the recombination frequency because all crossovers result in recombinant gametes.

Genetic map = linkage map = chromosome map

Page 57: Genotyping, linkage mapping and binary data

59

Gene Mapping: Crossing Over

Crossovers which occur outside the region between two genes will not alter their arrangement

The result of double crossovers between two

genes is indistinguishable from independent assortment of the genes

Crossovers involving three pairs of alleles specify gene order = linear sequence of genes

Page 58: Genotyping, linkage mapping and binary data

60

Genetic vs. Physical Distance

Map distances based on recombination

frequencies are not a direct measurement of

physical distance along a chromosome

Recombination “hot spots” overestimate physical

length

Low rates in heterochromatin and centromeres

underestimate actual physical length

Page 59: Genotyping, linkage mapping and binary data

Gene Mapping

Mapping function: the relation between genetic map distance and the frequency of recombination

Chromosome interference: crossovers in one region decrease the probability of a second crossover close by

Coefficient of coincidence = observed number of double recombinants divided by the expected number

Interference = 1-Coefficient of coincidence

Page 60: Genotyping, linkage mapping and binary data

Genetic distance

Genetic distance =

1 cMorgan = 0.01 recombinants = average of 1Mb (physical distance)

the genetic length over which one crossover occurs in 1% of

meiosis. This distance is expressed in cMorgan.

As double recombinants occur the further two loci are,

the frequency of recombination does not increase

proportionately.

(Assuming that the recombination frequency is uniform along the chromosomes)

Page 61: Genotyping, linkage mapping and binary data

Linkage related Concepts

Interference - A crossover in one region usually decreases the probability of a

crossover in an adjacent region.

CentiMorgan (cM) - 1 cM is the distance between genes for which the

recombination frequency is 1%.

Lod Score - a method to calculate linkage distances (to determine the distance

between genes).

Page 62: Genotyping, linkage mapping and binary data

Linkage vs. Association

Linkage analyses look for relationship between a marker and disease

within a family (could be different marker in each family)

Association analyses look for relationship between a marker and

disease between families (must be same marker in all families)

Page 63: Genotyping, linkage mapping and binary data

Binary Data

Overview

Page 64: Genotyping, linkage mapping and binary data

Binary Data definition

Binary data is data whose unit can take on only two

possible states, traditionally termed 0 and +1 in accordance

with the binary numeral system and Boolean algebra.

Page 65: Genotyping, linkage mapping and binary data

Levels of Binary Data Storage

Page 66: Genotyping, linkage mapping and binary data
Page 67: Genotyping, linkage mapping and binary data
Page 68: Genotyping, linkage mapping and binary data
Page 69: Genotyping, linkage mapping and binary data
Page 70: Genotyping, linkage mapping and binary data
Page 71: Genotyping, linkage mapping and binary data
Page 72: Genotyping, linkage mapping and binary data
Page 73: Genotyping, linkage mapping and binary data
Page 74: Genotyping, linkage mapping and binary data
Page 75: Genotyping, linkage mapping and binary data
Page 76: Genotyping, linkage mapping and binary data

Thank You

Any Questions ??