i: human genome maps and localization of disease genes
DESCRIPTION
I: Human genome maps and localization of disease genes. Loengud ja seminarimaterjal: www.tymri.ut.ee -> õppetöö User: ML2004 Pw: 2004 Kirjandus: T. Strachan and A.P.Read “Human Molecular genetics” A. J.F. Griffiths et al “ Introduction in genetic analysis” - PowerPoint PPT PresentationTRANSCRIPT
I: Human genome maps and I: Human genome maps and localization of disease geneslocalization of disease genes
Loengud ja seminarimaterjal: www.tymri.ut.ee -> õppetöö
User: ML2004Pw: 2004
Kirjandus:
1. T. Strachan and A.P.Read “Human Molecular genetics”
2. A. J.F. Griffiths et al “ Introduction in genetic analysis”
3. Alberts B et al “Molecular Biology of the cell”
4. http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?db=Books
CLONES DISEASES
cDNA Genomic FamiliesChromosomalAbnormalities
ESTs PolymorphicFull-
lengthLargeinsert
STSs HeterogeneityIrregular inheritance
FISHSomatic cell hybridsRadiation hybrids
MultipointLinkage
(CEPH, etc.)
Two-point linakge(Lods, sib pairs,Homozygosoty)
Contig assembly
GENETIC MAPGENETIC MAP
Initial diseaseGene localization
PHYSICAL MAPPHYSICAL MAP
HUMAN GENOME MAPHUMAN GENOME MAP
Sequencing
Gene identification
MOUSE map
Marker-marker framework map
Genetic mapping - the aim is to discover how often 2 loci are separated by meiotic recombination
A2 A2
B2 B2
A2 A1
B2 B1
A1 A1
B2 B1
A2 A1
B1 B1
A2 A1
B2 B1
A1 A1
B1 B1
A1 A1
B2B1
A1 A1
B1 B1
A2 A1
B2 B1
A2 A1
B1 B1
A2 A1
B2B1
A1 A1
B1 B1
A2 A1
B2B1
NR NRNRNRNR R R
Genes A and B with alleles A1, A2 and B1, B2 are segregating in the family
I
II
III
Generation
Recombination fraction between loci A and B: the proportion of children who are recombinant R; the probability that an odd number of crossover events will take place between two loci
*Loci on different chromosomes: r (or =0.5*Loci on the same chromosome or syntetic : r (or <0.5*The closer the loci are, the smaller the value of R
Genetic map unit is 1 cM (centimorgan) = 1% of recombination between two loci
*The mathematical relationship between and genetic map distanceis described by mapping function (Haldane fucntion, Kosambi function)*average 49-55 crossovers per cell (differs between individuals)*chiasmata are more frequent in female meioses (fits Haldane rule that heterogametic sex has the lowere chiasma count)1 female cM=1.68±1.07 Mb1 male cM= 0.92±0.96MbSex-average cM= 1.30±0.80 Mb
Puurand, 2004
Mapping of the genome requires genetic markers:any Mendelian character can be used as a marker
*Genetic map preciseness and quality is increased by :1) dense coverage of markers across genome2) high PIC (polymorphism information content)3) extensive family material - high number of informative meiosis
AA1 1 A2 A3 A4
AA11 A4
A1 A1 A2 A2
A1 A2
AA11 A1 A1 A2
AA11 A1
A1 A2 A1 A2
A1 A2
Uninformative meiosis Informative meiosis
The father has adominant conditionthat he inherited with the marker allele AA11..
Informative meiosisInformative meiosisallows to define thatallows to define thatthe child inherited the child inherited AA11 from the father. from the father.
Marker When used No of loci
Blood groups 1910-1960 ~20
Protein Electromorphs 1960-1975 ~30
HLA tissue types 1970 – 1 haplotype
DNA RFLPs 1975 – <105
DNA minisatellites 1985 – <104
DNAmicrosatellites 1989 – <105
DNA SNPs 1998 – <106
The development of human genetic markers
Mendelian characters are determined by a SINGLE locus genotype
*For human <10 000 Mendelian characters are known (OMIM database)*Character can be either dominant or recessive and is discrete (= Y or N)*Genotype can be either hetero-, homo- or hemizygous (male X and Y loci)
*There are 5 [6]basic Mendelian pedigree patterns:- autosomal dominant inheritance- autosomal recessive inheritance- X-linked dominant inheritance- X-linked recessive inheritance- Y-linked inheritance[-mitochondrial or matrilinear inheritance]
*The mode of inheritance is determined by using several pedigrees*knowledge of the mode of inheritance is the prerequisite for linkage analysis
NON-Mendelian characters multifactorial: (1) continous or quantitative;(2) oligo- or polygenic; (3) environment-dependent expression
Linkage mapping of human disease genes – analysis of the segregation of marker alleles together with a Mendelian disease in human pedigreesIs based on counting recombinant and nonrecombinants
*Usually is not possible to score the recombinants by hand*computerized lod score analysis is used
likelihood that the loci are linked (with ) likelihood that the loci are unlinked (=0.5)Odds of linkage=
Lod score = log10(Odds of linkage)
*lod scores are calculated over the range of values*the most likely is the one with the highest lod score*lod scored can be added up across familiesZ=3 is the threshhold of accepting linkage (= 1000:1 odds)Z<-2 linkage can be rejected (=1:10 odds)
Problems of lod score analysis in humans:
*long generation time
*inability to control matings
*inability to control environmental exposure
*errors in genotyping and misdiagnosis
*computational difficulties
*locus heterogeneity
*limited resolution of the map
*lod score mapping is limited to Mendelian characters
Types of Maps Features Resolution 1. Cytogenetic Chromosome Banding maps Several Mb
2.Chromosome a) Somatic cell hybrid panels Several Mb
Breakpoint maps b) Radiation hybrids maps >0.5 Mb
3. Restriction map Rare-cutter (e.g. Not-I) maps <0.5 Mb
4. Clone contig map a) overlapping YAC clones 0.1-1Mb
b) overlapping cosmid clones ~40 kb
5. STS-maps typed by PCR; requires prior ~100 kb
sequence information for PCR primers
6. EST-maps sequencing 200-300 bp from ~40 kb
a cDNA clone, mapping back to other maps
7. DNA sequence map 1kb
Types of physical maps available for human genome
Localizationof the mapped gene
Principle of the Radiation Hybrid Panel mapping
*Mapping is by PCR typing or Southern blot hybridization of the studied gene*The higher the initial radiation dose, the higher resolution mapping
Different radiation hybrid cell lines
The mappingfunction Dis measured incentiRays (cR)
Clone contig maps: contig=contigous DNA without any gaps across the whole chromosome or selected genomic region
Overlap between the clones can be detected using STSs content mapping, repetitive DNA fingerprinting (long insert clones like YACs) or RFLP, microsatellite typing, and FISH analysis (shorter insert clones like cosmids or PACs)
Clone (cosmid, BAC, PAC, YAC)
STS and EST
STS- sequence tagged-siteSTS- sequence tagged-site„foot-print“ of a genomic region: short DNA stretches, amplifiable by adefined unique pair of primersApplied for: cchracterization and mapping of genomic clones into the context of the particular genomic region or contig
EST - expressed sequence tagEST - expressed sequence tagSource: various cDNA librariesMethod:1) cDNAs from a library are cloned into vectors; 2) 200-300 bp of each of cDNAs are sequenced random; 3) a public EST-database is formed, where scientists can identify and derive the clones containing the cDNAs of interest; 4) EST Initiative usually also tries maps the ESTsto the genome map
Publicly available genome databasesPublicly available genome databases
NCBI: http://www.ncbi.nlm.hih.gov/ENSEMBL: http://www.ensembl.org/
Organisms: human, mouse, rat, fruitfly, zebrafish, C.elegans, etc.Information: 1) genome maps – genetic, physical2) coding sequence (transcript maps, ESTs etc.)3) marker databases and maps (SNPs, mikrosatellites, RFLPs etc.)4) Gene information (genomic structure, mRNA, peptide, gene family,polymorphisms, function, diseases, etc.)5) polymorphism information6) homology maps (e.g. Mouse and Human)7) links to other databases (PubMed, OMIM, SNP databases, clone availability etc.)
Functional cloning
Identification of disease genes: position-independent strategies
Knowledge of thedefective protein
product
Gene specific Oligonucleotides
(AspartylglucoseAminuria, AGU
in Finns and AGAlocus )
Use of specificAntibodies
(PhenylketonuriaAnd phenylalanineHydrpxylase, PAH)
Identification of a genethrough its normal
funtion
Functional„rescue“ incell lines or/transgenic mice(Fanconi’s anemiaGroup C)
SubstractionCloning(Dystrophin and DMD gene)
Identification of disease genes: position-dependent strategies
Step 1. Positional cloning
Define the candidate Region
High-resolutionMap of the candidate region
Linkage mappingChromosomal
aberrations in patients
Polymorphismscreening
Genetic and physical mapping
Linkage disequilibrium mapping
Candidate gene search and analysisSearch for transcripts
Search thedatabases
Application of chromosomalApplication of chromosomalAberrations for mappingAberrations for mappingthe disease locus:the disease locus:3 individuals among the Finnish AGU (aspartyl-glucoseaminuria) patientswere characterized byaberrant karyotype andsimultaneously either under-(patients a,b) or over- (patient c)expression of the AGA protein.Patients a and b missed one telomeric segment of chr. 4q,patient c had an extra copy of thisRegion translocated to chr.21p.Thus, the AGA gene couldBe mapped 4q33->tel
Step 2. Positional candidate cloning
Define the candidate gene (s)
Confirming a candidate gene
Expression patternAnd function
Homology to relevantHuman gene or EST
Homology to a relevantGene in a model organism
Mutation screening
Difficulties:*Locus heterogneity
*mutational homogeneity*neutral versus pathogenic
mutations*other types of mutation than SNPs
Restoration of Normal phenotype
Mouse model of thedisease
Understand the functionof the gene
Mapping human traits using isolates
Given the current size of world’s population, the human genome is LESS diverse than might be expected:
1. Recent divergence from other primates
2. Relatively small size of human population over most of its history
Major waves of human migrations:
1. 100 000 years ago out of Africa
2. 50 000 - 30 000 to new regions, as Americas and Australia
3. 10 000 ya with spread of agriculture after the last glacial period.
Genetic consequences of a bottleneck accompanied by isolation:*Less diversity*Inbreeding*Genetic drift - random enrichment of recessive and neutral alleles
bottleneck
time
Populationgenetic diversitybefore bottleneck
Populationgenetic diversityafter bottleneck
Use of population isolates for mapping human traits
Peltonen et al., 2000
I. Examples of exploited isolated populations with High frequency of certain Mendelian disorders :
*Finns, Amish, Sardinians, Bedouins
II for mapping complex diseases, it might be useful to study very young isolates (10-20 generations):
*eastern Finland (Kuusamo), Costa Rica, Quebec, Newfoundland
These population isolates are have:
(a) Reduced genetic complexity
(b) Uniform environmental and cultural feature in the isolate
Peltonen et al., 20002000-4000 years ago
16th centuryTwo waves of settlement by
founder effect:>30 Finnish Mendelian diseases
Population isolates have been used with great success for identifying single gene defects.
Linkage analysis needs: (a) reliable diagnosis; (b) pattern of inheritance
Due to founder effect and genetic drift, the “disease allelechromosomes” possess strong haplotype signatures:The younger the mutation and the lower the recombination rate,
the longer the “disease” haplotype around the mutation
Novel mutation (M) is in absolute allelic association with certain haplotypic pattern of all polymorphic markers (P ) of the same chromosome
Present chromo-somes
M P PPP
Other chromosomes in populations with random distribution of alleles at loci P
Recombinations in population history
“disease” chromosome
Simple versus Complex disease mapping: are the isolates also here useful?
In practise, most successes in mapping complex disease loci in population isolates have depended on large pedigrees with proven or predicted genealogical ties between affected individuals.
Other strategies - genome scans to monitor intrafamilial association and linkage disequilibrium in population isolates have been less successful
Important! Subdivision of patient populations by qualitative clinical criteria minimizes genetic heterogeneity
Peltonen et al., 2000
Icelandic experiment:
*Iceland was founded 9th-10th century by limited numbers of founders from Scandinavia*minimal immigration during 1100 years*most of 275 000 Icelanders are descendents of original settlers*A tradition of recording family trees - genealogy of Icelanders traces back >1000 years*reduced genetic heterogeneity due to founder effect and inbreeding*deCODE project: cross-populations databases of linked genealogical, patients and genotyping records
Example of the mapping strategy of a Mendelian disease gene Example of the mapping strategy of a Mendelian disease gene In population isolate :vLINCLIn population isolate :vLINCL
NCL( neuronal ceroid lipofuscinosis) - a group of neurodegenerative Disorders of childhood with an incidence of 1:12 500 births
vLINCL(Finnish variant for late infantile NCL)vLINCL(Finnish variant for late infantile NCL)affects children at the age 4-7 yo. with first symptoms of clumsiness,followed by progressive visual failure, mental and motor deterioration
Enriched in Southern Ostrobothnia region of FinlandMost Finnish patients probably share a mutation, which was introduced20-30 generations ago (500-750 ya), i.e. during the period of inner migrationIn Finland from coast to the inland. Mutation was spread and clustered in the area probably due to low number of the founders, followed by demographic expansion and relativelystrong isolation
vLINCL 1
Linkage with polymorphic markersD13D160 and D13S162 at chromosome 13q : critical region 4 cM
II.Physical mapAcross the region
using FISH-methods:
I.Clones: previouslyAvailable and isolated
During the project
III.New polymorphic markers
1. Genetic mapping
1. Refined chromo-Somal region 13q32
2. Exclusion ofCandidate genes
By position
3. Contig across critical region
IV.Identification of novel candidate genesby the searches in EST database and
cDNA library screening
2. LD mapping:Haplotype analysis
Narrowed downcritical region 200 kb
Disease mutation identification in patients versus controls
cDNA clone assambly for putative CLN5 gene:ESTs, RACE, library cDNAs
RNA expression pattern by Northern and RT-PCR
Tissue expression analysis: mutationmutationverificationverification and disease pathology study
vLINCL 2
Physical mapping of vLINCL candidate region by FISH onmetaphase chromosomes,mechanically streched chromosomesand DNA fibers
vLINCL 3
vLINCL 4
CLN5 gene:CLN5 gene: putative transmembrane protein with no homology to previously reported genes (Savukoski et al., 1998)
3 different haplotypic backgrounds with 3 different mutations:1) a 2 -bp deletion in exon 4 (FinMajor), in the highest risk area with the carrier frequency 1:24, in en extended high-risk area 1:100, not present elswere in Finland2) a nonsense mutation (FinMinor) - transversion in exon 1, present in only one family, not present elsewere in Finland nor Europe3) a missense mutation (Dutch mutation) -transversion in exon 4
CLN5 gene:CLN5 gene: lysosomally targeted glycoprotein(Isosomppi et al., 2002)
1) expressed in embryonic human brain at the beginning of cortical neurogenesis2) transfection experiments: WT is lysosomally targeted and partially secreted into culture medium3) transient localization in ER and Golgi reflects intracellular traffiking of CLN5 to lysosomes4) CLN5 is N-glycosylated ->soluble protein5) FINMajor mutation -> protein expressed but not targeted to lysosomes