1 genes and ms in tasmania, cont. lecture 5, statistics 246 february 3, 2004
TRANSCRIPT
![Page 1: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/1.jpg)
1
Genes and MS in Tasmania, cont.
Lecture 5, Statistics 246February 3, 2004
![Page 2: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/2.jpg)
Mapping genes contributing to complex diseases
![Page 3: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/3.jpg)
3
MS susceptibility genes are difficult to map
MS is a complex disease. Analyses with traditional methods such as single marker association studies and standard linkage approaches (affected sib-pairs, pedigrees etc) have failed to agree on genomic regions other than the HLA region.
There are a variety of possible reasons for this: • Allelic and locus heterogeneity
(no single gene model fits all)• Significant environmental influences• Imprecise phenotyping
![Page 4: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/4.jpg)
4
Linkage vs Association studies
• Linkage mapping: tests for cosegregation of a marker allele with the disease within families
• Association mapping: seeks a marker allele that is present more frequently in cases than in controls; all affected individuals are treated as distant relatives– Case/control studies– Transmission disequilibrium test (needs triads)
We will do a quick review of association mapping before turning to our MS study.
![Page 5: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/5.jpg)
5
Linkage disequilibrium
Suppose that we have a marker with just two alleles, M and m say, having frequencies p and 1-p, and a (not necessarily linked) disease locus with alleles D and d, having frequencies q and 1-q. A (haploid) gamete must have one of the four combinations (haplotypes) DM, Dm, dM or dm. Let the frequencies in a population of these four haplotypes be x1, x2, x3 and x4 .
Under independence, we would have x1 = pq, etc. Deviations of the observed haplotype frequencies from these products is
termed linkage disequilibrium (LD), or, better, gametic association.
If inheriting the allele D at the disease locus increases the chance of getting the disease, and the disease and marker loci are in LD, then the frequencies of the marker alleles M and m will differ between diseased and non-diseased individuals. This observation is the basis of association studies.
![Page 6: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/6.jpg)
6
Case-control studies in genetic epidemiology
Case-control studies compare case and control allele frequencies at markers or candidate genes (the “exposure” variables). All the standard potential drawbacks of such studies apply, with the similarity of the two base populations being the most critical here. It is thought to be relatively easy for samples from racially mixed populations to differ in allele frequencies, and hard to deal with this in the genetic context. Key term: population structure.
If our cases are MS patients, who are our controls? It would be rare for a study to be able to afford or get ethics approval to carry out random sampling of the relevant background population. More commonly, controls are people such as blood donors, whose blood (DNA) has been collected for other purposes. How close will they be to a random sample from the case population?
In an effort to deal with this, the TDT which follows in effect uses untransmitted genotypes as controls, bypassing any population structure.
![Page 7: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/7.jpg)
7
The transmission-disequilibrium test
The TDT, as it is called, in its simplest form, starts with parents and an affected child, and considers a biallelic marker locus at which all three are typed, and we can determine which maternal and paternal alleles were transmitted, and which were not.
For example, if the parents were a1/a2 and a1/a1 , and the affected offspring was a1/a2 , then a2 was transmitted and a1 was not transmitted by the first parent.
From a random sample of such trios (called triads), a 22 table
can be built up giving the number of times a1 and a2 were transmitted and were not transmitted, respectively, and a simple test can be derived. Many generalizations of this procedure now exist, see notes for Stat 260, 1998 Week 5.
![Page 8: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/8.jpg)
8
2 1361
9 15174
1 962
9 172 12
12714671
18 181 410 10
Genotypes Haplotypes
13115492171276118410
26917
16921214718110
Haplotype
Re-construction
• A collection of alleles derived from the same chromosome
What is a haplotype?
Chromosome phase is knownChromosome phase is unknown
![Page 9: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/9.jpg)
9
Haplotype mapping
If alleles at a disease locus are associated with alleles at one nearby marker locus on gametes, they are likely to be associated with alleles at other nearby marker loci, and hence with marker haplotypes.
A potentially more powerful way to locate disease genes is to search for associations between marker haplotypes and disease.
There are two possible problems here stemming from the fact that there can be a very large number of marker haplotypes: we may have to deal with very small frequencies, and we have a multiple testing problem.
![Page 10: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/10.jpg)
10
Searching for common or rare haplotypes in cases alone is one form of association mapping. It has been successful, as very substantial LD can arise around disease loci. In general controls are necessary as the background LD can be large.
That is, there can be substantial LD between putative disease gene alleles and alleles of nearby markers, without there being any causal link between the gene and the disease. We call this background LD.
Background LD can be large – when the population is young– when the # of founders is small (bottlenecks)– through admixture of populations
LD, haplotype mapping and background LD
![Page 11: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/11.jpg)
11
Exercises on LD
1. Under a random mating assumption, the long term values of the frequencies x1, x2, x3 and x4 on page 5 above are pq, (1-p)q, p(1-q) and (1-p)(1-q). (Week 5, Stat 260, 1998).
2. Demonstrate that a mixture (e.g. 50:50) of two populations initially in linkage equilibrium at two loci, will typically not be in LE.
3. Explain why a single mutant arising by chance, will initially be in strong LD with alleles at loci near the locus on which it arises.
![Page 12: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/12.jpg)
Mapping MS genes in Tasmania
![Page 13: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/13.jpg)
13
Area: 67,800 km2
Population: 470,000
Tasmania
Capital city: Hobart (~200,000)
![Page 14: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/14.jpg)
Tasmanian Population Growth
1 : First settled by Europeans (1803)
2 : 24,000 free settlers19,000 convicts (1836)
4 : End of convicttransportation (1853)
3 : Civil registration of births and marriages (1838)
5 : “The Gold Rush” (1860’s)
1 2345
![Page 15: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/15.jpg)
15
Mapping with haplotype sharing
Time1800-1850’s
6-8 generations2000
Premise: Tasmanians share large(ish) segments of haplotypes because they are distantly relatedSimilarly our MS patients should share these large(ish) segments but even more so (in size and in number) in regions around MS susceptibility genes
![Page 16: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/16.jpg)
16
Haplotypes are “eroded” by recombinationAncestral
chromosome
Time/generations/meioses
MS MS MS MSMS MS 25 cM (SD=18)
Recombination events can help to map genes with precision, but erode haplotypes making them more difficult to detect
![Page 17: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/17.jpg)
17
What might have happened in the population?
• A mutation arises in, or is introduced to, a population leading to disease (say MS) in those individuals
• The mutation arises on the background of a unique haplotype
• As this mutation spreads through the population (by chance, or inbreeding) so do remnants of this original haplotype by hitchhiking (linkage disequilibrium)
timeMS
Ancestralsusceptibility
haplotype
MS
MS
MS
MS
![Page 18: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/18.jpg)
Design of the Tasmanian MS study
![Page 19: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/19.jpg)
19
What strategy could be used to map MS susceptibility genes in Tasmania?
• Too few affected sib pairs/multiplex families for a conventional linkage approach• Prefer a model free (non-parametric) approach A haplotype-based case-control study design seemed appropriate
![Page 20: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/20.jpg)
20
MS study in Tasmania: design
• Collect as many MS cases with ancestral links to Tasmania as possible, and a suitable (not necessarily equal) number of similar, socioeconomically and geographically matched unrelated controls
• Around each case and each control, collect a constellation of ~ 4 close relatives for (probabilistic) haplotype reconstruction
• Infer genome-wide haplotypes for all cases and controls
• Carry out a case/control study with the haplotypes, seeking regions of the genome shared more by the cases, in comparison with the controls
![Page 21: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/21.jpg)
21
Analysis options
Transmitted
Case Haplotypes
Untransmitted Case
Haplotypes
Transmitted Control
Haplotypes
Untransmitted Control
Haplotypes
Green: hope to find signal Red: hope to find nothing
![Page 22: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/22.jpg)
First mathematical questions
• Resolution of genome-wide scan (length of likely shared chromosomal segments)
• Nature and number of relatives needed to permit the reconstruction of accurate haplotypes with high probability
![Page 23: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/23.jpg)
Average length of shared chromosomal segments
Exercise. Assume the Poisson model for crossovers along a chromosome. What is the mean and variance of the length in cM of the chromosomal segments shared by individuals with a common ancestor 7 generations back?
![Page 24: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/24.jpg)
Nature and numbr of relatives needed to give accurate haplotypes
Exercise. Explain why it is that when we have both sets of parental genotypes, and the markers are reasonably polymorphic, we can reconstruct an individual’s haplotypes with high probability. What are the difficult cases?
If we have no parents, or just one parent, and grandparents’, siblings’ or offsprings’ genotypes are available, which are most informative for an individual’s haplotype reconstruction?
![Page 25: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/25.jpg)
25
Reconstructing haplotypes from genotypes
• Observe genotyping data for an individual
At marker 1 : (1,3)
At marker 2 : (b,d)
• Reconstruct the haplotype by inferring recombination events from genotypes of relatives
At marker 1 : Mum (1,2) Dad (3,4)
At marker 2 : Mum (a,b) Dad (c,d)
1
b
3
d
Marker 1
Marker 2
![Page 26: 1 Genes and MS in Tasmania, cont. Lecture 5, Statistics 246 February 3, 2004](https://reader036.vdocuments.net/reader036/viewer/2022062719/56649ee85503460f94bf96d1/html5/thumbnails/26.jpg)
26
Genotyping
Use STR (short tandem repeat)
also known as microsatellite markers
…AGCTAGCGCGC….GCGCGGCATTA…
…AGCTAGCGCGC….GCGCGGCGCATTA…
Eventual plan: 5 cM genome wide scan (~ 800 markers) with dinucleotide STRs