julia n. chapman, alia kamal, archith ramkumar, owen l. astrachan duke university, genome revolution...

1
Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources Sources http://www.ornl.gov/sci/techresources/Human_Genome/faq/ snps.html http://www.affymetrix.com/corporate/media/ genechip_essentials/snps/ Tracking_DNA_With_SNPs.affx Brookes, A. J. (26 February 1999) “The Essence of SNPs.” Department of Genetics and Pathology. (177-186) www.elsevier.com/locate/gene Reduced Representation Reduced Representation Shotgun Sequencing (RRS) Shotgun Sequencing (RRS) RRS re-samples specific subsets of the genome from several individuals, and compares the resulting sequences using a highly accurate SNP detection algorithm • Rapid genotyping Inexpensive SNP map construction 47,172 SNPs have been discovered through RRS APT APT For a link to our APT, go to: www.duke.edu/~ak98 for APT Researchers and Researchers and Applications Applications Variations in the DNA sequences of humans Variations in the DNA sequences of humans can affect how humans develop can affect how humans develop disease disease , , respond to respond to pathogens pathogens , , chemicals chemicals , , drugs drugs , , etc. etc. Evolutionarily stable Evolutionarily stable --little change among generations --little change among generations --easier to track in --easier to track in population studies population studies . . Scientists believe SNP maps will help Scientists believe SNP maps will help identify multiple genes associated with identify multiple genes associated with cancer cancer , , diabetes diabetes , , vascular disease vascular disease , and , and some forms of some forms of mental illness mental illness . Associations . Associations are difficult to establish with are difficult to establish with conventional gene-hunting methods because conventional gene-hunting methods because a single altered gene may make only a a single altered gene may make only a small contribution to the disease. small contribution to the disease. Several groups worked to find SNPs and Several groups worked to find SNPs and ultimately create SNP maps of the human ultimately create SNP maps of the human genome genome U.S. Human Genome Project (HGP) U.S. Human Genome Project (HGP) SNP Consortium SNP Consortium TSC project TSC project *Small likelihood of duplication among the *Small likelihood of duplication among the Introduction Introduction Look at the two sequences below. What Look at the two sequences below. What stands out? The sequences match up on stands out? The sequences match up on every nucleotide except for the A and the every nucleotide except for the A and the T in black. T in black. AAATTTTGGGGGCCCCA AAATTTTGGGGGCCCCA A A AA AA AAATTTTGGGGGCCCCA AAATTTTGGGGGCCCCA T T AA AA This nucleotide change in a sequence is This nucleotide change in a sequence is known as a known as a Single Nucleotide Polymorphism Single Nucleotide Polymorphism or a SNP (pronounced "snip"). or a SNP (pronounced "snip"). SNPs are DNA sequence variations that SNPs are DNA sequence variations that occur when a single nucleotide ( occur when a single nucleotide ( A,T,C A,T,C ,or ,or G G ) in the genome sequence is altered. ) in the genome sequence is altered. For a variation to be considered a SNP, it For a variation to be considered a SNP, it must occur in at least 1% of the must occur in at least 1% of the population. population. SNPs, which make up about SNPs, which make up about 90% 90% of all human of all human genetic variation, occur every genetic variation, occur every 100 100 to to 300 300 bases along the bases along the 3-billion-base 3-billion-base human human genome. genome. Two of every three SNPs involve the Two of every three SNPs involve the replacement of replacement of cytosine cytosine (C) with (C) with thymine thymine (T). SNPs can occur in both coding (gene) (T). SNPs can occur in both coding (gene) and noncoding regions of the genome. and noncoding regions of the genome. Summary: “The Essence Summary: “The Essence of SNPs” of SNPs” The recent surge of interest in Single Nucleotide Polymorphisms is in large part due to potential uses in disease detection and prevention and population genetics. Currently projects such as the HapMap are searching for the best methods of locating and identifying SNPs in the human genome. As SNPs are discovered they are stored in databases which are growing rapidly and are becoming more and more accessible. (See databases at http://www.genome.wi.mit.edu/SNP/human/ind ex.html and http://www.ibc.wstl.edu/SNP). By definition, a SNP must occur with a frequency >1% in a population. Therefore, SNPs can be used as markers to link populations and study evolutionary patterns because they don’t mutate over time. In conclusion, SNPs have the potential to solve some of genetics’ most vexing problems in the near future. Single Nucleotide Polymorphisms: The Essence of SNPs The fact that we inherit our DNA in these consistent, predictable blocks is key to understanding how SNPs are used to track down a disease-gene. Once a disease- causing mutation occurs in this block of DNA — either by chance or by environmental factors — that mutation is passed on to descendents who inherit that block of DNA generations later. The various SNPs that occur within the block of DNA will also be passed on. So when researchers see a SNP shared by a lot of people who have a disease like autism, (but not shared in a group of people that don't,) they think "These people share a similar block of inherited DNA and there may be a disease causing mutation in that block." In this way, SNPs from an ancestor who might have lived 5,000 years ago, canserve as a marker for a disease gene you could have inherited today. Regions from ancestral chromosomes that Regions from ancestral chromosomes that remain intact and are separated by remain intact and are separated by regions of recombinant DNA regions of recombinant DNA Enable geneticists to establish Enable geneticists to establish correlations between specific genes and correlations between specific genes and disease disease Shared by multiple persons in a given Shared by multiple persons in a given population population Help to determine evolutionary Help to determine evolutionary patterns: younger populations patterns: younger populations tend to have longer haplotypes tend to have longer haplotypes (less time for recombination (less time for recombination ) ) Fig. 3. The red and blue ancestral chromosomes recombine over many generations. The region A labeled in the red parent chromosome corresponds to the region A in two of the resultant chromosomes. Therefore, if all three share a common disease, a correlation may be established between gene A and the particular disease. Cost effective: cheaper to Cost effective: cheaper to locate linked regions of SNPs locate linked regions of SNPs than to locate each of 10 than to locate each of 10 million+ SNPs million+ SNPs Approx 300,000 to 600,000 “tag Approx 300,000 to 600,000 “tag SNPs” that uniquely identify SNPs” that uniquely identify haplotypes haplotypes

Upload: madeline-jackson

Post on 05-Jan-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources

Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. AstrachanDuke University, Genome Revolution Focus, Department of Computer Science

SourcesSourceshttp://www.ornl.gov/sci/techresources/Human_Genome/faq/snps.html

http://www.affymetrix.com/corporate/media/genechip_essentials/snps/Tracking_DNA_With_SNPs.affx

Brookes, A. J. (26 February 1999) “The Essence of SNPs.” Department of

Genetics and Pathology. (177-186)

www.elsevier.com/locate/gene

Reduced Representation Shotgun Reduced Representation Shotgun Sequencing (RRS)Sequencing (RRS)

RRS re-samples specific subsets of the genome from several individuals, and compares the resulting sequences using a highly accurate SNP detection algorithm • Rapid genotyping• Inexpensive SNP map construction• 47,172 SNPs have been discovered through RRS

APTAPTFor a link to our APT, go to:

www.duke.edu/~ak98 for APT

Researchers and Applications Researchers and Applications

Variations in the DNA sequences of humans Variations in the DNA sequences of humans can affect how humans develop can affect how humans develop diseasedisease,, respond to respond to pathogenspathogens, , chemicalschemicals, , drugsdrugs, etc. , etc.

Evolutionarily stable Evolutionarily stable --little change among generations --little change among generations --easier to track in --easier to track in population studiespopulation studies. .

Scientists believe SNP maps will help identify multiple genes Scientists believe SNP maps will help identify multiple genes associated with associated with cancercancer, , diabetesdiabetes, , vascular diseasevascular disease, and some , and some forms of forms of mental illnessmental illness. Associations are difficult to establish . Associations are difficult to establish with conventional gene-hunting methods because a single with conventional gene-hunting methods because a single altered gene may make only a small contribution to the disease. altered gene may make only a small contribution to the disease.

Several groups worked to find SNPs and ultimately create SNP Several groups worked to find SNPs and ultimately create SNP maps of the human genomemaps of the human genome

U.S. Human Genome Project (HGP)U.S. Human Genome Project (HGP) SNP ConsortiumSNP Consortium

TSC projectTSC project*Small likelihood of duplication among the groups because of *Small likelihood of duplication among the groups because of estimated 3 million SNPs; high potential payoff estimated 3 million SNPs; high potential payoff

IntroductionIntroductionLook at the two sequences below. What stands out? The Look at the two sequences below. What stands out? The sequences match up on every nucleotide except for the A and sequences match up on every nucleotide except for the A and the T in black. the T in black.

AAATTTTGGGGGCCCCAAAATTTTGGGGGCCCCAAAAA AA

AAATTTTGGGGGCCCCAAAATTTTGGGGGCCCCATTAAAAThis nucleotide change in a sequence is known as a This nucleotide change in a sequence is known as a Single Single Nucleotide PolymorphismNucleotide Polymorphism or a SNP (pronounced "snip"). or a SNP (pronounced "snip").

SNPs are DNA sequence variations that occur when a single SNPs are DNA sequence variations that occur when a single nucleotide (nucleotide (A,T,CA,T,C,or ,or GG) in the genome sequence is altered. ) in the genome sequence is altered.

For a variation to be considered a SNP, it must occur in at least For a variation to be considered a SNP, it must occur in at least 1% of the population. 1% of the population.

SNPs, which make up about SNPs, which make up about 90%90% of all human genetic of all human genetic variation, occur every variation, occur every 100100 to to 300300 bases along the bases along the 3-billion-base 3-billion-base human genome. human genome.

Two of every three SNPs involve the replacement of Two of every three SNPs involve the replacement of cytosinecytosine (C) with (C) with thyminethymine (T). SNPs can occur in both coding (gene) (T). SNPs can occur in both coding (gene) and noncoding regions of the genome.and noncoding regions of the genome.

Summary: “The Essence of SNPs”Summary: “The Essence of SNPs”The recent surge of interest in Single Nucleotide

Polymorphisms is in large part due to potential uses in disease detection and prevention and population genetics. Currently projects such as the HapMap are searching for the best methods of locating and identifying SNPs in the human genome.

As SNPs are discovered they are stored in databases which are growing rapidly and are becoming more and more accessible. (See databases at http://www.genome.wi.mit.edu/SNP/human/index.html and http://www.ibc.wstl.edu/SNP).

By definition, a SNP must occur with a frequency >1% in a population. Therefore, SNPs can be used as markers to link populations and study evolutionary patterns because they don’t mutate over time.

In conclusion, SNPs have the potential to solve some of genetics’ most vexing problems in the near future.

Single Nucleotide Polymorphisms:

The Essence of SNPs

The fact that we inherit our DNA in these consistent, predictable blocks is key to understanding how SNPs are used to track down a disease-gene. Once a disease-causing mutation occurs in this block of DNA — either by chance or by environmental factors — that mutation is passed on to descendents who inherit that block of DNA generations later. The various SNPs that occur within the block of DNA will also be passed on. So when researchers see a SNP shared by a lot of people who have a disease like autism, (but not shared in a group of people that don't,) they think "These people share a similar block of inherited DNA and there may be a disease causing mutation in that block." In this way, SNPs from an ancestor who might have lived 5,000 years ago, canserve as a marker for a disease gene you could have inherited today.

Regions from ancestral chromosomes that remain intact and Regions from ancestral chromosomes that remain intact and are separated by regions of recombinant DNAare separated by regions of recombinant DNA

Enable geneticists to establish correlations between Enable geneticists to establish correlations between specific genes and diseasespecific genes and disease

Shared by multiple persons in a given populationShared by multiple persons in a given population

Help to determine evolutionary patterns: Help to determine evolutionary patterns: younger populations tend to have longer younger populations tend to have longer haplotypes (less time for recombinationhaplotypes (less time for recombination))

Fig. 3. The red and blue ancestral chromosomes recombine over many generations. The region A labeled in the red parent chromosome corresponds to the region A in two of the resultant chromosomes. Therefore, if all three share a common disease, a correlation may be established between gene A and the particular disease.

Cost effective: cheaper to locate linked Cost effective: cheaper to locate linked regions of SNPs than to locate each of 10 regions of SNPs than to locate each of 10 million+ SNPsmillion+ SNPs

Approx 300,000 to 600,000 “tag SNPs” that Approx 300,000 to 600,000 “tag SNPs” that uniquely identify haplotypesuniquely identify haplotypes