human genome resources chiki gupta november 21 st, 2005 biophysics 101

9
Human Genome Resources Human Genome Resources Chiki Gupta Chiki Gupta November 21 November 21 st st , 2005 , 2005 Biophysics 101 Biophysics 101

Upload: derrick-cain

Post on 18-Jan-2018

220 views

Category:

Documents


0 download

DESCRIPTION

What is ENCODE? Purpose: To identify all functional elements in the human genome sequence Purpose: To identify all functional elements in the human genome sequence What are functional elements? What are functional elements? –Protein-coding genes, regulators, enhancers, DNA sequences that regulate chromosome folding, etc. Started in 2003 at University of California Santa Cruz Started in 2003 at University of California Santa Cruz Open Consortium Open Consortium Academic, government and private sector scientists are encouraged to contribute and use the online info Academic, government and private sector scientists are encouraged to contribute and use the online info

TRANSCRIPT

Page 1: Human Genome Resources Chiki Gupta November 21 st, 2005 Biophysics 101

Human Genome Human Genome ResourcesResources

Chiki GuptaChiki GuptaNovember 21November 21stst, 2005, 2005

Biophysics 101Biophysics 101

Page 2: Human Genome Resources Chiki Gupta November 21 st, 2005 Biophysics 101

Background on Two Background on Two ResourcesResources

• ENCODE- ENCODE- EncEncyclopedia yclopedia oof f DDNA NA EElementslements

• OMIMOMIM- OOnline nline MMendelian endelian IInheritance nheritance

in in MMan an

Page 3: Human Genome Resources Chiki Gupta November 21 st, 2005 Biophysics 101

What is ENCODE?What is ENCODE?• PurposePurpose: To identify all functional : To identify all functional

elements in the human genome elements in the human genome sequencesequence

•What are functional elements?What are functional elements?– Protein-coding genes, regulators, enhancers, DNA Protein-coding genes, regulators, enhancers, DNA

sequences that regulate chromosome folding, etc.sequences that regulate chromosome folding, etc.• Started in 2003 at University of Started in 2003 at University of

California Santa CruzCalifornia Santa Cruz• Open ConsortiumOpen Consortium

•Academic, government and private sector Academic, government and private sector scientists are encouraged to contribute and scientists are encouraged to contribute and use the online infouse the online info

Page 4: Human Genome Resources Chiki Gupta November 21 st, 2005 Biophysics 101

The 3 Phases of ENCODEThe 3 Phases of ENCODE• Phases 1 & 2Phases 1 & 2: Identify a suite of : Identify a suite of

approaches for a comprehensive approaches for a comprehensive identification of functional elements identification of functional elements

• Scope: Only 30 Mb (1%) of a target genomeScope: Only 30 Mb (1%) of a target genome• Determination of Target:Determination of Target:

– 50 % of the 30 Mb50 % of the 30 Mb selected manually based on selected manually based on presence of well-studied genes, and the existence presence of well-studied genes, and the existence of a substantial amount of comparative sequence of a substantial amount of comparative sequence data data

– Remaining 50%Remaining 50% selected randomly according to selected randomly according to a stratified random-sampling strategy based on a stratified random-sampling strategy based on gene density and level of non-exonic conservation gene density and level of non-exonic conservation

– Phase 3Phase 3: Expand methodology to identify : Expand methodology to identify all functional elementsall functional elements• Scope: Entire genomeScope: Entire genome

Page 5: Human Genome Resources Chiki Gupta November 21 st, 2005 Biophysics 101

The 3 Phases of ENCODEThe 3 Phases of ENCODE1.1. Pilot Project PhasePilot Project Phase -Analysis of a set of -Analysis of a set of

representative regionsrepresentative regions

2.2. Technology Development PhaseTechnology Development Phase -Develop -Develop new high throughput methods to identify new high throughput methods to identify functional elements for target regionfunctional elements for target region

• Planned Production Phase-Planned Production Phase- Scale up to Scale up to analyze the entire human genome and to analyze the entire human genome and to find gaps in our ability to identify find gaps in our ability to identify functional elements in genomic sequence. functional elements in genomic sequence.

Page 6: Human Genome Resources Chiki Gupta November 21 st, 2005 Biophysics 101

ENCODE-HapMap ENCODE-HapMap CoordinationCoordination

– International International HapMapHapMap Project Project- Focuses on - Focuses on 10 ENCODE random regions for an in-10 ENCODE random regions for an in-depth study of human genetic variation.depth study of human genetic variation.• Goal: This data will serve as the “gold Goal: This data will serve as the “gold

standard” data set because of the high density standard” data set because of the high density of SNP coverage.of SNP coverage.

• Methodology of ENCODE data production:Methodology of ENCODE data production:– Generate sequencing information from a number of Generate sequencing information from a number of

different genomesdifferent genomes– Perform comparative analysis to extract maximum Perform comparative analysis to extract maximum

amount of information about the human genome amount of information about the human genome

Page 7: Human Genome Resources Chiki Gupta November 21 st, 2005 Biophysics 101

Example with ENCODEExample with ENCODE• http://genome.ucsc.edu/ENCODE/enchttp://genome.ucsc.edu/ENCODE/enc

ode.hg17.htmlode.hg17.html– Lots of linked/annotated information!Lots of linked/annotated information!

•Of particular interest to us: SNPs, Of particular interest to us: SNPs, Recombination Hotspots, Repeats, Introns, Recombination Hotspots, Repeats, Introns, Splicing Locations, Conserved sequences Splicing Locations, Conserved sequences from chimps, dogs, chicken, etc.from chimps, dogs, chicken, etc.

Page 8: Human Genome Resources Chiki Gupta November 21 st, 2005 Biophysics 101

What is OMIM?What is OMIM?• What is it?What is it?

– Catalog of human genes and genetic disordersCatalog of human genes and genetic disorders– Developed at Johns Hopkins UniversityDeveloped at Johns Hopkins University

• Strength:Strength:– Information regarding diseases, and affected Information regarding diseases, and affected

proteins is linked and readily accessible.proteins is linked and readily accessible.• Weakness: Less useful for our project.Weakness: Less useful for our project.

– We’re concerned more with comparing genetic We’re concerned more with comparing genetic sequences, not necessarily with the details of sequences, not necessarily with the details of various human diseases (especially if we model various human diseases (especially if we model using bacterial genome)using bacterial genome)

Page 9: Human Genome Resources Chiki Gupta November 21 st, 2005 Biophysics 101

Example with OMIMExample with OMIM• Alzheimer’s DiseaseAlzheimer’s Disease

– Gene for Microtubule affinity- regulating Gene for Microtubule affinity- regulating kinasekinase

– http://www.ncbi.nlm.nih.gov/Omim/getmhttp://www.ncbi.nlm.nih.gov/Omim/getmap.cgi?chromosome=alzheimer&first=+ap.cgi?chromosome=alzheimer&first=+Find+&start=0Find+&start=0

– Clicking on OMIM under “Summary of Clicking on OMIM under “Summary of Maps” provides detailed information Maps” provides detailed information regarding the function of the specified regarding the function of the specified gene region.gene region.