human encodeproject
TRANSCRIPT
![Page 1: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/1.jpg)
Encyclopedia Of DNA Elements
A consortium of 440 scientists, 32 laboratories
Sucheta Tripathy, IICB, 17th Sept. 2012
![Page 2: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/2.jpg)
http://www.nature.com/encode/ http://www.encodeproject.org/ENCODE/ http://www.factorbook.org/ http://encodeproject.org/ENCODE/dataStand
ards.html http://1000genomes.org http://genome.ucsc.edu/ENCODE/
Some of the useful links:
![Page 3: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/3.jpg)
http://www.gencodegenes.org/data.html
![Page 4: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/4.jpg)
http://homes.gersteinlab.org/people/rar62/subwaymap/SubwayMap8_16_12.pdf
Characterization of intergenic region and gene definition
![Page 5: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/5.jpg)
http://homes.gersteinlab.org/people/rar62/subwaymap/SubwayMap8_16_12.pdf
![Page 6: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/6.jpg)
A Road map
In October 1990 Human
Genome project started
First Publication in 2000
Finished paper in
2003
NHGRI Solicited
pilot proposal
for ENCODE
First Report on Encode Published in 2007
RFAs were sought for
full ENCODE
ENCODE published
2012
GWAS -90% lies outside coding
2005
![Page 7: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/7.jpg)
http://www.nature.com/nature/journal/v489/n7414/full/489049a.html
![Page 8: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/8.jpg)
It is like google map says Eric Lander : Map of earth from outer space
Treasure Hunt?
![Page 9: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/9.jpg)
95% of the genome is “junk”.◦2.94% of the genome is coding
cis regulatory elements occur within a limited genome distance.
Most of the genome is transposable elements that are of obscure origin are dying.
Transcribed elements are most often translated than not.
What we knew
![Page 10: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/10.jpg)
80% of the human genome is active!!◦ 70,000 promoters and 400,000 enhancers
75% of the genome transcribed in some tissue or other during life time.
Environment plays great role in switching on or off of a lot many genes. [Epigenetics]
Most of the diseases don’t lie with the genes but the switches!!
Dark matters controlling the genes are physically close to the genes they control.
Key Findings:
![Page 11: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/11.jpg)
Genes and the switches don’t hold one to one relationship!
4 million switches controlling 21,000 genes!!
Identical twins are NOT identical – greatly influenced by environments.
Astronomy and genetic Biology looks similar(95% of the Universe is called as dark matter – we don’t understand)
Key Findings:
![Page 12: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/12.jpg)
“This explains why 6.5 billion people on earth don’t look alike”..
Intelligent Design (Creationism) believers are excited that it is handiwork of God.
Natural selectionists (Darwinists) excited that natural selection at its best.◦ This has raged a war between democrats and
republicans as usual. Junk DNA is an “Oxymoron”. Some are still wondering about the
remaining 20%.
Who said What (common people)
![Page 13: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/13.jpg)
‘I hope this information stirs the mind of those researchers that have ignored "trace minerals" in food as part of the nutritional package’.
The more we think we are close to finding an answer – the far we find ourselves. Reminds me of Aristotle Who once said “The more you know, the more you know you don't know”
Who said What Contd…
![Page 14: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/14.jpg)
Most part of DNA was considered “Garbage” but later upgraded to “junk”.
Most people are actually happy because it is happening during their “life time”.
Switches are software and genes are hardware.
Ancient Egyptians considered “torso” has a divine role and discarded grey matter in head as “junk”.
Historically “Junk” Vs “Garbage”
![Page 15: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/15.jpg)
Sean Eddy “At least 40% of the human genome is composed of the decaying DNA remains of transposable elements (TEs), different species of which have replicated in great waves during the evolution of our genome.”
“I sure wish I’d gotten the memo, because this week a collaboration of labs led by myself, Arian Smit, and Jerzy Jurka just released a new data resource that annotates nearly 50% of the human genome as transposable element-derived, and transposon-derived repetitive sequence is the poster child for what we colloquially call “junk DNA”.”
http://cryptogenomicon.org/
Some people are upset
![Page 16: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/16.jpg)
PLoS Biol. 2011 April; 9(4): e1001046.
![Page 17: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/17.jpg)
PLoS Biol. 2011 April; 9(4): e1001046.
![Page 18: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/18.jpg)
PLoS Biol. 2011 April; 9(4): e1001046.
![Page 19: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/19.jpg)
Cell Type Tier Description Source
GM12878 1 B-Lymphoblastoid cell line Coriell GM12878
K562 1Chronic Myelogenous/Erythroleukemia cell line
ATCC CCL-243
H1-hESC 1Human Embryonic Stem Cells, line H1
Cellular Dynamics International
HepG2 2 Hepatoblastoma cell line ATCC HB-8065
HeLa-S3 2 Cervical carcinoma cell line ATCC CCL-2.2
HUVEC 2Human Umbilical Vein Endothelial Cells
Lonza CC-2517
Various (Tier 3) 3Various cell lines, cultured primary cells, and primary tissues
Various
PLoS Biol. 2011 April; 9(4): e1001046.
The Cell Types
![Page 20: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/20.jpg)
DNAseI -> Transcription factor binding sites (2.9 million sites, 1/3 rd in one cell type and remaining in others)
Chip-seq -> sequence transcription factor and histone binding sites (HeLA and GM12878 – qualified to be called as new species)
5C technology -> Finding proximity between regulatory and regulated regions
High density 5 bp tiling DNA micro arrays
The Experiments
![Page 21: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/21.jpg)
Cap Analysis of Gene Expression Paired-End diTag (PET) Reduced Representation Bisulphite
Sequencing (RRBS)
Contd.
![Page 22: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/22.jpg)
33.45% exon and 66.55% intron. 62% of the genome is transcribed
reproducibly. 231 MB of genome has protein binding sites.
◦ 80% of which are low affinity sites (http://www.factorbook.org/)
◦ Many are highly conserved cell selective type 96% of the CpG exhibited differential
methylation pattern. GWAS SNPs had overlaps with ENCODE
elements.
The Main Nature paper
![Page 23: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/23.jpg)
Chromosome confirmation capture carbon copy(5C)◦ 1% of the genome is distally regulated (>1000
bp)◦ On an average 3.9 distal elements interacted with
TSS.◦ Distance could be several KBs to MBs
Chromosome Interacting regionsSanyal et al Nature 489, 109–113 (06 September 2012)
![Page 24: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/24.jpg)
cis-regulatory elements - Enhancers, promoters, insulators, silencers.
2.9 million DHS encompassing 125 diverse cell and tissue types.
20-50 bp length DHS mapped uniquely to 86.9% of genome◦ 580,000 distal DHS with target promoters ◦ 3% lie in TSS◦ 5% lie within 2.5 KB of TSS◦ 95% lie distally (introns and intergenic regions)◦ Strongly enriched in LTRs
Dnase Hypersensitive Site studiesThurman et. al Nature 489, 75–82http://www.nature.com/nature/journal/v489/n7414/full/nature11232.html
![Page 25: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/25.jpg)
3/4th of genome is capable of transcription – redefine concept of gene?◦ 62.1% AND 74.7% are processed or primary
transcripts.◦ 10-12 expressed isoforms per gene per cell.◦ Coding and non-coding transcripts are localized in
cytoplasm and nucleus respectively.◦ 6% of the coding and non-coding transcripts
overlap with small RNAs – precursors?◦ Most of the novel transcripts lacked protein
coding ability.
Landscape of Transcriptiondjebali et al. Nature 489, 101–108 http://www.nature.com/nature/journal/v489/n7414/full/nature11233.html
![Page 26: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/26.jpg)
Mapping job is only half done. Characterizing everything a genome does is
10% done. Finding Network of switches for genes. A number of correlations…..
What is yet to be done
![Page 27: Human encodeproject](https://reader035.vdocuments.net/reader035/viewer/2022070318/557c16d8d8b42a22218b45f0/html5/thumbnails/27.jpg)
Where does gene therapy go from here? Our fundamental understanding of genes as
the functional units are flawed?? Epigenetics becomes the key player… Gives impetus to holistic approach in
treating a disease.
Do we still believe that human genome is most efficient?
Future Implications: