introduction to sequencing technologies -...
TRANSCRIPT
![Page 1: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/1.jpg)
Introduction to Sequencing Technologies
August 22, 2014 Manpreet Katari
![Page 2: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/2.jpg)
Why do we sequence?
• Genome Annotation: A complete genome sequence provides us with the raw data to
construct a "parts list".
• Comparative Genomics: Conserved regions in the genome are more likely to play an
important role in biology of the species.
• Functional Genomics: Sequencing the RNA provides us with an insight into the
transcriptionally active regions of the genome.
• Population Genetics and Genomics: Genetic structure and diversity reveals history and distribution of
phenotypic traits (e.g. disease susceptibility alleles)
• Genetic Analysis:
Map and characterize molecular basis of allelic variants 2
![Page 3: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/3.jpg)
Examples of Large Genome Projects
• 1000 Genomes Project (www.1000genomes.org). An effort to sequence the genome of 1000 people to identify genetic variants that occur in atleast 1% of the human population.
• 1001 Arabidopsis thaliana Genomes Project (www.1001genomes.org) . Study the genomes and phenotypes of 1001 accession that can explain difference in phenotype caused by adaptation to different conditions.
• Metagenomics – Human Microbiome Project (http://commonfund.nih.gov/hmp/): Sequencing of DNA samples from environments, for example mouth, skin, and digestive system, to identify the different bacterial species present.
![Page 4: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/4.jpg)
Evolution of Sequencing Technology
• Method
• Sanger sequencing (manual)
• Automated Sanger sequencing Lee Hood (first semi-automated machine)
99.999% accuracy
$0.50/kilobase
• Cyclic array sequencing • sequencing by synthesis
• sequencing by hybridization
• Throughput
• Tens of thousands of bp per person-year
~400-500bp/run
• Thousands of bp per run (multiplexed) ~ millions of bp per year
~700-800bp/run
• Millions of bp per run
4
1977
1986
2000's
![Page 5: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/5.jpg)
⇒ A 3’ hydroxyl group is essential for chain elongation
5’
3’
5
Sanger DNA Sequencing
CHAIN TERMINATOR
![Page 6: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/6.jpg)
32P labeled audioradiogram 6
Sanger DNA Sequencing
Denaturing gel Labeled strands
Add dNTPs, Polymerase,
ddNTP
Template + Product
Primer Denatured Template Run samples on
denaturing polyacrylamide gel
+
−
5'
3'
![Page 7: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/7.jpg)
Capillary Gel Electrophoresis
7
3'- G A C T G A A G C T G T T
-5'
⇒ Fluorescent dye vs. radioactive label on dNTPs
⇒ Sequencing reaction is performed in a single tube with a mixture of fluorescently labeled ddNTPs
⇒ Reaction is electrophoresed in a single denaturing capillary gel (96 samples run at once using robotics)
⇒ Different wavelengths emitted by fluorescent dyes are automatically detected upon laser excitation
⇒ Computer software automatically reads sequence and assesses quality
Laser Detector
Trace file
ssDNA to be sequenced
5'- C G A A G T C A G -3'
![Page 8: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/8.jpg)
Radioactive vs. Fluorescent Sequencing
8
• 32P-labeled dNTPs • One lane per base • Autoradiogram
• 4 different fluorophores • Single capillary gel • Laser detector
![Page 9: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/9.jpg)
Automated Sequencing
• Perhaps the most important contribution to large-scale sequencing was the development of automated sequencers.
• The industry is currently transitioning from Sanger sequencers to second-generation cycle sequencers for large-scale applications.
• Automated Sanger sequencers still fill an important niche currently –
• but mostly for small-scale applications, like checking that clones contain the expected sequence
• Next-gen platforms are currently very expensive
• but table-top models will probably be standard laboratory equipment in 5-10 years.
9
![Page 10: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/10.jpg)
MegaBACE • Made by Amersham • 96 capillaries • Robotic loading from 384–well
plate • Two to four hours per run • Up to 800 bases per read
Automated Sequencers
ABI 3700 • Made by Applied Biosystems • Most widely used :
• 96 capillaries • robotic loading from 384-well
plates • Two to three hours per run • 600–700 bases per read
![Page 11: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/11.jpg)
ABI traces
![Page 12: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/12.jpg)
Video Demos
Sanger Sequencing - http://youtu.be/nudG0r9zL2M
12
![Page 13: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/13.jpg)
Workflow of conventional vs. second-generation sequencing
13
High-throughput shotgun Sanger sequencing
Cyclic array shotgun sequencing
96 or 384 long reads per run
Millions of short reads per run
Template immobilization Sanger cycle seq
(Template amplification)
Template amplification
Capillary electrophoresis
Seq by synthesis or hybridization
![Page 14: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/14.jpg)
Workflow of conventional vs. second-generation sequencing
14
High-throughput shotgun Sanger sequencing
Cyclic array shotgun sequencing
96 or 384 long reads per run
Millions of short reads per run
Template immobilization Sanger cycle seq
(Template amplification)
Template amplification
Capillary electrophoresis
Seq by synthesis or hybridization
![Page 15: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/15.jpg)
Cost of Sequence per megabase
![Page 16: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/16.jpg)
Template Immobilization Strategies
16
![Page 17: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/17.jpg)
Illumina
17 Figu
re fr
om M
. Met
zker
, Nat
Rev
Gen
et, J
an. 2
010
![Page 18: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/18.jpg)
Video Demos
Illumina - http://youtu.be/womKfikWlxM
18
![Page 19: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/19.jpg)
PacBio
19 Figu
re fr
om M
. Met
zker
, Nat
Rev
Gen
et, J
an. 2
010
![Page 20: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/20.jpg)
Video Demos
PacBio - http://youtu.be/NHCJ8PtYCFc
20
![Page 21: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/21.jpg)
J. Craig Venter Celera Genomics
Francis Collins Human Genome Project
Road to Human Genome
![Page 22: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/22.jpg)
Map-‐based sequencing I
• Human Genome Project adopted a map-‐based strategy – Start with well-defined physical map – Produce shortest tiling path for large-insert clones – Assemble the sequence for each clone – Then assemble the entire sequence, based on the
physical map
![Page 23: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/23.jpg)
Map-‐based sequencing II
Construct clone map and select mapped clones
Generate several thousand sequence reads per clone
Assemble
![Page 24: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/24.jpg)
Physical mapping
• Determina@on of physical distance between two points on chromosome – Distance in base pairs
• Example: between physical marker and a gene • Need overlapping fragments of DNA
– Requires vectors that accommodate large inserts • Examples: cosmids, YACs, and BACs
![Page 25: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/25.jpg)
BACs and PACs • BACs and PACs
– Most commonly used vectors for large-scale sequencing
– Good compromise between insert size and ease of use
– Growth and isolation similar to that for plasmids
![Page 26: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/26.jpg)
Con@gs • Contigs are groups of overlapping pieces of chromosomal
DNA – Make contiguous clones
• For sequencing one wants to create “minimum tiling path” – Contig of smallest number of inserts that covers a region of
the chromosome
genomic DNA
con@g
minimum @ling path
![Page 27: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/27.jpg)
Con@gs from overlapping restric@on fragments
• Cut inserts with restriction enzyme
• Look for similar pattern of restriction fragments – Known as
“fingerprinting” • Line up overlapping
fragments • Continue until a contig
is built
![Page 28: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/28.jpg)
Gel image processing
![Page 29: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/29.jpg)
FPC: fingerprint analysis window
![Page 30: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/30.jpg)
The C. elegans genome project
The first mul@cellular organism to have its genome fully sequenced (97 million bases) The sequence was completed in 1998
⇒ The minimum @ling path, or “The Golden Path”
![Page 31: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/31.jpg)
Mapless sequencing
• Alterna@ve solu@on: fragment en@re genome – Sequence each fragment – Assemble overlapping sequences to form con@guous sequence
• Focus on principles and techniques of mapping and sequencing
![Page 32: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/32.jpg)
Whole-‐genome shotgun sequencing I • Developed by Celera
– Subsidiary of Applied Biosystems, maker of automated sequencers
• No mapping • Instead, the whole
genome is sheared • Randomly sequenced
![Page 33: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/33.jpg)
Whole-‐genome shotgun sequencing II
Generate tens of millions of sequence reads
Assemble
![Page 34: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/34.jpg)
Whole-‐genome shotgun sequencing III
• Major challenge: assembly – Repe@@ve elements are the biggest problem
• Performed on very high-‐speed computers, using novel soYware
• Key to assembly is paired reads – Sequence both ends of each clone
![Page 35: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/35.jpg)
Milestone: 26 June 2000 -‐ White House press conference with Bill Clinton: HGP: Started 1990 ~22.1 billion nucleo@des of sequence data 7-‐fold coverage Unfinished (24% completely finished, 50% near-‐finished) Celera: Started 1998 ~14.5 billion nucleo@des of sequence data 4.6-‐fold coverage Complete assembled genome with >99% coverage First assembled draY of human genome was simultaneously published in Nature & Science 15 & 16 February 2001 (Nature published 1 day earlier).
![Page 36: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/36.jpg)
© 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458
Hierarchical vs shotgun
![Page 37: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/37.jpg)
Hybrid approach
• Combines aspects of both map-based and whole-genome shotgun approaches – Map clones – Sequence some of the mapped clones – Do whole-genome sequencing – Combine information from both methods
• Use sequence from mapped clones as scaffold to assemble whole-genome shotgun reads
• Used for sequencing the mouse genome
![Page 38: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/38.jpg)
![Page 39: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/39.jpg)
![Page 40: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/40.jpg)
Genome Assembly
![Page 41: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/41.jpg)
Whole Genome Shotgun Sequencing
![Page 42: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/42.jpg)
Comparison of overlap graph and ���de Brujin graph for assembly
Shatz et al. Genome Research 2010, Analysis of large genomes
![Page 43: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/43.jpg)
Example of Tour Bus error correction.
Zerbino D R , and Birney E Genome Res. 2008;18:821-829
Copyright © 2008, Cold Spring Harbor Laboratory Press
![Page 44: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/44.jpg)
End Reads (Mates)
Primer
Central steps of the assembly
![Page 45: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/45.jpg)
Finishing I • Process of assembling raw
sequence reads into accurate con@guous sequence – Required to achieve
1/10,000 accuracy • Manual process
– Look at sequence reads at posi@ons where programs can’t tell which base is the correct one
– Fill gaps – Ensure adequate coverage
Gap
Single stranded
![Page 46: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/46.jpg)
Finishing II • To fill gaps in sequence,
design primers and sequence from primer
• To ensure adequate coverage, find regions where there is not sufficient coverage and use specific primers for those areas
GAP
Primer
Primer
![Page 47: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/47.jpg)
Assembly Progression (Macro View)
Each nucleo@de sequenced many @mes
![Page 48: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/48.jpg)
Lander-‐Waterman Model
![Page 49: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/49.jpg)
Rough-draft and skimming sequence
• Rough-draft sequence refers to an average of 5x coverage
• Skimming is 1–3x coverage • Obtains 67%–97% of the sequence • On average, 99% accurate • Of greatest use when can compare the sequence
to a reference sequence • For example, chimpanzee genome compared
with human genome
![Page 50: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/50.jpg)
DNA RNA
cDNA
phenotype protein
[1] Transcription [2] RNA processing (splicing) [3] RNA export [4] RNA surveillance
![Page 51: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/51.jpg)
The Expressed Genome
• Sequencing mRNA: • gene discovery • define gene structures • define differences in mRNA processing
• alternative transcription start sites • alternative exons and splice junction usage • alternative polyA site usage
• mRNA profiling • characterize functional differences between developmental stages and
tissues
• Small RNA profiling: • discover small RNAs, e.g. miRNA, siRNA, piRNA etc…
• these often have regulatory functions
![Page 52: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/52.jpg)
EST sequencing
• Extract RNA from different developmental stages and tissues
• Make cDNA library
• Select clones at random
• Sequence in from one or both ends
• One-pass sequencing
• The resulting sequence = expressed sequence tag (EST)
Muscle mRNA
cDNA libraries
LIMS
Robotic stations DNA sequencers
5’ 3’ cDNA
Partial sequence = EST
![Page 53: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/53.jpg)
ESTs Mapped to Genome
53
![Page 54: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/54.jpg)
EST sequencing: pros and cons
• Advantages
• Relatively inexpensive
• Certainty that sequence comes from transcribed gene
• Information about tissue and developmental stage
• Long contiguous sequence often spanning introns
• Can provide clear boundary for ends of transcripts
• Disadvantages
• No regulatory information
• Usually <60% of genes found in EST collections (random sampling of transcripts based on abundance)
• Most ESTs are not full-length (higher representation of 3' ends)
54
![Page 55: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/55.jpg)
Questions that can be addressed with genome-wide expression analysis:
• What genes have similar function? • What regulatory pathways exist? • Can we subdivide experiments or genes into meaningful classes? • Can we correctly classify an unknown experiment or gene into a
known class? • Can we make better treatment decisions for a cancer patient based
on his or her gene expression profile?
![Page 56: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/56.jpg)
Microarrays vs Northern blots: from Gene to Genome Science
• Northern blot: limited by number of lanes in gel
• Microarray: A large number
of DNA fragments are attached in a systematic way to a solid substrate, can measure mRNA levels for thousands of genes (~ every gene in a genome) in parallel
![Page 57: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/57.jpg)
Three types of array manufacture
• On-chip oligonucleotide synthesis • Photolithography
• Affymetrix (~25-mers) • Ink-jet printing
• Agilent (~60-mers)
• Spotted microarrays • Long dsDNA (typically genomic PCR products)
![Page 58: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/58.jpg)
Affymetrix gene chip
![Page 59: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/59.jpg)
Probe pairs
• Oligos are selected from a region of the gene that has low similarity to other genes.
Perfect match: ATGTTTGACGCTGCGTAGATCCGAG Mismatch: ATGTTTGACGCTACGTAGATCCGAG
![Page 60: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/60.jpg)
MicroArray
![Page 61: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/61.jpg)
Microarray hybridization
Spotted microarrays • Competitive hybridization: two labeled
cDNA samples (experimental and control) hybridized to same slide
• Cy3 and Cy5 dye labeling, fluoresce at different wavelengths
Affymetrix GeneChips
• One labeled RNA population per chip • Biotin labeling, binds to fluorescently
labeled avidin (Comparison made between hybridization intensities of same oligonucleotides on different chips).
mRNA
cDNA
DNA microarray
samples
Microarray Animation
![Page 62: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/62.jpg)
Spotted glass microarray
![Page 63: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/63.jpg)
Comparisons of Gene Expression across samples
whole body liver
brain kidney liver lung
0
510
152 0
2 53 0
3 54 0
4 5
Gene A Gene B
0
5
1015
2 0
2 5
3 03 5
4 0
4 5
Gene A Gene B
![Page 64: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/64.jpg)
Transcriptomics using RNA-seq
![Page 65: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/65.jpg)
RNA-seq provides even more
![Page 66: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/66.jpg)
Candidate new and revised exons
![Page 67: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/67.jpg)
Reproducibility, linearity and sensitivity.
![Page 68: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/68.jpg)
Reproducibility, linearity and sensitivity.
![Page 69: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/69.jpg)
Reproducibility, linearity and sensitivity.
![Page 70: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/70.jpg)
Reproducibility, linearity and sensitivity.
![Page 71: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/71.jpg)
Illum
ina mRN
A sequ
encing
Log
(KN
O3/
KC
l)
Log (KNO3/KCl) Affymetrix ATH1 chips
N-‐regula@on of mRNA: Illumina vs ATH1
R2 = 0.85
![Page 72: Introduction to Sequencing Technologies - CGIARhpc.ilri.cgiar.org/.../AdvancedBFX2014_2/course/IntroNextGen.pdf · Sequencing the RNA provides us with an insight into the transcriptionally](https://reader031.vdocuments.net/reader031/viewer/2022022503/5aaf84617f8b9a5d0a8d8936/html5/thumbnails/72.jpg)
Comparison of platforms for detecting gene expression
AFFY Gene Chip Illumina
All protein coding genes are represented X
Can detect all the different types of RNA X
Cost (including analyzing data) X
Can determine gene regulation X X
Requires pre-existing knowledge of gene sequence X