1 dna sequencing achim tresch uoc / mpipz cologne treschgroup.de/omicsmodule1415.html...

48
1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/ OmicsModule1415.html [email protected]

Upload: magdalene-hall

Post on 23-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

1

DNA SequencingAchim Tresch

UoC / MPIPZ Cologne

treschgroup.de/[email protected]

Page 2: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

- DNA sequencing in the last century

- Current technologies (Illumina, Ion Torrent)

- New developments (PacBio, Nanopore)

Topics

Page 3: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

T

Sanger sequencing

- Random incorporation of blocked nucleotides at any position, reaction stops in a small fraction of the reads

TTGCACTTGAGTCGTAACGTGAACTCAGCATAGGCTCAGATAGAT

A-Reaction: add dATP (elongation) and ddATP (block)Analogous: C-, G-, T-Reaction

ddATP

- Developed by Fred Sanger in the 70ies (1918-2013, 2*Nobel laureate: 1958 – protein structure of insulin, 1980 – sequencing of nucleic acids)

- Sequencing by synthesis: DNA polymerase is synthesizing a complementray strand by adding single nucleotides

TTGCACTGAGTCGAACGTGACTCAGCATAGGCTCAGATAGAT

Page 4: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

TTGCACTTGAGTCGAACGTGAACTCAGCATAGGCTCAGATAGAT

A-Reaction: TTGCATTGCACTTGA

C-Reaction: TTGCTTGCACTTGCACTTGAGTC

G-Reaction: TTGTTGCACTTGTTGCACTTGAGTTGCACTTGAGTCG

T-Reaction: TTTTTGCACTTTGCACTTTTGCACTTGAGT

TTGCACTTGAGT

ddNTP

Sanger sequencing

ladder of DNA fragments electrophoresis sequence

T

G

C

A

Page 5: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

GATTGATAGTTGCCTAACTATCAACGTATAGGCTCAGATAGAT

GGAGATGATTGATTGGATTGAGATTGATGATTGATAGATTGATAGGATTGATAGTGATTGATAGTTGATTGATAGTTGGATTGATAGTTGC

- labeled ddNTPS, capillary sequencing

A

Sanger sequencing

Page 6: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Pyrosequencing

- immobilize DNA on beads, pyrosequencing in microreactors

dTTP

TTGCACTGAGTCGTAACGTGACTCAGCATAGGCTCAGATAGAT

PPiATP

Oxyluciferin + light

454 technology

Page 7: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

DNA-loaded beads + primer+ polymerase + sulfurylase+ luciferase

flowgram

TTGCACTGAGTCGTAACGTGACTCAGCAAGTCTATTCACCCAC...

454 technology

Problem: homopolymers difficult to detect

Page 8: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

increase throughput:

- DNA gel electrophoresis, single genes in few days

- capillary electrophoresis, 96 capillaries per machine, human genome in a few years

- sequencing on microbeads: 454 technology

Parallelisation & Miniaturisation

Page 9: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Illumina sequencing:

- sequencing by synthesis

- massive parallelisation and miniaturisation by self-organising DNA microarrays on a glass surface

- several hundred Gb, >109 reads per run

Illumina technology

Page 10: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

- generate libraries

- grow clusters on a flowcell

- sequence by addition and imaging of blocked & fluorescence-labeled nucleotides

Illumina technology

Page 11: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

library preparation:

DNA fragments

Blunting by Fill-in and exonuclease

Phosphorylation

Addition of A-overhang

Ligation to adapters

Illumina technology

Page 12: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell

P5

P7

5’

5’

S.P. # 1 Insert

P5’

P7’

S.P. # 2

TAG

Illumina technology

Page 13: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell 2. hybridize template

Illumina technology

Page 14: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell 2. hybridize template 3. immobilize

template

Illumina technology

Page 15: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell 2. hybridize template 3. immobilize

template 4. bridge amplification

Illumina technology

Page 16: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell 2. hybridize template 3. immobilize

template 4. bridge amplification

Illumina technology

Page 17: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell 2. hybridize template 3. immobilize

template 4. bridge amplification

Illumina technology

Page 18: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell 2. hybridize template 3. immobilize

template 4. bridge amplification

Illumina technology

Page 19: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell 2. hybridize template 3. immobilize

template 4. bridge amplification

Illumina technology

Page 20: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell 2. hybridize template 3. immobilize

template 4. bridge amplification 5. linearisation

Illumina technology

Page 21: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell 2. hybridize template 3. immobilize

template 4. bridge amplification 5. linearisation 6. cleave reverse strand

Illumina technology

Page 22: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell 2. hybridize template 3. immobilize

template 4. bridge amplification 5. linearisation 6. cleave reverse strand 7. block 3‘-ends

Illumina technology

Page 23: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

cluster generation: 1. flowcell 2. hybridize template 3. immobilize

template 4. bridge amplification 5. linearisation 6. cleave reverse strand 7. block 3‘-ends 8. hybridize primer

Illumina technology

Page 24: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Imaging & Sequencing:

Illumina technology

Nucleotide + fluorescent dye

+ terminator

Page 25: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

reversible terminators:

Illumina technology

Page 26: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

fluorescently labelled clusters:

Illumina technology

Page 27: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

data output:

Hiseq:- ca. 250 Mio reads * 8 lanes- 2*100 bp paired end -> 400 Gb / 8 days

Hiseq rapid run:- ca. 200 Mio reads * 2 lanes- 2*150 bp paired end -> 120 Gb / 40 hours- (2*250 bp paired end) -> 200 Gb / 60 hours)

Miseq:- ca. 25 Mio reads * 1 lane- 2*300 bp paired end -> 15 Gb / 65 hours

Illumina technology

Page 28: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Fastq quality scores

good quality quality drops towards the end

0.1 %error1 %error

Data quality of short reads

Page 29: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Amplification Artifacts

Duplicate reads

Page 30: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Ion torrent:

semiconductor sequencing- detect H+ release upon nucleotide incorporation by DNA polymerase

Ion torrent

Page 31: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

work flow:

Ion torrent

Page 32: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

data output:

Ion Proton:

- up to 80 mio reads - up to 10 Gb (200 base read length) - 4 hours runtime

Ion Torrent PGM:

- up to 5 mio reads - up to 2 Gb (400 base read length) - 8 hours runtime

Ion torrent

Page 33: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

homopolymer problem?

Ion torrent

- nonlinear increase of signal

Page 34: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

what can we do with short reads?

RNA-seq, identify transcripts, count reads per transcript assessment of differential expression

problem: reads are too short to establish connectivity of all exons, difficult/impossible to quantify multiple isoforms of a gene

Sequencing Applications

Page 35: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Stefan Krebs, 30.09.2013

Single end: ambiguous mapping

Paired end sequencing: read fragment from both ends-> resolve ambiguities

Improvements: Paired end Reads

Page 36: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

further improvements

long jumping mate-pair libraries:circularize large fragment and reads junctions (2-10 kb)

resolve large repeats in genome assembly

Improvements: Circularization

Page 37: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Third generation Sequencing

Page 38: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

- single molecule detection-several kilobases read length-moderate output (150.000 wells)-expensive instrument and high cost per base

Pacific Biosciences

Page 39: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Pacific Biosciences

Page 40: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Pacific Biosciences

Page 41: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Pacific Biosciences

Read length distribution

Page 42: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Pacific Biosciences

Read quality

Page 43: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Pacific Biosciences

Page 44: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

- DNA polymerase coupled to pore releases tags when incorpotating labeled nucleotides

- tags passing through nanopore change ion current

- read length = length of DNA fragment

Oxford Nanopore

Page 45: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

everything that can be converted to a DNA strand can be sequenced- even long-term data storage by encoding in synthetic DNA is possible

BIOLOGICAL APPLICATIONS:sequencing of genomes, transcriptomes, population diversity, composition of microbial communities, ChIPseq, methyl-Seq, translating RNA from ribosomes, ...

MEDICAL APPLICATIONS:whole genome sequencing, exome sequencing, tumor diagnostics, sequencing of T-cell receptor diversity, identification of pathogens, ...

FORENSICS, FOOD SAFETY, ARCHEOLOGY, …

Applications

Page 46: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Other Approaches

Page 47: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Summary third generation Sequencing

Page 48: 1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html tresch@mpipz.mpg.de

Acknowledgements

Stefan KrebsGene CenterLMU Munich