introduction to single- isolates, single cge services...workshop on whole genome sequencing and...
TRANSCRIPT
![Page 1: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/1.jpg)
Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017
Introduction to single-isolates, single CGE services
![Page 2: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/2.jpg)
Learning objective:
After this lecture and exercise, you should be able to…
…describe how the methods from Center for Genomic Epidemiology for identifying species, Multilocus Sequence Type, plasmids, and antimicrobial resistance genes work
…use the above-mentioned methods as stand-alone services and interpret the results
![Page 3: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/3.jpg)
![Page 4: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/4.jpg)
Tools for species identification
Name of Service Description Status PublicationSpeciesFinder Species
identification using 16S rRNA
OnlinePublished Feb 2014 PMID: 24574292
KmerFinder Species identification using overlapping 16mers
Online
Published Jan 2014 PMID: 24172157
TaxonomyFinder Taxonomy identification using functional protein domains
Under development
Published in PMID: 24574292 + Oksana Lukjancenko's PhD thesis
Reads2Type Species identification on client computer
OnlinePublished Feb 2014 PMID: 24574292
![Page 5: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/5.jpg)
PMID: 24574292
![Page 6: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/6.jpg)
Training data◇ 1,647 completed / almost completed genomes downloaded from
NCBI in 2011 (1,009 different species)
Evaluation data◇ NCBI draft genomes
• 695 isolates from species that overlap with training set (151 species)
◇ SRA draft genomes• 10,407 sets of short reads from Illumina (168 species)
• 10,407 draft genomes from Illumina data (168 species)
![Page 7: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/7.jpg)
16S rRNA
• 16S rRNA sequencing has dominated molecular taxonomy of prokaryotes for 40 years (Fox et al, Int. J. Syst. Bacteriol., 1977)
• Tremendous amounts of 16S rRNA sequence data are available in public databases
Concerns: • Low resolution • Some genomes contain several copies of the 16S rRNA gene with inter-gene variation
• The 16S rRNA gene represents only about 0.1% of the coding part of a microbial genome
![Page 8: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/8.jpg)
Reference database • 16S rRNA genes are isolated from genomes in training data using RNAmmer (Lagesen, NAR, 2007).
Method • Input genomes are BLASTed against 16S rRNA genes in reference database.
• Best hit is selected based on a combination of coverage, % identity, bitscore, number of mistmatches and number of gaps in the alignments.
CGE implementation of 16S species identification
SpeciesFinder
![Page 9: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/9.jpg)
•Genomesintrainingdataischoppedinto16mers:
A T G A C G T A T G A C T G A T G G C G T A G T A G T C C
•Downsampling
•Only16merswithspecificprefix(ATGAC)arekept
KmerFinder Using all information in the WGS data
almost
![Page 10: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/10.jpg)
Bact1-> E. coli
Bact2-> S. enterica
Bact3-> K. pneumoniae
Bact4-> S. aureus
?????
Query bacteria of unknown species
Reference db bacteria of known species (template)
Prediction: Query bacteria is a S. aureus
![Page 11: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/11.jpg)
Three other methods were evaluatedTaxonomyFinder: Performs its predictions based on the presence of protein profiles that are specific to particular taxonomic groups.
Reads2Type: Performs its predictions based on species-specific 50mers in the 16S rRNA or gyrB gene (for Enterobacteriaceae).
rMLST: Performs its predictions based on up to 53 ribosomal genes. Implemented in collaboration with Keith Jolley from Oxford (MLST).
![Page 12: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/12.jpg)
Results
(16srRNA)
![Page 13: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/13.jpg)
Summary of taxonomy benchmark study
• KmerFinder had the highest accuracy and was the fastest method.
• SpeciesFinder (16S rRNA-based) had the lowest accuracy.
• Methods that only sample genomic loci (16S, Reads2Type, rMLST) had difficulties distinguishing species that only recently diverged, especially when main difference is a plasmid.
![Page 14: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/14.jpg)
“Standard”whenaimingatdeterminingthespeciesofoneisolate
“Winnertakesitall”ifyouhaveamixedsampleorsuspectyouhaveamixedsample
![Page 15: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/15.jpg)
KmerFinder statistics
€
Squ
S:Score(totalnumberofuniquekmersinquerysequencethatmatchkmersintemplatesequence)qu:Totalnumberofuniquekmersinquerysequence
€
Slu
S:Score(totalnumberofuniquekmersinquerysequencethatmatchkmersintemplatesequence)lu:Totalnumberofuniquekmersintemplatesequence(referencesequence)
luS
Querycoverage
Templatecoverage
Kmersinquery Kmersintemplate(reference)genome
qu
![Page 16: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/16.jpg)
More KmerFinder statistics
Depth(DepthofCoverage).Onlyrelevantwhenuploadingrawreads.
Average number of times each position is covered by a kmer.
€
N ⋅ LG
N=totalno.ofkmersthatmatchthetemplate(notthesameasscore)
L=16(lengthofkmer)
G=Totalno.ofuniquekmersintemplate
![Page 17: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/17.jpg)
KmerFinder output standard scoring method
![Page 18: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/18.jpg)
Query(input)Rawreadsfromurinesamplearesplitinto16mers
Onlyunique16mersarekept
Template/referencedatabase
E.coli
P.mirabilis
S.aureus
Inthe“total”valuesthekmersareallowedtomatchmorethanonetemplate
“Winnertakesitall”
4493
3320
Depth
![Page 19: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/19.jpg)
Tools for further typing
Name of Service Description Publication
MLSTMultilocus sequence typing
Published Apr 2012, PMID: 22238442
PlasmidFinderIdentification of plasmids in Enterobacteriaceae (and Gram-positives)
Published Apr 2014, PMID: 24777092
pMLST pMLST of plasmids in Enterobacteriaceae
Published Apr 2014, PMID: 24777092
![Page 20: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/20.jpg)
Multilocus Sequence Typing (MLST)
• First developed in 1998 for Neisseria meningitis (Maiden et al. PNAS 1998. 95:3140-3145)
• The nucleotide sequence of internal regions of app. 7 housekeeping genes are determined by PCR followed by Sanger sequencing
• Different alleles are each assigned a random number
• The unique combination of alleles is the sequence type (ST)
![Page 21: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/21.jpg)
UsingWGSdataforMLST
DownloadoftheMLSTdatafrompubmlst.org
![Page 22: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/22.jpg)
Assembledgenome454–singleendreads454–pairedendreadsIllumina–singleendreadsIllumina–pairedendreadsIonTorrentSOLiD–singleendreadsSOLiD–matepairreads
Acinetobacterbaumannii#1Acinetobacterbaumannii#2ArcobacterBorreliaburgdorferiBacilluscereusBrachyspirahyodysenteriaeBifidobacteriumBrachyspiriaintermediaBordetellaBurkholderiapseudomalleiBrachyspiraBurkholeriacepaciacomplexCampylobacterjejuniClostridiumbotulinumClostridiumdifficile#1Clostridiumdifficile#2CampylobacterhelveticusCampylobacterinsulaenigraeClostridiumsepticumC.diphtheriaeCampylobacterfetusChlamydiales
CampylobacterlariCronobacterC.upsaliensisEscherichiacoli#1Escherichiacoli#2EnterococcusfaecalisEnterococcusfaeciumF.psychrophilumHaemophilusinfluenzaeHaemophilusparasuisHelicobacterpyloriKlebsiellapneumoniaeLactobacilluscaseiLactococcuslactisLeptospiraListeriaListeriamonocytogenesMoraxellacatarrhalisMannheimiahaemolyticaNeisseriaP.gingivalisP.acne
PseudomonasaeruginosaPasteurellamultocidaPasteurellamultocidaStaphylococcusaureusStreptococcusagalactiaeSalmonellaentericaStaphylococcusepidermidisS.maltophiliaStreptococcuspneumoniaeStreptococcusoralisS.zooepidemicusStreptococcuspyogenesStreptococcussuisStreptococcusthermophilusStreptomycesStreptococcusuberisVibrioparahaemolyticusVibriovulnificusWolbachiaXylellafastidiosaY.pseudotuberculosis
![Page 23: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/23.jpg)
Mismatches
![Page 24: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/24.jpg)
Extended Output
![Page 25: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/25.jpg)
Truncated gene
![Page 26: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/26.jpg)
Extended Output
![Page 27: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/27.jpg)
Toolsforphenotyping-ResFinder
ResFinder(BLAST)
NGSIllumina
Iontorrent454..
Resistancegeneprofile
Assemblypipeline
List of genes Accession numbers
Theoretical resistance phenotype
![Page 28: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/28.jpg)
![Page 29: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/29.jpg)
ResFinderoutput
![Page 30: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/30.jpg)
◇ 200 isolates from 4 different species (Salmonella Typhimurium, Escherichia coli, Enterococcus faecalis and Enterococcus faecium)
◇ ResFinder, 98 %ID, 60% length coverage
◇ Phenotypic tests, 3,051 in total • 482 Resistant • 2569 Susceptible
=> 99,74% of the results were in agreement between ResFinder and the phenotypic tests
23 discrepancies -> 16, typically in relation to spectinomycin in E. coli
![Page 31: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/31.jpg)
Exampleofuse
• Following the detection of mcr-1 in China, the mcr-1 gene was added to the ResFinder database
• In a few days app. 1,000 previously sequenced Danish isolates were re-analysed
• mcr-1 was detected in an E. coli isolated from a patient and in 5 E. coli isolates from imported chicken meat
![Page 32: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/32.jpg)
PlasmidFinderandpMLST
ThePlasmidFinderdatabasecontainsreplicons,notentireplasmids.
![Page 33: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/33.jpg)
pMLSTplasmidMLSTforincF,incN,incHI1,IncHI2,andIncI1plasmids
![Page 34: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/34.jpg)
Handling sequence data?Watch out!
Same FASTA file in Word
This should be fine…
![Page 35: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/35.jpg)
Handling sequence data?Watch out!
Oh no! This wont work…
Use “pure” text editors
Example: • Sublime Text
Save files in “txt” format.
What your data actually looks like!
![Page 36: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/36.jpg)
A word on browsers
Browserslikelytoworkwithnoproblems:
Chrome,Firefox,(Safari)
Browserswedon’tlike:Explorer,Edge
![Page 37: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/37.jpg)
And now…www.goseqit.com/exercise1
![Page 38: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/38.jpg)
https://www.dropbox.com/sh/09r0kab7hzeb9mv/AAAHWHvUuad3pG2gPq9llc7Za?dl=0AlsoavailableviaDropBox:
Exercise data
![Page 39: Introduction to single- isolates, single CGE services...Workshop on Whole Genome Sequencing and Analysis, 2-4 Oct. 2017 Introduction to single-isolates, single CGE services](https://reader033.vdocuments.net/reader033/viewer/2022042021/5e786d615308fe220c3f59a2/html5/thumbnails/39.jpg)
www.goseqit.com/results