update on ncbi pathogen detection and antimicrobial ...€¦ · shared network pathways and data...
TRANSCRIPT
Update on NCBI Pathogen Detection and Antimicrobial Resistance ActivitiesWilliam Klimke -GenomeTrakr, Sept 26-28, Crystal City, VA
Shared Network pathways and data streams for outbreak detection and investigations
Sampling of clinical/human, food, animal, and environmental bacterial isolates
Minimal metadata, everything publically accessible More extensive metadata, analyzed seq data (tools to translate)
data shared among PulseNet labs only
PulseNet National Database
CDC analysis (all raw sequence goes into NCBI
USDAFSIS labs
State health labs (clinical)
FSIS & FDA labsfood, environmental
PulseNetGenomeTrakrState agriculture & food regulatory labs
NCBI Submission Portal
Raw Genomic Sequence data
Analysis goals
1. Are these isolates clonally related?
2. What is the anti-microbial resistance gene repertoire of this isolate
NC
BI
Subm
issi
on P
orta
l
BioSamples
SRA
GenBank
BioProject
NCBI Pathogen Pipeline
Genome Assembly
Genome Annotation
Clustering
SNP analysis
Tree Construction
Reports
QC
USA
UK
Aus
Clin
ical
Pathogen Detection Pipeline
tpublished
NC
BI
Subm
issi
on P
orta
l
BioSamples
SRA
GenBank
BioProject
NCBI Pathogen Pipeline
Genome Assembly
Genome Annotation
Clustering
SNP analysis
Tree Construction
Reports
QC
USA
UK
Aus
Clin
ical
tsubmitted
G1
Neighbors
Goal: Improve Turnaround Time from Submission to Report
Pipeline Changes
• SKESA assembler releasedhttps://github.com/ncbi/SKESASKESA publication accepted (Genome Biology)
• SKESA running on all new submissions
• SKESA reassembly for all four foodborne pathogens
Pipeline Changes
• wgMLST schemes developed for all foodborne pathogens
• wgMLST now running for all four foodborne pathogens• used for rapid reports (within one hour of submission report
a table of nearest neighbors)
• used for clustering to generate clusters for input to SNP pipeline
0
20000
40000
60000
80000
100000
120000
140000
160000
Salmonella Listeria Campylobacter E.coli/Shigella
Growth: Total Isolates in Pathogen Detection
2016 2017 2018
Avg. Time to Publish (hours) From Submission
Organism No. of Isolates Old SNP Pipeline New wgMLST + SNP Pipeline
Campylobacter 21,214 13 5E.coli/Shigella 52,801 23 9Listeria 19,146 14 5Salmonella 147,150 92 52
wgMLST/SNP Pipeline Comparison (hours)
• Will be covered by Arjun Prasad - Friday
• navigation panel• subtree creation• sharing of highlighted isolates• tree labeling
• Automated alerts
• first phase - stored searches send automatic notifications• second phase - 'watched' isolate mode
These are the last two major features needed to aid outbreak detection
Pathogen Browser
Requires MyNCBI login
Automated Alerts: First phase (Stored Searches)
Automated Alerts: Email Notifications
• enumerates the reason an isolate has failed for those organisms undergoing wgMLST clustering
• text file on FTP
• ftp://ftp.ncbi.nlm.nih.gov/pathogen/Results/Listeria/latest_snps/Exceptions/
• will have strain and SRA_center added as columns
exception type exception consequence lower limit upper limit actual value biosample_accAssembly validation failure Low contig N50 Not published 10000 NULL 2736 SAMN05173249Assembly validation failure High contig L50 Not published NULL 200 330 SAMN05173249Readset validation failure Insufficient coverage Not published 20 NULL 18 SAMN10079593wgMLST validation failure Too few wgMLST loci found Not clustered 2000 NULL 1723 SAMN03761820
Exceptions Report
NCBI’s Role in Combatting Antibiotic Resistance
1. Build AMR reference database (reference proteinsand hidden Markov models (HMMs) and proteinfamily hierarchy
2. Build AMRFinder tool to identify AMR proteins usingreference database
3. Use AMRFinder to identify AMR proteins in all pathogenisolates integrated into NCBI Pathogen Browser
4. Capture antibiotic susceptibility test data (AST)
5. Integrate AST into NCBI Pathogen Browser
https://ftp.ncbi.nlm.nih.gov/pub/factsheets/Factsheet_AMR_Project.pdf
From Annotation to Resistance Genes
AMRdatabase
HMMs andBLAST
Report onresistance genes
in isolate
Proteins
4,579 resistance proteins34 drug classes resisted~50% beta-lactamases565 HMMs
New beta lactamase,
qnr, and mcrsubmissions
https://www.ncbi.nlm.nih.gov/pathogens/antimicrobial-resistance/AMRFinder/
AMRFinder Alpha Release
• scientific publication describing AMRFinder and genotype/phenotype correlations on ~6K NARMS isolates
• organism-specific point mutations (29 genes) covering resistance to quinolones, macrolides, etc.
• AMR resource landing pages
• AMR Gene Browser - for every AMR gene in every isolate• starting with some high priority organisms (Klebsiella)• requires annotated genome to be available
• stress response and virulence genes
AMRFinder and Antimicrobial Resistance Resources
Pathogen Browser
Pathogen Detection Pipeline
AMRFinder and Resources
automated alerts
Sept. - Oct/2018 Nov - Dec/2018 2019
publication on AMRFinder
stress response and virulence genes
point mutations
combined protein/translated searches
AMR Gene Browser
'watched isolates'
improve time to report
begin GenBank submission
improve time to report
wgMLST for otherpathogens
wgMLST publication
pipeline publication
browser publication
NCBI Pathogen Detection Roadmap
Thank you.This research was supported by the Intramural Research Program of the NIH, National Library of Medicine.
Richa AgarwalaAzat BadretdinSlava BroverJoshua CherryJinna ChoiVyacheslav ChetverninRobert CohenMichael DiCuccioBoris FedorovMike FeldgardenLewis GeerDan HaftLianyi HanAvi KimchiMichel KimelmanWilliam KlimkeAlex KotliarovValerii LashmanovAleksandr MorgulisEyal MosesChris O'SullivanArjun Prasad
Edward RiceKirill RotmistrovskyyAlejandro A. SchafferNadya SerovaStephen SherrySergey ShiryevMartin ShumwayOleg ShutovAlexandre SouvorovTatiana TatusovaIgor TolstoyChunlin XiaoLeonid ZaslavskyAlexander ZasypkinLukas WagnerHlavina WratkoEugene Yaschenko
David LipmanJames OstellKim Pruitt
CDCFDA/CFSANGenFSUSDA-FSISPHE/FERANIHGRINIAIDWRAIRBroadWadsworth/MDHVendors: PacBio, Illumina, Roche