snp calling & outbreak reconstruction in a monomorphic pathogen

Download SNP Calling & Outbreak Reconstruction in a Monomorphic Pathogen

If you can't read please download the document

Upload: jennifer-gardy

Post on 28-Jul-2015

2.092 views

Category:

Science


4 download

TRANSCRIPT

1. S N P C A L L I N G & O U T B R E A K R E C O N S T R U C T I O N I N A M O N O M O R P H I C PAT H O G E N W I T H I N - H O S T D I V E R S I T Y A N D O T H E R C O N S I D E R A T I O N S 2. Y O U R I N S T R U C T O R D R . J E N N I F E R G A R D Y S E N I O R S C I E N T I S T, B R I T I S H C O L U M B I A C E N T R E F O R D I S E A S E C O N T R O L A S S I S TA N T P R O F E S S O R , S C H O O L O F P O P U L AT I O N & P U B L I C H E A LT H , U N I V E R S I T Y O F B R I T I S H C O L U M B I A [email protected] @jennifergardy 3. A G E N D A Introduction I: a bit about my research Introduction II: WGS for outbreak investigation: hooray for clonal pathogens Lesson 1: getting a good dataset Lesson 2: linking variation to transmission Lesson 3: within-host diversity Lesson 4: putting it together 4. M Y R E S E A R C H I N T E R E S T S I N T R O D U C T I O N 5. TB IS CAUSED BY Mycobacterium tuberculosis Infects alveolar macrophages Doubling time of 15-24h Can exist in latent phase ~90% of infections never progress to active disease Highly clonal population 7 major lineages recognized worldwide ~4.4 Mbp genome No ECEs ~10% repetitive regions 37 complete MTB reference genomes, 1000s of draft assemblies 6. One key to stopping TB is UNDERSTANDING TRANSMISSION 7. BCCDC is responsible for communicable disease diagnosis, surveillance, epidemiology, and prevention in British Columbia, Canada. 8. SURVEILLANCE IDENTIFIES TB CASES 9. molecular epidemiology identifies clustered isolates 10. M O L E C U L A R T Y P I N G O F M . T U B E R C U L O S I S SPOLIGOTYPING 43 oligonucleotide spacers between conserved direct repeats Hybridisation assay: is spacer present or not? Binary 0 or 1 43-digit binary string converted to 15-digit string using octal transformation IS6110-RFLP Restriction enzyme digest followed by electrophoresis Probe these ladders for IS6110 insertion element Final pattern is just the bands with IS6110 MIRU-VNTR PCR amplification of 12-24 MIRU (Mycobacterial Interspersed Repetitive Unit) VNTR regions Size of amplified product indicates number of repeats Final fingerprint is a 12 or 24-digit number 11. contact tracing identifies transmissions 12. L I M I TAT I O N S O F C U R R E N T M E T H O D S Genotyping methods only tell you a cluster of cases exists, not the order/direction of transmission Size/membership of the cluster varies with the molecular typing method(s) used Epidemiological investigation is required to derive the links between cases, and may not be available or of sufficient quality 13. genomic epidemiology (jnmik epidmlj/) n. reading whole genome sequences from outbreak isolates to track person-to-person spread of an infectious disease. 14. AAAAAA 15. AAAAAA AAAAAA AACAAA 16. AAAAAA AAAAAA AACAAA AACAAA GACAAA AAAATA AAAAAA 17. AAAAAA AACAAA AACAAA AACTAA AACTAA AACAAG 18. TELEPHONE ARTBYDEVIANTARTUSERSCUMMY 19. Quick et al, BMJ Open (2014) PMID: 25371418 20. Halachev et al, Genome Med (2014) PMID: 25414729 21. EPIDEMICS 22. Eppinger et al, MBio (2014) PMID: 25370488 understanding epidemic origins 23. Deng et al, Emerg Infect Dis (2014) PMID: 25147968 WHAT DRIVES PATHOGEN EMERGENCE? 24. Grad et al, Lancet Infect Dis (2014) PMID: 24462211 EPIDEMIOLOGICAL & CLINICAL TRENDS 25. W G S F O R O U T B R E A K I N V E S T I G AT I O N : H O O R AY F O R C L O N A L PAT H O G E N S I N T R O D U C T I O N I I 26. outbreak reconstruction typing diagnosis speciation genome annotation comparative genomics identify virulence factors predict drug resistance 27. outbreak reconstruction typing diagnosis speciation genome annotation comparative genomics identify virulence factors predict drug resistance plasmids SNPs indels recombination inversions translocations repetitive elements 28. outbreak reconstruction typing diagnosis speciation genome annotation comparative genomics identify virulence factors predict drug resistance plasmids SNPs indels recombination inversions translocations repetitive elements reference mapping de novo assembly 29. outbreak reconstruction typing diagnosis speciation genome annotation comparative genomics identify virulence factors predict drug resistance plasmids SNPs indels recombination inversions translocations repetitive elements reference mapping de novo assembly 30. outbreak reconstruction typing diagnosis speciation genome annotation comparative genomics identify virulence factors predict drug resistance plasmids SNPs indels recombination inversions translocations repetitive elements reference mapping de novo assembly 31. A D VA N TA G E S O F W O R K I N G W I T H C L O N A L PAT H O G E N S I N W G S Genetically monomorphic - limited/no recombination/HGT, low diversity compared to other organisms Easy to find a reference genome to align reads against De novo assembly also easier Diversity largely arises through insertions, deletions, and point mutations Identification of these elements is a single-step process Can use most of the genome for comparing multiple isolates, instead of a small subset of core genes More data, more accurate phylogenies, prediction of function and resistance 32. Stucki & Gagneux, Tuberculosis 33. Comas et al, PLoS One 1. Align your genome against a standard reference genome, find variation 2. Assign it to a lineage with the lineage-defining variations 3. Within a lineage, place your isolate into the phylogeny of previously- sequenced genomes 4. Look for SNPs indicating drug resistance or epidemiological clustering 34. N AT I O N - W I D E W G S O F T B S P E C I AT E , R E S I S TA N C E T Y P E , E P I L I N K S S I N G L E D ATA B A S E 35. RECAP SO FAR WGS CAN BE USED TO TRACK PERSON- TO-PERSON TRANSMISSION AND EPIDEMIC DYNAMICS - GENOMIC EPIDEMIOLOGY CLONAL PATHOGENS (E.G. TB, MRSA, Y. PESTIS, B. ANTHRACIS, ETC) ARE AN ESPECIALLY GOOD USE CASE FOR WGS GENOMIC EPI REQUIRES MAPPING TO A REFERENCE AND CALLING SNPS 36. G E T T I N G A G O O D D ATA S E T L E S S O N 1 #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample gi|50953765|ref|NC_002755.2| 235 . CG C 328.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=1.570;DP=52;FS=1.221;MLEAC=1;MLEAF=1.00;MQ=59.61;MQ0=0;MQRankSum=-0.984;QD=6.33;RPA=3,2;RU=G;ReadPosRankSum=0.797;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:15,31:52:99:1:1.00:368,0 gi|50953765|ref|NC_002755.2| 238 . GC G 403.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=2.733;DP=53;FS=0.000;MLEAC=1;MLEAF=1.00;MQ=59.61;MQ0=0;MQRankSum=-2.059;QD=7.62;RPA=4,3;RU=C;ReadPosRankSum=0.349;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:15,30:52:99:1:1.00:443,0 gi|50953765|ref|NC_002755.2| 3631 . GC G 215.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-2.204;DP=54;FS=69.670;MLEAC=1;MLEAF=1.00;MQ=58.43;MQ0=0;MQRankSum=-1.742;QD=4.00;RPA=4,3;RU=C;ReadPosRankSum=0.384;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:17,25:52:99:1:1.00:255,0 gi|50953765|ref|NC_002755.2| 4123 . C T 1459 . AC=1;AF=1.00;AN=1;DP=55;Dels=0.00;FS=0.000;HaplotypeScore=13.3300;MLEAC=1;MLEAF=1.00;MQ=59.28;MQ0=0;QD=26.53 GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:0,55:55:99:1:1.00:1489,0 gi|50953765|ref|NC_002755.2| 4630 . CG C 163.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.623;DP=42;FS=3.012;MLEAC=1;MLEAF=1.00;MQ=59.69;MQ0=0;MQRankSum=2.239;QD=3.90;RPA=4,3;RU=G;ReadPosRankSum=0.084;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:14,21:42:99:1:1.00:203,0 gi|50953765|ref|NC_002755.2| 5701 . AC A 68.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=0.591;DP=39;FS=44.682;MLEAC=1;MLEAF=1.00;MQ=59.17;MQ0=0;MQRankSum=0.066;QD=1.77;RPA=4,3;RU=C;ReadPosRankSum=-0.394;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:11,19:39:99:1:1.00:108,0 gi|50953765|ref|NC_002755.2| 7543 . TG T 247.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-1.463;DP=43;FS=79.062;MLEAC=1;MLEAF=1.00;MQ=59.54;MQ0=0;MQRankSum=0.547;QD=5.77;RPA=3,2;RU=G;ReadPosRankSum=1.996;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:16,22:43:99:1:1.00:287,0 gi|50953765|ref|NC_002755.2| 12448 . TG T 292.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=0.000;DP=38;FS=41.965;MLEAC=1;MLEAF=1.00;MQ=57.91;MQ0=0;MQRankSum=2.037;QD=7.71;RPA=5,4;RU=G;ReadPosRankSum=0.724;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:9,21:35:99:1:1.00:332,0 gi|50953765|ref|NC_002755.2| 13030 . CG C 344.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=0.857;DP=57;FS=5.672;MLEAC=1;MLEAF=1.00;MQ=59.28;MQ0=0;MQRankSum=0.334;QD=6.05;RPA=2,1;RU=G;ReadPosRankSum=0.009;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:22,31:57:99:1:1.00:384,0 gi|50953765|ref|NC_002755.2| 14147 . GC G 299.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-1.195;DP=49;FS=0.000;MLEAC=1;MLEAF=1.00;MQ=59.22;MQ0=0;MQRankSum=1.344;QD=6.12;RPA=2,1;RU=C;ReadPosRankSum=1.762;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:15,23:49:99:1:1.00:339,0 gi|50953765|ref|NC_002755.2| 14192 . CG C 352.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.924;DP=41;FS=36.845;MLEAC=1;MLEAF=1.00;MQ=59.04;MQ0=0;MQRankSum=-1.143;QD=8.61;RPA=4,3;RU=G;ReadPosRankSum=-0.830;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:14,23:41:99:1:1.00:392,0 gi|50953765|ref|NC_002755.2| 15273 . AG A 107.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.823;DP=77;FS=146.402;MLEAC=1;MLEAF=1.00;MQ=59.82;MQ0=0;MQRankSum=0.076;QD=1.40;RPA=3,2;RU=G;ReadPosRankSum=0.823;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:35,39:76:99:1:1.00:147,0 gi|50953765|ref|NC_002755.2| 15571 . AC A 76.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=1.057;DP=50;FS=37.078;MLEAC=1;MLEAF=1.00;MQ=59.63;MQ0=0;MQRankSum=-0.449;QD=1.54;RPA=2,1;RU=C;ReadPosRankSum=1.591;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:22,21:50:99:1:1.00:116,0 gi|50953765|ref|NC_002755.2| 15647 . CG C 89.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-3.493;DP=46;FS=10.328;MLEAC=1;MLEAF=1.00;MQ=59.58;MQ0=0;MQRankSum=-0.068;QD=1.96;RPA=3,2;RU=G;ReadPosRankSum=2.841;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:18,22:46:99:1:1.00:129,0 gi|50953765|ref|NC_002755.2| 17609 . C G 875 . AC=1;AF=1.00;AN=1;BaseQRankSum=1.288;DP=39;Dels=0.00;FS=0.000;HaplotypeScore=33.2023;MLEAC=1;MLEAF=1.00;MQ=60.00;MQ0=0;MQRankSum=1.555;QD=22.44;ReadPosRankSum=-1.466 GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:1,38:39:99:1:1.00:905,0 gi|50953765|ref|NC_002755.2| 18844 . GC G 49.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=1.550;DP=35;FS=67.160;MLEAC=1;MLEAF=1.00;MQ=59.42;MQ0=0;MQRankSum=-1.822;QD=1.43;RPA=4,3;RU=C;ReadPosRankSum=-0.136;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:12,13:35:89:1:1.00:89,0 gi|50953765|ref|NC_002755.2| 18890 . CG C 47.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=0.433;DP=27;FS=44.730;MLEAC=1;MLEAF=1.00;MQ=59.86;MQ0=0;MQRankSum=0.029;QD=1.78;RPA=3,2;RU=G;ReadPosRankSum=1.299;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:12,12:27:87:1:1.00:87,0 gi|50953765|ref|NC_002755.2| 19260 . AG A 66.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-1.899;DP=24;FS=33.331;MLEAC=1;MLEAF=1.00;MQ=59.92;MQ0=0;MQRankSum=1.823;QD=2.79;RPA=2,1;RU=G;ReadPosRankSum=-0.152;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:9,11:24:99:1:1.00:106,0 gi|50953765|ref|NC_002755.2| 19342 . AG A 162.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.153;DP=38;FS=28.024;MLEAC=1;MLEAF=1.00;MQ=58.61;MQ0=0;MQRankSum=-0.576;QD=4.29;RPA=3,2;RU=G;ReadPosRankSum=1.458;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:13,19:38:99:1:1.00:202,0 gi|50953765|ref|NC_002755.2| 22351 . G A 614 . AC=1;AF=1.00;AN=1;DP=29;Dels=0.00;FS=0.000;HaplotypeScore=3.3416;MLEAC=1;MLEAF=1.00;MQ=59.40;MQ0=0;QD=21.17 GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:0,29:29:99:1:1.00:644,0 gi|50953765|ref|NC_002755.2| 22858 . GC G 385.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.517;DP=44;FS=77.067;MLEAC=1;MLEAF=1.00;MQ=60.00;MQ0=0;MQRankSum=0.986;QD=8.77;RPA=3,2;RU=C;ReadPosRankSum=0.548;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:14,23:44:99:1:1.00:425,0 gi|50953765|ref|NC_002755.2| 24291 . AC A 40.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.229;DP=36;FS=25.732;MLEAC=1;MLEAF=1.00;MQ=58.96;MQ0=0;MQRankSum=0.811;QD=1.14;RPA=3,2;RU=C;ReadPosRankSum=2.557;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:14,16:35:80:1:1.00:80,0 gi|50953765|ref|NC_002755.2| 24554 . GC G 286.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.667;DP=18;FS=2.187;MLEAC=1;MLEAF=1.00;MQ=56.45;MQ0=0;MQRankSum=0.061;QD=15.94;RPA=3,2;RU=C;ReadPosRankSum=0.182;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:4,12:18:99:1:1.00:326,0 gi|50953765|ref|NC_002755.2| 24792 . TG T 226.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-1.775;DP=20;FS=11.553;MLEAC=1;MLEAF=1.00;MQ=58.44;MQ0=0;MQRankSum=-0.254;QD=11.35;RPA=4,3;RU=G;ReadPosRankSum=1.099;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:3,10:20:99:1:1.00:266,0 gi|50953765|ref|NC_002755.2| 25567 . GC G 59.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=1.107;DP=24;FS=33.858;MLEAC=1;MLEAF=1.00;MQ=59.72;MQ0=0;MQRankSum=0.048;QD=2.50;RPA=4,3;RU=C;ReadPosRankSum=1.203;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:8,9:24:99:1:1.00:99,0 gi|50953765|ref|NC_002755.2| 26566 . CG C 132.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.915;DP=29;FS=13.132;MLEAC=1;MLEAF=1.00;MQ=60.00;MQ0=0;MQRankSum=-0.471;QD=4.59;RPA=4,3;RU=G;ReadPosRankSum=1.248;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:10,15:29:99:1:1.00:172,0 gi|50953765|ref|NC_002755.2| 30131 . CG C 407.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=0.935;DP=35;FS=14.606;MLEAC=1;MLEAF=1.00;MQ=60.00;MQ0=0;MQRankSum=-1.230;QD=11.66;RPA=2,1;RU=G;ReadPosRankSum=2.066;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:9,19:30:99:1:1.00:447,0 gi|50953765|ref|NC_002755.2| 30500 . T C 882 . AC=1;AF=1.00;AN=1;DP=31;Dels=0.03;FS=0.000;HaplotypeScore=4.8259;MLEAC=1;MLEAF=1.00;MQ=59.65;MQ0=0;QD=28.45 GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:0,30:30:99:1:1.00:912,0 gi|50953765|ref|NC_002755.2| 30974 . GC G 132.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-1.336;DP=32;FS=36.755;MLEAC=1;MLEAF=1.00;MQ=58.06;MQ0=0;MQRankSum=1.736;QD=4.16;RPA=3,2;RU=C;ReadPosRankSum=0.735;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:9,13:32:99:1:1.00:172,0 gi|50953765|ref|NC_002755.2| 31870 . TG T 100.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.986;DP=33;FS=47.544;MLEAC=1;MLEAF=1.00;MQ=60.00;MQ0=0;MQRankSum=0.363;QD=3.06;RPA=4,3;RU=G;ReadPosRankSum=-0.259;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:11,15:33:99:1:1.00:140,0 gi|50953765|ref|NC_002755.2| 31979 . C G 938 . AC=1;AF=1.00;AN=1;DP=37;Dels=0.00;FS=0.000;HaplotypeScore=5.4934;MLEAC=1;MLEAF=1.00;MQ=59.36;MQ0=0;QD=25.35 GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:0,37:37:99:1:1.00:968,0 gi|50953765|ref|NC_002755.2| 32682 . GC G 993.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=1.348;DP=42;FS=0.000;MLEAC=1;MLEAF=1.00;MQ=59.55;MQ0=0;MQRankSum=0.835;QD=23.67;RPA=5,4;RU=C;ReadPosRankSum=-0.449;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:1,33:42:99:1:1.00:1033,0 gi|50953765|ref|NC_002755.2| 34472 . CG C 66.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=2.736;DP=26;FS=31.527;MLEAC=1;MLEAF=1.00;MQ=60.00;MQ0=0;MQRankSum=-0.165;QD=2.58;RPA=3,2;RU=G;ReadPosRankSum=0.231;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:10,12:26:99:1:1.00:106,0 gi|50953765|ref|NC_002755.2| 35847 . CG C 189.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=0.581;DP=45;FS=41.475;MLEAC=1;MLEAF=1.00;MQ=58.98;MQ0=0;MQRankSum=0.751;QD=4.22;RPA=3,2;RU=G;ReadPosRankSum=1.629;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:17,22:45:99:1:1.00:229,0 gi|50953765|ref|NC_002755.2| 36233 . AC A 215.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=0.751;DP=10;FS=6.532;MLEAC=1;MLEAF=1.00;MQ=58.31;MQ0=0;MQRankSum=-0.751;QD=21.60;RPA=3,2;RU=C;ReadPosRankSum=-0.751;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:1,8:10:99:1:1.00:255,0 gi|50953765|ref|NC_002755.2| 37870 . TG T 138.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=0.440;DP=49;FS=49.878;MLEAC=1;MLEAF=1.00;MQ=58.61;MQ0=0;MQRankSum=-1.321;QD=2.84;RPA=4,3;RU=G;ReadPosRankSum=-0.029;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:17,21:49:99:1:1.00:178,0 gi|50953765|ref|NC_002755.2| 37985 . CG C 531.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-1.030;DP=42;FS=68.866;MLEAC=1;MLEAF=1.00;MQ=59.62;MQ0=0;MQRankSum=-0.859;QD=12.67;RPA=4,3;RU=G;ReadPosRankSum=0.481;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:11,25:42:99:1:1.00:571,0 gi|50953765|ref|NC_002755.2| 39010 . CG C 94.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.317;DP=24;FS=8.022;MLEAC=1;MLEAF=1.00;MQ=55.37;MQ0=0;MQRankSum=-1.347;QD=3.96;RPA=2,1;RU=G;ReadPosRankSum=0.238;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:7,13:24:99:1:1.00:134,0 gi|50953765|ref|NC_002755.2| 39106 . TG T 30.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-1.588;DP=19;FS=30.584;MLEAC=1;MLEAF=1.00;MQ=58.65;MQ0=0;MQRankSum=0.741;QD=1.63;RPA=3,2;RU=G;ReadPosRankSum=2.117;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:7,9:19:70:1:1.00:70,0 gi|50953765|ref|NC_002755.2| 39850 . GC G 340.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=2.329;DP=44;FS=25.006;MLEAC=1;MLEAF=1.00;MQ=59.96;MQ0=0;MQRankSum=-0.687;QD=7.75;RPA=3,2;RU=C;ReadPosRankSum=2.628;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:15,23:44:99:1:1.00:380,0 gi|50953765|ref|NC_002755.2| 40422 . A C 677 . AC=1;AF=1.00;AN=1;DP=31;Dels=0.03;FS=0.000;HaplotypeScore=2.9313;MLEAC=1;MLEAF=1.00;MQ=59.60;MQ0=0;QD=21.84 GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:0,29:30:99:1:1.00:707,0 gi|50953765|ref|NC_002755.2| 40815 . AG A 85.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=1.473;DP=23;FS=5.021;MLEAC=1;MLEAF=1.00;MQ=60.00;MQ0=0;MQRankSum=0.412;QD=3.74;RPA=3,2;RU=G;ReadPosRankSum=-1.473;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:6,9:23:99:1:1.00:125,0 gi|50953765|ref|NC_002755.2| 40878 . GC G 365.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.520;DP=24;FS=12.155;MLEAC=1;MLEAF=1.00;MQ=57.81;MQ0=0;MQRankSum=-0.236;QD=15.25;RPA=4,3;RU=C;ReadPosRankSum=0.236;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:4,16:24:99:1:1.00:405,0 gi|50953765|ref|NC_002755.2| 40891 . GC G 307.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-1.837;DP=28;FS=11.823;MLEAC=1;MLEAF=1.00;MQ=58.13;MQ0=0;MQRankSum=0.167;QD=11.00;RPA=4,3;RU=C;ReadPosRankSum=1.236;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:7,16:27:99:1:1.00:347,0 gi|50953765|ref|NC_002755.2| 40929 . GC G 314.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-1.037;DP=37;FS=31.951;MLEAC=1;MLEAF=1.00;MQ=58.57;MQ0=0;MQRankSum=0.699;QD=8.51;RPA=5,4;RU=C;ReadPosRankSum=2.053;STR GT:AD:DP:GQ:MLPSAC:MLPSAF:PL 1:12,18:37:99:1:1.00:354,0 gi|50953765|ref|NC_002755.2| 40956 . GC G 61.97 . AC=1;AF=1.00;AN=1;BaseQRankSum=-0.954;DP=37;FS=21.093;MLEAC=1;MLEA 37. SEQUENCING CONSIDERATIONS What depth of coverage do I need? 50x-100x to facilitate SNP calling Dont multiplex too much! Should I sequence multiple isolates from a patient? Useful for chronic/latent infections Can I send multiple outbreaks for sequencing? LIMS check Should I generate one long-read scaffold? Can finish genomes this way 38. BIOINFORMATICS GUIDANCE 39. M Y U S U A L P I P E L I N E Read QC with FASTQC Map against reference with BWAmem Call SNVs with samtools pileup Output a VCF file with SNVs only - no indels Custom Python script to filter out SNVs common to all sequenced isolates and format remainder as a table Remove all SNVs within 50bp of another High coverage dataset makes SNV calling based on qual score thresholds easy Manually inspect each SNV using a BAM viewer tool 40. LOOK AT YOUR DATA 41. M Y U S U A L P I P E L I N E Read QC with FASTQC Map against reference with BWAmem Call SNVs with samtools pileup Output a VCF file with SNVs only - no indels Custom Python script to filter out SNVs common to all sequenced isolates and format remainder as a table Remove all SNVs within 50bp of another High coverage dataset makes SNV calling based on qual score or other thresholds easy Manually inspect each SNV using a BAM viewer tool 42.