scott emrich assistant professor, computer science and engineering scientific manager, vectorbase...
TRANSCRIPT
![Page 1: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/1.jpg)
Scott Emrich
Assistant Professor, Computer Science and Engineering
Scientific Manager, VectorBaseUniversity of Notre Dame
A flexible, scalable genomics framework for integrating
heterogeneous vector sequence data
![Page 2: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/2.jpg)
Assembly required…
![Page 3: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/3.jpg)
VectorBase is here to help (esp. –OMICs data)
Please see me and/or Dan Lawson (EBI) anytime this meeting
![Page 4: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/4.jpg)
Anopheles gambiae M & S
Lawnziak, Emrich et al. (2010, Science)
![Page 5: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/5.jpg)
![Page 6: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/6.jpg)
Some genomic regions display footprint of strong, recent selection
Lawniczak, Emrich et al. 2010 Science
![Page 7: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/7.jpg)
A C G T C G T T A C T G CReference:
A C G T C G A T A C T G CSample_1:
A C G T C G T T A T T G CSample_2:
A C G T C G A T A T T G CA C G T C G A T A T T G CA C G T C G A T A C T G CA C G T C G A T A C T G C
A C G T C G T T A T T G CA C G T C G T T A T T G CA C G T C G T T A T T G CA C G T C G T T A T T G C
FlexReseq tool for integrating diverse sequence data
![Page 8: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/8.jpg)
FlexReseq implementation
Genome Analysis Toolkit (GATK):Map-Reduce framework that allows efficient access to large resequencing data sets
FlexReseq: A module for GATK:Configurable interface allows easy data explorationModular implementation of rules allows for easy extension of software
Saves you from lots of scripting (Perl) code!
McKenna et al., Genome Research, 2010
![Page 9: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/9.jpg)
![Page 10: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/10.jpg)
![Page 11: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/11.jpg)
A malaria use-case for FlexReseq
Samarakoon, Regier, et al., BMC Genomics, 2011
Why are some parasites drug-resistant?
Goal: we want to connect genotype (genome)
to phenotype (drug response)
How did drug-resistance evolve?
![Page 12: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/12.jpg)
1. Whole genome shotgun
sequencing 2. Reference genome mapping
NCBI Trace Archive [28]
Reference genome
(3D7)
Parental genomes[shotgun libraries]
Progeny genomes[shotgun libraries]
PlasmoDB (v5.4) [27]
Mapped:
SSAHA2
http://www.sanger.
ac.uk
ParentsHB3, Dd2
Progenyrecombinants
SC05 7C126
Shotgun librariesGS-FLX technology
454/Roche
Genetic crossWellems et al.
1990 [24]
![Page 13: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/13.jpg)
![Page 14: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/14.jpg)
A more detailed map of P. falciparum
Dd2 HB3Chromosome position
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Chr
omos
om
e
(A) 7C126 (B) SC05
![Page 15: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/15.jpg)
Association of 2La with clines of aridity in Nigeria…
Modified from Coluzzi et al (1979)
24,000 mosquitoes
194 sampling localities
![Page 16: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/16.jpg)
High-throughput sequencing
• Data from Besansky lab• Illumina Genome
Analyzer• 4 population pools
(S-form)• SHRiMP alignment• BWA works also
C. Cheng et al, unpublished
![Page 17: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/17.jpg)
![Page 18: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/18.jpg)
Differential mapping biases do exist
![Page 19: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/19.jpg)
![Page 20: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/20.jpg)
Population haplotyping
![Page 21: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/21.jpg)
In situ error isolation
Has been shown to be important in ancient DNA-based ecology
![Page 22: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/22.jpg)
![Page 23: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/23.jpg)
![Page 24: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/24.jpg)
![Page 25: Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics](https://reader030.vdocuments.net/reader030/viewer/2022032707/56649e4e5503460f94b450bb/html5/thumbnails/25.jpg)
Thanks to…
VectorBase (NIH/NIAID)• Dr. Nora Besansky (ND)• Dr. Frank Collins (ND)• Rory Carmichael, Andrew
Shehan, Nate Konopinski, Dave Campbell (ND), others…
Notre Dame Bioinformatics Lab, Summer 2010
Anopheles genome cluster groupi5KArthropod Genomics Consortium steering committee