#ievobio keynote - june 26, 2013
DESCRIPTION
TRANSCRIPT
![Page 1: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/1.jpg)
Visualizing biodiversity in the era of high-throughput
sequencing
Holly Bik, UC Davis @Dr_Bik
![Page 2: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/2.jpg)
Our ability to visualize high-throughput sequencing data is as
bad as my title slide
![Page 3: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/3.jpg)
���
$250k, 1 year��
“A Research-Driven Data Visualization Framework for High-
Throughput Environmental Sequence Data” �
![Page 4: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/4.jpg)
http://pitchinteractive.com @pitchinc
![Page 5: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/5.jpg)
“Pitch Interactive dissects large data sets in search of meaningful and often hidden patterns that
serve to determine the shape and form that best tells a story.”
![Page 6: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/6.jpg)
Diverse marine community!
EASY!EASY!
EASY!
VERY Difficult!!
![Page 7: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/7.jpg)
![Page 8: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/8.jpg)
Mark Rothko, �No. 14, 1960�
�rectangles of orange and
purple with soft edges ��
![Page 9: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/9.jpg)
h"p://pippascabinet.blogspot.com/2012/11/on6true6love.html:
![Page 10: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/10.jpg)
Challenge 1: Environmental data is terrible at revealing fine-scale
taxonomic patterns
![Page 11: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/11.jpg)
ShallowGulf:
ShallowCalif:
AtlanAc22#1:AtlanAc25#2:
AtlanAc29:AtlanAc43: Pacific128:
Pacific528:Pacific422:
Pacific321:
Pacific237:AtlanAc45:
PC2:(12.21%):
PC3:(10.54%): PC1:(13.03%):
Overarching Community Patterns!
Bik et al. 2012, Molecular Ecology,! 21(5):1048-59 !
![Page 12: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/12.jpg)
0:
0.1:
0.2:
0.3:
0.4:
0.5:
0.6:
0.7:
0.8:
0.9:
1:
Post-spill�
Fungal Dominance�
Nematode Dominance�Pre-spill�
Bik et al. 2012, PLoS ONE, 7(6):e38550 !
![Page 13: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/13.jpg)
Algae:
Environmental:
Fungi:
Metazoa::Annelida:
Metazoa::Arthropoda:
Metazoa::Gastrotricha:
Metazoa::Nematoda:
Metazoa::Platyhelminthes:
No:Match:
Stramenopiles:
Unicellular:Eukaryotes:
Metazoa::Acanthocephala:
Metazoa::Brachiopoda:
Metazoa::Bryozoa:
Metazoa::Chordata:
Metazoa::Cnidaria:
Metazoa::Echiura:
Metazoa::Entoprocta:
Metazoa::Mollusca:
Fungi�
Grand&Isle,&Louisiana&:
Bik et al. 2012, PLoS ONE, 7(6):e38550 !
![Page 14: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/14.jpg)
Exploring Trees�Ecologically, what are these reference taxa doing??!
![Page 15: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/15.jpg)
Pertinent info for biological interpretations of DNA data!!!
![Page 16: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/16.jpg)
Challenge 2: Taxonomic, phylogenetic, and ecological knowledge is imperative for
making meaningful interpretations of high-throughput sequence datasets
![Page 17: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/17.jpg)
Enoplus spp.��
Daptonema spp.��
Robbea spp.��
Caenorhabditis elegans
Actinomyces spp.��
Clostridium spp.��
Listeria spp.
Synechococcus spp.
![Page 18: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/18.jpg)
Challenge 3: Extreme bioinformatics bottleneck for
microbial eukaryote data
![Page 19: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/19.jpg)
rDNA copy number & genome size in eukaryotes
Prokopowich CD, Gregory TR, Crease TJ. (2003) Genome, 46(1):48–50.
![Page 20: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/20.jpg)
Bik et al., in revision
…and in ONE genus of nematodes
Caenorhabditis brenneri ~323 rRNA gene copies
Caenorhabditis briggsae ~56 rRNA gene copies
![Page 21: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/21.jpg)
OCTU Reads OCTU Length Bit Score E-Value Match bp Total bp % Similarity Chimera DB match
27 63 266 525 e-146 265 265 100 -1 B. seani 175
12 9 265 500 e-138 261 264 98.86 -1 B. seani 175170 8 264 496 e-137 261 264 98.86 0 B. seani 175513 1 264 494 e-136 259 262 98.85 -2 B. seani 175579 2 263 492 e-136 258 261 98.85 -2 B. seani 175570 1 262 492 e-136 258 261 98.85 -1 B. seani 175394 1 263 490 e-135 260 264 98.48 1 B. seani 17519 2 269 488 e-135 264 269 98.14 0 B. seani 175658 1 266 486 e-134 260 265 98.11 -1 B. seani 175412 2 264 480 e-132 260 265 98.11 1 B. seani 175465 9 254 478 e-132 251 254 98.82 0 B. seani 1751164 1 268 478 e-132 261 267 97.75 -1 B. seani 175304 1 261 474 e-130 255 260 98.08 -1 B. seani 175868 1 244 460 e-126 242 245 98.78 1 B. seani 175514 2 274 458 e-126 263 272 96.69 -2 B. seani 175683 1 250 426 e-116 241 249 96.79 -1 B. seani 175627 1 230 422 e-115 223 226 98.67 -4 B. seani 175171 3 212 400 e-108 209 211 99.05 -1 B. seani 1751223 1 202 355 5.00E-95 198 204 97.06 2 B. seani 175
Porazinska et al. 2010 Zootaxa
Intragenomic variation in Eukaryotic rRNA
Tail!
Head!
Artificial control community containing known nematode species, all with corresponding full length reference 18S sequences!
Head-Tail Pattern in Nematode OTUs
![Page 22: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/22.jpg)
99% cutoff
OTUs as ‘Clouds’
97% cutoff
How to correlate OTUs with biological species?
![Page 23: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/23.jpg)
Sparse Databases for Eukaryotes
SILVA&108&Ref&rRNA&Database&(16S/18S)&
Bacteria: 530,197:
Archaea: 25,658:
Eukaryotes: 62,587:
![Page 24: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/24.jpg)
Ambiguous Taxonomy
Taxa Region 1 95%
Region 2 95%
Region 1 99%
Region 2 99%
Metazoa (20 Phyla) 1360 1461 43255 25668 Nematoda 765 879 27020 15518
Annelida 217 197 7073 3869 Arthropoda 128 178 2280 2323
Unicellular eukaryotes 738 1257 15198 22020 Environmental isolates 774 686 12687 9775 No match 480 354 11345 1868 Fungi 225 163 9984 2445 Stramenopiles 137 146 1771 1583 Algae 111 96 975 861 Total (all taxa) 3825 4163 95215 64220
!1!Deep sea and shallow water marine sediment 1.2 million reads, 454 GS FLX Titanium
Bik et al. 2012, Molecular Ecology, 21(5):1048-59
![Page 25: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/25.jpg)
Goal 1: A web-based, scalable visualization framework for
standard data formats
![Page 26: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/26.jpg)
Tier One
Standard outputs from bioinformatic pipelines
![Page 27: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/27.jpg)
• BIOM (json) files – OTU tables, metagenome datasets • Tab-delimited metadata files
![Page 28: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/28.jpg)
![Page 29: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/29.jpg)
http://explore.climbsf.com
![Page 30: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/30.jpg)
Goal 2: Destroy biologists’ addiction to pie charts
![Page 31: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/31.jpg)
A pie chart is not the most informative way to interpret
biodiversity data!
![Page 32: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/32.jpg)
Tier Two
![Page 33: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/33.jpg)
Bacteria: Archaea:
Nematodes:
Cilliates:
Crustaceans:
Circle:size:=:species:abundance:Circle:color:=:metadata:(sample,:temprature,:pH,:etc.):Mockup:example:take:from:h"p://www.wefeelfine.org/::
![Page 34: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/34.jpg)
Goal 4: Find intuitive ways to visualize new data outputs
![Page 35: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/35.jpg)
Explicitly Phylogenetic Approaches!Aligned:environmental:sequences:
Guide:Tree:
EvoluAonary:Placement:of:short:reads:
:::::::::
![Page 36: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/36.jpg)
http://phylosift.wordpress.com!
![Page 37: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/37.jpg)
Input Sequences rRNA workflow
protein workflow
profile HMMs used to align candidates to reference alignment
Taxonomic Summaries
parallel option
hmmalign multiple alignment
LAST fast candidate search
pplacer phylogenetic placement
LAST fast candidate search
LAST fast candidate search
search input against references
hmmalign multiple alignment
hmmalign multiple alignment
Infernal multiple alignment
LAST fast candidate search
<600 bp
>600 bp
Sample Analysis & Comparison
Krona plots, Number of reads placed
for each marker gene
Edge PCA, Tree visualization, Bayes factor tests
each
inpu
t seq
uenc
e sc
anne
d ag
ains
t bot
h w
orkf
low
s
![Page 38: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/38.jpg)
Probability Distributions: �when a pie chart is not a pie chart
![Page 39: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/39.jpg)
![Page 40: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/40.jpg)
Great! !
Not Bad !
Getting Tricky… !
![Page 41: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/41.jpg)
Marine:Metagenome:
Tree:Placement:Sing:Tree:6:Guppy:
![Page 42: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/42.jpg)
Goal 5: Pester other people Solicit case study participants
![Page 43: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/43.jpg)
![Page 44: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/44.jpg)
Goal 6: (Phase 2) Build a user and developer community
![Page 45: #ievobio Keynote - June 26, 2013](https://reader033.vdocuments.net/reader033/viewer/2022051817/54813404b37959582b8b5d4b/html5/thumbnails/45.jpg)
Acknowledgements :
:
Jonathan Eisen Aaron Darling Guillaume Jospin Dongying Wu David Coil :
: Further Information
• @Dr_Bik – updates posted to Twitter
• Grant proposal now posted on Figshare!
!!!: