what should bioinformatics do for evodevo?
DESCRIPTION
Presented at Euro Evo Devo 2014 in ViennaTRANSCRIPT
![Page 1: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/1.jpg)
Insights into the evolution and development of planarian regeneration from the genome of the flatworm Girardia tigrina
SUJAI KUMAR
2014-07-24 VIENNA EURO EVODEVO
WHAT SHOULD BIOINFORMATICS DO FOR EVODEVO?
![Page 2: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/2.jpg)
EVODEVO
SUJAI KUMAR
![Page 3: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/3.jpg)
SUJAI KUMAR
"Winkel triple projection SW" by Strebe - Own workLicensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons http://commons.wikimedia.org/wiki/File:Winkel_triple_projection_SW.jpg
Cartoonist and mathematics teacher inNew Delhi
![Page 4: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/4.jpg)
SUJAI KUMAR
Finding patterns in sequences:TIMSS 1999 video study
MS in Educational Psychology at the University of Illinois
![Page 5: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/5.jpg)
SUJAI KUMAR
Self-organising systems research in New Delhi
![Page 6: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/6.jpg)
SUJAI KUMAR
Sequenced four nematode genomes for PhD in Blaxter Lab, Edinburgh
![Page 7: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/7.jpg)
SUJAI KUMAR
Planarian regeneration genomics in Aboobaker Lab, Oxford
![Page 8: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/8.jpg)
Outline of this talk
1. Regeneration, planarian flatworms, and Girardia tigrina
2. Creating G tigrina genomic resources
3. Using these resources to understand regeneration
4. What should bioinformatics do for EvoDevo
![Page 9: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/9.jpg)
1. Regeneration,planarian flatworms,and Girardia tigrina
Bely and Nyberg, 2010 DOI:10.1016/j.tree.2009.08.005
![Page 10: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/10.jpg)
1. Regeneration,planarian flatworms,and Girardia tigrina
Kao, 2014. PhD Thesis “Transcriptome assembly and analysisof the freshwater planarian Schmidtea mediterranea”
Platyhelminthes
Cestoda
Monogenea
Trematoda
Rhabditophora
Turbellaria
Tricladida
Macrostomorpha
Lecithoepitheliata
RhabdocoelaTT
T
TT
T
Girardia tigrinaaboobakerlab.com/genomes
G
Schmidtea mediterraneasmedgd.neuro.utah.edu
G
Polycladida
![Page 11: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/11.jpg)
1. Regeneration,planarian flatworms,and Girardia tigrina
• What we know already
• Some genes and pathways that are essential for WBR• Some transcription expression profiles• No transgenics in any planarian
![Page 12: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/12.jpg)
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
![Page 13: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/13.jpg)
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
Illumina HiSeq: WorkhorseShort paired reads~$£€ 1,000 / 100 MegaBaseMate pairs essential
PacBio: expensiveHigh quality fly genome~$£€ 10,000 / 100 MegaBase
Nanopore – not a game changer just yet
![Page 14: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/14.jpg)
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
• Quality Control
• Raw data QC fastqc
• Preliminary assembly Blobology
• Separate components contaminants/ endosymbionts/ mitochondrial
• Assess insert sizes Bad mate pair libraries confound scaffolding
![Page 15: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/15.jpg)
Each point is a contigfrom a preliminaryassembly
(Caenorhabditis Sp. 5)
Taxon-annotatedGC-Coverage(TAGC)Plots
a.k.a“Blobology”
![Page 16: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/16.jpg)
GC Content
Rea
d co
vera
ge
Girardia tigrina
![Page 17: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/17.jpg)
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
• Quality Control
• Raw data QC fastqc
• Preliminary assembly Blobology
• Separate components contaminants/ endosymbionts/ mitochondrial
• Assess insert sizes Bad mate pair libraries confound scaffolding
• Generate many assemblies
• ABySS, CLC, MaSurCA, SGA, Spades, ALLPATHS-LG• Evaluate assemblies
• FRCbam, REAPR, CGAL
• CEGMA, alignments to known sequences• Freeze and release
![Page 18: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/18.jpg)
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
• NOT a great assembly• But it was GoodEnough™ • Next version with long-insert mate pairs• Diploid, but high heterozygosity
Assembly version nGt.0.3 nGt.0.5
Raw read data ~500M short read pairs160 GBases
Consolidating near identical contigs
Total Span Gbases 1.898 1.500
Num Contigs 581,558 422,617
Span Contigs >10kb 541,653,308 536,575,093
Num Contigs >10kb 29,050 27,495
N50 5,751 6,827
CEGMA 45% 56%
![Page 19: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/19.jpg)
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
• Gene prediction
• RNA-seq• Predictors Augustus, SNAP, GeneMark
• Consolidators MAKER, EVM, ENSEMBL genebuild
• Evaluate use Annotation Edit Distance (AED) as a metric
• Functional annotation
• InterProScan, Trinotate, Blast2GO
• Community annotation
• WebApollo, Community Annotation Portal
Annotation Version
Num of Genes
Num of Genes with AED>0.5
Mean aa length
Num of Genes with InterPro annotations
nGt.0.5.1 39,119 35,061 268 22,747
![Page 20: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/20.jpg)
2. Creating G tigrina genomic resources
Sequencing > Assembly > Annotation > Delivery
• Genome Browser
• Blast server
• Bulk data downloads
• Interface
• Badger, Tripal, InterMine, Ensembl
![Page 21: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/21.jpg)
3. Using these resources to understand regeneration
• Individual genes and pathways
• Transgenics
• Protein ortholog analysis
• 4 triclads, 1 other platyhelminth, 2 ecdysozoa, 4 deuterostomes• 14k out of 40k G tigrina proteins in strict ortholog clusters• ~8000 triclad-specific clusters• ~800 triclad-specific clusters with all 4 species represented
• Cis-regulatory analysis
• Neoblast specific regulatory regions
![Page 22: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/22.jpg)
4. What should bioinformatics do for EvoDevo
• What should I do for an experimental EvoDevo lab
• Visual > Text• View additional information in place• Plot everything vs everything• Create gene models visually• Routine analyses should not require bioinformatician• Clear explanations of how a resource was created• Not too many versions• Minimum standards
![Page 23: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/23.jpg)
4. What should bioinformatics do for EvoDevo
• What should the bioinformatics community do for me as an EvoDevo bioinformatician
• Best practice documentation for analyses• Easy to install tools• Minimum standards for assembly, metadata, annotation, and delivery• Grants for coordination, tools, resources
![Page 24: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/24.jpg)
Summary
• Please use the resources at aboobakerlab.com/genomes
• Tell us what other resources you’d like to see as standard
• Fund technology development and training
![Page 25: What should Bioinformatics do for EvoDevo?](https://reader033.vdocuments.net/reader033/viewer/2022052822/554ea658b4c905977e8b4936/html5/thumbnails/25.jpg)
Acknowledgements
• AboobakerLab.com
• Aziz Aboobaker• Natalia Pouchkina-Stantcheva• Damian Kao• Yuliana Mihaylova• Aphrodite Zhao
• Blaxter Lab (nematodes.org)
• Ben Elsworth (Badger)
• Sequencing
• Edinburgh Genomics
• Funding
• BBSRC• BSDB / Company of Biologists travel grant