advanced practical course in genome...

Post on 04-Jun-2020

7 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Practical course in genome bioinformatics

Day 6 – lectureManual annotation (Web Apollo & automatic tools)

24 Feb 2017http://ekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

Genome project “roadmap”

• After experimental design and preparations a genome project can be roughly split into the following steps:

1. Sequencing2. (de novo) assembly, scaffolding3. RNA-sequencing and mapping4. Gene prediction5. Manual & functional annotation6. Submission and publication of the genome in a biodatabase7. Further downstream analysis

26.2.2016 juhana.kammonen@helsinki.fi

You are here !

http://padlet.com/juhana_kammonen/bioinfo

Manual annotation - outline

• Repetitive elements prediction (ab initio) finds and masks repeats in the genome

• Automated gene prediction reveals potential gene content from the genome• A challenging task especially in eukaryotic genomes

• After gene prediction the genome must be manually curated• Apply a set of trained human eyes to evaluate different tracks

of evidence and find potential errors in gene predictions

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

The ”rocky path” from de novoassembly to manual annotation

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

Evidence tracks not covered well by gene prediction

• RNA-sequencing coverage• Relatively weak correlation with

actual gene location but admittedly is an indication of expression level

• Should be included as an evidence track in Web Apollo

• Splicing in eukaryotes• Exons may be incorrectly linked /

separated in the gene models

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

Gene prediction and splicing

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

Cantarel BL, Korf I, Robb SMC, Parra G, Ross E, Moore B, Holt C, Sanchez Alvarado A, Yandell M (2008). MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Research, 18(1), 188–196.

Gene prediction accuracy revisited

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

Web Apollo manual annotation tool

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

Lee E, Helt GA, Reese JT, Munoz-Torres MC, Childers CP, Buels RM, Stein L, Holmes IH, Elsik CG, Lewis SE (2013). Web Apollo: a web-based genomic annotation editing platform. Genome Biology, 14(8), R93.

Web Apollo annotation view

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

Basic recipe for manually annotating a single gene• Use as many relevant tracks of evidence as possible to

verify a predicted gene

• Align the predicted sequence against various databases

• BLAST

• Possible databases of related species

• Add comments on the annotation of your findings

• If the predicted annotation was modified, specify the reason

• Web Apollo allows the annotations to be marked e.g. as ”needs revision”

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

Today’s features

• Web Apollo – collaborative online manual annotation tool• Lee E, Helt GA, Reese JT, Munoz-Torres MC, Childers CP, Buels RM, Stein L,

Holmes IH, Elsik CG, Lewis SE (2013). Web Apollo: a web-based genomic annotation editing platform. Genome Biology, 14(8), R93.

• BLAST – Basic Local Alignment Search Tool• ”Swiss army knife” of a bioinformatician

• Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990). Basic local alignment search tool. Journal of Molecular Biology, 215(3):403-10

• MAFFT – Multiple Alignment Using Fast Fourier Transform• Katoh K, Misawa K, Kuma K, & Miyata T (2002). MAFFT: a novel method

for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research, 30(14), 3059–3066.

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

Manual annotation efforts at Viikki campus• Betula pendula (silver birch) genome annotation in

spring of 2014• 1000 genes curated and annotated during 3 weeks

• Taphrina betulina genome annotation in 2015• 800 genes curated and annotated during 2 weeks

• P. Hispida saimensis genome annotation coming up later this year

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

Next: Computer exercises

• Getting familiar with Web Apollo http://apollo.berkeleybop.org

• Example annotation of a gene in Apis mellifera (honeybee) genome with Web Apollo

• Confirmation of proper annotation using BLAST and MAFFT

• Download exercise sheet from: http://ekhidna.biocenter.helsinki.fi/downloads/teaching/spring2017/Exercises_day6.pdf

26.2.2016 juhana.kammonen@helsinki.fi http://padlet.com/juhana_kammonen/bioinfo

top related