ngs bioinformatics workshop 1.5 tutorial – genome annotation april 5th, 2012 irmacs 10900...

9
NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB

Upload: reynard-sims

Post on 01-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB

NGS Bioinformatics Workshop1.5 Tutorial – Genome Annotation

April 5th, 2012IRMACS 10900

Facilitator: Richard BruskiewichAdjunct Professor, MBB

Page 2: NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB

Workflow for Today

Prepare to visualize annotationGet a genomic sequence from GenbankRepeat mask it.

Page 3: NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB

Retrieve a genomic sequence…

Retrieve a (relatively small <100kb, eukaryote) genomic sequence clone from GenbankQuery Nucleotide divisione.g. Arabidopsis BAC

clone (HE601748.1)Select FASTASave.. To File.. As “Fasta” (rename?)

Page 4: NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB

Blast is a low hanging fruit…

Use BLAST to quickly survey for similar sequencesMegablast against nucleotide

e.g. HE601748 is closest to A. thaliana chr. 5?Megablast against reference RNA sequence db

Page 5: NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB

Repeat Masking

Upload the clone file to RepeatMasker on the web and run with appropriate parameters:http://www.repeatmasker.org/cgi-bin/WEBRepeatMasker

Save the results (including the masked sequence) to your computer

Page 6: NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB

ab initio Gene Predictions

Genscan:http://genes.mit.edu/GENSCAN.html

Cut and paste results as text to a fileFgenesh:

www.softberry.com

Page 7: NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB

Blast2GOhttp://www.blast2go.com

Annotation workbench, via Gene Ontology (GO) terms. First, save the predicted peptides (e.g. from fgenesh)

need to fix the FASTA headers to assign proper identifiers (could write a script?)

(Java web) start blast2go workbench Load in peptides Do the analysis… e.g. run blastp, GO, annotation,

Interpro, etc. See www.geneontology.org for details on GO http://www.ebi.ac.uk/interpro/ for interpro info

Page 8: NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB

EMBOSS

European Molecular Biology Open Software Suite (EMBOSS):

http://emboss.sourceforge.net Download and install version of interest (e.g.

Linux, Mac OSX, Windows…)Decide what do to:

http://emboss.sourceforge.net/apps/groups.htmlLet’s try a CpG island plot (cpgplot)

Page 9: NGS Bioinformatics Workshop 1.5 Tutorial – Genome Annotation April 5th, 2012 IRMACS 10900 Facilitator: Richard Bruskiewich Adjunct Professor, MBB

Study Genes by Comparative Genomics

JGI Vista toolkit:http://genome.lbl.gov/vista GenomeVistarVista