spring 2017 phage lab - vcu wiki

6
Spring 2017 Phage Lab Computer lab availability is here. Your work Update your resumes!: Here is with - our three posters, the three abstracts used to submit the posters for the undergraduate research festival, and a link to a folder presentations the oral presentation from William & Mary. Please include these presentations in your resumes as appropriate to the author list. I think it is a good idea for students to list in your resume. You can always list it under your presenations/publications, with an GenBank submissions "*this is a submission of an annotated bacteriophage genome to a public database GenBank. My role was to annotate a section (or X genes) in this genome." I will update the table below with the phages' GenBank accession numbers once they are received later this week. This week we will write to describe our phages. These are real publications with each of you as a coauthor! You should put a Genome Announcements note on the item in your resume with "I am included as a co-author as a member of the 2016-2017 VCU Phage Hunters. My role was to annotate a section (or X genes) of the bacteriophage Y". You can also specify what role you played in writing the GA. Students who discover and characterize the bacteriophage will be listed individually. Example Genome Announcements for from VCU, from JMU, from Bacillus phages DirtyBetty and Kida Bacillus Phage Belinda Bacillus Phage SalinJah UMBC, and Bacillus Phage vB_BceS-MY192 from a non SEA PHAGES group. Each journal has their own rules for formatting and content for article submissions. Instructions to authors for Genome Announcements is linked . here Featuring.... Section 1 Section 1 Section 2 Section 2 Fasta file Zainny Janet OTooleKemple52 AaronPhadgers GenBank accession number MF288920.1 MF288922.1 MF288921.1 MF288919.1 Discovere d by Zainab Gbadamosi Brenna Kent Thomas Raymond Rahul Warrier Phage Morpholo gy myoviridae myoviridae myoviridae myoviridae Genome length (bp) 162692 160705 161807 161772 # ORFs 303 285 291 301 # tRNAs 0 7 7 3 % GC 38.7 38 37.9 38.7 DNAMast er file Zainny_aunoannotated.dnam5 Janet_autoannotated.dnam5 OTooleKemple52_autoannotated.dnam5 AaronPhadgers_autoannotated.dnam5 Merged DNAMast er file Zainny_final.dnam5 Janet_merged.dnam5 Janet_final.dnam5 OTK52_merged OTK52_final.dnam5 AaronPhadgers_final.dnam5 Genbank file for journal club Janet Genbank file OTooleKemple52 Genbank file

Upload: others

Post on 25-Apr-2022

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Spring 2017 Phage Lab - VCU Wiki

Spring 2017 Phage LabComputer lab availability is here.

Your workUpdate your resumes!:

Here is with - our three posters, the three abstracts used to submit the posters for the undergraduate research festival, and a link to a folder presentationsthe oral presentation from William & Mary. Please include these presentations in your resumes as appropriate to the author list.

I think it is a good idea for students to list in your resume. You can always list it under your presenations/publications, with an GenBank submissions "*this is a submission of an annotated bacteriophage genome to a public database GenBank. My role was to annotate a section (or X genes) in this genome." I will update the table below with the phages' GenBank accession numbers once they are received later this week.

This week we will write to describe our phages. These are real publications with each of you as a coauthor! You should put a Genome Announcementsnote on the item in your resume with "I am included as a co-author as a member of the 2016-2017 VCU Phage Hunters. My role was to annotate a section (or X genes) of the bacteriophage Y". You can also specify what role you played in writing the GA. Students who discover and characterize the bacteriophage will be listed individually.

Example Genome Announcements for from VCU, from JMU, from Bacillus phages DirtyBetty and Kida  Bacillus Phage Belinda Bacillus Phage SalinJahUMBC, and  Bacillus Phage vB_BceS-MY192 from a non SEA PHAGES group.

Each journal has their own rules for formatting and content for article submissions. Instructions to authors for Genome Announcements is linked .here

Featuring.... 

  Section 1 Section 1 Section 2 Section 2

Fasta file

Zainny Janet OTooleKemple52 AaronPhadgers

GenBank accessionnumber

 MF288920.1  MF288922.1  MF288921.1  MF288919.1

Discovered by

Zainab Gbadamosi Brenna Kent Thomas Raymond Rahul Warrier

Phage Morphology

myoviridae myoviridae myoviridae myoviridae

Genome length (bp)

162692 160705 161807 161772

# ORFs 303 285 291 301

# tRNAs 0 7 7 3

% GC 38.7 38 37.9 38.7

DNAMaster file

Zainny_aunoannotated.dnam5 Janet_autoannotated.dnam5 OTooleKemple52_autoannotated.dnam5 AaronPhadgers_autoannotated.dnam5

Merged DNAMaster file

Zainny_final.dnam5 Janet_merged.dnam5

Janet_final.dnam5

OTK52_merged

OTK52_final.dnam5

AaronPhadgers_final.dnam5

Genbank file for journal club

  Janet Genbank file OTooleKemple52 Genbank file  

Page 2: Spring 2017 Phage Lab - VCU Wiki

Genemark coding potential

Zainny coding potential Janet coding potential OTooleKemple52 coding potential AaronPhadgers coding potential

Six frame translation

Zainny 6 frame Janet 6 frame OTooleKemple52 6 frame AaronPhadgers six frame

GenBank submission file

Genbank file

Submission file

Genbank file

Submission file

Genbank file

  fileSubmission

Genbank file

Submission file

Genbank author list

  Zainny annotators  Janet annotators OKT annotators AaronPhadgers annotators

GA draft Zainny GA Janet GA    

Plate pic

TEM image

 

VCU phages are archived .here

Co-authors for Genbank submissions are here.

Resetting PhameratorIf ever you need to reset settings in Phamerator, go to Edit > Preferences and type the following:

Server: http://phamerator.csm.jmu.edu/sea

Database: Bacillus_Draft

After forcing a database update, you will be able to access the phage database (you may need to restart Phamerator)

CACAO GO AnnotationPhage Hunters GO annotation guide

GO annotation training slides from TAMU

 Allison's slides for March 22nd introduction to GO annotation, through slide 17

Form your teams by 3/17 using  . For now, everyone should sign up for a team.this link

Week of March 20th- this week we will prepare for you to submit your first annotations starting on April 3rd, but by preparing, you'll also be ready to submit challenges next week.

- choose a favorite endolysin (section1)/capsid (section 3)/holin (section 2) pham from the Bacillus phage collection in phamerator.

Page 3: Spring 2017 Phage Lab - VCU Wiki

1.

1. 2. 3. 4. 5. 6. 7. 8.

- identify a published homolog to your favorite endolysin/capsid/holin pham using Blastp and HHPred. Record on your wiki a screenshot(s) of the match as well as the probability and evalue of the match. For this work, your statistics must be a maximum e-value of 10 for Blastp and a minimum probability of -7 

0.9 for HHPred. Manual inspection of the alignments is required to ensure you have at least 75% coverage and 30% identity between the two sequences. Note for endolysin: you probably need to do this after splitting your sequence into the catalytic and cell-wall binding domains.

- From HHPred, save the paper describing the published homolog. Scour that paper to identify the experiment in the paper that supports the functional annotation of that protein. From HHPred, your papers are likely to be crystal structures of proteins, therefore support for functional annotaiton may come from a prior study showing a wet lab characterization of that protein's activity. You'll have to learn to trace through a paper's references to find an appropriate source. On your wiki, document the experiment you wish to use. Grab a screen shot of the figure or table, and the methods used to produce that data. Write a description in your own words of how the figure/table "proves" the activity of the protein. 

Week of March 27th- this week you will be able to submit your first challenges to GO annotations

Week of April 3rd- this week you will submit your first GO annotations!

Week of April 17th- annotation week. You can do transfer annotations to connect your protein of interest to an experimentally verified protein!

SEA PHAGES GO annotation term for transfer annotation

You can use this code for transfer annotations were you choose a homology-based evidence code. An extensive list of all the reference GO codes are listed .here

go_ref_id: title: Gene Ontology annotation by SEA-PHAGE biocurators authors: Ivan Erill, SEA-PHAGE biocurators year: 2014 GO_REF:0000100abstract: This GO reference describes the criteria used by biocurators of the SEA-PHAGE consortium for the annotation of predicted gene products from newly sequenced bacteriophage genomes in the SEA-PHAGE and other databases and in the GenBank records periodically released to phagesdb.orgNCBI for these genomes. In particular, this GO reference describes the criteria used to assign evidence codes ISS, ISA, ISO, ISM, IGC and ND. To assign ISS, ISA, ISO and ISM evidence codes, SEA-PHAGE biocurators use a varied array of bioinformatics tools to establish homology and conservation of sequence and structure functional determinants with proteins from multiple organisms with published association to experimental GO terms and lacking NOT qualifiers. These proteins are referenced in the WITH field of the annotation using their xref database accession. The primary tools for homology search in ISS, ISA, ISO and ISM assignments are BLASTP and HHpred, using a maximum e-value of 10^-7 for BLASTP and a minimum probability of

. For ISS and ISA assignments,0.9 for HHpred, and manual inspection of alignments in both cases BLASTP alignments are required to have at . For ISO assignments, . For least 75% coverage and 30% identity orthology is further validated using reciprocal BLASTP with the identified hit

HHpred results, ISS or ISM annotations are made only if the source for the original GO annotation explicitly defines a matched domain function, or if more than half of the domains of the query protein are identified in the matching protein. All ISS, ISA, ISO and ISM assignments entail the manual verification of the source for the GO term in the matching protein sequence and critical curator assessment of the likelihood of preservation of function, process or component in the context of bacteriophage biology. IGC codes are assigned on the basis of suggestive evidence for function based on synteny, as inferred from whole-genome comparative analyses of multiple bacteriophage genomes using primarily the Phamerator software platform, and with special emphasis on the bacteriophage virion structure and assembly genes. When extensive review of published literature on putative homologs reveals no experimental evidence of function, component or process for a particular gene product, it is assigned an ND evidence code and annotated to the root term for Cellular Component, Molecular Function and Biological Process. As part of the review process for assignment of ISS, ISA, ISO, IGC and ISM evidence codes, SEA-PHAGE curators are required to analyze the reference literature for identified matches and shall perform GO annotations with appropriate evidence codes if these were not available.

Some Guidelines and Links for GO Annotation:

Identify a protein of interest.  For a standard annotation, the protein must have experimental evidence for function in a primary literature article. For a transfer annotation of a protein from one of our phage proteins to a close homolog, the homolog protein must have an existing GO annotation with experimental evidence for function in a primary literature citation.

:2. Select an evidence codes You will have to satisfy one of the approved evidence codes based on the type of data documenting the protein's function.

Here's a list of the evidence codes that CACAO students may use:

IDA: Inferred from Direct AssayIMP: Inferred from Mutant PhenotypeIGI: Inferred from Genetic Interaction - requires with/from field to be filled inISS: Inferred from Sequence or Structural Similarity - almost always requires with/from field to be filled inISO: Inferred from Sequence Orthology - requires with/from field to be filled inISA: Inferred from Sequence Alignment - requires with/from field to be filled inISM: Inferred from Sequence Model - requires with/from field to be filled inIGC: Inferred from Genomic Context

  ALL other codes, even if used correctly, will cause the annotation to be rejected by the judges

***Here is a full description of  . what all those codes mean

3. Select a GO term: You will have to pick the appropriate GO term, which can only be as specific as the data shows.

FIrst, think about whether your protein's evidence points to Cellular Component, Biological Process or Molecular Function.

***Here is a .full description of CC/BP/MF

Second, check out existing GO term annotations in and . (make sure to record your uniprot identifier and confirm you have Uniprot, Amigo QuickGOthe right protein!)

Page 4: Spring 2017 Phage Lab - VCU Wiki

Third, check out existing GO term annotations on GONuts wiki. You can re-use a GO term to offer additional support. You can't submit a second annotation using the same evidence.

Fourth, re-examine your evidence and select the appropriate GO term. 

4. Write your notes: Clearly and carefully document the evidence that is displayed in a figure or table for your protein. Write this evidence in your own words, without copy/pasting from the article.

Here's describing GONUTS wiki.a paper

 

 

Section 1 Endolysin functional anntoation

TsarBomba gp40 pham 168

Anthos gp54 pham 182

Nigalana gp74 pham 3097

Vinny gp63 pham 4563

Claudi gp39 pham 4564

Taylor gp31 pham 137

Pegasus gp108 pham 1683

Harambe (I'm guessing on this one)pham 1759

Update: AJ posted the combined files here for and . Use these for your journal club Endolysin Catalytic domain Endolysin cell wall binding domainphylogeny!

This file contains the full length endolysin sequences plus three structural homologs (to Nigalana, Anthos and TsarBomba groups).

March 27th- challenge

Page 5: Spring 2017 Phage Lab - VCU Wiki

1.

2.

1.

2.

1.

2.

3.

4.

Can you figure out what is wrong with   from last year? Evaluate the suitability of the GO ID, the Reference, the this endolysin(?) annotationEvidence Code, and the evidence described by the student in the Notes. Post this evaluation to your wiki.Can you figure out what is wrong with   from last year? Note that the annotation for "GO:0009253" was rejected, and the this endolysin annotationannotation for "GO:0051672 was accepted by the TAMU folks making a minor change. Evaluate the suitability of the GO ID, the Reference, the Evidence Code, and the evidence described by the student in the Notes. Post this evaluation to your wiki.

 

Section 2 Holin

Sequence file for holin pham: holin_pham634

March 29th- challenge

Section 3 CapsidSequence file for capsid phams:

CapsidPham58 (myoviruses)

CapsidPham65 (podoviruses)

March 29th- challenge

Can you figure out what is wrong with from last year? Evaluate the suitability of the GO ID, the Reference, the Evidence this capsid annotationCode, and the evidence described by the student in the Notes. Post this evaluation to your wiki.Can you figure otu what is wrong with  from last year? Look for the row with the white background in the table, under the  this capsid annotationautomatically generated annotations with a green background. Evaluate the suitability of the GO ID, the Reference, the Evidence Code, and the evidence described by the student in the Notes. Post this evaluation to your wiki.

Comparative Genomics toolsYou can use these tools to create data, but Bioinformatics isn't just button clicking. Make sure you know what you are comparing, why you are doing that comparison, and work hard to understand what the data means.

Phamerator: Comparative genomics tool in the SEA Virtual Machine in the computer lab. You can use this tool to generate map images, explore protein phamilies, and protein domains (this function seems to not be working at the moment).

:Gepard dot plot tool This tool takes a fasta file as input for both the x and y axis. You can upload the same or different files depending on the comparison you want to make. Requires your java to be up-to-date. Default word size is 10bp, so the tool places a dot for each 10bp that is identical between two sequences in a pairwise alignment. Note that for the journal club you can use blastn, but you are also welcome to try this tool to produce a prettier dotplot.

:Clustal omega Quick multiple sequence alignment and phylogenetic tree tool. This tool takes a multi-fasta file as input. Requires your java to be up-to-date.Splitstree: This tool takes a modified phages vs. phams table (which is then converted to a nexus file) as input and generates an unrooted phylogenetic tree based on shared protein content. The splits in the tree indicate the extent of shared content. A nexus file you can use as input based on our current phamerator database is on blackboard in Course Documents because the wiki wouldn't let me upload.

Posters for VCU Undergraduate Research FestivalOrganized by the Undergraduate Research Opportunities Program (UROP) and part of VCU Student Research Weeks, the annual VCU Poster

 is a wonderful opportunity for students to present their research endeavors and creative Symposium for Undergraduate Research and Creativity

Page 6: Spring 2017 Phage Lab - VCU Wiki

1.

1.

2.

3.

scholarship to their academic peers, members of the VCU faculty, community members, and friends and family.  All undergrads from every discipline are encouraged to present and attend.  Presentations may be for completed research projects, completed papers, or research in progress.

Projects involving creative work such as prose or poetry, performances, and artwork will be considered for acceptance if they are part of a scholarly project undertaken by the student.  We are currently accepting poster   up until the  .  All   should be submitted to abstracts deadline of March 22nd, 2017 abstracts http://go.vcu.edu/uroppostersubmit

After students are notified of their acceptance, we will accept electronic file submission of their posters.  Note: We hold poster workshops Jan. – Mar. and we are now able to print research posters free of cost to our students!  A schedule for upcoming poster workshops will be posted on the   next UROP Blogweek.

Abstracts should include: Name/Major of student, Name/Dept. of Faculty Mentor, Title of research Project, Brief description of research project.  All inquiries to    [email protected]

Research Weeks will take place throughout the month of April, and the symposium will take place on  .  We would also be happy to add your April 19thevent to the week(s) and assist in publicizing it.

Abstracts are due March 22nd!

Yeast 2 Hybrid: Rashmi can submit abstract. The draft  . We need one or two people to finalize the poster for printing. poster is here

       2. Host range: Emaan can submit abstract. The draft  . We need two people to finalize the poster for printing. poster is here

       3. Spring bioinformatics poster!! We will talk about what we want to do and who can contribute pieces.

For February 15th week:

For annotating functional information, let's look at .OTooleKemple52 terminase protein

Blastp conserved domains-  A conserved domain is a distinct functional (and often structural) unit of a protein. A sub-tool in Blastp queries a conserved domains database [consisting of pre-calculated position specific score matrices made from ] to identify multiple sequence alignmentsconserved domains in protein sequences. Please report the protein function name and the E value. The source is blastp conserved domains. Hit must have E-value of 10  or lower. If you want to know more, describes the tool and how to read the information.-5 this pageHHPred-  This tool compares your amino acid sequence to a structural database to predict function. The algorithm (hidden Markov model) uses a profile-sequence comparison to find the best match to your query. The profile of each amino acid position includes information about residue conservation and preferred amino acids ("how important each position is for defining other members of the protein family"). The database behind HHPred is a protein structural database. Structure leads to function and structure is more conserved than sequence. Please report the protein function name, the probability and the E value. The source is HHPred. Hit must have probability >80% and an E-value of 10 or lower. If you -5

want to know more, describes HHPred.this pagePhamerator- This is a mapping tool that visually shows each protein in a phage genome as well as blastp conserved domains. It is only available through a linux virtual machine installed on the computers in Harris 3112. Please report the protein function name. The source is Phamerator.

For February 8 week labs:

This  has your annotation ranges. You have a month to complete the annotation of up to 80 proteins, as I'd like these submitted by spring break.  excel fileThis is a lot! Note I used four people from section 1 to help annotate section 3 phages because of the difference in the number of students. Please use your lab time efficiently and eliminate distraction so you can get as much work done in lab as possible. You're welcome to bring headphones and tune out your classmates. 

For February 1st week labs: 

Slides about blast and coding potential data, and how to get started with annotation.

Sequence file for  .OTooleKemple52 protein 1

Sequence file for  .PPIsBest protein1

Sequence file for  .Janet protein 1

DNAMaster Annotation Guide is on blackboard! See page 64 for guiding principles of annotation.

Here are   on how to make a wiki page! I inserted these slides into my wiki page by clicking on the link button above. three slides