cacao - penn state gene function and gene ontology january 2011 penn_state_cacao

49
CACAO - Penn State Gene Function and Gene Ontology January 2011 http://gowiki.tamu.edu/wiki/index.php/ Category:Penn_State_CACAO

Post on 15-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

CACAO - Penn State

Gene Function and Gene OntologyJanuary 2011

http://gowiki.tamu.edu/wiki/index.php/Category:Penn_State_CACAO

Page 2: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

What is an annotation?

• Dictionary.com: a critical or explanatory note or body of notes added to a text

• For us, it is adding biologically relevant information to a protein record

Page 3: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Function annotation

• Allows us to

– Infer the functions of genes

• Related by common descent

• Related by similar expression patterns

• Related by phylogenetic profiles

• ...

Page 4: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Function annotation

• Allows us to

– Understand the capabilities of organisms genomes

– Understand patterns of gene expression

• In different environments

• In different tissues

• In disease states

– ...

Page 5: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Classic MODel

Literature

Datasets

Curators(rate limiting)

Database

Page 6: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Requirements

• Accurate functional annotation for as many genes as possible

• A system of assigning function that allows both humans and computers to compare, contrast, analyze, and predict gene function

• Curators to make and/or check these assignments– For CACAO, we will teach you what

biocurators do.

Page 7: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

What’s in it for you (besides credit)?

– We hope you will

• learn how we think about gene function

• gain skills that will help your future career

• enjoy contributing to a resource used by people all over the world

• have fun!

Page 8: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

CACAO

• Community

• Assessment– How well can

• Community – you (with our coaching)

• Annotation with– assign gene functions

• Ontologies– using GO?

Page 9: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

GO = Gene Ontology

• Controlled vocabulary

– Everyone uses the same terms

– Terms have IDs that computers can understand

• Relationships between functions

Page 10: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

GO

• 3 aspects (ontologies) for gene products

1. Biological Process

2. Molecular Function

3. Cellular Component

• Used to make annotations– aka Gene associations– Term + qualifiers + evidence code + reference etc.

Page 11: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Molecular Function

• activities or “jobs” of a gene product

glucose-6-phosphate isomerase activity

from GOCfigure from GO consortium presentations

Page 12: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Biological Processa commonly recognized series of events

cell division

Figure from Nature Reviews Microbiology 6, 28-40 (January 2008)

Page 13: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Cellular Component

• where a gene product acts

Page 14: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Key elements of a GO annotation

Submitted to GO consortium

Viewable on GONUTS

**Don’t worry - I will cover this again (several times)!

Page 15: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

GO Annotation

• To make an annotation, you need to

– Assign GO terms to genes (gene products)

• At appropriate level of specificity

• Sometimes with Qualifiers

– NOT

– Contributes_to

– Colocalizes_with

– Record the evidence

Page 16: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Record the evidence

• Where it came from:

– Reference (database accession)

• PMID:6987663

• Kind of evidence:

– Evidence codes

• IMP: Inferred from Mutant Phenotype

• IDA: Inferred from Direct Assay

• …

Page 17: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

CACAO - the “Community Community AnnotationAnnotation” part

What I am going to tell you about next is:1. How to choose proteins to annotate2. Finding GO terms & navigating a GO term page3. Finding UniProt accessions4. Making gene pages on GONUTS & the anatomy of a gene page5. How and where to add an annotation6. Where to look for your annotations & other teams’ annotations … (& the challenges!)

http://gowiki.tamu.edu/wiki/index.php/Category:Penn_State_CACAO

Page 18: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

GONUTS

• Community-editable database• GO terms

• Place to annotate

• “GoPageMaker”– Makes gene pages with minimal info required– “Annotation” table

• Editable (by YOU!!!)

• Pulls information/annotations for proteins from a non-editable database called UniProt

http://gowiki.tamu.edu

Page 19: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Deciding what to annotate1. randomly

2. topics of interest (ie efflux pump proteins, biofilms)

3. papers you have come across while doing other stuff

4. methods you know or want to learn

5. phenotypes and mutants you are interested in

6. by author

7. by pathway or regulon

8. suggested by another (ie high IEA:manual annotation ratio)

9. current paper mentions another gene product

10. review papers (ie Annual Reviews are excellent sources)

EXAMPLE #1: let’s say you have a great paper (PMID:1111) that characterizes the tyrosine kinase activity of your

favorite protein (human p53)…

Page 20: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Finding genes/proteins to annotate

• UniProt - http://www.uniprot.org• Textbook or class notes• Wikipedia or Google• Paper you are reading that mentions another gene

– Review articles

• WikiPathways - http://www.wikipathways.org• PubMed - http://pubmed.org• Ask a coach (usually me)• GONUTS

– what proteins have other teams been annotating?!

Page 21: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Finding papers

• PubMed - http://pubmed.org• GoogleScholar• References in your textbook• Wikipedia & Google• Papers given out in another class• Papers from a lab you are interested in

– Undergrad research work?

** only original research papers **- no review articles, no textbooks, no books, no class notes

Page 22: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Key elements of a GO annotation

Submitted to GO consortium

Viewable on GONUTS

Page 23: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Part I: Where do you search for GO terms? GONUTS

http://gowiki.tamu.edu

Page 24: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO
Page 25: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO
Page 26: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO
Page 27: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO
Page 28: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO
Page 29: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

• CHICK - AgBase (Gallus gallus)• dictyBase - dictyBase (Dictyostelium discoideum - slime mold)• FB - FlyBase (Drosophila melanogaster)• HUMAN - Reactome, BHF-UCL• MGI - Mouse genome informatics (Mus musculus - house mouse)• SGD - Saccharomyces genome database (Saccharomyces cerevisiase - yeast)• TAIR - The Arabidopsis Informatics Resource (Arabidopsis thaliana)• WB - WormBase (Caenorhabditis elegans)• ZFIN - Zebrafish model organism database (Danio rerio)

Page 30: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

What do you actually need once you have found the correct term?

GO:0004713

Page 31: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Part II: You now have a paper, a protein & you found a suitable GO

term… what next?

• UniProt accession - http://www.uniprot.org

- Search (“Query”) & find the correct UniProt accession for your protein- Look something like: P012A9

Page 32: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Part III: Where are you going to add your annotations? GONUTS

http://gowiki.tamu.edu

Page 33: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

How do you make a new gene page in GONUTS?

• Use the UniProt accession to make a page that you will be able to add your own annotation to.

• GoPageMaker will:1. Check if the page exists in GONUTS & take you there if it does.2. Make a page & pull all of the annotations from UniProt into a

table that you can edit.

Page 34: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Page 35: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Page 36: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Page 37: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Part IV: Where do you add an annotation? Add a row in the table.

Page 38: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO
Page 39: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Part V: What you must fill in (for every annotation)

GO:0004713

PMID:1111

IDA: Inferred from direct assay

Figure 2a

Page 40: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

What you might also have to fill in

Not sure? Check the competition guidelines. Ask a coach (Jim, Debby, Adrienne or usually me)!

Page 41: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Where will your annotation now show up?

1. In the “Annotation” table on the gene page you just edited2. In the table on your user page

http://gowiki.tamu.edu/wiki/index.php/User:Siebenmc

3. In the table on your team pagehttp://gowiki.tamu.edu/wiki/index.php/Category:Team_Mu_subunit_1

4. As points on the scoreboardhttp://gowiki.tamu.edu/wiki/index.php/Category:CACAO_Spring_2011

5. If challenged, it will show up in the “Submitted Challenges” table (below the scoreboard)

Page 42: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

CACAO is competitive

• Teams get points for complete annotations– GO term (right level of specificity)– reference– evidence code– identify where in the paper the evidence comes from

• Teams can take away points from competitors by challenging annotations– finding a problem– suggesting a better alternative

Page 43: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

CACAO - the “Community Community AssessmentAssessment” part

Page 44: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO
Page 45: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

1

2

3

Page 46: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Scoreboard

Submitted Challenges

Closed Challenges

Moving through challenges

Page 47: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO

Category:Team UCL1

Page 48: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO
Page 49: CACAO - Penn State Gene Function and Gene Ontology January 2011 Penn_State_CACAO