using the gene ontology: gene product annotation
TRANSCRIPT
• Compile structured vocabularies describing aspects of molecular biology
• Describe gene products using vocabulary terms (annotation)
• Develop tools:• to query and modify the vocabularies and annotations• annotation tools for curators
GO Project Goals
GO provides two bodies of data:
• Terms with definitions and cross- references
• Gene product annotations with supporting data
GO Data
•Molecular Function — elemental activity or task
nuclease, DNA binding, transcription factor
•Biological Process — broad objective or
goalmitosis, signal transduction, metabolism
•Cellular Component — location or complexnucleus, ribosome, origin recognition complex
The Three Ontologies
• Association between gene product and applicable GO terms
• Provided by member databases
• Made by manual or automated methods
GO Annotation
DAG Structure
Annotate to any level within DAG
mitotic chromosome condensation
S.c. BRN1, D.m. barren
DAG Structure
Annotate to any level within DAG
mitosisS.c. NNF1
mitotic chromosome condensation
S.c. BRN1, D.m. barren
• Database object: gene or gene product
• GO term ID
• Reference
• publication or computational method
• Evidence supporting annotation
GO Annotation: Data
IDA - Inferred from Direct Assay
IMP - Inferred from Mutant Phenotype
IGI - Inferred from Genetic Interaction
IPI - Inferred from Physical Interaction
IEP - Inferred from Expression Pattern
GO Evidence Codes
TAS - Traceable Author Statement
NAS - Non-traceable Author Statement
IC - Inferred by Curator
ISS - Inferred from Sequence or structural Similarity
IEA - Inferred from Electronic Annotation
ND - Not Determined
IDA - Inferred from Direct Assay
IMP - Inferred from Mutant Phenotype
IGI - Inferred from Genetic Interaction
IPI - Inferred from Physical Interaction
IEP - Inferred from Expression Pattern
GO Evidence Codes
TAS - Traceable Author Statement
NAS - Non-traceable Author Statement
IC - Inferred by Curator
ISS - Inferred from Sequence or structural Similarity
IEA - Inferred from Electronic Annotation
ND - Not Determined
From primary literature
IDA - Inferred from Direct Assay
IMP - Inferred from Mutant Phenotype
IGI - Inferred from Genetic Interaction
IPI - Inferred from Physical Interaction
IEP - Inferred from Expression Pattern
GO Evidence Codes
TAS - Traceable Author Statement
NAS - Non-traceable Author Statement
IC - Inferred by Curator
ISS - Inferred from Sequence or structural Similarity
IEA - Inferred from Electronic Annotation
ND - Not Determined
From reviews or introductions
From primary literature
IDA - Inferred from Direct Assay
IMP - Inferred from Mutant Phenotype
IGI - Inferred from Genetic Interaction
IPI - Inferred from Physical Interaction
IEP - Inferred from Expression Pattern
GO Evidence Codes
TAS - Traceable Author Statement
NAS - Non-traceable Author Statement
IC - Inferred by Curator
ISS - Inferred from Sequence or structural Similarity
IEA - Inferred from Electronic Annotation
ND - Not Determined
From reviews or introductions
From primary literature
IDA - Inferred from Direct Assay
IMP - Inferred from Mutant Phenotype
IGI - Inferred from Genetic Interaction
IPI - Inferred from Physical Interaction
IEP - Inferred from Expression Pattern
GO Evidence Codes
TAS - Traceable Author Statement
NAS - Non-traceable Author Statement
IC - Inferred by Curator
ISS - Inferred from Sequence or structural Similarity
IEA - Inferred from Electronic Annotation
ND - Not Determined
From reviews or introductions
From primary literature automated
• Manual
• Automated• sequence similarity• transitive annotation• nomenclature, other text matching
GO Annotation: Methods
Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)
Literature-Based Manual Annotation: Experimental Evidence
CodesLecoq, K., et al. (2001) YLR209C Encodes
Saccharomyces cerevisiae Purine Nucleoside Phosphorylase. J. Bacteriology 183(16): 4910-4913.
Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)
Literature-Based Manual Annotation: Experimental Evidence
CodesLecoq, K., et al. (2001) YLR209C encodes
Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913.
IDA
Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)
Literature-Based Manual Annotation: Experimental Evidence
CodesLecoq, K., et al. (2001) YLR209C encodes
Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913.
FUNCTION:
purine nucleoside phosphorylase
IDA
Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)
Literature-Based Manual Annotation: Experimental Evidence
CodesLecoq, K., et al. (2001) YLR209C ncodes Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol.
183(16): 4910–4913.
FUNCTION:
purine nucleoside phosphorylase
IDA
IMP
Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)
Literature-Based Manual Annotation: Experimental Evidence
CodesLecoq, K., et al. (2001) YLR209C encodes
Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913.
FUNCTION:
purine nucleoside phosphorylase
IDA
IMP
Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)
Literature-Based Manual Annotation: Experimental Evidence
CodesLecoq, K., et al. (2001) YLR209C encodes
Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913.
FUNCTION:
purine nucleoside phosphorylase
IDA
PROCESS:purine nucleoside catabolism
IMP
Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)
Literature-Based Manual Annotation: Experimental Evidence
CodesLecoq, K., et al. (2001) YLR209C encodes
Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913.
FUNCTION:
purine nucleoside phosphorylase
IDA
PROCESS:purine nucleoside catabolism
IMP
This paper has no data for cellular component.
InterPro2go links InterPro entries and GO terms
Automated Annotation: InterPro Example
YFP
InterProentry
GOentry
InterPro2go links InterPro entries and GO terms
Automated Annotation: InterPro Example
YFP
InterProentry
GOentry
Run InterProScan to link YFP and InterPro entry
InterPro2go links InterPro entries and GO terms
Automated Annotation: InterPro Example
YFPInfer GO term from the other two links
InterProentry
GOentry
Run InterProScan to link YFP and InterPro entry
• FlyBase • WormBase• Saccharomyces Genome Database • DictyBase• Mouse Genome Informatics • Gramene• The Arabidopsis Information Resource • Compugen, Inc.• Swiss-Prot/TrEMBL/InterPro
• Pathogen Sequencing Unit (Sanger Institute)
• PomBase (Sanger Institute)
• Rat Genome Database
• The Institute for Genomic Research
GO Annotation: Contributors
• Fruit fly (Drosophila melanogaster)• Budding yeast (Saccharomyces cerevisiae)
• Fission yeast (Schizosaccharomyces pombe)• Human (Homo sapiens)
• Mouse (Mus musculus) • Rice (Oryza sativa)
• Rat (Rattus norvegicus) • Tsetse fly (G. morsitans)
• Caenorhabditis elegans • Arabidopsis thaliana
• Vibrio cholerae • Dictyostelium discoideum
GO Annotation: Organisms
• FlyBase & Berkeley Drosophila Genome Project • WormBase• Saccharomyces Genome Database • DictyBase• Mouse Genome Informatics • Gramene• The Arabidopsis Information Resource • Compugen, Inc.• Swiss-Prot/TrEMBL/InterPro
• Pathogen Sequencing Unit (Sanger Institute)
• PomBase (Sanger Institute)
• Rat Genome Database
• Genome Knowledge Base (CSHL)
• The Institute for Genomic Research
www.geneontology.org
The Gene Ontology Consortium is supported by NHGRI grant HG02273 (R01). The Gene Ontology project thanks AstraZeneca for financial support. The Stanford group acknowledges a gift from Incyte Genomics.