calbicyc, metabolic pathways at the candida genome database

34
CalbiCyc, Metabolic Pathways at the Candida Genome Database Martha Arnaud [email protected]

Upload: chinue

Post on 30-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

CalbiCyc, Metabolic Pathways at the Candida Genome Database. Martha Arnaud [email protected]. Outline. Accessing data in the Candida Genome Database (CGD) Gene information in CGD: the Locus Summary page Biochemical pathways at CGD Pathway Prediction and Curation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

CalbiCyc, Metabolic Pathways at the Candida Genome Database

Martha Arnaud

[email protected]

Page 2: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Outline

• Accessing data in the Candida Genome Database (CGD)

• Gene information in CGD: the Locus Summary page

• Biochemical pathways at CGD

• Pathway Prediction and Curation

• Our favorite PTools customizations / configuration options

Page 3: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Introduction to CGD

SGD-like resource for Candida albicans

CGD started in 2004

All CGD data are freely available to the public

Share codebase, tools, and website organization with SGD

Manual curation of scientific literature

Pathways are just one of the types of data we provide, and they account for a modest fraction of our site usage

Page 4: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Accessing data in CGD

Quick Search:

The main entrance point

Search CGD by gene name or keyword

Fields searched:

- Gene Names - Gene Descriptions - Gene Ontology terms, synonyms, IDs- People (colleagues, authors) - PubMed ID- S. cerevisiae Ortholog or Best Hit- Biochemical pathways

Page 5: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Additional tools for accessing data in CGD

Advanced Search:

Search for genes by properties

Page 6: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Uses Textpresso

Developed by

Wormbase, Caltech

Full-text literature search

Page 7: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Sequence-related searches and tools

BLAST

Pattern Match

Restriction Map

Primers

Genome Browser

(GMOD’s GBrowse)

Page 8: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Community Resources

Search for colleagues

Browse Candida Labs

Page 9: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Bulk Data Downloads

Browse list of downloadable files

Page 10: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Bulk Data Downloads

Browse list of downloadable files

Downloads directory

on our web site

Page 11: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Gene information in CGD

CGD focuses on gene-based information

Basic gene information is found on the “Locus Summary Page” (LSP)

Quick Search is the easiest way to find the LSP for a gene

Page 12: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Locus Summary Page

LSP summarizes gene information

A “hub” that links out to more details

Page 13: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Locus Summary Page

Gene names, aliases

Gene description

Mutant phenotypes

Gene Ontology

Chromosomal location

Sequence retrieval

Sequence analysis

Genome browser

Orthologs

Pathways

Page 14: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Locus Summary Page

Gene names, aliases

Gene description

Mutant phenotypes

Gene Ontology

Chromosomal location

Sequence retrieval

Sequence analysis

Genome browser

Orthologs

Pathways

Page 15: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Locus Summary Page

Gene names, aliases

Gene description

Mutant phenotypes

Gene Ontology

Chromosomal location

Sequence retrieval

Sequence analysis

Genome browser

Orthologs

Pathways

Page 16: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Locus Summary Page

Gene names, aliases

Gene description

Mutant phenotypes

Gene Ontology

Chromosomal location

Sequence retrieval

Sequence analysis

Genome browser

Orthologs

Pathways

Page 17: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Locus Summary Page

Gene names, aliases

Gene description

Mutant phenotypes

Gene Ontology

Chromosomal location

Sequence retrieval

Sequence analysis

Genome browser

Orthologs

Pathways!

Page 18: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Biochemical Pathways at CGD http://pathway.candidagenome.org/

Search

Browse…List…

PTools in “web mode”

Page 19: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Biochemical Pathways at CGD

• Zoom

• Link out to SGD

• Curated pathway summary comments

• References

Page 20: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

PTools Prediction of pathways for CGD

• Pathologic pathway database construction: January - April 2007

• Pathways released on our public web site: March 2008

Page 21: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Two-part curation approach:

Step 1. Triage– Literature searches, assemble citation list– Decide to keep or delete pathway– Kept 181, deleted 227, added 15

Step 2. More intensive curation– Pathway modifications– Pathway comments

Current statistics: 156 pathways 107 with second-stage curation complete14 triage and S. cerevisiae comments from SGD35 triage only (references but, no free-text description)

Curation of pathways at CGD

Page 22: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Curation of pathways, continued

Curation notes

Curation challenge: Steep learning curve for the curation tools.

The tools are quite different, and the process is distinct, from the usual gene-centric curation we do, curators need to “switch gears” for pathway curation.

Found that it was easier to make progress by making a focused “project” out of pathway curation.

Page 23: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

• Multiple routes to customize the function and appearance of the tools:

– Menu options– Parameters passed upon PTools web server startup– PTools “init” file – Style sheet– Custom scripts

Our favorite configurable Ptools options

Page 24: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Some options that we find to be useful:

PathoLogic: Specify Reference PGDB

SGD had some recent curation that was not yet included in MetaCyc

Pathologic allowed us to include the new information in the prediction set!

Page 25: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Useful customization options, continued:

Pathway Hole Filler can be run without using the operon-related data types

Issue command at the lisp prompt before you start the hole filler:

(update-nodes '(SSCORE-NODE EVALS-NODE ALN-NODE RANK-NODE))

Page 26: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Useful customization options, continued:

Gene links within pathway displays link to CGD Locus Summary pages

Use -gene-link-db CGD argument when starting web server.Defined link template in the CGD database frame.

Page 27: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Useful customization options, continued:

Gene links within pathway displays link to CGD Locus Summary pages

Use -gene-link-db CGD argument when starting web server.Defined link template in the CGD database frame.

Page 28: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Useful customization options, continued:

Add custom header and footer

Integrated appearance, navigation with the rest of our site

Define header and footer in: /ptools-local/html

Page 29: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Useful customization options, continued:

Customized color

of buttons and boxes

on the interface

Use "CGD Gold"

not "EcoCyc Orange”

aic-export/htdocs/style.css

Relabel “Quick Search”

on the interface

(because we already

have a Quick Search,

with different

functionality)

/ptools-local/ptools-init.dat

Page 30: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

One more useful customization:

Custom-format pathway files, updated weekly

• Suzanne Paley sent us a Lisp script that we run as a cron job to regenerate the flat-files weekly

• We then process the flat files to generate an always-current custom-format file, requested by a CGD user

Page 31: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Some advice and encouragement on the pathway generation process

• The process can be very “fiddly.” Hang in there!

• Helpful user support: [email protected]

• Do not be afraid to ask for help! The process can be complicated, but an active dialogue with the helpful PTools support team makes it all possible

Page 32: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

Gavin Sherlock, PI

Martha Arnaud, Curator

Maria Costanzo, Curator

Diane Inglis, Curator

Marek Skrzypek, Curator

Gail Binkley, Database Administrator

Stuart Miyasato, Systems Administrator

Prachi Shah, Scientific Programmer

The Team

Page 33: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database

• Peter Karp• Suzanne Paley• Michelle Green• Joe Dale• Ron Caspi• SGD

THANKS

For help in getting CGD Pathways up and running

Contact us: [email protected]

Page 34: CalbiCyc, Metabolic Pathways at the  Candida  Genome Database