wfleabase daphnia genome database from common components daphnia genomic consortium meeting, sept....
TRANSCRIPT
wFleaBaseDaphnia Genome Database from Common Components
Daphnia Genomic Consortium
Meeting, Sept. 2003
Don Gilbert, [email protected]
A Replicable Genome infOrmation System ( Argos )
http://eugenes.org/argos | flybase.net/flybase-ng
common/java/ ; perl/ -- program libraries and packages
servers/ -- major programs (BLAST, MySql/PostgreSQL, others)
systems/ -- OS executables of programs
daphnia/ .. implemented organism genome systems
eugenes/
flybase/
docs/ & install/ -- Argos instructions and usage
template/ -- structure for new projects
ROOT/ -- common directory of installed projects
Argos featuresCommon genome tool set
Share benefits of “best of breed” genome tools Common parts are tested & maintained by others Minimal IT expertise (no compiles or system management) Choice of tools (existing or new genome DB use parts desired)
Flexible project packages Project needs specify tool set (compare EnsEMBL where all use one set) Own look’n’feel web pages, contents, functions Security for protected and public sections
Easy replication to any Unix computer ‘Live’ database system replication using rsync Keep remote servers up-to-date every day Local cluster/grid for high-volume traffic Works on common workstations, laptops
Argos common parts
Java common library, Ant builds, XML Tools,
Web Services (Axis), Lucene for “Google”-like searches
Perl common library of BioPerl, GBrowse, others
Servers include
Apache, Tomcat web servers
MySQL, PostgreSQL databases
BLAST (NCBI)
Systems compiled for
apple-powerpc-darwin, intel-linux, sun-sparc-solaris
wFleaBase structure
Cgi-bin -- Web programs(Perl)
Common -- Link to common, shared tools
Conf -- Site configurations for web, data
Data -- Bulk data & FTP site folder
Dbs -- Project databases: blast, lucene, mysql
Indices -- Database indices
Lib -- Program libraries
Web -- Web structure and documents
Genomics, Sequences, Maps, Literature, Stocks, Docs, other
includes Public and Protected (project member only) parts
Webapps -- Web programs (Java)
includes Search system, Secure web and editing
Where to put Daphnia Genome?
Database needs Automated annotation and curated updates Search and retrieve data subsets
Choices EnsEMBL - working now, Gramene & others
use GMOD:Chado - in development
(FlyBase,WormBase, ChlamyGenome,TIGR, others will use)
Others choices?
Generic Model Organism Database Construction Set
Genome+ Database (more than annotations)
Genome visualization tools Genome annotation pipeline planned Literature curation and Gene Ontology
tools Component system (pick and choose) Developing - more complete in 2004
www.gmod.org
EnsEMBL Genome Database
Genome annotation database Genome visualization tools Genome annotation pipeline Comprehensive system (all or none) Production - useable now
www.ensembl.org
wFleaBase issues
• Basic web system ready for genome data?
• Start with EnsEMBL for management; move to GMOD:Chado if better choice?
• Add GMOD GBrowse; Apollo Editor with genome
• Add “Self-service” database features for?• Easy management by scientists • Genome data; stocks; research literature• Add evolutionary, ecological, environmental data
Prototype at http://iubio.bio.indiana.edu/daphnia/