agenda vaprototype genomeservices iplant api 1. guigui design info/file geo# goterms of interest...
TRANSCRIPT
1
Agenda
VAPrototype
GenomeServices
iPlant API
G
U
I
•Design info/file•GEO#•GOterms of interest
•GenExpr2Ddata•GenExprSum•ContrastFiles•EnrichedGO graphs•GeneMANIA output
AffyGenAnalyser
BiNGO
GOAnalyser
Enriched GOterms
GeneMANIA
GeneLists
GenAV Analysis & Visualization of Affymetrix Gene Expression Data
User
Output(text files,
graphs)
3
LIVE DEMO
4
Working group-led iteration and discussion (Jan-June)
•Componentization•Reusability•Identification of potential GUI representations for work products
Summer Supercomputing Institute Meeting (July)
•Refinement of workflow•Identification of entry and exit point•Iteration on GUI representations•Cyberinfrastructure-oriented design•Implementation decisions
• Technology/language• Work allocation
How did we get here?
5
Expression AnalysisBioConductor limma
•Retrieve data•Specify experiment design•Normalize (gcRMA)•Linear model fit•Bayesian correction•Hypothesis testing•Emit results
NCBI GEO
iPlant Data Storage API
•Limma is a standard module for expression analysis
•Limma incorporates translation and integration code to handle most common array platforms
•Limma writes verbose but consistent delimited results
•People know how to use BioC/Limma and can do so on their desktop systems
Entry point is user upload expression file into the iPlant Data API
6
VAPrototype
•Retrieve data via /data API•Iterate over experiments•Perform category enrichment•Consolidate results•Return as JSON data structure
http://medea/iplant/js/application.js
1. Invoke VAPrototype via iPlant /jobs API2. Poll for service to complete3. Fetch results as JSON4. Render to dynamic table5. Interpret user interactions
Lecong.cgi
•Accept gene list•Accept control list•Accept parameters•Run analysis using call to R•Return JSON data structure
iPlant Jobs API
iPlant Data API
R/Bioconductor/HyperGO
7
http://medea/iplant/js/application.js1. Interpret user interactions
1. Sorting2. Downloading3. Invoke Network Analysis service
via iPlant /jobs API4. Poll /jobs for completion5. Fetch results (GraphML)6. Render in Cytoscape Web
BuildNetwork
•Accept gene list•Accept parameters (species, etc)•Accept algorithm name (GeneMania)•Invoke GeneMania plugin (Java) to predict network•Convert all gene names to AGI codes•Convert domain-specific report to GraphML
iPlant Jobs API
iPlant Genome ServiceAPI
GeneMania
8
9
10
What’s next
•VAPrototype won’t see any explicit additional development since it is a proof of principle•We need to focus on delivering robust versions of the functions that are mocked up•It serves as a reference implementation for a 3rd party DE•It also illuminates specific data integration needs•We may use it as a testing ground for new ideas in GUI, service coordination, and API design•It will be ported to use the full implementation of the iPlant API and used as an example for potential developers
• Web application portion: 1 day• Web services: 1 week
11
Genome Services
Why is this needed? This is G2P not genomics!
•Support multiple genomes in UHTS services
•Support germplasms and natural accessions
•Pave the way to supporting user genomes
•Make best use of existing resources
•Sane, authority-led approach to data integration
Current Ideas•Return a structured list of taxonomic identifiers (Genus, species, version, germplasm/accession) supported by iPlant•Given a genus, species, version, and germ plasm/accession identifier:
• Return a URI pointing to a multiple-FASTA containing the genome sequence
• Return a URI pointing to a GFF3 version of the genome annotation
• Return a URI pointing to a GTF version of the genome annotation
• Return a URI pointing to the dummy expr files needed by Cufflinks for RNAseq
• Be able to actually return the files referenced by these URIs for download
•Given the taxonomic identifier plus a name or synonym of a gene
• Return an authoritative name for said gene•Given the taxonomic identifier plus a microarray platform name plus a probe identifier:
• Return the canonical gene name mapped to that microarray probe
iPlant Genome Services API
Clade-specific data authorities
NCBI and EBI
Local Knowledge
Mirroringrelationships
Genome Services
iPlant Genome Services API
Clade-specific data authorities
NCBI and EBI
Local Knowledge
Mirroringrelationships
Direct relationships
Indirect relationships
(CoGE)
Taxonomic Name
Resolution Service (TNRS)
Discovery Environment
TAIR
Gramene
Phytozome
Etc.
14
The iPlant API
The iPlant API will support the following use cases:
1. I have a command-line tool that performs a specific type of bioinformatics analysis and I want to make it available to others.
2. I have a web service that performs a specific type of bioinformatics analysis and I want to make it available to others.
3. I have a web site that people can use to perform analyses and I want to make it available to others.
4. I want to write an web application that chains multiple types of tools together.
5. I want to use a workflow manager like Taverna or Kepler to orchestrate a set of analytical steps.
Architecture
Core Services
• Eventing• I/O• Data Transforms• App Discovery• Job Mgmt.
• User Profile Mgmt.
• Authentication• User/Project
Auditing• Mashups
(Orchestration)
I/O Services
Getting raw data into and out of the iPlant CI and moving data around internally
• /io: upload files and stage URIs (http, https, ftp, sftp, gsiftp, jdbc, amazon s3, irods)
• /io/list: list iPlant files• /io/<file_id>: download, delete file
Job Management Services
Submitting and managing jobs to run supported applications as well as querying for historical information about jobs
• /job: submitting a job• /job/history: historical job history• /job/<job_id>: kill an active job or get information about a job• /job/<job_id>/input/list: get a listing of the input files associated with a
specific job• /job/<job_id>/input/<file_id>: retrieve a specific input file in the format
it was in when the job ran• /job/<job_id>/output/list: get a listing of the output files associated with
a specific job• /job/<job_id>/output/<file_name>: retrieve a specific output file
associated with the job
Application Discovery Services
Application discovery and management (different from semantic web service discovery)
• /apps: add a new application to the iPlant CI• /apps/list: list all supported applications• /apps/search: search for a specific application• /apps/type/list: list all supported application types• /apps/type/<type_name>: list all supported applications
of a specific type• /apps/name/<app_name>: list all supported applications
matching a given name