mygrid/taverna provenance

Click here to load reader

Download myGrid/Taverna Provenance

Post on 14-Jan-2016

39 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

myGrid/Taverna Provenance. Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06. Components. Identifiers LSIDs Data JDBC data store Metadata RDF Provenance Plugin Browsing Provenance Browser Plugin Security Under development. LSID. - PowerPoint PPT Presentation

TRANSCRIPT

  • myGrid/Taverna ProvenanceDaniele TuriUniversity of Manchester

    OMII f2f Meeting, London, 19-20/4/06

  • ComponentsIdentifiersLSIDsDataJDBC data storeMetadataRDF Provenance PluginBrowsingProvenance Browser PluginSecurityUnder development

  • LSID

  • LSID: Life Science IdentifierURN specification in progress5 part identifier (with optional version id)urn:lsid:www.mygrid.org.uk:lsdocument:X1234urn:lsid:ncbi.nlm.nlh.gov.lsid.biopathways.org:genbank_gi:7717376protocol for retrieving data and metadata about an objectcommitment by the provider to always return the same data for an ID

  • LSID (ctd)IssueLSID AuthoritiesResolutionLSID ResolversExamplesmyGridLong Term Ecological Research NetworkBioPathways Consortium

  • LSID (ctd 2)abstractionlightweightindependent from actual storage implementationdatabasefile systemapplicationboth for private and public data sources

  • Data

  • Data Storage (current)Taverna can persist inputs, outputs and intermediate results in an SQL database via JDBCOptional and can be done by configuring a Baclava Data StoreAllows the LSIDs of data items to be resolved against the actual data

  • Data Storage (future)Domain-specific databasesuse outside myGridDevelop:taverna processor for JDBC/OGSA-DAIassociated interface (cf BioMart)Users will be able to study the contents of an existing database and: write queries that extract data from the database, where the query may be parameterised with values passed in from the workflow; write requests that insert data from the workflow into a named table in the database.

  • Metadata

  • Metadata GenerationTaverna Provenance PluginListen to Taverna EventsWorkflowEventListenerFaithfully record them as ontological instance dataRDF graphs (one for each Taverna run)

  • MetadataRepresentationOntology (Schema)StorageQueryBrowsing

  • RepresentationRDFtriplessubject predicate objectURIs (hence easy data integration)semantic web languageXML serializationflexible, powerfulsets of triples gives rise to graphs

  • Workflow Runurn:lsid:..:wfInstance:8runslaunchedBybelongsTourn:lsid::org:HY7urn:lsid::person:4urn:lsid::workflow:6urn:lsid::processRun:84urn:lsid::processRun:51executedexecuted

  • SchemaOntologyRDF schemaTaxonomic inferencesalso available as OWLopens it up to complex reasoning

  • Typed Workflow Runurn:lsid:..:wfInstance:8runslaunchedByExperimenterbelongsToOrganizationurn:lsid::org:HY7ProcessRunWorkflowRunWorkflowProvenance OntologyrunslaunchedBybelongsToexecutedurn:lsid::person:4urn:lsid::workflow:6urn:lsid::processRun:84urn:lsid::processRun:51executedexecuted

  • StorageNamed RDF graphsretrieve whole graphs (eg workflows)implementation in NG4J (Jena + MySQL)scalability issuesSesame2 native storescalableJava 5

  • QueryRDF query languages TriQL, SeRQL, SPARQLquery languages for named RDF graphsOntology inspection/reasoningCanned Queriesworkflows with failed processesinput/output of past process runsworkflows with data changed by user

  • Browsing

  • Provenance BrowsingProvenance Browser Pluginreusing Taverna GUI componentsMatthew Gamble

  • Analysis

  • Provenance AnalysisComparisonAggregationetcsee work by Jun Zhao

  • Security

  • User sends LSID ref and credentials to the Access Point Access Point returns data and metadata or denies access as follows: credentials are passed to a User Directory User Directory passes the corresponding user to the Authorization Authority Authorization Authority returns the user attributes in the form of a (possibly signed) SAML assertion this assertion, together with the lsid and its corresponding metadata, is passed to the Policy Enforcement Point (PEP) PEP uses these three inputs to form an XACML request that is passed to a Policy Decision Point (PDP) that is preloaded with an XACML Policy Set. PDP evaluates the request against its policy set and returns an XACML response to PEP PEP decodes the response and either allows data/metadata to be returned to the user or denies access.

  • myGrid XACML PolicyScenario supervisors can access all workflows in the organization students can access only their own workflows blacklisted users cannot access anythingSee policySet.xml on myGrid wiki