kepler+pf+rws, kepler+pf+rws, podhorszki, altintas et al. provenance challenge @ ggf18 rws...

24
Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert Podhorszki Ilkay Altintas Bertram Ludaescher in collaboration with Shawn Bowers Timothy McPhillips

Upload: amber-carroll

Post on 04-Jan-2016

229 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

RWS Provenance Experiments in Kepler (Kepler + PR + RWS)

Norbert Podhorszki

Ilkay Altintas

Bertram Ludaescher

in collaboration with

Shawn Bowers

Timothy McPhillips

Page 2: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Initial Provenance Framework (IPAW’06, Altintas et al.)

• Vision:– Modeled as a separate concern in the system

• Optional drag and drop feature– Listen to execution and save information (customizable):

• Context: who, what, where, when, and why that is associated with the run

• Input data and its associated metadata• Workflow outputs and intermediate data products• Workflow definition (entities, parameters, connections): a

specification of what exists in the workflow and can have a context of its own

• Information about the workflow evolution -- workflow trail

Page 3: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Kepler System Architecture

Authentication

GUI

Vergil

SMS

KeplerCore

ExtensionsPtolemy

…Kepler GUI Extensions…

Actor&DataSEARCH

TypeSystem

Ext

ProvenanceRecorder

KeplerObject

Manager

Documentation

Smart Re-run /Failure

Recovery

IPAW’06-Altintas et al.

Page 4: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Kepler Provenance Recorder (IPAW’06, Altintas et al)

• Parametric and customizable

– Different report formats– Variable levels of

verbosity• all, some, medium,

on error– Multiple cache

destinations

• Saves information on– User name, Date, Run,

etc…

Page 5: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Read-Write-ReSet Model (IPAW’06, McPhillips et al)

• r, r …. r, w, w, … w, r, … r, w, ... w, …

firing• what about actor state? what about “real” dependencies?• reset event s defines when actor “cuts off” dependencies

– a semantic notion, known to the actor [developer] (or part of a higher-order scheme)

• r, r …. r, w, w, … w, [s!] r, … r, w, ... w, …

A3

r … r w…w

[s!]

PS ???

Page 6: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Goals of the PR+RWS Experiments

• Use the RWS model for Kepler workflows– both single-level and nested workflows (fun starts here :-)

• Extend the Kepler Provenance Recorder – Modify the methods of the provenance listener class– Classes to store execution data about the workflow

• To generate the send-receive relations of the tokens correctly • To count actor firings correctly

• Disclaimer: Initially only one workflow run is targeted– (but approach can handle multiple actor firings due to pipeline

parallelism .. )– future: queries over several runs and workflow-provenance – (others in Kepler already doing this merge efforts in the future)

Page 7: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Implementation: Data Model

• Port-actor relationship– portTable(Port, Actor, type)

• type is r as real and v as virtual (transparent)• Token-object relationship

– tokenTable(Token, Object)

• Object-value relationship– objectTable(Object, Value, Type)

• type is currently not recorded• RWS trace

– traceTable(Port, Event, Token, FiringCounter)• event: r as read, w as write or s as state-reset

Page 8: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Implementation: Class Hierarchy

• Extends the existing provenance execution listener with– Methods– More event listeners– Supporting classes

• RWSPortInfo, RWSActorInfo – Data structures for building and containing info about the workflow

(and counters for event record

• RWSEvent– Handles RWS events

Page 9: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

initialize()

Generate RWS portMap

Generate RWS actorMap

Record static wf info

Create new RWS event list

Initialization phase

RWSPortInfo(info locally

known at a port)

RWSPortInfo(build connection info)

for each port

for each port

RWSActorInfo

for each actor

portTable

Execution: Initialization

Page 10: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Execution: Event Handling and Modifications

validate()

Before model is executed.

Subscribe to token listeners

TokenSendTokenGet

changeExecuted()

Sth is changed in the workflow

Re-generate RWS portMap

Just before run

When the workflow is modified

event handling methods are extended here

Page 11: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Execution: During the workflow run

TokenSendEvent()New RWS event w

When a token event occurs

TokenGetEvent()

Print sent token’s info(token id, object id, value)

Generate virtual TokenGet event

For each connected transparent port

New RWS event r

Generate virtual TokenSend event

If it is atransparent port

tokenTable

traceTable

objectTable

Page 12: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

A Kepler Workflow Implementation

RWS TRACE

Table # of elements size in KB

portTable 81 4 tokenTable 30 2 objectTable 30 3

traceTable 86 6

Page 13: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Query 1.a

Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc.

Answer a. list of actors that contributed to the result: (21 actors). They appear in reversed order as they were executed.

?- q1b_actors('"/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif"', ActorList), print(ActorList).

[ .pc.Convert_x, .pc.Slicer_x, .pc.SoftMean, .pc.Reslice3, .pc.Reslice2, .pc.Reslice4, .pc.Reslice1, .pc.AlignWarp3, .pc.RefImg, .pc.RefHdr, .pc.InputHdr3, .pc.InputImg3, .pc.AlignWarp2, .pc.InputHdr2, .pc.InputImg2, .pc.AlignWarp4, .pc.InputHdr4, .pc.InputImg4, .pc.AlignWarp1, .pc.InputImg1, .pc.InputHdr1]

Page 14: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Query 1.bAnswer b. list of intermediate values created by the workflow (26 values).

?- q1b_values('"/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif"', ValueList), print(ValueList).

["/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage4/atlas-x.pgm", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage3/atlas.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage3/atlas.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced3.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced2.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced4.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced1.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced2.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced3.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced4.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced1.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage1/warp3.warp", "/usr/home/pnorbert/Provenance/ProvCh/data/input/reference.img", "/usr/home/pnorbert/Provenance/ProvCh/data/input/reference.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy3.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy3.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage1/warp2.warp", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy2.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy2.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage1/warp4.warp", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy4.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy4.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage1/warp1.warp", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy1.img", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy1.hdr”]

Page 15: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Improved PC workflow (cf. COMAD wf)

RWS TRACE

Table # of elements size in KB

portTable 42 2 tokenTable 51 3 objectTable 39 4 traceTable 150 9

• A more generic workflow to accepts any number of images

• Smaller number of actors• This effects the number of

values as it requires additional array operations

• cf. also COMAD approach and Taverna approach (but we fire AlignWrap individually here)

Page 16: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Improved PC workflow

Page 17: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Query 1

Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc.

Answer a. list of actors that contributed to the result: (15 actors). They appear in reversed order as they were executed.

?- q1b_actors('"/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif"', ActorList), print(ActorList).

[ .pca.Convert, .pca.Slicer , .pca.hdrrepeat, .pca.seqXYZ, .pca.imgrepeat, .pca.SoftMeanArray, .pca.imgarray, .pca.hdrarray, .pca.Reslice, .pca.AlignWarp, .pca.RefHdr, .pca.InputHdr, .pca.InputImg, .pca.RefImg, .pca.Ramp]

Page 18: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Query 1Answer b. list of intermediate values created by the workflow (33 values).It includes internal data values (arrays) additionally to the original file names.?- q1b_values('"/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif"', ValueList), print(ValueList).[ "/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage4/atlas-x.pgm", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage3/atlas.hdr", "x", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage3/atlas.img", { "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced1.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced2.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced3.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced4.img" }, { "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced1.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced2.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced3.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced4.hdr" }, "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced1.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced2.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced3.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced4.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage1/warp1.warp", "/usr/home/pnorbert/Provenance/ProvCh/data/input/reference.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy1.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy1.img", "/usr/home/pnorbert/Provenance/ProvCh/data/input/reference.img", 1, etc...

Page 19: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Nested workflow tricky example

S

Page 20: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

The trick

• Multi-port of Ptolemy– two distinct channels going into S and out from S– A’s output is delivered to S.C– B’s output is delivered to S.D– S.C’s output is delivered to E– S.D’s output is delivered to F

Page 21: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Lineage of actors and values

Who contributed to value D.2 arrived at F??- q1('"D.2"', ActorList, ValueList).

ActorList = ['.WF15.S.D', '.WF15.S', '.WF15.B']ValueList = ['"D.2"', '2', '2']

Who contributed to value C.1 arrived at E??- q1('"C.1"', ActorList, ValueList).

ActorList = ['.WF15.S.C', '.WF15.S', '.WF15.A']ValueList = ['"C.1"', '1', '1']

Page 22: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Single-level lineage of actors and values

Who contributed to value D.2 arrived at F??- q1b('"D.2"', ActorList, ValueList).

ActorList = ['.WF15.S', '.WF15.B']ValueList = ['"D.2"', '2']

Who contributed to value C.1 arrived at E??- q1b('"C.1"', ActorList, ValueList).

ActorList = ['.WF15.S', '.WF15.A']ValueList = ['"C.1"', '1']

Page 23: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Conclusions

• 1st attempt combining Kepler PR & Kepler RWS provenance model– Both published in IPAW 2006

• Query 1 was successfully answered.

• Queries 2 and 3 are answerable, but hadn’t been implemented yet.

• Queries on multiple runs and workflow design provenance is out of the scope of this initial prototype.– Other groups in Kepler focusing on this.

Page 24: Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert

Kepler+PF+RWS,Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18Provenance Challenge @ GGF18

Some related references

• Provenance Framework/Recorder:– Provenance Collection Support in the Kepler Scientific Workflow

System,I.Altintas, O. Barney, E. Jaeger-Frank, IPAW2006, Chicago, Illinois, May 2006.

• RWS Model:– A Model for User-Oriented Data Provenance in Pipelined Scientific W

orkflows, Shawn Bowers, Timothy McPhillips, Bertram Ludaescher, Shirley Cohen, Susan B. Davidson. International Provenance and Annotation Workshop (IPAW'06), Chicago, Illinois, USA, May 3-5, 2006.