![Page 1: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/1.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when
IPython meets noWorkflow
João Felipe Nicolaci Pimentel (UFF),Juliana Freire (NYU), Leonardo Murta (UFF),
Vanessa Braganholo (UFF)
![Page 2: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/2.jpg)
Outline
• Motivation– Exploratory research
– Example
– Interactive Notebooks
– Example
– Provenance Limitations
• Approach– Provenance Collection
– Provenance Analysis
• Conclusion
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 2
![Page 3: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/3.jpg)
Motivation
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 3
![Page 4: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/4.jpg)
Exploratory research
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
Implement or Change
Execution
Analysis
Hypothesis
João Felipe Nicolaci Pimentel 4
![Page 5: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/5.jpg)
Example of exploratory research
• Analyze precipitation data from Rio de Janeiro
• Hypothesis: “The precipitation for each month remains constant across years”
• Data: 2013, 2014 [BDMEP]
João Felipe Nicolaci Pimentel 5Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
Implement or Change
Execution
Analysis
Hypothesis
![Page 6: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/6.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 6
1| import numpy as np2| import matplotlib.pyplot as plt3| from precipitation import read, prepare4|5| def bar_graph(years):6| global PREC, MONTHS7| prepare(PREC, MONTHS, years, plt)8| plt.savefig("out.png")9|10| MONTHS = np.arange(12) + 111| d13, d14 = read('p13.dat'), read('p14.dat')12| PREC = prec13, prec14 = [], []14| for i in MONTHS:15| prec13.append(sum(d13[i]))16| prec14.append(sum(d14[i]))18| bar_graph(['2013', '2014'])
![Page 7: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/7.jpg)
1st Iteration
• $ python experiment.py
• $ display out.png
• Analysis: “Drought in 2014”
• New Hypothesis:
• Data: 2012, 2013, 2014 [BDMEP]
João Felipe Nicolaci Pimentel 7Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
“The precipitation for each month remains constant across years if there is no drought”
![Page 8: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/8.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 8
10| MONTHS = np.arange(12) + 111| d12 = read('p12.dat') 12| d13, d14 = read('p13.dat'), read('p14.dat')13| PREC = prec12, prec13, prec14 = [], [], []14|15| for i in MONTHS:16| prec12.append(sum(d12[i]))17| prec13.append(sum(d13[i]))18| prec14.append(sum(d14[i]))19|20| bar_graph(['2012', '2013', '2014'])
2nd Iteration
![Page 9: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/9.jpg)
2nd Iteration
• $ python experiment.py
• $ display out.png
• Analysis: “2012 was similar to 2013”
• Cycle continues
João Felipe Nicolaci Pimentel 9Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
![Page 10: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/10.jpg)
• Documents
– Text, code, plots, rich media
– Share the documents with results
• Most famous:
– IPython Notebook, knitr
• IPython Notebook has more than 500,000 active users
• Good for exploratory research
Interactive Notebooks
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 10
![Page 11: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/11.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 11
Example - 1st Iteration
![Page 12: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/12.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 12
![Page 13: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/13.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 13
![Page 14: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/14.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 14
![Page 15: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/15.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 15
![Page 16: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/16.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 16
![Page 17: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/17.jpg)
Exploratory research
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
Change
Execution
Analysis
Hypothesis
João Felipe Nicolaci Pimentel 17
• Change existing cells• Add new one
• Execute cells
• Observe the results• Use code for analysis
InteractiveNotebooks
![Page 18: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/18.jpg)
1. Which version of matplotlib, numpy and precipitation module is it using? Will it work on another environment?
Provenance Limitations
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 18
![Page 19: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/19.jpg)
Provenance Limitations
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 19
2. The cell [3] were replaced by [5]. Where did the “prec13” and “prec14” come from? What changed from [3] to [5]?
![Page 20: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/20.jpg)
3. What happens inside the cell? What does read('p12.dat') return? How long did it take to execute each function?
Provenance Limitations
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 20
![Page 21: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/21.jpg)
4. What is the content of 'p12.dat'?
Provenance Limitations
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 21
![Page 22: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/22.jpg)
João Felipe Nicolaci Pimentel 22Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
Provenance of Python Scripts
• API: (Bochner; Gude; Schreiber, 2008)– Require API calls
• StarFlow (Angelino; Yamins; Seltzer, 2010)– Require annotations
• Sumatra (Davison, 2012)– Require version control system
• noWorkflow (Murta et al., 2014)– Transparent collection
• None supports notebooks
![Page 23: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/23.jpg)
• Transparently captures provenance of Python scripts – no changes required!
• Allows users to analyze provenance data
noWorkflow
João Felipe Nicolaci Pimentel 23Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
![Page 24: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/24.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 24
1| import numpy as np2| import matplotlib.pyplot as plt3| from precipitation import read, prepare4|5| def bar_graph(years):6| global PREC, MONTHS7| prepare(PREC, MONTHS, years, plt)8| plt.savefig("out.png")9|10| MONTHS = np.arange(12) + 111| d13, d14 = read('p13.dat'), read('p14.dat')12| PREC = prec13, prec14 = [], []14| for i in MONTHS:15| prec13.append(sum(d13[i]))16| prec14.append(sum(d14[i]))18| bar_graph(['2013', '2014'])
1.9.2
1.4.31.0.1
PATH = /home/joao/...PYTHON_VERSION = 2.7.6
![Page 25: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/25.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 25
1| import numpy as np2| import matplotlib.pyplot as plt3| from precipitation import read, prepare4|5| def bar_graph(years):6| global PREC, MONTHS7| prepare(PREC, MONTHS, years, plt)8| plt.savefig("out.png")9|10| MONTHS = np.arange(12) + 111| d13, d14 = read('p13.dat'), read('p14.dat')12| PREC = prec13, prec14 = [], []14| for i in MONTHS:15| prec13.append(sum(d13[i]))16| prec14.append(sum(d14[i]))18| bar_graph(['2013', '2014'])
PREC = [[…], […]]MONTHS = array(1,…,12)
p13.dat content b/ap14.dat content b/a
read(‘p14.dat’) -> {1: [7.1, 0.8, 0.0, …],2: …}
![Page 26: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/26.jpg)
noWorkflow
João Felipe Nicolaci Pimentel 26Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
Interactive Provenance
Interactive + Provenance
noW
Objective
![Page 27: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/27.jpg)
Approach
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 27
![Page 28: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/28.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 28
Provenance Collection – 1 / 2
![Page 29: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/29.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 29
Provenance Collection – 2 / 2
![Page 30: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/30.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 30
Provenance Collection – 2 / 2
![Page 31: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/31.jpg)
João Felipe Nicolaci Pimentel 31Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
Provenance Analysis
• Call trial methods and properties
• Load Trial
• Trial visualization
• Perform SQL queries
• Perform Prolog queries
• Read file content before and after writing
• Advanced Analysis
![Page 32: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/32.jpg)
Call Trial Method
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 32
1. Which version of precipitation module is it using?
![Page 33: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/33.jpg)
Load Trial
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 33
2. What changed from [3] to [5]?
![Page 34: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/34.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 34
Provenance Visualization
3. What happened inside the cell?
![Page 35: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/35.jpg)
SQL Queries
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 35
![Page 36: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/36.jpg)
Prolog Queries
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 36
![Page 37: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/37.jpg)
Read file content
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
João Felipe Nicolaci Pimentel 37
4. What is the content of 'p12.dat'?
![Page 38: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/38.jpg)
Advanced Analysis
• Combine:
– Python code
– SQL queries
– Prolog queries
– File content
– External tools
João Felipe Nicolaci Pimentel 38Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
![Page 39: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/39.jpg)
Limitations
• Capture one cell at a time.
– It is necessary to repeat “%%now_run” for every cell
• No Out [x]
– The Out [x] is replaced by the trial object
• No IPython superset
– It is not possible to invoke other special commands
João Felipe Nicolaci Pimentel 39Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
![Page 40: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/40.jpg)
Related Work
• Ducktape (Wibisono et al., 2014)
– Use notebook only for interactive provenance visualization
• Lancet (Stevens, Elver, Bednar, 2013)
– Requires definition of special launchers to capture provenance
– Steep learning curve
João Felipe Nicolaci Pimentel 40Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
![Page 41: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/41.jpg)
Conclusion
Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow
• Mechanism to collect and analyze provenance from IPython Notebooks
– Invoke noWorkflow through special functions
– Analytic tools: SQL queries, Prolog queries, object properties, graphs
• noWorkflow tracks history, environment, intermediate results and files
• Reproducible notebook!
João Felipe Nicolaci Pimentel 41
![Page 42: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/42.jpg)
Future Work
• New visualization methods for Provenance
– Dependency graph
– Diff visualization
• Integration with Pandas
– Improve data analysis
• Collect provenance from other languages supported by Jupyter
João Felipe Nicolaci Pimentel 42Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
![Page 43: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/43.jpg)
Collecting and Analyzing Provenance on Interactive Notebooks: when
IPython meets noWorkflow
João Felipe Nicolaci Pimentel (UFF),Juliana Freire (NYU), Leonardo Murta (UFF),
Vanessa Braganholo (UFF)
https://github.com/gems-uff/noworkflow [email protected]
![Page 44: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/44.jpg)
SQL Schema
João Felipe Nicolaci Pimentel 44Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
![Page 45: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/45.jpg)
Prolog Schema
João Felipe Nicolaci Pimentel 45Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
![Page 46: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/46.jpg)
Complex Analysis
João Felipe Nicolaci Pimentel 46Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow
![Page 47: on Interactive Notebooks: when IPython meets noWorkflowworkshops.inf.ed.ac.uk/tapp2015/TAPP15_III_2_slides.pdf · on Interactive Notebooks: when IPython meets noWorkflow João Felipe](https://reader033.vdocuments.net/reader033/viewer/2022052500/605aa3c64a89db1acd2e4f88/html5/thumbnails/47.jpg)
Complex Analysis
João Felipe Nicolaci Pimentel 47Collecting and Analyzing Provenance on Interactive
Notebooks: when IPython meets noWorkflow