dev8d 2011-pipe2 py

30
Introducing Pipe2Py: Converting Yahoo Pipes to Python Code Original code: Greg Gaughan Additional development: Tuukka Hastrup Based on an original idea by: Tony Hirst, Dept of Communication and Systems, The Open University

Upload: tony-hirst

Post on 06-May-2015

761 views

Category:

Technology


0 download

DESCRIPTION

An introduction to pipe2py a Yahoo Pipes to Python compiler.

TRANSCRIPT

Page 1: Dev8d 2011-pipe2 py

Introducing Pipe2Py:Converting Yahoo Pipes to Python Code

Original code: Greg Gaughan

Additional development: Tuukka Hastrup

Based on an original idea by: Tony Hirst, Dept of Communication and Systems, The Open University

Page 2: Dev8d 2011-pipe2 py

pipes.yahoo.com

Page 3: Dev8d 2011-pipe2 py

But what happens if Yahoo Pipes dies?

Page 4: Dev8d 2011-pipe2 py

Pipe2Pygithub.com/ggaughan/pipe2py

Page 5: Dev8d 2011-pipe2 py

Yahoo pipelines are translated into pipelines of Python generators* to give a close match to the original data flow.

* based on ideas by David Beazley http://www.dabeaz.com/generators-uk

Page 6: Dev8d 2011-pipe2 py

Each Yahoo module is coded as a separate Python module.

Page 7: Dev8d 2011-pipe2 py

So you can use Yahoo Pipes as a graphical rapid prototyping application, and then generate a Python code equivalent you can host yourself

So what?

Page 8: Dev8d 2011-pipe2 py

download codehttp://github.com/ggaughan/pipe2py

to dev8d/pipes/pipe2py

set pathexport PYTHONPATH=dev8d/pipes

installation

Page 9: Dev8d 2011-pipe2 py

simplejson*sudo easy_install simplejson

dependencies

* only needed for Python pre 2.6

Page 10: Dev8d 2011-pipe2 py

test directorypython testbasics.py

unit tests

Page 11: Dev8d 2011-pipe2 py

python compile.py -p pipelineid

compilation - direct from Yahoo Pipes

generatespipe_pipelineid.py

Page 12: Dev8d 2011-pipe2 py

python compile.py pipelinefile.json

compilation - from a file

generatespipelinefile.py

Page 13: Dev8d 2011-pipe2 py

python pipe_pipelineid.py

command line execution

runspipe_pipelineid.py

Page 14: Dev8d 2011-pipe2 py

from pipe2py import Contextfrom pipe2py.modules import *

def pipe_404411a8d22104920f3fc1f428f33642(context, _INPUT, conf=None, **kwargs):    "Pipeline"    if conf is None:        conf = {}

    forever = pipeforever.pipe_forever(context, None, conf=None)

    sw_502 = pipefetch.pipe_fetch(context, forever, conf={u'URL': {u'type': u'url', u'value': u'http://blog.ouseful.info/feed'}})    _OUTPUT = pipeoutput.pipe_output(context, sw_502, conf={})    return _OUTPUT

compiled code of the form...

Page 15: Dev8d 2011-pipe2 py

Each call to the final generator will ripple through the pipeline

issuing .next() calls onto the previous generator until the

source is exhausted.

Page 16: Dev8d 2011-pipe2 py

Each item is typically passed through the whole pipeline one at a time, so:

memory usage is kept to a minimumno module is waiting on an earlier module to finish processing the whole data setby adding queues between the modules they could easily be made to run in parallel, each on a different CPU, to give great scalability

Page 17: Dev8d 2011-pipe2 py

from pipe2py import Contextimport pipe_9dc8014dcfd34c834a960321afde68d9 as p

C=Context()

r = p.pipe_9dc8014dcfd34c834a960321afde68d9(C,None)

for i in r:   print i   print i['title']

usage - compiled pipe

Page 18: Dev8d 2011-pipe2 py

from pipe2py.compile import parse_and_build_pipefrom pipe2py import Context

pipe_def = """json representation of the pipe"""

p = parse_and_build_pipe(Context(), pipe_def)

for i in p:    print i

usage - interpreted pipe

Page 19: Dev8d 2011-pipe2 py

context = Context(describe_input=True)

p = pipe_ac45e9eb9b0174a4e53f23c4c9903c3f(context, None)

need_inputs = pprint need_inputs

>>> [(u'0', u'username', u'Twitter username', u'text', u''),...    (u'1', u'statustitle',  u'Status title [string] or [logo] means twitter icon', u'text', u'logo')]

''' That is, tuples of the form   (position, name, prompt, type, default)'''

usage - user inputs #1            Identifying console prompts

Page 20: Dev8d 2011-pipe2 py

C = Context(inputs={'username':'greg', 'statustitle':'logo'},                     console=False)p = pipe_ac45e9eb9b0174a4e53f23c4c9903c3f(C, None)

for i in p:    print i

usage - user inputs #2            avoiding console prompts

Page 21: Dev8d 2011-pipe2 py

Yahoo Pipes modules:Pipe2Py implementation progress

Page 22: Dev8d 2011-pipe2 py

Yahoo Pipes modules:Pipe2Py implementation progress

Page 23: Dev8d 2011-pipe2 py

Yahoo Pipes modules:Pipe2Py implementation progress

Page 24: Dev8d 2011-pipe2 py

;-)

One more thing...

Page 25: Dev8d 2011-pipe2 py

pipes-engine.appspot.com

pipe2py hosting on Google App Engine

Page 26: Dev8d 2011-pipe2 py

- generate test pipes that work of increasing complexity

- generate test pipes that don't work

- commit pipe2py patches for test pipes that don't work

How can you help?

Page 27: Dev8d 2011-pipe2 py

- simplify installation (easy_install?)

- identify a good convention for integrating pipe2py compiled pipes in arbitrary code

- - identify a good convention for inserting arbitrary python functions into, or in-between, compiled pipe2py pipelines

How else can you help?

Page 28: Dev8d 2011-pipe2 py

the next step: produce an open source front end visual editor?

wireit?pypes?

Anything else?

Page 29: Dev8d 2011-pipe2 py

generate a ready-to-run instance of a Google App Engine configuration

based around a compiled pipe?

Anything more else?

Page 30: Dev8d 2011-pipe2 py

Pipe2Pygithub.com/ggaughan/pipe2py