streamy, pipy, analyticy
DESCRIPTION
Node.js Streams & Pipes revised for analyticsTRANSCRIPT
![Page 1: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/1.jpg)
Copyright Push Technology 2012
LNUG London
January 2013
![Page 2: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/2.jpg)
Copyright Push Technology 2012 [email protected]
About me?
• Distributed Systems / HPC guy.
• Chief Scien*st :-‐ at Push Technology
• Responds to: Guinness, Whisky
• TwiOer: @darachennis
![Page 3: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/3.jpg)
Copyright Push Technology 2012
Streamy Pipy
Analy*cy
![Page 4: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/4.jpg)
Copyright Push Technology 2012
EEP + ‘Streams & Pipes’= CEP
• An experiment in Embedded Event Processing • Sliding, Tumbling, Monotonic and Periodic windows • Separate ‘window’ definiYon from operaYon • Aggregate funcYons. Window of data produces scalar result
• But? No filtering, branching or combinators, no flows …
• That’s a job for Streams & Pipes. Let’s add that.
eep.js: Func*onal Opera*ons on Streaming Data Windows
S Cw ww w Q
![Page 5: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/5.jpg)
Copyright Push Technology 2012
Windows
![Page 6: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/6.jpg)
Copyright Push Technology 2012
Windows + Aggregate FuncYons
• A window of data is a slice of data over Yme, number of events or some other dimension
• An aggregate funcYon is something you do in the context of a window.
What is this? • Average – Aggregate Func*on • CPU – Data (events) • On a second by second basis -‐ Periodic Yme window
Example
![Page 7: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/7.jpg)
Copyright Push Technology 2012
Tumbling Windows
• Every N events, give me an average of the last N events • Does not overlap windows • ‘Closing’ a window, ‘Emits’ a result (the average) • Closing a window, Opens a new window
What is a tumbling window?
1 2 3 4
2 3 4 5
2 3 4 5
t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ...
init()
init()
init()
emit()
emit()
emit()
x() x() x() x()
x() x() x() x()
x() x() x() x()
![Page 8: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/8.jpg)
Copyright Push Technology 2012
Sliding Windows
• Like tumbling, except can overlap. • But typically O(N2), Keep N small. Except EEP.js. O(N) perf.
• Every event opens a new window. • Ader N events, every subsequent event emits a result. • Like all windows, cost of calculaYon amorYzed over events
What is a sliding window?
1 2 3 4
1 2 3 4
1 2 3 ..
1 2 .. ..
5
..
..
..
..
..
init()
x()
x()
x()
..
.. ..
..
..
..
..
..
t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ...
![Page 9: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/9.jpg)
Copyright Push Technology 2012
Periodic Windows
• Driven by ‘wall clock Yme’ in milliseconds • Not monotonic, natch. Beware of NTP
What is a periodic window?
1 2 3 4
2 3 4 5
2 3 4 5
t0 t1 t2 t3 ...
init()
init()
init()
emit()
emit()
emit()
x() x() x() x()
x() x() x() x()
x() x() x() x()
![Page 10: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/10.jpg)
Copyright Push Technology 2012
Monotonic Windows
• Driven mad by ‘wall clock Yme’? Need a logical clock? • No worries. Provide your own clock! Eg. Vector clock
What is a monotonic window?
1 2 3 4
2 3 4 5
2 3 4 5
t0 t1 t2 t3 ...
init()
init()
init()
emit()
emit()
emit()
x() x() x() x()
x() x() x() x()
x() x() x() x()
my my my
![Page 11: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/11.jpg)
Copyright Push Technology 2012
Slide beOer with CompensaYng Aggregates
1
1 2 3 4
1 2 3 4
1 2 3 ..
1 2 .. ..
5
..
..
..
..
..
init()
x()
x()
x()
..
.. ..
..
..
..
..
..
t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ...
do { … } while (…)
compensate()
![Page 12: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/12.jpg)
Copyright Push Technology 2012
Bad Sliding -‐ O(N2)
![Page 13: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/13.jpg)
Copyright Push Technology 2012
Good Sliding
• Takes us from O(N2) to O(N) for Sliding windows
![Page 14: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/14.jpg)
Copyright Push Technology 2012
EEP.js is fast
![Page 15: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/15.jpg)
Copyright Push Technology 2012
Using Sliding, Tumbling Windows
![Page 16: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/16.jpg)
Copyright Push Technology 2012
Using Periodic, Monotonic Windows
![Page 17: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/17.jpg)
Copyright Push Technology 2012
Custom clocks (noYon of Yme)
![Page 18: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/18.jpg)
Copyright Push Technology 2012
EEP.js v0.1, v0.2 were ugly babies.
Sorry! Swear, the next version will be just as funcYonal but preOy…
![Page 19: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/19.jpg)
Copyright Push Technology 2012
Streams & Pipes
![Page 20: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/20.jpg)
Copyright Push Technology 2012
What about Streams & Pipes?
S C Q
w ww weep
????
+
![Page 21: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/21.jpg)
Copyright Push Technology 2012
Streams & Pipes: Origins
• Do one thing. Do it well • Compose sophisYcated behaviors from simple parts
• Maximize reuse • Unix, ‘Chain of Responsibility’ (GoF), Interceptor (POSA2), XPipe, Builder, …
• The ‘Assembly Line Principle’ is nothing new
![Page 22: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/22.jpg)
Copyright Push Technology 2012
Streams & Pipes: Node.JS
• var events = require(‘events’) • Publish/Subscribe to event (streams)
• var stream = require(‘stream’) • Readable – Consume a (finite) set of events • Writable – Produce a (finite) set of events • readable.pipe(writeable) • writeable.pipe(readable)
![Page 23: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/23.jpg)
Copyright Push Technology 2012
Streams & Pipes: streams2
• Transform – Compress, Encrypt, Encode, … • Duplex – Readable and Writable • Passthrough – The canonical ‘noop’ transform
• Node.js Streams history (so far) hOp://bit.ly/XupqkO -‐ by @izs
![Page 24: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/24.jpg)
Copyright Push Technology 2012
Streams & Pipes: but …
• Oriented for IO, not compute/analy*cs • Array-‐like buffers not individual datums • @dominictarr event-‐streams? Array based • ASCII, UTF-‐8, Binary -‐ not JS types • Oden require copying, parsing, … (slow)
• So, streams & pipes for JS types? Yes! • Do one thing. Do it well • Compose sophisYcated simple parts • Maximize reuse
![Page 25: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/25.jpg)
Copyright Push Technology 2012
Introducing Beam.js
![Page 26: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/26.jpg)
Copyright Push Technology 2012
Beams, Pipes
• Streams & Pipes for analyYcs • Not designed for IO. Use Streams for that
• Not concerned with CEP. • … Use EEP for that? J
• Not concerned with arrays of things • … Use Dominic Tarr’s event-‐stream for that
• Beam • Crunch events • Pipeline, Branch & Combine
![Page 27: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/27.jpg)
Copyright Push Technology 2012
Beams & Pipes.
• Streams & Pipes, reconsidered for JS types
• var Beam = require(‘beam’);
• Beam.Source -‐-‐ Push data in • Beam.Sink -‐-‐ Suck analysis out • Beam.Operator -‐-‐ OODA / PDCA
• Really Simple: ~150 LOC
![Page 28: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/28.jpg)
Copyright Push Technology 2012
Beams & Pipes: Operators
• Three types of operator • Transform • 1 in, 1 out. Output data/type may differ
• Filter • 1 in, 1 or none out. Output data/type same as input
• Custom • May transform, filter
![Page 29: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/29.jpg)
Copyright Push Technology 2012
Example: Defini*ons
![Page 30: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/30.jpg)
Copyright Push Technology 2012
Example: Usage
![Page 31: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/31.jpg)
Copyright Push Technology 2012
Example: Easy to debug …
![Page 32: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/32.jpg)
Copyright Push Technology 2012
Example: Streams & Beams
![Page 33: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/33.jpg)
Copyright Push Technology 2012
Branch
• You can define 1 or many • They can overlap or not as you see fit • It’s just an applicaYon of predicate (boolean) filters • Simple
![Page 34: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/34.jpg)
Copyright Push Technology 2012
Combine?
• You can combine many sources or branches into one • Works like a union. First in, first out. • You can write your own. It’s just an Operator • You can branch from, combine to … any beam
![Page 35: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/35.jpg)
Copyright Push Technology 2012
Streams & Pipes, ++
• In Node.js the definiYon and usage of streams in a pipe are entangled. • Typically, with Streams & Pipes for IO, you only ever want one. • In algorithms you may want to reuse. • Think about it …
• Event EmiOer. 1 square … 2 branches?
![Page 36: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/36.jpg)
Copyright Push Technology 2012
Pipes ++
• Beam Pipes are different (& really really really simple) • You can define a filter once • You can store it in a module • Store like opera*ons together • Make libraries
• Use ‘em. Share ‘em.
![Page 37: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/37.jpg)
Copyright Push Technology 2012
EEP based on Beam soon!
![Page 38: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/38.jpg)
Copyright Push Technology 2012
Un*l then?
• npm install beam
• Filter data events • Transform data events • Analyze, crunch all the things • Branch all the things • Combine all the things
![Page 39: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/39.jpg)
Copyright Push Technology 2012
Beam futures?
• Taps – Convert events into beams • Drain – Convert beams into events • Beams • Write Beam operators in ‘beam’ • Beams ‘inside’ beams • Source.pipe(op).compile(); // Maybe?
![Page 40: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/40.jpg)
Copyright Push Technology 2012
Ques*ons
![Page 41: Streamy, Pipy, Analyticy](https://reader034.vdocuments.net/reader034/viewer/2022051818/549714e4ac7959482e8b51cd/html5/thumbnails/41.jpg)
Copyright Push Technology 2012 [email protected]
QuesYons?
• Thank you for listening to, having me • Le twiOer: @darachennis
• hOps://github.com/darach/beam-‐js
hOps://github.com/darach/eep-‐js
• npm install eep npm install beam
• EEP built on beam? EEP in other langs? Soon
• Fork it, Port it, Enjoy it!