asynchronous i/o in nodejs - new standard or challenges?
TRANSCRIPT
Asynchronous I/O in NodeJS
- Standard and challenge of web programming?
Pham Cong Dinh @ SkunkWorks
@pcdinh
BarCamp Saigon 2011
Follow us on Twitter: @teamskunkworks
Notice
It is not a comprehensive study
Things can be changed
No standard at the moment
Unix only
Proven control flow models
Single-threaded process modelPrefork MPM Apache HTTPd
Multi-threaded process modelWorker MPM Apache HTTPd
JVM
Emerging control flow models
Coroutines
Coroutines are computer program components that generalize subroutines to allow multiple entry points for suspending and resuming execution at certain locations.Fiber
A lightweight thread of execution. Like threads, fibers share address space. However, fibers use co-operative multitasking while threads use pre-emptive multitasking. Events: non-blocking I/O
Events / Non-blocking I/O
From http://bethesignal.org/
Request handling:
Process / Threads / Events
Process: A single process is forked per request. That process is blocked until response can be produced
Threads: A process contains multiple threads. A single thread is assigned to handle an incoming request. Thread-per-request model
Events: A process consists of the processing of a series of events. At any instant in time, a single event/request is being processed. The process is not blocked on I/O request.
Request handling:
Process / Threads / Events
Threads shares
Default share memory File descriptors Filesystem context Signals and Signal handling
Request handling:
Process / Threads / Events
Thread creation is expensive - From iobound.com
Request handling:
Process / Threads / Events
Context switching is expensive
Request handling:
Process / Threads / Events
Blocking model: Process and Threads
Request handling:
Process / Threads / Events
Non-blocking model: Events
Request handling:
Event-driven IO loop issue
poll vs. epoll
epoll: O(1) or O(n=active)
Request handling:
Event-driven callback issues
Non-blocking IO Loop vs. Blocking callback
Event dispatching is not blocked
Callback can be blocked
Mixing asynchronous code and synchronous code can be bad
Events in NodeJS
libev for event loops
libeio for asynchonous file I/O
c-ares for asynchronous DNS requests and name resolution.
evcom (by Ryan Dahl) is a stream socket library on top of libev.
Asynchronous programming model in NodeJS
First citizen: High order function/callback
Most objects in NodeJS are Event Emitters (http server/client, etc.)
Most low level functions take callbacks. (posix API, DNS lookups, etc.)
Blocking code
var a = db.query('SELECT A');console.log('result a:', a);
Non-blocking code using callbackdb.query('SELECT A', function(result) { console.log('result a:', result);});
Asynchronous programming model in NodeJS
Callbacks is hardDivides things into stages and handle each stage in a a callback
Do things in a specific order.
You must keep track of what is done at a point of time
Hard to handle failures
Nested callbacks can be hard to read
Asynchronous programming model in NodeJS
Nested callbacks can be hard to read
var transferFile = function (request, response) { var uri = url.parse(request.url).pathname; var filepath = path.join(process.cwd(), uri); // check whether the file is exist and get the result from callback path.exists(filepath, function (exists) { if (!exists) { response.writeHead(404, {"Content-Type": "text/plain"}); response.write("404 Not Found\n"); response.end(); } else { // read the file content and get the result from callback fs.readFile(filepath, "binary", function (error, data) { if (error) { response.writeHead(500, {"Content-Type": "text/plain"}); response.write(error + "\n"); } else { response.writeHead(200); response.write(data, "binary"); }
response.end(); }); } });}
Asynchronous programming model in NodeJS
Callback is hard to debug
function f () { throw new Error(foo);}
setTimeout(f, 10000*Math.random());setTimeout(f, 10000*Math.random());
From which line does the error arise?
Asynchronous programming model in NodeJS
Flow Control Libraries
Steps https://github.com/creationix/step
Flow-JS https://github.com/willconant/flow-js
Node-Promise https://github.com/kriszyp/node-promise
Asynchronous programming model in NodeJS
Flow Control Libraries: Steps
Step's goal is to both remove boilerplate code and to improve readability of asynchronous code. The features are easy chaining of serial actions with optional parallel groups within each step.
Step( function readSelf() { fs.readFile(__filename, this); }, function capitalize(err, text) { if (err) throw err; return text.toUpperCase(); }, function showIt(err, newText) { if (err) throw err; console.log(newText); });
Asynchronous programming model in NodeJS
Flow Control Libraries: Flow-JS
Flow-JS provides a Javascript construct that is something like a continuation or a fiber found in other languages. Practically speaking, it can be used to eliminate so-called "pyramids" from your multi-step asynchronous logic.
dbGet('userIdOf:bobvance', function(userId) { dbSet('user:' + userId + ':email', '[email protected]', function() { dbSet('user:' + userId + ':firstName', 'Bob', function() { dbSet('user:' + userId + ':lastName', 'Vance', function() { okWeAreDone(); }); }); });});
Asynchronous programming model in NodeJS
Flow Control Libraries: Flow-JS
flow.exec( function() { dbGet('userIdOf:bobvance', this);
},function(userId) { dbSet('user:' + userId + ':email', '[email protected]', this.MULTI()); dbSet('user:' + userId + ':firstName', 'Bob', this.MULTI()); dbSet('user:' + userId + ':lastName', 'Vance', this.MULTI());
},function() { okWeAreDone() });
Asynchronous programming model in NodeJS
JavaScript extension: TameJS
Tame (or "TameJs") is an extension to JavaScript, written in JavaScript, that makes event programming easier to write, read, and edit when control-flow libraries are not good enough!.
http://tamejs.org/
Asynchronous programming model in NodeJS
JavaScript extension: TameJS
Synchronous code
handleVisit : function(angel, buffy) { var match_score = getScore(angel, buffy); var next_match = getNextMatch(angel); var visit_info = recordVisitAndGetInfo(angel, buffy); if (match_score > 0.9 && ! visit_info.last_visit) { sendVisitorEmail(angel, buffy); } doSomeFinalThings(match_score, next_match, visit_info);}Asynchrnous code
handleVisit : function(angel, buffy) { getScore(angel, buffy, function(match_score) { getNextMatch(angel, function(next_match) { recordVisitAndGetInfo(angel, buffy, function(visit_info) { if (match_score > 0.9 && ! visit_info.last_visit) { sendVisitorEmail(angel, buffy); } doSomeFinalThings(match_score, next_match, visit_info); }); }); });}
Asynchronous programming model in NodeJS
JavaScript extension: TameJS
TameJS style
handleVisit : function(angel, buffy) {
// // let's fire all 3 at once //
await { getScore (angel, buffy, defer(var score)); getNextMatch (angel, buffy, defer(var next)); recordVisitAndGetInfo (angel, buffy, defer(var vinfo)); }
// // they've called back, and now we have our data //
if (score > 0.9 && ! vinfo.last_visit) { sendVisitorEmail(angel, buffy); } doSomeFinalThings(score, next, vinfo);}
Asynchronous programming model in NodeJS
JavaScript's yield
V8/NodeJS has not supported yield, so generator yet
The end
Q & A