asynchronous programming done right - node.js
DESCRIPTION
Asynchronous programming done right. Without race conditions. Good pratcices in Node.js.TRANSCRIPT
Asynchronousprogramming
done right.
Without race conditions ..ions ..io on..ns ditions.
by Piotr Pelczar (Athlan)
About me
Piotr Pelczar
Freelancer for 8yrs
PHP, Node.js, Java/Groovy
Zend Certified Engineer
IPIJ, Startups
Stay in touch
athlan.pl
/piotr.pelczarfacebook.com
/athlangithub.com
/piotrpelczarslideshare.net
/ppelczarlinkedin.com/in
Asynchronousprogramming
Asynchronous actions are actions executed in a non-blockingscheme, allowing the main program flow to continue processing.
How software lives inhardware?
Operating systems are process based
Each process has assigned processor, registers, memory
How software lives inhardware?
Process paralelism using threads (thread pools)
Switching processor over processes/threads causes contextswitching
1. context switching = wasting time
Sync programmingIn trivial, sequential approach
Each operation is executed sequentially:
O(t) > O(t+1)
if O(t) stucks, O(t+1) waits...
Sync programming
This is cool, software flow is predictibleBut not in high throughput I/O
I/O costs because of waiting time...
High throughput I/OHigh throughput I/O doesn't mean:
Memory operations
Fast single-thread computing
High throughput I/OHigh throughput I/O means:
HTTP requests
Database connections
Queue system dispatching
HDD operations
2. Avoid I/O blocking
2. Avoid I/O blocking
Single-threaded, event loopmodel
Imagine a man, who has a task:
Walk around
When bucket is full of water, just pour another bucket
Go to next bucket
There is no sequencesIn async programming, results appears in no sequences
operation1(); // will output "operation1 finished."operation2(); // will output "operation2 finished."operation3(); // will output "operation3 finished."
There is no sequencesoperation1() would be
var amqp = require("amqp")var eventbus = amqp.createConnection();console.log("AMQP connecting...");
eventbus.on("ready", function() { console.log("AMQP connected...");
callback(); return;});
There is no sequencesoperation2() would be
var redis = require("redis")var conn = redis.createClient(port, host, options);console.log("Redis connecting...");
conn.auth(pass, function(err) { if(err) console.log("Redis failed..."); else console.log("Redis connected..."); callback(); return;});
There is no sequencesoperation3() would be
var mongojs = require("mongojs");
console.log("Mongo connecting...");var conn = mongojs.connect(connectionString); // blocking operationconsole.log("Mongo connected...");
callback();return;
There is no sequencesExpectations?
AMQP connecting... // operation1()AMQP connected... // operation1()Redis connecting... // operation2()Redis failed... // operation2()Mongo connecting... // operation3(), blockingMongo connected... // operation3()
There is no sequencesExpectations?
There is no sequencesThe result:
AMQP connecting... // operation1()Redis connecting... // operation2()Mongo connecting... // operation3(), blockingMongo connected... // operation3()Redis failed... // operation2()AMQP connected... // operation1()
There is no sequences
So... what functionsreturns?
You can perform future tasks in function, so what will bereturned?
value123 will be returned,just after blocking code, without waiting for non-blocking.
function my_function() { operation1(); operation2(); operation3();
return "value123";}
Assume: Functions doesNOT returns values
The function block is executed immedietally from top to bottom.You cannot rely to return value, because it is useless.
CallbacksCallback is the reference to function.
var callbackFunction = function(result) { console.log("Result: %s", result)}
When operation is done, the callback function is executed.callbackFunction("test1") // "Result: test1" will be printed out
CallbacksIf callbackFunction is a variable (value = reference),
so can be passed it via function argument.var callbackFunction = function() { ... }someOtherFunction(callbackFunction);
function someOtherFunction(callback) { callback(); // execute function from argument}
CallbacksFunctions can be defined as anonymous (closures)
function someOtherFunction(callback) { var arg1 = "test"; callback(arg1); // execute function from argument}
someOtherFunction(function(arg1) { console.log('done... %s', arg1);})
Callbacks can be nestedNesting callbacks makes code unreadeable:
var amqp = require('amqp');
var connection = amqp.createConnection();
connection.on('ready', function() { connection.exchange("ex1", function(exchange) { connection.queue('queue1', function(q) { q.bind(exchange, 'r1');
q.subscribe(function(json, headers, info, m) { console.log("msg: " + JSON.stringify(json)); }); }); });});
Callbacks can be nestedNesting callbacks makes code unreadeable:
var amqp = require('amqp');
var connection = amqp.createConnection();
connection.on('ready', function() { connection.exchange("ex1", function(exchange) { connection.queue('queue1', function(q) { q.bind(exchange, 'r1');
q.subscribe(function(json, headers, info, m) { console.log("msg: " + JSON.stringify(json)); table.update(select, data, function() { table.find(select, function(err, rows) { // inserted rows... } }); }); }); });});
Asynchronous control flowsPromise design pattern
Libraries that manages callbacks references
Promise design pattern1. Client fires function that will return result in the future
in the future, so it is a promise
2. Function returns promise object immedietalybefore non-blocking operations
3. Client registers callbacks
4. Callbacks will be fired in the future, when task is done
var resultPromise = loader.loadData(sourceFile)
resultPromise(function success(data) { // this function will be called while operation will succeed}, function error(err) { // on fail})
Promise design pattern1. Create deferred object
2. Return def.promise
3. Call resolve() or reject()
var loadData = function(sourceFile) { var def = deferred() , proc = process.spawn('java', ['-jar', 'loadData.jar', sourceFile]) var commandProcessBuff = null , commandProcessBuffError = null; proc.stdout.on('data', function (data) { commandProcessBuff += data }) proc.stderr.on('data', function (data) { commandProcessBuffError += data })
proc.on('close', function (code) { if(null !== commandProcessBuffError) def.reject(commandProcessBuffError) else def.resolve(commandProcessBuff) }) return def.promise}
Promise design pattern
Async Node.js libraryProvides control flows like:
Sequences (series)
Waterfalls (sequences with parameters passing)
Parallel (with limit)
Some/every conditions
While/until
Queue
Async Node.js librarySeries
Async Node.js librarySeries
async.series([ function(callback) { // operation1 }, function(callback) { // operation2 }, function(callback) { // operation3 }], function() { console.log('all operations done')})
Async Node.js libraryParallel
async.parallel([ function(callback) { // operation1 }, function(callback) { // operation2 }, function(callback) { // operation3 }], function() { console.log('all operations done')})
Async Node.js libraryParallel limit
Async Node.js libraryParallel limit
var tasks = [ function(callback) { // operation1 }, function(callback) { // operation2 }, // ...]
async.parallelLimit(tasks, 2, function() { console.log('all operations done')})
Async Node.js libraryWaterfall
async.waterfall([ function(callback) { // operation1 callback(null, arg1, arg2) }, function(arg1, arg2, callback) { // operation2 callback(null, foo, bar) }, function(foo, bar, callback) { // operation3 }], function() { console.log('all operations done')})
Async Node.js libraryWhilst
async.doWhilst( function(done) { // operation1 done(null, arg1, arg2) }, function() { return pages < limit }], function() { console.log('done')})
Asynchronousprogramming traps
Dealing with callbacks may be tricky. Keep your code clean.
Unnamed callbacksKeep your code clean, don't name callback function callback
function doSomething(callback) { return callback;}
Unnamed callbacksfunction doSomething(callback) { doAnotherThing(function(callback2) { doYetAnotherThing(function(callback3) { return callback(); }) })}
Unnamed callbacksInstead of this, name your callbacks
function doSomething(done) { doAnotherThing(function(doneFetchingFromApi) { doYetAnotherThing(function(doneWritingToDatabase) { return done(); }) })}
Double callbacksfunction doSomething(done) {
doAnotherThing(function (err) { if (err) done(err); done(null, result); }); }
Callback is fired twice!
Double callbacksFix: Always prepend callback execution with return statement.
function doSomething(done) {
doAnotherThing(function (err) { if (err) return done(err); return done(null, result); });}
Normally, return ends function execution, why do not keep thisrule while async.
Double callbacksDouble callbacks are very hard to debug.
The callback wrapper can be written and execute it only once.setTimeout(function() { done('a')}, 200)setTimeout(function() { done('b')}, 500)
Double callbacksvar CallbackOnce = function(callback) { this.isFired = false this.callback = callback} CallbackOnce.prototype.create = function() { var delegate = this return function() { if(delegate.isFired) return delegate.isFired = true delegate.callback.apply(null, arguments) }}
Double callbacksobj1 = new CallbackOnce(done)
// decorate callbacksafeDone = obj1.create() // safeDone() is proxy function that passes arguments setTimeout(function() { safeDone('a') // safe now...}, 200)setTimeout(function() { safeDone('b') // safe now...}, 500)
Unexpected callbacksNever fire callback until task is done.
function doSomething(done) {
doAnotherThing(function () { if (condition) { var result = null // prepare result... return done(result); } return done(null); });}
The ending return will be fired even if condition pass.
Unexpected callbacksNever fire callback until task is done.
function doSomething(done) {
doAnotherThing(function () { if (condition) { var result = null // prepare result... return done(result); } else { return done(null); } });}
Unexpected callbacksNever use callback in try clause!
function (callback) { another_function(function (err, some_data) { if (err) return callback(err); try { callback(null, JSON.parse(some_data)); // error here } catch(err) { callback(new Error(some_data + ' is not a valid JSON')); } });}
If callback throws an exception, then it is executed exactly twice!
Unexpected callbacksNever use callback in try clause!
function (callback) { another_function(function (err, some_data) { if (err) return callback(err); try { var parsed = JSON.parse(some_data) } catch(err) { return callback(new Error(some_data + ' is not a valid JSON')); } callback(null, parsed); });}
Unexpected callbacksNever use callback in try clause!
Take care of eventsRead docs carefully. Really.
function doSomething(done) {
var proc = process.spawn('java', ['-jar', 'loadData.jar', sourceFile]) var procBuff = ''; proc.stdout.on('data', function (data) { procBuff += data; }); // WAT?! proc.stderr.on('data', function (data) { done(new Error("An error occured: " + data)) }); proc.on('close', function (code) { done(null, procBuff); }}
Take care of eventsRead docs carefully. Really.
function doSomething(done) {
var proc = process.spawn('java', ['-jar', 'loadData.jar', sourceFile]) var procBuff = ''; var procBuffError = '';
proc.stdout.on('data', function (data) { procBuff += data; });
proc.stderr.on('data', function (data) { proc += data; });
proc.on('close', function (code) { if(code !== 0) { return done(new Error("An error occured: " + procBuffError)); } else { return done(null, procBuff) } }
}
Unreadable and logsKeep in mind, that asynchronous logs will interweave
There are not sequenced
Or there will be same log strings
Unexpected callbacksAsynchronous logs will interweave
Unreadable and logsLogs without use context are useless...
function getResults(keyword, done) { http.request(url, function(response) { console.log('Fetching from API') response.on('error', function(err) { console.log('API error') }) });}
Unreadable and logsfunction getResults(keyword, done) { var logContext = { keyword: keyword } http.request(url, function(response) { console.log(logContext, 'Fetching from API') response.on('error', function(err) { console.log(logContext, 'API error') }) });}
Unreadable and logsCentralize your logs - use logstash
And make them searcheable - Elasticsearch + Kibana
Too many openedbackground-tasks
While running parallel in order to satisfy first-better algorithm,others should be aborted
Too many openedbackground-tasks
Provide cancellation API:var events = require('events')
function getResults(keyword) { var def = deferred() var eventbus = new events.EventEmitter() var req = http.request(url, function(response) { var err = null , content = null res.on('data', function(chunk) { content += chunk; }); response.on('close', function() { if(err) return def.reject(err) else return def.resolve(content) }) response.on('error', function(err) { err += err }) });
Too many openedbackground-tasks
Provide cancellation API:var response = getResults('test')
response.result(function success() { // ...}, function error() { // ...})
// if we needresponse.events.emit('abort')
Everything runs in parallel except your code.
When currently code is running, (not waiting for I/O descriptors)whole event loop is blocked.
THE ENDby Piotr Pelczar
Q&A
by Piotr Pelczar