Download - Node.js and Cassandra
Node.js and CassandraFor highly concurrent systems
Software as a service
Most common scenario
• I/O Boundo dbo other serviceso file system
• Low CPU usage
• Peaks and valleys
Why Node.js
Why Node.js
Event-based
• Single threaded
• Minimum overhead per connection
• Predictable amount of memory under load
• Apache / IIS vs Nginx: Process-based vs event-loop
Everything runs in parallel except your code
Why Node.js
Event-based: Apache vs Nginx
source webfaction.com
Why Node.js
Async I/O
• Uses OS network interfaces + fixed thread pool
• Time spent: db connections / files
Why Node.js
Javascript: Closures
• CPS: Continuation passing style
• The scope of the outer function -> inner function
• Javascript: Functional / Dynamic / Object oriented
• ... packet manager / os community / V8 / ubiquitous...
Why Node.js
Stream everything
• Avoid buffering
• HTTP: Chunked requests and responses
• TCP: Chunks readable in a stream
• Stream Piping (UNIX like)
The Driver for Cassandra
Features
• Connection pooling to multiple hosts
• Load balancing
• Automatic failover / retry
• Row and field streaming
• Queuing: Concurrent connecting / preparing
The Driver for Cassandra
The Driver for Cassandra
A
B
CD
G E
F H
App Cassandra nodes
Sample: Json Web Api
The Driver for Cassandra
app.get('/user/:id', function (req, res, next){
var query = 'SELECT * FROM users WHERE id = ?';
cassandra.executeAsPrepared(query, [req.params.id], function (err, result) {
if (err) return next(err);
var row = result.rows[0];
//Response: expose some properties of the user
res.json({id: req.params.id, name: row.get('name')});
});
});
1
2
3
4
5
6
7
8
9
The Driver for Cassandra
How row streaming works
Socket Protocol Parser
Readablestream
Transformstream
Chunks
Header and body chunk
s
Row
Client
Transformstream
Sample: Field streaming
The Driver for Cassandra
app.get('/user/:id/image', function (req, res, next){
var query = 'SELECT id, profile_image FROM users WHERE id = ?';
cassandra.streamField(query, [req.params.id], function (err, row, image) {
if (err) return next(err);
//pipe the image stream to the response stream
image.pipe(res);
});
});
1
2
3
4
5
6
7
8
Sample: Field streaming + image resizing
The Driver for Cassandra
app.get('/user/:id/image', function (req, res, next){
var query = 'SELECT id, profile_image FROM users WHERE id = ?';
cassandra.streamField(query, [req.params.id], function (err, row, image) {
if (err) return next(err);
//pipe the image stream to a resizer stream
image.pipe(resizer).pipe(res);
});
});
1
2
3
4
5
6
7
8
Moving forward
Next features• Multiple data centers support.
• Cassandra query tracing
Contribute! :)
Jorge Bay Gondra
@jorgebg
github.com/jorgebay/node-cassandra-cql
npm install node-cassandra-cql
Thanks!
References and further reading
A Design Framework for Highly Concurrent Systems by Matt Welsh, Steven D. Gribble, Eric A. Brewer, and David Culler @ UC Berkeley
Concurrency is not Parallelism (it's better) by Rob Pike @Google Go lang
How the single threaded non blocking IO model works in Node.js