nosql options compared
DESCRIPTION
TRANSCRIPT
NoSQL Options ComparedDifferent Horses for Different Courses
NoSQL Options ComparedDifferent Horses for Different Courses
The current world of NoSQL The NoSQL model NoSQL classes Which model should you use Best use (& use cases comparison) Choosing the Right Horse Lessons learned from actual use
The current world of NoSQLThe current world of NoSQL
RDBMS105+ databases
NoSQL122+ databases
ForecastNoSQL market expected to reach$3.4 Billion by 2018
NoSQL market revenue $14 Billion over 2013 – 2018
RDBMS are great and ... will be fine
Recap: RDBMS are great
SQL ACID Well understood by developers Well supported by frameworks and tools Backups Tuning Recovery support
RDBMS are great, but...
Difficult to handle relational schema Schema changes Difficult to scale writes Vertical scaling is limited Horizontal scaling is limited
Bridging NoSQL\SQL divide
Interest in using NoSQL technology has reemerged:
Oracle released MySQL + NoSQL memcached plugin
More:
Neo4J announced JDBC (SQL) driver
Cassandra + CQL
CouchBase + UnQL
NoSQL model promotes
Schema-free approach Flexible data models No unneeded complexity Strict data consistency might be unnecessary
Big data amount High throughput
Over RDBMS expensive in performance BASE (not ACID)
Eventually consistent Simple API
NoSQL models
Available DatabasesAvailable Databases
Key Value StoresDocument DatabasesGraph DatabasesColumn-Oriented DatabasesXML DatabasesObject DatabasesOther (NoSQL related, un-resolved, uncategorized)
31%Key Value
10%Document
13%Graph
9%Column Family
6%XML
10%Object
21%Other
NoSQL models
Column Oriented Store - each storage block contains data from only one column
Document Store - stores documents made up of tagged elements
Key Value Store - hash table of keys
Graph Database - stores data in the nodes and relationships of a graph
NoSQL candidates to compare
Key Value / Tuple Store
Document Store
Graph Database
Wide Column Store / Column Families Database
Redis
open source networked in-memory key-value optional durability written in ANSI C
CouchBase
open source high-performance map/reduce schema-free document-oriented written in C, C++, Erlang
Neo4j
open source embedded or server disk-based transactional data stored in graphs written in Java
Cassandra
open source no single point of failure column family store tunable consistency written in Java
Comparison
Redis CouchBase Neo4j Cassandra
Language C C, C++, Erlang Java Java
Commercial Support
Third party companies
Consulting & Support with Enterprise
Neo4j Advanced, Neo4j Enterprise
DataStax, Impetus, Acunu, Riptapo, Cubet
Technologies
Customers GitHub, Guardian
Media Group
Zynga, AOL, BBC Adobe, Cisco, StudiVZ, Deutsche Telekom, Fanbox
Twitter, Digg, Reddit, Rackspace, Facebook
Licenses New BSD Community & Enterprise Licenses
GPL or AGPLv3 Apache License 2
General Information
Comparison
Redis CouchBase Neo4j Cassandra
Client Libraries Bindings + REST\HTTP non-vBucket or v-Bucket Bindings + REST\HTTP Thrift based
C + + n/a +
C++ n/a + n/a +
C# + + + +
Java + + + +
Perl + + + +
PHP + + n/a +
Python + + + +
Ruby + + + +
Querying - - - -
Secondary Indexes n/a + - +
Map/Reduce n/a n/a n/a Hadoop supported
ACID transactions n/a + + +/- (tunable consistency)
Queries & Operations
Best Use
Redis CouchBase Neo4j Cassandra
Real-time systems where low latency is critical (games)
Syncing online and offline data (allows synchronization and sharing of data and applications across multiple platforms and mobile devices)
Cloud/network management
Managing large streams of non-transactional data: apache logs, application logs, etc
High performance caching tier for web sites and other applications
Social and online gaming
Social, geospatial data Consistent, fast response times under writes (high volume writes)
Server for backed sessions or transient data
Data management layer for recommendation engine
Bioinformatics Real-time analytics & statistics
Service offering some real-time statistics
Highly available solution
Which model should you use?
Column Oriented Store
Document Store
Key Value Store
Graph Database
More specific: which NoSQL database?
Answer
This depends on your case!
Compare your problems to others
Evaluate characteristics of NoSQL storage:
Maturity
Connectivity/Querying/Operations
More aspects to consider
How big is your data Massive read/write throughput Fast key-value access No single point of failure Tunable Brewer's CAP trade-offs:
ConsistencyAvailabilityPartition Tolerance
Maintainability, Administration
Lessons learned from actual use
Start small, but significant
Is not “one size fits all”, but “horses for courses”
Consider a Hybrid Approach (NoSQL + RDBMS)
Lessons learned from actual useHybrid Approach
NoSQL RDBMS
Business
Facade
Two Databases: NoSQL + RDBMS
Key Value Storage for Session Data + RDBMS for User DataColumn Storage for Reporting Data + RDBMS for User Data