nosql

(No)SQL

Radu [email protected]

http://vunvulearadu.blogspot.com

{“name” : “Radu Vunvulea,“company” : “iQuest”,“userType” : “enthusiastic”“technologies” : [ “.NET”, “JS”, “Azure”, “Web”, “Mobile”, “SL” ],“blog” : “vunvulearadu.blogspot.com”,“email” : ”[email protected]”,“socialMedia” :

{“twitter” : “@RaduVunvulea”,

“fb” : “radu.vunvulea”}

}

Who am I?

In the early 1980s, relational databases began to be defined. One of the proponents of relational database theory was Edgar F. Codd, who published 13 rules that set out to define a relational database. This was the beginning of the formalized scientific groundwork done to lay down specific rules for the existence of the relational aspects of a database.Sursa: http://www.ehow.com

http://www.ehow.com/about_5437121_properties-relational-database.html#ixzz2RIvQmJAs



Relevant rules

• Relational facilities• Information is represented only in

one way• All data must be accessible• All views that are theoretically

updatable must be updatable by the system

• Insert, Update, Delete for any retrieval sets

• Where from is this name?

NoSQL

• Where from is this name?• Non-relational• Web-scale database

NoSQL

• What is NoSQL

What is NoSQL

Any database that is not a Relational Database!

What is NoSQL?

Any database that is not a Relational Database!

Simple like this

What is NoSQL?

• Non-Relational Database• But is to long• Is not so cool• This name would not caught on

A better name would be

• Non-Relational Database• But is to long• Is not so cool• This name would not caught on

…so we are back to

NoSQL

A better name would be

• More and more connections between data• Everything is linked to something more…

and more… and so on

• Hyperlinks• Tags• RSS• RDF• Attributes• User content

Database trends – 1 Connections

• From a flat architecture

Database trends – 2 Architecture

DB

App

• From a flat architecture to a couple one


DB

AppApp App

• From a flat architecture to a couple one and now we have a decoupled one based on services


DB

App

DB

App

DB

App

• From web 2.0 the structure of data are don’t have so fixed structure (is more flexible)

• How many phone number a person could have in 1970?

Database trends – 3 No fix structure

• From web 2.0 the structure of data are don’t have so fixed structure (is more flexible)

• How many phone number a person could have in 1970? And NOW …

Database trends – 3 No fix structure

• 2006 - 160• 2008 – 390• 2010 – 998• 2012 – 2000+

• First column is in years• Second column is in … ?

Database trends – 4 Data Size

• 2006 - 160• 2008 – 390• 2010 – 998• 2012 – 2000+

• First column is in years• Second column is in ExaByte (EB) - TeraByte

–

Database trends – 4 Data Size

What we need

Relational Database Performance

RDBMSperformance

So, we end up with

So, we end up with

Non-Relational Database

1. Key-Value2. Document3. Big Table4. Graph DB

Categories of NoSQL Database

• Design to handle massive load• Can scale to massive amounts of

data• Based on Key-Value collections • Dynamic ring partition • Dynamic replication

• Ex.: Dynoite

Key-Value

Key-Value

• Like column oriented Relational Database, but with a twist

• Tables similar to RDBMS, but handle semi-structured

• Based on Google’s BigTable paper• Data mode: • Columns – columns family -> ACL• Dataums keyed by - row, column, time,

index• Row-range – table -> distribution

• Ex.: Cassandra

Big Table

• Similar with Key-Value pair but• DB knows what the Value is

• Inspired by Lotus Notes• Data model:• Collections of Key-Value collections

• Documents are often versioned

• Ex.: MongoDB

Document Database

• Focus in modeling the structure of data• The interconnectivity

• Scales on the complexity of data• Inspired by mathematical Graph Theory• Data model:• Property Graph -> Nodes• Relationships/Edges between Nodes• Key-Value pair on both• Possible Edge Labels and/or Node/Edge Types

• Ex.: Neo4j

Graph Database

• Not part of NoSQL community• Still a good solution for a lot of

problems• Focuses on matching OOP paradigm • Easy to use• Simple to integrate

• Neither gain nor loosing traction

Object Database

• Easy to deploy• No OS management• Scaling • Monitoring• Publish from different source controls • Support different technologies (PHP,

node.js, .NET)• Low cost support – shared mode • Reserved mode – dedicated instance• Each site run in an isolated environment

Web Sites

Scaling

Complexity

Size

KeyValue

BigTable

Doc.

Graph

How to query it?

• REST• GQL (SQL Like)• SPARQL• Gremlin• API’s

How to query it?

• Replication• Write to many• Master/Slave replication

• Master reelection• Failover• Either by another machine taking over• Client knowing

Availability

• Most NoSQL sacrifice Consistency • Some NoSQL don’t have Transactions • Atom single operations• Because of this some operations are

impossible to implement

Correctness

• NoSQL is the Batman

Performance

• NoSQL is the Batman• Durability is sacrificed• On-disk durability• Multiple-replicas durability

Performance

One solution for all our problemes?Web Sites

One solution for all our problems?

• Why• Dynamic query• Content is stored as documents• Big database that need to be very fast

• Where• Properties are stored like query and index• Can be used for voting system, CMS or comment

storage

MongoDB

• Why• When you make a lot of updates and insert• Reading data is not the main scope of the

database (writes are faster than reads)• Content is stored as column• High availability

• Where• Can be used with success for logging• Financial industry or any place where we work

with a lot of data that is needed to be written• Basket of an e-commerce application

Cassandra

• Why• For data that don’t change very often

(insert and read and NOT update)• We have a lot of predefined queries and

we need versioning support• Where• Is a great database for CMS and CRM.

CouchDB

• Why• When you do data analyzing

• Where• Works great in combination with

Hadoop

HBase

• Why• When we need high concurrency• When the latency is very low and we

want the latency to be minimal• Where• Backend of a game or a system that

offer data in real time

Membase

• Why• When we need to make a lot of updates• When the database is not too big and

can be kept in memory• Where• Can be used when we have a real time

communication, for example a stock market with prices

Redis

• Facebook• Hbase – Facebook messages• Scribe - Real-time click logs• Hive – SQLqueries -> MapReducejobs• Hadoop• Web analytics warehouse• Distribute datastore• MySQLbackup

Examples

• Twitter• Hadoop – Analytics• Hbase – People search• Scribe – Log collection framework• FlockDB – Social graph analysis

Examples

Question

Answers

THE END

Radu [email protected]

http://vunvulearadu.blogspot.com