try nosql it doesn't hurts and is fun

38
Pere Urbón-Bayes Moviepilot Gmbh @purbon [email protected] N SQL dijous 30 de juny de 2011

Upload: pere-urbon-bayes

Post on 15-May-2015

8.108 views

Category:

Technology


1 download

DESCRIPTION

This presentation is a review of the NoSQL spaces I did for the X Jornades de Programari Lliure in Barcelona. You will see a complete review of the NoSQL movement, use cases, technology review, an special review of what are the Graph Databases. And more.... Special thanks to @Hagenburger, @sbitxu, @jannis and the inspiration of the big @jimwebber and the amazing community.

TRANSCRIPT

Page 1: Try NoSQL it doesn't hurts and is fun

Pere Urbón-BayesMoviepilot Gmbh

@[email protected]

N SQL

dijous 30 de juny de 2011

Page 2: Try NoSQL it doesn't hurts and is fun

Were we are, and where do we come from?

NoSQL. { “motivation” : “use cases” }

Graph databases.

....

We’re going to talk about?

dijous 30 de juny de 2011

Page 3: Try NoSQL it doesn't hurts and is fun

{

"if_you":{

"are_the_master_of": [ "movies", "data analytics", "ruby", "git", "nosql" ],

"love":"recommendation systems",

"would_love_to_know_about":"graph_databases",

"believe_in":"open source"

},

"join_us":"true",

"contact_with":"jobs at moviepilot.com"

}

Moviepilot is a leading provider and discovery service for movies and series based in Berlin!

dijous 30 de juny de 2011

Page 4: Try NoSQL it doesn't hurts and is fun

Come and GoingHistory

1960 Navigational Databases

1970 Relational Databases.

Edgar Codd Algebra.

1970 ends, SQL DBMS.

SQL, DB2, Ingres, PostgreSQL, Sybase,

dijous 30 de juny de 2011

Page 5: Try NoSQL it doesn't hurts and is fun

dijous 30 de juny de 2011

Page 6: Try NoSQL it doesn't hurts and is fun

dijous 30 de juny de 2011

Page 7: Try NoSQL it doesn't hurts and is fun

Where are we now?

1990 2000 2010 2030

Text Files

Social networks

Blogs

Tagging

RDF

Semantic Web

Folksonomies

Linked Data

Business Intelligence

Is every thing related?

RDMS

dijous 30 de juny de 2011

Page 8: Try NoSQL it doesn't hurts and is fun

Where are we now?

19801990

20002010

2020

dijous 30 de juny de 2011

Page 9: Try NoSQL it doesn't hurts and is fun

How are our apps...?

Data warehousing and Business Intelligence.

Stream processing.

Text search.

Scientific processing.

Semi-(un)-structured data.

dijous 30 de juny de 2011

Page 10: Try NoSQL it doesn't hurts and is fun

Need to scale horizontally.

Partition and replication.

OLTP and OLAP.

Web 2.0.

Performance, Performance, Performance.

Flexibility.

Big even Huge datasets.

.....?

How are our apps...?

dijous 30 de juny de 2011

Page 11: Try NoSQL it doesn't hurts and is fun

select fun, profit from real_world where relational=false and barcelona=true;

Carlo Strozzi, 1998.

Eric Evans (Rackspace) and Johan Oskarsson (last.fm), early 2009.

no:sql(east) 2009, no:sql(eu) 2010.

N SQL

dijous 30 de juny de 2011

Page 12: Try NoSQL it doesn't hurts and is fun

Ability to scale horizontally.

Replication and distribution.

Weaker concurrency model.

Smart use of resources.

Access throw different end points.

Dynamic schema environment.

Leave more business to the app side.

select fun, profit from real_world where relational=false and barcelona=true;

N SQL

dijous 30 de juny de 2011

Page 13: Try NoSQL it doesn't hurts and is fun

dijous 30 de juny de 2011

Page 14: Try NoSQL it doesn't hurts and is fun

StoreDismantle

RebuildBrick

Window

Roof

Unstructured?

Enjoy

Unstructured Structured

dijous 30 de juny de 2011

Page 15: Try NoSQL it doesn't hurts and is fun

ACIDselect fun, profit from real_world where relational=false and barcelona=true;

AtomicityAll operations are executed or none is.

ConsistencyData is consistent after the transaction.

IsolationTransactions are independent.

DurabilityChanges persist, event if failures.

Helps

Understand data.Persistence guaranteed.

Hurts

Horizontal scale.High Availability.

dijous 30 de juny de 2011

Page 16: Try NoSQL it doesn't hurts and is fun

“There is a magic bullet! It's called relaxing the requirements.”

- Evan Weaver, @evan

dijous 30 de juny de 2011

Page 17: Try NoSQL it doesn't hurts and is fun

CAPselect fun, profit from real_world where relational=false and barcelona=true;

C

P

A

ConsistencyEach client has the same

view.

AvailabilityAll client can read and

write.

Partition ToleranceWorks well across different

network partitions.

mysql

redis riak

Only Two!!!!

dijous 30 de juny de 2011

Page 18: Try NoSQL it doesn't hurts and is fun

“You have database problem. You research blog and HN. You start use NoSQL product. Now you not know anymore if you have problem.”

- Devops BORAT, @devops_borat

dijous 30 de juny de 2011

Page 19: Try NoSQL it doesn't hurts and is fun

NoSQL systems.select fun, profit from real_world where relational=false and barcelona=true;

Most commons

Column DBs.

Document DBs.

Key-Value DBs.

Graph DBs.

Object DBs.

Other systems

XML Databases

Grid Databases.

RDF.

....

dijous 30 de juny de 2011

Page 20: Try NoSQL it doesn't hurts and is fun

Column Databasesselect fun, profit from real_world where relational=false and barcelona=true;

Is a DBMS that stores its content by column rather than by row. This has advantages for data warehouses.

More efficient with Aggregates and if data is column oriented.

Suited for OLAP and not much for OLTP.

First implementations, early 1970.

dijous 30 de juny de 2011

Page 21: Try NoSQL it doesn't hurts and is fun

Apache Cassandraselect fun, profit from real_world where relational=false and barcelona=true;

Designed to handle very large data spread across multiple commodity servers.

High Availability with no SPOF.

Born at Facebook, to power Inbox Search.

Hybrid system, between column and rows.

Initial Release 2008. Version 0.8.1 28/06/11.

dijous 30 de juny de 2011

Page 22: Try NoSQL it doesn't hurts and is fun

Key-Value Databasesselect fun, profit from real_world where relational=false and barcelona=true;

Allow the use to store key-value pairs, where the key usually consist of a string, and the value is a simple primitive.

Suited for use cases where properties and values are enough, ex: profiles, logs, etc...

Eventually consistent, hierarchy, multivalued, etc..

First implementations, around 1980.

dijous 30 de juny de 2011

Page 23: Try NoSQL it doesn't hurts and is fun

Redis.ioselect fun, profit from real_world where relational=false and barcelona=true;

Open-source, networked, in-memory, persistent, journaled, key-value datastore.

Binding for the major languages.

The data structure storage system.

Master-Slave replication. High performance.

Initial Release 2009. Version 2.2.7 11/05/11

dijous 30 de juny de 2011

Page 24: Try NoSQL it doesn't hurts and is fun

Document Databasesselect fun, profit from real_world where relational=false and barcelona=true;

Is a DBMS where the default unit of store is a document. XML, JSON, YAML, .....

More complex than Key-Value store.

Suited for multi document apps. News, CVs,...

Eventual consistency, limited Atomicity and Isolation.

One of the first, Lotus Notes, 1989.

dijous 30 de juny de 2011

Page 25: Try NoSQL it doesn't hurts and is fun

OrientDBselect fun, profit from real_world where relational=false and barcelona=true;

Open source database written in Java.

Schema-[full,less,mix] modes.

Support SQL, ACID compliant, HTTP, Rest and JSON. Distributed and scalable.

Light and embeddable. Binding most langs.

Initial Release 2010, Version 1.0rc2 17/06/11

dijous 30 de juny de 2011

Page 26: Try NoSQL it doesn't hurts and is fun

Graph Databasesselect fun, profit from real_world where relational=false and barcelona=true;

Is a database that uses graph structures with nodes, edges, and properties.

Suited for associative datasets, map object oriented app structure. Avoid expensive joins.

Are powerful for graph-like operations, like shortest path, communities, etc.

First implementations around 2007.

dijous 30 de juny de 2011

Page 27: Try NoSQL it doesn't hurts and is fun

Graph Databases

dijous 30 de juny de 2011

Page 28: Try NoSQL it doesn't hurts and is fun

What is a graph?

Graph G(V,E) where V = {v1,v2,...,vN) and E = {E1,E2,...,EN)

Directed / Undirected

Mixed

Multigraph

Weighted

dijous 30 de juny de 2011

Page 29: Try NoSQL it doesn't hurts and is fun

dijous 30 de juny de 2011

Page 30: Try NoSQL it doesn't hurts and is fun

Graph DatabasesThe Property Graph

Abstractions

Nodes and Relationships.

Properties on both.

John smith liked http://www.example.com at 01/10/11

dijous 30 de juny de 2011

Page 31: Try NoSQL it doesn't hurts and is fun

Graph DatabasesApplications

Task planning

Scheduling

Process assignation

Routing

Logistics

League planning

Pattern Recognition

Dependency analysis

Impact analysis

Network flow

Traffic analysis and optimization

Delivery optimization

Optimization of tasks

dijous 30 de juny de 2011

Page 32: Try NoSQL it doesn't hurts and is fun

Graph Databases Applications

Recommendations

Heuristics (PageRank)

Local

Shortest Paths

Hammock Functions

Walks

Search algorithms

Shooting stars

K-nearest neighbors

dijous 30 de juny de 2011

Page 33: Try NoSQL it doesn't hurts and is fun

Graph DatabasesApplications

Semantic web

RDF (OWL) Store

RDF-Sail

SPARQL

Linked data (Open Data)

Link analysis

Structure mining

dijous 30 de juny de 2011

Page 34: Try NoSQL it doesn't hurts and is fun

Graph DatabasesVendors

Neo4J: Open source database NoSQL graph.

HyperGraphDB: An IA and semantic web graph database.

Infogrid: The Internet Graph database.

Sones: SaaS dot Net Graph database.

OrientDB: The Document-GraphDB.

FlockDB: The twitter graphdb.

Pregel: Graph Processing at Google.

dijous 30 de juny de 2011

Page 35: Try NoSQL it doesn't hurts and is fun

dijous 30 de juny de 2011

Page 36: Try NoSQL it doesn't hurts and is fun

dijous 30 de juny de 2011

Page 37: Try NoSQL it doesn't hurts and is fun

Demo time

dijous 30 de juny de 2011

Page 38: Try NoSQL it doesn't hurts and is fun

Thanks!Questions?Pere Urbón-BayesMoviepilot Gmbh

@[email protected]

dijous 30 de juny de 2011