orientdb - the 2nd generation of (multi-model) nosql

65
Luigi Dell’Aquila Director of Consulting Orient Technologies LTD Twitter: @ldellaquila http://www.orientdb.com OrientDB - the 2nd generation of (Multi-Model) NoSQL And why GraphDB are the starting point of this revolution

Upload: luigi-dellaquila

Post on 16-Jul-2015

1.349 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Luigi Dell’Aquila

Director of Consulting

Orient Technologies LTD

Twitter: @ldellaquila

http://www.orientdb.com

OrientDB - the 2nd generation of

(Multi-Model) NoSQLAnd why GraphDB are the starting point of this revolution

Page 2: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

“90% of the data

in the world today

has been created

in the last two years alone.”

- IBM

Welcome to Big Data

Page 3: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Just Data

Order #134(Order) John

(Provider)

Commodore

Amiga 1200

(Product)

Frank(Customer)

Monitor 40”

(Product)

Mouse

(Product)

Bruno(Provider)

Page 4: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Just Data

Order #134(Order) John

(Provider)

Commodore

Amiga 1200

(Product)

Frank(Customer)

Monitor 40”

(Product)

Mouse

(Product)

Bruno(Provider)

Data by itself has little value,

it’s the relationship

between data that gives it

incredible value

Page 5: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Relationships give data “meaning”

Order #134(Order) John

(Provider)

Commodore

Amiga 1200

(Product)

(Sells)

Frank(Customer)

(Has)(Makes)

Monitor 40”

(Product)(Sells)

(Has)

Mouse

(Product)

Bruno(Provider) (Sells)

(Has)

Page 6: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Top NoSQL categories

Key/Value Databases

Document Databases

Graph Databases

Column Databases

Page 7: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Top NoSQL categories

Key/Value Databases

Document Databases Graph Databases

Column Databases

Page 8: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Why do most NoSQL products

avoid

managing relationships?

Page 9: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

ID Name

10 John

11 John

24 Mike

28 Mike

ID Address

10 24

10 33

32 44

ID Location

24 Milan

33 London

18 Paris

18 Madrid

44 Moscow

Customer CustomerAddress Address

Is this

familiar?

Page 10: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

What’s wrongwith JOIN?

Page 11: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

A-Z

A-L M-Z

Imagine an Address Book

where we want to find Luigi’s phone number

Index Lookup: how does it work?

Page 12: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

Index algorithms are all similar and based on

balanced trees

Index Lookup: how does it work?

Page 13: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

A-D

A-B C-D

E-L

E-G H-L

Index Lookup: how does it work?

Page 14: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

A-D

A-B C-D

E-L

E-G H-L

E-G

E-F G

H-L

H-J K-L

Index Lookup: how does it work?

Page 15: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Index Lookup: how does it work?

A-Z

A-L M-Z

A-L

A-D E-L

M-Z

M-R S-Z

A-D

A-B C-D

E-L

E-G H-L

E-G

E-F G

H-L

H-J K-L

Luigi

Found! This lookup took 5 steps. With millions of indexed records, the tree depth

could be 1000’s of levels!

Page 16: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Joins Kill Performance

ID Name

10 John

11 John

24 Mike

28 Mike

ID Address

10 24

10 33

32 44

ID Location

24 Milan

33 London

18 Paris

18 Madrid

44 Moscow

Customer CustomerAddress Address

Joins are executed every time

you cross relationships

Querying million of records

joining 3-4 tables could

generate billions of

combinations

Page 17: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

This is why the database

query performance

suffers as the database

increases in size

O(Log N)

Page 18: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

RDBMS performance on traversal

Page 19: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

In a world that’s becoming

more connected, we need a

better way to store data and

manage relationships

Read: Data is important, but relationships are even more fundamental today

Page 20: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

“A graph database is any

storage system

that provides

index-free adjacency”

- Marko Rodriguez(author of TinkerPop Blueprints)

Page 21: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Every developer knows

the Relational Model,

but who knows the

Graph one?

Page 22: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Back to school:

Graph Theory crash course

Page 23: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Basic Graph

Luigi LyonVisited

Page 24: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Vertices and Edges can have properties

Vertices are directed

* https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model

Property Graph Model*

Lyon

people: 500,000

Luigi

company: OrientTechnologies

Vertices and Edges can have properties

Vertices and Edges can have properties

Visited

on: 2015

Page 25: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Luigi Lyon

An Edge connects only 2 vertices

Use multiple edges to represent 1-N and N-M relationships

1-N and N-M Relationships

Page 26: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Congrats! This is your diploma in

«Graph Theory»

Page 27: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

The Graph theory

is so simple,

yet so

powerful

Page 28: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

How does a true*

Graph Database

manage relationships?

*a “Graph” layer on top of a DBMS doesn’t qualify as a true GraphDB

Page 29: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Luigi Lyon#13:55

#15:99

Each element in the

Graph has own

immutable Record ID

#22:11

(Edge)

(Vertex)(Vertex)

Each element in the

Graph has own

immutable Record ID

Each element in the

Graph has own

immutable Record ID

Page 30: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Luigi Lyon#13:55

#15:99

Connections use

persistent

pointers

#22:11

(Edge)

(Vertex)(Vertex)

Page 31: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Luigi Lyon#13:55

#15:99

#22:11

(Edge)

(Vertex)(Vertex)

Page 32: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Luigi Lyon#13:55

#15:99

#22:11

(Edge)

(Vertex)(Vertex)

Page 33: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

A Graph Database creates the

relationship just once

(when the edge is created)

VS

RDBMS computes the

relationship every time

you query a database

Page 34: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

When you move from a RDBMS

to a Graph Database you jump

from a O(log N) speed to a near O(1)

With a Graph Database, the

traversing time is

not affected by database size!

This is huge in the BigData age

Page 35: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Graph Databases Easily Manage Complex Relationships

No costs to traverse relationships:

• Recommendation engines

• Social Applications

• Spatial Apps

• Master Data Management

• Information Clustering

John

Thriller

Comedy

Pulp Fiction

Mr Bean

TheaterB

TheaterA

Theater C

NYC

San Josè

Lives in

Page 36: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

GraphDB Database QuadrantR

ela

tionship

s C

om

ple

xity >

Data Complexity >

Relational

Key Value

Column

Graph

Document

Page 37: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

GraphDB Database QuadrantR

ela

tionship

s C

om

ple

xity >

Data Complexity >

Relational

Key Value

Column

Graph

Document

These were 1st generation NoSQL

products, where each tool was

only good at a few use cases

Page 38: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL
Page 39: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Oracle

(RDBMS)

Redis or

Memcache

(Key/Value)

MongoDB

(DocDB)

Neo4j

(GraphDB)

E

Application

ETL

E

E

E

1st Generation NoSQL: Scenario

Primary

DB

Page 40: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

1st Generation NoSQL: Fact

In > 90% of use cases,

NoSQL products are

used as second DBMS

Page 41: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Oracle

(RDBMS)

Redis or

Memcache

(Key/Value)

MongoDB

(DocDB)

Neo4j

(GraphDB)

E

Application

ETL

E

E

E

1st Generation NoSQL: Problems

- No standard between NoSQL

products

- Multiple vendors = multiple skills

- ETL + synchronization code

is costly to write and maintain

- Performance and Reliability is

hard to predict

Page 42: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

2nd Generation NoSQL

is

Multi-Model

Page 43: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

What’s Multi-Model DBMS?

GraphDocument

Object

Key/Value

Multi Model represents the

intersection

of multiple models in just one

product

Page 44: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

What’s Multi-Model DBMS?

GraphDocument

Object

Key/Value

Multi Model represents the

intersection

of multiple models in just one

product

- Just one product to learn and maintain

- Just one vendor relationship to manage

- No ETL, no synchronization required

- Performance and Reliability is easy to test from the

beginning

Page 45: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Relationships give data “meaning”

Order #134(Order) John

(Provider)

Commodore

Amiga 1200

(Product)

(Sells)

Frank(Customer)

(Has)(Makes)

Monitor 40”

(Product)(Sells)

(Has)

Mouse

(Product)

Bruno(Provider)

(Sells)

(Has)

Page 46: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Multi-Model domain schema

Customer Provider

Productname: string

qty: int

Actorname: string

surname: string

Sellsprice: decimal

Inherits

Edge

Legenda:

V Vertex

Makes

Ordernumber: int

date: datetime

Hasprice: decimal

Page 47: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

`

Vertices and Edges are Documents

{

”@rid": “12:382”,

”@class": ”Customer",

“name”: “Frank”,

“surname” : “Raggio”,

“phone” : “+39 33123212”,

“details”: {

“city”:”London",

“tags”:”millennial”

}

}

Frank

Order

General purpose solution:

• JSON

• Schema-less

• Schema-full

• Schema-hybrid

• Nested documents

• Rich indexing and

querying

• Developer friendly

Page 48: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Polymorphic queries

John(Provider)

Frank(Customer)SELECT * FROM Customer

SELECT * FROM Provider

SELECT * FROM Actor

Bruno(Provider)

Bruno(Provider)

Frank(Customer)

John(Provider)

Page 49: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Multi-Model complex domains schema

Band Genre

AccountMusicTaste

Location

Likes

Performs

Inherits

Edge

Legenda:

V Vertex

Plays

Page 50: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Multi-Model complex domains

Snow Patrol(Band)

John(Account)

Indie(Genre)

123, 1st Street

Austin, TX

(Location)

(Performs)

April 7, 2015

9pm-11.30pm

(Likes)

Frank(Account)

(Likes)

(Likes)

Rock(Genre)(Likes)

(Plays)

Page 51: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Multi-Model Database QuadrantR

ela

tionship

s C

om

ple

xity >

Data Complexity >

Relational

Key Value

Column

Graph Multi-Model

Document

Page 52: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Multi-Model Solutions

Page 53: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

There are a few DBMSs that claim

to be Multi-Model, but they do not

have a true Graph Engine.

The “Graph” is only a layer on top

of the engine.

Under the hood they do JOINs,

which means traversal time is

affected by database size.

Page 54: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Meet OrientDB

The First Ever Multi-Model

Database Combining Flexibility

of Documents with

Connectedness of Graphs

Page 55: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

With a true Graph, Document,

Key/Value and Object Oriented engine

Page 56: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

OrientDB features

Page 57: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

DEMO

Page 58: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

• Support for TinkerPop standard

for Graph DB: Gremlin language

and Blueprints API

• SQL + extensions for graphs

• JDBC driver to connect any BI tool

• HTTP/JSON support

• Drivers in Java, Node.js, Python,

PHP, .NET, Perl, C/C++ and more

API & Standards

Page 59: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Availability and Integrity

• Atomic, Consistent, Isolated and Durable (ACID)

multi-statement transactions

Master

Node

Master

Node

C

C C C

CC

C

Multi-master

Replication

Page 60: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Scalability and Performance

• Multi-Master Replication, Sharding and Auto-

Discovery to Simplify Ops

• +200k Tps on Commodity Hardware

Master

Node

Master

Node

C

C C C

CC

C

Auto-

Discovered

Node

Page 61: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Some numbers

Page 62: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

A Bright Future

Graph DBMS increased their popularity by 500% within the last 2 years

Document DBMS are the 3rd fastest growing category

Page 63: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Some of Our Customers

Page 64: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Get Started for Free

OrientDB Community Edition is FREE

for any purpose (Apache 2 license)

Udemy Getting Started Training is

★★★★★ and Free

http://www.orientechnologies.com/getting-started

OrientDB Enterprise is Free for

Development

Page 65: OrientDB - the 2nd generation  of  (Multi-Model) NoSQL

Thank you!

Luigi Dell’Aquila

@ldellaquila

http://www.orientdb.com