orientdb - the 2nd generation of (multimodel) nosql - luigi dell aquila - codemotion amsterdam 2016
TRANSCRIPT
- the 2nd generation of (Multi-Model) NoSQL
- the 2nd generation of (Multi-Model) NoSQL
And why GraphDB are the And why GraphDB are the starting point of this revolutionstarting point of this revolution
#OrientDB
How many of you have already used NoSQL technology?
How many of you are familiar with Graph Databases?
How many of you are already familiar with OrientDB?
Before We Start…
#OrientDB
Order #134(Order)
Order #134(Order) John
(Provider)John
(Provider)
CBMAmiga 500 (Product)
CBMAmiga 500 (Product)
Frank(Customer)
Frank(Customer)
Monitor 40”
(Product)
Monitor 40”
(Product)
Mouse(Product)
Mouse(Product)
Bruno(Provider)Bruno
(Provider)
#OrientDB
Order #134(Order)
Order #134(Order) John
(Provider)John
(Provider)
CBMAmiga 500 (Product)
CBMAmiga 500 (Product)
Frank(Customer)
Frank(Customer)
Monitor 40”
(Product)
Monitor 40”
(Product)
Mouse(Product)
Mouse(Product)
Bruno(Provider)Bruno
(Provider)
Data by itself has little value, it’s the relationshipbetween data that gives it
incredible value
#OrientDB
CBMAmiga 500 (Product)
CBMAmiga 500 (Product)
(Sells)
Frank(Customer)
Frank(Customer)
(Has)
(Makes)
Monitor 40”
(Product)
Monitor 40”
(Product)
(Sells)(Has)
Mouse(Product)
Mouse(Product)
(Sells)
(Has)
Order #134(Order)
Order #134(Order) John
(Provider)John
(Provider)
Bruno(Provider)Bruno
(Provider)
#OrientDB
ID Name
10 John
11 John
24 Mike
28 Mike
CustomerID CityID
10 24
10 33
32 44
ID City
24 Milan
33 London
18 Paris
18 Madrid
44 Moscow
Customers CustomersCities Cities
#OrientDB
ID Name
10 John
11 John
24 Mike
28 Mike
CustomerID CityID
10 24
10 33
32 44
ID City
24 Milan
33 London
18 Paris
18 Madrid
44 Moscow
Customers CustomersCities CitiesJoins are executed every timeyou cross relationships
Querying million of records joining 34 tables could
generate billions of combinations
#OrientDB
This is why the databasequery performance
suffers as the databaseincreases in size
O(Log N)
#OrientDB
Vertices are directed
* https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model
Property Graph Model*
Romecountry: Italy
Andreacompany: OrientDB
Vertices and Edges can have properties
Visitedyear: 2016
#OrientDB
Andrea Rome
Visitedyear: 2012
An Edge connects only 2 vertices
Use multiple edges to represent 1-N and N-M relationships
Workedyear: 2016
1N and NM Relationships
#OrientDB
How does a true* Graph Database
manage relationships?
*a “Graph” layer on top of a DBMS doesn’t qualify as a true GraphDB
#OrientDB
AndreaRomeRome
Visitedyear: 2012
#13:55 #15:99
Each element in the Graph has own
immutable Record ID
Each element in the Graph has own
immutable Record ID
#22:11
(Edge)
(Vertex) (Vertex)
Each element in the Graph has own
immutable Record ID
Each element in the Graph has own
immutable Record ID
Each element in the Graph has own
immutable Record ID
Each element in the Graph has own
immutable Record ID
#OrientDB
Connections use persistent
pointers
Connections use persistent
pointers
AndreaRomeRome
Visitedon: 2012#13:55
#15:99
out = #22:11in = #22:11
#22:11
(Edge)
(Vertex)
(Vertex)
src = #13:55 dst = #15:99
#OrientDB
AndreaRomeRome
Visitedon: 2012#13:55
#15:99
out = #22:11in = #22:11
#22:11
(Edge)
(Vertex)
(Vertex)
src = #13:55 dst = #15:99
#OrientDB
AndreaRomeRome
Visitedon: 2012#13:55
#15:99
out = #22:11in = #22:11
#22:11
(Edge)
(Vertex)
(Vertex)
src = #13:55 dst = #15:99
#OrientDB
A Graph Database creates therelationship just once
(when the edge is created)
VS
RDBMS computes therelationship every timeyou query a database
#OrientDB
When you move from a RDBMSto a Graph Database you jump
from a O(log N) speed to a near O(1)
With a Graph Database, thetraversing time is
not affected by database size!
This is huge in the BigData age
#OrientDB
No costs to traverse relationships:
• Recommendation engines• Social Applications• Spatial Apps• Master Data Management• Information Clustering
John
Thriller
Comedy
Pulp Fiction
Mr Bean
TheaterB
TheaterA
Theater C
NYC
San Josè
Lives in
Likes
LikesHas
Has
Is
Is
Plays
Has
Plays
#OrientDB
So the Graph Model Is the only solution to efficiently
manage relationships
But what about data complexity?And data consistency?
And scaling?
#OrientDB
Rel
atio
nshi
ps C
ompl
exity
>
Data Complexity >
Relational
Key Value
Column
Graph
Document
First Generation NoSQLFirst Generation NoSQL
#OrientDB
First Generation NoSQL: Polyglot PersistenceFirst Generation NoSQL: Polyglot Persistence
RDBMSRDBMS
Key/Value StoreKey/Value Store
DocumentDatabaseDocumentDatabase
GraphDatabase
GraphDatabaseApplicationApplication
ETL
#OrientDB
Key/Value StoreKey/Value Store
DocumentDatabaseDocumentDatabase
GraphDatabase
GraphDatabaseApplicationApplication
ETL
First Generation NoSQL: Polyglot PersistenceFirst Generation NoSQL: Polyglot Persistence
- No standard between NoSQL Products
- Multiple vendors = multiple skills
- ETL + synchronization code is expensive to write and maintain
- Performance and Reliability is hard to predict
RDBMSRDBMS
#OrientDB
What’s a MultiModel DBMS?What’s a MultiModel DBMS?
Graph
Document
Object
Key/Value
Multi-Model represents the intersection
of multiple models in just one product
Full-Text
Spatial
#OrientDB
What’s a MultiModel DBMS?What’s a MultiModel DBMS?
Graph
Document
Object
Key/Value
Full-Text
Spatial
- Just one product to learn and maintain- Just one vendor relationship to manage- No ETL, no synchronization required- Performance and Reliability is easy to test from the beginning
- Just one product to learn and maintain- Just one vendor relationship to manage- No ETL, no synchronization required- Performance and Reliability is easy to test from the beginning
Multi-Model represents the intersection
of multiple models in just one product
Confidential
Polyglot vs MultimodelPolyglot (NoSQL 1.0) Multimodel (NoSQL 2.0)
Polyglot Persistence is a fancy term to mean that when storing data, it is best to use multiple data storage technologies, chosen based upon the way data is being used by individual applications or components
Multi-model databases are intended to offer the data modeling advantages of polyglot persistence without its disadvantages. complexity, in particular, is reduced. The first multi-model database was OrientDB.
https://en.wikipedia.org/wiki/Multimodel_databasehttp://www.jamesserra.com/archive/2015/07/whatispolyglotpersistence/
ECOMMERCE
PRODUCT CATALOG
SHOPPING CART
RECOMMENDATION
ECOMMERCE
PRODUCT CATALOG
SHOPPING CART
RECOMMENDATION
TRANSACTIONAL TRANSACTIONA
LSEARCH
SEARCH
SPATIAL
SPATIAL
#OrientDB
`
{ ”@rid": “12:382”, ”@class": ”Customer", “name”: “Frank”, “surname” : “Raggio”, “phone” : “+39 33123212”, “details”: {
“city”:”London", “tags”:”millennial” }}
Frank
Order
Makes
General purpose solution:• JSON• Schema-less • Schema-full• Schema-hybrid• Nested documents• Rich indexing and querying• Developer friendly
#OrientDB - @ldellaquila
Second Generation NoSQL
Rela
tionsh
ip C
om
ple
xit
y >
Data Complexity >
Relational
Key Value
Column
Graph
Document
Multi-Model
#OrientDB
•Support for TinkerPop standard for Graph DB: Gremlin language and Blueprints API
•SQL + extensions for graphs•JDBC driver to connect any BI tool•HTTP/JSON support•Drivers in Java, Node.js, Python,
PHP, .NET, Perl, C/C++ and more
API & Standards
#OrientDB - @ldellaquila
• OrientDB footprint is minimal and the embedded version can run with few MB of RAM
• OrientDB needs a Java Run Time
• When run distributed, OrientDB uses Hazelcast (Apache2 licensed) library embedded
Requirements and Dependencies
#OrientDB - @ldellaquila
• Basic HTTP authentication (+HTTPS/SSL)
• User/Role authentication system. One User can have multiple Roles
• Privileges are managed in Roles
• Roles can inherit from other Roles
• Record-level security: every record can contain the user/role can create/read/update/delete the record
• Auditing available in Enterprise Edition
Security
#OrientDB - @ldellaquila
• HTTPS/SSL
• Starting from OrientDB v2.2:- Support for Kerberos- Encryption at REST using AES and DES of the entire database or portions- PBKDF2 HASH algorithm with a 24-bit length Salt per user for a configurable number of iterations
Encryption
#OrientDB - @ldellaquila
• Full Backup and Restore
• Delta Backup (v2.2) Enterprise Edition and Restore is available
• Studio web tool
• Command line Console
Administration
#OrientDB - @ldellaquila
• Import/Export in JSON
• Import from SQL script
• OrientDB ETL tool (http://orientdb.com/docs/last/ETL-Introduction.html)
• Teleporter (v2.2)
Data Extraction and Loading
#OrientDB - @ldellaquila
• Multi-Master architecture
• Tunable consistency through the usage of a quorum, per database or single class (table)
• Synchronous and Asynchronous replication
• Zero config: if multicast is enabled the server is attached to the cluster
Scale out and HA
#OrientDB
Master Node
Master Node
Master Node
Master Node
CC
CC CC CC
CCCC
CC
Multi-master Replication
Atomic, Consistent, Isolated and Durable (ACID) multistatement transactions
#OrientDB
Master Node
Master Node
Master Node
Master Node
CC
CC CC CC
CCCC
CC
Auto-Discovered
Node
Auto-Discovered
Node
#OrientDB
Udemy Getting Started Training is ★★★★★ and Free
http://www.orientechnologies.com/gettingstarted
OrientDB Enterprise is Free for Development
OrientDB Community is FREE for any purpose (APACHE 2 license)