graph databases - where do we do the modeling part?

56
Karen Lopez @ datachick # HeartData Heart of Data Modeling Graph Databases: Where does the modeling go?

Upload: dataversity

Post on 23-Jan-2018

1.743 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Graph Databases - Where Do We Do the Modeling Part?

Karen Lopez @datachick #HeartData

Heart of Data ModelingGraph Databases: Where does the modeling go?

Page 2: Graph Databases - Where Do We Do the Modeling Part?

Yes, Please do Tweet/Share today’s event

@datachick #heartdata

Page 3: Graph Databases - Where Do We Do the Modeling Part?

Karen López

Karen has 20+ years of data and information architecture experience on large, multi-project programs.

She is a frequent speaker on data modeling, data-driven methodologies and pattern data models.

She wants you to love your data…

She is loves new tech and gadgets

Page 4: Graph Databases - Where Do We Do the Modeling Part?

How new tech are you?...so let’s get to know you….

Page 5: Graph Databases - Where Do We Do the Modeling Part?

Use Q&A for

formal questions

Get them in now!

Use chat to discuss with each

other

We have a great community

Yes!

Slides

Recording

…next week…

Page 6: Graph Databases - Where Do We Do the Modeling Part?

Plan for Today

Why topic?

Graphy Stuff in Relational World

Another way?

Graph Resources

Page 7: Graph Databases - Where Do We Do the Modeling Part?

Modern Data

Architectures

will have hybrid

Technologies

WHERE ‘HYBRID’ = ‘SQL and NoSQL’

Page 8: Graph Databases - Where Do We Do the Modeling Part?

Clarifying terminology

Graphs Graph Databases Graph Processing

8

Page 9: Graph Databases - Where Do We Do the Modeling Part?

NoSQL, Not Only SQL

9

Relational Graph Columnar

Key ValueDocument Databases

Column Family

Page 10: Graph Databases - Where Do We Do the Modeling Part?

Graph & Hierarchy ConceptsOverview

Page 11: Graph Databases - Where Do We Do the Modeling Part?

Graph Nodes

Node

Node Node

Node

Node

Node

Node

Node

NodeNode

Node

Page 12: Graph Databases - Where Do We Do the Modeling Part?

Directed / Undirected

Node

Node

NodeNode

Node

Node

Node

NodeNode

Node

Page 13: Graph Databases - Where Do We Do the Modeling Part?

What is this structure?

Page 14: Graph Databases - Where Do We Do the Modeling Part?

Ragged Hierarchies

A hierarchy where there is variability in the number of levels across branches.

Node

Node

Node

Node

Node

Node

Node Node

Node

Node

Node

Node

14

Page 15: Graph Databases - Where Do We Do the Modeling Part?

Automobile

Engine

Fuel Line Valve Injector Fan

Fan Blade

Bearing

Bolt

Fanbelt

Entertainment System

Bolt

Radio

Satellite Radio

Media Player

Backup camera

Automobile

Engine

Injection System

Fuel Line

Valve

Injector

Fan

Fan Blade

Fan Bearing

Fanbelt

Entertainment System

Bolt

Radio

Satellite Radio

Media Player

Backup camera

Energy Graph

What Happens When…….?

Sometimes we take a group of “sibling widgets” and make them a widget just for them. Think “subassembly”. Then we have to think of this new group as a widget.

15

Page 16: Graph Databases - Where Do We Do the Modeling Part?

How we Model Graph in RelationalLots of tricks and tips happening here.

Page 17: Graph Databases - Where Do We Do the Modeling Part?

Recursive Relationship

Self Join

Recursive Association

Dog Ear / Mouse Ear

Bill of Materials

???

Page 18: Graph Databases - Where Do We Do the Modeling Part?

Data Model – Hierarchy Recursive

Page 19: Graph Databases - Where Do We Do the Modeling Part?

Data Model – Hierarchy Recursive

19

What happens when:

We add a new level?Take one away?Promote someone?

Page 20: Graph Databases - Where Do We Do the Modeling Part?

Relational Performance Tricks

Special data types

Adjacency Lists

Path Enumerations

Closure Tables

Nested Sets

20

Page 21: Graph Databases - Where Do We Do the Modeling Part?

But remember this?

21

Page 22: Graph Databases - Where Do We Do the Modeling Part?

Now we have M:N

22

Employee

Reporting Structure

Page 23: Graph Databases - Where Do We Do the Modeling Part?

Hierarchy vs. Real World

23

Page 24: Graph Databases - Where Do We Do the Modeling Part?

Recursive Relationships

24

Page 25: Graph Databases - Where Do We Do the Modeling Part?

Recursives IRL

Reporting structures

Components

Facilities

Documents

Networks

25

Page 26: Graph Databases - Where Do We Do the Modeling Part?

SQL Server HierarchyID

26

Page 27: Graph Databases - Where Do We Do the Modeling Part?

But aren’t relational databases about relationships?

Page 28: Graph Databases - Where Do We Do the Modeling Part?

Labeled Property Graph

Nodes have properties(think key-value pairs)

Nodes have labels(think meta-data and categories)

Relationships are directed

Relationships have names

Relationships have a start and endnode

Relationships have properties

28

Nodepropertyproperty

Nodeproperty

NodeNode

Node

Label

LabelLabel

Label

Page 29: Graph Databases - Where Do We Do the Modeling Part?
Page 30: Graph Databases - Where Do We Do the Modeling Part?

http://neo4j.com/graphgist/8139605

Page 31: Graph Databases - Where Do We Do the Modeling Part?

Graph Databases Physical Architecture

31

Page 32: Graph Databases - Where Do We Do the Modeling Part?

TripleStores

Come from semantic technologies movement

A triple is a subject:predicate:object data structure

Individually triples are semantically poor

En masse they provide rich dataset to harvest knowledge and infer connections

Use RDF and XML--SPARQL for queries

Ginger dances with Fred

Fred likes ice cream

Karen loves data

32

Page 33: Graph Databases - Where Do We Do the Modeling Part?

Graph Databases – Neo4j

CREATE (matrix1:Movie { title : 'The Matrix', year : '1999-03-31' })CREATE (matrix2:Movie { title : 'The Matrix Reloaded', year : '2003-05-07' })CREATE (matrix3:Movie { title : 'The Matrix Revolutions', year : '2003-10-27' })CREATE (keanu:Actor { name:'Keanu Reeves' })CREATE (laurence:Actor { name:'Laurence Fishburne' })CREATE (carrieanne:Actor { name:'Carrie-Anne Moss' })CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix1)CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix2)CREATE (keanu)-[:ACTS_IN { role : 'Neo' }]->(matrix3)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix1)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix2)CREATE (laurence)-[:ACTS_IN { role : 'Morpheus' }]->(matrix3)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix1)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix2)CREATE (carrieanne)-[:ACTS_IN { role : 'Trinity' }]->(matrix3)

http://neo4j.com/docs/stable/cypherdoc-movie-database.html

Page 34: Graph Databases - Where Do We Do the Modeling Part?

Notice anyting?

Page 35: Graph Databases - Where Do We Do the Modeling Part?

Querying Neo4j with Cypher

MATCH (you {name:"You"})

MATCH (expert)-

[:WORKED_WITH]-

>(db:Database

{name:"Neo4j"}) MATCH

path = shortestPath( (you)-

[:FRIEND*..5]-(expert) )

RETURN db,expert,path

Page 36: Graph Databases - Where Do We Do the Modeling Part?

IBM Graph

Page 37: Graph Databases - Where Do We Do the Modeling Part?

Querying IBM Graph

var url = process.env.graphDBURL + '/gremlin';var query = "def g = graph.traversal(); g.V().has('code','" + req.body.orig +"').out('route').has('code', '" + req.body.dest + "')";var opts = { auth: { user: process.env.username, pass: process.env.password}, json: { gremlin: query } };

request.post(url, opts, function(error, resp, obj) {var result = (obj.result && obj.result.data && obj.result.data.length >

0) ? obj.result.data[0] : null;if (result) { // found a route from orig to dest console.log('route

exists from ' + req.body.orig + ' to ' + req.body.dest); } });

Page 38: Graph Databases - Where Do We Do the Modeling Part?

Another IBM Graph model

Page 39: Graph Databases - Where Do We Do the Modeling Part?

Titan

Page 40: Graph Databases - Where Do We Do the Modeling Part?

Tools and Graph Databases

•No native supportERwin

•No native supportER/Studio

•No native supportPowerDesigner

“the data model is the database”

“the database is the data model”

ODBC / JDBC connectively for querying.

Page 41: Graph Databases - Where Do We Do the Modeling Part?

So what about modeling for graph?

Marketing vs. Real Project

There is a model

The model isn’t the structure

The model would be used to design the graph(s)

Same modeling issues: NamingPropertiesRulesConsistencyGovernance

Page 42: Graph Databases - Where Do We Do the Modeling Part?

Data Modeling & Graph

No* Logical + Physical Data model

The graph is the data model...and the database

Whiteboard data modeling

Traditional data models still have a role

Page 43: Graph Databases - Where Do We Do the Modeling Part?

Requirements

Data Model

Database*

More

requirements

/ changes /

tuning /

whims

+ Non Model Stuff

Data Model

Driven

Data Model Driven

Page 44: Graph Databases - Where Do We Do the Modeling Part?
Page 45: Graph Databases - Where Do We Do the Modeling Part?

10+ Tips for Architects

1. Understand the use cases for graph technologies

2. Evaluate/profile your data requirements for suitability for graph databases and/or graph processing

3. ACID support varies across products. You’ll want to test your use cases.

4. Your query data stories will guide your decisions

5. Test your current development tools for support

Page 46: Graph Databases - Where Do We Do the Modeling Part?

10+ Tips for Architects

6. Test your database design/data modeling tools

7. Leverage your existing metadata/models

8. True hierarchies are VERY RARE in the real world.

9. Know the questions you have to ask about all the exceptions

10. Keep asking where the data integrity happens/is relevant

46

Page 47: Graph Databases - Where Do We Do the Modeling Part?

Resources

47

Page 48: Graph Databases - Where Do We Do the Modeling Part?

http://www.neo4j.org/learn/try 48

Page 49: Graph Databases - Where Do We Do the Modeling Part?

Fun with Graphs

Scotch Whiskeys

Belgian Beer

Bank Fraud Detection

Access Control Management

…and 100 more….

GraphGist Project - http://gist.neo4j.org/

Page 50: Graph Databases - Where Do We Do the Modeling Part?

Bluemix

http://www.ibm.com/analytics/us/en/technology/cloud-data-services/products/graph.html

Page 51: Graph Databases - Where Do We Do the Modeling Part?

Trees and Hierarchies in SQL for Smarties

51

Page 52: Graph Databases - Where Do We Do the Modeling Part?

Whitepaper

http://whitepapers.dataversity.net/content50141/

Page 53: Graph Databases - Where Do We Do the Modeling Part?

And it’s FREE! GraphDatabases.com

Page 54: Graph Databases - Where Do We Do the Modeling Part?

PostgreSQLRiakHbaseMongoDBNeo4JCouchDBRedis

Page 55: Graph Databases - Where Do We Do the Modeling Part?

“Every design decision should include cost,

benefit and risk”

- Karen Lopez

Page 56: Graph Databases - Where Do We Do the Modeling Part?

Thank you, you were great. Really, really great.

Karen Lopez @datachickwww.datamodel.com