graph databases: connecting the dots in big data
TRANSCRIPT
Graph Databases : Connec1ng the Dots in Big Data
Darren Wood Chief Architect, InfiniteGraph
Rela8onships are everywhere
CRM, Sales & Marke1ng Network
Mgmt, Telecom
Intelligence (Government& Business)
Finance
Healthcare Research: Genomics
Social Networks
PLM (Product Lifecycle Mgmt)
Graph Databases
• Not Really Graph Problems – Average age of my customers that purchased X
– Which zip code buys the most of Y
• Graph Problems – How is person A connected to person B – Can suspect Y be associated with loca8on Z – Who are influencers within a social network ?
Copyright © InfiniteGraph
Graph Databases
• Op8mized around data rela8onships – Rela8onships as first class ci8zens – Super fast traversal between en88es – Rich/flexible annota8on of connec8ons
• Small focused API (typically not SQL) – Na8vely work with concepts of Vertex/Edge – SQL has no concept of “naviga8on” – Most aZempts based in SQL are convoluted
Copyright © InfiniteGraph
Physical Storage Comparison
Copyright © InfiniteGraph
Mee8ngs
P1 Place Time P2
Alice Denver 5-‐27-‐10 Bob
Calls
From Time Dura8on To
Bob 13:20 25 Carlos
Bob 17:10 15 Charlie
Payments
From Date Amount To
Carlos 5-‐12-‐10 100000 Charlie
Met 5-‐27-‐10 Alice
Called 13:20 Bob
Payed 100000 Carlos
Charlie
Called 17:10
Rows/Columns/Tables Relationship/Graph Optimized
Simple API
Copyright © InfiniteGraph
Vertex alice = myGraph.addVertex(new Person(“Alice”)); Vertex bob = myGraph.addVertex(new Person(“Bob”)); Vertex carlos = myGraph.addVertex(new Person(“Carlos”)); Vertex charlie = myGraph.addVertex(new Person(“Charlie”));
alice.addEdge(new Meeting(“Denver”, “5-27-10”), bob); bob.addEdge(new Call(timestamp), carlos); carlos.addEdge(new Payment(100000.00), charlie); bob.addEdge(new Call(timestamp), charlie);
Alice Carlos Charlie Bob Meets Calls Pays
Calls
Query and Naviga8on
• Queries – but not as you know them • More like a rules based search and discovery
• Asynchronous Results
Copyright © InfiniteGraph
Alice Carlos Charlie Bob Meets Calls Pays
Calls
“Find all paths between Alice and Charlie”
“Find all paths between Alice and Charlie – within 2 degrees”
“Find all paths between Alice and Charlie – events in May 2010”
Naviga8on Example
Copyright © InfiniteGraph
// Create a qualifier that describes the target vertex Qualifier findCharliePredicate = new VertexPredicate(personType, "name == ’Charlie'");
// Construct a navigator which starts with Alice and uses a result qualifier // to find all paths in the graph to Charlie Navigator charlieFinder = alice.navigate( Guide.SIMPLE_BREADTH_FIRST, // default guide Qualifier.ANY, // no path constraints findCharliePredicate , // find paths ending with Charlie myResultHandler); // fire results to supplied handler
// Start the navigator charlieFinder.start();
Naviga8onal Query Performance
Scaling Graphs – Gegng Data In
Copyright © InfiniteGraph
IG Core/API
Configura8on Naviga8on Execu8on
Management Extensions
Session / TX Management
Placement Standard Blocking Ingest/Placement (MDP Plugin)
Objec8vity/DB
App-‐1 (Ingest V1)
App-‐2 (Ingest V2)
App-‐3 (Ingest V3)
V1 V2 V3
App-‐1 (E1 2{ V1V2})
App-‐2 (E23{ V2V3})
App-‐3
E12 E23
Accelerated Ingest
Copyright © InfiniteGraph
IG Core/API
Configura8on Naviga8on Execu8on
Management Extensions
Session / TX Management
Placement (Standard) Placement
(Accelerated)
V1
V2
V3
E12
E23
Distributed
Pipelines
Sta
ging
Con
tain
ers P
ipeline Containers
E(1-‐>2)
E(3-‐>1)
E(2-‐>3)
E(2-‐>1)
E(2-‐>3)
E(3-‐>1)
E(1-‐>2)
E(3-‐>2)
E(1-‐>2)
E(2-‐>3)
E(3-‐>1)
E(2-‐>1)
E(2-‐>3)
E(3-‐>1)
E(3-‐>2)
E(1-‐>2)
Choose Your Own Consistency…
Copyright © InfiniteGraph
// Describe your requested model using policies PolicyChain myPolicies = new PolicyChain(new EdgePipeliningPolicy(true));
// Start a transaction with the policies you want Transaction tx = myGraph.beginTransaction( AccessMode.READ_WRITE, myPolicies);
// This code doesn’t change, can be used with any policies alice.addEdge(new Meeting(“Denver”, “5-27-10”), bob); bob.addEdge(new Call(timestamp), carlos);
tx.commit();
Indexing Framework
• Focused on providing choice ! • Manual Indexes for grouping data
• Automa8c Indexes for cross popula8on
• Query interface with qualifica8on language • Pluggable query operators • External index support (Lucene)
Copyright © InfiniteGraph
InfiniteGraph Visualizer
Copyright © InfiniteGraph
Scaling Graphs – Distributed Naviga8on
• Graph algorithms naturally branch • Requires orchestra8on of threads/agents
Copyright © InfiniteGraph
Alice
Carlos Charlie Bob Meets Calls Pays
Dave Eve Chuck Calls
Lives With
Meets
Distributed API
Applica8on(s)
Par88on 1 Par88on 3 Par88on 2 Par88on ...n
Processor Processor Processor Processor
Big Distributed Data (Tradi8onal -‐ Huge Generaliza8on)
Copyright © InfiniteGraph
Distributed API
Applica8on(s)
Par88on 1 Par88on 3 Par88on 2 Par88on ...n
Processor Processor Processor Processor
Big Distributed Data (Graph)
Copyright © InfiniteGraph
Some customers and partners