nosql now! presentation, august 24, 2011: graph databases: connecting the dots in big data
DESCRIPTION
Darren Wood is the Architect and Lead Developer of InfiniteGraph, the distributed graph database, produced by Objectivity, Inc. Darren has spent the majority of his career architecting and building distributed systems with an emphasis on elastic scalability and data management. Prior to joining Objectivity, Inc. in 2007, Darren held positions as a Senior Consultant with IONA Technologies and a Development Team Lead for Citect Australia. Darren holds a First Class Honors Degree in Computer Systems Engineering from the University of Technology in Sydney, Australia.TRANSCRIPT
Graph Databases : Connecting the Dots in Big Data
Darren WoodChief Architect, InfiniteGraph
Relationships are everywhere
Graph Databases
• Not Really Graph Problems– Average age of my customers that purchased X– Which zip code buys the most of Y
• Graph Problems– How is person A connected to person B– Can suspect Y be associated with location Z– Who are influencers within a social network ?
Copyright © InfiniteGraph
Graph Databases
• Optimized around data relationships– Relationships as first class citizens– Super fast traversal between entities– Rich/flexible annotation of connections
• Small focused API (typically not SQL)– Natively work with concepts of Vertex/Edge– SQL has no concept of “navigation”– Most attempts based in SQL are convoluted
Copyright © InfiniteGraph
Physical Storage Comparison
Copyright © InfiniteGraph
Meetings
P1 Place TimeP2Alice Denver 5-27-10Bob
Calls
From Time DurationToBob 13:20 25CarlosBob 17:10 15Charlie
Payments
From Date AmountToCarlos 5-12-10 100000Charlie
Met5-27-10Alice
Called13:20Bob
Payed100000Carlos
Charlie
Called17:10
Rows/Columns/Tables Relationship/Graph Optimized
Simple API
Copyright © InfiniteGraph
Vertex alice = myGraph.addVertex(new Person(“Alice”)); Vertex bob = myGraph.addVertex(new Person(“Bob”)); Vertex carlos = myGraph.addVertex(new Person(“Carlos”)); Vertex charlie = myGraph.addVertex(new Person(“Charlie”));
alice.addEdge(new Meeting(“Denver”, “5-27-10”), bob);bob.addEdge(new Call(timestamp), carlos);carlos.addEdge(new Payment(100000.00), charlie);bob.addEdge(new Call(timestamp), charlie);
Alice Carlos CharlieBobMeets Calls Pays
Calls
Query and Navigation• Queries – but not as you know them• More like a rules based search and discovery• Asynchronous Results
Copyright © InfiniteGraph
Alice Carlos CharlieBobMeets Calls Pays
Calls
“Find all paths between Alice and Charlie”
“Find all paths between Alice and Charlie – within 2 degrees”
“Find all paths between Alice and Charlie – events in May 2010”
Navigation Example
Copyright © InfiniteGraph
// Create a qualifier that describes the target vertexQualifier findCharliePredicate =
new VertexPredicate(personType, "name == ’Charlie'");
// Construct a navigator which starts with Alice and uses a result qualifier// to find all paths in the graph to CharlieNavigator charlieFinder = alice.navigate(
Guide.SIMPLE_BREADTH_FIRST, // default guide Qualifier.ANY, // no path constraints
findCharliePredicate , // find paths ending with Charlie
myResultHandler); // fire results to supplied handler
// Start the navigatorcharlieFinder.start();
Navigational Query Performance
Scaling Graphs – Getting Data In
Copyright © InfiniteGraph
IG Core/API
ConfigurationNavigation Execution
Management Extensions
Session / TX ManagementPlacement
Standard Blocking Ingest/Placement (MDP Plugin)
Objectivity/DB
App-1(Ingest V1)
App-2(Ingest V2)
App-3(Ingest V3)
V1V1 V2
V2 V3V3
App-1(E1 2{ V1V2})
App-2(E23{ V2V3})
App-3
E12E12 E23
E23
Accelerated Ingest
Copyright © InfiniteGraph
IG Core/API
ConfigurationNavigation Execution
Management Extensions
Session / TX Management
Placement(Standard)Placement
(Accelerated)
V1V1
V2V2
V3V3
E12E12
E23E23
Distributed
Pipelines
Sta
ging
Con
tain
ers P
ipeline Containers
E(1->2)
E(3->1)
E(2->3)
E(2->1)
E(2->3)E(3->1)
E(1->2)
E(3->2)
E(1->2)
E(2->3)
E(3->1)
E(2->1)
E(2->3)
E(3->1)
E(3->2)
E(1->2)
Choose Your Own Consistency…
Copyright © InfiniteGraph
// Describe your requested model using policiesPolicyChain myPolicies =
new PolicyChain(new EdgePipeliningPolicy(true));
// Start a transaction with the policies you wantTransaction tx = myGraph.beginTransaction(
AccessMode.READ_WRITE, myPolicies);
// This code doesn’t change, can be used with any policiesalice.addEdge(new Meeting(“Denver”, “5-27-10”), bob);bob.addEdge(new Call(timestamp), carlos);
tx.commit();
Indexing Framework
• Focused on providing choice !• Manual Indexes for grouping data• Automatic Indexes for cross population• Query interface with qualification language• Pluggable query operators• External index support (Lucene)
Copyright © InfiniteGraph
InfiniteGraph Visualizer
Copyright © InfiniteGraph
Scaling Graphs – Distributed Navigation
• Graph algorithms naturally branch• Requires orchestration of threads/agents
Copyright © InfiniteGraph
Alice
Carlos CharlieBobMeets Calls Pays
Dave EveChuckCalls
Lives With
Meets
Distributed API
Application(s)
Partition 1 Partition 3Partition 2 Partition ...n
Processor Processor Processor Processor
Big Distributed Data(Traditional - Huge Generalization)
Copyright © InfiniteGraph
Distributed API
Application(s)
Partition 1 Partition 3Partition 2 Partition ...n
Processor Processor Processor Processor
Big Distributed Data(Graph)
Copyright © InfiniteGraph
Some customers and partners