getting started with graph databases

34
www.Objectivit y.com Getting Started with Graph Databases Nick Quinn Principal Engineer, InfiniteGraph 06/07/2022 1

Upload: infinitegraph

Post on 27-Jan-2015

137 views

Category:

Technology


7 download

DESCRIPTION

Exploiting graph database to discover value in complex Big Data. Lunch will be provided while you discover the power of graph database technology for your Big Data needs. Bring your charged laptops to this upcoming meetup to walk through how to get started with InfiniteGraph. Nick Quinn, Senior Software Developer for InfiniteGraph, will walk you through the initial installation of InfiniteGraph and the HelloGraph sample to get you started with your graph database. Download InfiniteGraph for free here: http://www.objectivity.com/downloads Once we get through the tutorial, there will be time for Q&A and more hands on support from additional members of the InfiniteGraph technical team. If you have a complex Big Data problem and are looking to discover deeper connections and relationships within your data to create next-generation applications for social networks, healthcare, finance, telecom and security this is a must attend event! Get started quickly with our enterprise proven, massively scalable and distributed graph database!

TRANSCRIPT

Page 1: Getting Started with Graph Databases

www.Objectivity.com

Getting Started with Graph Databases

Nick Quinn

Principal Engineer, InfiniteGraph

04/10/2023 1

Page 2: Getting Started with Graph Databases

What are we talking about today?

•Big Data and Databases•What is a Graph Database?•What is InfiniteGraph?•Demo and Q&A – Hands On

– Installing InfiniteGraph• https://download.infinitegraph.com

– FlightPlan Sample• http://wiki.infinitegraph.com “Download

Examples” FlightPlanSample.zip

Images Courtesy of IMDB (www.imdb.com)

Page 3: Getting Started with Graph Databases

04/10/2023

NoSQL 2013

• Developers are embracing choice• More than Dynamo and BigTable clones• Incorporates specialized data models like

Document, Object and Graph • 100+ projects and products (Wikipedia)• ~250 Meetup.com Groups (5 meetups this week!)• NoSQL fans consume 12% of the worlds Beer & Pizza

Page 4: Getting Started with Graph Databases

04/10/2023 4

NoSQL and BigData – What’s the Connection ?

• Making big data “appear” smaller• Partitioning, replication & distributed query• Storage model optimizations• Consistency trade offs• Simplified query models• Dynamic views

big data is a loosely-defined term used to describe data sets

so large and complex that they become awkward to work with using on-hand database management tools (wikipedia)

Page 5: Getting Started with Graph Databases

04/10/2023 5

The Specialist !

• Everyone specializes – Doctors, Lawyers, Bankers, Developers

• Why was data so normalized for so long !• NoSQL is all about the data specialist• Specializing in…

– Distribution / deployment– Physical data storage– Logical data model– Query mechanism

Page 6: Getting Started with Graph Databases

04/10/2023 6

Polyglot NoSQL Architectures

Distributed Data Processing Platform Document

GraphDatabase

RDBMS

Partitioned Distributed DB (often Document / KV)

Users

Appl

icati

ons

External / Legacy Data Tr

ansf

orm

ation

\ M

DM

Business

Page 7: Getting Started with Graph Databases

04/10/2023 7

NoSQL Landscape - How it all stacks up!

Data Model Performance Scalability Flexibility Complexity Functionality

Key–value Stores high high high none variable

(none)

Column Store high high moderate low minimal

Document Store high variable high low variable (low)

Graph Database variable variable high high graph theory

Relational Database variable variable low moderate relational

algebra.

From…http://wikipedia.org/wiki/NoSQL

Page 8: Getting Started with Graph Databases

04/10/2023 8

Navigational Query Performance

Page 9: Getting Started with Graph Databases

04/10/2023 9

The Physical Data Model

• Becoming a relationship specialist…

Meetings

P1 Place TimeP2Alice Denver 5-27-10Bob

Calls

From Time DurationToBob 13:20 25CarlosBob 17:10 15Charlie

Payments

From Date AmountToCarlos 5-12-10 100000Charlie

Met5-27-10Alice

Called13:20Bob

Paid100000Carlos

Charlie

Called17:10

Rows/Columns/Tables Relationship/Graph Optimized

Page 10: Getting Started with Graph Databases

04/10/2023 10

Sometimes Big Data is just Fast Data !

• Some data is only actionable momentarily– Intelligence– IT Security– Site/page visit– Financial / trading behavior

• Presents a different type of challenge• Latency of batch data processing becomes

problematic

Page 11: Getting Started with Graph Databases

04/10/2023 11

Scaling Writes

• Big/Fast data demands write performance• Most NoSQL solutions allow you to scale writes by…

– Partitioning the data– Understanding your consistency requirements– Allowing you to defer conflicts

Page 12: Getting Started with Graph Databases

Why a Graph Database ?

04/10/2023 12

Page 13: Getting Started with Graph Databases

04/10/2023 13

Relationships are everywhere

CRM, Sales & Marketing Network Mgmt,

TelecomIntelligence

(Government& Business)

Finance

HealthcareResearch: Genomics

Social Networks

PLM (Product Lifecycle Mgmt)

Page 14: Getting Started with Graph Databases

04/10/2023 14

Exploding Connections

• More often than not… graphs are big !

Page 15: Getting Started with Graph Databases

The Graph Database Landscape

Copyright © InfiniteGraph

• Neo4J • Titan (Aurelius)• AllegroGraph (RDF) • FlockDB (Twitter)• DEX (Sparsity)• OrientDB (Document)• + 24 others (from wikipedia.org)

Page 16: Getting Started with Graph Databases

The Graph Database Landscape Cont’d

Copyright © InfiniteGraph

• Graph Analytics: High latency, Batch Processing, offline– Apache Giraph– GraphLab– Intel’s Graph Builder

• Visual Analytics: In Memory, High Performance, Poor Scalability

– Tom Sawyer– D3JS– KeyLines– InfoVis

• Tinkerpop stack (Blueprints/Gremlin)– 16 implementations and counting…

Page 17: Getting Started with Graph Databases

04/10/2023 17

Why InfiniteGraph™?

• Objectivity/DB is a proven foundation

– Building highly connected databases since 1993– A complete database management system

• Concurrency, transactions, cache, schema, query, indexing

• It’s a Graph Specialist !

– Simple but powerful API tailored for navigation through data

– Easy to configure distribution model

Page 18: Getting Started with Graph Databases

04/10/2023 18

InfiniteGraph™ Basic Architecture

InfiniteGraph - Core/API

ConfigurationNavigation Execution

Management Extensions

BlueprintsUser Apps

Distributed Object and Relationship Persistence Layer

Session / TX ManagementPlacement

Page 19: Getting Started with Graph Databases

04/10/2023 19

Zone 2Zone 1

HostA

Fully Distributed Data Model

IG Core/API

Distributed Object and Relationship Persistence Layer

ADP Placement

HostB HostC HostX

AddVertex()

Page 20: Getting Started with Graph Databases

04/10/2023 20

InfiniteGraph is a Complete Database• InfiniteGraph helps manage the things you don’t want to do, but

want to have done:– Concurrency

• Transactions (commit/rollback)

• Controlled multi-user reading during updates

– Schema Control• Build complex data structures, make changes easily and migrate existing data

– Distribution• Sharing large amounts of distributed data between distributed processes

– Indexes• Choose built-in key-value, b-tree or other indexes

– Cache• Keep large sections of the graphs in configurable memory caches

Page 21: Getting Started with Graph Databases

Copyright © InfiniteGraph

Scaling Graph Writes

InfiniteGraph

Objectivity/DB Persistence Layer

App-1(Ingest V1)

App-2(Ingest V2)

App-3(Ingest V3)

V1 V2 V3

App-1(E1 2{ V1V2})

App-2(E23{ V2V3})

App-3

E12 E23

Page 22: Getting Started with Graph Databases

04/10/2023 22

High Performance Edge Ingest

IG Core/API

C1

C2

C3

E12

E23

Targ

et C

onta

iner

s

Pipeline ContainersE(1->2)

E(3->1)

E(2->3)

E(2->1)

E(2->3)E(3->1)

E(1->2)

E(3->2)

E(1->2)

E(2->3)

E(3->1)

E(2->1)

E(2->3)

E(3->1)

E(3->2)

E(1->2)

Pipeline

Agent

Page 23: Getting Started with Graph Databases

04/10/2023 23

Result…

1

2

4

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

500000

1 client

2 clients

4 clients

8 clients

1 client

2 clients

4 clients

8 clients

# of Processes

Nod

es a

nd E

dges

per

seco

nd

8 Hosts

4 Hosts

2 Hosts

Single Host

Page 24: Getting Started with Graph Databases

Distributed API

Application(s)

Partition 1 Partition 3Partition 2 Partition ...n

Processor Processor Processor Processor

Copyright © InfiniteGraph

Scaling Reads and QueryPartitioning and Read Replicas… easy right !

Page 25: Getting Started with Graph Databases

04/10/2023 25

Distributed API

Application(s)

Partition 1 Partition 3Partition 2 Partition ...n

Processor Processor Processor Processor

Why are Graphs Different ?

Page 26: Getting Started with Graph Databases

04/10/2023 26

Optimizing Distributed Navigation• Detect local hops and perform in memory

traversal– Intelligently cache freq accessed remote data

• Route tasks to other hosts when it is optimal

Processor

Distributed API

Partition 1 Partition 2

Processor

Application

A

XY

B

C

D

EP(A,B,C,D)

F

G

Page 27: Getting Started with Graph Databases

04/10/2023 27

Super Simple API

Person alice = new Person(“Alice”);helloGraphDB.addVertex( alice );

Person bob = new Person(“Bob”);helloGraphDB.addVertex( bob );

Person carlos = new Person(“Carlos”);helloGraphDB.addVertex( carlos );

Person charlie = new Person(“Charlie”);helloGraphDB.addVertex( charlie );

Page 28: Getting Started with Graph Databases

04/10/2023 28

Adding Edges

MyEdgeType edge = new MyEdgeType();

vertexA.addEdge ( edge, vertexB, EdgeKind.???, weight );

Meeting denverMeeting = new Meeting("Denver", "5-27-10");alice.addEdge(denverMeeting, bob, EdgeKind.BIDIRECTIONAL, (short)1);

Call bobToCarlos = new Call(getRandomJulyTime());bob.addEdge(bobToCarlos, carlos, EdgeKind.OUTGOING, (short)0);

Payment payment = new Payment(10000.00);carlos.addEdge(payment, charlie, EdgeKind.OUTGOING, (short)2);

Call bobToCharlie = new Call(getRandomJulyTime());bob.addEdge(bobToCharlie, charlie, EdgeKind.INCOMING, (short)0);

Page 29: Getting Started with Graph Databases

04/10/2023 29

The Result…

Page 30: Getting Started with Graph Databases

04/10/2023 30

Graph Traversal (Navigation) Queries

• Use an instance of the Navigator class to perform a navigation query.

• A navigation instance is highly customizable, but is comprised of the following basic parts:– The vertex from which to start the navigation query.– A guide strategy, which is a high-level navigational aid. You can

create a custom guide, or there are several available built-in guide strategies.

• Guide.Strategy.NONE• Guide.Strategy.SIMPLE_BREADTH_FIRST• Guide.Strategy.SIMPLE_DEPTH_FIRST

– Qualifiers• A path qualifier• A result qualifier

– Handlers• A result handler

Page 31: Getting Started with Graph Databases

04/10/2023 31

Schema – It’s not your enemy ! (well not all the time...)

• Schema vs Schema-less– Database religion– No time for a full debate here– InfiniteGraph supports schema– Planning to also support optional properties on

schema types

• Graph Views : A Great Use Case for Schema!– Filter by type and predicate during navigation– Connection Inference!

Page 32: Getting Started with Graph Databases

Graph Views and Bacon!• Filter out uninteresting projects connected to Kevin Bacon

GraphView view = new GraphView();//Excludes all instances of TvShow from navigationview.excludeClass(myDb.getTypeId(TvShow.class.getName()));//Excludes all movies made for TV/Videoview.excludeClass(myDb.getTypeId(Movie.class.getName()),

“details.madeForTv || details.madeForVideo”);//Include ActedIn w/ characterName not containing “Himself”view.excludeClass(myDb.getTypeId(WorkedOn.class.getName()));view.includeClass(myDb.getTypeId(ActedIn.class.getName()),

“!CONTAINS(characterName, “Himself”)”);

Kevin Bacon

Actor

The Following

TV ShowBehind the

Scenes

Movie

Apollo 13

Movie

HimselfRyan Hardy

Jack Swigert

Page 33: Getting Started with Graph Databases

04/10/2023 33

Tools To Suit the Solution

Page 34: Getting Started with Graph Databases

Demo

Installing InfiniteGraph FlightPlan Sample