what's new in apache tinkerpop - the graph computing framework

47
What’s New in Apache TinkerPop? Open Source Graph Computing Framework http://tinkerpop.incubator.apache.org/ Stephen Mallette - @spmallette © 2015. All Rights Reserved.

Upload: datastax-academy

Post on 09-Jan-2017

902 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: What's New in Apache TinkerPop - the Graph Computing Framework

What’s New in Apache TinkerPop?Open Source Graph Computing Framework

http://tinkerpop.incubator.apache.org/

Stephen Mallette - @spmallette

© 2015. All Rights Reserved.

Page 2: What's New in Apache TinkerPop - the Graph Computing Framework

© 2015. All Rights Reserved.

Page 3: What's New in Apache TinkerPop - the Graph Computing Framework

By Andrea Mann from London, United Kingdom (Flickr Uploaded by Hohum) [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 4: What's New in Apache TinkerPop - the Graph Computing Framework

© 2015. All Rights Reserved.

Page 5: What's New in Apache TinkerPop - the Graph Computing Framework

Georgius Agricola, De re metallica 1556

© 2015. All Rights Reserved.

Page 6: What's New in Apache TinkerPop - the Graph Computing Framework

“Woman at spinning wheel with man carding” Smithfield Decretals (British Library, Royal 10 E. IV, fol. 147v), c. 1340“Carding, Spinning and Weaving” by Giovanni Boccaccio from De claris mulieribus 15th Century

© 2015. All Rights Reserved.

Page 7: What's New in Apache TinkerPop - the Graph Computing Framework

London, British Library, Royal 18 E.iii (15th century) [Public domain], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 8: What's New in Apache TinkerPop - the Graph Computing Framework

[Public domain], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 9: What's New in Apache TinkerPop - the Graph Computing Framework

By Unknown. Photo credit: Yale University Art Gallery. In the Public Domain. [Public domain], via Wikimedia Commons

[Public domain], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 10: What's New in Apache TinkerPop - the Graph Computing Framework

By Dogcow (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 11: What's New in Apache TinkerPop - the Graph Computing Framework

By Adam Schuster (Flickr: Proto IBM) [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

By Arnold Reinhold [CC BY-SA 2.5 (http://creativecommons.org/licenses/by-sa/2.5)], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 12: What's New in Apache TinkerPop - the Graph Computing Framework

© 2015. All Rights Reserved.

Page 13: What's New in Apache TinkerPop - the Graph Computing Framework

label: personname: Stephen

label: booktitle: Connections

label: personname: James

label: bought label: wrote

Graph Data Structure

© 2015. All Rights Reserved.

Page 14: What's New in Apache TinkerPop - the Graph Computing Framework

TinkerPop 2.0

TinkerPop 3.0

The TinkerPop Stack

© 2015. All Rights Reserved.

Page 15: What's New in Apache TinkerPop - the Graph Computing Framework

The TinkerPop Stack

© 2015. All Rights Reserved.

Page 16: What's New in Apache TinkerPop - the Graph Computing Framework

Gremlin in TinkerPop3

is NOT “just ”

It is advised that not use expressionsƛ

supports BOTH imperative and declarative querying

© 2015. All Rights Reserved.

Page 17: What's New in Apache TinkerPop - the Graph Computing Framework

$ bin/gremlin.sh

\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin>

© 2015. All Rights Reserved.

Page 18: What's New in Apache TinkerPop - the Graph Computing Framework

$ bin/gremlin.sh

\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> graph = GraphFactory.open("graph.properties")==>tinkergraph[vertices:0 edges:0]gremlin>

© 2015. All Rights Reserved.

Page 19: What's New in Apache TinkerPop - the Graph Computing Framework

$ bin/gremlin.sh

\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> graph = GraphFactory.open("graph.properties")==>tinkergraph[vertices:0 edges:0]gremlin> graph.io(gryo()).readGraph('data.kryo')==>nullgremlin> graph==>tinkergraph[vertices:1933 edges:4125]gremlin>

discussion

wrote

hasResponse

person response

participatesIn hasRoot

© 2015. All Rights Reserved.

Page 20: What's New in Apache TinkerPop - the Graph Computing Framework

$ bin/gremlin.sh

\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> graph = GraphFactory.open("graph.properties")==>tinkergraph[vertices:0 edges:0]gremlin> graph.io(gryo()).readGraph('data.kryo')==>nullgremlin> graph==>tinkergraph[vertices:1933 edges:4125]gremlin> g = graph.traversal()==>graphtraversalsource[tinkergraph[vertices:1933 edges:4125], standard]gremlin>

© 2015. All Rights Reserved.

Page 21: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> g.V(4608)==>v[4608]

4608

person

g.V(4608)

“Find the vertex with id 4608”

© 2015. All Rights Reserved.

Page 22: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> g.V(4608).values('userName')==>Renlit

4608

person

g.V(4608)

Renlit

userName

.values('userName')

“Get the value of the ‘userName’ property on vertex 4608”

© 2015. All Rights Reserved.

Page 23: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> g.V(4608).out('wrote')==>v[354560]==>v[640768]...==>v[466432]

4608 wrote

person response

g.V(4608) .out('wrote')

“Find the responses posted by ‘Renlit’”

© 2015. All Rights Reserved.

Page 24: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> g.V(4608).out('wrote').count()==>67

4608 wrote

person response

.out('wrote')

“Find the number of responses posted by ‘Renlit’”

g.V(4608) .count()

67

© 2015. All Rights Reserved.

Page 25: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> t = g.V(4608).out('wrote').count();null==>nullgremlin> t.strategies.toList()==>ConjunctionStrategy==>IncidentToAdjacentStrategy==>AdjacentToIncidentStrategy==>IdentityRemovalStrategy==>DedupBijectionStrategy==>MatchPredicateStrategy==>RangeByIsCountStrategy==>TinkerGraphStepStrategy==>ProfileStrategy==>EngineDependentStrategy==>ComputerVerificationStrategy==>StandardVerificationStrategy

© 2015. All Rights Reserved.

Page 26: What's New in Apache TinkerPop - the Graph Computing Framework

t.strategies.toList()

StrategyApplication

Original Query g.V(4608).out('wrote').count()

© 2015. All Rights Reserved.

AdjacentToIncidentStrategy

Post-Strategies g.V(4608).outE('wrote').count()

ConjunctionStrategyIncidentToAdjacentStrategy

IdentityRemovalStrategyDedupBijectionStrategyMatchPredicateStrategyRangeByIsCountStrategyTinkerGraphStepStrategyProfileStrategyEngineDependentStrategyComputerVerificationStrategyStandardVerificationStrategy

Page 27: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> g.V(4608).as('a').out('wrote').out('hasResponse').in('wrote') .where(neq('a')).groupCount().next()==>v[5376]=4==>v[2304]=2==>v[5888]=7...==>v[10496]=1

4608 wrote

person response

hasResponse

hasResponse

hasResponse

...

response

wrote

wrote

wrote

...

person person

4608

g.V(4608).

as('a').out('wrote') .out('hasResponse') .in('wrote') .where(neq('a')) .groupCount()

“Get a distribution over the authors who replied to ‘Renlit’”

© 2015. All Rights Reserved.

Page 28: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> g.V(4608).out('wrote').values('responseLevel').groupCount()==>[1:11, 2:19, 3:22, 4:9, 5:3, 6:3]gremlin>

4608 wrote

person response

g.V(4608) .out('wrote')

...

responseLevel

.values('responseLevel').groupCount()

“Get a distribution over the ‘responseLevel’ value for posts by ‘Renlit’”

© 2015. All Rights Reserved.

Page 29: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> g.V().has('type','response').values('responseLevel').groupCount()==>[1:358, 2:796, 3:445, 4:150, 5:57, 6:13, 7:4, 8:1]gremlin>

response

g.V() .has('type','response')

...

responseLevel

.values('responseLevel') .groupCount()

type response

“Get a distribution over the ‘responseLevel’ for all posts in the graph”

Page 30: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> g.V(4608).out('wrote').values('responseLevel').groupCount()==>[1:11, 2:19, 3:22, 4:9, 5:3, 6:3]gremlin> g.V().has('type','response').values('responseLevel').groupCount()==>[1:358, 2:796, 3:445, 4:150, 5:57, 6:13, 7:4, 8:1]gremlin>

g.V(4608).out('wrote') .values('responseLevel') .groupCount()

g.V().has('type','response') .values('responseLevel') .groupCount()

© 2015. All Rights Reserved.

Page 31: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> :install org.apache.tinkerpop hadoop-gremlin 3.0.0-incubating==>Loaded: [org.apache.tinkerpop, hadoop-gremlin, 3.0.0-incubating] - restart the console to use [tinkerpop.hadoop]gremlin> :exit

... $ bin/gremlin.sh \,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> :plugin use tinkerpop.hadoop==>tinkerpop.hadoop activatedgremlin> hdfs.copyFromLocal('data.kryo', 'data.kryo')==>nullgremlin> hdfs.ls()==>rw-r--r-- smallette supergroup 5782840 data.kryogremlin>

© 2015. All Rights Reserved.

Page 32: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> graph = GraphFactory.open('conf/hadoop/data-gryo.properties')==>hadoopgraph[gryoinputformat->gryooutputformat]gremlin> g = graph.traversal(computer(SparkGraphComputer))==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat],sparkgraphcomputer]

© 2015. All Rights Reserved.

Page 33: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> graph = GraphFactory.open('conf/hadoop/data-gryo.properties')==>hadoopgraph[gryoinputformat->gryooutputformat]gremlin> g = graph.traversal(computer(SparkGraphComputer))==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat],sparkgraphcomputer]gremlin> g.V(4608).out('wrote').values('responseLevel').groupCount()==>[1:11, 2:19, 3:22, 4:9, 5:3, 6:3]gremlin> g.V().has('type','response').values('responseLevel').groupCount()==>[1:358, 2:796, 3:445, 4:150, 5:57, 6:13, 7:4, 8:1]

© 2015. All Rights Reserved.

g.V(4608)

groupCount()

out().in() g.V().

Any Graph System

Neo4j

Titan

Sqlg

BlueM

ix

Hadoop

Giraph

Spark

OrientD

B

...

Page 34: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> :plugin use tinkerpop.gephi==>tinkerpop.gephi activatedgremlin> :remote connect tinkerpop.gephi==>Connection to Gephi - http://localhost:8080/workspace0 with stepDelay:1000, startRGBColor:[0.0, 1.0, 0.5], colorToFade:g, colorFadeRate:0.7, startSize:20.0,sizeDecrementRate:0.33

© 2015. All Rights Reserved.

Page 35: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> :plugin use tinkerpop.gephi==>tinkerpop.gephi activatedgremlin> :remote connect tinkerpop.gephi==>Connection to Gephi - http://localhost:8080/workspace0 with stepDelay:1000, startRGBColor:[0.0, 1.0, 0.5], colorToFade:g, colorFadeRate:0.7, startSize:20.0,sizeDecrementRate:0.33gremlin> :> graph==>tinkergraph[vertices:1933 edges:4125]

© 2015. All Rights Reserved.

Page 36: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> :> graph==>tinkergraph[vertices:1933 edges:4125]

© 2015. All Rights Reserved.

Page 37: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> g.V(10240).values('userName')==>Nayagremlin> g.V(5888).values('userName')==>Loret

© 2015. All Rights Reserved.

Page 38: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> subGraph = g.V(10240,5888).repeat(__.outE().subgraph('subGraph').inV()) .times(10) .cap('subGraph').next()==>tinkergraph[vertices:1152 edges:1343]gremlin> :> subGraph

© 2015. All Rights Reserved.

Naya

Loret

Page 39: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> :remote config visualTraversal subGraph svg==>Connection to Gephi - http://localhost:8080/workspace0 with stepDelay:1000, startRGBColor:[0.0, 1.0, 0.5], colorToFade:g, colorFadeRate:0.7, startSize:20.0,sizeDecrementRate:0.33gremlin> svg==>graphtraversalsource[tinkergraph[vertices:1152 edges:1343], standard]gremlin> svg.strategies.toList()==>ConjunctionStrategy==>IncidentToAdjacentStrategy==>AdjacentToIncidentStrategy==>IdentityRemovalStrategy==>FilterRankingStrategy==>MatchPredicateStrategy==>RangeByIsCountStrategy==>TinkerGraphStepStrategy==>EngineDependentStrategy==>GephiTraversalVisualizationStrategy==>ProfileStrategy==>ComputerVerificationStrategy

© 2015. All Rights Reserved.

Page 40: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 41: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 42: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 43: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 44: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 45: What's New in Apache TinkerPop - the Graph Computing Framework

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 46: What's New in Apache TinkerPop - the Graph Computing Framework

Takeaways

If you have connected data, use a Graph DB

If you use a Graph DB, consider

If you use , get started with Gremlin Console

© 2015. All Rights Reserved.

Page 47: What's New in Apache TinkerPop - the Graph Computing Framework

Acknowledgements

Ketrina Yim@KetrinaYim

Artist behind Gremlin and his friends

Joe Leehttp://jml3designz.com/

Graphic designer providing support on this presentation

Apache TinkerPophttp://tinkerpop.incubator.apache.org/

The TinkerPop Community

© 2015. All Rights Reserved.