cassandra summit - what's new in apache tinkerpop?

47
What’s New in Apache TinkerPop? Open Source Graph Computing Framework http:// tinkerpop.incubator.apache.org/ Stephen Mallette - @spmallette © 2015. All Rights Reserved.

Upload: stephen-mallette

Post on 07-Jan-2017

2.555 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Cassandra Summit - What's New In Apache TinkerPop?

What’s New in Apache TinkerPop?Open Source Graph Computing Framework

http://tinkerpop.incubator.apache.org/

Stephen Mallette - @spmallette

© 2015. All Rights Reserved.

Page 2: Cassandra Summit - What's New In Apache TinkerPop?

© 2015. All Rights Reserved.

Page 3: Cassandra Summit - What's New In Apache TinkerPop?

By Andrea Mann from London, United Kingdom (Flickr Uploaded by Hohum) [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 4: Cassandra Summit - What's New In Apache TinkerPop?

© 2015. All Rights Reserved.

Page 5: Cassandra Summit - What's New In Apache TinkerPop?

Georgius Agricola, De re metallica 1556

© 2015. All Rights Reserved.

Page 6: Cassandra Summit - What's New In Apache TinkerPop?

“Woman at spinning wheel with man carding” Smithfield Decretals (British Library, Royal 10 E. IV, fol. 147v), c. 1340“Carding, Spinning and Weaving” by Giovanni Boccaccio from De claris mulieribus 15th Century

© 2015. All Rights Reserved.

Page 7: Cassandra Summit - What's New In Apache TinkerPop?

London, British Library, Royal 18 E.iii (15th century) [Public domain], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 8: Cassandra Summit - What's New In Apache TinkerPop?

[Public domain], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 9: Cassandra Summit - What's New In Apache TinkerPop?

By Unknown. Photo credit: Yale University Art Gallery. In the Public Domain. [Public domain], via Wikimedia Commons

[Public domain], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 10: Cassandra Summit - What's New In Apache TinkerPop?

By Dogcow (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 11: Cassandra Summit - What's New In Apache TinkerPop?

By Adam Schuster (Flickr: Proto IBM) [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

By Arnold Reinhold [CC BY-SA 2.5 (http://creativecommons.org/licenses/by-sa/2.5)], via Wikimedia Commons

© 2015. All Rights Reserved.

Page 12: Cassandra Summit - What's New In Apache TinkerPop?

© 2015. All Rights Reserved.

Page 13: Cassandra Summit - What's New In Apache TinkerPop?

label: personname: Stephen

label: booktitle: Connections

label: personname: James

label: bought label: wrote

Graph Data Structure

© 2015. All Rights Reserved.

Page 14: Cassandra Summit - What's New In Apache TinkerPop?

TinkerPop 2.0

TinkerPop 3.0

The TinkerPop Stack

© 2015. All Rights Reserved.

Page 15: Cassandra Summit - What's New In Apache TinkerPop?

The TinkerPop Stack

© 2015. All Rights Reserved.

Page 16: Cassandra Summit - What's New In Apache TinkerPop?

Gremlin in TinkerPop3is NOT “just ”

It is advised that not use expressionsƛ

supports BOTH imperative and declarative querying

© 2015. All Rights Reserved.

Page 17: Cassandra Summit - What's New In Apache TinkerPop?

$ bin/gremlin.sh

\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin>

© 2015. All Rights Reserved.

Page 18: Cassandra Summit - What's New In Apache TinkerPop?

$ bin/gremlin.sh

\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> graph = GraphFactory.open("graph.properties")==>tinkergraph[vertices:0 edges:0]gremlin>

© 2015. All Rights Reserved.

Page 19: Cassandra Summit - What's New In Apache TinkerPop?

$ bin/gremlin.sh

\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> graph = GraphFactory.open("graph.properties")==>tinkergraph[vertices:0 edges:0]gremlin> graph.io(gryo()).readGraph('data.kryo')==>nullgremlin> graph==>tinkergraph[vertices:1933 edges:4125]gremlin>

discussion

wrote

hasResponse

person response

participatesIn hasRoot

© 2015. All Rights Reserved.

Page 20: Cassandra Summit - What's New In Apache TinkerPop?

$ bin/gremlin.sh

\,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> graph = GraphFactory.open("graph.properties")==>tinkergraph[vertices:0 edges:0]gremlin> graph.io(gryo()).readGraph('data.kryo')==>nullgremlin> graph==>tinkergraph[vertices:1933 edges:4125]gremlin> g = graph.traversal()==>graphtraversalsource[tinkergraph[vertices:1933 edges:4125], standard]gremlin>

© 2015. All Rights Reserved.

Page 21: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> g.V(4608)==>v[4608]

4608

person

g.V(4608)

“Find the vertex with id 4608”

© 2015. All Rights Reserved.

Page 22: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> g.V(4608).values('userName')==>Renlit

4608

person

g.V(4608)

Renlit

userName

.values('userName')

“Get the value of the ‘userName’ property on vertex 4608”

© 2015. All Rights Reserved.

Page 23: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> g.V(4608).out('wrote')==>v[354560]==>v[640768]...==>v[466432]

4608 wrote

person response

g.V(4608) .out('wrote')

“Find the responses posted by ‘Renlit’”

© 2015. All Rights Reserved.

Page 24: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> g.V(4608).out('wrote').count()==>67

4608 wrote

person response

.out('wrote')

“Find the number of responses posted by ‘Renlit’”

g.V(4608) .count()

67

© 2015. All Rights Reserved.

Page 25: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> t = g.V(4608).out('wrote').count();null==>nullgremlin> t.strategies.toList()==>ConjunctionStrategy==>IncidentToAdjacentStrategy==>AdjacentToIncidentStrategy==>IdentityRemovalStrategy==>DedupBijectionStrategy==>MatchPredicateStrategy==>RangeByIsCountStrategy==>TinkerGraphStepStrategy==>ProfileStrategy==>EngineDependentStrategy==>ComputerVerificationStrategy==>StandardVerificationStrategy

© 2015. All Rights Reserved.

Page 26: Cassandra Summit - What's New In Apache TinkerPop?

t.strategies.toList()

StrategyApplication

Original Query g.V(4608).out('wrote').count()

© 2015. All Rights Reserved.

AdjacentToIncidentStrategy

Post-Strategies g.V(4608).outE('wrote').count()

ConjunctionStrategyIncidentToAdjacentStrategy

IdentityRemovalStrategyDedupBijectionStrategyMatchPredicateStrategyRangeByIsCountStrategyTinkerGraphStepStrategyProfileStrategyEngineDependentStrategyComputerVerificationStrategyStandardVerificationStrategy

Page 27: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> g.V(4608).as('a').out('wrote').out('hasResponse').in('wrote') .where(neq('a')).groupCount().next()==>v[5376]=4==>v[2304]=2==>v[5888]=7...==>v[10496]=1

4608 wrote

person response

hasResponse

hasResponse

hasResponse

...

response

wrote

wrote

wrote

...

person person

4608

g.V(4608).as('a')

.out('wrote')

.out('hasResponse')

.in('wrote') .where(neq('a'))

.groupCount()

“Get a distribution over the authors who replied to ‘Renlit’”

© 2015. All Rights Reserved.

Page 28: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> g.V(4608).out('wrote').values('responseLevel').groupCount()==>[1:11, 2:19, 3:22, 4:9, 5:3, 6:3]gremlin>

4608 wrote

person response

g.V(4608) .out('wrote')

...

responseLevel

.values('responseLevel')

.groupCount()

“Get a distribution over the ‘responseLevel’ value for posts by ‘Renlit’”

© 2015. All Rights Reserved.

Page 29: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> g.V().has('type','response').values('responseLevel').groupCount()==>[1:358, 2:796, 3:445, 4:150, 5:57, 6:13, 7:4, 8:1]gremlin>

response

g.V()

.has('type','response')

...

responseLevel

.values('responseLevel')

.groupCount()

type response

“Get a distribution over the ‘responseLevel’ for all posts in the graph”

Page 30: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> g.V(4608).out('wrote').values('responseLevel').groupCount()==>[1:11, 2:19, 3:22, 4:9, 5:3, 6:3]gremlin> g.V().has('type','response').values('responseLevel').groupCount()==>[1:358, 2:796, 3:445, 4:150, 5:57, 6:13, 7:4, 8:1]gremlin>

g.V(4608).out('wrote') .values('responseLevel') .groupCount()

g.V().has('type','response') .values('responseLevel') .groupCount()

© 2015. All Rights Reserved.

Page 31: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> :install org.apache.tinkerpop hadoop-gremlin 3.0.0-incubating==>Loaded: [org.apache.tinkerpop, hadoop-gremlin, 3.0.0-incubating] - restart the console to use [tinkerpop.hadoop]gremlin> :exit

... $ bin/gremlin.sh \,,,/ (o o)-----oOOo-(3)-oOOo-----plugin activated: tinkerpop.serverplugin activated: tinkerpop.utilitiesplugin activated: tinkerpop.tinkergraphgremlin> :plugin use tinkerpop.hadoop==>tinkerpop.hadoop activatedgremlin> hdfs.copyFromLocal('data.kryo', 'data.kryo')==>nullgremlin> hdfs.ls()==>rw-r--r-- smallette supergroup 5782840 data.kryogremlin>

© 2015. All Rights Reserved.

Page 32: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> graph = GraphFactory.open('conf/hadoop/data-gryo.properties')==>hadoopgraph[gryoinputformat->gryooutputformat]gremlin> g = graph.traversal(computer(SparkGraphComputer))==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat],sparkgraphcomputer]

© 2015. All Rights Reserved.

Page 33: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> graph = GraphFactory.open('conf/hadoop/data-gryo.properties')==>hadoopgraph[gryoinputformat->gryooutputformat]gremlin> g = graph.traversal(computer(SparkGraphComputer))==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat],sparkgraphcomputer]gremlin> g.V(4608).out('wrote').values('responseLevel').groupCount()==>[1:11, 2:19, 3:22, 4:9, 5:3, 6:3]gremlin> g.V().has('type','response').values('responseLevel').groupCount()==>[1:358, 2:796, 3:445, 4:150, 5:57, 6:13, 7:4, 8:1]

© 2015. All Rights Reserved.

g.V(4608)

groupCount()

out().in()

g.V().Any Graph System

Neo4j

Titan

Sqlg

BlueM

ix

Hadoop

Giraph

Spark

OrientD

B

...

Page 34: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> :plugin use tinkerpop.gephi==>tinkerpop.gephi activatedgremlin> :remote connect tinkerpop.gephi==>Connection to Gephi - http://localhost:8080/workspace0 with stepDelay:1000, startRGBColor:[0.0, 1.0, 0.5], colorToFade:g, colorFadeRate:0.7, startSize:20.0,sizeDecrementRate:0.33

© 2015. All Rights Reserved.

Page 35: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> :plugin use tinkerpop.gephi==>tinkerpop.gephi activatedgremlin> :remote connect tinkerpop.gephi==>Connection to Gephi - http://localhost:8080/workspace0 with stepDelay:1000, startRGBColor:[0.0, 1.0, 0.5], colorToFade:g, colorFadeRate:0.7, startSize:20.0,sizeDecrementRate:0.33gremlin> :> graph==>tinkergraph[vertices:1933 edges:4125]

© 2015. All Rights Reserved.

Page 36: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> :> graph==>tinkergraph[vertices:1933 edges:4125]

© 2015. All Rights Reserved.

Page 37: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> g.V(10240).values('userName')==>Nayagremlin> g.V(5888).values('userName')==>Loret

© 2015. All Rights Reserved.

Page 38: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> subGraph = g.V(10240,5888).repeat(__.outE().subgraph('subGraph').inV()) .times(10) .cap('subGraph').next()==>tinkergraph[vertices:1152 edges:1343]gremlin> :> subGraph

© 2015. All Rights Reserved.

Naya

Loret

Page 39: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> :remote config visualTraversal subGraph svg==>Connection to Gephi - http://localhost:8080/workspace0 with stepDelay:1000, startRGBColor:[0.0, 1.0, 0.5], colorToFade:g, colorFadeRate:0.7, startSize:20.0,sizeDecrementRate:0.33gremlin> svg==>graphtraversalsource[tinkergraph[vertices:1152 edges:1343], standard]gremlin> svg.strategies.toList()==>ConjunctionStrategy==>IncidentToAdjacentStrategy==>AdjacentToIncidentStrategy==>IdentityRemovalStrategy==>FilterRankingStrategy==>MatchPredicateStrategy==>RangeByIsCountStrategy==>TinkerGraphStepStrategy==>EngineDependentStrategy==>GephiTraversalVisualizationStrategy==>ProfileStrategy==>ComputerVerificationStrategy

© 2015. All Rights Reserved.

Page 40: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 41: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 42: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 43: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 44: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 45: Cassandra Summit - What's New In Apache TinkerPop?

gremlin> :> svg.V(10240).as('x').out('wrote').out('hasResponse').in('wrote') .where(neq('x')).groupCount()==>[v[5888]:4]

© 2015. All Rights Reserved.

Page 46: Cassandra Summit - What's New In Apache TinkerPop?

TakeawaysIf you have connected data, use a Graph DB

If you use a Graph DB, consider

If you use , get started with Gremlin Console

© 2015. All Rights Reserved.

Page 47: Cassandra Summit - What's New In Apache TinkerPop?

Acknowledgements

Ketrina Yim@KetrinaYim

Artist behind Gremlin and his friends

Joe Leehttp://jml3designz.com/

Graphic designer providing support on this presentation

Apache TinkerPophttp://tinkerpop.incubator.apache.org/

The TinkerPop Community

© 2015. All Rights Reserved.