cypher

49
Cypher Query Language Chicago Graph Database Meet-Up Max De Marzi

Upload: max-de-marzi

Post on 10-May-2015

19.115 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Cypher

Cypher Query Language

Chicago Graph Database Meet-UpMax De Marzi

Page 2: Cypher

What is Cypher?

•Graph Query Language for Neo4j

•Aims to make querying simple

Page 3: Cypher

Why Cypher?

•Existing Neo4j query mechanisms were not simple enough

•Too verbose (Java API)

•Too prescriptive (Gremlin)

Page 4: Cypher

SQL?

•Unable to express paths

•these are crucial for graph-based reasoning

•Neo4j is schema/table free

Page 5: Cypher

SPARQL?

•SPARQL designed for a different data model

•namespaces

•properties as nodes

•high learning curve

Page 6: Cypher

Design

Page 7: Cypher

Design Decisions

DeclarativeMost of the time, Neo4j knows better than you

Imperative Declarative

follow relationshipbreadth-first vs depth-

first

explicit algorithm

specify starting pointspecify desired

outcome

algorithm adaptablebased on query

Page 8: Cypher

Design Decisions

Pattern matching

Page 9: Cypher

Design Decisions

Pattern matching

AA

BB CC

Page 10: Cypher

Design Decisions

Pattern matching

Page 11: Cypher

Design Decisions

Pattern matching

Page 12: Cypher

Design Decisions

Pattern matching

Page 13: Cypher

Design Decisions

Pattern matching

Page 14: Cypher

Design Decisions

ASCII-art patterns

() --> ()

Page 15: Cypher

Design Decisions

Directed relationship

(A) --> (B)

AA BB

Page 16: Cypher

Design Decisions

Undirected relationship

(A) -- (B)

AA BB

Page 17: Cypher

Design Decisions

specific relationships

A -[:LOVES]-> B

AA BBLOVES

Page 18: Cypher

Design Decisions

Joined paths

A --> B --> C

AA BB CC

Page 19: Cypher

Design Decisions

multiple paths

A --> B --> C, A --> C

AA

BB CC

A --> B --> C <-- A

Page 20: Cypher

Design Decisions

Optional relationships

A -[?]-> B

AA BB

Page 21: Cypher

Design Decisions

Familiar for SQL users

selectfrom

wheregroup byorder by

startmatchwherereturn

Page 22: Cypher

START

SELECT * FROM Person WHERE firstName = “Max”

START max=node:persons(firstName = “Max”) RETURN max

Page 23: Cypher

MATCHSELECT skills.* FROM users JOIN skills ON users.id = skills.user_id WHERE users.id = 101

START user = node(101) MATCH user --> skills RETURN skills

Page 24: Cypher

Optional MATCHSELECT skills.* FROM users LEFT JOIN skills ON users.id = skills.user_id WHERE users.id = 101

START user = node(101) MATCH user –[?]-> skills RETURN skills

Page 25: Cypher

SELECT skills.*, user_skill.* FROM users JOIN user_skill ON users.id = user_skill.user_id JOIN skills ON user_skill.skill_id = skill.id WHERE users.id = 1

Page 26: Cypher

START user = node(1) MATCH user -[user_skill]-> skill RETURN skill, user_skill

Page 27: Cypher

Indexes

Used as multiple starting points, not to speed up any traversals

START a = node:nodes_index(type='User') MATCH a-[r:knows]-bRETURN ID(a), ID(b), r.weight

Page 28: Cypher

http://maxdemarzi.com/2012/03/16/jung-in-neo4j-part-2/

Page 29: Cypher

Complicated Match

Some UGLY recursive self join on the groups table

START max=node:person(name=“Max") MATCH group <-[:BELONGS_TO*]- max RETURN group

Page 30: Cypher

WhereSELECT person.* FROM person WHERE person.age >32 OR person.hair = "bald"

START person = node:persons("name:*") WHERE person.age >32 OR person.hair = "bald" RETURN person

Page 31: Cypher

ReturnSELECT person.name, count(*) FROM Person GROUP BY person.name ORDER BY person.name

START person=node:persons("name:*") RETURN person.name, count(*) ORDER BY person.name

Page 32: Cypher

Order By, Parameters

Same as SQL

{node_id} expected as part of request

START me = node({node_id})MATCH (me)-[?:follows]->(friends)-[?:follows]->(fof)-[?:follows]->(fofof)-[?:follows]->othersRETURN me.name, friends.name, fof.name, fofof.name, count(others)ORDER BY friends.name, fof.name, fofof.name, count(others) DESC

Page 33: Cypher

http://maxdemarzi.com/2012/02/13/visualizing-a-network-with-cypher/

Page 34: Cypher

Graph Functions

Some UGLY multiple recursive self and inner joins on the user and all related tables

START lucy=node(1000), kevin=node(759) MATCH p = shortestPath( lucy-[*]-kevin ) RETURN p

Page 35: Cypher

Aggregate FunctionsID: get the neo4j assigned identifierCount: add up the number of occurrencesMin: get the lowest valueMax: get the highest valueAvg: get the average of a numeric valueDistinct: remove duplicates

START me = node:nodes_index(type = 'user')MATCH (me)-[r?:wrote]-()RETURN ID(me), me.name, count(r), min(r.date), max(r.date)" ORDER BY ID(me)

Page 36: Cypher

Functions

Collect: put all values in a list

START a = node:nodes_index(type='User')MATCH a-[:follows]->bRETURN a.name, collect(b.name)

Page 37: Cypher

http://maxdemarzi.com/2012/02/02/graph-visualization-and-neo4j-part-three/

Page 38: Cypher

Combine Functions

Collect the ID of friends

START me = node:nodes_index(type = 'user')" MATCH (me)<-[r?:wrote]-(friends)RETURN ID(me), me.name, collect(ID(friends)), collect(r.date)ORDER BY ID(me)

Page 39: Cypher

http://maxdemarzi.com/2012/03/08/connections-in-time/

Page 40: Cypher

UsesRecommend Friends

START me = node({node_id}) MATCH (me)-[:friends]->(friend)-[:friends]->(foaf) RETURN foaf.name

Page 41: Cypher

UsesSix Degrees of Kevin Bacon

START me=node({start_node_id}), them=node({destination_node_id}) MATCH path = allShortestPaths( me-[?*]->them ) RETURN length(path), extract(person in nodes(path) : person.name)

Length: counts the number of nodes along a pathExtract: gets the nodes/relationships from a path

Page 42: Cypher

UsesSimilar Users

START me = node(user1) MATCH (me)-[myRating:RATED]->(i)<-[otherRating:RATED]-(u)WHERE abs(myRating.rating-otherRating.rating)<=2RETURN u

Users who rated same items within 2 points.

Abs: gets absolute numeric value

Page 43: Cypher

http://thought-bytes.blogspot.com/2012/02/similarity-based-recommendations-with.html

START me=node(user1),        similarUsers=node(3) (result received in the first query)MATCH (similarUsers)-[r:RATED]->(item)WHERE r.rating > 7 AND NOT((me)-[:RATED]->(item)) RETURN item

Items with a rating > 7 that similar users rated, but I have notAnd: this and that are trueOr: this or that is trueNot: this is false

Boolean Operations

Page 44: Cypher

Predicates

START london = node(1), moscow = node(2)MATCH path = london -[*]-> moscowWHERE all(city in nodes(path) where city.capital = true)

ALL: closure is true for all itemsANY: closure is true for any itemNONE: closure is true for no itemsSINGLE: closure is true for exactly 1 item

Page 45: Cypher

Implementation•Recursive matching with

backtracking

START x=... MATCH x-->y, x-->z, y-->z, z-->a-->b, z-->b

Page 46: Cypher

Implementation

Execution Plan

start n=node(0)return n

Parameters()Nodes(n)Extract([n])ColumnFilter([n])

Cypher is Pipeslazily evaluated pulling from pipes underneath

Page 47: Cypher

Implementation

Execution Plan

start n=node(0)match n-[*]-> b return n.name, n, count(*) order by n.age

Parameters()Nodes(n)PatternMatch(n-[*]->b)Extract([n.name, n])EagerAggregation( keys: [n.name, n], aggregates: [count(*)])Extract([n.age])Sort(n.age ASC)ColumnFilter([n.name,n,count(*)])

Page 48: Cypher

Implementation

Execution Plan

start n=node(0) match n-[*]-> b return n.name, n, count(*) order by n.name

Parameters()Nodes(n)PatternMatch(n-[*]->b)Extract([n.name, n])Sort(n.name ASC,n ASC)EagerAgregation( keys: [n.name, n], aggregates: [count(*)])ColumnFilter([n.name,n,count(*)])

Page 49: Cypher

Thanks for Listening!

Questions?

maxdemarzi.com