2016-02 graphs - pg+rdf

29
Two graph data models RDF and Property Graphs Andy Seaborne Paolo Castagna [email protected], [email protected]

Upload: andyseaborne

Post on 21-Jan-2018

33 views

Category:

Software


0 download

TRANSCRIPT

Two graph data modelsRDF and Property Graphs

Andy SeabornePaolo Castagna

[email protected], [email protected]

Outline➢ Graphs➢ Data Model: RDF➢ Data Model: Property Graphs➢ Best of both?

Andy➢ Involved in Linked Data standards

(SPARQL, RDF)➢ Open source: contributor to Apache Jena➢ Work for TopQuadrant, an RDF tools

company

Graphs

Org charts are not trees

Graphs

Reference dataLife Sciences Ontologies

Vocabularies

Sharable dataWikipedia Info boxes (DBpedia)

Analytics and Unstructured dataFraud analysis

Social Graphs

Use Case for GraphsLooking for patterns

➢ Analytics● Social networks and recommendation engines● Data center infrastructure management

➢ Knowledge Graphs● Happenings: people, places, events ● Customer databases / products catalogues

Graph Data Models➢ RDF

● W3C Standard

➢ Property Graphs● Industry standard

RDF➢ A graph is a set of links

Link: a triple : subject - predicate - objectpredicate (or property) is the link name : an IRI

➢ IRIs (=URIs)literals (strings, numbers, …)blank nodes

prefix : <http://example/myData/>prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/>

# foaf:name is a short form of <http://xmlns.com/foaf/0.1/name>:alice rdf:type foaf:Person ; foaf:name "Alice Smith" ; foaf:knows :bob .

:alice foaf:knows

"Alice Smith"

foaf:name

foaf:Person

rdf:type

:bob

prefix : <http://example/myData/>prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/>

:bob rdf:type foaf:Person ; foaf:name "Bob Brown" .

"Bob Brown"

foaf:Person

rdf:type

:bob

prefix : <http://example/myData/>prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/>

:alice rdf:type foaf:Person ; foaf:name "Alice Smith" ; foaf:knows :bob .

:bob rdf:type foaf:Person ; foaf:name "Bob Brown" .

:alice foaf:knows

"Alice Smith"

foaf:name

foaf:Person

rdf:type"Bob Brown"

foaf:Person

rdf:type

:bob

RDFS

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/>

foaf:Person rdfs:subClassOf foaf:Agent .foaf:Person rdfs:subClassOf <http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> .

foaf:skypeID rdfs:domain foaf:Agent ; rdfs:label "Skype ID" ; rdfs:range rdfs:Literal ; rdfs:subPropertyOf foaf:nick .

RDF : Access

➢ SPARQL : Query language➢ Protocol : over HTTP

## Names of people Alice knows.

PREFIX : <http://example/myData/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT * { :alice foaf:knows ?X . ?X foaf:name ?name .}

RDF : Access

➢ SPARQL : Query language➢ Protocol : over HTTP

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?name ?numFriends { { SELECT ?person (count(*) AS ?numFriends) { ?person foaf:knows ?X . } GROUP BY ?person } ?person foaf:name ?name .} ORDER BY ?numFriends

RDF : Access

➢ SPARQL : Update language➢ Protocol : over HTTP

PREFIX : <http://example/myData/>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX foaf: <http://xmlns.com/foaf/0.1/>

INSERT DATA { :bob foaf:name "Bob Brown" ; foaf:knows :alice } ;INSERT { :alice knows ?B }} WHERE { :bob knows ?B}

Apache JenaTLP: April 2012

➢ Involvement in standards➢ RDF 1.1, SPARQL 1.1➢ RDF database➢ SPARQL server

Other RDF@ASF:

➢ Any23, Marmotta, Clerezza, Stanbol, Rya

Property Graph Data Model A property graph is a set of vertices and edges with respective properties (i.e. key / values):

➢ each vertex or edge has a unique identifier➢ each vertex has a set of outgoing edges and a set of incoming edges➢ edges are directed: each edge has a start vertex and an end vertex➢ each edge has a label which denotes the type of relationship➢ vertexes and edges can have a properties (i.e. key / value pairs)

Directed multigraph with properties attached to vertexes and edges

Property Graph: Example

id = 1 id = 2

name = “Alice”surname = “Smith”age = 32email = [email protected]...

name = “Bob”surname = “Brown”age = 45email = [email protected]...

since = 01/01/1970...

id = 3

knows

Property Graphs : Access ➢ Tinkerpop Gremlin

DSL for various languagesg.V().as('person').out('knows').as('friend')

.select().by{it.value('name').length()}

➢ CypherMATCH (you:Person {name:"You"})FOREACH (name in ["Johan","Rajesh","Anna","Julia","Andrew"] | CREATE (you)-[:FRIEND]->(:Person {name:name}))

➢ Connect : API

Property Graphs @ASF➢ Apache Tinkerpop➢ Apache Spark > GraphX➢ Apache Giraph➢ Apache Flink > Gelly

RDFStandards

Information modeling

Data publishing

Property GraphsCode

Analytics

Data capture

LayeringUsing Property Graphs tech for RDF

Using RDF tech for Property Graphs

Doable but why?

Can’t use the tools of one without understanding the other.

What to take from RDFURIs as data types

Data Exchange

Data modelling

Emphasis on data formats for exchange

Relational Algebra engines

URIs matter

https://twitter.com/canberratimes/status/700198365393321984

What to take from PGSeparate links and values

Short names for attributes

Engines for Graph Algorithms

Some Conclusions➢ Data Graphs are (still) new to many people➢ RDF emphasizes information modelling

→ Knowledge graphs e.g SNOMED→ SQL-like query

➢ Property Graph emphasizes data syntax→ Data capture→ Graph analytic algorithms

➢ Naive layering of data models leads dissatisfaction→ Can only mix toolsets by knowing it’s layered

➢ Could share technology→ Storage, data access, query algebra

Thanks and Q&A

?

The AnswerBuilding one on top of the other is possible … but why do it?

Really hard to use! Worse of both worlds.

Semantic Web has some useful featuresApply to property graphs