2016-02 graphs - pg+rdf
TRANSCRIPT
Two graph data modelsRDF and Property Graphs
Andy SeabornePaolo Castagna
Andy➢ Involved in Linked Data standards
(SPARQL, RDF)➢ Open source: contributor to Apache Jena➢ Work for TopQuadrant, an RDF tools
company
Graphs
Reference dataLife Sciences Ontologies
Vocabularies
Sharable dataWikipedia Info boxes (DBpedia)
Analytics and Unstructured dataFraud analysis
Social Graphs
Use Case for GraphsLooking for patterns
➢ Analytics● Social networks and recommendation engines● Data center infrastructure management
➢ Knowledge Graphs● Happenings: people, places, events ● Customer databases / products catalogues
RDF➢ A graph is a set of links
Link: a triple : subject - predicate - objectpredicate (or property) is the link name : an IRI
➢ IRIs (=URIs)literals (strings, numbers, …)blank nodes
prefix : <http://example/myData/>prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/>
# foaf:name is a short form of <http://xmlns.com/foaf/0.1/name>:alice rdf:type foaf:Person ; foaf:name "Alice Smith" ; foaf:knows :bob .
:alice foaf:knows
"Alice Smith"
foaf:name
foaf:Person
rdf:type
:bob
prefix : <http://example/myData/>prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/>
:bob rdf:type foaf:Person ; foaf:name "Bob Brown" .
"Bob Brown"
foaf:Person
rdf:type
:bob
prefix : <http://example/myData/>prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/>
:alice rdf:type foaf:Person ; foaf:name "Alice Smith" ; foaf:knows :bob .
:bob rdf:type foaf:Person ; foaf:name "Bob Brown" .
:alice foaf:knows
"Alice Smith"
foaf:name
foaf:Person
rdf:type"Bob Brown"
foaf:Person
rdf:type
:bob
RDFS
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/>
foaf:Person rdfs:subClassOf foaf:Agent .foaf:Person rdfs:subClassOf <http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> .
foaf:skypeID rdfs:domain foaf:Agent ; rdfs:label "Skype ID" ; rdfs:range rdfs:Literal ; rdfs:subPropertyOf foaf:nick .
RDF : Access
➢ SPARQL : Query language➢ Protocol : over HTTP
## Names of people Alice knows.
PREFIX : <http://example/myData/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT * { :alice foaf:knows ?X . ?X foaf:name ?name .}
RDF : Access
➢ SPARQL : Query language➢ Protocol : over HTTP
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?numFriends { { SELECT ?person (count(*) AS ?numFriends) { ?person foaf:knows ?X . } GROUP BY ?person } ?person foaf:name ?name .} ORDER BY ?numFriends
RDF : Access
➢ SPARQL : Update language➢ Protocol : over HTTP
PREFIX : <http://example/myData/>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
INSERT DATA { :bob foaf:name "Bob Brown" ; foaf:knows :alice } ;INSERT { :alice knows ?B }} WHERE { :bob knows ?B}
Apache JenaTLP: April 2012
➢ Involvement in standards➢ RDF 1.1, SPARQL 1.1➢ RDF database➢ SPARQL server
Other RDF@ASF:
➢ Any23, Marmotta, Clerezza, Stanbol, Rya
Property Graph Data Model A property graph is a set of vertices and edges with respective properties (i.e. key / values):
➢ each vertex or edge has a unique identifier➢ each vertex has a set of outgoing edges and a set of incoming edges➢ edges are directed: each edge has a start vertex and an end vertex➢ each edge has a label which denotes the type of relationship➢ vertexes and edges can have a properties (i.e. key / value pairs)
Directed multigraph with properties attached to vertexes and edges
Property Graph: Example
id = 1 id = 2
name = “Alice”surname = “Smith”age = 32email = [email protected]...
name = “Bob”surname = “Brown”age = 45email = [email protected]...
since = 01/01/1970...
id = 3
knows
Property Graphs : Access ➢ Tinkerpop Gremlin
DSL for various languagesg.V().as('person').out('knows').as('friend')
.select().by{it.value('name').length()}
➢ CypherMATCH (you:Person {name:"You"})FOREACH (name in ["Johan","Rajesh","Anna","Julia","Andrew"] | CREATE (you)-[:FRIEND]->(:Person {name:name}))
➢ Connect : API
LayeringUsing Property Graphs tech for RDF
Using RDF tech for Property Graphs
Doable but why?
Can’t use the tools of one without understanding the other.
What to take from RDFURIs as data types
Data Exchange
Data modelling
Emphasis on data formats for exchange
Relational Algebra engines
URIs matter
https://twitter.com/canberratimes/status/700198365393321984
What to take from PGSeparate links and values
Short names for attributes
Engines for Graph Algorithms
Some Conclusions➢ Data Graphs are (still) new to many people➢ RDF emphasizes information modelling
→ Knowledge graphs e.g SNOMED→ SQL-like query
➢ Property Graph emphasizes data syntax→ Data capture→ Graph analytic algorithms
➢ Naive layering of data models leads dissatisfaction→ Can only mix toolsets by knowing it’s layered
➢ Could share technology→ Storage, data access, query algebra