![Page 1: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/1.jpg)
Thursday, 21 June 12
![Page 2: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/2.jpg)
Using MongoDB as a high performance graph database
MongoDB UK, 20th June 2012
Chris Clarke CTO, Talis Education Limited
Thursday, 21 June 12
Who is talis?
Using mongo about 8 months (since 2.0)5 months in production
![Page 3: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/3.jpg)
What this talk not about
Thursday, 21 June 12
A blueprint for what you should doA pitch to encourage you to take our approachProviding or proving performance benchmarks Evangelism for the semantic web or linked dataEncouraging you to contribute/download/use an open source projectOptimised for your use case
Although we can talk to you about any of the above (see me after)
![Page 4: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/4.jpg)
So, what is this talk about?
Thursday, 21 June 12
Our journey of using MongoDB as a high performance graph databaseSpecifically the software wrapper we implemented on top of Mongo to give us a leg up in terms of scalability and performanceTo give you some ideas for how to work with graph data models if you’d like to use document databases
![Page 5: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/5.jpg)
GRAPHS 101
Thursday, 21 June 12
ApologiesNodes and edges or Resources and propertiesReally easy to represents facts
![Page 6: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/6.jpg)
John knows Jane
knowsJohn Jane
Thursday, 21 June 12
Ball and stick diagramsThis is an undirected graph. It implies that John knows Jane and Jane knows John. The property has no directional significance.
![Page 7: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/7.jpg)
knowsJohn Jane
John knows JaneJane knows John
Thursday, 21 June 12
This is an undirected graph. It implies that John knows Jane and Jane knows John. The property has no directional significance.
![Page 8: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/8.jpg)
knowsJohn Jane
John knows JaneJane ? John
Thursday, 21 June 12
This is a directed graph. The relationship is one way. To add Jane knows John we need a second property.
We will only use directed graphs from herein as they are more specific
![Page 9: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/9.jpg)
John Janeknows
knows
John knows JaneJane knows John
Thursday, 21 June 12
![Page 10: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/10.jpg)
Triples + RDF 101
Thursday, 21 June 12
![Page 11: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/11.jpg)
Subject Property Object
John knows Jane
Thursday, 21 June 12
This is a triple
Property = predicate
![Page 12: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/12.jpg)
Subject Property Object
John knows Jane
Jane knows John
Thursday, 21 June 12
This is a second tripleThe same resource can be a subject or an object
![Page 13: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/13.jpg)
http://xmlns.com/foaf/0.1/knows http://example.com/Janehttp://example.com/John
Subject Property Object
Thursday, 21 June 12
RDFResources and properties as URIsURIs can be dereferencedCan share common property descriptions (RDF Schemas)Here using FOAF - billions if not trillions of triples defined using FOAF
![Page 14: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/14.jpg)
foaf:knows http://example.com/Janehttp://example.com/John
Subject Property Object
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
foaf:name “John”http://example.com/John
Thursday, 21 June 12
Namespaces for readability
In RDF subjects are always urisBut objects can be literals i.e. plain textMany RDF/graph databases allow you to further type literals as dates, numbers, etc.
![Page 15: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/15.jpg)
Subject Property Object
foaf:name “John”
http://example.com/John
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
foaf:Personrdf:type
http://example.com/John
foaf:name “Jane”
http://example.com/Jane foaf:Personrdf:type
http://example.com/Jane
foaf:knows http://example.com/Janehttp://example.com/John
foaf:knows http://example.com/Johnhttp://example.com/Jane
Thursday, 21 June 12
Here we type John and Jane as foaf:Person using rdf:type
Note both John and Jane appear as subjects and resources
This RDF graph represents six facts
![Page 16: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/16.jpg)
example:John
foaf:knows
foaf:knows
example:Jane
foaf:Person
rdf:type rdf:type
“John” “Jane”
Thursday, 21 June 12
Here it is in ball and stick
![Page 17: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/17.jpg)
FFS! I can do that in two minutes in BSON
Thursday, 21 June 12
![Page 18: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/18.jpg)
> db.people.find(){
_id: ObjectID(‘123’),name: ‘John’knows: [ObjectID(‘456’)]
},{
_id: ObjectID(‘456’),name: ‘Jane’knows: [ObjectID(‘123’)]
}
Thursday, 21 June 12
Yes, you can!Data only makes sense inside your db though
![Page 19: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/19.jpg)
http://sheikspear.blogspot.co.uk/2011/07/simples.html
Thursday, 21 June 12
Talk over, right?We can all go home
![Page 20: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/20.jpg)
Some useful stuff, using RDF
Thursday, 21 June 12
Lets look at some reasons why we think RDF is good
![Page 21: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/21.jpg)
attribution
Thursday, 21 June 12
This is the linked open data cloud
Linked data is a way RDF published on the open web
Search linked data TED to hear why Tim Burness Lee cares about this
Each blob on this diagram represents an open, interlinked dataset. The lines between them represent the interlinking between data sets
Billions of public “facts” and growing exponentially from sites such as BBC, governments, Last.fm, Wikipedia
![Page 22: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/22.jpg)
Merging data from different sources is really easy
Thursday, 21 June 12
Because the format is subject, predicate, object the shape of RDF is always the same. Because schemas are public and widely shared the same properties are used all over the place.Really easy to use this data in your own app and remix
![Page 23: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/23.jpg)
Dataset B
foaf:Person
example:Johnexample:John
rdf:type
“John”
foaf:name
Dataset A
Thursday, 21 June 12
![Page 24: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/24.jpg)
foaf:Person
example:John
rdf:type
“John”
foaf:name
Dataset A+B
Thursday, 21 June 12
Really easy to merge graphs“Designed in” to the data formatLots of existing tooling to do this
![Page 25: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/25.jpg)
RDF query language: SPARQL
Thursday, 21 June 12
![Page 26: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/26.jpg)
PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?emailWHERE { ?person a foaf:Person. ?person foaf:name ?name. ?person foaf:mbox ?email.}ORDER BY ?nameLIMIT 50
Thursday, 21 June 12
SPARQL is mega flexible. Lots of functions for grouping, walking graphs, pattern matching, inference, UNIONS, Geo extensions etc. etc. - all that shit. Most if not all of those datasets will have a SPARQL endpoint you can query
![Page 27: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/27.jpg)
SELECT TabularDESCRIBE GraphASK BooleanCONSTRUCT Graph
Thursday, 21 June 12
4 main query types
![Page 28: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/28.jpg)
PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?emailWHERE { ?person a foaf:Person. ?person foaf:name ?name. ?person foaf:mbox ?email.}ORDER BY ?nameLIMIT 50
FFS! That looks like SQL!
Thursday, 21 June 12
Yes it does. The WHERE clause is basically doing a shit load of joins. I’ll come back to that.
![Page 29: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/29.jpg)
Offline conversion process
Application DB(SQL or other)
Triple store +SPARQL
Thursday, 21 June 12
Most datasets on the LOD diagram don’t exist natively as Linked data and RDF. They are post-produced.Data not held natively - so conversion script - needs to be maintained and updated every time app schema changesData not up to date (1 hour, 1 day, 1 month behind?)
![Page 30: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/30.jpg)
Our innovation:Native Linked Data
Applications
Thursday, 21 June 12
We started working on these applications back in 2008
They are natively linked data so solve the conversion+currency issue
There is no other “format” or schema the data is stored in, it’s native RDF
When you have no schema, and you can integrate data from elsewhere on the web, it’s addictive
![Page 31: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/31.jpg)
Our problem:FFS! For applications, we
need humongous scale and performance
Thursday, 21 June 12
Those applications becoming rather popular with our users...
sub 50ms query time
Modern web apps need speed and data scale
Out-grown triple store and SPARQL
SPARQL is very flexible and expressive. It’s also expensive SPARQL is great for data sets where the questions you can ask are limitless, but our applications need a data layer where speed is measured in single digit ms.
Complex caching (w/Memcache) to achieve performance and scalability90:10 read:write
![Page 32: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/32.jpg)
Tripod
Thursday, 21 June 12
It’s a pod for our triplesA triple store designed for applications and scalabilityBased on Mongo
![Page 33: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/33.jpg)
Functional requirements:• Order magnitude increase in perf/scale• Graph-orientated interface
Non-functional requirements:• Strong community
Thursday, 21 June 12
Existing code very graph orientated
![Page 34: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/34.jpg)
Core data formatTripod API
Dealing with complex queriesTripodTables
Free text search
Thursday, 21 June 12
Walk through Tripod looking at 5 areas
![Page 35: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/35.jpg)
{‘http://example.com/John’ : {
‘http://purl.org/dc/elements/1.1/name’ : [{
value: ‘John’,type: ‘literal’
}],‘http://purl.org/dc/elements/1.1/knows’ : [
{value: ‘http://example.com/Jane’,type: ‘uri’
}]
},‘http://example.com/Jane’ : {
‘http://purl.org/dc/elements/1.1/name’ : [{
value: ‘Jane’,type: ‘literal’
}],‘http://purl.org/dc/elements/1.1/knows’ : [
{value: ‘http://example.com/John’,type: ‘uri’
},{
value: ‘http://example.com/James’,type: ‘uri’
}]
}}
Thursday, 21 June 12
RDF/JSON - a serialisation of RDF in JSON
Neither disk space efficient or readable
full-formed properties not compatible with Mongo (dot notation)
Even single values inside an array (problems for compound indexing)
![Page 36: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/36.jpg)
> db.CBD_people.find(){
_id: ‘http://example.com/John’,‘foaf:name’: {l: ‘John’},‘foaf:knows’: {u: ‘http://example.com/Jane’}
},{
_id: ‘http://example.com/Jane’,‘foaf:name’: {l: ‘Jane’},‘foaf:knows’: [ {u:‘http://example.com/John’}, {u:‘http://example.com/James’}]
}
Thursday, 21 June 12
Same semantics
2 documents here
Concise bound descriptions - all data known about a subject, one relationship deep
One document per subject per collection, keyed (and thus enforced) by Subject URI
Property names are namespaced
CBD collections are deemed as read/write in Tripod
![Page 37: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/37.jpg)
class MongoGraph extends SimpleGraph {
function add_tripod_array($tarray) function to_tripod_array($docId)
}
Thursday, 21 June 12
All of our app already uses SimpleGraph from a library called Moriarty (Google Code)
Simple extension which can ingest/output the data format on prev slide
![Page 38: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/38.jpg)
Core data formatTripod API
Dealing with complex queriesTripodTables
Free text search
Thursday, 21 June 12
Walk through Tripod looking at 5 areas
![Page 39: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/39.jpg)
interface ITripod{ public function select($query,$fields,$sortBy=null,$limit=null); public function describeResource($resource); public function describeResources(Array $resources); public function saveChanges($oldGraph, $newGraph); public function search($query);}
Thursday, 21 June 12
Almost the same as our existing data access API onto generic triple store
All of these methods return graphs, all are mega-simple queries on the CBD collections
None of these methods support joins (WHERE clause in SPARQL)
![Page 40: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/40.jpg)
public function describeResource($resource){
$query = array(“_id”=>$resource);$bson = $this->getCollection()->findOne($query);$graph = new MongoGraph();$graph->add_tripod_data($bson);return $graph;
}
Thursday, 21 June 12
These methods mega simple to implement as they translate to really simple Mongo Queries on the CBD collections returning single objects
![Page 41: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/41.jpg)
interface ITripod{ public function select($query,$fields,$sortBy=null,$limit=null); public function describeResource($resource); public function describeResources(Array $resources); public function saveChanges($oldGraph, $newGraph); public function search($query);
public function getViewForResource($resource,$viewType); public function getViewForResources(Array $resources,$viewType); public function getViews(Array $filter,$viewType);
}
Thursday, 21 June 12
Some extra methods to deal with complex queries involving joins
![Page 42: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/42.jpg)
Core data formatTripod API
Dealing with complex queriesTripodTables
Free text search
Thursday, 21 June 12
2 things we realised when looking at our applications
![Page 43: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/43.jpg)
DESCRIBE <http://example.com/foo> ?sectionOrItem ?resource ?document ?authorList ?author ?usedBy ?creator ?libraryNote ?publisherWHERE{
OPTIONAL {
<http://example.com/foo> resource:contains ?sectionOrItem . OPTIONAL
{ ?sectionOrItem resource:resource ?resource .
OPTIONAL { ?resource dcterms:isPartOf ?document . } OPTIONAL {
?resource bibo:authorList ?authorList . OPTIONAL { ?authorList ?p ?author . } } OPTIONAL { ?resource dcterms:publisher ?publisher . } } OPTIONAL { ?libraryNote bibo:annotates ?sectionOrItem } } . OPTIONAL { <http://example.com/foo> resource:usedBy ?usedBy } . OPTIONAL { <http://example.com/foo> sioc:has_creator ?creator }}
Thursday, 21 June 12
Typical SPARQL query in our app
9 “joins” in this query
![Page 44: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/44.jpg)
DESCRIBE <http://example.com/foo> ?sectionOrItem ?resource ?document ?authorList ?author ?usedBy ?creator ?libraryNote ?publisherWHERE{
OPTIONAL {
<http://example.com/foo> resource:contains ?sectionOrItem . OPTIONAL
{ ?sectionOrItem resource:resource ?resource .
OPTIONAL { ?resource dcterms:isPartOf ?document . } OPTIONAL {
?resource bibo:authorList ?authorList . OPTIONAL { ?authorList ?p ?author . } } OPTIONAL { ?resource dcterms:publisher ?publisher . } } OPTIONAL { ?libraryNote bibo:annotates ?sectionOrItem } } . OPTIONAL { <http://example.com/foo> resource:usedBy ?usedBy } . OPTIONAL { <http://example.com/foo> sioc:has_creator ?creator }}
Thursday, 21 June 12
Only thing that changes at run time in this query is this URI
Flexibility of SPARQL great for developer but terrible here for system performance
Query engine needs to join 9 times! Flexibility costs us every time we run this query!
This is why we hid it behind a cache
![Page 45: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/45.jpg)
join countfollow sequences (n times)join across databases
All the above with a condition
include certain propertiesinclude all properties
Thursday, 21 June 12
2nd thing
We only make use of minimal SPARQL
And some of these aren’t even well supported in SPARQL (sequences + join across databases)
![Page 46: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/46.jpg)
Materialised views, generated infrequently, read often
Thursday, 21 June 12
Remember 90:10 read:update
View specifications based on a subset of SPARQL
Views are for DESCRIBE like queries where all the data is brought back in one hit (not tabular data)
![Page 47: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/47.jpg)
{ _id: "v_resource_brief", from: "CBD_harvest", type: "http:\/\/talisaspire.com\/schema#Resource", include: ["rdf:type", "dct:subject", "dct:isVersionOf", "searchterms:usedAt", "dc:identifier"], joins: { "acorn:preferredMetadata": [], "acorn:listReferences": { include: ["acorn:list"] }, "acorn:bookmarkReferences": { include: ["acorn:bookmark"] }, "dcterms:isPartOf": [], "acorn:partReferences": { include: ["dct:hasPart"], joins: { "dct:hasPart": { joins: { "acorn:preferredMetadata": [] } } } } }}
Thursday, 21 June 12
A view specification - itself a document that can be stored in Mongo
8 keywords:
type from include joinsttl followSequence maxJoins counts
![Page 48: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/48.jpg)
Generated by incremental MapReduce when:1) Data is changed
2) TTL expires
Thursday, 21 June 12
Tripod can take these specifications and manage views in a special collection within the DB.
They expire and are regenerated automatically (and incrementally)
Incremental map reduce inside the DB
Fast, interleaves with reads
![Page 49: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/49.jpg)
> db.views.findOne(){ "_id" : { "rdf:resource" : "http://talisaspire.com/examples/1", "type" : "v_resource_full" }, "value" : { "graphs" : [ { "_id" : "http://talisaspire.com/examples/1", "rdf:type" : { "type" : "uri", "value" : "http://talisaspire.com/schema#Resource" } } ], "impactIndex" : [ "rdf:resource" : "http://talisaspire.com/examples/1" ] }}
Thursday, 21 June 12
This is what a view looks like
ID is a composite key of the view type and root resourceGraphs is a collection of CBDs
MongoGraph we displayed earlier can take this and represent it as a unified graph to the application
Impact index - A watch list of resources. When resources are saved the impact index is queried to find views that need invalidating
TTL is an alternative. If in viewspec timestamp is stored in view to determine when it can be invalidated
![Page 50: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/50.jpg)
attribution
11 22
33
44
Thursday, 21 June 12
Match views to data update rate
![Page 51: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/51.jpg)
Core data formatTripod API
Dealing with complex queriesTripodTables
Free text search
Thursday, 21 June 12
Tripod Tables are for larger datasets which cannot be brought back in one hit
They can be paged or have individual columns indexed for fast sort capability
![Page 52: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/52.jpg)
SELECT ?listName ?listUri!WHERE{! ?resource bibo:isbn10 "$isbn" ! UNION! { ! ! ?resource bibo:isbn10 "$isbnLowerCase" .! }! ?item resource:resource ?resource .! UNION! {! ! ?resourcePartOf bibo:isbn10 "$isbn" .! ! UNION! ! {! ! ! ?resourcePartOf bibo:isbn10 "$isbnLowerCase" . ! ! }! ! ?resourcePartOf dct:hasPart ?resource .! ! ?item resource:resource ?resource . } ?listUri resource:contains ?item . ?listUri sioc:name ?listName . ?listUri rdf:type resource:List}LIMIT 10OFFSET 40
Thursday, 21 June 12
This is a select query that brings back a two col document
OFFSET
LIMIT
![Page 53: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/53.jpg)
<?xml version="1.0"?><sparql xmlns="http://www.w3.org/2005/sparql-results#">! <head>! ! <variable name="label"/>! ! <variable name="type"/>! </head>! <results>! ! <result>! ! ! <binding name="label">! ! ! ! <literal>Tropical grassland</literal>! ! ! </binding>! ! ! <binding name="type">! ! ! ! <uri>http://purl.org/ontology/wo/TerrestrialHabitat</uri>! ! ! </binding>! ! </result>! ! <result>! ! ! <binding name="label">! ! ! ! <literal>Grassy field</literal>! ! ! </binding>! ! ! <binding name="type">! ! ! ! <uri>http://purl.org/ontology/wo/TerrestrialHabitat</uri>! ! ! </binding>! ! </result>! </results></sparql>
Thursday, 21 June 12
SPARQL SELECT results - tabular format - here in XML
![Page 54: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/54.jpg)
> db.t_resource.findOne(){
"_id" : "http://talisaspire.com/resources/3SplCtWGPqEyXcDiyhHQpA-2","value" : {
"type" : ["http://purl.org/ontology/bibo/Book","http://talisaspire.com/schema#Resource"
],"isbn" : "9780393929690","isbn13" : [
"9780393929691","9780393929691-2",
! "9780393929691-3"],"impactIndex" : [
"http://talisaspire.com/works/4d101f63c10a6",]
}}
Thursday, 21 June 12
This time our map reduce doesn’t create one doc as with materialised views
We get one doc per row
![Page 55: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/55.jpg)
Core data formatTripod API
Dealing with complex queriesTripodTables
Free text search
Thursday, 21 June 12
Our triple store included free text search
We wanted to stream updates into Elastic Search or A N Other search solution
When documents saved, same specification language used to build Search Document Format docs and submit them to an endpoint
We like ElasticSearch but you could use Amazon CloudSearch
![Page 56: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/56.jpg)
Limitations
Thursday, 21 June 12
Map Reduce as a non-blocking db.eval() and also to work around sync PHP programming model
PHP only for now - our web apps were PHP
To get a SPARQL endpoint we are exporting data out to Fueski - solved the mapping not the currency (for SPARQL)
![Page 57: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/57.jpg)
Future
Thursday, 21 June 12
Node JS portUse as a server not a libraryEliminate dependancy on map reduceSpecification version controlTap into op log for stream approach into Fuseki and other locationsNamed graph supportFurther optimisation of data modelMaybe open source
![Page 58: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/58.jpg)
That’s it
Thursday, 21 June 12
![Page 59: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/59.jpg)
Questions?
Find us on:
Web: talisaspire.comTwitter: @talisaspireYouTube: youtube.com/user/TalisAspireFacebook: facebook.com/talisaspireSupport: support.talisaspire.com
Questions?
Thursday, 21 June 12
![Page 60: Using MongoDB as a high performance graph database](https://reader033.vdocuments.net/reader033/viewer/2022051411/545af6aeaf79592b448b6045/html5/thumbnails/60.jpg)
Find us on:
Web: talisaspire.comTwitter: @talisaspireYouTube: youtube.com/user/TalisAspireFacebook: facebook.com/talisaspireSupport: support.talisaspire.com
Thursday, 21 June 12