graphdb fundamentals

80
GraphDB Fundamentals Ontotext Webinar April 7, 2016

Upload: ontotext

Post on 07-Apr-2017

1.990 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: GraphDB Fundamentals

GraphDB FundamentalsOntotext Webinar April 7, 2016

Page 2: GraphDB Fundamentals

Presentation Outline

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 2

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#2

Page 3: GraphDB Fundamentals

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#3

Page 4: GraphDB Fundamentals

Resource Description Framework (RDF) is a graph data model that• Formally describes the semantics, or meaning, of information

• Represents metadata, i.e., data about data

RDF data model consists of triples• That represent links (or edges) in an RDF graph

• Where the structure of each triple is Subject, Predicate, Object

Example triples:

‘br:’ refers to the namespace ‘http://bedrock/’ so that ‘br:Fred’ expands to <http://bedrock/Fred> a Universal Resource Identifier (URI).

What is RDF?

Subject Predicate Object

br:Fred br:hasSpouse br:Wilma .br:Fred br:hasAge 25 .

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 4

#4

Page 5: GraphDB Fundamentals

An Example of an RDF Model

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 5

hasSpouse

hasSpouse

hasSpouse

hasChild

hasChild hasChildhasChild hasChild

hasChild hasChild hasChild hasChild

worksFor

livesInlivesIn

worksFor

WilmaFlintstone

PebblesFlintstone

PearlSlaghoople

RoxyRubble

PearlSlaghoople

Bamm-BammRubble

PrehistoricAmerica

CobblestoneCounty Bedrock Rock

Quarry

partOf locatedIn

FredFlinstone

BarneyRubble

BettyRubble

partOf

Chip

#5

Page 6: GraphDB Fundamentals

RDF Schema (RDFS)

• Adds− Concepts such as Resource, Literal, Class, and Datatype − Relationships such as subClassOf, subPropertyOf, domain, and range

• Provides the means to define− Classes and properties− Hierarchies of classes and properties

• Includes “entailment rules”, i.e., axioms to infer new triples from existing ones

What is RDFS?

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 6

#6

Page 7: GraphDB Fundamentals

Applying RDFS To Infer New Triples:hasSpouse rdfs:domain :Human ; rdfs:range :Human .

:Fred :hasSpouse :Wilma .:Human rdfs:subClassOf :Mammal .

:Fred a :Human .:Wilma a :Human .

:Fred a :Mammal .:Wilma a :Mammal .

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 7

#7

Page 8: GraphDB Fundamentals

Questions?

RDF and RDFS Overviews

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 8

Page 9: GraphDB Fundamentals

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 9

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#9

Presentation Outline

Page 10: GraphDB Fundamentals

10

What is SPARQL?

SPARQL is a SQL-like query language forRDF graph data with the following query types:

• SELECT which returns tabular results

• CONSTRUCT creates a new RDF graph based on query results

• ASK which returns ‘yes’ if the query has a solution, otherwise ‘no’

• DESCRIBE which returns RDF graph data about a resource; useful when the query client does not know the structure of the RDF data in the data source

• INSERT which inserts triples into a graph

• DELETE which deletes triples from a graph.

SemanticSearch

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 10

Page 11: GraphDB Fundamentals

Using SPARQL to Insert TriplesTo create an RDF graph, perform these steps:• Define prefixes to URIs with the PREFIX keyword

• Use INSERT DATA to signify you want to insert statements. Write the subject-predicate-object statements (triples).

• Execute this query.

:pebbles:bamm- bamm

:fred :wilma

:roxy :chip

:hasSpouse

:hasChild :hasChild

:hasChild :hasChild

PREFIX : <http://bedrock/>INSERT DATA { :fred :hasSpouse :wilma . :fred :hasChild :pebbles . :wilma :hasChild :pebbles . :pebbles :hasSpouse :bamm-bamm ; :hasChild :roxy, :chip .}

#11

Page 12: GraphDB Fundamentals

Using SPARQL to Select TriplesTo access the RDF graph you just created, perform these steps:• Define prefixes to URIs with the PREFIX keyword.

• Use SELECT to signify you want to select certain information, and WHERE to signify your conditions, restrictions and filters.

• Execute this query.

PREFIX : <http://bedrock/>SELECT ?subject ?predicate ?object WHERE {?subject ?predicate ?object }

Subject Predicate Object

:fred :hasChild :pebbles:pebbles :hasChild :roxy:pebbles :hasChild :chip:wilma :hasChild :pebbles

#12

Page 13: GraphDB Fundamentals

Using SPARQL to Find Fred’s GrandchildrenTo find Fred’s grandchildren, first find out if Fred has any grandchildren:• Define prefixes to URIs with the PREFIX keyword

• Use ASK to discover whether Fred has a grandchild, and WHERE to signify your conditions.

YESPREFIX : <http://bedrock/>ASKWHERE { :fred :hasChild ?child . ?child :hasChild ?grandChild .}

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved

#13

Page 14: GraphDB Fundamentals

Using SPARQL to Find Fred’s GrandchildrenNow that we know he has at least one grandchild, perform these steps to find the grandchild(ren):• Define prefixes to URIs with the PREFIX keyword

• Use SELECT to signify you want to select a grandchild, and WHERE to signify your conditions.

PREFIX: <http://bedrock/>SELECT ?grandChild WHERE { :fred :hasChild ?child . ?child :hasChild ?grandChild .}

grandChild

1. :roxy2. :chip

#14

Page 15: GraphDB Fundamentals

SPARQL Overview

Questions?

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 15

Page 16: GraphDB Fundamentals

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 16

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#16

Presentation Outline

Page 17: GraphDB Fundamentals

What is Ontology

An ontology is a formal specification that provides sharable and reusable knowledge representation.

Examples of ontologies include:

• Taxonomies

• Vocabularies

• Thesauri

• Topic Maps

• Logical Models

#17

Page 18: GraphDB Fundamentals

What is in an Ontology?An ontology specification includes descriptions of• Concepts and properties in a domain • Relationships between concepts • Constraints on how the relationships can be used• Individuals as members of concepts

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 18

#18

Page 19: GraphDB Fundamentals

The Benefits of an OntologyOntologies provide:• A common understanding of information• Explicit domain assumptions

These provisions are valuable because ontologies:• Support data integration for analytics• Apply domain knowledge to data• Support interoperation of applications• Enable model-driven applications• Reduce the time and cost of application development• Improve data quality, i.e., metadata and provenance

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 19

#19

Page 20: GraphDB Fundamentals

OWL OverviewThe Web Ontology Language (OWL) adds more powerful ontology modelling means to RDF/RDFS• Providing

− Consistency checks: Are there logical inconsistencies?− Satisfiability checks: Are there classes that cannot have instances?− Classification: What is the type of an instance?

• Adding identity equivalence and identity difference − Such as, sameAs, differentFrom, equivalentClass, equivalentProperty

• Offering more expressive class definitions, such as− Class intersection, union, complement, disjointness− Cardinality restrictions

• Offering more expressive property definitions such as,− Object and datatype properties− Transitive, functional, symmetric, inverse properties− Value restrictions

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 20

#20

Page 21: GraphDB Fundamentals

Ontology Overview

Questions?

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 21

Page 22: GraphDB Fundamentals

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 22

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#22

Presentation Outline

Page 23: GraphDB Fundamentals

"Ontology Development 101" by Noy & McGuinness (2001) is a popular, practical seven-step methodology for developing an ontology.

• Step 1: Identify the domain and scope

• Step 2: Consider re-using existing ontologies

• Step 3: Enumerate important terms

• Step 4: Define the classes and class hierarchy

• Step 5: Define the properties of classes

• Step 6: Define property facets

• Step 7: Create instances

A Methodology for Ontologies

1

23

45

6

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 23

#23

Page 24: GraphDB Fundamentals

To help identify the domain and scope of the ontology, answer these questions:

• What is the domain of the ontology?

• What is the purpose of the ontology?

• Who are the users and maintainers?

• What questions will the ontology answer?

Step 1: Identify the Domain and Scope

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 24

#24

Page 25: GraphDB Fundamentals

Ontologies are re-usable and extensible and there are a number of existing ontologies that you might consider:

• Your existing ontology

• Widely used ontologies− such as: Dublin Core, FOAF, SKOS

• Upper Level Ontologies− such as: Cyc, UMBEL, DOLCE, SUMO

• Linked Open Data

• Specialized domain ontologies

Step 2: Consider Re-using Existing Ontology

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 25

#25

Page 26: GraphDB Fundamentals

Terminology is useful for domain modeling. Start collecting terminology based on interviews and domain documentation.

Step 3: Enumerate Important Terms

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 26

#26

Page 27: GraphDB Fundamentals

To help define the class and class hierarchy, determine which type of modeling to use.

Three types of modeling are:

• Top-down modeling− Use it when the general domain concepts are known

• Bottom-up modeling− Use it when there is a great variety of concepts and no clear overarching general concepts at the outset

• Hybrid modeling− Use it when you need both top down and bottom up modeling, which is often the case

Step 4: Define Class and Class Hierarchy

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 27

#27

Page 28: GraphDB Fundamentals

Define the properties of classes, such as:

• Intrinsic properties − For example color, mass, density

• Extrinsic properties − For example, name, location

• Parts

• Relationships to other individuals

Step 5: Define Properties of Classes

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 28

#28

Page 29: GraphDB Fundamentals

Define property facets, such as:

• Property Type− Is it symmetric? Is it transitive? Is it a datatype or an object

property?

• Cardinality− Is the property optional or essential? Is the property a one-

to-many relationship?

• Domain− From which classes does this property point?

• Range− To which classes does this property point?

Step 6: Define Property Facets

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 29

#29

Page 30: GraphDB Fundamentals

Create instances of classes

• For example, :Fred a :Human

Creating instances

• Tests the domain ontology

• May expose modeling issues− which can be addressed by iterative refinement

Step 7: Create Instances

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 30

#30

Page 31: GraphDB Fundamentals

Ontology Modeling

Questions?

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 31

Page 32: GraphDB Fundamentals

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 32

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#32

Presentation Outline

Page 33: GraphDB Fundamentals

GraphDB™ Editions

• GraphDB™ Free

• GraphDB™ Standard

• GraphDB™ Cloud

• GraphDB™ as-a-Service (S4)

• GraphDB™ Enterprise

#33

Page 34: GraphDB Fundamentals

#34http://info.ontotext.com/graphdb-free-graphdb

GraphDB™ Free Installation

Page 35: GraphDB Fundamentals

To install GraphDB™ Free Edition, perform these steps:

• Download GraphDB™ Free Edition and unzip

• Start the GraphDB and Workbench interfaces in the embedded Tomcat server by executing the startup script located in the root directory:

startup.bat (Windows)

./startup.sh (Linux/Unix/Mac OS)

The message below appears in your Terminal and the GraphDB Workbench opens up at http://localhost:8080/.

INFO: Starting ProtocolHandler [“http-bio-8080”]

Opening web app in default browser

GraphDB™ Free Edition Installation Overview

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 35

#35

Page 36: GraphDB Fundamentals

Create a new repository by:• Launching the GraphDB™ Workbench• Selecting “Admin”• Selecting “Locations and Repositories”• Configuring the new repository

GraphDB™ Free Edition Workbench New Repositoryhttp://localhost:8080

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 6

#36

Page 37: GraphDB Fundamentals

Test the repository by

• Selecting “SPARQL”

• Submitting queries

GraphDB™ Workbench Execute Querieshttp://localhost:8080

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 7

2 Query1 Insert Data

#37

Page 38: GraphDB Fundamentals

GraphDB™ Installation

Questions?

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 38

Page 39: GraphDB Fundamentals

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 39

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#39

Presentation Outline

Page 40: GraphDB Fundamentals

With regard to performance tuning

• Memory is the most important factor− More memory results in better performance

• Configure the heap space as follows:− Set Max Heap Space to ~90% of Free Memory (-Xmx JVM parameter)

− Use entity-index-size to set the entity index size

− Cache memory indices (statements, predicates, and FTS)

Performance Tuning: Memory

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 40

#40

Page 41: GraphDB Fundamentals

Performance Tuning: Memory

JVM

opti

on –

Xmx<

size>

(tot

al Ja

va h

eap

mem

ory)

Java runtime overhead

Entities

POS/PSO PCSOT PTSOC

Predicate Lists (SP/OP)

Full-text search

RDF Rank

Geo-spatial

Lucene

Cache memory

GraphDB application heap

Total available Java heap

tuple-index-memory

predicate-memory

fts-memory

Depends on entity-index-size

Typically 12-15% of total heap

cach

e-m

emor

y

Remaining memory used by GraphDB and the application’s heap

Some of the space will be used for Caching the RDRank, geo-spatial, and Lucene indices (if enabled)

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 41

#41

Page 42: GraphDB Fundamentals

Each dataset has its own “geometry.” Technicians must gain experience with each dataset in order to refine the loading process. Here are some tips:

• Load Performance− Set ‘cache-memory’ to be 50% of max heap− Disable optional indices− Load Data in chunks of 1 million statements− Use Fast Transaction mode

• Normal Operations After Load− Set ‘cache-memory’ to be 38% of max heap− Re-enable optional indices− Enable safe transaction mode− Experiment with

▪ cache-memory + ▪ tuple-memory-index + ▪ predicate-memory + ▪ fts-memory

Performance Tuning: Load

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 42

#42

Page 43: GraphDB Fundamentals

To help achieve the optimal configuration, GraphDB™ has a spreadsheet that estimates memory and index configuration values.

The spreadsheet

• Generates command line parameters and ttl configuration based on your input

• Is located in your distribution at ./doc called graphdb-se-configurator.xls

Performance Tuning: Spreadsheet

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 43

#43

Page 44: GraphDB Fundamentals

GraphDB™ Enterprise edition provides scalability

• Replication / High Availability cluster

• Improved concurrent querying and scalability

• Resilience for failover

Scalability: GraphDB™ Enterprise

GraphDB™

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 44

#44

Page 45: GraphDB Fundamentals

Performance Tuning and Scalability

Questions?

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 45

Page 46: GraphDB Fundamentals

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 46

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#46

Presentation Outline

Page 47: GraphDB Fundamentals

GraphDB™ Workbench is a web-based administration tool. It is similar to Sesame Workbench, but

• Has more features

• Is intuitive and easier to use

GraphDB™ Workbench functions Include

• Managing GraphDB™ repositories

• Loading and exporting data

• Monitoring query execution

• Developing and executing queries

• Managing connectors

• Managing users

GraphDB™ Workbench and Sesame

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 47

#47

Page 48: GraphDB Fundamentals

On the following slide is an example of the GraphDB™ Workbench screen.

• Access the GraphDB™ Workbench from a browser.

• The splash page provides a summary of the installed GraphDB™ Workbench.

• The Workbench has a menu bar and a number of convenient pull down menus organized under “Data”, “SPARQL”, “Admin”, and the currently selected repository.

GraphDB™ Workbench

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 48

#48

Page 49: GraphDB Fundamentals

Access GraphDB™ Workbenchhttp://localhost:8080/graphdb-workbench-se/

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 49

#49

Page 50: GraphDB Fundamentals

Create a new repository by selecting

• The Admin menu

• Locations and Repositories

• Create Repository

Create New Repository

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 50

#50

Page 51: GraphDB Fundamentals

By selecting the SPARQL menu, the SPARQL query editor displays and

• Allows you to render your query results as Table, Pivot Table, or Google Analytic Charts

Execute Queries With GraphDB™ Workbench

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 51

#51

Page 52: GraphDB Fundamentals

GraphDB™ Workbench Query Editor

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 52

#52

Page 53: GraphDB Fundamentals

Query Monitoring: Abort Query

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 53

#53

Page 54: GraphDB Fundamentals

GraphDB™ Workbench and Sesame

Questions?

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 54

Page 55: GraphDB Fundamentals

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 55

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#55

Presentation Outline

Page 56: GraphDB Fundamentals

Loading data may be accomplished by using

• GraphDB™ Workbench− To upload individual files

− To upload bulk data from a directory

• Parallel Loader

Loading Data

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 56

#56

Page 57: GraphDB Fundamentals

Loading Data

Supported File Formats

#57

Page 58: GraphDB Fundamentals

Loading data through the GraphDB Workbench

To load a local file:

#58

• Select Data -> Import.• Open the Local files tab and click the Select files icon to choose the file you want to upload.• Click the Import button.• Enter the import settings in the pop-up window

Page 59: GraphDB Fundamentals

Loading Local Files

#59

Page 60: GraphDB Fundamentals

Loading a database server file

#60

• Create a folder named graphdb-import in your user home directory.• Copy all data files you want to load into the GraphDB database to this folder.• Go to the GraphDB Workbench.• Select Data -> Import.• Open the Server files tab.• Select the files you want to import.• Click the Import button.

Page 61: GraphDB Fundamentals

The LoadRDF Parallel Bulk Loader

• Features fast loading of large datasets into new repositories

• Is not intended for updating existing repositories

• Is easy to use:− Enter loadrdf <config.ttl> <serial|parallel> <files...>

▪ For example “./loadrdf.sh config.ttl parallel example.ttl”

− The “Serial Load” option pipelines the parse, entity resolution, and load tasks.

− The “Parallel Load” batch processes the parse, entity resolution, and load tasks.

Load RDF Parallel Bulk Loader

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 61

#61

Page 62: GraphDB Fundamentals

Other ways to load data

#62

By pasting data in the Text area tab of the Import page.

By pasting a data URL in the Remote content tab of the Import page.

By executing an INSERT query in the SPARQL -> SPARQL Query page.

Page 63: GraphDB Fundamentals

Loading Data

Questions?

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 63

Page 64: GraphDB Fundamentals

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 64

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#64

Presentation Outline

Page 65: GraphDB Fundamentals

Reasoning Strategies:

• Forward Chaining− Inferences pre-computed

− Faster query performance

− Slower load times

− More memory/disk space required

− Updates are expensive (truth maintenance is non-trivial)

• Backward Chaining− Inferences performed as needed at query time

− Slower query performance

− Faster load times

• Hybrid Chaining − Partial forward chaining at data loading time + partial backward chaining at query time

Reasoning Strategies

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 65

#65

Page 66: GraphDB Fundamentals

• GraphDB™ forward chaining/delete optimization − Fast deletes− Most triplestores perform an expensive full re-compute on updates− Truth maintenance minimizes the re-compute but the required dependency tracking is expensive− GraphDB optimizes the update by using backward chaining to derive update dependencies

dynamically.

• owl:sameAs forward chaining optimization− Forward chaining owl:sameAs generates a large number triples− This is caused by statement duplication on equivalent resources− The equivalent resource optimization minimizes triples generated.− Backward-chaining can expand results at query time

GraphDB™ Reasoning Optimizations

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 66

#66

Page 67: GraphDB Fundamentals

A Rule Set Consists of• Prefixes (namespace prefixes)

• Axiomatic triples

• Custom rules

Pre-Defined Rule Sets are• empty: no reasoning, GraphDB™ operates as a plain RDF store;

• rdfs: standard RDFS semantics;

• owl-horst: RDFS + D-Entailment + Some OWL – Tractable

• owl-max: RDFS with most of OWL Lite

• owl2-rl: Conformant OWL2 RL profile except for D-Entailment (types)

• owl2-ql: Reasoning over large volumes of data

Rule Sets

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 67

#67

Page 68: GraphDB Fundamentals

Rule Sets and Reasoning Strategies

Questions?

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 68

Page 69: GraphDB Fundamentals

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 69

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and Sesame

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#69

Presentation Outline

Page 70: GraphDB Fundamentals

Ontotext GraphDB Connectors

#70

• Provides extremely fast full text search, range, faceted search, and aggregations

• Utilize an external engine like Lucene, Solr or Elasticsearch

• Flexible schema mapping: index only what you need

• Real-time synchronization of data in GraphDB and the external engine

• Connector management via SPARQL

• Data querying & update via SPARQL

• Based on the GraphDB plug-in architecture

Page 71: GraphDB Fundamentals

Workflow

#71

Internal indexes Graph indexes

Solr/Elasticsearch direct

queries

Query Processor

Selective Replication

SPARQL INSERT/DELETE

SPARQL SELECT with or without an

embedded

Lucene/Solr/Elasticsearch query

Lucene/Solr/Elasticsearch GraphDB engine

Page 72: GraphDB Fundamentals

Interface

• All interaction via SPARQL queries − INSERT for creating connectors − SELECT for getting connector configuration parameters− INSERT/SELECT/DELETE for managing & querying RDF data

#72

Page 73: GraphDB Fundamentals

Connectors – Primary Features

• Maintaining an index that is always in sync with the data stored in GraphDB

• Multiple independent instances per repository

• The entities for synchronization are defined by:− a list of fields (on the Lucene side) and property chains (on the GraphDB side) whose

values will be synchronised− a list of rdf:type's of the entities for synchronisation− a list of languages for synchronisation (the default is all languages)− additional filtering by property and value

• Full-text search using native Lucene queries

#73

Page 74: GraphDB Fundamentals

Connectors – Primary Features

• Snippet extraction: highlighting of search terms in the search result

• Faceted search

• Sorting by any preconfigured field

• Paging of results using offset and limit

• Custom mapping of RDF types to Lucene types

• Specifying which Lucene analyzer to use (the default is Lucene's StandardAnalyzer)

• Boosting an entity by the [numeric] value of one or more predicates

• Custom scoring expressions at query time to evaluate score based on Lucene #74

Page 75: GraphDB Fundamentals

RDF Rank is a GraphDB™ extension that• Is similar to PageRank and it identifies “important” nodes in an RDF graph based on their

interconnectedness • Is accessed using the rank:hasRDFRank system predicate• Incremental RDF Rank is useful for frequently changing data

For Example, to select the top 100 important nodes in the RDF graph:

RDF Rank

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>SELECT ?n WHERE {?n rank:hasRDFRank ?r }ORDER BY DESC(?r)LIMIT 100

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 75

Page 76: GraphDB Fundamentals

GeoSPARQL Support

#76

GeoSPARQL is a standard for representing and querying geospatial linked data from the Open Geospatial Consortium, using the Geography Markup Language

• A small topological ontology in RDFS/OWL for representation

• Simple Features, RCC8, and DE-9IM (a.k.a. Egenhofer) topological relationship vocabularies and ontologies for qualitative reasoning

• A SPARQL query interface using a set of Topological SPARQL extension functions for quantitative reasoning

Page 77: GraphDB Fundamentals

Extensions

Questions?

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 77

Page 78: GraphDB Fundamentals

78

Support and FAQ’s

[email protected]

Additional resources:

Ontotext: Community Forum and Evaluation Support: http://stackoverflow.com GraphDB Website: http://graphdb.ontotext.com GraphDB Website: http://ontotext.com/knowledge-hub/documentation/

SPARQL, OWL, and RDF: RDF: http://www.w3.org/TR/rdf11-concepts/ RDFS: http://www.w3.org/TR/rdf-schema/ SPARQL Overview: http://www.w3.org/TR/sparql11-overview/ SPARQL Query: http://www.w3.org/TR/sparql11-query/ SPARQL Update: http://www.w3.org/TR/sparql11-update

Page 79: GraphDB Fundamentals

For Further Information

• Peio Popov, VP of USA Business Operations−[email protected] −929.239.0659

• Marin Dimitrov, CTO−[email protected]−718.473.0870

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved 79

#79

Page 80: GraphDB Fundamentals

The EndGraphDB™ Fundamentals