ncbo sparql endpoint

24
THE NATIONAL CENTER FOR BIOMEDICAL ONTOLOGY NCBO SPARQL Endpoint Trish Whetzel Outreach Coordinator

Upload: trish-whetzel

Post on 23-Jun-2015

88 views

Category:

Software


0 download

DESCRIPTION

Tutorial from ICBO 2013.

TRANSCRIPT

Page 1: NCBO SPARQL Endpoint

THE NATIONAL CENTER FORBIOMEDICAL ONTOLOGY

NCBO SPARQL Endpoint

Trish WhetzelOutreach Coordinator

Page 2: NCBO SPARQL Endpoint

http://sparql.bioontology.org/http://sparql.bioontology.org/

Page 3: NCBO SPARQL Endpoint

BioPortal SPARQL Content

• All ontologies from BioPortal– Original ontology format (OBO, OWL,

UMLS/RRF) transformed into RDF– Updated daily– Latest version only

• Statistics– 393 ontologies– 4.2M terms– 2419 different predicates– 80M triples

Page 4: NCBO SPARQL Endpoint

Resource Description Framework

– A collection of RDF statements represents a labeled, directed multi-graph

– RDF is modeled as a set of triples– Triples are in the form (subject, predicate, object)

where:• Subjects and predicates are URIs• Objects are either URIs or Literals• Literals can be typed and have a language tag

– Blank nodes (bnodes) or anonymous resources are nodes that do not have a URI or literal

– RDF is not XML, there are different serializations, e.g. N3, turtle, rdf/xml

Page 5: NCBO SPARQL Endpoint

RDF – Turtle format

– Prefixes are allowed and statements can be grouped by subject

– Specific constructions to represent lists and blank nodes

– Least verbose and best human readable format, easy to serialize and good performance for parsing

Page 6: NCBO SPARQL Endpoint

RDF – N3 format

– One line per triple, no prefixes, not grouped by subject

– More verbose, easy to serialize, best performance for parsing

Page 7: NCBO SPARQL Endpoint

RDF – RDF/XML format

– Extremely verbose, worst human readable format

– Worst performance for parsing

Page 8: NCBO SPARQL Endpoint

SPARQL

– W3C standard query language for RDF– Structure

Page 9: NCBO SPARQL Endpoint

BioPortal Metadata

• Virtual ontology identifier– Stable identifier across all versions of the ontology– All versions of an ontology are linked via this ID

• Ontology version identifier– Unique for each ontology version– Most metadata linked directly to the ontology

version

Page 10: NCBO SPARQL Endpoint
Page 11: NCBO SPARQL Endpoint

Select ontology abbreviations

Page 12: NCBO SPARQL Endpoint
Page 13: NCBO SPARQL Endpoint

All ontologies updated since DATE

Page 14: NCBO SPARQL Endpoint

Find term in all ontologies

Page 15: NCBO SPARQL Endpoint

Select all terms from theABA Adult Mouse Anatomy

Page 16: NCBO SPARQL Endpoint

Select URI and preferred label from all terms

Page 17: NCBO SPARQL Endpoint

Get parent of given term

Page 18: NCBO SPARQL Endpoint

Select all terms and their parent

Page 19: NCBO SPARQL Endpoint

Select distinct properties

Page 20: NCBO SPARQL Endpoint

Select properties for term

Page 21: NCBO SPARQL Endpoint

Count terms in SNOMED

Page 22: NCBO SPARQL Endpoint

Performance Tips and Tricks

– Completely unbound patterns (?g ?s ?p ?o) are not allowed

– To optimize queries, use UNIONS instead of FILTERS

– If using FILTER on literals it is better if the filter is not applied to millions of rows

– To prevent combinatorial explosions of results, consider use CONSTRUCT or DESCRIBE (any M-N relationship can provoke this)

Page 23: NCBO SPARQL Endpoint

SPARQL Code Repository

• https://github.com/ncbo/sparql-code-examples

Page 24: NCBO SPARQL Endpoint

BioPortal SPARQL Endpoint

• Documentation: http://www.bioontology.org/wiki/index.php/SPARQL_BioPortal

• Query interface: http://sparql.bioontology.org/• Example queries:

http://sparql.bioontology.org/examples • Sample code: https://github.com/ncbo/sparql-

code-examples