bio2rdf - make the most of virtuoso open source
DESCRIPTION
Learn how to deploy Bio2RDF data in a triple store and SPARQL endpoint using Virtuoso Open SourceTRANSCRIPT
Triplestores are Database Management System for data modeled in RDF
● Optimized for triples, queryable via the SPARQL query language
● There are three types of TripleStores:○ Native
Persistent storage systems with their own implementation of databases. Eg: Virtuoso
○ In MemoryRDF Graph is stored as triples in RAM Eg: Jena
○ Non-native non-memory: Persistent storage systems setup to run on third party DBMS Eg. Jena SDB
Managing a VOS installation for Bio2RDF
● Each Bio2RDF dataset is loaded into a Virtuoso Triplestore and available at its own SPARQL endpoint
● We have developed a PHP manager script for creating and configuring Virtuoso installations that is publicly available at: https://github.com/micheldumontier/php-lib/blob/master/apps/manager.php
● Allows you to create, start, stop and change memory requirements of multiple VOS instances○ manager.php is available for download here
● Requires: a tab-delimited file listing the desired instance names and the HTTP/ISQL ports they use
Use manager.php to configure multiple VOS instances
● Script options○ create : creates a virtuoso instance binary for the
specified instance name and starts it ○ start : stops, then starts the corresponding instance
of VOS○ stop : stops the specified instance○ refresh : creates a fresh copy of the instance's
virtuoso.ini file with default values○ apacheconfig : create an Apache VirtualHost file ○ GB of memory to use : specifies the amount of
RAM that an instance can use in GB
manager.php cli options
Set up manager.php
● Set the location of local virtuoso installation○ $virtuoso_dir = '/usr/local/virtuoso-opensource';
● Set the location of your base directory○ $base_dir = '/media/320/bio2rdf/manager';
● Set the sub-directory where the virtuoso instances will live○ $instance_dir = $base_dir.'/virtuoso';
● Set the location of the instances.tab file○ $instance_file = $base_dir.'/instances.tab';
make sure you have appropriate permissions to create target directories
Live demo -> Using manager.php
Load Bio2RDF data into your VOS endpoint
● Loading data into a VOS endpoint can be done using web interface (Conductor) or ISQL:○ http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtBulkRDFLoader
● You can also use the Bio2RDF loader script to load RDF into an endpoint○ uses a tab-delimited instances file to specify VOS
instance to load into○ can load single files or recursively load files inside a
given directory○ will record loading errors/incorrectly formatted RDF○ will re-try a certain number of times to reload data on
error
loader.php cli options
● Script options○ file: file to load○ dir: directory of files to load○ graph: name of graph to load into○ instance: instance name from tab-delimited○ port: ISQL port of VOS instance○ user: username for VOS instance (default: dba)○ pass: password for VOS instance (default: dba)
Using loader.php
● Script options cont'd:
○ threads: number of threads to use when loading○ updatefacet: update VOS faceted browser
(true|false)○ deletegraph: delete graph before loading
(true|false)○ deleteonly: delete graph without loading (true|false)○ setns: set namespaces for faceted browser○ setpassword: set password for VOS conductor○ format: format of RDF being loaded (default: N3)○ ignoreerror: ignore RDF format errors (true|false)
Using loader.php with Bio2RDF DrugBank data
● Live demo -> load Drugbank into your newly installed VOS instance
1. Download VOS 6.1.6 binary from here available via github: https://github.com/openlink/virtuoso-opensource
2. Verify that you have installed these packages
3. Uncompress the VOS binary, cd into that directory and run: ○ ./autotogen.sh (to prepare for building)○ Set the appropriate compiler flags○ ./configure ( > 20 minutes)
4. Install VOS by running: ○ sudo make install
■ install location: /usr/local/virtuoso-opensource
How to build OpenLink Virtuoso OS: Ubuntu
Reference: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSMake#Requirements
VOS Post-installation setup 1. Go to your installation directory and copy
○ var/lib/virtuoso/db/virtuoso.ini to bin/virtuoso.ini
2. Allow virtuoso to read other (or all) directories in your filesystem○ Edit virtuoso.ini and add '/' at the end of DirsAllowed
■ see example here
3. Start your virtuoso instance by running:○ sudo ./virtuoso-t -f & (run from /usr/local/virtuoso-opensource/bin)○ visit http://localhost:8890/conductor and login with default credentials:
user:dba, pass:dba
4. Install Faceted Browser○ Go to System Admin -> Packages, select the fct package and click on
Install/Upgrade -> proceed○ Verify that http://localhost:8890/fct responds
Reference: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main#How Do I...