souri oracle semantic technologies utaustin

Upload: baskarbaju1

Post on 14-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    1/112

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    2/112

    Oracle Database Semantic Technologies: An Overview ofCore and Enterprise Functionality

    Feb 2012

    Souri Das, Ph.D.Architect, Oracle

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    3/112

    Outline Fundamentals

    Semantic Technologies in a nutshell Source for RDF data OWL Inferencing Primer

    Overview Architecture Core Functionality: Load, Infer, Query

    Enterprise Functionality Tools: OBIEE, Visualization

    Detailed look Storage Installation and configuration Loading Querying in SQL & SPARQL Inferencing Semantic Indexing of Unstructured Content Security

    3

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    4/112

    Semantic Technologies in a nutshell

    Data (expressed in RDF) triplets (similar to: unpivoted tables) entity typically is associated with a type (rdf:type ) May be extended to quads: a grouping of triples

    BENEFIT: uniform structure => allows syntactic integration Schema / Ontology (expressed in OWL)

    extended with class and property hierarchies Many other such rules BENEFIT => allows discovery of implicit knowledge BENEFIT => allows semantic integration

    Query (expressed in SPARQL ) BENEFIT => Suits the triple or quad-based structure of RDF

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    5/112

    Source for RDF data

    Native RDF Example: Social network

    Converted to RDF Example: Tables (via unpivoting) Example: XML Example: Text (via NLP extraction) Example: Spatial (ogc:WKTLiteral) Example: Multimedia (via extraction)

    Viewed as RDF Example: Tables

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    6/112

    OWL Inferencing: A short Primer

    rdfs:subClassOf

    rdfs:subPropertyOf

    rdfs:domain

    rdfs:range

    owl:FunctionalProperty

    owl:InverseFunctionalProperty

    owl:SymmetricProperty

    owl:TransitiveProperty

    owl:inverseOf

    owl:someValuesFrom

    owl:allValuesFrom

    owl:hasValue

    owl:sameAs

    owl:differentFrom

    owl:equivalentClass

    owl:equivalentProperty

    owl:disjointWithowl:complementOf

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    7/112

    Inference: Examplesowl:FunctionalProperty

    owl:InverseFunctionalPropertyowl:SymmetricProperty

    owl:TransitiveProperty

    owl:inverseOf

    :hasMother rdf:type owl:FunctionalProperty

    :John :hasMother :Mary:John :hasMother :Maria=>:Mary owl:sameAs :Maria:Maria owl:sameAs :Mary

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    8/112

    Inference: Examplesowl:FunctionalProperty

    owl:InverseFunctionalPropertyowl:SymmetricProperty

    owl:TransitiveProperty

    owl:inverseOf

    :hasSSN rdf:type owl:InverseFunctionalProperty

    :John :hasSSN 123-45-6789:Johny :hasSSN 123-45-6789=>:John owl:sameAs :Johny:Johny owl:sameAs :John

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    9/112

    Inference: Examplesowl:FunctionalProperty

    owl:InverseFunctionalPropertyowl:SymmetricProperty

    owl:TransitiveProperty

    owl:inverseOf

    :hasSibling rdf:type owl:SymmetricProperty

    :John :hasSibling :Mary=>:Mary :hasSibling :John

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    10/112

    Inference: Examplesowl:FunctionalProperty

    owl:InverseFunctionalPropertyowl:SymmetricProperty

    owl:TransitiveProperty

    owl:inverseOf

    :hasAncestor rdf:type owl:TransitiveProperty

    :John :hasAncestor :Mary:Mary :hasAncestor :Tom=>:John :hasAncestor :Tom

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    11/112

    Inference: Examplesowl:FunctionalProperty

    owl:InverseFunctionalPropertyowl:SymmetricProperty

    owl:TransitiveProperty

    owl:inverseOf

    :hasParent owl:inverseOf :hasChild

    :John :hasParent :Mary=>:Mary :hasChild :John

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    12/112

    Inference: Examples

    :Male owl:disjointWith :Female

    owl:equivalentClass

    owl:equivalentPropertyowl:disjointWith

    owl:complementOf

    :John rdf:type :Male:Mary rdf:type :Female=>:John owl:differentFrom :Mary:Mary owl:differentFrom :John

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    13/112

    Inference: Examples

    :NonHuman owl:complementOf :Human

    owl:equivalentClass

    owl:equivalentPropertyowl:disjointWith

    owl:complementOf

    :Fish rdfs:subClassOf :NonHuman=>:Fish owl:disjointWith :Human:Human owl:disjointWith :Fish

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    14/112

    Inference: Examples

    :Player owl:equivalentClass _:c1 _:c1 owl:onProperty :participateIn _:c1 owl:someValuesFrom :Sports _:c1 rdf:type owl:Restriction

    owl:someValuesFrom

    owl:allValuesFromowl:hasValue

    :Soccer rdf:type :Sports

    :John :particpateIn :Soccer =>:John rdf:type :Player

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    15/112

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    16/112

    Inference: Examples

    :EligibleToBePresident owl:equivalentClass _:c1 _:c1 owl:onProperty :countryOfBirth

    _:c1 owl:hasValue :USA _:c1 rdf:type owl:Restriction

    owl:someValuesFrom

    owl:allValuesFromowl:hasValue

    :John :countryOfBirth :USA=>

    :John rdf:type :EligibleToBePresident:Tom rdf:type :EligibleToBePresident=>:Tom :countryOfBirth :USA

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    17/112

    THE FOLLOWING IS INTENDED TO OUTLINEOUR GENERAL PRODUCT DIRECTION. IT ISINTENDED FOR INFORMATION PURPOSESONLY, AND MAY NOT BE INCORPORATED INTO

    ANY CONTRACT. IT IS NOT A COMMITMENT TO

    DELIVER ANY MATERIAL, CODE, ORFUNCTIONALITY, AND SHOULD NOT BE RELIEDUPON IN MAKING PURCHASING DECISION. THEDEVELOPMENT, RELEASE, AND TIMING OF ANY

    FEATURES OR FUNCTIONALITY DESCRIBEDFOR ORACLE'S PRODUCTS REMAINS AT THESOLE DISCRETION OF ORACLE.

    17

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    18/112

    Importance of W3C & OGC Semantic Standards

    Key W3C Web Semantic Activities: W3C RDF Working Group W3C SPARQL Working Group W3C RDB2RDF Working Group (Editors of R2RML)

    W3C OWL Working group W3C Semantic Web Education & Outreach (SWEO) W3C Health Care & Life Sciences Interest Group (HCLS) W3C Multimedia Semantics Incubator group

    W3C Semantic Web Rules Language (SWRL)

    OGC GeoSPARQL Standard Working Group (Tech. Editor)

    18

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    19/112

    3 r d -P

    a r t y

    C a l l o

    u t s

    R e a s on

    er s : P

    el l e

    t

    N

    L P E x

    t r a c t or s

    Java API support

    SPARQL : Jena / Sesame

    JDBC

    JavaPrograms

    SQL Interface

    S Q L p l u s

    P L / S Q L

    S Q L d e v .

    P r o g r a m m

    i n g

    I n t e r f a c e

    SPARQL Endpoints Joseki / Sesame

    Architectural Overview

    Enterprise(Relational)

    dataRDF/OWLdata and

    ontologies

    Rulebases:OWL, RDF/S,user-defined

    InferredRDF/OWL

    data R D F / O W L

    O r a c l e

    D B Security: Oracle Label Security

    S e m a n

    t i c

    I n d e x e s

    Ontology-assistedQuery of

    Enterprise Data

    QueryRDF/OWLdata and

    ontologies

    INFERLOAD

    R D F / S

    U s e r -

    d e f .

    O W L s u b s e

    t s

    B u l

    k - L o a d

    I n c r . D

    M L

    C o r e

    f u n c t

    i o n a

    l i t y

    QUERY ( SPARQL in SQL )

    OBIEE via SPARQL Gateway

    Tools

    Visualizer Cytoscape-based

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    20/112

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    21/112

    Mapping Core Entities to DB objects

    Sem. Store entity type Database View nameModel m mdsys.RDF M _ mRulebase rb mdsys.RDF R _ rbRules Index (entailment) x mdsys.RDF I _ x

    Virtual Model vm mdsys.SEM V _ vm (allows duplicates)mdsys.SEM U _ vm (unique)

    SELECT privilege for a core entity is directly related to SELECT

    privilege for corresponding view object.

    21

    Each core entity is mapped to a view object in the database:

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    22/112

    Core Functionality: Load / Query / Inference

    OWLsubset

    RDF /RDFS

    Rulebase m

    Rulebases & Vocabularies

    X1 X2

    X p

    Entailments

    A1

    A2

    An

    R

    R

    R

    h e r m a n

    s c o

    t t

    s c o

    t t

    ApplicationTables

    Semantic Network (MDSYS)

    Load Bulk load Incremental load

    Query and DML

    SPARQL (from Java/endpoint)

    Inference Native support for OWL 2 RL,

    SNOMED (OWL 2 EL subset),OWLprime, SKOSCORE, etc.

    Named Graph (Local/Global)Inference

    User-defined rules

    M1

    M2

    Mn

    Models

    Values

    Triples

    Oracle Database

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    23/112

    Enterprise Functionality: SQL / Sem. Indexing / Security SPARQL query (embedding) in SQL

    Allows joining SPARQL results with relational data Allows use of rich SQL operators (such as aggregates)

    Semantic indexing Index consists of RDF triples extracted from documents stored

    (directly or indirectly) in a table column Extraction done by one or more 3 rd party information extractors

    Security: Fine-Grained Access Control (for each triple) Uses Oracle Label Security (OLS) Each RDF triple has an associated sensitivity label

    Querying Text and Spatial data using SPARQL

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    24/112

    Tools: OBIEE for RDF data (using SPARQL Gateway)

    Easy integration of RDF data with Business Intelligence (OBIEE)through SPARQL Gateway

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    25/112

    Tools: Visualization using Cytoscape andSEM_ANALYSIS

    Detail Graph Whole graph or a subgraph

    Summary Graph Static Summaries

    Representative Instance Particular Instance

    Dynamic Summaries Summary for SPARQL-pattern based dynamic subgraph

    (Summary-Detail) Hybrid Graph

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    26/112

    Example 1: Detail Graph

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    27/112

    Example 2: Detail SubGraphs

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    28/112

    Example 3: Representative & Particular Instance SummaryGraphs

    Note: The right graph was manually re-arranged to simplify comparison.

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    29/112

    Example 4: Dynamic Subset-based Summary

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    30/112

    Example 5: Particular Instance also an example of hybrid (detail-summary) graph

    Detail Edge

    Aggregate Edge(can be expanded

    Using expand Property) Absent Edge

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    31/112

    Installation and Configuration

    of Oracle Database Semantic Technologies

    31

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    32/112

    Installation and Configuration (1)

    Load the PL/SQL packages and jar file cd $ORACLE_HOME/md/admin Login as sysdba SQL> @catsem

    Create a tablespace for semantic networkcreate bigfile tablespace semts

    datafile '?/dbs/semts01.dat' size 512M reuseautoextend on next 512M maxsize unlimited

    extent management localsegment space management auto;

    customize

    32

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    33/112

    Installation and Configuration (2)

    Create a temporary tablespace

    create bigfile temporary tablespace semtmptstempfile ?/dbs/semtmpts.dat'size 512M reuseautoextend on next 512M maxsize unlimitedEXTENT MANAGEMENT LOCAL;

    ALTER DATABASE DEFAULT TEMPORARY TABLESPACE semtmpts;

    Create an undo tablespaceCREATE bigfile UNDO TABLESPACE semundots

    DATAFILE ?/dbs/semundots.dat' SIZE 512M REUSEAUTOEXTEND ON next 512M maxsize unlimitedEXTENT MANAGEMENT LOCAL;

    ALTER SYSTEM SET UNDO_TABLESPACE=semundots;

    33

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    34/112

    Installation and Configuration (3)

    Create a semantic network As sysdba SQL> exec sem_apis.create_sem_network(semts );

    Create a semantic model As scott (or other) SQL> create table test_tpl (triple sdo_rdf_triple_s) compress ; SQL> exec sem_apis.create_sem_model(test , test_tpl , triple );

    34

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    35/112

    Loading RDF triples

    35

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    36/112

    Loading Semantic Data

    Incremental DMLs (small number of changes) SQL: Insert SQL: Delete Java API (Jena): GraphOracleSem.add, delete Java API (Sesame): OracleSailConnection.addStatement,

    removeStatements

    Bulk load (adding many triples) PL/SQL: sem_apis.bulk_load_from_staging_table()

    Staging table may be populated using SQL*Loader or External Table Java API (Jena)

    OracleBulkUpdateHandler.addInBulk, prepareBulk, completeBulk Java API (Sesame)

    OracleBulkUpdateHandler.addInBulk, prepareBulk, completeBulk

    Recommended

    loading method f very small numbeof triples

    Recommendedloading method for very large number

    of triples

    36

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    37/112

    Bulk load: Load Data into Staging Table Create a staging table

    CREATE TABLE STAGE_TABLE (RDF$STC_sub varchar2(4000) not null,RDF$STC_pred varchar2(4000) not null,RDF$STC_obj varchar2(4000) not null,RDF$STC_graph varchar2(4000)

    ) compress ;-- RDF$STC_graph column is required if loading N-Quads

    Grant appropriate privileges to MDSYSGRANT SELECT, INSERT on STAGE_TABLE to MDSYS;-- INSERT privilege is required if using External Table (see below)

    Two ways for loading from file(s) Using External Table (for N-Triple or N-Quad format) Using SQL*Loader (for N-Triple format only)

    37

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    38/112

    Load Data into Staging Table: using External Table Create an External Table and associate it with the data files

    BEGINsem_apis.create_source_external_table(source_table => 'stage_table_source'

    ,def_directory => 'DATA_DIR',bad_file => 'CLOBrows.bad' );END;/grant SELECT on "stage_table_source" to MDSYS;alter table "stage_table_source" location ('demo_datafile.nt');

    Load content of External Table into the Staging TableBEGINsem_apis.load_into_staging_table(staging_table => 'STAGE_TABLE'

    ,source_table => 'stage_table_source',input_format => 'N-QUAD');END;/

    For large loads, consider parallel loading Distribute the input data into multiple files (associate with a single External Table) Use parallel=> when invoking sem_apis.load_into_staging_table

    38

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    39/112

    Load Data into Staging Table: using SQL*Loader Create a control file for SQL*Loader UNRECOVERABLELOAD DATA

    APPENDinto table stage_tablewhen (1) '# (

    RDF$STC_sub CHAR(4000) terminated by whitespace,RDF$STC_pred CHAR(4000) terminated by whitespace,RDF$STC_obj CHAR(5000)

    rtrim(:RDF$STC_obj,'. '||CHR(9)||CHR(10)||CHR(13)) ) Invoke SQL*Loader

    sqlldr userid=/ control=data= direct=true

    For large loads, consider parallel loading Distribute the input data into multiple files and invoke sqlldr from multiple sessions sqlldr userid=/ control=

    data= direct=true parallel=true &

    39

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    40/112

    Loading RDF model from Staging Table Once Staging Table has been loaded, issue the

    following callBEGINsem_apis.bulk_load_from_staging_table(model_name => my_rdf_model'

    ,table_owner => SCOTT',table_name => 'STAGE_TABLE',flags => PARSE');END;/

    For parallel loading of large data consider additional

    attributes in flags parameter PARALLEL= MBV_METHOD=SHADOW

    40

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    41/112

    Load Data into Staging Table using prepareBulk

    When you have many RDF/XML, N3, TriX or TriG files

    OracleSailConnection osc = oracleSailStore.getConnection();

    store.disableAllAppTabIndexes();for (int idx = 0; idx < szAllFiles.length; idx++) {

    osc.getBulkUpdateHandler(). prepareBulk (

    fis, "http://abc", // baseURIRDFFormat.NTRIPLES, // dataFormat"SEMTS", // tablespaceNamenull, // flagsnull, // register a

    // StatusListener"STAGE_TABLE", // table name(Resource[]) null // Resource... for contexts);

    osc.commit(); fis.close();} The latest Jena Adapter has prepareBulk and completeBulk APIs

    Can start multiple

    threads and

    load files

    in parallel

    41

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    42/112

    More Data Loading Choices (1)

    Use External Table to load data into Staging TableCREATE TABLE stable_ext(

    RDF$STC_sub varchar2(4000),RDF$STC_pred varchar2(4000),RDF$STC_obj varchar2(4000))

    ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY tmp_dir

    ACCESS PARAMETERS(RECORDS DELIMITED by NEWLINEPREPROCESSOR bin_dir:'uncompress.sh'FIELDS TERMINATED BY ' ' )

    LOCATION ( data1.nt.gz',data2.nt.gz',,data_4.nt.gz' ))

    REJECT LIMIT UNLIMITED;

    Multiple

    files

    is critical to

    performance

    42

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    43/112

    More Data Loading Choices (2)

    Load directly using Jena Adapter Oracle oracle = new Oracle(szJdbcURL, szUser, szPasswd);

    Model model = ModelOracleSem.createOracleSemModel(oracle, szModelName);

    InputStream in = FileManager.get().open("./univ.owl" ); model.read(in, null);

    More loading examples using Jena Adapter Examples 7-2, 7-3, and 7-12 (SPARUL) [1]

    Loading RDFa graphOracleSem.getBulkUpdateHandler().prepareBulk( rdfaUrl, )

    [1]: Oracle Database Semantic Technologies Developer's Guidehttp://download.oracle.com/docs/cd/E11882_01/appdev.112/e11828/toc.htm

    43

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    44/112

    More Data Loading Choices (3)

    Load directly using Sesame Adapter OraclePool op = new OraclePool(OraclePool.getOracleDataSource(jdbcUrl, user,

    password));OracleSailStore store = new OracleSailStore(op, model);SailRepository sr = new SailRepository(store);

    RepositoryConnection repConn = sr.getConnection();repConn.setAutoCommit(false);repConn.add(new File(trigFile), "http://my.com/",

    RDFFormat.TRIG); repConn.commit();

    More loading examples using Sesame Adapter Examples 8-5, 8-7, 8-8, 8-9, and 8-10 [1]

    [1]: Oracle Database Semantic Technologies Developer's Guidehttp://download.oracle.com/docs/cd/E11882_01/appdev.112/e11828/toc.htm

    44

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    45/112

    Utility APIs

    SEM_APIS.remove_duplicates

    e.g. exec sem_apis.remove_duplicates( graph_model );

    SEM_APIS.merge_models Can be used to clone model as well. e.g. exec sem_apis.merge_models( model1 , model2 );

    SEM_APIS.swap_names e.g. exec

    sem_apis.swap_names( production_model , prototype_model );

    SEM_APIS.alter_model (entailment) e.g. sem_apis.alter_model( m1 , MOVE , TBS_SLOWER );

    SEM_APIS.rename_model/rename_entailment

    45

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    46/112

    Inference

    46

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    47/112

    Core Inference Features

    Inference done using forward chaining Triples inferred and stored ahead of query time Removes on-the-fly reasoning and results in fast query times

    Various native rulebases provided

    E.g., RDFS, OWL 2 RL, SNOMED (EL+), SKOS Validation of inferred data User-defined rules Proof generation

    Shows one deduction path

    47

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    48/112

    OWL Subsets Supported

    OWL subsets for different applications RDFS++

    RDFS plus owl:sameAs and owl:InverseFunctionalProperty OWLSIF (OWL with IF semantics)

    Based on Dr. Horst s pD* vocabulary OWLPrime

    Includes RDFS++, OWLSIF with additional rules Jointly determined with domain experts, customers and partners

    OWL 2 RL W3C Standard Adds rules about keys, property chains, unions and intersections to OWLPrime

    SNOMED

    Choice of rulebases If ontology is in EL, choose SNOMED component If OWL 2 features (chains, keys) are not used, choose OWLPrime Choose OWL2RL otherwise.

    1 Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary 48

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    49/112

    11g Release 2 Inference Features

    Richer semantics support OWL 2 RL, SKOS, SNOMED (subset of OWL 2 EL)

    Performance enhancements Large scale owl:sameAs handling

    Compact materialization of owl:sameAs closure Parallel inference

    Leverage native Oracle parallel query and parallelDML

    Incremental inference

    Efficient updates of inferred graph through additions Compact Data Structures

    49

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    50/112

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    51/112

    Inference APIs

    SEM_APIS.CREATE_ENTAILMENT (

    index_name sem_models(GraphTBox , GraphABox , ), sem_rulebases(OWL2RL), passes, inf_components, options

    ) Use PROOF=T to generate inference proof

    SEM_APIS.VALIDATE_ENTAILMENT ( sem_models((GraphTBox , GraphABox , ), sem_rulebases(OWLPrime ),

    criteria, max_conflicts, options)

    Jena Adapter API: GraphOracleSem.performInference()

    Typical Usage:

    First load RDF/OWL data Call create_entailment to

    generate inferred graph

    Query both original graph andinferred data

    Inferred graph contains only newtriples! Saves time &resources

    Typical Usage:

    First load RDF/OWL data

    Call create_entailment togenerate inferred graph

    Call validate_entailment to findinconsistencies

    51

    RecommendedAPI

    for inference

    d d b f

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    52/112

    Option 1: add user-defined rules Both 10g and 11g RDF/OWL support user-defined rules in this form:

    Filter expressions are allowed ?x :hasAge ?age.

    ?age > 18 ?x :type :Adult.

    Extending Semantics Supported by 11.2 OWL Inference

    Antecedents

    Consequents

    ?x :parentOf ?y .?z :brotherOf ?x .

    ?z :uncleOf ?y

    52

    E di S i S d b 11 2 OWL I f

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    53/112

    Option 2: Separation in TBox and ABox reasoning throughPelletDb (using Oracle Jena Adapter)

    TBox (schema related) tends to be small in size Generate a class subsumption tree using a complete DL

    reasoners like Pellet ABox (instance related) can be arbitrarily large

    Use the native inference engine in Oracle to infer newknowledge based on class subsumption tree from TBox

    Extending Semantics Supported by 11.2 OWL Inference

    TBoxTBox &

    Complete classtree

    ABox

    DLreasoner

    InferenceEngine in

    Oracle

    53

    E bli Ad d I f C bili i

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    54/112

    Enabling Advanced Inference Capabilities Parallel inference option

    EXECUTE sem_apis.create_entailment('M_IDX',sem_models('M'),

    sem_rulebases('OWLPRIME'), null, null, 'DOP=x' ); Where x is the degree of parallelism (DOP)

    Incremental inference optionEXECUTE sem_apis.create_entailment ('M_IDX',sem_models('M'),sem_rulebases('OWLPRIME'),null,null, 'INC=T' );

    Enabling owl:sameAs option to limit duplicatesEXECUTE Sem_apis.create _entailment('M_IDX',sem_models('M'),

    sem_rulebases('OWLPRIME'),null,null, 'OPT_SAMEAS=T' );

    Compact data structuresEXECUTE Sem_apis.create _entailment('M_IDX',sem_models('M'),

    sem_rulebases (OWLPRIME'), null,null, ' RAW8=T'); OWL2RL/SKOS inference

    EXECUTE Sem_apis.create_entailment('M_IDX',sem_models('M'),sem_rulebases(x),null,null );

    x in (OWL2RL , SKOSCORE )

    54

    N d G h B d Gl b l d L l I f

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    55/112

    Named Graph Based Global and Local Inference Named Graph Based Global Inference (NGGI)

    Perform inference on just a subset of the triples Some usage examples

    Run NGGI on just the TBox Run NGGI on just a single named graph Run NGGI on just a single named graph and a TBox

    Named Graph Based Local Inference (NGLI) Perform local inference for each named graph (optionally with a

    common Tbox) Triples from different named graphs will not be mixed together.

    NGGI and NGLI together can achieve efficient named graph basedinference maintenance

    55

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    56/112

    Querying Semantic Data

    56

    S ti O t E d T f SQL SELECT

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    57/112

    Semantic Operators Expand Terms for SQL SELECT Scalable, efficient SQL operators to perform ontology-

    assisted query against enterprise relational data

    Finger_Fracture

    Arm_Fracture

    Upper_Extremity_Fracture

    Hand_FractureElbow_FractureForearm_Fracture

    rdfs:subClassOf

    rdfs:subClassOf

    rdfs:subClassOf

    rdfs:subClass Of

    Rheumatoid_Arthritis2

    Hand_Fracture 1

    DIAGNOSISID

    Patientsdiagnosistable

    Query: Find all entries in diagnosis columnthat are related to Upper_Extremity_Fracture

    Syntactic query against relational tablewill not work!

    SELECT p_id, diagnosisFROM Patients Zero Matches!

    WHERE diagnosis = Upper_Extremity_Fracture;Traditional Syntactic query against relational data

    New Semantic query against relational data (while consulting ontology)

    SELECT p_id, diagnosisFROM Patients

    WHERE SEM_RELATED (

    diagnosis,rdfs:subClassOf ,Upper_Extremity_Fracture ,Medical_ontology ) = 1;

    SELECT p_id, diagnosisFROM Patients

    WHERE SEM_RELATED (

    diagnosis,rdfs:subClassOf ,Upper_Extremity_Fracture ,Medical_ontology = 1)

    AND SEM_DISTANCE()

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    58/112

    SPARQL Query Architecture

    Jena APIJena Adapter

    Sesame APISesame Adapter

    Standard SPARQL EndpointEnhanced with query management control

    SEM_MATCHSQL

    Java

    HTTP

    58

    SPARQL-to-SQLCore Logic

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    59/112

    SEM_MATCH: Adding SPARQL to SQL

    Extends SQL with SPARQL constructs

    Graph Patterns, OPTIONAL, UNION Dataset Constructs FILTER including SPARQL built-ins Prologue Solution Modifiers

    Benefits: Allows SQL constructs/functions: JOINs with other object-relational data DDL Statements: create tables/views

    59

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    60/112

    SEM_MATCH: Adding SPARQL to SQL

    SELECT n1, n2FROM

    TABLE(SEM_MATCH(

    PREFIX foaf:

    SELECT ?n1 ?n2FROM WHERE {?p foaf:name ?n1

    OPTIONAL {?p foaf:knows ?f .?f foaf:name ?n2 }

    FILTER (REGEX(?n1, ^A)) } ORDER BY ?n1 ?n2 ,SEM_MODELS(M1 ),));

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    61/112

    SEM_MATCH: Adding SPARQL to SQL

    SELECT n1, n2FROM

    TABLE(SEM_MATCH(

    PREFIX foaf:

    SELECT ?n1 ?n2FROM WHERE {?p foaf:name ?n1

    OPTIONAL {?p foaf:knows ?f .?f foaf:name ?n2 }

    FILTER (REGEX(?n1, ^A)) } ORDER BY ?n1 ?n2 ,SEM_MODELS(M1 ),));

    n1 n2

    Alex Jerry Alex Tom

    Alice Bill

    Alice Jill Alice John

    SQL Table Function

    dd

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    62/112

    SEM_MATCH: Adding SPARQL to SQL

    SELECT n1, n2FROM

    TABLE(SEM_MATCH(

    PREFIX foaf:

    SELECT ?n1 ?n2FROM WHERE {?p foaf:name ?n1

    OPTIONAL {?p foaf:knows ?f .?f foaf:name ?n2 }

    FILTER (REGEX(?n1, ^A)) } ORDER BY ?n1 ?n2 ,SEM_MODELS(M1 ),));

    SQL Table FunctionRewritable

    ( SELECT v1.value AS n1, v2.value AS n2FROM VALUES v1, VALUES v2

    TRIPLES t1, TRIPLES t2, WHERE t1.obj_id = v1.value_id

    AND t1.pred_id = 1234 AND

    )Get 1 declarative SQL query

    - Query optimizer sees 1 query- Get all the performance of Oracle SQL Engine- compression, indexes, parallelism, etc.

    SEM MATCH Table Function Arguments

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    63/112

    SEM_MATCH Table Function Arguments

    SEM_MATCH(query , models ,

    rulebases ,

    options);

    SELECT ?a

    WHERE { ?a foaf:name ?b }

    Container(s) for asserted quads

    Built-in (e.g. OWL2RL)and user-definedrulebases

    ALLOW_DUP=T STRICT_TERM_COMP=F

    Entailedtriples+

    Basic unit of access control

    63

    GovTrack RDF Data

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    64/112

    RDF/OWL data about activities of US Congress Political Party Membership Voting Records Bill Sponsorship Committee Membership Offices and Terms GOV_TBOX

    GOV_PEOPLE

    GOV_BILLS_110

    GOV_BILLS_111

    GOV_VOTES_07

    GOV_VOTES_09

    GOV_VOTES_08

    GOV_TRACK_OWL

    GOV_ALL_VM INFERENCE

    OWL2RL

    Virtual ModelsSemantic Models

    RulebasesEntailments

    GovTrack in Oracle

    http://www.govtrack.us/developers/rdf.xpd

    GOV_ASSERT_VM Asserted data only(2.8M triples)

    Asserted + Inferred(3.1M triples)

    64

    GOV_DISTRICTS (US Census)

    Vi l M d l

    http://www.govtrack.us/developers/rdf.xpdhttp://www.govtrack.us/developers/rdf.xpdhttp://www.govtrack.us/developers/rdf.xpdhttp://www.govtrack.us/developers/rdf.xpd
  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    65/112

    Virtual Models

    A virtual model is a logical RDF graph that can beused in a SEM_MATCH query.

    Result of UNION or UNION ALL of one or more models andoptionally the corresponding entailment

    create_virtual_model (vm_name, models, rulebases) drop_virtual_model (vm_name) SEM_MATCH query accepts a single virtual model

    No other models or rulebases need to be specified DMLs on virtual models are not supported

    65

    Vi t l M d l E l

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    66/112

    Virtual Model Example

    beginsem_apis.create_virtual_model('gov_assert_vm',

    sem_models('gov_tbox', 'gov_people', 'gov_votes_07','gov_votes_08', 'gov_votes_09', 'gov_bills_110','gov_bills_111', 'gov_districts'));

    sem_apis.create_virtual_model('gov_all_vm',

    sem_models('gov_tbox', 'gov_people', 'gov_votes_07','gov_votes_08', 'gov_votes_09', 'gov_bills_110','gov_bills_111', 'gov_districts'),

    sem_rulebases('OWL2RL'));end;/

    grant select on mdsys.semv_gov_assert_vm to scott;grant select on mdsys.semv_gov_all_vm to scott;

    Creation

    Access Control

    66

    Q E l 1 B i Q

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    67/112

    Query Example 1: Basic Query

    select fn, bday, g, t, hp, rfrom table(sem_match('SELECT ?fn ?bday ?g ?t ?hp ?r WHERE

    { ?s vcard:N ?n .?n vcard:Family "Kennedy" .?s foaf:name ?fn .?s vcard:BDAY ?bday .?s foaf:gender ?g .?s foaf:title ?t .?s foaf:homepage ?hp .?s foaf:religion ?r

    }',sem_models('gov_all_vm'), null, null, null,null,' ALLOW_DUP=T '));

    Find information about all Kennedys

    67

    Q E l 2 OPTIONAL Q

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    68/112

    Query Example 2: OPTIONAL Query

    select fn, bday, g, t, hp, rfrom table(sem_match('SELECT ?fn ?bday ?g ?t ?hp ?r WHERE

    { ?s vcard:N ?n .?n vcard:Family "Kennedy" .?s foaf:name ?fn .?s vcard:BDAY ?bday .?s foaf:gender ?g .OPTIONAL { ?s foaf:title ?t .

    ?s foaf:homepage ?hp .

    ?s foaf:religion ?r }}',sem_models('gov_all_vm'), null, null, null,,null, ' ALLOW_DUP=T '));

    Find information about all Kennedys, with title,homepage and religion optional

    68

    Q er E ample 3: Simple FILTER

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    69/112

    Query Example 3: Simple FILTER

    select fname, lnamefrom table(sem_match('SELECT ?fname ?lname WHERE

    { ?s rdf:type foaf:Person .

    ?s vcard:N ?vcard .?vcard vcard:Given ?fname .?vcard vcard:Family ?lnameFILTER (STR(?lname) < "B") }'

    ,sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T '));

    Find all people with a last name that starts with A

    69

    Query Example 4: Negation as Failure

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    70/112

    Query Example 4: Negation as Failure

    select fn, bday, hpfrom table(sem_match('SELECT ?fn ?bday ?hp WHERE

    { ?s vcard:N ?n .?n vcard:Family "Lincoln" .?s vcard:BDAY ?bday .?s foaf:name ?fn .FILTER (!BOUND(?hp))OPTIONAL {

    ?s foaf:homepage ?hp}

    }',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T '));

    Find all Lincolns without a homepage

    70

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    71/112

    Query Example 6:

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    72/112

    Query Example 6:Inference GovTrack Bill Types

    72

    Query Example 6: Inference

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    73/112

    Query Example 6: Inference

    select title, dt, btypefrom table(sem_match('SELECT ?title ?dt ?btype WHERE

    { ?s foaf:name "Barack Obama" .?b bill:sponsor ?s .?b dc:title ?title .?b rdf:type ?btype .?b bill:introduced ?dtFILTER("2007-03-28"^^xsd:date

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    74/112

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    75/112

    Oracle Extensions for Text and Spatial

    75

    Full Text Indexing with Oracle Text

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    76/112

    Full Text Indexing with Oracle Text

    Filters graph patterns based on text search string

    Indexes all RDF Terms URIs, Literals, Language Tags, etc.

    Provide SPARQL extension function orardf:textContains(?var,

    Oracle text search string) Search String

    Group Operators: AND, OR, NOT, NEAR, Term Operators: stem($), soundex(!), wildcard(%)

    SQL> exec sem_apis.add_datatype_index('http://xmlns.oracle.com/rdf/text');

    Text Query Example

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    77/112

    Text Query Example

    select s, title, dtfrom table(sem_match('SELECT ?s ?title ?dt WHERE

    { ?b bill:sponsor ?s .?s foaf:name ?n .?b dc:title ?title .?b bill:introduced ?dtFILTER ( orardf:textContains(?title,

    "$children AND $taxes") )}',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T '));

    Find all bills about Children and Taxes

    Spatial Support with Oracle Spatial

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    78/112

    Spatial Support with Oracle Spatial

    Support geometries encoded as orageo:WKTLiterals

    :semTech2011 orageo:hasPointGeometry"POINT(-122.4192 37.7793)"^^orageo:WKTLiteral .

    Provide library of spatial query functions

    SELECT ?s WHERE { ?s orageo:hasPointGeometry ?geom

    FILTER( orageo:withinDistance(?geom,"POINT(-122.4192 37.7793)"^^orageo:WKTLiteral,

    "distance=10 unit=KM") )

    orageo:WKTLiteral Datatype

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    79/112

    orageo:WKTLiteral Datatype

    SRS: WGS84 Longitude, Latitude"POINT(-122.4192 37.7793)"^^orageo:WKTLiteral

    SRS: NAD27 Longitude, Latitude"

    POINT(-122.41 81 37.7793)"^^orageo:WKTLiteral

    Optional leading Spatial Reference System URI followed by OGC WKTgeometry string.

    WGS 84 Longitude, Latitude is the default SRS (assumed if SRS URI is

    absent)

    SQL> exec sem_apis.add_datatype_index('http://xmlns.oracle.com/rdf/geo/WKTLiteral',options=>'TOLERANCE=1.0 SRID=8307

    DIMENSIONS=((LONGITUDE,-180,180)(LATITUDE,-90,90))');

    Prepare for spatial querying by creating a spatial index for theorageo:WKTLiteral datatype

    What Types of Spatial Data are Supported?

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    80/112

    What Types of Spatial Data are Supported?

    Spatial Reference Systems Built-in support for 1000 s of SRS Plus you can define your own Coordinate system transformations applied transparently

    during indexing and query

    Geometry Types Support OGC Simple Features geometry types

    Point, Line, Polygon Multi-Point, Multi-Line, Multi-Polyon Geometry Collection

    Up to 500,000 vertices per Geometry

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    81/112

    GovTrack Spatial Demo

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    82/112

    GovTrack Spatial Demo

    Congressional District Polygons (435) Complex Geometries Average over 1000 vertices per geometry

    Load .shp filefrom US Censusinto Oracle Spatial

    Generate triples usingsdo_util.toWKTGeometry()

    Load into Oraclesemantic model

    Spatial Query 1

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    83/112

    Spatial Query 1

    select name, cdistfrom table(sem_match('SELECT ?name ?cdist WHERE

    { ?person usgovt:name ?name .?person pol:hasRole ?role .?role pol:forOffice ?office .?office pol:represents ?cdist .?cdist orageo:hasWKTGeometry ?cgeom FILTER ( orageo:relate(?cgeom,

    "POINT(-71.46444 42.7575)"^^orageo:WKTLiteral,"mask=contains") ) } '

    ,sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T '));

    Which congressional district conta ins Nahsua, NH

    Spatial Query 2

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    84/112

    Spatial Query 2

    select name, cdistfrom table(sem_match('SELECT ?name ?cdist WHERE

    { ?person usgovt:name ?name .?person pol:hasRole ?role .?role pol:forOffice ?office .?office pol:represents ?cdist .?cdist orageo:hasWKTGeometry ?cgeom FILTER ( orageo:nearestNeighbor(?cgeom,

    "POINT(-71.46444 42.7575)"^^orageo:WKTLiteral,"sdo_num_res=10") ) }

    ORDER BY ASC( orageo:distance( orageo:centroid(?cgeom) ,"POINT(-71.46444 42.7575)"^^orageo:WKTLiteral,"unit=KM") )'

    ,sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T '));

    Who are my nearest 10 representatives ordered by centerpoint

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    85/112

    SPARQL Querying

    Jena Adapter for Oracle Database 11g Release 2

    85

    Jena Adapter for Oracle Database 11g Release 2

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    86/112

    Implements Jena Semantic Web Framework APIs Popular Java APIs for semantic web based applications Adapter adds Oracle-specific extensions

    Jena Adapter provides three core features: Java API for Oracle RDF Store SPARQL Endpoint for Oracle with SPARQL 1.1. support Oracle-specific extensions for query execution control and

    management

    86

    Jena Adapter as a Java API for Oracle RDF

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    87/112

    Proxy like design Data not cached in memory for scalability SPARQL query converted into SQL and executed inside DB

    Various optimizations to minimize the number of Oracle queriesgenerated given a SPARQL 1.1. query

    Various data loading methods

    Bulk/Batch/Incremental load RDF or OWL (in N3, RDF/XML, N-TRIPLEetc.) with strict syntax verification and long literal support

    Allows integration of Oracle Database 11g RDF/OWLwith various tools TopBraid Composer

    External OWL DL reasoners (e.g., Pellet)

    http://www.oracle.com/technology/tech/semantic_technologies/documentation/jenaadapter2_readme.pdf

    87

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    88/112

    Jena Adapter Feature: SPARQL Endpoint

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    89/112

    p Q p

    SPARQL service endpoint supporting full SPARQLProtocol

    Integrated with Jena/Joseki 3.4.0 (deployed in WLS 10.3 or Tomcat 6) Uses J2EE data source for DB connection specification SPARQL 1.1. and Update (SPARUL) supported

    Oracle-specific declarative configuration options in Joseki Each URI endpoint is mapped to a Joseki service:

    rdf:type joseki:Service ;rdfs:label "SPARQL with Oracle Semantic Data Management" ;joseki:serviceRef "GOV_ALL_VM" ;#web.xml must route this name to Joseki

    joseki:dataset ; # dataset partjoseki:processor joseki:ProcessorSPARQL_FixedDS;

    89

    SPARQL Endpoint: Example

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    90/112

    p p

    Example Joseki Dataset configuration: rdf:type oracle:Dataset;

    joseki:poolSize 4; # Number of concurrent connections# allowed to this dataset.

    oracle:connection [ a oracle:OracleConnection ; ];

    oracle:defaultModel [ oracle:firstModel "GOV_PEOPLE";

    oracle:modelName "GOV_TBOX;oracle:modelName "GOV_VOTES_07; oracle:rulebaseName "OWLPRIME";oracle:useVM "TRUE ] ;

    oracle:namedModel [

    oracle:namedModelURI ;oracle:firstModel "GOV_VOTES_07" ].

    90

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    91/112

    Query Extensions in Jena Adapter

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    92/112

    Query management and execution control

    Timeout Query abort framework

    Including monitoring threads and a management servlet Designed for a J2EE cluster environment

    Hints allowed in SPARQL query syntax

    Parallel execution

    Support ARQ functions for projected variables fn:lower-case, upper- case, substring,

    Native, system provided functions can be used in SPARQL oext:lower-literal, oext:upper-literal, oext:build-uri-for-id ,

    92

    Query Extensions in Jena Adapter

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    93/112

    Extensible user-defined functions in SPARQL

    ExamplePREFIX ouext: SELECT ?subject ?object ( ouext:my_strlen (?object) as ?obj1)WHERE { ?subject dc:title ?object }

    User can implement the my_strlen functions in Oracle Database

    Connection Pooling through OraclePooljava.util.Properties prop = new java.util.Properties();prop.setProperty("InitialLimit", "2"); // create 2 connectionsprop.setProperty("InactivityTimeout", "1800"); // seconds

    .

    OraclePool op = new OraclePool (szJdbcURL, szUser, szPasswd, prop,"OracleSemConnPool");

    Oracle oracle = op.getOracle();

    93

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    94/112

    Semantic Indexing

    for Unstructured Content

    94

    Overview: Creating and Using a Semantic Index

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    95/112

    R o w

    i d docId Article p_date1 Indiana authorities filed felony

    charges and a court issued anarrest warrant for a financialmanager who apparently tried

    to fake his death by crashinghis airplane in a Florida swamp.Marcus Schrenker , 38

    02/01/11

    2 Major dealers and investors 11/30/10

    .. ..

    Newsfeed table

    Subject Property Object Graph

    p:Marcus rdf:type rc::Person p:Marcus :hasName Marcus^^

    p:Marcus :hasAge 38^ xsd:..

    Triples table with rowid references

    SemContext index on Art ic le column

    SELECT Sem_Contains_Select( 1) FROM NewsfeedWHERE Sem_Contains (Article,

    {?x rdf:type rc:Person .?x :hasAge ?age .FILTER(?age >= 35)} ,1)=1

    CREATE INDEX ArticleIndexON Newsfeed (Article )INDEXTYPE IS SemContext PARAMETERS ( my_policy )

    AND p_date > to_date(01 -Jan- 11 )

    An

    al

    y t i c al

    Q u er i e

    s

    On

    Gr a

    ph D

    a t a

    r1

    r2

    LOCAL 1 PARALLEL 4

    extractor

    1LOCAL index support for semantic indexing is restricted to range -partitioned base tables only.

    auto maintainedlike a B-tree index

    Combining Ontologies with extracted triples

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    96/112

    The triples extracted from each document can be augmented with Entailment created by combining the triples with schema ontologies and rulebase(s)

    Knowledge bases (stored as RDF models) and their entailments Augmentation achieved using dependent policies

    beginsem_rdfctx.create_policy ( policy_name => my_policy_plus_geo

    , base_policy => my_policy

    , user_models => SEM_MODELS(USGeography ) , user_entailments => SEM_MODELS( Doc_inf

    , USGeography_inf ));end;

    SELECT docId FROM Newsfeed WHERE SEM_CONTAINS (Articles,

    { ?comp rdf:type c:Company .?comp p:categoryName c:BusinessFinance .?comp p:location ?city .?city geo:state NY^^xsd:string} ,

    my_policy_plus_geo ) = 1

    Entailments from KBs

    Will result in a multi-model query involving: theRDF model for my_policy index, the RDF modelUSGeography , and the entailments .

    Extracted triples

    Knowledge bases (KBs)(local) Entailments from extr. triples

    Inference: document-centric

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    97/112

    Subject Property Object Graph

    rdf:type

    rdf:type

    Subject Property Object Graph

    rdf:type

    rdf:type rdf:type

    Semantic Index : extracted triples

    rdfs:subClassOf rdfs:subPropertyOf

    Ontology : schema triples (for extracted data)

    Entailment : set of inferred triples

    id document

    1 John is aparent.He grew upin NYC.

    2 John is aman.

    Base table

    Abstract Extractor TypeA b t t t t t df t t t (i PL/SQL) d fi th

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    98/112

    An abstract extractor type, rdfctx_extractor (in PL/SQL), defines thecommon interfaces for all extractor implementations.

    Specific implementations for the abstract type interact with individualthird-party extractors and produce RDF/XML documents for the inputdocument.

    create or replace type rdfctx_extractor authid current_user as object (

    member function extractRdf (document CLOB,

    docId VARCHAR2,params VARCHAR2,options VARCHAR2 default NULL) return CLOB

    member function batchExtractRdf (docCursor SYS_REFCURSOR,extracted_info_table VARCHAR2,params VARCHAR2,partition_name VARCHAR2 default NULL,docId VARCHAR2 default NULL,preferences SYS.XMLType default NULL,options VARCHAR2 default NULL)return CLOB,

    ) not instantiable not final

    /

    A sample extractor type -- interface

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    99/112

    create or replace type rdfctxu.info_extractor under rdfctx_extractor (overriding member function getDescription return VARCHAR2 ,

    overriding member function rdfReturnType return VARCHAR2 ,

    overriding member function getContext(attribute VARCHAR2) return VARCHAR2,

    )

    overriding member function extractRDF(document CLOB, docId VARCHAR2, params VARCHAR2 ) return CLOB,

    overriding member function batchExtractRdf(docCursor SYS_REFCURSOR,extracted_info_table VARCHAR2,

    params VARCHAR2,partition_name VARCHAR2 default NULL ) return CLOB

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    100/112

    Enterprise Security for Semantic Data

    100

    Enterprise Security for Semantic Data

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    101/112

    Model-level access control Each semantic model accessible through a view

    (RDFM_ modelName ) Grant/revoke privileges on the view Discretionary access control on application table for model

    Finer granularity possible through Oracle Label Security Triple level security Mandatory Access Control

    101

    Oracle Label Security

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    102/112

    Oracle Label Security Mandatory Access Control Data records and users tagged with security labels Labels determine the sensitivity of the data or the rights a person

    must posses in order to read or write the data.

    User labels indicate their access rights to the data records. For reads/deletes/updates: user s label must dominate row s label

    For inserts: user s label applied to inserted row A Security Administrator assigns labels to users

    ContractID Organization ContractValue Label

    ProjectHLS N. America 1000000 SE:HLS:US

    OLS Data Classification

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    103/112

    Label Components: Levels Determine the vertical

    sensitivity of data and the highestclassification level a user can access.

    Compartments Facilitatecompartmentalization of data. Usersneed exclusive membership to acompartment to access its data.

    Groups Allow hierarchicalcategorization of data. A user withauthorization to a parent group canaccess data in any of its child groups.

    CONF : NAVY, MILITARY : NY, DC

    HIGHCONF : MILITARY , NAVY, SPCLOPS : US, UK

    Row L abel matches User Access Label

    RDF Triple-level Security with OLS

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    104/112

    SE :HLS :FIN,US1000000ContractValueprojectHLS

    SE :HLS :USN.AmericaOrganizationprojectHLSRowLabelObjectPredicateSubject

    Sensitivity labels associated with individual triples control

    read access to the triples. Triples describing a single resource may employ different

    sensitivity labels for greater control.

    Triples table

    projectHLS

    N.America

    1000000

    Organizat ion

    ContractValue

    Subject Predicate Objects

    SE :HLS :US

    Security Label

    SE :HLS :FIN,US

    Security Label

    Securing RDF Data using OLS: Example (1)

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    105/112

    Create an OLS policy Policy is the container for all the labels and user authorizations Can create multiple policies containing different labels

    Create label components Levels:

    UN (unclassified) < SE (secret) < TS (top secret)

    Compartments:HLS (Homeland Security), CIA, FBI Groups:

    NY, DC EASTUS US SD, SF WESTUS

    Create labels EASTSE = SE :CIA,HLS:EASTUS USUN = UN:FBI,HLS:US

    105

    Securing RDF Data using OLS: Example (2)

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    106/112

    Assign labels to users John

    EASTSE ( SE :CIA,HLS:EASTUS ) John can read SE and UN triples John can read triples for CIA and HLS John can read triples for NY, DC, and EASTUS When inserting a row, the default write label is EASTSE

    MaryUSUN (UN:FBI,HLS:US)

    Mary can only read UN triples Mary can read triples for FBI and HLS Mary can read all group triples (e.g. SF , NY, WESTUS , etc) When inserting a row, the default write label is USUN

    106

    Securing RDF Data using OLS: Example (3)

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    107/112

    Apply the OLS policy to RDF store Triple inserts, deletes, updates, and reads will use the policy

    John inserts triple:

    Mary inserts triple:

    Both these triples inserted in model but tagged with

    different label values (EASTSE, USUN)

    Users can have multiple labels Only one label active at any time (user can switch labels) Only active label applied to operations (e.g. queries, deletes,

    inferred triples)107

    Securing RDF Data using OLS: Example (4)

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    108/112

    108

    John Read Triple Label Mary Read

    No TS :HLS:DC No

    No SE :HLS,FBI:DC No

    Yes UN:HLS:DC Yes

    Yes UN:HLS,CIA:NY No

    No SE :CIA:SF No

    No UN:HLS,FBI:NY Yes

    No UN:HLS:SF Yes

    Example labels and read access

    EASTSE(SE :CIA,HLS :EASTUS )

    USUN(UN:FBI ,HLS :US )

    Securing RDF Data using OLS: Example (5)

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    109/112

    Same triple may exist with different labels: UN: HLS: DC

    SE: HLS: DC When Mary queries, only 1 triple returned ( UN triple) When John queries, both UN and SE triples are returned

    No way to distinguish since we don t return label information!

    Solution: use FILTER_LABEL option in SEM_MATCH This query will filter out triples that are dominated by SE :

    SELECT s,p,yFROM table(sem_match('{?s ?p ?y}' , sem_models(TEST'),

    null, null, null, null, FILTER_LABEL=SE POLICY_NAME=DEFENSE ));

    MIN_LABEL can be used to filter out untrustworthy data

    109

    For More Information

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    110/112

    or Search web for:

    or

    oracle.com

    Oracle RDF

    110

    See documentation at:http://docs.oracle.com/cd/E11882_01/appdev.112/e11828/toc.htm

    http://docs.oracle.com/cd/E11882_01/appdev.112/e11828/toc.htmhttp://docs.oracle.com/cd/E11882_01/appdev.112/e11828/toc.htm
  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    111/112

    111

  • 7/28/2019 Souri Oracle Semantic Technologies UTAustin

    112/112