invited iceis tanca orsi

29
Ontology driven, Ontology driven, context-aware query distribution context-aware query distribution for on-the-fly data-integration for on-the-fly data-integration Letizia Tanca and Giorgio Orsi Letizia Tanca and Giorgio Orsi the Context-ADDICT the Context-ADDICT project project

Upload: giorgio-orsi

Post on 22-Nov-2014

493 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

  • 1. the Context-ADDICT projectOntology driven,context-aware query distributionfor on-the-fly data-integrationLetizia Tanca and Giorgio Orsi
  • 2. Data Integration: State of the art the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 3. the future the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 4. 4OverviewAn ontology-driven solution for dynamic data integration, within a scenario where: data sources are not known a-priori user queries are dealt with in a context-aware fashion information fruition is fostered by handing it to the user in a semantics-aware, integrated fashion eliminating non-interesting information, thus reducing the information noise controlling the problems dimension via context-based reduction of the current information spaceWe propose a DL language, CA-DL, which can uniformly represent the application domain and the contextQueries are issued to the system in SPARQL and translated into CA-DL for internal processing the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 5. Context-ADDICT(joint work with C. Bolchini, E. Quintarelli and F. A. Schreiber)Features Context-aware data/ontology tailoring [5] Ontology-driven, on-the-fly data integration of heterogeneous and dynamic data sources Multimodal access to resources Focus on small and mobile devices (sensors, mobile phones, custom embedded-systems)Applications Urban mobility Automotive, e-Health Logistics Energy Production Automation Automated and Personalized Advertisement Personal Information Systems the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 6. Context-ADDICT : context-aware integration of the 6overall information collected from the data sources[MDM06]On-the-fly data integration + data reduction via tailoring the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 7. 7 Modeling context: the CDT An orthogonal context model, which can be adopted for any application (data tailoring, application and service adaptivity and fine-tuning, sensor queries) Single contexts are defined as subtrees of a Context Tree, representing the contexts currently envisaged for that particular application Fine granularity, semantics- based the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 8. Domain OntologyDomain Ontology: Supplies to the absence of a DB global schema Shared and commonly agreed Must be decidable and efficiently computable CA-DL the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 9. Data Sources: Semantic Extraction Data Source Ontology: Semantic Extraction: semantic ontology + structural ontology Models structural/semantic independence (the different models can be used separately) the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 10. CDT domain ontology source ontologies the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 11. Relevant areas, or projectionsProjection: is the set of relevant data for a given user in a given context projected from the ADO to the data sources is context-aware possibly materialized on the user device the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 12. Our problem the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 13. A closer look the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 14. CA-DLCA-DL is used to create mappings between data sources and application domain ontologies and to represent the application context.CA-DL corresponds to a strict subset of OWL2, tailored to be rewritable from/to SPARQL syntax and to express both GAV and LAV mappings.A SPARQL query is issued to the system, and: translated into CA-DL transformed by adapting it to the current user context handed over to the query-rewriting algorithm(s) which distribute it to the suitable data sources (i.e. when alternative data-sources are available) translated into the data-source language(s) by means of automatically generated wrappers the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 15. In CA-DLNo unions, keeping the complexity of the rewriting process within PTIME, and only allowing LAV mappings which involve intersections of concepts: in a CA-DIS the queries are highly heterogeneous and the mappings are often computed on-the-fly.No universal quantification: because GAV mappings rewrite the complex mapping into SPARQL syntax, where currently it is not possible to express general universal restrictions. Only special form of universal restriction: property range definitions where the concept N is the range of the property R. the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 16. The CDT for the insurance companyapplication the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 17. The CDT ontology the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 18. The application domain ontology the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 19. A context and its relevant area the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 20. The application domain ontology manufacturer haspolicy expectsreceipt hasBrand Mname policy vehicle hasName customer receipt man hasclaim envisages hasriskclass motorcycle driver riskcar woman payment Haspayment drives high low claim mid Relevant area for context c1 the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 21. The data sources and their semantic ontologiesDS1: Customer(id, name, ownesMotorbikePlateNumber) Motorbike(motorbikePlateNumber, manufacturer, model) the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 22. The data sources and their semantic ontologiesDS2:Client(id, fullName, riskClass, gender)RiskClass(id, description) the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 23. The mapping ontology the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 24. Context-aware queries for context c1q(x,w) Customer(x), drives(x, y), hasBrand(y, z), hasMname(z, w)This query correctly retrieves all the customers who drive a car with their manufacturers names, since the requested concepts and roles are included in the relevant area for context c1q(x,y) Customer(x), hasName(x, y)This query correctly retrieves all the customers with their names, since the requested concept and property are included in the relevant area for context c1q(x,z) Customer(x), hasPolicy(x, y), envisages(y, z)The answer to his query is empty in context c1, since its relevant area does not include the roles hasPolicy and envisages the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 25. Context-aware queries: Context c1 q(x,y) Customer(x), hasName(x,y) The query is distributed to the datasources D1 and D2, after a reasoning step, through the mapping ontology. The concept DS1:Customer is mapped (via LAV mappings) to an anonymous concept of the domain ontology containing women who drive motorbikes. The data property ado:hasName is mapped to the data property DS1:name The concept ado:Customer is mapped (via GAV mapping) to and to an anonymous concept containing DS2:Client who has male gender with high risk class. The data property ado:hasName is mapped to the dataproperty DS2:fullname the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 26. The data sources and their semantic ontologiesDS1: Customer(id, name, ownesMotorbikePlateNumber) Motorbike(motorbikePlateNumber, manufacturer, model) SELECT id, name FROM CustomerNote: the customers here are only women !!DS2:Client(id, fullName, riskClass, gender)RiskClass(id, description) SELECT id, fullname FROM Client, RiskClass WHERE Client.riskClass=RiskClass.id AND RiskClass=high AND gender=male the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 27. Conclusions and future workAn ontology-driven solution for dynamic data integration, where: data sources are not known a-priori user queries are dealt with in a context-aware fashionThe future: Performance evaluation, in terms of: Recall/precision Efficiency Usage of the same framework in an Internet of things scenario the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 28. Some references the Context-ADDICT project Dipartimento di Elettronica e Informazione
  • 29. CA-DL axioms the Context-ADDICT project Dipartimento di Elettronica e Informazione