recap @ science on the semantic web, rutgers, october 2002

29
Invitational Workshop on Database and Information Systems Research For Semantic Web and Enterprises Amit Sheth & Robert Meersman NSF Information & Data Management PI’s Workshop Amit Sheth & Isabel Cruz cap @ Science on the Semantic Web, Rutgers, October 2002

Upload: lolita

Post on 17-Jan-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Recap @ Science on the Semantic Web, Rutgers, October 2002. Invitational Workshop on Database and Information Systems Research For Semantic Web and Enterprises Amit Sheth & Robert Meersman NSF Information & Data Management PI’s Workshop Amit Sheth & Isabel Cruz. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Invitational Workshop on Database and Information

Systems Research For Semantic Web and Enterprises

Amit Sheth & Robert Meersman

NSF Information & Data Management PI’s Workshop

Amit Sheth & Isabel Cruz

Recap @ Science on the Semantic Web, Rutgers, October 2002

Page 2: Recap @ Science on the Semantic Web,  Rutgers, October 2002

“Ask not what the Semantic Web Can do for you, ask what you can do

for the Semantic Web”

Hans-Georg Stork, European Union

http://lsdis.cs.uga.edu/SemNSF

Page 3: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Context for Amicalola workshop

• Series of Workshops and upcoming conferences: Lisbon (9/00), Hong Kong (5/01), Palo Alto (7/01), Amsterdam (12/01); since then WWW2002/ISWC – Observation: visible lack of DB/IS

involvement• “Semantic Web – The Road Ahead,”

[Decker, Hans-Georg Stork, Sheth, … SemWeb’2001 at

WWW10, Hongkong, May 1, 2001. ]• Semantic Web: Rehash or Research Goldmine

[Fensel, Mylopoulous, Meersman, Sheth (Chair), CooPIS’01]

• At Castel Pergine, Italy

Page 4: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Semantics & IDM – Brief History (partial)

• Semantic Data ModelingM. Hammer and D. McLeod: "The Semantic Data Model: A Modelling Machanism for Data Base Applications"; Proc.. ACM SIGMOD, 1978.

• Conceptual ModelingMichael Brodie, John Mylopoulos, and Joachim W. Schmidt. On Conceptual Modeling. Springer Verlag, New York, NY, 1984. A series of preceding workshops.

• Data Semantic: What, Where and How?- "Database Semantics", R.A. Meersman and T.B. Steel (eds), Proceedings of the IFIP DS-1 Conference, North-Holland (1985).- So Far (Schematically) yet So Near (Semantically) –Sheth, Keynote at DS-5- Meersman, Navathe, Rosenthal, Sheth (Chair); IFIP DS-6 Panel

• Semantic Interoperability on Web many projects in 90s – 1994 CIKM paper on Semantic Information Brokering talked about

query processing in a multi-ontology environment• Domain Modeling, Metadata, Context, Ontologies,

Semantic Interoperability, Semantics in Schema Integration, Semantic Information Brokering, Spatio-temporal-geographic- image-video-multimodal semantics

• All these involving Semantics, Databases, IS and even Web – before “Semantic Web” term is coined

Page 5: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Challenges – unique role of IDM

SCALE and PERFORMANCEAcceptable computation (query/analysis) time

when you have millions and billions of instances (documents, digital content) and metadata (annotation)

• locking for sharing/storage management• Semantic similarity, mappings,

interoperability (schema transformation/integration aka ontology mismatch)

• indexing for expediting computations• workflow for Web Services-based processes

Page 6: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Organization/Output

• 20+ senior researchers/practitioners• 2.5 days in Georgia Mountains• Proceedings of position papers (also talks)• Three workgroups: Application Pull (Brodie/Dayal),

Ontology (Decker/Kashyap) and Web Services (Fensel/Singh)

• <SWIS WG at IDM PI’s meeting>• Review at OntoWeb3 Panel• Final Report • SIGMOD Record special issue December 2002Every thing is at lsdis.cs.uga.edu/SemNSF/

Page 7: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Participants

Karl Aberer, LSIR, EPFL, SwitzerlandMike Brodie, VerizonIsabel Cruz, The University of Illinois at ChicagoUmeshwar Dayal, Hewlett-Packard LabsStefan Decker, Stanford UniversityMax Egenhofer, University of MaineDieter Fensel,  Vrije Universiteit AmsterdamWilliam Grosky,University of Michigan-DearbornMichael Huhns,  University of South CarolinaRamesh Jain, UC-San Diego, and PrajaYahiko Kambayashi, Kyoto UniversityVipul Kashyap, National Library of MedicineLing Liu, Georgia Institute of TechnologyFrank Manola, The MITRE CorporationRobert Meersman, Vrije Universiteit Brussel (VUB)Amit Sheth, University of Georgia and VoquetteMunindar Singh, North Carolina State UniversityGeorge Stork, EURudi Studer, AIFB Universität KarlsruheBhavani Thuraisingham, NSF-CISE-IISMichael Uschold, The Boeing Company

Page 8: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Medical metaphor

• Ontologies: anatomy

• Processes: physiology

• Applications: pathology

Page 9: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Application Pull …Agenda

• Premises– Every resource meaningfully available– Current & Planned Web Services– Beneficiaries and Requirements

• Potential Semantic Services– B2B, C2C, Intra-Enterprise– Example Semantic Web Services

• Challenges / Questions / Concepts• What the Semantic Web Will Look Like

Page 10: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Application Pull …Scenarios

• Scenarios– Tax preparation (Individual)– Supply Chain (B2B)– Scientific Research

• Semantics will be added at three different levels in successive phases– Information– Transactions– Collaborations

Page 11: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Application Pull …Benefits / Requirements

• Lowering barriers to entry– Costs– Entrants

• Consumers• Service providers

• Dynamic– Ability to adjust to rapidly

changing circumstances• Continuous

– Continuous activity (i.e., taxes, financial activity) monitoring

– Event Detection– Do taxes anytime,

anywhere• X-Internet

– Executable– Extended

• Improved– Transparency– Timeliness– Accuracy– Optimization– Eliminate mundane tasks

• Additional services• Reliability and trust• Archiving

– Data– Meta-data– Transaction histories

Page 12: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Application Pull …Challenges

• Upper ontologies– Entities

• Personal• Organizations

– Activities / Events– Processes

• Ontologies– Products– Services– Financial contracts– Business objects– Tax laws (all agencies)– Financial activities – Service providers– Financial planning– Supply chain processes

– Activities (to be monitored)

• Ontology activities– Search– Select– Create, refine– Maintain, version

• Local• Shared• Global

– Mapping• Ontology-based activities

– Accountability• Arbitration• Trust• Tracing

• Engineering– Managing ontologies and

mappings– Scalability, robustness,

Page 13: Recap @ Science on the Semantic Web,  Rutgers, October 2002

OntologyLearning

Ontology Search

MaintenanceVersioning

Compare/Similarity

Deployment(e.g., Hypothesis Generation, Query)

Merge/Refine/Assemble

Requirements/Analysis

Creation/Change

Evaluation

ConsistencyChecking

Page 14: Recap @ Science on the Semantic Web,  Rutgers, October 2002

DB Research in the Ontology LifeCycle

• Operations to compare Models/Ontologies

• Scalability/Storage Indexing of Ontologies– DB approaches data model specific– Need to support graph based data

models• Temporal Query Languages

Lots of work in Schema Integration/translation

Page 15: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Ontology WG: DB Research in the Ontology LifeCycle II

• Schema Mapping– Meta Model specific– Representation of exceptions, e.g.,

tweety– Specification of Inexact Schema

Correspondences• E.g., 40% of animals are 30% of humans

• Meta Model Transformations/Mappings (e.g., UML to RDF Schema)

Page 16: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Ontology WG: DB Research in the Ontology LifeCycle III

• Ontology Versioning– Collaborative editing– Meta Model specific versioning– Version of Schema/Meta Model

Transformations

Page 17: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Ontology WG: DB Research & Semantic Interoperation

• Inference v/s Query Rewriting/Processing for Semantic Integration:

• E.g., RichPerson = (AND Person (> Salary 100))

• Can Query Processing/Concept Rewriting provide the same functionality as inferences ? More efficiently ?

• Distributed Inferences and Loss of Information

• Query Languages for combining metadata and data queries

• Graph-based data models and query languages

• Schema Correspondences/Mappings

•Intensional Answers (Answers are descriptions, e.g. (AND Person (> Salary 100)) instead of a list of all rich people)

• Semantic Associations (identification of meaningful relationships between different documents and entities)

Semantic Index

Page 18: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Semantic WS Scope

All html People

Program Amazon Hard code

Std currency.com Self-described

Worth pursuingFormally self-described

Page 19: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Mike’s Humor

• Services vs. Ontologies“Well done is better than well said.”Ben Franklin

Page 20: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Research Issues

• Environment

• Representation

• Programming• Interaction (system)• Architecture• Utilities

• Scalable, openness, autonomy, heterogeneity, evolving

• Self-description, conversation, contracts, commitments, QoS

• Compose & customize, workflow, negotiation

• Trust, security, compliance

• P2P, privacy,• Discovery, binding, trust-

service

Page 21: Recap @ Science on the Semantic Web,  Rutgers, October 2002

SWS – Fitting in and expanding IS/DB/DM: Or why Bhavani & George should care?

Data => services, similar yet more challenging:– Modeling <functional and operational>– Organizing collections– Discovery and comparison (reputation)– Distribution and replication– Access and fuse (composition)– Fulfillment

• Contracts, coordination versus transactions• Quality: more general than correctness or precision• Compliance

– Dynamic, flexible information security and trust.

Page 22: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Research Issues

• Conversational (state-based, event-based, history-based)• Interoperability of conversational services – compose,

translate, • Representations for services: programmatic self-

description• Commitments, contracts, negotiation, compliance,

cooperation• Discovery, location, binding• Transactional workflow: rollback, roll-forward, semantic

exception handling, recovery• Trustworthy service (discovery, provisioning,

composition, description)• Security; privacy vs. personalization• Quality-of-Service, w.r.t. various aspects, negotiable

Page 23: Recap @ Science on the Semantic Web,  Rutgers, October 2002

DB / IS subcommunity

How is it relevant to research on the SW

How may the SW stimulate research in this community

DB theory Type theory, Complexity, theory of concurrency

Ontology axiomatics and theory; formal semantics; semantics for incomplete, inconsistent and evolving representations

Data(base) semantics

Everything; in particular ontology language development; constraints; data structures

Ontology modeling; formal semantics of web services

Normalization/design

Not specifically as such; some work on Non-First Normal Form

Requirement for formal properties for ontology organization; perhaps ontology design guidelines or “semantic normal forms”; conflict resolution; redundancy checks in general

Data modeling reuse/extend/map DM formalisms, techniques and methods e.g. EER, ORM, UML for ontology (content) specification and design

semantic data modeling; ontology content creation techniques and methods; complex ontological relationships; domain models

View integration

Ontology alignment, translation, object identities, updateable views…; model mappings

see Federated DBs; ontology support for view and application integration; ontology composition and update

Page 24: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Schema integration

apply to autonomously designed schemas; global schemas as pre-ontologies? conflict detection

Ontology alignment; new kinds of models will pose new kinds of problems

Deductive DB/Datalog

Learn from its failure, query processing and F-logic

how to handle different complexity levels efficiently

Multimedia DB Image ontologies; semantic indexing; similarity-based search

Image-based ontologies?

Temporal/Spatial DB

GIS semantics and archiving; histories data management;

requirement to model temporal knowledge as first class citizen in ontologies; spatial, temporal modeling in upper ontologies; versioning of GIS becomes critical issue

Document DB Digital libraries, unstructured data; standards for digital library resource descriptions to beused on the SW

Lack of a priori global model presents a research challenge

OO DB Object-oriented and object-based models for ontologies, extensible databases; modeling of object behavior; build OODB into Java

management of large collections of object-, behavior- and resource identifiers

Visual DB Visualization for the SW, visual queries; ontology visualization

semantic upgrades of image databases to be used as visual ontologies

Page 25: Recap @ Science on the Semantic Web,  Rutgers, October 2002

XML/Web DB

Most relevant, caching Size and semantics; XML shortcomings for semantics definition

Distributed DB

everything trust/privacy/compliance issues in distributed DBMS; design/dynamic tailoring of DDBMS underlying web services

Constraint DB

Constraint enforcement as semantics mechanism; semantics-based query processing

Non-closed world assumption issues

Transaction modeling

loosening of ACID properties Web services, Extended distributed transaction models; non-CWA issues; smart user profiling

Transaction processing

limits of what can/must be transactional

ACID properties of Web services; semantic support for very long transactions

Mobile DB not directly; “mobile” is a platform issue

context-aware computing; device location-independent semantics; mobility issues raised/enabled by the (Semantic) Web

Main memory DB

Semantic caching possibly semantic caching i.e. using application semantics or context

Page 26: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Parallel DB unclear at present; straightforward reuse/apply (e.g. parallel queries, transactions, …) in certain niches

Not clear at present Web SoA; parallel architectures for ontology servers?

DB machines   Not clear at present Web SoA

DB security A lot, e.g., access control trust and privacy, QoS; dynamically changing and conflicting security requirements

Federated DB Autonomy; approaches for integrating heterogeneous data sources, in particular web information sources; mediator/ wrapper-based architectures

www = huge federated DB; develop more powerful (scalable) approaches for ontology alignment and integration; heterogeneous sources may have different credibility; service composition

Query processing

high applicability; e.g. “smart” query enhancement

Query optimization

high applicability; e.g. use domain-knowledge to optimize query execution and rewriting

 

Information retrieval

broad applicability of techniques and theory;

 

Page 27: Recap @ Science on the Semantic Web,  Rutgers, October 2002

DB interoperability

Everything; esp. see federated DBs; see schema integration

Semantic aspects of interoperability; see federated DBs; quality of interoperation

DB versioning Link maintenance; ontology versioning

Annotations, ontology modeling, versioning of instance data

Metadata Annotations, ontology modeling, versioning

Mediation/Middleware

Web services will benefit P2P, collaboration, new market for mediating components

DB warehousing

DW architectures for decision support; improve e.g. web service efficiency; see the (S)Web as a giant DW

Smart data warehousing; share/compose application semantics; ontology behind “real” data

Data(base) mining

web mining; clustering; learning; information extraction profiles

mining from text; exploit semantics in mining; derive semantics inductively from query results on “real” data including exceptions; machine learning

Database architectures and DBMS

DBMS (components) as web service(s); add semantics to every function/module in a DBMS’s architectures

Ontology support in data dictionaries; new, more flexible DB architectures for better SW support and processing on the web

Page 28: Recap @ Science on the Semantic Web,  Rutgers, October 2002

Web-IS architectures

fitting enterprise IS (components) into the SW; Web IS; also see DBMS architectures

New architectures and design principles for Web IS

Functional modeling

design of web services; functional modeling that deals explicitly with a domain’s semantics

Decomposition and composition of web services; event modeling

IS in organizations

looser coupling required, provide potential for organizations to morph into the SW; see also workflow modeling

serving new organizations of business, community and government with emergent SW-based IS technology

Web-IS applications

  smart (ontology-driven) SW portals and search engines (“Google++”-type); SW-based “direct marketing”-style systems; smart user profiling

IS workflow modeling

exception handling in long (business) transactions; workflows as “the” paradigm for “programming” the SW

unreliability of components; unavailability of services

IS methodologies

ontology lifecycle issues; as IS components become more intelligent, work shifts to self-organization

New thinking required! E.g. Web IS in enterprises; how must business processes change to deal with existence of the SW; develop/maintain SW-based systems for user community unknown a priori

CASE tools ontology management systems

Page 29: Recap @ Science on the Semantic Web,  Rutgers, October 2002

User interfaces

new applications of design principles for GUIs

New and complex requirements and methods, immersive environments

DB application architectures

  Web application service

AI-and-DB knowledge representation, inference

 

Uncharted territory 1

  Sensor input and stream data management

Uncharted territory 2

In general, most algorithms in DM are poor when they are applied to access, report etc data on the web. Domain semantics in such requests need to be exploited; however “centralized” solutions (where resources need to notify potential requestors) will not be scalable.