research issues & challenges in semantic web

39
Research Issues & Research Issues & Challenges Challenges in Semantic Web in Semantic Web Jinsoo Park, Ph.D. Assistant Professor College of Business Administration Korea University [email protected] http://ids.korea.ac.kr

Upload: fernando-castillo

Post on 03-Jan-2016

24 views

Category:

Documents


1 download

DESCRIPTION

Jinsoo Park, Ph.D. Assistant Professor College of Business Administration Korea University [email protected] http://ids.korea.ac.kr. Research Issues & Challenges in Semantic Web. Self-Introduction. Short Bio 1999, Ph.D. in Information Systems, The University of Arizona - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Research Issues & Challenges in Semantic Web

Research Issues & Research Issues & ChallengesChallenges

in Semantic Webin Semantic Web

Jinsoo Park, Ph.D.

Assistant ProfessorCollege of Business Administration

Korea [email protected]

http://ids.korea.ac.kr

Page 2: Research Issues & Challenges in Semantic Web

Jinsoo Park2

Self-Introduction

Short Bio1999, Ph.D. in Information Systems, The University of Arizona1999.9 – 2002.8, Assistant Professor, Carlson School of Management, University of Minnesota2002.9 – Present, Assistant Professor, Business School, Korea University

Research AreasContent and Metadata Management in Intra- and Inter-organizational Information Systems

Semantic Interoperability and Integration

Knowledge Sharing and Coordination

Ontology

Teaching Areas

PhD – Research MethodsMBA/MS – IS Development Methodologies, AI, Databases, Data Structures & Algorithms, JavaUndergraduate – Systems Analysis and Design, MIS, IT Infrastructure

Page 3: Research Issues & Challenges in Semantic Web

Jinsoo Park3

e-Business – Technical Challenges

Communication Security & Reliability

System Heterogeneity

Data/Information Heterogeneity

Business Process Heterogeneity

Dynamic Business and Technology Heterogeneity

Page 4: Research Issues & Challenges in Semantic Web

Jinsoo Park4

Inter-organizational Interoperability

Interaction with diverse, complex enterprises

Interoperability

Buyer

Enterprise Applications

ERP

SCM

Supplier

Enterprise Applications

CRM

SFA

Buyers/Suppliers

ERP SCM CRM SFAEnterprise Application

s

Shim et al. (2000)

Page 5: Research Issues & Challenges in Semantic Web

Jinsoo Park5

Motivations

Vast amounts of data and escalating

Highly heterogeneous – a plethora of semantic conflicts Data types, data formats, structures, community, …

Considerable amount of legacy data with no associated metadata

The growth in existing data far exceeds our abilities to locate and analyze the relevant data

“Enterprise data integration is the top item on every CIO’s wish list. So what are we doing about it?”

“Are We Working On The Right Problems?,” Plenary Panel led by Michael Stonebraker, 1998 ACM SIGMOD Conf.

“The diversity of information content and formats is a salient factor in nearly all distributed systems, and the major challenge is to make diverse information systems interoperate at the semantic level while retaining their difference.”

March, Hevner & Ram, Information Systems Research, 2000

98% of companies recently interviewed say that integration is either “extremely important” or “very important” to their firm’s IT strategy

Forrest Research, 2001

Sponsors: NSF, NASA, and NIH

Page 6: Research Issues & Challenges in Semantic Web

Jinsoo Park6

Interoperability

Interoperability

Syntactic Level (Application Level)

Semantic Level (Knowledge Level)

Interface Message Transport Protocol

MetadataOntology Context Agent

Technology Solution

Linguistic, Social, and Philosophical

Solution

Ram, Park and Lee (1999)

Page 7: Research Issues & Challenges in Semantic Web

Jinsoo Park7

What is Semantics?

The meaning and the use of data (Woods, 1975)

“meaning or relationship of meanings, or relating to meaning (Webster)

Week vs. Deep Semantics (Sheth, 1995)Week semantics - semantics that can be identified based on structural, syntactic, and value/extensional information in databases

Deep semantics - semantics that involve the issues of human cognition, perception, or interpretation

Example47

Apples are expensive

A 100 ~ 91?

Semantics bring information closer to human thinking and decision-making

Page 8: Research Issues & Challenges in Semantic Web

Jinsoo Park8

Semantics-based Communication

Theory of communication that links results from semiotics, linguistics and philosophy into actual information technology

Meaning Triangle (Odgen and Richards, 1923)

Concept

Symbol Thing

evokes refers torefers to

stands for

Page 9: Research Issues & Challenges in Semantic Web

Jinsoo Park9

Semantic Interoperability Problems

Contextual differences between source and target information systems

Different vocabularies, taxonomies, schemas

Implicit semantics – tacit knowledge

Lack of separation between content, intent and process

Embedded rules

Consistency between different versions of the same schema

Page 10: Research Issues & Challenges in Semantic Web

Jinsoo Park10

Research on Semantic Interoperability

DataDataLevelLevel

AnalysisAnalysis

DataDataLevelLevel

AnalysisAnalysis

Analysis of the differences in data domains caused by the multiple representations and interpretations of similar data

DeMichiel (1989), Yu et al. (1991), Ventrone & Heiler (1991), Sciore et al. (1994), Kahng & McLeod (1998), Goh et al. (1999)

Analysis of the differences in data domains caused by the multiple representations and interpretations of similar data

DeMichiel (1989), Yu et al. (1991), Ventrone & Heiler (1991), Sciore et al. (1994), Kahng & McLeod (1998), Goh et al. (1999)

SchemaSchemaLevelLevel

AnalysisAnalysis

SchemaSchemaLevelLevel

AnalysisAnalysis

Analysis of differences in logical structures and/or inconsistencies in metadata (i.e., schemas) of the same application domain

Batini & Lenzerini (1984), Navathe et al. (1986), Geller et al. (1992), Garcia-Solaco et al. (1995), Lakshmanan et al. (1997)

Analysis of differences in logical structures and/or inconsistencies in metadata (i.e., schemas) of the same application domain

Batini & Lenzerini (1984), Navathe et al. (1986), Geller et al. (1992), Garcia-Solaco et al. (1995), Lakshmanan et al. (1997)

Few research has been done on both levels at the same time

Page 11: Research Issues & Challenges in Semantic Web

Jinsoo Park11

An Example – Data-Level Conflicts

DISCLOSURE

COMPNO

Attribute

CF

NI

NS

NRCEX(ROE)

3842

Value

19,860,228

146,502

2,909,574

0.11

DATALINE

CODE

Attribute

PERIOD END

EARNED FORORDINARY

TOTAL SALES

RETURN ONSHAREHOLDER

EQUITY

HOND

Value

28-02-86

146,502

2,909,574

19.57

Page 12: Research Issues & Challenges in Semantic Web

Jinsoo Park12

An Example – Schema-Level Conflicts

DB 1

YEAR

TAX

TAX-TYPE AMOUNT

1999

1999

Property

Water

2000

2000

Property

Water

250.34

38.99

234.98

59.05

DB 2

YEAR

TAX-AMOUNT

PROPERTY WATER

1999

2000

250.34 38.99

234.98 59.05

DB 3

PROPERTY WATER

YEAR AMOUNT

1999

2000

250.34

234.98

YEAR

1999

2000

AMOUNT

38.99

59.05

Page 13: Research Issues & Challenges in Semantic Web

Jinsoo Park13

The Revolution of the Web

Trusted Web Resources

HyperText Markup Language (HTML)

HyperText Transfer Protocol (HTTP)

Resource Description Framework (RDF)eXtensible Markup Language (XML) Self-Describing Documents

Formatted DocumentsFoundation of the Current Web

Proof, Logic andOntology Languages

(e.g., DAML+OIL)Shared terms/terminologyMachine-Machine communication

1990

2000

2010

Berners-Lee and Hendler (2001), Nature

Page 14: Research Issues & Challenges in Semantic Web

Jinsoo Park14

The Current Web

Global information space for human consumption.

Information and its presentations are mixed up.

Accessible by merely keywords: high recall, low precision

No distinction of the keyword search “Rose” among these concepts: Rational Rose, Gun ’n Roses, Rose (flower), Rose (Titanic), England’s

Rose.

Difficult for machines to automatically comprehend, process, communicate and interoperate.

Problems in information:finding,extracting,representing,interpreting,maintaining.

Page 15: Research Issues & Challenges in Semantic Web

Jinsoo Park15

The Semantic Web

“The Semantic Web is the representation of data on the World Wide Web (based on the RDF standards and other standards to be defined).” (http://www.w3.org/2001/sw/)

Envisioned by Tim Berners-Lee and researched by DARPA team and others

“A web of data that can be processed directly or indirectly by machines”Tim Berners-Lee, Weaving the Web, HarperBusiness, 2000.

The “Next Generation Web” with well-established infrastructure for expressing information in a

precise,

human-readable, and

machine-interpretable form.

Page 16: Research Issues & Challenges in Semantic Web

Jinsoo Park16

The Vision

Agents Web Services

Grid Computing

e-Business

e-Science

[Source: C. Globe, “Information Grids, the Semantic Web & Why Ontologies Matter”]

Page 17: Research Issues & Challenges in Semantic Web

Jinsoo Park17

Current Research and Technologies

Semantic Web technologies are still very much in their infanciesLittle consensus about the likely direction of the Semantic Web

No widespread agreement on exactly what the Semantic Web is

InfrastructureXML(S), RDF(S)

Ontology languageDAML+OIL, OWL, …

Two paradigms in semantic interoperabilityData warehousing (eager) approach

On-demand driven (lazy) approach

Page 18: Research Issues & Challenges in Semantic Web

Jinsoo Park18

Benefits of XML over HTML

(b)XML<?xml version=“1.0”?><document> <productInfo> <product>LaserJet1150</product> <regularPrice>380,000</regularPrice> <ourPrice>357,000 </ourPrice> <inStock>yes</inStock> </productInfo></document>

(a)HTML<html> <body topmargin=20 leftmargin=10> <font size=3> <table width="389" border="1"> <tr> <td height="82" valign="middle"> <pre> Regular Our Price Price LaserJet1150 380,000 357,000 In stock </pre> </td> </tr> </table> ... </font> </body></html>

Page 19: Research Issues & Challenges in Semantic Web

Jinsoo Park19

But XML faces following problems …

Multiple StandardsNeed for consistent and standardized tags

There are so many XML standards “there are more than a dozen XML protocols - for Financial Trading applications alone”

(Chairman of a Financial Services XML Working Group)

e.g., (price, cost), (subject, theme, title), (car, automobile) ...

Implicit SemanticsAgreement upon the precise meaning of each tag

e.g., How precisely defined is the notion of “price”Is it in dollars($) or won (\)?Even if it is “Dollars” is it US dollars, Canadian dollars, or Hong Kong dollars?

Does the “price” include sales tax? Does it include the value added tax (VAT)?

About notion of “title”It is a movie title or a drama title?

About notion of “bank”It is a financial institution or a river embankment

Modeling Conflicts

Page 20: Research Issues & Challenges in Semantic Web

Jinsoo Park20

But XML faces following problems

Evolution of SemanticsProblem of evolution

e.g., Conversion form using local currency to using Euros in Europe

e.g., GMDaewoo, RenaultSamsung

Multiple PurposesDifferent purposes necessitate different interpretations of the information

e.g., Student Professor – Taking courses

Staff – Registration

e.g., Corporate household/family structureFinancial – Risk (credit - bankruptcy)

Accounting – Account consolidation

Legal – Liability (insurance)

and these are dynamic, changing over time ..

Page 21: Research Issues & Challenges in Semantic Web

Jinsoo Park21

RDF & RDF Schema

RDF (Resource Description Framework)Represents metadata about Web resources

e.g., title, author, and modification data of a Web page …

Data model → resource, property, property valuerdf:Description, rdf:ID, rdf:typePurport to provide interoperability between applications that exchange machine-understandable information on the Web

RDF SchemaProvides semantics about RDF

a.k.a. RDF Vocabulary Description Language XML schema: about syntax

Defines an appropriate RDF vocabulary (classes, properties and constraints) for each specific domainExtension of data model → class and property hierarchyrdfs:subClassOf, rdfs:subPropertyOf, rdfs:domain and rdfs:rangeLogical connectives such as conjunction, disjunction, and negation are not provided

Not full-fledged ontological modeling and reasoning

Page 22: Research Issues & Challenges in Semantic Web

Jinsoo Park22

RDF & RDF Schema

<rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:rdfs=“http://www.w3.org/2000/01/rdf-schema#” xml:base=“ http://ids.korea.ac.kr/student.rdfs”> <rdfs:Class rdf:ID="universityStudent“/> <rdfs:Class rdf:ID=“undergraduateStudent"> <rdfs:subClassOf rdf:resource="#universityStudent"/> </rdfs:Class> <rdfs:Class rdf:ID="graduateStudent"> <rdfs:subClassOf rdf:resource="#universityStudent"/> </rdfs:Class>

<rdf:Property rdf:ID=“degree"> <rdfs:domain rdf:resource="#graduateStudent"/> <rdfs:range rdf:resource="http://www.w3.org/2000/01/rdf- schema#Literal"/> </rdf:Property></rdf:RDF>

universityStudent

undergraduteStudent graduateStudent

Properties: degree

<rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns=“http://ids.korea.ac.kr/student.rdfs#“> <rdf:Description rdf:about =“http://ids.korea.ac.kr/graduateStudent.rdf#Honggildong”> <rdf:type resource=“http://ids.korea.ac.kr/student.rdfs#graduateStudent”/> <degree>MIS</degree> </rdf:Description> </rdf:RDF>

RDF

RDF Schema

Page 23: Research Issues & Challenges in Semantic Web

Jinsoo Park23

OWL

Web Ontology Language

RDF schema is lacking in some desirable expressivenessPeople use different words to represent the same thing

cardinality constraints, conjunction, disjunction …

OWL extends RDF SchemaUses all RDF Schema’s basic notions of Class, Property, domain, and range

Adds more vocabulary for describing properties and classesrelations between classes (e.g., disjointness)

cardinality (e.g., “exactly one”)

richer typing of properties

characteristics of properties (e.g., symmetry)

enumerated classes

OWL can be used to explicitly represent the meaning of terms in vocabularies and the relationships between those terms

Page 24: Research Issues & Challenges in Semantic Web

Jinsoo Park24

CREAM

Conflict Resolution Environment for Autonomous Mediation

An integrated and collaborative facility for achieving semantic interoperability among the participating heterogeneous information sources.

Agent-based Mediation Architecture

SCROL – Semantic Conflict Resolution OntoLogyDomain independent

Semantic Query Transformation

Park & Ram (2004), ACM Transactions on Information Systems

Page 25: Research Issues & Challenges in Semantic Web

Jinsoo Park25

Research Questions

What kinds of semantic conflicts are typically found in a heterogeneous environment?

How do we recognize & resolve such conflicts

To what extent can we automate the process of conflict identification & resolution using mediators?

Page 26: Research Issues & Challenges in Semantic Web

Metadata Layer

Common Repositor

ySCROL

Schema Designer

Schema Mapper

Ontology Mapper

Wrapper Generator

Data AccessLayer

Data Exchange Layer

XML Generator

XSL Generator

DTD Generator

Output Generators

Semantic Filter

QBS Interface Users

InformationIntegrator

Semantic Mediation Layer

Database source Web Source

Database source Web Source

Semantic Mediators

Semantic Mediators

RMIWrapper

RMIWrapper

ContentWrapper

ContentWrapper

Page 27: Research Issues & Challenges in Semantic Web

Jinsoo Park27

Semantic Integration

SCROL

Federated Schema

XML Schema

XML DTD

Business Doc/Schema

Users

Semantic Mediation Service Layer

DB Schema

Page 28: Research Issues & Challenges in Semantic Web

Jinsoo Park28

Metadata – Semantic Model

Ram, Park and Ball (1999), IEEE Computer.

Page 29: Research Issues & Challenges in Semantic Web

Jinsoo Park29

SCROL – Semantic Conflict Resolution OntoLogy

= (OC, OI, RP, RS, RM, u)

OC - concepts

OI - instances

RP - parenthood relationship (subconcept-of/superconcept-of, instance-of)

RS - sibling relationship (disjoint, peer, part-of, is-a)

RM - (domain instance value) mapping relationship (one-one, one-many, many-many, total, partial, none)

u - root

Ram & Park (2004), IEEE Transactions on Knowledge and Data Engineering

Page 30: Research Issues & Challenges in Semantic Web

Jinsoo Park30

SCROL – Graphical Illustration

RootConcept

Concept Concept

Concept Concept Concept Concept Concept

Concept Concept ConceptConcept

ConceptConcept Concept

Concept Concept Concept

Instance InstanceInstance

InstanceInstanceInstance

Instance

InstanceInstance InstanceInstance

disjoint

disjoint

is-a

mapping

is-a

peer

mapping

mapping

part-of

part-of

mapping

Page 31: Research Issues & Challenges in Semantic Web

Jinsoo Park31

SCROL Interface

Page 32: Research Issues & Challenges in Semantic Web

Jinsoo Park32

Ontology-Schema Mapping Example

Image

PictureMap

Drawing

BMP

TIFF

JPEG

GIF

VSD

DWG

Vector

Raster

peer

Scale

10-3 10-2 10310-1 100 101 102

Temporal_Format

Date

Day

Julian DateType

String Type

mm/dd/yy

yyyy/mm/dd

Month Day, Year

Date, Month Day, Year

StringCardinal_

Numberpeer

Area

Square Meter

Acre

Graphical Location

CoordinateDescriptive Location

UTMCode String

peer

Duration

Day

MonthWeek

Region

City

Country

State/Province

Town

City

part-of

County-Population City-Population

name

size

census-starting-

date

census-ending-date

location

area

imagename

sizeduration

area area-size

map

census-starting-

date

Page 33: Research Issues & Challenges in Semantic Web

Jinsoo Park33

Ontology-Schema Mapper

Page 34: Research Issues & Challenges in Semantic Web

Jinsoo Park34

Semantic Mediators

S C R O L

U ser

C o o r d in a to r

C o n flic tD e te c to r

Q u e r yG e n e r a to r

S e le c to r

D a taC o lle c to r

C o n flic tR e so lv e r

(1) Ask Query

(3) A

sk S

eman

tic C

onfli

cts

in th

e R

eque

sted

Que

ry(5 ) Id e n tify C o n f lic ts

(4 ) T ra v e rse S C R O L

(6)

Rep

ly S

earc

hing

Res

ults

(6 )R e so lv a b le ?

N O

(7a)

Rep

ort t

o D

omai

n E

xper

ts

Y E S

(7b) Ask D

irectory Service and

Local Q

uery Statements

(8b)

Ask

Loc

al Q

uery

Stat

emen

ts

M e ta d a taD ire c to ry

(8a) Retrieve DirectoryInformation

R e m o te S y s te m

R e m o te S y s te m

R e m o te S y s te m

R M IS e r v e r

R M IS e r v e r

R M IS e r v e r

(11)

Ret

riev

e Q

uery

Res

ult

(11) Retrieve Q

uery Result

(1 1 ) R e tr ie v eQ u e ry R e su lt

(1 3 ) A sk S e m a n tic R e c o n c ilia tio n

(1 4 ) R e p ly R e so lv e d R e su lts

(15) Display Query Results

(12) Tell Query Results

M e ssa g eG e n e r a to r

(2) Generate

Message

(10) Ask RMI to getQuery Result Sets

(9 b ) R e p ly G e n e ra te dQ u e ry S e ts

(9 a ) R e c o n c ile C o n d itio nS ta te m e n ts

Page 35: Research Issues & Challenges in Semantic Web

Jinsoo Park35

Semantic Mediator Communication Protocol

Theory of Speech Acts (Austin 1962, Searle 1969)Performatives

ASK-ALL(QID, Query) - asking the collection of local queries.ASK-IF(Query) - asking if Query holds.DELIVER(QueryResults) - reporting the query results.DETECT(Query) - traversing the SCROL to check semantic conflicts.GENERATE(Query) - requesting local query generation.LOCATE(Query) - requesting directory service to retrieve directory information.RECONCILE(QueryResults) - requesting semantic reconciliation for the query results.REPLY-ALL(QID, QueryResults) - replying all the query results being asked.REPLY-IF(Query, Answer) - replying the Answer upon being asked if Query.REPORT(QneryResults) - reporting the query results.RESOLVE-IF(Query, Answer) - reporting the Answer upon being asked if Query can be resolvable.TELL(Query) - notifying and updating the query request.

Page 36: Research Issues & Challenges in Semantic Web

Jinsoo Park36

Key Issues and Potential Research Directions …

Integration vs. Interoperability

IntegrationIntegrationbased based

approach approach

IntegrationIntegrationbased based

approach approach

attempts to build a monolithic view of the enterprise

integrates processes and applications at the event and message levels so multiple systems become one logical unit

attempts to build a monolithic view of the enterprise

integrates processes and applications at the event and message levels so multiple systems become one logical unit

InteroperabilityInteroperabilitybased based

approach approach

InteroperabilityInteroperabilitybased based

approach approach

focuses on the exchange of meaningful, context-driven information between autonomous systems

focuses on the exchange of meaningful, context-driven information between autonomous systems

Page 37: Research Issues & Challenges in Semantic Web

Jinsoo Park37

Key Issues and Potential Research Directions …

Machine Understandable SemanticsHow can software agents learn something about the meaning of a term that it has never before encountered?

Semantic Mediation and Semantic Query ProcessingConflict Detection and Resolution

Semantic NormalizationC (e1) = C (e2)

Semantic Mapping and Translation

Semantic Association

Dynamic Evolution

Page 38: Research Issues & Challenges in Semantic Web

Jinsoo Park38

Key Issues and Potential Research Directions

Ontology HeterogeneityDifferent knowledge representation formalism

Language heterogeneity – when ontologies are expressed using different ontology languages

Naming conflicts

e.g., synonyms, homonyms, etc.

Modeling conflicts

e.g., Total Number of Employees could be attributed to inclusion or exclusion of Temporary Employees

Temporal conflicts

Arises when entity values or definitions belong to different times, or time intervals

Conceptualization conflicts

e.g., time intervals vs. time points

Ontology Learning

Page 39: Research Issues & Challenges in Semantic Web

Jinsoo Park39

Q & A