the semantic web deborah mcguinness associate director and senior research scientist knowledge...
TRANSCRIPT
The Semantic Web
Deborah McGuinness
Associate Director and Senior Research Scientist
Knowledge Systems Laboratory
Stanford University
Stanford, CA USA
http://www.ksl.stanford.edu/people/dlm
Today: Rich Information Source for Human Manipulation/Interpretation
Human
Human
“I know what was input”
• Global documents and terms indexed and available for search• Search engine interfaces• Entire documents retrieved according to relevance (instead of
answers)• Human input, review, assimilation, integration, action, etc.• Special purpose interfaces required for user friendly applications
The web knows what was input but does little interpretation, manipulation, integration, and action
Information Discovery… but not much more
• Human intensive (requiring input reformulation and interpretation)
• Display intensive (requiring filtering)• Not interoperable• Not agent-operational• Not adaptive• Limited context• Limited service
Analogous to a new assistant who is thorough yet lacks common sense, context, and adaptability
Future: Rich Information Source for Agent Manipulation/Interpretation
Human
Agent
Agent
“I know what was meant”
• Understand term meaning and user background• Interoperable (can translate between applications)• Programmable (thus agent operational)• Explainable (thus maintains context and can adapt)• Capable of filtering (thus limiting display and
human intervention requirements)• Capable of executing services
Semantic Markup
Languages such as DAML+OIL(www.daml.org)• Encoding background info• User modeling info• Annotating web pages• Annotating services thereby limiting needs for human disambiguation input, human
interpretation, multiple answer display, translation assistance, agent assistance, adaptivity support, etc.)
Ontologies
DAML-enabled web pages
The Semantic Web enables…
• New models of intelligent services
• E-commerce solutions
• M-commerce
• Web assistants
• …
•E-commerce solutions
•M-commerce
The Semantic Web Enables…
New forms of web assistants/agents that act on a human’s behalf requiring less from humansand their communication devices…
Under the coversMeaning needs to be encoded, understood, and
reasoned with.
-- Ontologies capture meanings of terms and their interrelationships
What is an Ontology?
Catalog/ID
GeneralLogical
constraints
Terms/glossary
Thesauri“narrower
term”relation
Formalis-a
Frames(properties)
Informalis-a
Formalinstance
Value Restrs.
Disjointness, Inverse, part-
of…
Ontologies and importance to E-Commerce
Simple ontologies (taxonomies) provide:• Controlled shared vocabulary (search engines, authors,
users, databases, programs/agents all speak same language)• Site Organization and Navigation Support• Expectation setting (left side of many web pages)• “Umbrella” Upper Level Structures (for extension)• Browsing support (tagged structures such as Yahoo!)• Search support (query expansion approaches such as
FindUR, e-Cyc)• Sense disambiguation
Ontologies and importance to E-Commerce II
• Consistency Checking• Completion• Interoperability Support• Support for validation and verification testing (e.g.
http://ksl.stanford.edu/projects/DAML/chimaera-jtp-cardinality-test1.daml )
• Configuration support• Structured, “surgical” comparative customized
search• Generalization/ Specialization• … Foundation for expansion and leverage
A Few Observations about Ontologies– Simple ontologies can be built by non-experts
• Verity’s Topic Editor, Collaborative Topic Builder, GFP, Chimaera, Protégé, OIL-ED, etc.– Ontologies can be semi-automatically generated
• from crawls of site such as yahoo!, amazon, excite, etc.• Semi-structured sites can provide starting points
– Ontologies are exploding (business pull instead of technology push)• most e-commerce sites are using them - MySimon, Amazon, Yahoo! Shopping, VerticalNet,
etc.• Controlled vocabularies (for the web) abound - SIC codes, UMLS, UN/SPSC, Open
Directory (DMOZ), Rosetta Net, SUO • Business interest expanding – ontology directors, business ontologies are becoming more
complicated (roles, value restrictions, …), VC firms interested,• DTDs are making more ontology information available • Markup Languages growing XML, RDF, DAML, RuleML, xxML• “Real” ontologies are becoming more central to applications
Implications and Needs
• Ontology Language Syntax and Semantics (DAML+OIL)
• Environments for Creation and Maintenance of Ontologies
• Training (Conceptual Modeling, reasoning implications, …)
Issues– Collaboration among distributed teams– Interconnectivity with many systems/standards– Analysis and diagnosis– Scale– Versioning– Security– Ease of use– Diverse training levels /user support– Presentation style– Lifecycle– Extensibility
Chimaera – A Ontology Environment Tool
An interactive web-based tool aimed at supporting:•Ontology analysis (correctness, completeness, style, …)•Merging of ontological terms from varied sources•Maintaining ontologies over time•Validation of input
• Features: multiple I/O languages, loading and merging into multiple namespaces, collaborative distributed environment support, integrated browsing/editing environment, extensible diagnostic rule language
• Used in commercial and academic environments
• Available as a hosted service from www-ksl-svc.stanford.edu
• Information: www.ksl.stanford.edu/software/chimaera
XML• World Wide Web Consortium (W3C)
standard• Provides important solution to syntax
problem and simple semantics and schemas:
<SSN>444-23-2656</SSN>
• Now we can describe the meaning of words• Many applications of XML appearing:
– Geographic Markup Language (GML)– Extensible rights Markup Language (XrML)– Chemical Markup Language (CML)
Problem: Limited semantics and ontology
DARPA Agent Markup Language
• Builds on top of XML and RDF• Provides rich ontology
representation• Key starting point for W3C
Semantic Web activity• Future releases will provide logic
and rules capabilities
Problem: Tools to help create DAML ontologies, markup, and to facilitate access are still emerging
EXAMPLES
<html> <head> <TITLE>Fred Jones</TITLE> </head><body> <H1>Information About Fred Jones</H1><P>Fred Jones is in the U.S. Air Force. He is a Captain stationed at AFRL. </P> </body> </html>
HTML
<person><name>Fred Jones</name><employer>U.S. Air Force</employer><station>AFRL</station><rank>Captain</rank></person>
XML<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:daml="http://www.daml.org/2001/03/daml+oil#" xmlns:dod="http://www.dod.mil/personnel#" xmlns:af="http://www.af.mil/personnel#" xmlns:afrl="http://www.rl.af.mil/personnel#" <dod:Officer rdf:ID="fsmith"> <dod:givenName>Fred</dod:givenName> <dod:surname>Smith</dod:surname> <dod:service rdf:resource="http://www.dod.mil/services#AirForce"/> <af:rank rdf:resource="http://www.af.mil/personnel#Captain"/> <af:station rdf:resource="http://www.af.mil/stations#AFRL_Rome"/>
<daml:equivalentTo rdf:resource="ssn:123-45-6789"/> </dod:Officer></rdf:RDF>
DAML
DAML Status
• DAML ontology language specification released and in use
• DAML services language specification draft released
• http://www.daml.org provides public Web site with DAML information
• Research and corporate teams are developing DAML tools
• Supported by W3C in the Semantic Web Activity
• Endorsed by companies and interest growing
TrustworthyWeb
Resources
HyperText Markup LanguageHyperText Transfer Protocol
Resource Description FrameworkeXtensible Markup Language Self-Describing Documents
Foundation of the Current Web
Proof, Logic andOntology Languages Shared terms/terminology
Machine-Machine communication
1990
2000
2010?
(from Berners-Lee, Hendler; Nature, 4/01)
Discussion/Conclusion• Ontologies are exploding; core of many applicationsOntologies are exploding; core of many applications• Business “pull” is driving ontology language tools and languagesBusiness “pull” is driving ontology language tools and languages• New generation applications need more expressive ontologies New generation applications need more expressive ontologies
and more back end reasoningand more back end reasoning• New generation users (the general public) need more support New generation users (the general public) need more support
than previous users of KR&R systemsthan previous users of KR&R systems• Scale and distribution of the web force mind shiftScale and distribution of the web force mind shift
• Markup languages will revolutionize web applicationsMarkup languages will revolutionize web applications• Agents can be human proxies enabling new applications and Agents can be human proxies enabling new applications and
modes of interactionmodes of interaction
Some Pointers
• Ontologies Come of Age Paper: http://www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
• Ontologies and Online Commerce Paper: http://www.ksl.stanford.edu/people/dlm/papers/ontologies-and-online-commerce-abstract.html
• DAML+OIL: http://www.daml.org/
Extras
What Is An Agent?
• Software module• Intended to act as a proxy for you
in some way• May be:
– Tightly controlled – Autonomous– Mobile
Why Is This Important?
• Humans work sequentially• Agents work in parallel and 24x7• Therefore, agents can be a major
productivity multiplier
Web Trends• Web is evolving from a provider of documents and images (information retrieval)
• To a provider of services• Web service discovery -Find me an airline service that offers flights to Singapore • Web service execution -Buy me “Harry Potter and the Sorcerer’s Stone” at
www.amazon.com
• Web service selection, composition and interoperation -Make my travel arrangements for my Internet World conference trip
• Both retrieval and services lend themselves to agent technologies
Problems
• Average Web searches examine only 25% of available information
• Web searches return a lot of unwanted information
• Information content of the Web doubles approximately every six months
• Problem continues to worsen as Web grows