1oreilly enterprise java conference, 2001an introduction to rdf

39
1 An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Upload: michelle-reyes

Post on 27-Mar-2015

231 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

1An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Page 2: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

2An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Facilities to put machine-understandable data on the Web are becoming a high priority for many communities. The Web can reach its full potential only if it becomes a place where data can be shared and processed by automated tools as well as by people. For the Web to scale, tomorrow's programs must be able to share and process data even when these programs have been designed totally independently. The Semantic Web is a vision: the idea of having data on the web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications.

Page 3: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

3An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

History

Page 4: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

4An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

What is the Web, Really ?

• Millions upon millions of computers all using the same communications protocol

TCP/IP

HTTP

HTML

Page 5: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

5An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

HTML<B><I><FONT FACE="Tahoma" SIZE=2><P ALIGN="CENTER">DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE</P><P ALIGN="CENTER">SAN JOSE STATE UNIVERSITY</P></I><P ALIGN="CENTER">SPRING 2000 COLLOQUIUM SERIES, PART II</P></FONT><I><FONT SIZE=2><P ALIGN="CENTER">Each talk with be on a Thursday at 3:00 p.m. in MacQuarrie Hall 523 </P><P ALIGN="CENTER">Please join us for refreshments beforehand, at 2:30 p.m., in MacQuarrie Hall 210</P></I><P ALIGN="CENTER">Parking available in the Seventh Street Garage at South Seventh and San Salvador Streets, San Jose, CA</P></FONT><I><FONT FACE="Tahoma" SIZE=2></I></FONT><FONT SIZE=2><P>April 6&#9;&#9;Zvezdelina Stankova-Frenkel, Mathematics, Mills College</P>

<I><P>From Desargues to Modern Algebraic Geometry</P></B></I></FONT><FONT FACE="Arial" SIZE=2><P>We will look at some classical plane geometry . . . mathematics. </P>

Page 6: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

6An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

The Evolution of Web Technology

• HTML 1.0 became 2.0 became ... 4.0

• Cascading style sheets and other formatting and layout standards defined by W3C

• Proprietary technologies such as Shockwave and PDF invented

Page 7: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

7An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

The Implicit Assumptions

• Point to point (direct) communication

• The primary task of a web server is to deliver information to a human who is asking for that information– Key points: to a human, already asking for

information

Page 8: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

8An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

The First Business Opportunity

• “The Web is like mail-order”

• Put Catalogs on the web– Easy to update– Easy to link in auxiliary information

• “People who bought that also bought …”

• Availability information

• In many cases, simply putting an “HTML front end” on existing systems

Page 9: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

9An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Leads to Another Opportunity

• Catalogs prime the pump– Easy to understand application that is

compelling– Side-effect: lots of information is now available

on the internet

• How do we take advantage of it ?– Automate existing processes– Enable new applications

Page 10: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

10An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

HTML is a Problem

• It’s a markup language based on document structure– Most tags are visual, about presentation– HTML solves document-level navigation

problems, for humans– Lots of information encoded in images

• Fundamentally, the wrong idea.

Page 11: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

11An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

eXtensible Markup Language (XML)

• Basically, a language for defining markup languages

• Key idea: separate data from presentation information

• Replace HTML with two things• A domain specific markup language (defined in XML)

• A map from that markup language to HTML (defined using XSL)

Page 12: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

12An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Split Data <SEASON><YEAR>1998</YEAR><LEAGUE><LEAGUE_NAME>National League</LEAGUE_NAME><DIVISION><DIVISION_NAME>East</DIVISION_NAME><TEAM><TEAM_CITY>Atlanta</TEAM_CITY><TEAM_NAME>Braves</TEAM_NAME><PLAYER><SURNAME>Malloy</SURNAME><GIVEN_NAME>Marty</GIVEN_NAME><POSITION>Second Base</POSITION><GAMES>11</GAMES><GAMES_STARTED>8</GAMES_STARTED><AT_BATS>28</AT_BATS><RUNS>3</RUNS><HITS>5</HITS><DOUBLES>1</DOUBLES>.....

Meaning!

From: The XML Bible by Harold

Page 13: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

13An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

From Presentation<HTML xmlns:xsl="http://www.w3.org/TR/WD-xsl"><HEAD><TITLE> <xsl:for-each select="SEASON"> <xsl:value-of select="YEAR"/> </xsl:for-each> Major League Baseball Statistics</TITLE></HEAD><BODY> <xsl:for-each select="SEASON"> <H1 ALIGN="CENTER"> <xsl:value-of select="YEAR"/> Major League Baseball Statistics </H1>

Formatting!

Page 14: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

14An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

What is the Web, XML Version

• HTML is a tag language, defined using XML– One of many tag languages (and the likely

target for XSL transformations)

TCP/IP

HTTP

XML

XHTML Special Purpose Tag Languages

Page 15: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

15An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

XML Has Lots of Problems

• Everything bottoms out in strings

• DTD’s provide simple structure at the level of “documents”– Very simple inter-document structure– No provisions for intra-document structure

• No support for versioning

Page 16: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

16An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

The VISA DTD<!ELEMENT Invoice (InvoiceHeader, InvoiceDetails+, InvoiceSummary)><!ATTLIST Invoice sectorUsageVersion CDATA #IMPLIED > <!ELEMENT InvoiceHeader (InvoiceType, InvoiceStatus, TaxTreatment, DiscountTreatment?, InvoiceTreatment, InvoiceNumber, InvoiceDate, TaxPointDate?, Currency, Party, Party, Party*, Payment?, PONum?, DeliveryNoteNum?, Ref*, Date*, GenText*)><!ELEMENT InvoiceType EMPTY><!ATTLIST InvoiceType stdValue (380|381) "380"

stdName (UNTDID:1001) "UNTDID:1001"> <!-- 380 = Invoice 381 = Credit Note -->

<!ELEMENT InvoiceStatus EMPTY><!ATTLIST InvoiceStatus stdValue (9|10|53) "9"

stdName (UNTDID:1225) "UNTDID:1225"><!-- 9 = Original, 10 = Copy, 53 = Test -->

<!ELEMENT TaxTreatment EMPTY><!ATTLIST TaxTreatment stdValue (NIL|GIL|NLL|GLL|NON) "NLL"

stdName (VISA:TAXT) "VISA:TAXT"><!-- NIL = Line item net amounts, invoice level tax GIL = Line item gross amounts, invoice level tax NLL = Line item net amounts, line level tax GLL = Line item gross amounts, line level tax NON = Tax does not apply to this invoice -->

<!ELEMENT DiscountTreatment EMPTY><!ATTLIST DiscountTreatment stdValue (UN|UG|TN) "UG"stdName (VISA:DSCT) "VISA:DSCT"> <!-- UN = Line item unit price, net of discount UG = Line item unit price, gross of discount TN = Line item sub-total, net of discount TG = Line item sub-total, gross of discount. --><!ELEMENT InvoiceTreatment EMPTY><!ATTLIST InvoiceTreatment stdValue (P|EP|E) "P"stdName (VISA:INVT) "VISA:INVT"> <!-- P = Invoice printed and given to purchaser, and then used for tax reclaim S = Printed, but printed invoice treated as supplemental invoice since electronic copy used for tax reclaim E = Printed invoice suppressed since electronic master version used for tax reclaim -->

Page 17: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

17An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

It Gets Worse

<!ATTLIST InvoiceTreatment stdValue (P|EP|E) "P"stdName (VISA:INVT) "VISA:INVT">

<!-- P = Invoice printed and given to purchaser, and then used for tax reclaim S = Printed, but printed invoice treated as supplemental invoice since electronic copy used for tax reclaim E = Printed invoice suppressed since electronic master version used for tax reclaim -->

<!ELEMENT InvoiceNumber (#PCDATA)><!-- String, 1..35 characters -->

<!ELEMENT InvoiceDate (#PCDATA)><!-- String, 1..19 Character DateTime (CCYY-MM-DDTHH:MM:SS) -->

<!ELEMENT TaxPointDate (#PCDATA)><!-- String, 1..19 Character DateTime (CCYY-MM-DDTHH:MM:SS) -->

Page 18: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

18An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

The Accompanying Prose

• The DTD is 4 pages

• The manual is 182 pages

The aim of this Guide is to provide sufficient information about the XML Invoice Document to enable its implementation. It documents the file structure, the business usage of the elements, and all the elements and attributes in detail.

Page 19: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

19An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

RDF

Page 20: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

20An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Goal: The Semantic Web

• Different sites each maintain small amounts of information

• Sites need to refer to each other’s information with full semantic integrity– Information is maintained by owners and

referred to by other sites– Information should be accessible, and coherent,

in very small chunks

Page 21: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

21An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Needed: Precision

• Need the ability to specify things like dates, times, and monetary amounts

• Compile in those VISA comments– The more of this we can do, the less

programmer-hours are needed

• Ultimately, most web-based computation will not involve a browser

Page 22: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

22An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Needed: Granularity

• Saying things at the “page” level is too coarse grained

• Small chunks of data necessary– And ability to aggregate into larger chunks

important

Page 23: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

23An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Use Classes and Instances

• Objects are a natural way to represent information

• A web page can contain hundreds of instances, each with its own URI

• Hard part is figuring out how to do this in a way that works on the web

Page 24: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

24An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Start with Resources

• A resource is a thing you talk about (can reference)

• Everything is a resource

• Resources have URI’s

Page 25: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

25An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

How to say things in RDF

• Small set of canonical tags • Use XML syntax to define vocabularies• Information asserted via triples

– Assertions require three things:• Subject: What the assertion is about (always a resource)

• Property: A property whose value is being asserted (always a resource)

• Object: The value of the property (either a resource or a primitive value)

Page 26: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

26An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Important Tags

• rdf:Description

• rdfs:Class

• rdfs:Property

• rdf:type

• rdfs:subClassOf

• rdfs:domain

• rdfs:range

Page 27: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

27An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Defining a Class

<?xml version='1.0' encoding='ISO-8859-1'?><!-- Version Tue Feb 01 18:29:46 PST 2000 --><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#" xmlns:rdfutil="http://www.w3.org/rdfutil#" xmlns:bill="http://www.grosso.org/rdfexample#">

<rdf:Description rdf:ID="MotorVehicle"> <rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Class"/> <rdfs:subClassOf rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Resource"/></rdf:Description>

Page 28: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

28An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

An Instance of Motor Vehicle

<?xml version='1.0' encoding='ISO-8859-1'?><!-- Version Tue Feb 01 18:29:46 PST 2000 --><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#" xmlns:rdfutil="http://www.w3.org/rdfutil#" xmlns:bill="http://www.grosso.org/rdfexample#">

<rdf:Description rdf:ID="MyChevy"> <rdf:type resource= bill:MotorVehicle /></rdf:Description>

Page 29: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

29An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Resources Define Tags

<rdfs:Class ID="MotorVehicle"> <rdfs:subClassOf rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Resource"/> </rdfs:Class>

<bill:MotorVehicle ID=MyChevy/>

Page 30: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

30An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Classes

• Object-oriented notion

• There are classes, arranged in a taxonomy (with subclass relationships)

• Instances can be instances of more than one class

Page 31: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

31An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Adding a Property <rdf:Description rdf:ID="rearSeatLegRoom"> <rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Property"/> <rdfs:domain rdf:resource="#MotorVehicle"/> <rdfs:range rdf:resource="http://www.w3.org/TR/xmlschema-2/#integer"/> </rdf:Description>

<rdfs:Property ID=”rearSeatLegRoom"> <rdfs:domain rdf:resource="#MotorVehicle"/> <rdfs:range rdf:resource="http://www.w3.org/TR/xmlschema-2/#integer"/> </rdfs:Property>

Page 32: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

32An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Setting Property Values

<rdf:Description rdf:ID=MyChevy><bill:rearSeatLegRoom> 47 </bill:rearSeatLegRoom>

</rdf:Description>

Page 33: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

33An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Properties

• Similar to fields (data members, attributes...)• Big difference: they’re first class objects

– Defined independently of classes

– Asserted independently of classes• Classes don’t come with a set of data members• Other people (other pages) can assert properties about

your classes and instances without your knowledge or permission

Page 34: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

34An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

The Web of Knowledge

Corning Fiberglass has a product catalog

Home Appliances Defines things like

“Blender”

Sears has an on-line store that uses (and extends) both of these

as standard vocabularies

Corning Fiberglass has a product catalog

Corning Fiberglass has a product catalog

Corning Fiberglass has a product catalog

Page 35: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

35An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

The Web of Knowledge

Public OpinionAnd RatingsTerminology

Corning Fiberglass has a product catalog

Home Appliances. Defines things like

“Blender”

Sears has an on-line store that uses (and extends) both of these

as standard vocabularies

Corning Fiberglass has a product catalog

Corning Fiberglass has a product catalog

Corning Fiberglass has a product catalog

Consumer Reports uses the product catalogs and

attaches more informationto them

Page 36: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

36An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

What is the Web, RDF Version

• Usually called “The Semantic Web”

TCP/IP

HTTP

XML

RDF and RDF-Schema

Schema Schema Schema

Instances Instances

Page 37: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

37An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Further Information

• http://www.w3.org/RDF/

• http://www.w3.org/2001/sw/

• http://www.semanticweb.org/

• http://www.mozilla.org/rdf/doc/

• http://www.xml.com/pub/a/2001/01/24/rdf.html

• http://xml.coverpages.org/rdf.html

Page 38: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

38An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

Programmatic Resources

• Protege (http://www smi.stanford.edu/projects/protege)

• RDF DB (http://web1.guha.com/rdfdb/)• Redland (http://www.redland.opensource.ac.uk/)• Java API

(http://www-db.stanford.edu/~melnik/rdf/api.html)

• Squish (http://swordfish.rdfweb.org/rdfquery/)

Page 39: 1OReilly Enterprise Java Conference, 2001An Introduction to RDF

39An Introduction to RDF O’Reilly Enterprise Java Conference, 2001

High Profile Uses

• Electric Power Industry (http://www.langdale.com.au/XMLCIM.html)

• DMOZ (http://www.dmoz.org/)

• Epinions (http://www.epinions.com)

• DAML (www.daml.org)