vocabulary markup language (voc-ml) project

16
Vocabulary Markup Language (Voc-ML) Project Joseph A. Busch Content Intelligence Evangelist Interwoven

Upload: mercia

Post on 21-Jan-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Vocabulary Markup Language (Voc-ML) Project. Joseph A. Busch Content Intelligence Evangelist Interwoven. Agenda. Background Revision summary Issues and Next steps. Soergel’s SemWeb Proposal. System of integrated access to data on concepts and terminology. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Vocabulary Markup Language  (Voc-ML) Project

Vocabulary Markup Language (Voc-ML) Project

Joseph A. BuschContent Intelligence EvangelistInterwoven

Page 2: Vocabulary Markup Language  (Voc-ML) Project

Agenda

Background

Revision summary

Issues and Next steps

Page 3: Vocabulary Markup Language  (Voc-ML) Project

Soergel’s SemWeb Proposal

System of integrated access to data on concepts and terminology.

Bring together variety of sources that exist largely in separate worlds, including dictionaries, thesauri, classification schemes, etc.

Federated system with multiple collaborators.

Common interface to all concept & terminology knowledge bases on the Internet.

Page 4: Vocabulary Markup Language  (Voc-ML) Project

The Real Semantic Web

Namespace for uniquely identifying a semantic scheme & each concept within each scheme.

Broad template or conceptual schema for holding all types of semantic information & specifying relationships among them.

Definitions of services for interacting with the System.

Page 5: Vocabulary Markup Language  (Voc-ML) Project

Vocabulary Markup Language (Voc-ML)

XML schema for the Semantic Web.

Broad template for structured representation of semantic schemes. Dublin Core metadata.

Tags and syntax for uniquely identifying each concept.

Typed relationships (hierarchical, associative, etc.)

Typed notes.

Host agency: Networked Knowledge Organization Systems (nkos.slis.kent.edu)

Page 6: Vocabulary Markup Language  (Voc-ML) Project

<?xml version="1.0"?>

<!DOCTYPE VocML SYSTEM "VocML.dtd“>

<Voc-ML version=”1.99“>

<SrcVocab>

<SVHeader>

<dc:Title>DFSIC-1998</dc:Title>

<dc:Source>Standard Industrial Classification (1987)</dc:Source>

<dc:Creator>Interwoven</dc:Creator>

<dc:Contributor>U.S. Department of Commerce</dc:Contributor>

<workNum UIDprefix=”DFSIC-1998” DisplayTitle=”Standard Industrial Classification” BriefDisplay=”SIC”>

</SVHeader>

<SVTerm UID=”DFSIC-1998::0139” CCID”104:43”>

<label>Field Crops, except Cash Grains, not elsewhere classified</label>

<definition>Establishments primarily engaged in the production of field crops, except cash grains, not elsewhere classified. This industry also includes establishments deriving 50 percent or more of their total value of sales of agricultural products from field crops, except cash grains (Industry Group 013), but less than 50 percent from products of any single industry.</definition>

<cla>0139</cla>

<typedRelation UREF=”DFSIC-1998::013” UTYPE=”Z39.19-1980::2" Name=”BT”>

<typedRelation UREF=”DFSIC-1998::013900” UTYPE=”Z39.19-1980::3" Name=”NT”>

Dublin Core

Unique ID

Typed Relationships

Page 7: Vocabulary Markup Language  (Voc-ML) Project

Agenda

Background

Revision summary

Issues and Next steps

Page 8: Vocabulary Markup Language  (Voc-ML) Project

Voc-ML Version 1.7 Revisions

Added editDate and source attributes to all “customer” updatable elements.

Removed most remnants of Datafusion Concept Catalog, including CCID, XWalkHeader, etc.

Page 9: Vocabulary Markup Language  (Voc-ML) Project

Voc-ML Version 1.8 Revisions

Defined and commented Path ID syntax and usage. PID allows encoding of complex polyhierarchies with context-dependent children, e.g., MeSH. *

* This is still in flux. Currently all path information can be encoded in parent/child tags, but current Interwoven software will not read path tags.

Page 10: Vocabulary Markup Language  (Voc-ML) Project

Voc-ML Version 1.9 Revisions

Removed more excess elements from Datafusion Concept Catalog, including CCLoadFile, ForbiddenIDs.

Removed all instances of CTYPE attribute. (New ID reference semantics).

Changed optional Note & Misc elements to be allowed only once, not many times. *

* This is still in flux. Currently reflects what Interwoven software does, not what the standard should do.

Page 11: Vocabulary Markup Language  (Voc-ML) Project

Voc-ML Version 1.95 Revisions

Added editHistory and Edit elements in SVHeader.

Added editDate and Source attributes to Path element.

Edited and updated comments on Path, Parent, & Child to reflect new semantics of the ID reference attributes in these elements.

Added comments on commonHeaderBlock parameter entity to explain Dublin Core and cite DC reference.

Cleaned up many other comments for readability.

Page 12: Vocabulary Markup Language  (Voc-ML) Project

Voc-ML Version 1.99 Revisions

Added xmlns:dc attribute to root SrcVocab element.

Changed PID attribute in path to CDATA.

Changed UREF attribute in child to CDATA.

Made editHistory in the SVHeader optional.

Page 13: Vocabulary Markup Language  (Voc-ML) Project

Agenda

Background

Revision summary

Issues and Next steps

Page 14: Vocabulary Markup Language  (Voc-ML) Project

Voc-ML Issues

UID syntax (Light, 5/25/01)

Topic maps vs Voc-ML

Functions to be served by standards for machine-readable thesauri (Soergel, 11/21/01) Data input, transfer

Query and view

URIs

Normalization vs specialization (ASIS meeting, 11/13/01) Type all relationships (remove Parent, Child, RelatedTerm

elements)

Type all notes (remove Definition element, add type for Note)

Page 15: Vocabulary Markup Language  (Voc-ML) Project

Proposed Next Steps

Post Voc-ML 2.0

Provide examples of marked-up resources

Prepare W3C RFC

Page 16: Vocabulary Markup Language  (Voc-ML) Project

Joseph A. Busch Director, Solutions Architecture

Interwoven

415-778-3129fax 415-778-3131

[email protected]