strategies for subject navigation of linked web sites using rdf topic maps
DESCRIPTION
Strategies for subject navigation of linked Web sites using RDF topic maps. Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies 2002 – Seattle, WA. Complex Web sites. Many institutions are struggling to solve problems with their official Web sites. But: - PowerPoint PPT PresentationTRANSCRIPT
Strategies for subject navigation of linked Web sites using RDF
topic maps
Carol Jean GodbyCarol Jean Godby
Devon SmithDevon Smith
OCLC Online Computer Library CenterOCLC Online Computer Library Center
Knowledge Technologies 2002 – Seattle, WAKnowledge Technologies 2002 – Seattle, WA
Complex Web sites
Many institutions are struggling to solve Many institutions are struggling to solve problems with their official Web sites.problems with their official Web sites.
But:But: The contents constantly change.The contents constantly change. The editors can’t exercise sufficient The editors can’t exercise sufficient
control.control. One result: an institution’s major presence One result: an institution’s major presence
on the Web is difficult to navigate.on the Web is difficult to navigate.
The Semantic Web
Tim Berners-Lee’s vision:Tim Berners-Lee’s vision: ““The current Web has documents for people, not The current Web has documents for people, not
computers. By augmenting Web pages with data computers. By augmenting Web pages with data designed for automated processing, users will designed for automated processing, users will transform the Web into the Semantic Web.”transform the Web into the Semantic Web.”
““Computers will find the meaning of semantic Computers will find the meaning of semantic data by following hyperlinks to definitions of key data by following hyperlinks to definitions of key terms and rules for reasoning about them terms and rules for reasoning about them logically.”logically.”
The Semantic Web:An Architecture
Unicode URI
XML + XML namespaces + XMLschema
RDF + RDFschema
Ontology vocabulary
Logic
Proof
Digitalsignature
Trust
Data
Data
Rules
Self-describingdocuments.
Source: Tim Berners-Lee
The promise of the Semantic Web
A common data modelA common data model
Conceptual linksConceptual links
Limited inferencesLimited inferences
Our demo: goals
Represent subject/topic information obtained from Represent subject/topic information obtained from different sources.different sources.
Demonstrate the value of hypothetical metadata-Demonstrate the value of hypothetical metadata-based navigation for a collection of related Web based navigation for a collection of related Web sites.sites. oclc.orgoclc.org Portions of w3c.orgPortions of w3c.org dublincore.orgdublincore.org
Develop and evaluate the utility of Open Source Develop and evaluate the utility of Open Source prototyping tools based on RDF.prototyping tools based on RDF.
SSome common topics
digital library xml
dublin core xml namespace
xml schemametadata
oclc.org w3c.org
dublincore.org
xml fragmentxml stylesheet
element nodedc element syntax
library automationclassification
traditional librarylibrary userslibrary network
xml profileschema processoruri syntax
Sources of subject/topic metadata
HTML keywordsHTML keywords Subject lines in email messagesSubject lines in email messages An index of library/information science An index of library/information science
termsterms Terms extracted automatically from text Terms extracted automatically from text
using natural-language-processing using natural-language-processing algorithmsalgorithms
Some term relationshipsSingular/Plural Library, librariesAcronyms
Standard Generalized Markup Language--SGMLLibrary of Congress Subject Headings--LCSH
Coordinationlibrary and information science--library science, information scienceinformation storage and retrieval--information storage, information retrieval
Broad/NarrowComputational linguistics—linguisticsClassification scheme—classification
Type-of Library—digital library, traditional libraryRelated Library—library classification scheme, library automation
An RDF encoding
<Topic rdf:about=http://purl.org/rdf/topics/<Topic rdf:about=http://purl.org/rdf/topics/classificationclassification>><name><name>classificationclassification</name></name><related_concepts <related_concepts
rdf:resource=“http://purl.org/rdf/topics/rdf:resource=“http://purl.org/rdf/topics/classification_codesclassification_codes”/>”/><related_concepts rdf:resource=http://purl.org/rdf/topics/<related_concepts rdf:resource=http://purl.org/rdf/topics/classification classification
numbernumber”/>”/><types_of rdf:resource=http://purl.org/rdf/topics/<types_of rdf:resource=http://purl.org/rdf/topics/automatic classificationautomatic classification”/>”/><types_of rdf:resource=“http://purl.org/rdf/topics/<types_of rdf:resource=“http://purl.org/rdf/topics/library_classificationlibrary_classification”/>”/><coordinate rdf:resource=“http://purl.org/rdf/topics/<coordinate rdf:resource=“http://purl.org/rdf/topics/resource_discovery and resource_discovery and
classificationclassification”/>”/><coordinate rdf:resource=“http:/purl.org/rdf/topics/<coordinate rdf:resource=“http:/purl.org/rdf/topics/classification and classification and
knowledgeknowledge”/>”/></Topic></Topic>
Connected RDF encodings
<Topic rdf:about=http://purl.org/rdf/topics/<Topic rdf:about=http://purl.org/rdf/topics/resource_discoveryresource_discovery>><name><name>resource discoveryresource discovery</name></name><broad_concepts rdf:resource=“http://purl.org/rdf/topics/<broad_concepts rdf:resource=“http://purl.org/rdf/topics/resourceresource”/>”/></Topic></Topic>
<Topic rdf:about=http://purl.org/rdf/topics/<Topic rdf:about=http://purl.org/rdf/topics/resourceresource>><name><name>resourceresource</name></name><related_concepts rdf:resource=http://purl.org/rdf/topics/<related_concepts rdf:resource=http://purl.org/rdf/topics/resource resource
discoverydiscovery”/>”/><types_of rdf:resource=http://purl.org/rdf/topics/<types_of rdf:resource=http://purl.org/rdf/topics/resource description resource description
frameworkframework”/>”/><related rdf:resource<related rdf:resource=“http://purl.org/rdf/topics/web_resource=“http://purl.org/rdf/topics/web_resource”/>”/></Topic></Topic>
A graphical representation of relationships
classification
classificationcodes
automaticclassification
resource discoveryand classification
Coordination
Broad/Narrow
resourcediscovery
resource
resource descriptionframework
rdf
Type_of
Coordination
Related
Acronym
The philosophy of our system
ModularModular
Open SourceOpen Source
Project Web site accessible at: Project Web site accessible at:
topicmap.oclc.org:5000topicmap.oclc.org:5000
Term filters: using knowledge encoded in the text
Positive contexts for terms: study of, information about, professor of, department of
information science, metadata applications, data processing, automatic classification, computational linguistics, internet resources
Negative contexts for terms: very different things, few messages, good point, interesting example, appealing idea, small extension, terse document, simple kind
System architecture: 2
Harvester (Perl)
File System (HTML)
Metadata Scraper(Perl)
File System(Normalized HTML)
Term manipulator(Java)
File System (XML/RDF)
XML/RDF Loader
Database
Open issues
RDF knowledge in the user interface.RDF knowledge in the user interface.
Encoding in RDF or XML?Encoding in RDF or XML?
The construction of knowledge ontologies.The construction of knowledge ontologies.
Conclusions
The enterprise succeeds or fails on the The enterprise succeeds or fails on the strength of the knowledge ontology.strength of the knowledge ontology.
RDF and the XTM standard are RDF and the XTM standard are descriptively equivalent for our work.descriptively equivalent for our work.
Sophisticated user interface design is Sophisticated user interface design is required to exploit all of the encoded required to exploit all of the encoded information.information.
For more information
Sharon Caraballo. Automatic Construction Sharon Caraballo. Automatic Construction of a Hypernym-Labeled Noun Hierarchy. of a Hypernym-Labeled Noun Hierarchy. PhD dissertation. Brown University, 2001.PhD dissertation. Brown University, 2001.
Carol Jean Godby. A Computational Study Carol Jean Godby. A Computational Study of Lexicalized Noun Phrases in English. of Lexicalized Noun Phrases in English. PhD dissertation. The Ohio State PhD dissertation. The Ohio State University, 2002.University, 2002.