overview of technological solutions to terminology services doug tudhope hypermedia research unit...

33
Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London, February 2004

Upload: autumn-stephens

Post on 28-Mar-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Overview of technological solutions to terminology services

Doug Tudhope

Hypermedia Research Unit

University of Glamorgan

JISC Terminology Workshop, London, February 2004

Page 2: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Presentation

Networked Knowledge Organisation Systems/Services

Broad review technological approaches

NKOS Lifecycle

Introduce Workshop Demonstrations

Critical Issues and possible gaps

References

Page 3: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Taxonomy of Knowledge Organisation Systems

Term ListsAuthority Files, Glossaries, Gazetteers, Dictionaries

Classification and CategorizationSubject Headings

Classification Schemes and Taxonomieseg DDC, scientific taxonomies

Relationship SchemesThesauri

Semantic Networks (eg WordNet)

(Ontologies)

Hodg00, http://www.clir.org/pubs/abstract/pub91abst.html

Page 4: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

KOS ctd.

Thesauri3 Standard Relationships between concepts (Aitc00)

Equivalence, Hierarchical, Associative

Inherent domain lexicon (lead-in vocabulary)

Concept definitions and warrant (Scope Notes)

OntologiesHigher level conceptualisation (McGu02, Noy)

formal definition of relationships

inference rules and definition of roles (sometimes)

KOS an element of ontologies and schemasJaco03, Ontologies and the Semantic Web,.

ASIST Bulletin, April/May 2003, Special Issue on Semantic Web

Page 5: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Recent Sources

NKOS: Networked Knowledge Organization Systems/Serviceshttp://jodi.ecs.soton.ac.uk/?vol=4&iss=4 NKOS JoDI Special Issue

http://www.multites.com/conference03.htm MultiTes Conference

http://nkos.slis.kent.edu/ JCDL and ECDL Workshops 2003

http://www.lub.lu.se/SEMKOS/ SEMKOS IP Proposal Resources

Semantic Web - RDF/XML, RDF Schema, Metalog, OWLhttp://www.w3.org/2001/sw/ W3C Semantic Web Activity

http://www.semanticweb.org/

http://ontoweb.aifb.uni-karlsruhe.de/ OntoWeb

http://www.w3c.rl.ac.uk/SWAD/thesaurus.html SWAD-Europe Thesaurus index

Semantic Grid - Semantic Web, Web service, eScience, GRID linkshttp://www.semanticgrid.org/

http://www.w3.org/2002/ws/ W3C Web Services Activity

http://www.ariadne.ac.uk/issue29/gardner/intro.html Gardner’s Intro to Web Services

Page 6: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

JISC Application Area

Search/retrieval for educational purposes(?)students, teachers, researchers

possibly

Generalised searchpossibly integrated into applications

triggered to take account of context (eg Brow02)

link eScience applications?

Current operational systems (eg RDN)lack terminology services

some browsing categories

but not integrated into search

Page 7: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Technologies

Information ScienceControlled Terminology

Information Retrieval (probabilistic, full text)

Intellectual/Automatic Indexing

search/browse, user interfaces

Facet Analysis

Ontology Engineering (AI Knowledge Representation)formal (finer grained) representation, description logics

automated reasoning, Semantic Web

Distributed SystemsZ39.50, Web Services, Semantic Grid

Language Engineering

Social Engineering

Page 8: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Enriching / Formalising KOS

KOS Legacy - large (multilingual) vocabularies, indexed multimedia (and print) collections

Product of peer review and follow standards

However

Not utilised to full potential in some applicationsDesigned for human inspection, semantic structure not explicitly

represented

May be inconsistently evolved from various sources

Opportunity to formalise / enrichPartly a matter of representation in RDF/XML

but may be inconsistencies in logical structure

--> deconstruction and ontological formalisation

--> mutually exclusive concept structures

Page 9: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Facet Analysis (a link between technologies)

Fundamental categories / foundational conceptseg CRG: Entity, Part, Property, Material, Process, Operation, Product,

Agent, Space, Time, ...

Mapped to facets for particular KOS

Basis of several scientific and industrial KOS

Synthesis rules for principled combination of conceptsrules for combining base concepts when indexing/querying

Browsing and Searching applications

Page 10: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

KOS integration into DL services from Hill02 Research Agenda KOS/DL

Taxonomy of KOS - KOS types linked to DL service protocols

Registries of KOS and KOS-level metadata to represent them

XML/RDF KOS representations - customisable

Core set of relationship types across all KOS

General KOS service protocol

from which protocols for specific types of KOS can be derived

Robust linking model in which DL entities (collections, objects, and services) can refer to KOS entities (concepts, labels, and relationships)

Visualization tools that fully use and display the rich semantics embedded in KOS

=> move towards a model of search service ‘flow’?- how semantic search services combine

Page 11: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Terminology Services from Koch04 Structured Overview - Activities to advance the powerful use of vocabularies

Searching for conceptsschemes in registries

concepts/terms in taxonomy servers

Search support for queriescollection finding

cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

query expansion, disambiguation

automatic indexing and classification

extraction/mining of terms

translation support using vocabularies

Page 12: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Workshop Demonstrations

… in context of NKOS Information Lifecycle

Page 13: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

NKOS Information Lifecycle

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Page 14: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

High Level Thesaurus (HILT) - Information Science

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Pilot Terminology Service

HILT team, Wordmap s/w, OCLC

discovery of collections

cross-searching JISC collections

mapping from Terminologies to DDC spine

DDC, LCSH, UNESCO, MeSH, AAT

Page 15: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

geoXwalk - Geographic Information Science

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Geo-spatial Gazetteer Service

Edina, Data Archive, CIE

feature (concept) searching

geographic searching, spatial operators

spatial result visualisation, flexible footprint

geoparser - automated geographic indexing

Page 16: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Renardus - Information Science

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

cross-browsing service

NetLab, UKOLN, ILRT, SUB, …

classification mapping via DDC

cross-searching EU subject gateways

(multilingual) user interface for browsing

in large classifications

Page 17: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Learning and Teaching Portal & SSL - Information Science

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Systems Simulations Ltd, Index+Learning and Teaching Support Network

Web-based thesaurus service

vocabulary management - ‘Suggest a Term’

data entry

browse and search

Page 18: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

CIE Health Demonstrator - Information Science, facet analysis

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Adiuri Systems Ltd (from IDEA Project)

Waypoint Health Info search demonstrator

faceted, multi-concept ‘query via browsing’

‘non-zero match’, postings displayed

faceted browsing user interface

Page 19: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

COHSE Conceptual Open Hypermedia - Ontology, description logic, hypertext navigation

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Link Navigation Using Ontologies

Manchester, Southampton University

Open Hypermedia System (Soton DLS)

open-source downloadable tools for

Ontology and Annotation Services:

eg OilEd lightweight ontology editor for DAML+OIL

Page 20: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

OpenGALEN - Ontology, GRAIL logic, facet analysis

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Open GALEN Common Reference Model -

Medical coding and classification systems Manchester University,

faceted; compositional rather than traditional enumerative medical codes

multilingual GALEN-in-use Project

OpenKnoME, GALEN Case Env toolsets

Page 21: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Co-ODE: Collaborative Open Ontology Development EnvOntology management

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Manchester University new project

develop Ontology management tools

as plugins for Protégé (Stanford)

building on earlier experience with OilEd

concern with usability

Page 22: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

FACET: faceted knowledge organisation for semantic retrieval

- Information Science, facet analysis

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

University of Glamorgan, Science Museum

faceted, multi-concept bestmatch search

semantic expansion as browsing service

faceted thesaurus search interface

standalone and Web demonstrators

Page 23: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

E-Biosci : EC platform e-publishing and info integration in Life

Sciences - Information Science

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

European Molecular Biology Organisation

Collexis B. V. technology:

semantic matching conceptual fingerprints

link genomic data + life sciences research lit

multilingual

integrated search: full text/data/researchers

peer-reviewed, different publishing models

Page 24: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

SKOS: Simple knowledge organisation for the semantic webInformation Science, Ontology

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

CCLRC, SWAD-EUROPE project

Migrate existing KOS to SemWeb via common RDF schema for thesauri and for inter-thesaurus mapping (formal OWL spec planned)

use cases for thesaurus services

lightweight RDF service demonstrators using Jena RDF API toolkit

Page 25: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Some critical issues

Standards

User Interface

Gaps?

Page 26: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Critical issues (1) Standards

Ongoing initiatives to revise thesaurus standardsANSI/NISO Z39.19

BS 5723 and BS 6723 - Dext03

BSI public draft soon, extended scope, interoperability

Thesaurus RepresentationsRDF - SWAD03; Topic Map - Ligh03; various XML

Possibilities to extend current relationships by specialisation,

enriching standards but maintaining compatibility

KOS Service Protocols - Bind04

service oriented approach with composite service provision

not based on atomic elements of data structures and relationships

expansion service provision

NKOS Registry - Vizi01; MEG Registry Project

Page 27: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Cost/benefit issues

Thesaurus long-lived, pragmatic and useful toolcost-effective granularity of relationships

for some search apps

Domain lexicon (UF/ALTs, Scope Notes)

Cost/benefit issues in KOS formalisationApplication dependent level of precision in concept use

Some apps very precise use of concepts (medical?)

Other apps may vary in concept application (humanities?)Indexer - Searcher variation

Results based on probable relevance judgements

Page 28: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Critical issues (2) User interface

User interface critical given controlled terminology demands

Offer different options

Move beyond minimal assumptions of current web search engines on

users, query structure, collections

Link with service protocol issueskind of interfaces easily afforded

Accessibility issues

Page 29: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Critical issues (3) Gaps?

Language EngineeringRelated standards - Shre03

POS tagging tools

large statistical corpora --> source of context data

for disambiguation, annotation, proactive search

JISC-specific corpora?

Collect portal use data --> taxonomies, synonyms Time-varying synonyms - BBCi04

Probabilistic IRterm frequency information, automatic weighting

Page 30: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Social Engineering?

What do users really want?

Problems of introducing new technologiesSometimes a matter of both reflecting and shaping user needs

Done implicitly by successful projectsbut also extant literature on sociology/philosophy of innovation

Lessons from:Participatory Design, Rapid Application Development - Tudh00

evolving network: prototypes, user expectations, requirements and working practices

Lead / Ambassador Users

training, tailoring and advocacy / motivation.

Page 31: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

Contact Information

Doug Tudhope

School of Computing

University of Glamorgan

Pontypridd CF37 1DL

Wales, UK

[email protected]

http://www.comp.glam.ac.uk/pages/staff/dstudhope

Page 32: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

References

Aitchison J., Gilchrist A., Bawden D. 2000. Thesaurus construction and use: a practical manual (4th edition). London: ASLIB. BBCi, A day in the life of BBCi search. http://www.currybet.net/articles/day_in_the_life/index.shtml

Binding C., Tudhope D. 2004. KOS at your Service: Programmatic Access to Knowledge Organisation Systems. JoDI 4(4), http://jodi.ecs.soton.ac.uk/Articles/v04/i04/Binding/

Brown P. 2002. From information retrieval to hypertext linking. New Review of Hypermedia and Multimedia,8, 231-255.

Dextre Clarke S. 2003. BS 8723 : a new British Standard for structured vocabularies. http://www.glam.ac.uk/soc/research/hypermedia/NKOS-workshop%20Folder/dextre_clarke.ppt

Hill et al. 2002. Integration of Knowledge Organization Systems into Digital Library Architectures. ASIST SigCR - http://www.lub.lu.se/SEMKOS/docs/Hill_KOSpaper7-2-final.doc

Hodge Gail, 2000. Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files. CLIR Pub91. April 2000. http://www.clir.org/pubs/abstract/pub91abst.html

Jacob Elin. 2003. Ontologies and the Semantic Web. ASIST Bulletin, April/May 2003, Special Issue on Semantic Web. http://www.asis.org/Bulletin/Apr-03/BulletinAprMay03.pdf

Koch T. Activities to advance the powerful use of vocabularies in the digital environment - Structured overview. http://www.lub.lu.se/~traugott/drafts/seattlespec-vocab.html

Light R. 2003. XML (and Topic Maps). http://www.richardlight.org.uk/thesauri/thesauri.htm

McGuinness D. 2002. Ontologies Come of Age. In: (Fensel et al eds.) Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. MIT Press.

MultiTes 2003. Conference on Thesauri and Taxonomies http://www.multites.com/conference03.htm

Page 33: Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit University of Glamorgan JISC Terminology Workshop, London,

References ctd.

NKOS: Networked Knowledge Organization Systems/Services, http://nkos.slis.kent.edu/

NKOS 2003. Workshop ECDL. http://www.glam.ac.uk/soc/research/hypermedia/NKOS-Workshop.php

NKOS 2004. New Applications of Knowledge Organization Systems. NKOS Special Issue, JoDI. http://jodi.ecs.soton.ac.uk/?vol=4&iss=4

Noy N., McGuinness D. Ontology Development 101: A Guide to Creating Your First Ontology. http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html

Shreve G. 2003. Terminology Standards. http://www.glam.ac.uk/soc/research/hypermedia/NKOS-workshop%20Folder/Shreve.ppt

Soergel D. The representation of Knowledge Organization Structure (KOS) data: a multiplicity of standards. http://www.glam.ac.uk/soc/research/hypermedia/publications/SoergelNKOS2001KOSStandards.PDF

SWAD-Europe Thesaurus Activity. http://www.w3c.rl.ac.uk/SWAD/thesaurus.html

Tudhope D, Beynon-Davies P, Mackay H. 2000. Prototyping praxis: Constructing computer systems and building belief. Human Computer Interaction, 15(4), 353-383. http://www.glam.ac.uk/soc/research/hypermedia/publications/tudhope-2000.pdf

Vizine-Goetz D. 2001. NKOS Registry - draft proposal for KOS-level metadata. http://staff.oclc.org/~vizine/NKOS/Thesaurus_Registry_version3_rev.htm