overview of technological solutions to terminology services doug tudhope hypermedia research unit...
TRANSCRIPT
Overview of technological solutions to terminology services
Doug Tudhope
Hypermedia Research Unit
University of Glamorgan
JISC Terminology Workshop, London, February 2004
Presentation
Networked Knowledge Organisation Systems/Services
Broad review technological approaches
NKOS Lifecycle
Introduce Workshop Demonstrations
Critical Issues and possible gaps
References
Taxonomy of Knowledge Organisation Systems
Term ListsAuthority Files, Glossaries, Gazetteers, Dictionaries
Classification and CategorizationSubject Headings
Classification Schemes and Taxonomieseg DDC, scientific taxonomies
Relationship SchemesThesauri
Semantic Networks (eg WordNet)
(Ontologies)
Hodg00, http://www.clir.org/pubs/abstract/pub91abst.html
KOS ctd.
Thesauri3 Standard Relationships between concepts (Aitc00)
Equivalence, Hierarchical, Associative
Inherent domain lexicon (lead-in vocabulary)
Concept definitions and warrant (Scope Notes)
OntologiesHigher level conceptualisation (McGu02, Noy)
formal definition of relationships
inference rules and definition of roles (sometimes)
KOS an element of ontologies and schemasJaco03, Ontologies and the Semantic Web,.
ASIST Bulletin, April/May 2003, Special Issue on Semantic Web
Recent Sources
NKOS: Networked Knowledge Organization Systems/Serviceshttp://jodi.ecs.soton.ac.uk/?vol=4&iss=4 NKOS JoDI Special Issue
http://www.multites.com/conference03.htm MultiTes Conference
http://nkos.slis.kent.edu/ JCDL and ECDL Workshops 2003
http://www.lub.lu.se/SEMKOS/ SEMKOS IP Proposal Resources
Semantic Web - RDF/XML, RDF Schema, Metalog, OWLhttp://www.w3.org/2001/sw/ W3C Semantic Web Activity
http://www.semanticweb.org/
http://ontoweb.aifb.uni-karlsruhe.de/ OntoWeb
http://www.w3c.rl.ac.uk/SWAD/thesaurus.html SWAD-Europe Thesaurus index
Semantic Grid - Semantic Web, Web service, eScience, GRID linkshttp://www.semanticgrid.org/
http://www.w3.org/2002/ws/ W3C Web Services Activity
http://www.ariadne.ac.uk/issue29/gardner/intro.html Gardner’s Intro to Web Services
JISC Application Area
Search/retrieval for educational purposes(?)students, teachers, researchers
possibly
Generalised searchpossibly integrated into applications
triggered to take account of context (eg Brow02)
link eScience applications?
Current operational systems (eg RDN)lack terminology services
some browsing categories
but not integrated into search
Technologies
Information ScienceControlled Terminology
Information Retrieval (probabilistic, full text)
Intellectual/Automatic Indexing
search/browse, user interfaces
Facet Analysis
Ontology Engineering (AI Knowledge Representation)formal (finer grained) representation, description logics
automated reasoning, Semantic Web
Distributed SystemsZ39.50, Web Services, Semantic Grid
Language Engineering
Social Engineering
Enriching / Formalising KOS
KOS Legacy - large (multilingual) vocabularies, indexed multimedia (and print) collections
Product of peer review and follow standards
However
Not utilised to full potential in some applicationsDesigned for human inspection, semantic structure not explicitly
represented
May be inconsistently evolved from various sources
Opportunity to formalise / enrichPartly a matter of representation in RDF/XML
but may be inconsistencies in logical structure
--> deconstruction and ontological formalisation
--> mutually exclusive concept structures
Facet Analysis (a link between technologies)
Fundamental categories / foundational conceptseg CRG: Entity, Part, Property, Material, Process, Operation, Product,
Agent, Space, Time, ...
Mapped to facets for particular KOS
Basis of several scientific and industrial KOS
Synthesis rules for principled combination of conceptsrules for combining base concepts when indexing/querying
Browsing and Searching applications
KOS integration into DL services from Hill02 Research Agenda KOS/DL
Taxonomy of KOS - KOS types linked to DL service protocols
Registries of KOS and KOS-level metadata to represent them
XML/RDF KOS representations - customisable
Core set of relationship types across all KOS
General KOS service protocol
from which protocols for specific types of KOS can be derived
Robust linking model in which DL entities (collections, objects, and services) can refer to KOS entities (concepts, labels, and relationships)
Visualization tools that fully use and display the rich semantics embedded in KOS
=> move towards a model of search service ‘flow’?- how semantic search services combine
Terminology Services from Koch04 Structured Overview - Activities to advance the powerful use of vocabularies
Searching for conceptsschemes in registries
concepts/terms in taxonomy servers
Search support for queriescollection finding
cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
query expansion, disambiguation
automatic indexing and classification
extraction/mining of terms
translation support using vocabularies
Workshop Demonstrations
… in context of NKOS Information Lifecycle
NKOS Information Lifecycle
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
High Level Thesaurus (HILT) - Information Science
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
Pilot Terminology Service
HILT team, Wordmap s/w, OCLC
discovery of collections
cross-searching JISC collections
mapping from Terminologies to DDC spine
DDC, LCSH, UNESCO, MeSH, AAT
geoXwalk - Geographic Information Science
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
Geo-spatial Gazetteer Service
Edina, Data Archive, CIE
feature (concept) searching
geographic searching, spatial operators
spatial result visualisation, flexible footprint
geoparser - automated geographic indexing
Renardus - Information Science
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
cross-browsing service
NetLab, UKOLN, ILRT, SUB, …
classification mapping via DDC
cross-searching EU subject gateways
(multilingual) user interface for browsing
in large classifications
Learning and Teaching Portal & SSL - Information Science
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
Systems Simulations Ltd, Index+Learning and Teaching Support Network
Web-based thesaurus service
vocabulary management - ‘Suggest a Term’
data entry
browse and search
CIE Health Demonstrator - Information Science, facet analysis
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
Adiuri Systems Ltd (from IDEA Project)
Waypoint Health Info search demonstrator
faceted, multi-concept ‘query via browsing’
‘non-zero match’, postings displayed
faceted browsing user interface
COHSE Conceptual Open Hypermedia - Ontology, description logic, hypertext navigation
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
Link Navigation Using Ontologies
Manchester, Southampton University
Open Hypermedia System (Soton DLS)
open-source downloadable tools for
Ontology and Annotation Services:
eg OilEd lightweight ontology editor for DAML+OIL
OpenGALEN - Ontology, GRAIL logic, facet analysis
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
Open GALEN Common Reference Model -
Medical coding and classification systems Manchester University,
faceted; compositional rather than traditional enumerative medical codes
multilingual GALEN-in-use Project
OpenKnoME, GALEN Case Env toolsets
Co-ODE: Collaborative Open Ontology Development EnvOntology management
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
Manchester University new project
develop Ontology management tools
as plugins for Protégé (Stanford)
building on earlier experience with OilEd
concern with usability
FACET: faceted knowledge organisation for semantic retrieval
- Information Science, facet analysis
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
University of Glamorgan, Science Museum
faceted, multi-concept bestmatch search
semantic expansion as browsing service
faceted thesaurus search interface
standalone and Web demonstrators
E-Biosci : EC platform e-publishing and info integration in Life
Sciences - Information Science
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
European Molecular Biology Organisation
Collexis B. V. technology:
semantic matching conceptual fingerprints
link genomic data + life sciences research lit
multilingual
integrated search: full text/data/researchers
peer-reviewed, different publishing models
SKOS: Simple knowledge organisation for the semantic webInformation Science, Ontology
KOS creation and maintenance
Mapping, merging vocabularies
Document creation and maintenance
Indexing, classification, annotation
intellectual, automatic
Discovery of services and databases/collections
Searching for concepts --> controlled terminology, auto-disambiguation
Querying and result display
Cross-searching, cross-browsing, mapping services
KOS browsing and user interface/visualisation
Query expansion
Extraction/mining of terms
Translation support using vocabularies
Content integration and mediation
CCLRC, SWAD-EUROPE project
Migrate existing KOS to SemWeb via common RDF schema for thesauri and for inter-thesaurus mapping (formal OWL spec planned)
use cases for thesaurus services
lightweight RDF service demonstrators using Jena RDF API toolkit
Some critical issues
Standards
User Interface
Gaps?
Critical issues (1) Standards
Ongoing initiatives to revise thesaurus standardsANSI/NISO Z39.19
BS 5723 and BS 6723 - Dext03
BSI public draft soon, extended scope, interoperability
Thesaurus RepresentationsRDF - SWAD03; Topic Map - Ligh03; various XML
Possibilities to extend current relationships by specialisation,
enriching standards but maintaining compatibility
KOS Service Protocols - Bind04
service oriented approach with composite service provision
not based on atomic elements of data structures and relationships
expansion service provision
NKOS Registry - Vizi01; MEG Registry Project
Cost/benefit issues
Thesaurus long-lived, pragmatic and useful toolcost-effective granularity of relationships
for some search apps
Domain lexicon (UF/ALTs, Scope Notes)
Cost/benefit issues in KOS formalisationApplication dependent level of precision in concept use
Some apps very precise use of concepts (medical?)
Other apps may vary in concept application (humanities?)Indexer - Searcher variation
Results based on probable relevance judgements
Critical issues (2) User interface
User interface critical given controlled terminology demands
Offer different options
Move beyond minimal assumptions of current web search engines on
users, query structure, collections
Link with service protocol issueskind of interfaces easily afforded
Accessibility issues
Critical issues (3) Gaps?
Language EngineeringRelated standards - Shre03
POS tagging tools
large statistical corpora --> source of context data
for disambiguation, annotation, proactive search
JISC-specific corpora?
Collect portal use data --> taxonomies, synonyms Time-varying synonyms - BBCi04
Probabilistic IRterm frequency information, automatic weighting
Social Engineering?
What do users really want?
Problems of introducing new technologiesSometimes a matter of both reflecting and shaping user needs
Done implicitly by successful projectsbut also extant literature on sociology/philosophy of innovation
Lessons from:Participatory Design, Rapid Application Development - Tudh00
evolving network: prototypes, user expectations, requirements and working practices
Lead / Ambassador Users
training, tailoring and advocacy / motivation.
Contact Information
Doug Tudhope
School of Computing
University of Glamorgan
Pontypridd CF37 1DL
Wales, UK
http://www.comp.glam.ac.uk/pages/staff/dstudhope
References
Aitchison J., Gilchrist A., Bawden D. 2000. Thesaurus construction and use: a practical manual (4th edition). London: ASLIB. BBCi, A day in the life of BBCi search. http://www.currybet.net/articles/day_in_the_life/index.shtml
Binding C., Tudhope D. 2004. KOS at your Service: Programmatic Access to Knowledge Organisation Systems. JoDI 4(4), http://jodi.ecs.soton.ac.uk/Articles/v04/i04/Binding/
Brown P. 2002. From information retrieval to hypertext linking. New Review of Hypermedia and Multimedia,8, 231-255.
Dextre Clarke S. 2003. BS 8723 : a new British Standard for structured vocabularies. http://www.glam.ac.uk/soc/research/hypermedia/NKOS-workshop%20Folder/dextre_clarke.ppt
Hill et al. 2002. Integration of Knowledge Organization Systems into Digital Library Architectures. ASIST SigCR - http://www.lub.lu.se/SEMKOS/docs/Hill_KOSpaper7-2-final.doc
Hodge Gail, 2000. Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files. CLIR Pub91. April 2000. http://www.clir.org/pubs/abstract/pub91abst.html
Jacob Elin. 2003. Ontologies and the Semantic Web. ASIST Bulletin, April/May 2003, Special Issue on Semantic Web. http://www.asis.org/Bulletin/Apr-03/BulletinAprMay03.pdf
Koch T. Activities to advance the powerful use of vocabularies in the digital environment - Structured overview. http://www.lub.lu.se/~traugott/drafts/seattlespec-vocab.html
Light R. 2003. XML (and Topic Maps). http://www.richardlight.org.uk/thesauri/thesauri.htm
McGuinness D. 2002. Ontologies Come of Age. In: (Fensel et al eds.) Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. MIT Press.
MultiTes 2003. Conference on Thesauri and Taxonomies http://www.multites.com/conference03.htm
References ctd.
NKOS: Networked Knowledge Organization Systems/Services, http://nkos.slis.kent.edu/
NKOS 2003. Workshop ECDL. http://www.glam.ac.uk/soc/research/hypermedia/NKOS-Workshop.php
NKOS 2004. New Applications of Knowledge Organization Systems. NKOS Special Issue, JoDI. http://jodi.ecs.soton.ac.uk/?vol=4&iss=4
Noy N., McGuinness D. Ontology Development 101: A Guide to Creating Your First Ontology. http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html
Shreve G. 2003. Terminology Standards. http://www.glam.ac.uk/soc/research/hypermedia/NKOS-workshop%20Folder/Shreve.ppt
Soergel D. The representation of Knowledge Organization Structure (KOS) data: a multiplicity of standards. http://www.glam.ac.uk/soc/research/hypermedia/publications/SoergelNKOS2001KOSStandards.PDF
SWAD-Europe Thesaurus Activity. http://www.w3c.rl.ac.uk/SWAD/thesaurus.html
Tudhope D, Beynon-Davies P, Mackay H. 2000. Prototyping praxis: Constructing computer systems and building belief. Human Computer Interaction, 15(4), 353-383. http://www.glam.ac.uk/soc/research/hypermedia/publications/tudhope-2000.pdf
Vizine-Goetz D. 2001. NKOS Registry - draft proposal for KOS-level metadata. http://staff.oclc.org/~vizine/NKOS/Thesaurus_Registry_version3_rev.htm