Download - LoCloud Vocabulary Services: Thesaurus management introduction, Walter Koch and Gerda Koch, AIT
Thesaurusmanagement Quickstart
Introduc4on
What are controlled vocabularies?
• organized arrangement of words and phrases
• used to index content and/or to retrieve content through browsing or searching
• include preferred and variant terms
• have defined scope or describe a specific domain
h@p://vocabularyserver.com/glossaries/ge@y/index.php
Thesaurus = a controlled vocabulary arranged in a known order and structured so that the various rela4onships among terms are displayed clearly and iden4fied by standardized rela4onship indicators.
Important: • ISO 25964 is a standard for building thesauri • SKOS is a W3C recommenda4on designed for representa4on
of controlled vocabularies and is built upon RDF and RDFS. It allows publica4on of such vocabularies as linked data.
1 h@p://www.niso.org/schemas/iso25964/ (September 19th, 2014)
ISO 25964
• Part 1: Thesauri for informa>on retrieval -‐ published in 2011 -‐ developing a thesaurus (mono-‐ and
mul4lingual) -‐ replaced previous standards ISO 2788/5964 -‐ includes data model and XML schema
• Part 2: Interoperability with other vocabularies -‐ published in 2013 -‐ recommenda4ons for the establishment and maintenance of
mappings between mul4ple thesauri, or between thesauri and other types of vocabularies
Data Model
http://www.niso.org/schemas/iso25964/
SKOS Simple Knowledge Organiza4on System http://www.w3.org/2004/02/skos/intro SKOS provides a standard way to represent knowledge organiza4on systems using the Resource Descrip4on Framework (RDF). Encoding this informa4on in RDF allows it to be passed between computer applica4ons in an interoperable way. Using RDF also allows knowledge organiza4on systems to be used in distributed, decentralised metadata applica4ons. Decentralised metadata is becoming a typical scenario, where service providers want to add value to metadata harvested from mul4ple sources.
Mul4lingual vocabulary issues (examples)
• structural problems: conceptual systems differ in the various languages
• equivalence problems: lexicalisa4on of concepts differs in different languages
• eg. bone – fish bone (en); Knochen – Gräten (de)[1]; • intra-‐ and inter-‐language problems; terms differ in meaning (homographs) given term can have more than one meaning in a language
• eg. Turkey (country) and turkey (animal) [1] h@p://www.dsoergel.com/cv/B67.pdf 20th August, 2014
Federated Model
• LoCloud vocabulary based on federated model • having independent vocabularies for various languages in the same domain (no one language is dominant)
• alignment of vocabularies via concept iden>fiers, end-‐user can search in all linked indexing vocabularies
• AIT experimental applica4on based on TemaTres Vocabulary Tool
TemaTres ... • supports distributed management models • ensures consistency and integrity of data and rela4onships between terms
• has features specially designed to provide data traceability and quality control in the context of a controlled vocabulary
• supports the analysis and categorisa4on of terms for search
• enables vocabularies to be represented in a wide range of metadata standards relevant to knowledge management http://www.vocabularyserver.com/
TemaTres func4onali4es • No limits to number of terms, alterna4ve labels, levels of
hierarchy, etc • allows import/export of data in text or SKOS format • mul4lingualism • SPARQL endpoint • rela4onships between terms • notes • user management • Reports • Addi4onally: meta-‐terms: define facets, collec4ons or arrays of terms, expose vocabularies
with powerful web services, search terms sugges4on (did you mean...?), display terms in mul4ple deep levels in the same screen, user management, duplicate and free terms control, mul4lingual terminology mapping etc.
Why TemaTres? • Fast to use • Making vocabularies available as Webservice in the enrichment process
• Many vocabularies (like UNESCO, Gemet, PICO) have already been established with this tool and are usable in the LoCloud infrastructure (h@p://www.vocabularyserver.com/vocabularies.php , 175 vocabularies available)
• Addi4onally own vocabularies can be created • Best star4ng point: Skos-‐file for import
Import in TemaTres
• Tabulated text
• Tagged text
• Skos core
Vocabularies that can at present be used during LoCloud aggrega4on:
Author Name of vocabulary University of California, Santa Barbara Alexandria Digital Library Feature Type Thesaurus Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)
Archeological Objects Thesaurus Scotland
English Heritage Archeological Sciences Thesaurus English Heritage Building Materials Thesaurus English Heritage Components Thesaurus American Folklore Society Ethnographic Thesaurus English Heritage Event Type Thesaurus English Heritage Evidence Thesaurus English Heritage FISH Archeological Objects Thesaurus Eionet European Environment Information and Observation Network
General Multilingual Environmental Thesaurus GEMET
Federation Internationale des Archives du Film (FIAF)
General Subject headings for Film Archives
The Discovery Programme Irish Monuments The Discovery Programme Irish Periods Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)
Maritime Craft Thesaurus Scotland
English Heritage Maritime Craft Type Thesaurus English Heritage and Royal Commission on the Historical Monuments of England
MDA Archaeological Objects Thesaurus
Royal Commission on the Ancient and Historical Monuments of Wales (RCAHMW)
Monument Thesaurus Wales
Royal Commission on the Ancient and Historical Monuments of Scotland (RCAHMS)
Monument Type Thesaurus
English Heritage Period Thesaurus Royal Commission on the Ancient and Historical Monuments of Wales (RCAHMW)
Period Thesaurus Wales
Bibliographic Standards Committee of the Rare Books and Manuscripts Section (ACRL/ALA)
Relator Terms for Use in Rare Book and Special Collections Cataloguing
Universidad de León
Tesauro de Ciencias de la Documentación
Library of Congress. Prints and Photographs Division
Thesaurus for Graphic Materials 1: Subject Terms
Library of Congress. Prints and Photographs Division
Thesaurus for Graphic Materials 2: Genre and Physical Characteristic Terms
Ministero per i Beni e le Attività Culturali
Thesaurus PICO 4.1
UKAT UK Archival Thesaurus (UKAT) UNESCO UNESCO thesaurus
Tool for vocabulary training
• Mediathread is CCNMTL's 1 open-‐source plaoorm for explora4on, analysis, and organiza4on of web-‐based mul4media content
• Launched at Columbia in 2010, Mediathread has now been used in over 300 courses across a wide range of subject domains, including Social Work, Journalism, East Asian Studies, Art History, Film Studies, History, Public Health, Educa4on, and English.2
• Mediathread is in use today at over 25 Colleges and Universi4es, including the MIT, Dartmouth College, Princeton University, Wellesley College etc. 3
• Mediathread is under constant development
1 Columbia Center for New Media Teaching and Learning h@p://ccnmtl.columbia.edu/poroolio/custom_sotware_applica4ons_and_tools/mediathread.html 2014-‐10-‐09
2h@p://mediathread.info/content/cases-‐columbia 2014-‐10-‐09 3h@p://getmediathread.com/index.html#who 2014-‐10-‐09
Accessing Mediathread
• h@p://mediathread.ait.co.at
• user/password:
Next Parts: Thesaurusmanagement:
• Part 1: Basics • Part 2: Import/Export • Part 3: Mul4lingual Vocabularies
Op4on: • Mediathread in a Nutshell
Mediathread in a Nutshell
Ater logging into Mediathread
Mediathread sec4ons (I) • From Your Instructor (let side)
ü Contains the composi4ons with the instruc4ons
ü Start with “How to use the Mediathread tool?”
ü Followed by Chapter 0 to Chapter 6
• Composi>ons give instruc4ons
ü Ater reading each composi4on complete the associated Assignment (same chapter number and name)
Mediathread sec4ons (II)
• Assignments contain exercises (middle) ü Accomplish them by clicking on “Respond to Assignment” ü If necessary check the instruc4ons in the composi4ons again
Reading Composi4ons (I)
• Ater clicking on a Composi4on ü Read the text on the let side ü Click on the symbol or text to see Power Point slides on the right side
Reading Composi4ons (II) • Change size and posi4on of the slides by
ü Using the arrow and plus/minus signs on the let ü Using the scroll func4on of your mouse (to change size) ü Dragging the slide by holding the let mouse bu@on (to change
posi4on)
Reading Composi4ons (III)
• When finished click on “LoCloud Vocabulary Training” to return to the course overview
• Here click on the next Composi4on or on “Respond to Assignment”
• Or use the links at the bo@om of each Composi4on or Assignment
OR
The Locloud Vocabulary Training ...
• is an English online tool workshop
• includes all features of the vocabulary tool TemaTres
• is too comprehensive to complete it in this sec4on
• can be started in class and finished any 4me online
Please use the *me le, to start with the Vocabulary Training ...
Star4ng Vocabulary Training ...
• Open Mediathread under h@p://mtp.ait.co.at
• Logging in