rda vocabularies and concepts gordon dunsire depute director, centre for digital library research...
TRANSCRIPT
RDA vocabularies and concepts
Gordon DunsireDepute Director, Centre for Digital Library Research
University of Strathclyde, Glasgow, Scotland
Presented to staff of the National Library of Scotland, Edinburgh20 Jul 2009
Overview
Part 1: Introduction to RDABenefits to users and cataloguersCollaboration with other communities/standards
Q/A and breakPart 2: Introduction to the Semantic Web
Concepts and methodsRole of the library community
Q/A and breakPart 3: Putting it all together
A short history of the evolution of the catalogue record
RDA vocabularies and concepts
Part 1:
Introduction to RDA
RDA
Resource Description and AccessA new standard for creating bibliographic
metadataBased on the Anglo-American Cataloguing Rules
In development since 1841 (Panizzi’s rules for the British Museum)
And FRBR and other more modern stuffFunctional Requirements for Bibliographic RecordsDeveloped by the International Federation of Library
Associations and Institutions (IFLA)Published 1998
User-centred features of RDA (1)
Improves the FRBRizability of cataloguesCovers all types of user
Those who need to find, identify, select, obtain and use information, and manage and organize information bibliographically
Covers all mediaPrint-based, digital; textual, visual, etc.
Equal, even treatment gives more control to the user in finding and choosing the most appropriate resources
FRBRisation
Work
Expression 1
Manifestation 1.1
Item 1.1.1
Expression 2
Manifestation 2.1 Manifestation 2.2
Item 2.1.1 Item 2.2.1 Item 2.2.2
Is realised through
Is embodied in
Is exemplified by
Symphony no.1
LSO performance
DVD-A
Copy on shelf
User-centred features of RDA (2)
Clearly distinguishes content from carrierE.g. Moving pictures on DVD; text on CD-ROMHelpful for users with special needs
E.g. restrict search to non-visual resources
MultinationalAnglo-centricity (and cataloguer-eccentricity)
removedAbbreviations and acronyms avoidedLatinisms removed
Farewell s.n., s.l., et al.
[Still arguing about square brackets!]
User-centred features of RDA (3)
Independent of technical metadata formatsCan be used with MARC, DC (Dublin Core)
And a whole bunch of other acronyms
Gives user familiar metadata regardless of what system is used
Designed for the digital environmentRDA will be published as an online product
So could be incorporated in user help facilitiesE.g. How a “preferred title for the work” (uniform title) is
derived
Cataloguer-centred features of RDA (1)
Online product designed to interface and integrate with cataloguing modulesWork-flow integration will give step-by-step and
contextual access to content rulesPossibility of adding local examplesPossibility of “myRDA”, removing unwanted rules
and unused optionsLMS vendors being kept informedAvoidance of repetitive strain injury
Looking for that rule on corporate body main entry in AARC2
Cataloguer-centred features of RDA (2)
More emphasis on cataloguer’s judgmentGuidelines rather than “rules”
Rules grouped by bibliographic element rather than format
Bibliographic elements related to FRBR entities (related to user tasks)Why am I recording this information?
Authority control includedGenerally compatible with AACR
RDA and ONIX
ONIX (Online Information Exchange )Publishing industry metadata standard
2 day workshop, March 2006, British Library, LondonRDA Editor, ONIX reps, facilitatorFollowed up via email and tele-con
RDA/ONIX framework for resource categorization, August 2006Distinguishes content from carrier (at last!)
Intention to extend frameworkStatus: Resources permitting – now permitted!
RDA and DCMI
DCMI (Dublin Core Metadata Initiative)2 day meeting, April/May 2007, British
Library, LondonRDA Editor, reps for RDA, DC and related
Semantic Web communitiesEstablished the DCMI RDA Task GroupOperates via wiki, email, tele-con, meetings at
DC annual conferencesCharter: To define components of the draft
standard "RDA - Resource Description and Access" as an RDF vocabulary for use in developing a Dublin Core application profile.
Status: Ongoing, but nearly complete
RDA and FRBR
FRBR Review Group, August 2007, WLIC (IFLA), Durban, South Africa
New project: To define appropriate namespaces for FRBR (entity-relationship) in RDF and other appropriate syntaxesStatus: Report and recommendations discussed
at WLIC, Québec City, CanadaDelayed by IFLA website re-organisation
FRBR recently extended to Object-oriented FRBR (FRBRoo)Based on CIDOC Conceptual Reference Model
(CRM)
RDA and FRAD
Functional Requirements for Authority DataPublished in May 2009Likely to be included in the FRBR namespace
projectRDA designed to be FRAD-ready
Generalities already incorporated, with place-holders, etc.
FRAD “Family” entity used in RDAFRBR only defines person and corporate body entities
RDA/ONIX framework
An ontology developed by RDA and the publishing community to improve metadata interoperability
Set of low-level attributes for describing the content and carrier of a bibliographic resource
Controlled vocabularies for some attributesAttributes combined to form high-level
content and carrier types for RDA
RDA/ONIX framework example
RDA content type “spoken word” High-level label for a framework base content category
Category attributes Character: Language SensoryMode: Hearing ImageDimensionality: not applicable ImageMovement: not applicable
User: what resources have content I can listen to? = OPAC: what content types have SensoryMode: Hearing?
(“Spoken word”; “Performed music”; etc.)
then OPAC: list bib records with these content types!
Vocabulary Mapping Framework (1)
JISC-funded project to extend the RDA/ONIX frameworkDue for completion early November 2009
Lead by publishing communityGD is consultant
Will develop an ontology/categorisation of relationships between/among bibliographic entities and agent entities (parties)E.g. Manifestion is-published-by Publisher; Work
is-created-by Author; Work is-derived-from WorkE.g. “Creator” > “Author”, “Collector”, “Illustrator”;
“Author” = “Writer”; etc.
Vocabulary Mapping Framework (2)
Relationship terms from several standards will be mapped to the ontologyCIDOC-CRM, RDA, FRBR, FRAD, MARC21, etc.
Mappings then provide a hub-and-spoke mapping between any pair of standardsEfficient, as direct pair mappings not requiredWill improve metadata interoperability in large-
scale, heterogeneous resource discovery services
Ontology, terms, mappings compatible with Semantic Web (namespaces, etc.)
RDA vocabularies and concepts
Part 2:
Introduction to the Semantic Web
A problem
Humans are very good at processing informationCreation, analysis, synthesis, communication
Some say this is what defines us
We have invented machines to process dataFaster, globally, non-stop
The result is the information eruptionThe Web: a continual explosion
Information professionals cannot keep upWe need our machines to process metadata
Semantic Web
“… an evolving extension of the [WWW] in which the semantics of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content.”Wikipedia, English, 10.08 15 Jul 2009
The basic building block is Resource Description Framework (RDF)
Resource Description Framework (RDF)
Simple metadata statements in the form of subject-predicate-object expressions, called triplesE.g. “This presentation” – “has creator” – “Gordon
Dunsire”
“presentation” and “creator” are metadata structure termsClasses and properties
“this ...” and “Gordon Dunsire” are metadata content termsInstances or values
Semantic Web applications
RDF Schema (RDFS)Expresses the structure of metadata classes and
properties
Simple Knowledge Organization System (SKOS)Expresses the basic structure and content of
concept schemes such as thesauri and other types of controlled vocabularies
Web Ontology Language (OWL)Explicitly represents the meaning of terms in
vocabularies and the relationships between them (scope, etc.)
Machine-processing
RDF is about making machine-processable statements, requiringA machine-processable language for
representing RDF statementsExtensible Markup Language (XML)
A system of machine-processable identifiers for resources (subjects, predicates, objects)
Uniform Resource Identifier (URI) For full machine-processing, an RDF statement is
a set of three URIs
Identifiers
Things requiring identification (a URI):Subject “This presentation”
e.g. its electronic location (URL): http://cdlr.strath.ac.uk/pubs/dunsireg/NLSRDA.pps
Predicate “has creator”e.g. http://purl.org/dc/terms/creator
Object “Gordon Dunsire”e.g. URI of entry in Library of Congress Name Authority
File: http://errol.oclc.org/laf/nb2001-72552.html
Declaring vocabularies/values as “namespaces” in Semantic Web applications provides URIs
RDA RDF vocabularies
Being added to the National Science Digital Library metadata registryStored in a databaseOutput as RDF(S)/SKOSAutomatic creation of a URI for each entry
Base domain: http://RDVocab.infoFirst part of every RDA vocabulary URIIdentifies the “namespace” or collection/set of
terms
RDA value in SKOS (part 1)
<?xml version="1.0" encoding = "UTF-8"?><rdf:RDF xmlns="http://www.w3.org/2004/02/skos/core#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:reg="http://metadataregistry.org/uri/schema/registry/">
:<!-- NOTICE: This is a single-concept fragment -->:<!-- Scheme: RDA Content Type -->:<skos:ConceptScheme rdf:about="http://RDVocab.info/termList/RDAContentType">::<dc:title>RDA Content Type</dc:title>:</skos:ConceptScheme>
XML namespaces
SKOS
NSDL Registry
Vocabulary URI
RDA value in SKOS (part 2)
:<!-- Concept: spoken word -->:<skos:Concept rdf:about="http://RDVocab.info/termList/RDAContentType/1013" xml:lang="en">::<skos:inScheme rdf:resource="http://RDVocab.info/termList/RDAContentType"/>::<reg:status rdf:resource="http://metadataregistry.org/uri/RegStatus/1002"/>::<skos:prefLabel xml:lang="en">spoken word</skos:prefLabel>::<skos:definition xml:lang="en">Content expressed through language in an audible form.</skos:definition>::<skos:scopeNote xml:lang="en">Includes recorded readings, recitations, speeches, interviews, oral histories, etc., computer-generated speech, etc.</skos:scopeNote>::<skos:prefLabel xml:lang="de">gesprochene Worte</skos:prefLabel> <skos:scopeNote xml:lang="de">Umfasst aufgezeichnete Lesungen, Rezitationen, Reden, Interviews, mündliche Überlieferungen usw. und maschinell erzeugte Sprache.</skos:scopeNote>::<skos:definition xml:lang="de">Inhalt, der durch Sprache in einer hörbaren Form ausgedrückt wird.</skos:definition>:</skos:Concept>
Term URI
Term
Definition
Term (German)Registry status term URI
RDA value in SKOS (part 3)
:<!-- Status properties used in this document -->:<skos:Concept rdf:about="http://metadataregistry.org/uri/RegStatus/1002">::<skos:prefLabel xml:lang="en">New-Proposed</skos:prefLabel>:</skos:Concept></rdf:RDF
Registry status term URI
Registry status term
RDA content type “spoken word”
The term “spoken word” can be referenced as the value of the field “content type” in any metadata record using RDF/XML (Semantic Web):
…
xmlns:rdvct = http://RDVocab.info/termList/RDAContentype#
…
<… rdvct:1013 …>
…
The field/attribute/element “content type” can be referenced in a similar way to the RDF Schema for RDA elements being developed by DCMI/RDA
More library namespaces
IFLA bibliographic control standards Discussions during WLIC 2008, Québec City
RDF Schema for entities and relationships from Functional Requirements for Bibliographic Records (FRBR)
E.g. “Work”, “has Expression” / ”is Expression of” Others are likely to follow:
Functional Requirements for Authority Data (FRAD) International Standard Bibliographic Description (ISBD)
ISBD/XML Task Group
Functional Requirements for Subject Authority Data (FRSAD)
UNIMARC Library of Congress taking a similar approach with
MARC21
RDA vocabularies and concepts
Part 3:
Putting it all together
A short history
of the evolution
of the library catalogue record
Lee, T. B.
Cataloguing has a future. - Audio disc
(Spoken word). - Donated by the author.
1. Metadata
In the beginning ...
... the catalogue card
Author:
Title:
Content type:
Provenance:
Subject:
Lee, T. B.
Cataloguing has a future
Spoken word
Audio disc
Metadata
Donated by the author
Carrier type:
From flat-file record ...
... to relational record
Name:
Biography:
...
Name authority
Term:
Definition:
...
Subject authority
Bibliographic description
Author:
Title:
Content type:
Provenance:
Subject:
Lee, T. B.
Cataloguing has a future
Spoken word
Audio disc
MetadataDonated by the author
Carrier type:
From flat-file description ...
... to FRBR record
Name:
Biography:
...
Name authority
Term:
Definition:
...
Subject authority
Bibliographic description
Item
Manifestation
Author:
Content type:
Subject:
Spoken word
Expression
Work
Lee, T. B.
Metadata
From FRBR record ...
... to extinction!
Name:
Name authority
Term:
Subject authority
Item
Manifestation
Expression
Work
Provenance: Donated by the author
Subject:
Author:
Title: Cataloguing has a future
Content type: Spoken word
Audio discCarrier type:Term:
RDA content type
Term:
RDA carrier type
Donor:
Title:
Amazon/Publisher
Where is the record?
Implicit, not explicitEverywhere and nowhere
A semantic Web will allow machines to create the record just-in-timeWe will not have to maintain records just-in-case
The user will have control over the presentationI want to see an archive or library or museum or
Amazon or Google or Flickr or ? display
And by avoiding duplication, we can all get on with describing new stuff ...
The hyperdimensional (Tardis) card
Lee, T. B.
Cataloguing has a future. - Audio disc
(Spoken word). - Donated by the author.
1. Metadata
Audio shop
Lee MuseumSpoken word archive
W3C Library
“TARDIS four port USB hub, for office-bound Time Lords:
Open a time vortex on your desk” – Pocket-lint
Linking communities
FRBRooCRM
ISBDFRBR
RDAMARC
RDADC
RDAFRBR
RDAONIXFRBRooFRBR
Everything is connected
FRBRooCRM FRBR
RDAONIX
DC
MARC
ISBD
… at the community (human) and technical (Semantic Web) levels