working with data about data: introduction to metadata 10.03.2009 | ms dot porter| slide 1 a project...
TRANSCRIPT
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 1
A project of the
Introduction to Metadata
Working with Data about DataDot Porter, DHO Metadata Manager
10 March 2009
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 2
What is metadata?
Metadata is… data about data!(from the Greek preposition μετά meaning "after” or "with”)
Basically, metadata is any kind of information that describes something else.
In the context of today’s workshop, Metadata is descriptive information about digital resources: • individual files• collections of files (or: relationships among files)• complete projects (or: relationships among files and collections)
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 3
What is metadata?In the context of today’s workshop, Metadata is descriptive information about digital resources: • individual files• collections of files• complete projects
Metadata may describe (e.g.)• the content of a photograph• the photograph itself• the digital version of that same photograph• the relationship between that photograph and other photographs or texts, etc.
We’ll come back to this!
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 4
Why metadata?
Digitization does not equal access. The mere act of creating digital copies of collection materials does not make those materials findable, understandable, or utilizable to our ever-expanding audience of online users. But digitization combined with the creation of carefully crafted metadata can significantly enhance end-user access; and our users are the primary reason that we create digital resources.”
“Hardware and software come and go—sometimes becoming obsolete with alarming rapidity—but high-quality, standards-based, system-independent metadata can be used, reused, migrated, and disseminated in any number of ways, even in ways that we cannot anticipate at this moment.
Digitization does not equal access. The mere act of creating digital copies of collection materials does not make those materials findable, understandable, or utilizable to our ever-expanding audience of online users. But digitization combined with the creation of carefully crafted metadata can significantly enhance end-user access; and our users are the primary reason that we create digital resources.”
From the “Introduction” to Introduction to Metadata, Online Edition, Version 3.0 <http://www.getty.edu/research/conducting_research/standards/intrometadata>
“Hardware and software come and go—sometimes becoming obsolete with alarming rapidity—but high-quality, standards-based, system-independent metadata can be used, reused, migrated, and disseminated in any number of ways, even in ways that we cannot anticipate at this moment.
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 5
Different ways of thinking about metadata
• Authoritative vs. user-created• Different types of metadata to describe
various aspects of the same thing• Ontologies, taxonomies, vocabularies• Metadata standards and formats
More about DHO recommendations this afternoon!
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 6
Authoritative metadata
• AKA ‘top-down’• Created by project team• Formalized; focus on control• Specialists in (at least one aspect of) the
field• Focus and coverage will depend on the
requirements of the project and repository
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 7
User-created metadata
• AKA ‘bottom-up’• Social tagging• May be open or within a community• Less focused; what the “tagging public”
sees• Generally less structured, not
prescriptive7 April 2009: Introduction to Semantic Web
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 8
Types of metadata• Descriptive: Facilitates discovery and describes
intellectual content• Administrative: Facilitates management of digital
and analog resources• Technical: Describes the technical aspects of the
digital object• Structural: Describes the relationships within a
digital object• Preservation: Supports long-term retention of the
digital object and may overlap with technical, administrative, and structural metadata
From Best Practice Guidelines for Digital Collections at University of Maryland Libraries, edited by Susan Schreibman <http://www.lib.umd.edu/dcr/publications/best_practice.pdf>
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 9
Types of metadata• Descriptive: Facilitates discovery and describes
intellectual content• Administrative: Facilitates management of digital
and analog resources• Technical: Describes the technical aspects of the
digital object• Structural: Describes the relationships within a
digital object• Preservation: Supports long-term retention of the
digital object and may overlap with technical, administrative, and structural metadata
From Best Practice Guidelines for Digital Collections at University of Maryland Libraries, edited by Susan Schreibman <http://www.lib.umd.edu/dcr/publications/best_practice.pdf>
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 10
Descriptive metadata
It is always necessary to differentiate between the description of…
• Content• Source (if there is one!)• Digital file/object For born-digital
objects, the digital object is the source
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 11
Content:• a painting• a sculpture• a text• a building
Source:• paper photograph of a painting or a building• sketch of a sculpture• a manuscript, containing a text
Digital file/object:• scan or digital photo of a paper photograph• scan or digital photo of a sketch of a sculpture• scan or digital photo of a manuscript• born-digital photo of a building
microfilm of a manuscript that is itself scanned… both manuscript and microfilm are “source”
No “source” – digital image is taken directly
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 12
Administrative metadata
• Facilitates management of files• Describes the creation/derivation of files– Responsible Individuals and institutions– Dates– Locations
• Technical specifications (e.g., file size, file format)
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 13
Structural metadata
• Describes/defines relationships between and among files
• AKA describing collections• AKA describing projects
Identifying what collection or project a file belongs to
Identifying what files belong to which collection or project
Identifying what project a collection belongs to
Relationships are usually, but need not be, 1:1
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 14
Ontologies, taxonomies, controlled vocabularies
• Controlled vocabulary: a list of terms• Taxonomy: a collection of controlled
vocabulary terms organized into a hierarchical structure
• Ontology: a formal representation of a set of concepts within a domain, and the relationships between these concepts
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 15
Ontologies, taxonomies, controlled vocabularies
• Concepts separate from format• Created by scholarly communities,
learned bodies, projects, etc.• Whenever possible, use what is available
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 16
Controlled vocabulary
• Controlled list of explicitly enumerated terms
• Unambiguous definitions for each term1. If the same term is commonly used to mean
different concepts in different contexts, then its name is explicitly qualified to resolve this ambiguity.
2. If multiple terms are used to mean the same thing, one of the terms is identified as the preferred term in the controlled vocabulary and the other terms are listed as synonyms or aliases.
Internet Assigned Numbers Authority Language Subtag Registry (http://www.iana.org/assignments/language-subtag-registry)
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 17
Taxonomy
• Controlled vocabulary, hierarchical structure
• Terms in parent-child relationships with one another
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 18
Ontology
Similar to a taxonomy (terms are sometimes interchanged), the difference is philosophical
An Ontology is developed to reason about a domain, and may be used to define a domain
CIDOC Conceptual Reference Model (http://cidoc.ics.forth.gr/)
• Represents the concepts that make up a domain
• Controlled vocabulary• Hierarchical
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 19
Metadata standards and formats
The first questions of metadata:• What do we want to describe?• How to we want to describe it?
Using accepted standards, expressed in widely-used or easily mapped formats, will ensure that our metadata is accessible.
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 20
Metadata standards
• Standards are widely-used (hence standard) prescriptive recommendations guiding– Defining fields: “name” “title” “identifier”
“subject” “physicalDescription” “location”– Structure and hierarchy within the
metadata itself– Controlled vocabularies
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 21
Metadata formats
• Extensible Markup Language (XML)– Allows for combining and interoperability– XML flexibility
• Any other conceivable format– MS Word? PDF? Post-it notes?
– Excel, FileMaker Pro, Access DB, CSV
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 22
In some but not all cases, the semantics of metadata is separate from the format of metadata
Identifier: 0-89236-361-4Creator: Howard BesserCreator: Jennifer TrantTitle: Introduction to Imaging: Issues in Constructing an Image DatabasePublisher: The Getty Art History Information ProgramDate: 1995Subject: Image processing—Digital techniques
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 23
Metadata mapping
• Moving metadata from one standard/format to another standard/format
• Not always pretty…
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 24
Metadata mapping
• Moving metadata from one standard/format to another standard/format
• Not always pretty…
• Important: base the design of your project metadata on an existing standard, and plan it out ahead of time!
WORKING WITH DATA ABOUT DATA: Introduction to Metadata10.03.2009 | Ms Dot Porter| slide 25
How much is enough?
Just because it’s there doesn’t mean you have to use it!
Standards are great!
Open Access (not magic, a bit scary, but very useful)
ROBUSTROBUST
YOUR NEEDS ARE NOT NECESSARILY EVERYONE ELSE’S – AND THAT IS OKAY!