how to do things with metadata: from rights statements to speech acts
TRANSCRIPT
HOW TO DOTHINGSWITHMETADATA
From Rights Statements to Speech Acts.
R. J. URBAN
Metadata SemanticsKnowledge Organization• Colloquial • Informal, document-
like representation structures.
• Metadata elements conform to a complex set of rules.
• Supports information retrieval.
• Creates descriptions that identify resources.
Knowledge Representation• Formal• Grounded in formal
theories of First Order Logic
• Metadata assertions are true/false within an interpretation.
• Available to support formal reasoning.
• Relies on names to identify resource.
Metadata Pragmatics?• Semantics: what do words (signifiers, etc.)
mean? – Colloquial MARC/XML semantics– Formal semantics for knowledge representation.
• Resource Description Language (RDF)• Pragmatics: how does context contribute to
meaning? – As humans we regularly interpret the meaning of
metadata successfully, even though it may not be formally represented or machine understandable.
So what?Dublin Core Rights
Typically, rights information includes a statement about various property rights associated with the resource, including intellectual property rights.
So what?DC Terms class: rightsStatement
A statement about the intellectual property rights (IPR) held in or over a Resource, a legal document giving official permission to do something with a resource, or a statement about access rights.
DPLA Rights Statements
DPLA Rights Statements
87,000unique values
Research Questions• What are organizations trying to do
with rights statements? • How are rights statements a kind of
speech act?
Europeana Context“The [Task Force on Metadata Quality] noted that many data providers approach rights statements as an afterthought and lack sufficient know-how to apply the appropriate statement. They therefore choose a restrictive rights statement as a default.”
http://goo.gl/lHaeMX
Europeana Rights Statements– Public Domain– In Copyright• Various Creative Commons Licenses• Rights Reserved – Free Access• Rights Reserved – Paid Access
– Orphan Works– Unknown
– http://pro.europeana.eu/page/available-rights-statements
International Standardized Rights Statements
• Europeana + Digital Public Library of America (DPLA)
• http://rightsstatements.org/• SKOS vocabulary representing 10
rights statement classifications.
Problem
Metadata Quality Frameworks
Schlosser’s Memes• Specific Ownership statements:
“copyright [organization]”• Vague Ownership statements:
“copyright retained by the original owner”• What you can/can’t do:
“we encourage fair use of copyrighted material”
• Protecting Ourselves and You:“no information on the rights in the collection, researchers are responsible for determining copyright.” Schlosser, M. (2009). Unless otherwise indicated: A survey of
copyright statements on digital library collections. College & Research Libraries, 70(4), 371–385. http://doi.org/10.5860/crl.70.4.371
Speech Acts
J.L. Austin
• Maybe not all “statements” are true/false
• Performatives• Questions• Commands• Promises• Oaths• Declarations
• locutionary vs. illocutionary meaning
Speech Acts• Maybe not all “statements”
are true/false• Performatives• Questions• Commands• Promises• Oaths• Declarations
• locutionary vs. illocutionary meaning
Searle’s Speech Act Theory• Illocutionary
force = – Illocutionary
point (purpose of statement) +
– Direction of fit (relation of utterance to the world) +
– Speaker intention (psychological state)
Searle (1979) Speech Act TaxonomyCategory Description Direction of fitAssertive Utterances that commit the
speaker to the expressed truth proposition.“The cat is on the mat.”
Words-to-world
Commisive Utterances that commit the speaker to some future action.“I shall faithfully uphold the office of the president….”
World-to-words
Declarations
Utterances that bring about some change in the world. “I now pronounce you man and wife.”
World-to-words andword-to-world
Directives Utterances that consist of an attempt by the speaker to get the hearer to do something. “Please pass the salt.”
World-to-words
Expressives Expresses the speakers emotional or psychological attitude towards a statement. “I believe that…”
No direction of fit.
Method: Sample• 87,610 unique values found in
aggregated DPLA metadata as dc:rights.– Frequency counts of associated records.
• Drop statements associated with fewer than 100 records. (n=86,482)– Of these, 78,191 only associated with
one record.• Result: 1295 statements
Method: Cleanup• http://openrefine.org• Make all statements lowercase.• Remove extra whitespace.• Remove uniquing features:– DOIs– Copyright [date]– Gift of/donated by [name]– Cite as [citation string]
• Result: 488 statements
Method: statement analysis• 488 statements– Qualitative coding according to Searle’s
Taxonomy of Speech Acts.• What is the proper unit of analysis? – 603 coded excerpts. (these are not
necessarily sentences, especially for long complex directives).
Assertives (n=199)• Schlosser’s ownership statements.• “all rights reserved”• “copyright [copyright holder]”• “copyright [date]”• “This work is in the public domain”• “No known copyright restrictions” • “Purchased with Smithsonian Trust
funds”
Directives (n=272)• Schlossers What you can and can’t do and Protecting
ourselves and you.• “contact the host institution for more information”• “users may download the images for personal or
educational use - students may include images in reports, for instance, and teachers may use the images in the classroom - if the following credit line is included with the image: courtesy of the georgia archives.”
• “to purchase copies of images and/or for copyright information, contact university of [x]”
• “this item may be subject to copyright” • “this image available for use only with the expressed,
written consent of the [x] historical society “
Commissive (n=1)• i hereby certify that, if appropriate, i have obtained and
attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, and specifically allowing distribution as specified below. i certify that the version i submitted is the same as that approved by my advisory committee. i hereby grant to brigham young university and its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. i retain all other ownership rights to the copyright of the thesis, dissertation, or project report. i also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.;
Expressives (n=1)• the new york public library is
interested in learning more about items you've seen on our websites or elsewhere online. if you have any more information about an item or its copyright status, we want to hear from you.
Rights statement patterns
Assertive + directive.
Copyright [date], [organization]. For more information contact [email protected]
Non-Speech Acts (n=130)• Unexpected discovery!• “university of utah”• [email protected]• "watt hall 4d, usc, los angeles, ca
90089-0294”
• Are URLs speech acts?– http://ex.org/rights.html
Non-speech acts in context
<http://ex.org/resource001> <dc:rights> “University of Utah”
• Still not really a “statement”• Not a legal document.• Not a permission.
So what?• Europeana Rights statements really about Assertives.
– Creative Commons Licenses (maybe more directive, but still represented as an assertion that a license is available)
• Open Digital Rights Language (ODRL)https://www.w3.org/community/odrl/ – Policies
• Permissions• Constraints
– Intended to be actionable by a system, so very tightly defined.– Mapping statements to ODRL may be difficult.
Permissions/constraints often mixed in rights statements sentences.
– Not well supported in cultural heritage digital library software.
Next steps• Can text analysis help automatically
assign International Standard Rights Metadata.– Automatically recognize and separate
different kinds of speech acts.– Determine relationship to rights
statements.• Fork Cohen’s Ciranda (detects
speech acts in emails)?
Parallel Research• What do rights
statements refer to?• Same data set.• Tagged according to
indexical/referential statements. – This collection– This (digital image)– The work– Etc. etc.