metadata in practice: issues of interoperability, quality, and standardization

Metadata in Practice: Issues of Interoperability, Quality, and Standardization 1

Metadata in Practice:

Issues of Interoperability, Quality, and Standardization

Melissa Ormond

Drexel University

2010‐2011 Summer

Dr. Xia Lin


ABSTRACT

This paper examines the high importance of metadata in both a digital library and as part of the World

Wide Web. Special focus is placed on the goals of metadata, detailed versus simple digital identification,

and embedded metadata and information retrieval. I will examine both metadata scheme

standardization such as Dublin Core, as well as the importance of interoperability and application

profiles. I will evaluate multiple metadata schemes such as Dublin Core in the World Wide Web,

automatic metadata generation such as that of the NSDL and OAI‐PMH, and author generated metadata

such as that found in DSpace. I will look at the issue of digital archiving and preservation and the

importance of metadata in ensuring both. Finally, I will examine the issue of metadata evaluation and

metrics for measuring. As the evolution of technology continues to change, the importance of how these

resources are described and managed will take on a new importance. “The rapid changes in the means

of information access occasioned by the emergence of the World Wide Web have spawned an upheaval

in the means of describing and managing information resources. Metadata is a primary tool in this work,

and an important link in the value chain of knowledge economies” (Duval et al., 2002). The future

mission of LIS professionals will be to determine the best means in which this metadata can be

integrated.


1. INTRODUCTION

In order to fully understand both the nature of metadata and its vital importance, one must first

understand the environment in which metadata plays its most important role – the digital library. The

Dictionary for Library and Information Sciences (2004) states that a digital library is, “a library in which a

significant proportion of the resources are available in machine‐readable format (as opposed to print or

microform), accessible by means of computers…the digital content may be locally held or accessed

remotely via computer networks” (Reitz, 2004). While this definition may be suitable to the

generalization of a digital library, it gives no direction as to the function or mission of a digital library.

Candela et al., (2007) argue that a digital library can “represent the meeting point of many disciplines

and fields, including data management, information retrieval, library sciences, document management,

information systems, the web, image processing, artificial intelligence, human‐computer interaction,

and digital curation” (Candela et al., 2007). In their article, “What is a Digital Library Anymore, Anyways,”

Lagoze et al. (2005) argue that a digital library is more than just accessibility of many disciplines. “There

seems to be a belief that a digital library is just about search and access. These functions are indeed

essential (and remain challenging), but they are just part of an information environment” (Lagoze et al.).

They went on to provide an expanded view of a digital library, one in which not only provides a

collection of materials relevant to the library’s mission, but furthermore provides the medium to be

both collaborative and contextual. The digital library must move past the “search and access” function

and instead work to “create a rich, asynchronous workplace in which information is shared, aggregated,

manipulated, and refined” (Lagoze et al.).

2. DIGITAL V. TRADITIONAL LIBRARIES

Helping users make a conceptual association is one of the most important, yet difficult, services

of a digital library. In a physical library, the spatial arrangement of items, such as books, conveys

meaning; i.e. associated items by subject. It is commonplace to visit a traditional library and browse a


particular subject area to locate similar materials. Physical library metadata, in the form of indexes such

as the Dewey decimal classification and classification schemes such as that of the Library of Congress,

ensure spatial arrangement of these physical materials. In fact, the spatial arrangement can itself be

viewed as a form of metadata (Nurberg et al., 1995). When it comes to digital libraries the relevant

issues regarding spatial arrangement must be resolved for a fully digital environment. How can the

organization afforded by physical library metadata be translated into a digital environment where there

are no shelves to arrange related materials? This is where the use of digital metadata plays a vital role.

“While spatial arrangement of library materials is a physical library metadata element with physical

presence, other metadata with no direct physical reality must also be translated, or adapted in its

application, if it is to be used in a digital library” (Nurberg et al., 1995).

However, this issue of spatial arrangement is not the only issue central to a digital library, and

not the only issue that is influenced by metadata. Digital libraries must also address how to digitize

items and store them online, how to include new forms of medium such as images, audio, and video,

how to locate information, when to use existing technologies, and how to deal with information

overload (Nurnberg et al., 1995). While digital libraries share both some of the same issues and goals of

a physical library, they have the extra burden of determining how to “adapt the tradition of the physical

library into the digital realm” (Nurnberg et al., 1995). Besides those listed above, a central problem with

a digital library translation is the ideal of archiving and preservation. “If physical libraries primarily

contain physical data and digital libraries primarily contain digital data, then how can digital libraries

preserve and disseminate the vast amounts of existing physical data?” (Nurnberg et al., 1995). According

to Nurnberg et al., the answer is fairly straightforward. Since digital libraries cannot contain the physical

content they can instead “contain digital translation of this data” (Nurnberg et al., 1995). These succinct

surrogate records of digital data can be found in the form of digital metadata.


3. WHAT IS METADATA

Before we begin discussing the implications of metadata in the digital library environment, we

must first answer the fundamental question of, what is metadata. Metadata is data that is both

machine‐readable and descriptive. Digital metadata serves many purposes including; resource

discovery, management, delivery, and preservation to name just a few. Basically, metadata is data about

data. It is information that describes content. Metadata is a traditional way, along with subject indexing

and classification, to connect subjects, topics, and documents. The purpose of metadata is to allow users

to easily find the information that they need. Metadata is needed to improve retrievability of electronic

resources and to provide sufficient and appropriate description of their content so that users can choose

among different resources that might appear similar.

In order to improve retrievability of resources, metadata must first efficiently describe the

digital resource in question. This is done in three distinct ways; first, the metadata must describe the

content of the resource, primarily what is the resource about. Secondly, the metadata must describe the

context of the resource. This includes essential questions such as who, what, where, why, and how; any

aspect associated with the creation of the resource. And finally, the metadata must describe the

structure of the resource, including form (Gill et al., 2008). The issue of retrievability comes to light once

an item has been properly described using metadata. An item with effective metadata can be properly

organized for later retrieval. However, the question must b e asked ‐ what makes effective metadata? Is

metadata only effective when all of the possible metadata element fields are completed?

3.1. DETAILED V. SIMPLE

To begin to answer the above question, let’s start with an analysis of detailed metadata

descriptions v. simple metadata descriptions. Detailed metadata descriptions, while they may improve

information retrieval, require not only trained staff to assign the metadata, but increased cost


associated with the metadata development. Detailed metadata can also lead to issues with consistent

and standard metadata – as a rule it is typically easier to standardized simple metadata. Besides ease in

consistency, simple metadata is both less costly and provides a greater probability of interoperability.

However, simple metadata is not without its flaws, the greatest being an increased chance of false

results during information retrieval, due to less detailed and specific search parameters. Whether

simple or detailed, in the end, “the richness of metadata descriptions will be determined by policies and

best practices designated by the agency creating the metadata” (Duval et al., 2002). Designing some

form of ideal metadata, that which is standardized and allows for greater interoperability, requires a

high degree of flexibility on the part of the agency creating the metadata. However, if the history of

metadata has taught us anything, it is that there is no easy answer, or solution, to the issue of digital

content and information retrieval.

3.2 ISSUES WITH METADATA

Early forms of metadata such as MARC (Machine‐Readable Catalog) and AACR2 are still

predominate metadata in the physical library world, however when it comes to the digital library world

embedded metadata has become the new standard. The Dublin Core Element Set (DC) was developed in

1995 as a means in which to improve indexing of search engines by embedding metadata elements into

web pages or encoding through the use of XML (Huddleston, 2008). The DC consists of 15 core

elements, including contributor, creator, date, and subject. Although the Dublin Core variants seem

simple enough, they are in reality very complex in nature. According to Arms (2000), “simplicity is both

the strength and the weakness of the Dublin Core. Whereas traditional cataloging rules are long and

complicated, requiring professional training to apply effectively, the Dublin Core can be described

simply. However, simplicity conflicts with precision” (Arms, 2000, Chapter 10). The perceived simplicity

of the core elements leads many to believe that DC metadata can simply be created by untrained staff,


however this minimalist views seems to have been initially supported. “The initial aim [of the Dublin

Core Element Set] was to create a single set of metadata elements for untrained people who publish

electronic materials to use in describing their work” (Arms, 2000, Chapter 10).

While this minimalist view is still held by many, some professionals feel that a more structured

metadata, one that is managed through controlled cataloging rules, is the better option. What a more

structured metadata can provide is increased standardization, precision, and interoperability, but at

what cost? Not only would the structured form be more complex in nature, therefore requiring greater

staff training, but the standardization process would be both time consuming and costly. As an example,

the IEEE‐LOM standard metadata took five years between the initial specification and the finished

product; five years of meetings, testing, and evaluations (Duval et al., 2002). Arms (2000) argues for a

proposed strategy that does not favor one option over another, but rather utilizes both. “The minimalist

option will meet the original criterion of being usable by people who have no formal training…the

structured option will be more complex, requiring fuller guidelines and a trained staff” (Arms, 2000,

Chapter 10). While this proposed strategy does seem optimal, there are still many issues to consider,

including standardized adoption of policies and cataloging rules, as well as how to manage the addition

of new elements.

One such way in which to manage this addition is through the use of application profiles. “An

application profile is an assemblage of metadata elements selected from one or more metadata

schemas and combined in a compound schema. Application profiles provide the means to express

principles of modularity and extensibility” (Duval et al., 2002). To put this in laymen’s terms, the

argument claims that it is impossible to have one single metadata format (with elements) that could

accommodate all digital library applications. What an application profile does is allow “designers to ‘mix

and match’ schemas as appropriate…while retaining interoperability with the original base schemas”


(Duval et al., 2002). In order to achieve this flexibility, application profiles must both enforce constraints

on element appearances, while defining the interrelationships between the constrained elements (Duval

et al., 2002). What it comes down to, is the main goal of applying application profiles is to increase

interoperability between metadata standards.

4. METADATA INTEROPERABILITY

In order to continue let’s first define what is meant by metadata interoperability. According to

Duval (2001), interoperability is “enabling information that originates in one context to be used in

another in ways that are as highly automated as possible” (Duval, 2001). In regards to metadata records,

this refers to mechanisms that enable two or more metadata schemes to “cross boundaries of context”

(Duval, 2001). To end users, metadata interoperability allows the searcher to cross search among

multiple repositories whose metadata can ‘speak to each other,’ subsequently, “preventing end users

from being locked into proprietary systems” (Duval, 2001). While the benefit of metadata

interoperability seems straightforward, achieving this interoperability is not so simple. As long as digital

libraries continue to act as standalone repositories, with their own mission, policies, and metadata

standards, how could LIS professionals ever hope to achieve metadata interoperability? While the goal

of application profiles is to alleviate some of the issues surrounding interoperability, this is not always

the case. How can we move beyond one single metadata standard without compromising

interoperability?

According to Manduca (2006), whose article focused on educational digital libraries, to achieve

an integrated network “requires knowledge of the breadth of resources and the breadth of user

communities” (Manduca et al., 2006). A good example of this focus on user community can be seen

through the IEEE LOM data model (Learning Object Metadata). While some professionals were focused

on the Dublin Core and W3C’s development of the Resource Description Framework (1999), a standard

for the semantic web that allows software to navigate Web content to link data, the learning community


was busy developing its own standards (Duval et al., 2002). The IEEE LOM, while similar to the Dublin

Core, contains its own elements and syntax. The basic IEEE LOM Metadata Scheme contains a hierarchy

of elements. The first level of the hierarchy consists of nine categories, including, lifecycle, technical, and

classification. Each of the nine categories contains sub‐elements, a hierarchy of attributes, or data

elements, that further describe a learning objective. Because of the complexity of the IEEE LOM

metadata scheme, it is necessary to utilize the services of a metadata expert, although the standard

itself lacks any definition as to whom or what is responsible for creating the metadata attributes.

Back to the issue of interoperability, with the development of multiple different metadata

standards and repositories, such as the IEEE LOM, there became a need in the LIS community to find

ways in which to aggregate metadata from different repositories into one place that could be easily

searched (Ward, 2003). Out of that complex need came the development of the OAI‐PMH (Open

Archives Initiative Protocol for Metadata Harvesting). The function of the OAI‐PMH is twofold; first,

administrators of digital libraries participating in the protocol (data providers) must expose Dublin Core

metadata via OAI‐PMH. Secondly, service providers take the exposed metadata and harvest the data

(Ward, 2003). This twofold action is meant to facilitate “cross‐domain resource discovery and digital

library interoperability” (Ward, 2003). What service providers discovered during the harvesting process

was that most data providers used only a select few of the Dublin Core elements, such as ‘creator’ and

‘identifiers,’ and that this lack of standardization on the part of the data provider resulted in greater

time and costs on the part of the service provider to subsequently fill in the blanks (Ward, 2003).

One such digital repository that has benefited from the use of OAI‐PMH is the National Science

Digital Library (NSDL), a national network of digital libraries focused on science, technology, engineering,

and mathematics. The NSDL is funded by the National Science Foundation (NSF) and its metadata

repository is based on the Dublin Core. “Dublin Core metadata records, which include URLs to the

corresponding digital resources, are ingested into the MR via OAI‐PMH. As part of the ingest process,


these records are processed to normalize dates and various controlled vocabulary elements” (Lagoze et

al., 2005).Based on the complexity of the NSDL it became obvious that utilizing a standardized metadata

scheme would be virtually impossible. Because of this, the NSDL accommodates a wide variety of

metadata standards, including “a broad spectrum of metadata quality, anticipating a wide variety of

errors or inconsistencies” (Hillmann et al., 2004). However, the acceptance of inconsistencies is not the

only issues plaguing the NSDL. Further issues that became relevant from the metadata harvest included

a wide variety of data failures including missing data (blank elements), incorrect or confusing data, and a

lack of controlled vocabulary (Hillmann et al., 2004). While the use of Dublin Core and OAI‐PMH has

allowed the NSDL to provide “basic digital library services, [it] has also revealed a number of

implementation problems. The most outstanding of these relate to metadata quality and OAI‐PMH

validity, especially XML‐schema compliance” (Lagoze et al., 2005).

5. HOW TO DETERMINE QUALITY

In order for metadata to be truly effective, it must be of the highest quality, however how does

one evaluate what constitutes high quality metadata? In their 2004 article, “The Continuum of Metadata

Quality: Defining, Expressing, Exploiting,” Bruce and Hillmann, argue the difficulty, if not impossibility, of

defining a standard metrics for determining metadata quality. What they do suggest is that an

examination “of the most commonly recognized characteristics of quality metadata: completeness,

accuracy, provenance, conformance to expectations, logical consistency and coherence, timeliness, and

accessibility,” can begin to help determine the necessary metrics for metadata quality (Bruce & Hillman,

2004). While I agree with their argument, I would argue that metadata quality evaluation must start at

the beginning, with the creation of metadata. It has been often argued that most metadata today is

created by those with little to no metadata training or experience. As previously argued, the superficial

simplicity of the Dublin Core, while serving as the motivation for its development, can also serve as its

downfall. The minimalist view has led many to believe that Dublin Core metadata can be assigned by


novices, with no formalized training. Rob Huddleston’s 2008 HTML, XHTML, and CSS, a training guide for

those taking an introductory web design class, devotes only four pages to web site meta elements, such

as those of the Dublin Core. The instructions include adding two main meta elements, keywords and

description, but give very little, if any at all, explanation as to the function of metadata or restrictions or

format of the two elements. In fact, the section clearly points out that since sites like Google ignore

metadata the addition of this information to the header of your website may not be worth the effort

(Huddleston, 2008). This lack of metadata importance is all the more obvious when one searches

Google.

5.1 METADATA QUALITY IN THE WORLD WIDE WEB

As an example I performed a simple Google search for ‘Tudor History,’ with resulted in

24,000,000 results. The first result was TudorHistory.org, a comprehensive site on the Tudor monarchs,

including photos, original resources, and timelines. A quick view at the page’s source material shows a

complete lack of standard metadata in the form of Dublin Core:

<HEAD> <meta http‐equiv="content‐type" content="text/html;charset=ISO‐8859‐1"> <META NAME="GENERATOR" CONTENT="Adobe PageMill 3.0 Mac"> <TITLE>TudorHistory.org</TITLE> <meta name="verify‐v1" content="2wVKaBktTtl0TaXXJCLtl5vYgVnF6pQKA5FqOOgvXlQ=" > <script src="http://www.google‐analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript">_uacct = "UA‐75588‐1";urchinTracker(); </script> </HEAD>

Out of the first five results provided by Google, none of the sites contained any form of Dublin Core

metadata. Results from this search can be found in Appendix A.

If search sites such as Google ignore metadata elements and web designers for the most part

chose not to include these elements in the headings of their web pages, then how do we evaluate


metadata that seems to be ineffective? Or better yet, how do we convince the community; including

web designers and information professionals, that quality metadata is necessary. I would argue that the

first step would be showcasing the effectiveness of metadata through its ability to benefit information

retrieval, through digital identification and resource discovery, and its ability to provide the means for

digital archiving and preservation. Once a strong argument has been made as to the effectiveness of

metadata you can begin discussing metadata quality. According to Lagoze et al., in their 2006 article on

metadata aggregation, there are three very necessary skill sets required in order to provide quality

metadata, domain experience, metadata experience, and technical experience (Lagoze et al., 2006 ).

While specialized training is both costly and time consuming the added benefit seems worth the cost. If

there is a widespread acceptance of a standard for high quality metadata in the form of Dublin Core (or

another standard in the future) and this standard is adhered to, then some of the issues surrounding

metadata interoperability can also be resolved.

5.2 METADATA QUALITY IN THE NSDL

While the Dublin Core and OAI‐PMH address the important goal of interoperability, this goal has

often been threatened due to the lack of high quality metadata (Lagoze, what is dl). An example of this

can be found in the NSDL. Although the NSDL has decided not to adhere to one metadata standard, such

as the Dublin Core, but rather accommodates a wide variety of metadata schemes, this does not protect

the NSDL from the same issues of metadata quality that those digital libraries that do adhere to the

Dublin Core face. Hillmann et al. (2004) argue that issues such as missing or incomplete data and lack of

controlled vocabulary require NSDL staff to perform ‘safe transformations’ to a large percentage of their

data in order to allow for interoperability among the systems (Hillmann et al., 2004). These ‘safe

transformations’ included removing elements with no data, identifying possible controlled vocabulary,

and normalizing the metadata presentation (Hillmann et al., 2004). These implementation fixes required


both time and money and worked against the goals of the NSDL which was to automate metadata

harvest and flow with very minimum human intervention (Hillmann et al., 2004).

Besides missing data and a lack of controlled vocabulary, NSDL came across a different issue

during metadata harvest, namely that when not adhering to one standard metadata scheme it is very

possible that “each metadata provider may be using different criteria to assign levels of interactivity,”

meaning the subjectivity used to value metadata, based on the elements with assigned data, may be

different for each metadata provider (Hillman et al., 2004). In this case, how can there ever by an

element of consistency during the harvest without human intervention in order to improve the quality

of the metadata? Even more so, this issue of subjectivity can be used when arguing how to evaluate the

quality of metadata. If the purpose of metadata is to aid end users in their search and retrieval, then the

context of the metadata elements should be measured to determine whether they meet this goal.

However, in the case of the NSDL, which harvests metadata adhering to a number of different schemes,

“different services require different kinds of metadata, perhaps tailored for different purposes, or with

different confidence ratings” (Hillmann et al., 2004). If each metadata provider bases their metadata

schemes around their particular end users, then how can one adequately determine metadata quality

based off of one quality standard?

6. METADATA GENERATION OPTIONS If the NSDL serves as an example of issues arising from auto‐generated metadata, metadata

assigned with little to no human interaction, what sort of issues can be found when dealing with human

generated metadata? We have already touched on the issue of cost associated with training staff.

According to Arms (2000), “cataloging and indexing are expensive when carried out by skilled

professionals. A rule of thumb is that each record costs about $50 to create and distribute” (Arms, 2000,

Chapter 10). If high quality human generated metadata schemes are both costly and time consuming to


both create and maintain, what are the alternatives, besides automatic indexing which most argue

results in poor quality metadata? An answer to this question could be found through the concept of

author generated metadata. Granted, while the author of a work will not be trained in the usage of

metadata, they will be the most familiar with the work being indexed, as well as the user community in

which the work belongs. Could author generated metadata be the solution to both costly metadata by

professionals and low quality metadata my machines? While the “authoring of data and metadata is

hard and time consuming…and automatic generation of obvious metadata is useful and

possible…semantic metadata will in most cases need to be provided through human intervention (Duvall

et al., 2002). In order to aid in this human intervention, especially in the case of author generated

metadata, a metadata template could be provided for a collection of documents that are similar, or

aimed at the same user community.

6.1 AUTHOR GENERATED METADATA – DSPACE

Taking into account all of the evolving issues surrounding both the creation and use of

metadata, let’s take a look at an example of an institutional digital repositories that stores, manages,

and provides access to institutional assets such as research papers, learning objectives, and research

data, while providing author generated metadata; MIT’s DSpace. DSpace was MIT’s attempt to resolve

the issue of an over abundance of self published materials. “As faculty and other researchers develop

research materials and scholarly publications in increasingly complex digital formats, there is a need to

collect, preserve, index and distribute them” (Smith et al., 2003). Using a qualified Dublin Core metadata

standard, DSpace provides the means “to manage these research materials and publications in a

professionally maintained repository to give them greater visibility and accessibility over time” (Smith et

al., 2003).


Using the Libraries Working Group Application Profile, there are three required Dublin Core

elements per submission ‐ title, language, and submission date; all other fields such as abstract,

keywords, and rights are optional (Smith et al., 2003). Once the metadata elements have been assigned

by the submitter they are displayed in the item record and indexed for greater searching and browsing

capabilities (Smith et al., 2003). DSpace collections have their own form of interoperability, as metadata

assigned to a record in a particular collection is indexed to allow searching in the initial collection, across

multiple collections, or across Communities (partner institutions participating in DSpace). Also, “to

further its goal of supporting interoperability with other DSpace adopters, and with other digital

repositories, preprint, and e‐print servers, the system has implemented the OAI‐PMH” (Smith et al.,

2003). In order to meet the growing needs for digital preservation, DSpace attempts to capture minimal

technical metadata, such as file format and creation date, in order to support bit preservation (Smith et

al., 2003). Along with quality procedures, servers, and backup plans, DSpace can work to ensure that

“material deposited can be delivered to future users exactly as it was originally received” (Smith et al.,

2003).

If the goal of DSpace is to collect, preserve, index, and distribute materials and publications,

then the requirement for only three Dublin Core metadata elements is problematic. The fields of title,

language, and submission date provide no conceptual associations that would help a user to easily find

needed information, which is the goal of metadata. The fact that DSpace metadata indexing is done by

end‐users, and not trained indexers, could explain why only three fields are required, however, as

argued earlier, metadata records assigned by untrained individuals tend to be plagued with low quality

or incomplete date. (An example of metadata records from DSpace can be located in Appendix B). Also,

DSpace’s preservation mission relies heavily on metadata submissions; however most of the descriptive

fields for a record that would aid in this goal of preservation are not required, and often not captured.

Though DSpace is utilizing the OAI‐PMH by exposing the systems assigned Dublin Core metadata in


order to ensure that deposited items can be found in the future, the lack of descriptive metadata for

items requires additional support (Smith, et al., 2003). Community representatives would need to work

directly with DSpace user support staff to ensure sufficient metadata creation, however without a policy

standard in which a model workflow could be based allowing each collection to enforce the metadata

standard; this will be an uphill battle.

7. CONCLUSION

The importance of metadata in digital libraries, both as a means for information retrieval and

storage, and digital preservation and archiving, is not to be disputed. Metadata serves as part of the

foundation of librarianship. Every item in a traditional library is assigned some form of metadata,

whether through a card catalog, location on a shelf, indexes such as the Dewey decimal classification, or

classification schemes such as that of the Library of Congress. While metadata has a long standing

tradition in the library environment, this tradition, or acceptance of importance, has not always

transferred over to the digital library realm. Digital metadata is plagued with a number of issues from

poor quality, to lack of interoperability, from cost association, to need of a metadata standard, and from

auto‐generated metadata to human generated metadata. While there appears to be no easy solution to

the metadata issue, what is clear is that without a solution we jeopardize the overall goal and mission of

a digital library, which it to both provide resources to users, and preserve these resources for future

generations. As Arms (2000) argues, “the underlying question is not whether automated digital libraries

can rival conventional digital libraries today. They clearly cannot. The question is whether we can

conceive of a time (perhaps twenty years from now) when they will provide an acceptable substitute”

(Arms, 2000). As the foundation of a digital library, it is time to start conceiving of a time when digital

metadata can fulfill this substitution.


8. REFERENCES

Arms, W. (2000). Digital Libraries. Chapter 10: Information Retrieval and Descriptive Metadata

Arms, W. (2000). Automated Digital Libraries: How Effectively Can Computers Be Used for Skilled Professional Librarianship? D‐Lib Magazine, 6(7/8).

Bruce,T, & Hillmann, D. (2004). "The Continuum of Metadata Quality: Defining, Expressing, Exploiting." In Metadata In Practice, ALA Edition. D. Hillmann & E Westbrooks. Candela , L., Castelli, D., Pagano, P., & Thanos, C. (2007). Setting the Foundation of Digital Libraries: The DELOS Manifesto. D‐Lib Magazine, 13(3/4).

Duval, E. (2001). Metadata Standards: What, Who & Why. Journal of Universal Computer Science, 7(7).

Duval, E., Hodgins, W., Sutton, S., & Weibel, S. (2002). Metadata Principles and Practicalities. D‐Lib Magazine, 8(4).

Gill, T., Gilliland, A., Whalen, M., & Woodley, M. (2008). “Metadata and the Web.” Introduction to Metadata Online Edition, Version 3.0. http://getty.edu/research/publications/electronic_publications/intrometadata/metadat a.html

Hillmann, D., Dushay, N., & Phipps, J. (2004). “Improving Metadata Quality: Augmentation and Recombination.” In Proceedings Dublin Core Metadata Conference, Shanghai, China. Huddleston, R. (2008) HTML, XHTML, and CSS. Indianapolis, IN: Wiley Publishing, Inc.

Lagoze, C., Krafft, D., Payette, S., & Jesuroga, S. (2005) What Is a Digital Library Anymore, Anyway? D‐Lib Magazine, 11(11).

Lagoze, C., Krafft, D., Cornwell, T., Dushay, N., Eckstrom, D., & Saylor, J. (2006). "Metadata aggregation and "automated digital libraries": A retrospective on the NSDL experience." In Proceedings of the 6th ACS/IEEE‐CS Joint Conference on Digital Libraries, ACM Press.

Manduca, C., Fox, S., & Iverson, E. (2006) Digital Library as Network and Community Center. D‐Lib Magazine, 12(12).

Nurberg, P., Furuta, R., Leggett, J., Marshall, C., & Shipman III, F. (1995). “Digital Libraries: Issues and Architecture.” In Proceedings on Digital Libraries.

Reitz, J. (2004). Dictionary for Library and Information Science. Westport, Connecticut: Libraries Unlimited.

Smith, M., Barton, M., Bass, M., Branschofsky, M., McClellan, G., Stuve, D., Tansley, R., & Walker, J. (2003). DSpace: An Open Source Dynamic Digital Repository. D‐Lib Magazine, 9(1).


Ward, J. (2003). “Quantitative Analysis of Unqualified Dublin Core Metadata Element Set Usage within Data Providers Registered with the Open Archives Initiative.” Proceedings of the 3rd ACM/IEEE‐ CS joint conference on Digital Libraries.


9. APPENDIX A

First five results from Google search ‘Tudor History’

Example of embedded metadata all lacking use of Dublin Core

1. http://tudorhistory.org/ <HEAD> <meta http‐equiv="content‐type" content="text/html;charset=ISO‐8859‐1"> <META NAME="GENERATOR" CONTENT="Adobe PageMill 3.0 Mac"> <TITLE>TudorHistory.org</TITLE> <meta name="verify‐v1" content="2wVKaBktTtl0TaXXJCLtl5vYgVnF6pQKA5FqOOgvXlQ=" > <script src="http://www.google‐analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript">_uacct = "UA‐75588‐1";urchinTracker(); </script> </HEAD>

2. http://www.englishhistory.net/tudor.html

<head> <meta name="content" content="Tudor England 1485 to 1603 images biographies primary sources"> <meta name="author" content="Marilee Mongello"> <meta name="page_topic" content="Tudor dynasty Henry VII, Henry VIII, The Six Wives of Henry VIII, Elizabeth I, Mary I, Edward VI, Lady Jane Grey, 16th century England"> <meta name="GENERATOR" content="Microsoft FrontPage 5.0"> <meta name="ProgId" content="FrontPage.Editor.Document"> <meta http‐equiv="Content‐Type" content="text/html; charset=windows‐1252"> <meta http‐equiv="Content‐Language" content="en‐us"> <title>Tudor England 1485 to 1603: Table of Contents</title> <style fprolloverstyle="">A:hover {color: #0000FF; font‐weight: bold} </style> </head>

3. http://www.tudorplace.com.ar/TUDOR.htm

<head> <meta http‐equiv="Content‐Type" content="text/html; charset=iso‐8859‐1"> <meta name="GENERATOR" content="Microsoft FrontPage 4.0"> <title>TUDOR</title> <style> <!‐‐a.new { color: #CC2200; }


‐‐> </style> </head>

4. http://womenshistory.about.com/library/quiz/bltqueenquiz.htm

<HTML> <head> <title>Which Tudor Queen Are You? A Women's History Quiz</title> <meta name="keywords" content="women's history, women's studies, quiz, question of the week"> <meta name="description" content="Which queen in Tudor history are you most like?: quotations by notable women biographies of notable women tudor queen tudor period personality quiz"> <script src="pquizheadtqueen.js" type="text/javascript"> </script><!‐‐GIHEDSTRT‐‐> <meta charset="ISO‐8859‐1"> <meta http‐equiv="X‐UA‐Compatible" content="IE=edge,chrome=1"> <meta name="ROBOTS" content="NOODP"> <meta name="pd" content="Saturday, 19‐Jun‐2010 14:40:14 GMT"> <link rel="icon" href="http://0.tqn.com/f/a08.ico"> <link rel="search" type="application/opensearchdescription+xml" href="http://0.tqn.com/4g/o/os.xml" title="About.com"> <script>var ziRfw=0;zobt=" Women's History Ads";zOBT=" Ads";function zIpSS(u){zpu(0,u,280,375,"ssWin")}function zIlb(l,t,f){zT(l,'18/1Pp/wX')}</script> <link rel="stylesheet" href="http://0.tqn.com/0g/dc/s63.css" media="all"><!‐‐[if lt IE 9]><link rel="stylesheet" href="http://0.tqn.com/8g/dc/ie.css" type="text/css" media="all"><![endif]‐‐><!‐‐[if lt IE 8]><link rel="stylesheet" href="http://0.tqn.com/8g/dc/rdie.css" type="text/css" media="all"><![endif]‐‐> <meta http‐equiv="pics‐Label" content='(pics‐1.1 "http://www.icra.org/pics/vocabularyv03/" l gen true for "http://womenshistory.about.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0) gen true for "http://womenshistory.about.com" r (n 0 s 0 v 0 l 0 oa 0 ob 0 oc 0 od 0 oe 0 of 0 og 0 oh 0 c 0))'> </HEAD

5. http://www.elizabethi.org/links/

(Although there is a dc. Element in the below header, they are not standard Dublin Core elements)

<html> <head> <meta http‐equiv="Content‐Type" content="text/html; charset=iso‐8859‐1"> <meta name="Author" content="."> <meta name="Description" content="Links to sites of interest on Tudor History"> <meta name="KeyWords" content="elizabeth, queen, reign,life,tudor,history,overview,biography,"> <title>LINKS: TUDOR HISTORY (Elizabethi.org)</title> <!‐‐ ValueClick Media POP‐UNDER CODE v1.8 for elizabethi.org (4 hour) ‐‐> <script language="javascript"><!‐‐ var dc=document; var date_ob=new Date();


dc.cookie='h2=o; path=/;';var bust=date_ob.getSeconds(); if(dc.cookie.indexOf('e=llo') <= 0 && dc.cookie.indexOf('2=o') > 0){ dc.write('<scr'+'ipt language="javascript" src="http://media.fastclick.net'); dc.write('/w/pop.cgi?sid=7948&m=2&tp=2&v=1.8&c='+bust+'"></scr'+'ipt>'); date_ob.setTime(date_ob.getTime()+14400000); dc.cookie='he=llo; path=/; expires='+ date_ob.toGMTString();} // ‐‐> </script> <!‐‐ ValueClick Media POP‐UNDER CODE v1.8 for elizabethi.org ‐‐> <style TYPE="text/css"> </style> </head>


10. APPENDIX B

*Examples of metadata records from DSpace:

Required elements completed. Additional elements such as abstract, description, and keywords

Title: LOM – IEEE Learning Objective Metadata

Authors: (removed)

Keywords:

LOM ‐ IEEE

Learning Objective Metadata

LOM

Issue Date: 13‐Jul‐2011

Abstract:

A brief review of the Learning Objective Metadata Standard (LOM – IEEE). The paper

describes the purpose, function, and basic structure of the LOM‐IEEE standard, as well as

an assessment of the metadata standard as found on the Learning Resource Exchange

Portal.

Description: Prepared as Assignment #2 for Dr. Xia Lin's INFO 653 (Digital Libraries) class at Drexel

University during the Summer 2011 quarter.

URI: http://hdl.handle.net/2114/686

Appears in

Collections: Metadata Project Review (2010‐2011 Summer)


Keywords element contains only one term, metadata, which is very general.

Title: Review of Gateway to Educational Materials

Authors: (removed)

Keywords: Metadata

Issue Date: 18‐Apr‐2011

Abstract: A review of the Gateway to Educational Materials metadata project.


Appears in Collections: Metadata Project Review (2010‐2011 Spring)

Contains most additional elements

Title: The Knife Case: Design and Construction

Authors: (removed)

Keywords: Kaufman Collection

knife cases

Issue Date: Sep‐2004

Publisher: Society of American Period Furniture Makers

Citation: x. (2004). The knife case: Design and construction. American Period Furniture, 4, 42‐57.

Abstract:

The origin, design and use of knife cses in early America are briefly explored. The design

and construction of a reproduction case, inspired by an original in the collection of

George M. and Linda H. Kaufman, are described in detail.


ISSN: 1542‐0299

Appears in

Collections: Articles


No keywords element – allowing only search by title, author, or date

Title: Museum and Cultural Collections (MOAC) with Certification Statement

Authors: (removed)

Issue Date: 20‐Apr‐2011


Appears in Collections: Metadata Project Review (2010‐2011 Spring)

*As evident from example metadata records, there is not consistency on metadata creation – each

community requires different metadata elements be completed for submissions.

I certify that: This paper/project/exam is entirely my own work. I have not quoted the words of any other person from a printed source or a website without indicating what has been quoted and providing an appropriate citation. I have not submitted this paper / project to satisfy the requirements of any other course. Signature: Melissa Ormond Date: 8/28/2011