Exposing Hidden Relationships: Practical Work in Linked Data using Digital Collections

Download Exposing Hidden Relationships: Practical Work in Linked Data using Digital Collections

Post on 17-Jul-2015

177 views

Category:

Education

8 download

Embed Size (px)

TRANSCRIPT

Draft Presentation

Exposing Hidden Relationships: Practical Work in Linked Data using Digital Collections

Cory Lampert and Silvia Southwick

UNLV University Libraries Digital Collections

April 23, 2015

Linked Data & RDF: New Frontiers in Metadata and Access Conference

Overview

Video Demo

UNLV Linked Data Project

Digital Collections Metadata: Source of Rich (But Hidden) Relationships

Video Demo

Next Steps, Future

Questions

Video demo

This short video (no sound, just image) will give a preview of what linked data may look like to users.

It shows the Virtuoso Pivot Viewer software acting upon UNLVs Linked Open Data triplestore.

Think about how this is similar/different to how users currently view data in library systems.

[PLAY PIVOTVIEWER.mp4]

Exploring LOD: Taking theory to practice

How we started

Goals set

What we accomplished

How we began

Conferences and buzz

Curiousity and professional development

Exploration and pilot project

Compelling results; sharing impact of what weve learned

Assessment

Much more to do... A sense of humor is helpful!

Photo: Five men with burros, circa 1900, Tonopah/Goldfield Collection

Motivation

Information encapsulated in records

Records contained in collections

Very few links are created within and/or across collections

Links have to be manually created

Existing links do not specify the nature of the relationships among records

This structure hides potential context (links) within and across collections

Free metadata from silos

Expose rich relationships

Leverage powerful, seamless, interlinking of data from multiple sources

Discover and query data in new ways

More precise searching

More opportunities to repurpose data

Current Practice

LOD Potential

Poll

Please use the agree/disagree button, available from the pull down menu at the top of the screen to respond to the statement below:

Statement: There is interest in doing practical work with linked open data at my institution.

Foundation of pilot

Our digital collections consist of unique materials documenting the history of Southern Nevada stored in CONTENTdm; project focused on LOD for visual material collections

Definition of LOD we are using: Linked Data refers to a set of best practices for publishing and interlinking data on the Web.

A good way to better understand this is the 5-Star Data diagram: http://5stardata.info/

Preparing for departure

Before we launch into a discussion of how we created our linked data, lets take a short trip.

We will start in our current data: digital collections metadata records, and end in the new world of linked open data.

Photo: Photograph of Howard Hughes in cockpit of the second XF-11, April 4, 1947, Howard Hughes Collection

Graphical Representation: Part of a Record

10

Examples of records

Showgirls

Menus

Dreaming the

Skyline

11

December 12, 1915

Exposing Hidden Links

12

Poll

Please use the agree/disagree button, available from the pull down menu at the top of the screen to respond to the statement below:

Statement: The diagrams helped me to see how linked data helps to reveal hidden relationships in existing metadata.

UNLV Linked Open Data Project Goals

Study the feasibility of developing a common process that would allow the conversion of our collection records into linked data preserving their original expressivity and richness

Publish data from our collections in the Linked Open Data Cloud to improve discoverability and connections across our collections and with data from other related data sets on the Web

ActionsTechnologies

Clean data

Export data

CONTENTdm

Open Refine

Import data

Prepare data

Reconcile

Generate triples

Export RDF

Import data

Publish

Mulgara /

Virtuoso

Phase 1

Phase 2

Phase 3

15

WHAT WE LEARNED

With interest and motivation, Linked Open Data is a feasible goal

Visualization tools help convey the benefits of LOD work

A pilot quickly turned into a project and then into production

Moving into the next phase required careful examination of current practice focusing on expressing links (relationships)

Photo: Film transparency of a chimpanzee with slot machines at the Sands Hotel, Las Vegas, circa late 1950s, Sands Collection

LOD Approach after the pilot

After learning the concepts, applying a model, and testing technologies, the LOD transformation process becomes repeatable

Sustainability of process depends upon data quality

Data begins with existing metadata in current collections; there are many lessons from the pilot that should inform revisions to current practice (even if LOD is more in future than present)

Mining the metadata

Application profile

Shared Vocabularies

Managing Controlled Vocabularies

Managing Linked Data

When should we start preparing metadata for Linked Data?

Evolution of metadata

Our focus is on metadata

Why?

Metadata is essential for establishing relationships

Any metadata?

Ability of discovering relationships is directly affected by metadata quality

It is critical to:

Use well-established Controlled Vocabularies (particularly if they are linked data ready)

Rigorously control local terms

Re-use URIs

Assign URIs for local terms

Metadata creation common Approaches

Focus is on the collection being created

Usually metadata consistency is managed within collections

Not much rigor is used to enter controlled vocabulary terms

Exs.: Misspellings, use of terms that do not match the preferred terms, etc.

Limited control of local terms

Implications:

Ability to identify relationships within and across collections is decreased

When should we start preparing metadata for Linked Data?

what can we do to create sapient metadata?

Application

Profile

Re-design strategies to

manage and use CVs

What do I do with my legacy metadata?

Adjust metadata according

to the

Application

Profile

Apply strategies to

manage and use CVs effectively

Metadata Milestones AT UNLV Libraries

Adopted an approach that considers each individual digital collection as part of an integrated digital library.

The UNLV Application Profile

Specifies:

which metadata terms UNLV Libraries uses for its digital collections

the source of metadata terms

how metadata should be expressed

labels to be used for each metadata field

Benefits:

Increases consistency of content across digital collections

Improves user interactions with digital collections

Indexing guidelines are easy to generate

Facilitates transformation to Linked Data

Increases compliance with regional and national aggregators

Outcomes

Well-established CVs allow re-use of URIs

Rigorous rules of data entry facilitate reconciliation

Local Controlled vocabularies allow interlinking among local terms / names within collections

Shared vocabularies allow interlinkage among local terms / names across collections

All these actions:

allow creation of a single process to transform digital collections into linked data

Video: [PLAY SUPER-SKELETON-WHH.mp4]

Moving From experimentation to Implementation

Cleaning and sharing controlled vocabularies from legacy collections (time consuming)

Re-training metadata creators

Re-designing workflows

Delegating additional data management responsibilities

Data Management

Maintenance of local URIs

Terms

Authoritative Names

Design and implementation of new processes to maintain synchronization between digital library and linked data set

Design processes to enrich relationships with external data sets

Next Steps

Future Activities

Resources

Video Demo

Future Activities

Publish data

Interlinking with other data sets

Documentation

Collaborative activities (regional controlled vocabularies)

Training and staff skill development

Interface design and development

Work with hierarchical data

Video demo

This short video (no sound, just image) will give a preview of what linked data may look like to users.

It shows the Relfinder software acting upon UNLVs Linked Open Data triplestore.

Think about how this is similar/different to how users currently view data in library systems.

[PLAY SHOWING RELATIONSHIPS.mp4]

The Linked Data Cloud

resources

Leading to Linking: Introducing Linked Data to Academic Library Digital Collections: http://www.tandfonline.com/doi/pdf/10.1080/19386389.2013.826095

A Guide for Transforming Digital Collections Metadata into Linked Data Using Open Source Technologies:

http://www.tandfonline.com/doi/pdf/10.1080/19386389.2015.1007009

UNLV Linked Data Blog (videos posted here): https://www.library.unlv.edu/linked-data

Contact us!

Thank you!

Contact Us:

Cory Lampert

cory.lampert@unlv.edu

Silvia Southwick

silvia.southwick@unlv.edu

UNLV Digital Collections

www.d.library.unlv.edu

Questions?

Photo: Photograph of Bluebells posing outside of Pan Am jet, 1958, Donn Arden Collection

Questions?

Contact:

Cory Lampert

cory.lampert@unlv.edu

Silvia Southwick

silvia.southwick@unlv.edu

UNLV Digital Collections

www.d.library.unlv.edu

Recommended

View more >