academic libraries and big data: trends in collection, publication, preservation, and access
Post on 15-Jan-2017
1.164 Views
Preview:
TRANSCRIPT
#SIBF15 #SIBFALA15 @MCDONALD @ALALIBRARY @SHJINTLBOOKFAIR
TOPICS • Big Data in Libraries • Why Libraries? • Libraries Supporting Data
• Analysis • Publication • Worflow (Re-use)
cc: Ray Schamp -‐ h,p://www.flickr.com/photos/19009479@N00
Big Data
cc: Mark McLaughlin -‐ h,ps://www.flickr.com/photos/51035737977@N01
BIG DATA COMES FROM MANY SMALLER PACKAGES
cc: FullPixel Photography -‐ h,p://www.flickr.com/photos/98543207@N02
LIBRARIES
PROVIDING OPEN DATA REPOSITORIES cc: Paul Stainthorp -‐ h,ps://www.flickr.com/photos/30409117@N07
Research is now about workflow.
Worklow is often about
data. cc: Vinovin -‐ h,ps://www.flickr.com/photos/10212590@N00
http://innoscholcomm.silk.co/
KEY POINT 1 WHAT IS YOUR RESEARCHER WORKFLOW?
cc: yaph -‐ h,ps://www.flickr.com/photos/8471827@N06
Libraries
cc: FullPixel Photography -‐ h,p://www.flickr.com/photos/98543207@N02
COLLABORATION SPACES LIBRARIES ARE:
cc: TechSoup for Libraries -‐ h,ps://www.flickr.com/photos/9279573@N02
PUBLISHERS LIBRARIES ARE:
cc: Thomas Hawk -‐ h,ps://www.flickr.com/photos/51035555243@N01
The Once and Future Publishing Library – Okerson/Holzman/CLIR
The Once and Future Publishing Library – Okerson/Holzman/CLIR
REPOSITORIES LIBRARIES ARE:
cc: Halans -‐ h,ps://www.flickr.com/photos/48889073931@N01
"Nobel prizes have been given for inventing
instruments. I'm eagerly waiting for one for
inventing software."
Daniel S. Katz - U Chicago cc: Yddlywinker -‐ h,p://www.flickr.com/photos/41687592@N00
REPOSITORIES Institutional
Publish VMs (Virtual Machines) with DOIs (Document Object Identifiers)
A national science & engineering cloud http://jetstream-cloud.org/
Data Publishing
IU ScholarWorks http://scholarworks.iu.edu/
“Libraries serve the research and learning
needs of their universities.”
cc: Andreas-‐photography -‐ h,p://www.flickr.com/photos/19367634@N05
From a talk by Lorcan Dempsey – The Library in the Life of a User http://www.slideshare.net/lisld/the-library-in-the-life-of-the-user
USE CASE HATHITRUST RESEARCH CENTER
Non-Consumptive Research Paradigm
• No action or set of actions on part of users, either acting alone or in cooperation with other users over duration of one or multiple sessions can result in sufficient information gathered from collection of copyrighted works to reassemble pages from collection.
• Definition disallows collusion between users, or
accumulation of material over time. Differentiates human researcher from proxy which is not a user. Users are human beings.
• Repository – 13+ million volumes | 3+ billion pages – 50% of volumes are in English – Material from the 15th C. on | 20th C.
concentration – 70% in copyright or undetermined | 30% open
• Interface – Search and read books in the public domain
About the HathiTrust Digital Library
HathiTrust Ecosystem
HathiTrust Research Center Ecosystem
1. Secure Portal Access 2. Data Capsule Access 3. Feature Extraction Services
HTRC Approaches
HTRC Data Capsule Workflow
HTRC Data Capsule
Maintenance Mode Secure Mode
Running other workflow
• The ability to slice through a massive corpus constructed from many different library collections, and out of that to construct the precise workset required for a particular scholarly investigation, is an example of the “game changing” potential of the HathiTrust...
Grand Motivation
Scope
Basic Portal Workflow
DISTANT READING
MORETTI-STANFORD cc: chrismar -‐ h,ps://www.flickr.com/photos/14334258@N00
Understanding literature not by
studying particular texts,
but by
aggregating and analyzing massive
amounts of data.
KEY POINT 2 WHAT TYPES OF DATA INTERFACES?
cc: Eric Fischer -‐ h,ps://www.flickr.com/photos/24431382@N03
NEW INTERFACES
Allen and Murdock – Indiana University cc: caseorganic -‐ h,ps://www.flickr.com/photos/28980639@N02
NEW INTERFACES
Chris Forster – Syracuse University cc: caseorganic -‐ h,ps://www.flickr.com/photos/28980639@N02
NEW INTERFACES
Jonathan Goodwin – Univ of Louisiana cc: caseorganic -‐ h,ps://www.flickr.com/photos/28980639@N02
KEY POINT 3 WHAT TYPES OF DATA ARE NEEDED?
cc: Hans-‐Werner Guth -‐ h,p://www.flickr.com/photos/42448330@N00
REPOSITORIES Software
REPOSITORIES Data
Libraries and the Researcher
• What are the workflows needed by your researchers?
• What are the interfaces that support those workflows?
• What is the data that supports those workflows, interfaces, and researchers? • Local • Regional • International
cc: Marie in NC -‐ h,p://www.flickr.com/photos/24732687@N00
Libraries are repositories of data for the creation of new
knowledge.
cc: young_einstein -‐ h,p://www.flickr.com/photos/25047883@N00
New Library Services provide
support for the
workflows of new
knowledge. cc: tjmwatson -‐ h,ps://www.flickr.com/photos/63603238@N00
Photo by Marcus Ramberg - Creative Commons Attribution-NonCommercial License https://www.flickr.com/photos/40021607@N00 Created with Haiku Deck
THANKS
#SIBF15 | #SIBFALA15 cc: nateOne -‐ h,ps://www.flickr.com/photos/49998984@N00
top related