best practices for managing born digital content
DESCRIPTION
Webinar presented for WiLS by Emily Pfotenhauer, Recollection Wisconsin Program Manager, June 24, 2014. Based on information from the Demystifying Born Digital reports from OCLC Research and the Digital Preservation Education and Outreach (DPOE) curriculum developed by the Library of Congress.TRANSCRIPT
Emily PfotenhauerJune 24, 2014
BEST PRACTICES FOR MANAGING
BORN DIGITAL CONTENT
Emily PfotenhauerRecollection Wisconsin Program Manager, [email protected]
Slides and links: http://recollectionwisconsin.org/borndigital
BEST PRACTICES FOR MANAGING BORN DIGITAL CONTENT
http://oclc.org/research/activities/borndigital.html
The mission of the DPOE program of the Library of Congress is to encourage individuals and organizations to actively preserve their digital content, building on a collaborative network of instructors, contributors, and institutional partners.
http://www.digitalpreservation.gov/education/
DIGITAL PRESERVATION OUTREACH AND EDUCATION
identify
select
store
protect
manage
provide
DPOE Modules for Managing Digital Content Over Time
WHAT IS DIGITAL CONTENT?
Digital content is any material that is published or distributed in a digital form, including text, data, sound recordings, photographs and images, motion pictures, and software. Digital materials created from analog sources Born-digital materials
Digital materials you currently have or create – or expect to have – that you want to preserve.
Born-digital resources are items created and managed in digital form. Digital photographs Digital documents Digital manuscripts Harvested web content Electronic records Data sets Digital art Digital media publications
Defining “Born Digital,” Ricky Erway, OCLC Researchhttp://oclc.org/content/dam/research/activities/hiddencollections/borndigital.pdf
DEFINING “BORN DIGITAL”
Everyone is creating digital content distributing digital content using digital content
And we are responsible for managing digital content
DIGITAL REALITY IN 2014
http://digitalbevaring.dk
WHAT’S THE PROBLEM?
Increasing amounts of digital assets are arriving on our doorstep
The digital assets arrive in all formats and on all formats
Time sensitive -- the longer we wait or the longer
our donors wait, the increased chance that something will be unreadable
Who takes the lead?What can I do?Where do I start?
Too technical (I don’t understand...)
Too daunting (I don’t have time...)
WHAT ARE THE CHALLENGES?
http://digitalbevaring.dk
Digital preservation combines policies, strategies and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time.
Working group on Defi ning Digital Preservation ALA Annual Conference, 6/24/2007
http://www.ala.org/alcts/resources/preserv/defdigpres0408
DIGITAL PRESERVATION
Digital materials on physical media (CDs, flash drives, floppy disks, etc.) have been stored along with other collection materials without having been copied, preserved, or made accessible.
A TYPICAL SCENARIO
WHAT COULD POSSIBLY GO WRONG?
Do no harm
Don’t do anything that prevents future action and use
Take action
Document what you do
FIRST STEPS: FOUR ESSENTIAL PRINCIPLES
Identifying content is a first step to planning for current and future preservation needs
Ask: what content do I have, will I have,might I have, must I have?
An inventory is the best way to identify what content you have now –
and raise awareness in your institution.
DPOE MODULE 1: IDENTIFY
http://digitalbevaring.dk
Good preservation decisions are based on an understanding of the possible content to be preserved
Not all digital content can or should be preserved
Preservation requires an explicit commitment of resources
WHY DO WE IDENTIFY CONTENT?
1. Identify and locate existing holdings.2. Count and describe digital media within each
collection.3. Remove media from collection (retain order
with photographs or separator sheets).4. Assign inventory number to each physical
piece.5. Record anything that is known about the
hardware, operating systems, and software used to create the fi les.
6. Calculate total amount of data (estimate).7. Re-house physical media in suitable storage.
FIRST STEPS: CREATE AN INVENTORY
Medium (6 CDs, 1 hard drive)
Format (pdfs, docs)
File Size (be consistent - MB, GB or TB)
Identifying information found on labels such as creator, title, description of contents and dates
Expected future growth, if any
COUNT AND DESCRIBE
Prioritize for further processing based on:
Significance and use of overall collectionDanger of loss of content (degradation) due
to age or type of mediaUniqueness – not replicated elsewhereQuantity of digital content
DPOE MODULE 2: SELECT
Cost: storage may be cheap, management is not…especially over time
Not all digital content may be appropriate for your organization to preserve. Matching mission to
content
Keeping delivery and access manageable and sustainable
WHY SELECT CONTENT TO PRESERVE?
Log jam on the St. Croix River, 1886Wisconsin Historical Society WHi-
2364
Ask yourself which digital content is most significant to your
organization? most extensive? most requested/used? easiest? oldest? newest? mandated? at risk?
SETTING PRIORITIES
Postal workers sorting mail, 1955Wisconsin Historical Society WHi-36392
Communication is key, particularly when content comes from external creators
Keep content creators in the conversationArrange a convenient time for them to talk about your preservation plans
Identify list of materials to review with themDocument the results and send them a copy
Sample policy: Minds@UWhttp://uwdcc.library.wisc.edu/minds/faq.shtml
INCLUDE CONTENT CREATORS
THEN WHAT?
Steps for transferring born-digital content from media you can read in-house:
1. Use a “clean” computer.
2. Use a write blocker.
3. Insert source media.
4. Create a disk directory.
5. Copy fi les from media to the directory.
6. Generate a copy of the directory.
7. Generate and record a checksum.
8. Create a readme fi le.
9. Copy the directory to trustworthy archival storage.
10. Return the original physical media to storage.
11. Create or update any associated descriptive tool(s).
Dedicated computer
Regularly scanned with up-to-date antivirus software
Non-networked
STEP 1: CLEAN WORKSTATION
UW-Madison Archives
Prevents the computer from altering fi le content and metadata (i.e. date, creator)
Do not open fi les until after transfer
STEP 2: WRITE BLOCKER
https://www.flickr.com/photos/joncrel/6285946610/
Do not attempt to open any fi les.
Examine media for cracks, breaks, etc.
Remove any sticky notes or anything else that could become loose.
STEP 3: INSERT SOURCE MEDIA
bitcurator.net
Create a directory on the clean machine for the current project.
Within the directory, create sub-directories: Master Folder (to hold the master copy of the file) Working Folder (to hold working copies of the
master copy) Documentation Folder (to hold metadata and
other information associated with the project)
STEP 4: CREATE A DISK DIRECTORY
Copy files from the source media to the master folder Copy files individually or in groups-OR- Create a disk image
Disk image = single fi le containing an authentic copy of a disk’s contents, retaining original metadata and fi le system structure
After transfer from source media, make a second working copy – ok to open these fi les
STEP 5: COPY FILES
Generate a copy of the disk directory information File names File sizes File extensions Dates
Store a digital copy in the project documentation folder
Print a copy to keep with the physical collection
STEP 6: COPY THE DISK DIRECTORY INFO
Checksums (aka “hash sums”) are created by programs running an algorithm against the contents of a fi le. (There are many free utilities that will perform this function for you.)
The resulting checksum is a short sequence of letters and/or numbers that uniquely identifies that fi le. (think “electronic fingerprint”)
STEP 7: RUN CHECKSUMS
Unix cksum utility
Checksums help maintain the INTEGRITY of your collections because they will tell you if things change over time.
If two fi les are exactly the same, the checksums of those fi les will also be exactly the same (generally speaking).
If a fi le becomes corrupted, degraded or is changed in some way, the next time you run the utility on it, the checksum will change.
WHY IS THIS A GOOD THING?
Things that will NOT affect checksums Moving items from one place to another Changing the file name
Run on the master fi les when a collection is completed
Set up a schedule to run “verify checks” periodically
CHECKSUMS: THINGS TO REMEMBER
Leave yourself (and others) some breadcrumbs
Brief description of contents, any retention schedule, naming conventions, steps taken in transfer
Store the fi le in the project documentation folder and store a printout of the readme fi le with the physical collection materials
STEP 8: CREATE A README FILE
Copy the directories containing the master fi les and project documentation to trustworthy archival storage
Store a second copy of the fi les in a different physical location
May delete working fi les at this time
STEP 9 : TRANSFER TO SECURE LOCATION
STEP 10: RETURN ORIGINAL TO STORAGE
Return original source media to appropriate storage
- OR –
Destroy the originals using a secure method
Inventory as well as any finding aid, collection-level record and/or accession record
Include steps taken during transfer and the current location(s) of the fi les
STEP 11: CREATE OR UPDATE ANY ASSOCIATED DESCRIPTIVE
TOOL(S)
http://digitalbevaring.dk
Do no harm
Don’t do anything that prevents future action and use
Take action
Document what you do
REVIEW: FOUR ESSENTIAL PRINCIPLES
The Signal: Library of Congress digital preservation blog http://blogs.loc.gov/digitalpreservation/
Minnesota State Archives – Electronic Records Management Resourceshttp://www.mnhs.org/preserve/records/electronicrecords.php
Practical E-Records bloghttp://e-records.chrisprom.com
Digital Curation Exchangehttp://digitalcurationexchange.org
Digital Curation Bibliography http://digital-scholarship.org/dcbw/dcb.htm
FURTHER RESOURCES
Emily PfotenhauerRecollection Wisconsin Program Manager, [email protected]
Slides and links: http://recollectionwisconsin.org/borndigital
THANK YOU!