challenges, workflows, and insights in the collaboration to preserve america's public media

Post on 14-Jun-2015

555 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

WGBH Media Library and Archives Director Karen Cariani and American Archive of Public Broadcasting Project Manager Casey Davis gave this presentation at the New England Archivists 2014 Fall Symposium. Karen and Casey discussed managing and preserving digital video; Project Hydra; metadata for audiovisual materials; and collaboration with other institutions through the lens of WGBH Media Library and Archives projects including the American Archive of Public Broadcasting and the NEH funded HydraDAM project.

TRANSCRIPT

Karen Cariani,Director, WGBH Media Library and Archives

Casey E. Davis,AAPB Project Manager

CHALLENGES, WORKFLOWSAND INSIGHTS

IN THE COLLABORATION TO PRESERVE AMERICA'S PUBLIC MEDIA

WHO WE ARE: WGBH MLA

WHO WE ARE: AAPB

...and more than 120 public radio and television stati ons and archives nati onwide

Social media allows anyone to become a video publisher and broadcaster

100 hours of video uploaded to YouTube every minute60:1 – 80:1 shooti ng rati o on documentary fi lmsHow oft en do you create videos?

We’re all digital archivists now.” -Sibyl Schaefer

I would add to that, more specifi cally.... In a few years, we will also all be audiovisual archivists.

WHY ARE WE HERE TODAY?

• Manage and preserve born-digital AV materials

• Explore digital media repository soluti ons• Generate metadata for digital AV materials• Evaluate multi -insti tuti onal collaborati ons

GOALS AND OBJECTIVES

How many of you have A/V materials in your collecti on?How many of you are collecti ng born digital media?How are you storing the fi les?Can you easily access them?What are your biggest concerns?Who is collaborati ng with other insti tuti ons?

A FEW QUESTIONS

MANAGING DIGITAL AV MATERIALS

• Fragi l i ty, vulnerabi l i ty of digital media• No universal ly accepted standards or

proof of concept• Digital obsolescence

• Complexity of digital video and audio • Complex intel lectual property issues• Huge fi le s izes make storage more

expensive• Storage l imitati ons lead to decis ions

to compress• Lack of training among archivists

CHALLENGES OF MANAGING DIGITAL VIDEO

wrapper

Synchronization information

subtitles

Chapter information

Multiple video streams

Multiple audio streams

One or more

codecs

AAPB DIGITIZATION OF 40K HOURS

WGBH’s 7,010 tapes that were sent to Crawford Media Services

RETURNED ON 17 LTO-6 TAPES

Additi on of 5,000 hours of digiti zed and born digital mediaUp to 59,000 fi lesNot to exceed 5.24 terabytes aft er transcoding occurred

THE AAPB BORN DIGITAL DELIVERABLE

Lack of staff resources at stati onsOft en no metadata for digital fi lesFile names not consistent w/ metadataSystem limitati onsBicycling hard drivesAccess quality vs preservati on quality5.24 terabytes became 250+ terabytes

WE HAD SOME CHALLENGES

Create procedures for donors to submit their digital fi lesProvide donors with resources to inventory their collecti on

Get as much metadata as you can from the donor

Provide donors with instructi ons on fi le naming, drive naming, and organizati on

ACQUIRING DIGITAL MATERIALS

Media currently stored on LTO-4 in an HSM systemThe goal: send all video fi les to AAPB10,648 fi les X approx. 100+ GB each = 201.6 TBCopied fi les over network onto

70 3TB hard drives

Success!

WGBH CONTRIBUTED FILES, TOO

...we initi ally had a 57% failure rate.

We learned the hard way that everyday IT operati ons are not good enough.

In the end:

26.4% failure rate

THINK AGAIN

Consider the NDSA levels of preservati on 1. Protect your data 2. Know your data 3. Monitor your data 4. Repair your data

Consider your resourcesDo what you can

NO, WE DON’T HAVE 7 COPIES

1: Protect your Data

2. Know your data

3. Monitor your data

4. Repair your data

Storage & geographic location

File fixity & integrity

Information security

Metadata

File formats Library of Congress. NDSA Levels of Preservation. http://www.digitalpreservation.gov/ndsa/activities/levels.html.

UK Data Service. Prepare and Manage Data.htt p://ukdataservice.ac.uk/manage-data/

Digital Curati on Centre. Checklist for a Data Management Plan.htt p://www.dcc.ac.uk/resources/data-management-plans/checklist

Library of Congress. DPOE Training Modules.htt p://www.digitalpreservati on.gov/educati on/

WITNESS. Acti vists Guide to Archiving Video.htt p://archiveguide.witness.org/

AMIA Educati on Committ ee Blog & forthcoming webinar serieshtt ps://amiaeducomm.wordpress.com/

RESOURCES

EXPLORE DIGITAL MEDIA REPOSITORY SOLUTIONS

Preservati on fi les are large Uncompressed Slow to move around

Complicated formats Not just one fi le type Codecs, wrappers, frame speed, etc.

Need proxy fi les for viewing Smaller size for quick transport over network Need transcoding

20

WHAT MAKES VIDEO DIFFERENT?

Vendor opti ons License fees - expensive Migration to new versions on their timetable Professional services to access proprietary code Sti ll need tech support

Open Source Need developers /tech support We all need the same basic functions Can add features and functionality

© 2010 WGBH 22

DAM SOFTWARE SOLUTIONS

To build a system using an open source soluti ons and components (Hydra tech stack) for digital media preservati on

How hard is it to do?

Is it implementable elsewhere?

Is it feasible for broad use?

© 2010 WGBH 23

NEH PROJECT GOALS

A system to help us manage digital fi les all formats Born digital

Many, many fi le formats and sizes

Analog to digital fi lesA system potenti ally for preservati on and access

Internal and external accessA system that could evolve with our needs as tech changes

Tech changes every 3-5 yearsAdapt to changing workfl owsAff ordable

© 2010 WGBH 24

WHAT DID WE NEED?

Open source We direct how it evolves We make sure it serves our needs

Perhaps cheaper in the long run Not free as in free puppy (or kitten) that

needs lots of support

But part of an enduring, sustainable community

© 2010 WGBH 25

WHY DID WGBH CHOOSE HYDRA?

• A robust repository fronted by feature-rich, tailored applicati ons and workfl ows (“heads”)

• One body, many heads

• Collaborati vely built gems and “soluti on bundles” that can be leveraged or adapted and modifi ed to suit local needs.

• A community of developers and adopters extending and enhancing the core framework

• Technical Training & Support

• Open source soft ware

© 2010 WGBH 26

WHAT IS HYDRA?

Aim to work towards a susta inable , open source reusable f ramework for multi purpose, multi functi on, multi - insti tuti onal repos i tory-enabled so luti ons

Chal lenges Do more with less Do it fast enough Do it well Get back on your feet quick

The Hydra Way - Work ing in Community Shared Purpose Conti nual Engagement & Assessment Tangible Results

27

WHY HYDRA?

“If you want to go fast, go alone, if you want to go far, go together” --African Proverb

Hydra Partners and Known Users

OR = Open Repositories Conference

Repository-Powered ApproachETDs

(Theses)Books, Articles Images

Audio-Visual

Research Data

Maps & GIS

Docu-ments

Digital Repository

Scalable, Robust, Shared Management

and Preservation Services

Maps

& GIS

Interface can be what you want it to be, simple Manage digital objects – core functi onality

Search Retrieve Describe Connect Store Preserve

Build functi ons and features on top of basic functi onality Started with Sufi a from Penn State

© 2010 WGBH 30

FUNCTIONALITY

© 2010 WGBH 31

© 2010 WGBH 32

Moving away from complicated systems

Turning to what we do best

Acknowledging that we can’t and shouldn’t try to be the end al l system

Focus on preservati on eff orts and hook into the workfl ow systems

Dealing with LOTS of fi les, big fi les, many formats, lots of stuff

Focus on how do we best handle this given our resources? © 2010 WGBH 33

WHAT WE’RE DOING

© 2010 WGBH 34

NEW WORKFLOW

Not easy or cheap

Defi nitely a free puppy Not house broken Needs care and attention

But great ‘walking ’ community Offer advice, share solutions Identify commonalities and work

together© 2010 WGBH 35

OPEN SOURCE TEST CASE

One Body, Many Heads…ETDs

(Theses)

Books, Article

s

Images

Audio-Visual

Research Data

Maps & GIS

Docu-ments

hydraScalable, Robust,

Shared Management and Preservation

Services

• Time consuming to give same level of detail that happens with other types of content

• Need rati onal balance

METADATA FOR AV MATERIALS

How many of you have an inventory of your AV assets?For Analog and digital?Do you have full catalog records?What metadata schema are you using to capture

Descriptive Intellectual property Technical & Preservation metadata?

QUESTIONS

A standard way for anyone managing video or audio to speak the same language

Best practi ces for capturing criti cal descripti ve, intellectual property, and technical metadata about video and audio

Under further development by the AAPB and PBCore Advisory Group

PBCORE | PBCORE.ORG

Northeast Histor ic F i lm Pop Up Archive University of I l l inois Center for

Innovati on in Teaching and Learning Smithsonian Channel Internati onal Cr iminal Tr ibunals , The

Hague All iance for Community Media University of South Carol ina, Moving

Image Research Col lecti ons Bay Area Video Coal iti on Columbia Univers ity L ibrar ies Cal i fornia Audiovisual Preservati on

Project

Rock and Rol l Hal l of Fame Community Media Distr ibuti on

Network MyMassTV Network Documentary Educati onal Resources Washington Univers i ty F i lm and

Televis ion Archive American Archive of Publ ic

Broadcasti ng Dance Her i tage Coal iti on Univers i ty of Notre Dame Greene County Publ ic L ibrary WITNESS Glenstone Art Museum

WHO USES IT?

WGBH I l l inois Publ ic Media Wisconsin Publ ic Te levis ion Wisconsin Publ ic Radio WYSO WNYC-FM WNET Louis iana Publ ic Broadcasti ng Pacifi ca Radio Archives KQED SCETV CUNY-TV KUHF

Howard University Television Database companies/orgs

have PBCore profi les including: Drupal Collecti veAccess Omeka Islandora

And many video and audio digiti zati on vendors...

WHO USES IT?

Local databases (Filemaker, Access, etc.)DAM systemsReady-made soluti ons:

Drupal CONTENTdm Collective Access Omeka

Spreadsheets

FIRST THINGS FIRST: HOW TO STORE DATA

BEFORE WE GO ANY FURTHER

Asset / Intellectual Work

Instantiations / Instances

4 content classes Intellectual Content Intellectual Property Technical Extensions

82 total elements30 att ributesSuggested controlled vocabularies

STRUCTURE OF PBCORE

Minimal fi elds you need to capture Identi fier

asset level & instanti ati on level Source of the identi fier Title

Formal or devised Type of ti tle Descripti on Locati on

Room, shelf, box, fi le path, hard drive ID, etc.

FINDING & CREATING THE METADATA

SO YOU’VE GOT THIS TAPE. NOW WHAT?

WHAT ABOUT THIS FILE?

DIGITAL MEDIA IDENTIFIER & FORMAT

Filename = Instanti ati on ID

From extension you can getDigital Format

http://en.wikipedia.org/wiki/Internet_media_type

DIGITAL MEDIA: ADDITIONAL METADATA

and more technical metadata...

AUTOMATE THE PROCESS

Automati on● removes human error● less staff ti me● consistency

Tools:ffprobemediainfoExifTool

BUT I USE OTHER STANDARDS...

PBCORE IS FLEXIBLE & EXTENSIBLE

As an XML schema, PBCore can be implemented along with other standards Within a METS wrapper With PREMIS as a sidecar fi le or as a <pbcoreExtension>

To provide more granular item-level descripti on along with collecti on-level descripti on in EAD

<PBCORETITLE>EXAMPLES</PBCORETITLE>

http://www.pbcore.org/documentation/

• Simple instantiation record• Simple description document• PBCore Collection• PBCore in a METS record• PBCore in a digital preservation setting• Using PBCore for asset management• Using PBCore for archival description

htt p://www.pbcore.orgPBCore Webinar recording:

htt p://www.vimeo.com/aapb/pbcorePBCore Validator:

htt p://infi nite-spire-2035.herokuapp.com/Forthcoming PBForm & updated Filemaker templateTwitt er: @therealpbcore

RESOURCES

MULTI-INSTITUTIONAL COLLABORATIONS

Content projects: Vietnam, Boston Local News, China?

Content inventory projectHydra community – open source projectAAPB – participating organizationsDigital Commonwealth

COLLABORATIONS

Planning ti me Creati ng policy – but be fl exibleDeliverables for collaborati onOrg chart for decision making – who has the fi nal word

Who is deeply involved, who is peripheralExample: Inventory project:

Data gathering

Tools – PBCore validatorForms – minimum fi eldsHand holding – call us

HOW TO GET WHAT YOU NEED

Important to build rapport with partners

Relati onships: I love you, I need you, but I want you to change

Learn about hierarchy at partner insti tuti on so you can understand challenges and potenti al obstacles.

Manage expectati ons

FACE TO FACE

In workflowIn budgetingIn needsIn timeframes

ACCEPT DIFFERENCES

Don’t be afraid to make sure your needs or your insti tuti ons needs are being met

In large collaborati on most likely you are not the only one to have those thoughts

SPEAK UP

Karen Carianikaren_cariani [at] wgbh [dot] org@kcariani

Casey E. Daviscasey_davis [at] wgbh [dot] org@caseyedavis1

www.americanarchive.orgwww.pbcore.orgOpenvault.wgbh.org

THANK YOU!

top related