a strategic view of document and digital object management

Download A strategic view of document and digital object management

If you can't read please download the document

Upload: derek-keats

Post on 16-Apr-2017

1.661 views

Category:

Education


1 download

TRANSCRIPT

A strategic view of document and digital object management

for the University of the Witwatersrand, Johannesburg

Prof Derek W. KeatsDeputy Vice Chancellor
(Knowledge & Information Management)
The University of the Witwatersrand, Johannesburg
http://[email protected]

Whataredocuments?

How does the computer'see' them?

Thestorageview

Themanipulationview

Thestructuralview

Theoperationalview

The
storage
viewThe
operational
viewThe
manipulation
viewThe
structural
view

Require software
that understands
the 'document' and
knows how to
present it.

The storage viewThe operational viewThe manipulation viewThe structural viewTimeTimeTime

The future

Today

Physical
deterioration

Digital
obsolescence

Accidental
damage

Loss of
metadata

Survival

Devices

File formats

A major threat to
proprietary
file formats
common inproprietary
systems

Today

Physical
deterioration

Digital
obsolescence

Accidental
damage

Loss of
metadata

Survival

Devices

File formats

Device obsolescence

File format
obsolescence

Software supporting the format fails in the marketplace or is bought by a competitor and withdrawn.

File format
obsolescence

Software upgrades fail to support legacy filesThe format itself is superseded by another or evolves in complexityThe format "take up" is low or industry fails to create compatible softwareThe format fails, stagnates, or is no longer compatible with the current environment

>

A small subset of commonly used media formats!

Media

If you don't have the software,
even a perfectly preserved document is of no use.

Digitization

Document
management

Born
digital

Digital
recovery

Digital archiving

Digital preservation

Analogue

Digital

Time

Digital
assets

Risk without long term planning

As a componentof how we manageour digital assets

Why digital asset management?

We are a knowledge organization

Knowledge workers spend 30-40%
of their time on document related tasks

This increases significantly when
other digital assets are taken into consideration

Digital assets are increasing and increasingly easy to lose

Digital assets form the basis of much of our researchAnd much more is possible

Digital archiving and preservation

Institutional papers and documents

Other digital assets

Historical papers

Library collections

Various history projects

Rockart collections

Video and audio collectionse.g. Wits TV

Donations of significant collections
from industry

History of human evolution research

Research output and theses

Research data

The curse of the
born-analogue

Social and semantic elements

CaptureCreateClassifyShareArchiveDestroyProtectRetainFind &usePreserveRoute

Creating semantic
and socially connected

document stores
archives
repositories
museums
herbaria

21st Century

ChisimbaSemantic and social 'X'

Fedora commons

Fedora commons
SWORD API

Chisimba

Fedora CommonsSWORD APIChisimba APIXMPP

eLearning'Portals'

Workflow

WEWE

Workflow

WeWe Basics

Rules-driven workflow engine

Rules represented in XML

Sequential event support

Conditional Return support

Written in Perl

Uses PostgreSQL Database

Open Source

Originally developed for The University of the Witwatersrand, Johannesburg

Multiple Management interfaces

WeWe Designer

Web-based design tool for designing workflows

Supports multiple events with multiple return types/states

Drag and drop interface

Written in JQuery

Open Source Interface

Adapt from Design Template support

WeWe Developer

Developers create Rules Modules

Modules can be written in Perl or any other language that can be executed from the Linux commandline

API

Commandline Interface

Workflow Process

Enterprise document
management

An approach using private cloud

Folder
serverWEWEChisimba

Private cloud infrastructure

SiteIngest

Born
digitalShared
folderNetwork

WEWE

Network

SiteSiteSite

Shared
folderWWWWEWEWorkflow managed by WEWE layer

Hosted
servicesDigital
archiveVirtualizationChisimbaFedora

Chisimba

OtherPrivate cloud infrastructure

WitsportalseLearningOS: Open Solaris

SOA layeremailZimbra

iRODSRemote
siteRemote
siteRemote
siteRemote
site WEWECompute cloudHierarchical storageRobotic
tape library

Spinning disks

Flash
memory

ComputecloudStorage
cloudRobotic
tape
library

Digital
archiveFedora WEWE

Chisimba

ArchonPrivate cloud infrastructure

Use in establishing digital archive

WEWE rulesIngest

Source
artifactsDigital
conversion

Remote
site

Ingest

Source
artifactsDigital
conversion

WEWE rules

Remote
siteBorn
digitalDocsAudioVideoetcSOA layerOS: Open SolarisFirst tier
storage

ComputecloudStorage
cloudRobotic
tape
library

Digital
archiveFedora WEWE

Chisimba

ArchonPrivate cloud infrastructure

Use in establishing digital archive

WEWE rulesIngest

Source
artifactsDigital
conversion

Remote
site

Ingest

Source
artifactsDigital
conversion

WEWE rules

Remote
siteBorn
digitalDocsAudioVideoetcSOA layerOS: Open SolarisFirst tier
storage

Scanning &assembly

#!/bin/bash#Scan in the pagesscanadf --mode "Black & White" --resolution 200

#Convert each page to a pdf filedoconvert $file $file.pdfrm $filedone

#Concatenate all the individual pdf files pdftk image-*.pdf cat output $1.pdfrm image-*.pdf
mv *.pdf /home/$USER/monitored/outgoing/ .

exit 0The real challengeis getting the document
scanned and into a
PDF and sent off
to somewhere
meaningful.

Thats why we need
expensive documentimaging software.

Right?

Let's have one digital asset management project for Wits and let us create the synergy that leads to innovation.

Attribution file: http://www.dkeats.com/usrfiles/users/
1563080430/attribution/attrib.txt