digitisation overview

Post on 16-Apr-2017

3.936 Views

Category:

Education

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Ria Groenewald

Department of Library Services

University of Pretoria

You cannot teach a man anything;you can only help him find it within himself

Galileo Galilei

Simplified definition of digitisation

Digitisation is the managed conversion of analogue material to a digital format for ongoing access by electronic devices during the intended life cycle of the digital object

• Kodak / Minolta Microfiche scanner

•i2S DigiBook book scanner

•Nikon 9000 Coolscan

•USB Turntable

•Tapedeck - ripper

•Epson 1640X

1. Kodak / Minolta Microfiche scanner

2. i2S DigiBook bookscanner

3. Nikon 9000 Coolscan

4. iTTUSB Turntable

5. PlusDeck 2c

6. Epson A3 flatbed

1.

2.

3.

4.5.6.

The library needs to use technology effectively in reaching out to users. In the academy, this means bringing innovation to our thinking

http://www.llrx.com/node/2177/printStuart Basefsky, 16 June 2009

Following benchmarks and best practices that are not a good fit for your [university] or its culture can be

counterproductive. The most effective way of using benchmarks and best practices is as a creative

mechanism for raising questions about your own [situation]. Following what others do is rarely a form of

good leadership.”

Leadership & The Role of Information: Making The Creatively Informed Questioner

By Stuart Basefsky, Published on October 29, 2008http://www.llrx.com/features/leadershipandroleofinformation.htm

Identify a project

• Know your collections– what is valuable– what others need to “see”– core business of institution– what is used often– benefit of such a project (collection as well as

stakeholders)

As part of a digitisation project planning, you’ll have to decide on the scanning and format specifications such as the:

• bit depth (bitonal, greyscale or 24-bit colour)• scanning resolution (400 dpi, etc.)• image manipulation options (deskewing,

etc.)• file format (TIFF, etc.)

Project planning

• Hard to provide a general price range, variation in collections and requirements for digitisation• Digitisation projects, services and costs can be as unique as the collections selected for digitisation• Projects have fundamental similarities (dpi selection,

derivative file creation, source material format, etc.) other characteristics can make apparently similar

projects completely different

Cost

Institutions should be able to define and defend their choices related to digitisation in terms of their institutional mission of teaching and research, and to avoid the distraction of commercialising their products

Policy making

Think – don’t tumble

• Will digital assets increase access to information that is hard to obtain otherwise?

• Will digital assets increase the information value of the physical material?

• Does digitisation fit the organisation’s mission?

• Is there a known potential audience for the materials that are planned to be digitised?

• Will digitisation increase access, functionality or intellectual control?

Questions

• Will digitising these materials fill a need that is currently unmet?

• Are the materials in the public domain or can proper rights be secured?

• Is funding in place for the digitisation program?

Questions

• Identify a project• Selection criteria• Copyright• Basic preservation on physical material• Scanning• Manipulation• Web ready• Submit or hand over

Workflow

• know the history and rationale behind selection of sources• start with collection items that are often used• embrittled material• published between a certain time-line• materials have to be Africana• language limitations• forming part of a certain collection• make sure no doubles are included

Selection criteria

Copyright

• stay clear of copyright• try to avoid material still in copyright• where necessary start with copyright clearance

first – may take long to sort out• note every step along the way – keep the evidence

• Basic cleaning of material– dust– tears / broken corners– mould– remove selotype / glue / pritt– remove staplers, gem clips, anything that can

cause rust marks– store in acid free containers if possible

Physical preservation

UPSpace I R

QA

QA

QA

QA

Scan directly to archival server

Copy from AS

Quality Control

Deskew/cleaning/ derivation/filter

Safe web ready

Final QC + Storage

Archival server

Send to submitters via

Reviewer

Metadata EditorUnique URI created for object

email

external hard drive

DVD/CD/Flash drive

baseline submissionQA QA

UPSpace I R

13 Apr 2005

addLCSH

subjects

Preparation of materialLecturer/Vet library personnel

Copyright clearanceJacob

Selection criteria of materialLecturer / Vet library

Baseline metadataService Unit Staff

Scan materialDigitization office/EI

Conversion of image + OCR*Digitization office

Store master imageDigitization office + VET library

Baseline metadataService Unit Staff

Cataloguing on UPSpaceAmelia/Cataloguer

Webready process

Access rightsLecturer

UPSpace AdministratorAmelia Breytenbach (Vet)

Link imagesDigitization office/Amelia

*OCR of books – only Preface/Contents/Index

• Start with the easy part– photo collection– black and white documents

• Phase it• Reward yourself when finished

Scanning

Guidelines to digital imaging

Imaging requirements

• Printed text

Resolution Bit depth Enhancements allowed

400-600 dpi Bitonal Sharpening, descreening, cropping, deskewing, and despeckling

Imaging requirements

• Rare/damaged printed text

Resolution Bit depth Enhancements allowed

400-600 dpi 8-gray or 24 colour

Contrast stretching Minimal adjustments for tone and colour

Imaging requirements

• Book illustrations

Resolution Bit depth Enhancements allowed

400 dpi -600 dpi with enhancement

8-gray or 24 colour

Contrast stretching Minimal adjustments for tone and colour

• Less is more– don’t fiddle just do the necessary amendments– get it ready for web display– remember the technical metadata– note everything

Image manipulation

Redaction

• Identify material for redaction– Once redactions have been identified and

agreed upon, decisions need to be recorded– Do not remove a whole sentence or

paragraph if only one or two words are non-disclosable

– be consistent throughout the collection

• Archival image– each image need its own unique identifier– keep apart – do not work on archival image make

a COPY– save the copy apart from archival image– note every step in database

Storage

• More is better– archival image– at least one TIFF original on DVD/ hard disk /

external hard disk– at least one derivate copy on DVD/ hard disk/

external hard disk– store apart, if possible keep a copy in another

building

Storage

Codex Sinaiticus is one of the world's outstanding manuscripts. Together with Codex Vaticanus, it is one of the earliest extant Bibles, containing the oldest complete New Testament. This treasured codex is indispensable for understanding the earliest text of the Greek Bible, the transmission of its text, the establishment of the Christian canon, and the history of the book. Over 400 leaves survive and are held across four institutions http://www.codexsinaiticus.org/en/project/digitisation.aspx

Through testing, the decision was made to opt for a compromise colour. A light brown background was chosen that was close enough to the colour of the parchment to give a sense of its warmth, while reducing the show-through to a point where it rarely makes reading the page difficult.

Test image of a Codex Sinaiticus page on a black background

Test image of a Codex Sinaiticus page on a white background

http://www.codexsinaiticus.org/en/project/digitisation.aspx

Measuring for scanner set-up

Quality Control on scanned images

Make a copy of the original scanned image to work with

File Renaming

BookRestorer - derivation process

Black and white compressed image

MR. GLADSTONE ON FAIR T: AD'. AND RUNT JUCPuctios-jTHE nkxt I.IIiKt.AI. LRADKk?LORD?AKIINOTON's NEW ATTITUDE AND WHATMR. CHAMBERLAIN THINKS OF IT?MR.RI.AINK AND LOUIS KOSSUTH?AX ANARCHIST CARDINALBISMARCK AND BROWNING??ART AND LITERA?RY NOT I 8.fBT CABLR TO THIS TRIBUNE.|

Optical Character Recognition

PDF

Newspaper digitisation

Microfiche

Risk analysis for digital objects

• Hard drive failure• URL error – linked broken• Storage medium failure• Loss of information/data• Human error and memory• Hackers

www.fotosearch.com

Preservation

• Preservation strategies should enable subsequent users to work with digital resources in the same way that they would be able to continue to work with older, analogue materials.

• Can we afford to scan at a low resolution, or make other compromises in the digitisation life-cycle

• budget for a possible migration strategy

• consider digital formats carefully

• metadata standards (technical and preservation)

• the organisation must be committed to the program

• follow best practices and international standards• IT must adapt to long-term needs of digital preservation• develop a technology infrastructure plan

Digital preservation

Rights = Object - instructed user what it represent

Transform to JPEG for web display

TIFF image file

Intellectual entity (photo)

Preserve for interoperability, access and readability

Converted to digital object

Object:

•File size

•Date created

•File format

•Creating application

Rights:

•License agreement

•Exact permissions granted over

preservation of the object

Agent:

•The role of the person undertaking the event (name/organization)

•Software name and version no.

•OS type

PREMIS MODEL

Ria GroenewaldDigitization CoordinatorDepartment of Library ServicesUniversity of PretoriaEmail: ria.groenewald@up.ac.zaTel: 012 x 420-3792

top related