digitisation overview
Post on 16-Apr-2017
3.936 Views
Preview:
TRANSCRIPT
Ria Groenewald
Department of Library Services
University of Pretoria
You cannot teach a man anything;you can only help him find it within himself
Galileo Galilei
Simplified definition of digitisation
Digitisation is the managed conversion of analogue material to a digital format for ongoing access by electronic devices during the intended life cycle of the digital object
• Kodak / Minolta Microfiche scanner
•i2S DigiBook book scanner
•Nikon 9000 Coolscan
•USB Turntable
•Tapedeck - ripper
•Epson 1640X
1. Kodak / Minolta Microfiche scanner
2. i2S DigiBook bookscanner
3. Nikon 9000 Coolscan
4. iTTUSB Turntable
5. PlusDeck 2c
6. Epson A3 flatbed
1.
2.
3.
4.5.6.
The library needs to use technology effectively in reaching out to users. In the academy, this means bringing innovation to our thinking
http://www.llrx.com/node/2177/printStuart Basefsky, 16 June 2009
Following benchmarks and best practices that are not a good fit for your [university] or its culture can be
counterproductive. The most effective way of using benchmarks and best practices is as a creative
mechanism for raising questions about your own [situation]. Following what others do is rarely a form of
good leadership.”
Leadership & The Role of Information: Making The Creatively Informed Questioner
By Stuart Basefsky, Published on October 29, 2008http://www.llrx.com/features/leadershipandroleofinformation.htm
Identify a project
• Know your collections– what is valuable– what others need to “see”– core business of institution– what is used often– benefit of such a project (collection as well as
stakeholders)
As part of a digitisation project planning, you’ll have to decide on the scanning and format specifications such as the:
• bit depth (bitonal, greyscale or 24-bit colour)• scanning resolution (400 dpi, etc.)• image manipulation options (deskewing,
etc.)• file format (TIFF, etc.)
Project planning
• Hard to provide a general price range, variation in collections and requirements for digitisation• Digitisation projects, services and costs can be as unique as the collections selected for digitisation• Projects have fundamental similarities (dpi selection,
derivative file creation, source material format, etc.) other characteristics can make apparently similar
projects completely different
Cost
Institutions should be able to define and defend their choices related to digitisation in terms of their institutional mission of teaching and research, and to avoid the distraction of commercialising their products
Policy making
Think – don’t tumble
• Will digital assets increase access to information that is hard to obtain otherwise?
• Will digital assets increase the information value of the physical material?
• Does digitisation fit the organisation’s mission?
• Is there a known potential audience for the materials that are planned to be digitised?
• Will digitisation increase access, functionality or intellectual control?
Questions
• Will digitising these materials fill a need that is currently unmet?
• Are the materials in the public domain or can proper rights be secured?
• Is funding in place for the digitisation program?
Questions
• Identify a project• Selection criteria• Copyright• Basic preservation on physical material• Scanning• Manipulation• Web ready• Submit or hand over
Workflow
• know the history and rationale behind selection of sources• start with collection items that are often used• embrittled material• published between a certain time-line• materials have to be Africana• language limitations• forming part of a certain collection• make sure no doubles are included
Selection criteria
Copyright
• stay clear of copyright• try to avoid material still in copyright• where necessary start with copyright clearance
first – may take long to sort out• note every step along the way – keep the evidence
• Basic cleaning of material– dust– tears / broken corners– mould– remove selotype / glue / pritt– remove staplers, gem clips, anything that can
cause rust marks– store in acid free containers if possible
Physical preservation
UPSpace I R
QA
QA
QA
QA
Scan directly to archival server
Copy from AS
Quality Control
Deskew/cleaning/ derivation/filter
Safe web ready
Final QC + Storage
Archival server
Send to submitters via
Reviewer
Metadata EditorUnique URI created for object
external hard drive
DVD/CD/Flash drive
baseline submissionQA QA
UPSpace I R
13 Apr 2005
addLCSH
subjects
Preparation of materialLecturer/Vet library personnel
Copyright clearanceJacob
Selection criteria of materialLecturer / Vet library
Baseline metadataService Unit Staff
Scan materialDigitization office/EI
Conversion of image + OCR*Digitization office
Store master imageDigitization office + VET library
Baseline metadataService Unit Staff
Cataloguing on UPSpaceAmelia/Cataloguer
Webready process
Access rightsLecturer
UPSpace AdministratorAmelia Breytenbach (Vet)
Link imagesDigitization office/Amelia
*OCR of books – only Preface/Contents/Index
• Start with the easy part– photo collection– black and white documents
• Phase it• Reward yourself when finished
Scanning
Guidelines to digital imaging
Imaging requirements
• Printed text
Resolution Bit depth Enhancements allowed
400-600 dpi Bitonal Sharpening, descreening, cropping, deskewing, and despeckling
Imaging requirements
• Rare/damaged printed text
Resolution Bit depth Enhancements allowed
400-600 dpi 8-gray or 24 colour
Contrast stretching Minimal adjustments for tone and colour
Imaging requirements
• Book illustrations
Resolution Bit depth Enhancements allowed
400 dpi -600 dpi with enhancement
8-gray or 24 colour
Contrast stretching Minimal adjustments for tone and colour
• Less is more– don’t fiddle just do the necessary amendments– get it ready for web display– remember the technical metadata– note everything
Image manipulation
Redaction
• Identify material for redaction– Once redactions have been identified and
agreed upon, decisions need to be recorded– Do not remove a whole sentence or
paragraph if only one or two words are non-disclosable
– be consistent throughout the collection
• Archival image– each image need its own unique identifier– keep apart – do not work on archival image make
a COPY– save the copy apart from archival image– note every step in database
Storage
• More is better– archival image– at least one TIFF original on DVD/ hard disk /
external hard disk– at least one derivate copy on DVD/ hard disk/
external hard disk– store apart, if possible keep a copy in another
building
Storage
Codex Sinaiticus is one of the world's outstanding manuscripts. Together with Codex Vaticanus, it is one of the earliest extant Bibles, containing the oldest complete New Testament. This treasured codex is indispensable for understanding the earliest text of the Greek Bible, the transmission of its text, the establishment of the Christian canon, and the history of the book. Over 400 leaves survive and are held across four institutions http://www.codexsinaiticus.org/en/project/digitisation.aspx
Through testing, the decision was made to opt for a compromise colour. A light brown background was chosen that was close enough to the colour of the parchment to give a sense of its warmth, while reducing the show-through to a point where it rarely makes reading the page difficult.
Test image of a Codex Sinaiticus page on a black background
Test image of a Codex Sinaiticus page on a white background
http://www.codexsinaiticus.org/en/project/digitisation.aspx
Measuring for scanner set-up
Quality Control on scanned images
Make a copy of the original scanned image to work with
File Renaming
BookRestorer - derivation process
Black and white compressed image
MR. GLADSTONE ON FAIR T: AD'. AND RUNT JUCPuctios-jTHE nkxt I.IIiKt.AI. LRADKk?LORD?AKIINOTON's NEW ATTITUDE AND WHATMR. CHAMBERLAIN THINKS OF IT?MR.RI.AINK AND LOUIS KOSSUTH?AX ANARCHIST CARDINALBISMARCK AND BROWNING??ART AND LITERA?RY NOT I 8.fBT CABLR TO THIS TRIBUNE.|
Optical Character Recognition
http://chroniclingamerica.loc.gov/lccn/sn83030214/1888-01-01/ed-1/seq-1/%3Bwords%3D/
Newspaper digitisation
Microfiche
Risk analysis for digital objects
• Hard drive failure• URL error – linked broken• Storage medium failure• Loss of information/data• Human error and memory• Hackers
www.fotosearch.com
Preservation
• Preservation strategies should enable subsequent users to work with digital resources in the same way that they would be able to continue to work with older, analogue materials.
• Can we afford to scan at a low resolution, or make other compromises in the digitisation life-cycle
• budget for a possible migration strategy
• consider digital formats carefully
• metadata standards (technical and preservation)
• the organisation must be committed to the program
• follow best practices and international standards• IT must adapt to long-term needs of digital preservation• develop a technology infrastructure plan
Digital preservation
Rights = Object - instructed user what it represent
Transform to JPEG for web display
TIFF image file
Intellectual entity (photo)
Preserve for interoperability, access and readability
Converted to digital object
Object:
•File size
•Date created
•File format
•Creating application
Rights:
•License agreement
•Exact permissions granted over
preservation of the object
Agent:
•The role of the person undertaking the event (name/organization)
•Software name and version no.
•OS type
PREMIS MODEL
Ria GroenewaldDigitization CoordinatorDepartment of Library ServicesUniversity of PretoriaEmail: ria.groenewald@up.ac.zaTel: 012 x 420-3792
top related