dr. dinesh katre chief investigator centre of excellence for digital preservation associate director...
TRANSCRIPT
Dr. Dinesh KatreChief InvestigatorCentre of Excellence for Digital Preservation
Associate Director & HODHuman-Centred Design & Computing GroupC-DAC, Pune, INDIA.
Digital Preservation Requirements of Electronic Educational Content(A Perspective of National Digital Preservation Programme)
Punch cards
Punch tapesSelectron tubes
Magnetic tapes
Audio Cassette Magnetic Drum
Evolution and obsolescence of storage media
8-inch floppy disk
5 ¼ floppy disk3 1/2 inch floppy disk
Compact diskHard disk
DVDEvolution and obsolescence of storage media
Discontinued Tools, Closed Formats and Outdated Storage Devices
Computer hardwareContinued changes in CPU speed, memory, processing, etc.New hardware introducing new peripheral connections
Operating SystemUpgraded versions or new OS does not run the old softwareContinued transition from 8bit OS to 64bit OSs and so on
SoftwareSoftware upgrades do not support the former file formatsProprietary and closed source softwareDiscontinuation of software, lack of support
File formats Proprietary and closed format specificationChange in the format specificationDiscontinuation of the required softwareData corruption
Storage devices and mediaContinued reduction in size and cost of storage devicesContinued increase in storage capacity and performancePolycarbonate media like CD and DVD have uncertain lifetimes (Cerf, 2010)Obsolete storage media and unavailable reading devices e.g. 5 1/4” or 3 1/2” floppiesNew approaches like storage virtualization
Physical threatsImproper storage environment (temperature, humidity, dust, light)Overuse and handling of mediaNatural disaster Infrastructure failureHuman errorSabotage
Tangible versus non-tangible
Electronic These & Dissertation
DefinitionLong Term Digital Preservation (LTDP) is a secure and trustworthy mechanism to ingest, process, store, manage, protect, find, access, and interpret digital information such that the same information can be used at some arbitrary point in the future in spite of obsolescence of everything: hardware, software, processes, format, people, etc.
What does “Long Term” mean?How can we increase the likelihood that data generated in 2010 or earlier will still be accessible in useful form in 2020 and later? (Cerf, 2010)
Data should normally be preserved and accessible for not less than 10 years for any projects, and for projects of clinical or major social, environmental or heritage importance, the data should be retained for up to 20 years“ (Research Councils UK 2008:6)
An archive is expected to provide permanent or indefinite Long Term, preservation of digital information. (OAIS, 2009)
BenefitsDigital preservation provides benefits such as legal protection, knowledge heritage for future work / future generations, trend analysis, reuse etc.
What is Digital Preservation?
1 Terabyte = 1,000,000,000,000 bytes B = 1012
1 Petabyte = 1,000,000,000,000,000 bytes B = 1015
1 Exabyte = 1,000,000,000,000,000,000 bytes B = 1018
1 Zettabyte = 1,000,000,000,000,000,000,000 bytes B = 1021
Digital Universe / Digital Dark AgeAs per the “Digital Universe Study Report” by International Data Corporation (IDC), 2010 -
• Estimated Size of Digital Universe in 2010 1.2 million Petabytes or 1.2 Zettabytes
• Estimated Size of Digital Universe in 2020 35 Zettabytes (As all major forms of media –voice, TV, radio, print would have completed the journey from analog to digital.)
• Estimated Size of Unprotected Data needing protection in 2020 18,000 Exabytes
Sustainable Economics for a Digital Planet, Ensuring Long Term Access to Digital Information, February 2010 – A report prepared by Blue Ribbon Task Force
• What digital information should we preserve? • Who will preserve it? • Who will pay for it?
National Digital Information Infrastructure and Preservation Program (NDIIPP), USA – started in 2000
Network of Expertise in Long-term STOrage of Digital Resources (NESTOR), Germany – started in 2003
CASPAR - Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval UK – started in 2006
International Trends of Digital Preservation
Planets: Preservation and Long-term Access through Networked Services
DigitalPreservationEurope (DPE)
Digital Curation Centre (DCC)
APARSEN: Network of Excellence
International Trends of Digital Preservation
International Trends of Digital Preservation
Alliance for Permanent Access (APA), EU project
International Trends of Digital Preservation
Proceedings of Indo US Workshop
Panel recommendations for India’s National Digital Preservation Programme
Indo-US Workshop on International Trends in Digital Preservation March 23-24, 2009
Dr. A. K. Chakravarti, Chairman of Expert Group
Dr. Dinesh Katre, Principal Investigator (C-DAC)
Dr. Gautam Bose, National Informatics Centre
Ms. Renu Budhiraja, e-Governance Division, DIT
Mr. Sumnesh Joshi, Unique Identification Authority of India
Dr. Meena Gautam, National Archives of India
Ms. Debjani Nag, Controller of Certifying Authorities
Mr. Vakul Sharma, Supreme Court
Dr. Mukul Sinha, Expert Software Consultants
Dr. Kamalini Dutt, Doordarshan
Dr. Ramesh C. Gaur, Indira Gandhi National Centre for the Arts
Ms. Manju Mathur, All India Radio
Dr. S. B. Bhattacharyya, e-Health Consultant
Dr. Usha Munshi, Indian Institute of Public Administration
Dr. Vandana Sinha, American Institute of Indian Studies
Dr. A. Moorthy, Defence Scientific Information & Documentation Centre, DRDO
Mr. Ramchandra Budihal, WIPRO
Mr. Sanjeev Kumar Gupta, IBM
Mr. V.V.S. Nageswara Rao, National Remote Sensing Centre, Dept of Space
Mr. Zia Saquib, Centre for Development of Advanced Computing
Mr. Ashok Kapoor, Reserve Bank of India
Mr. Patrick Kishor, State Bank of India
Mr. Sukhdev Singh, National Informatics Centre
Mr. Vivek K. Srivastava, e-Governance Division, DIT
Ms. Seema Sridhar, Life Insurance Corporation of India
Mr. N. S. Mani, National Archives of India
Mr. V. H. Jadhav, National Film Archives of India
Dr. V. C. V. Rao, Centre for Development of Advanced Computing
Dr. Y.K. Somayajulu, National Institute of Oceanography
May 20-21, 2010
National Meet of Expert Group Members from 30 different stakeholder organizations held at C-DAC, Pune
National Study Report on
Digital Preservation Requirements of India
Volume – I Recommendations for National Digital Preservation Programme
Chapter 1. Need of Digital Preservation Chapter 2. Scope of Digital Preservation in IndiaChapter 3. Recommendations for NDPP
Volume – IIPosition Papers by National Expert Group Members
23 Status ReportsStakeholder RecommendationsShort Term (3Yrs) and Long Term Actions (10Yrs)
A) Conduct research and development in digital preservation to produce the required tools, technologies, guidelines and best practices.
B) Develop the pilot digital preservation repositories and provide help in nurturing the network of Trustworthy Digital Repositories (National Digital Preservation Infrastructure) as a long-term goal.
C) Define the digital preservation standards by involving the experts from stakeholder organizations, consolidate and disseminate the digital preservation best practices generated through various projects under National Digital Preservation Programme, being the nodal point for pan-India digital preservation initiatives.
D) Provide inputs to Department of Information Technology in the formation of national digital preservation policy and strategy by identifying and selecting the activities for the National Digital Preservation Programme.
E) Spread awareness about the potential threats and risks due to digital obsolescence and the digital preservation best practices.
Objectives of Centre of Excellence (C-DAC Pune) Project Duration: April 2011 to March 2014
Offering appropriate Delivery Information
Package to Designated Users
Creating a complete Archival Information
Package
Preparing for Valid Submission
Information Package
Digital Preservation ofGovernment Archives
Digital Preservation of Born Digital Records
Preservation of Cultural Digital Content
Digital Repository Portals for Access to Designated Users
Authenticity Management and
Digital Preservation Audit
Portal for National Digital Preservation Program (NDPP)
Domain Specific Digital Preservation and Archival Systems
Digital Preservation Research & Development
Digital Preservation Best Practices and Standards for e-Governance
Archival Science
Library Science
Digital Repository 01 Digital Repository 02 Digital Repository 03
Scope of CoE-DP Project
Motivations for Management of Electronic Theses and Dissertations
Archive and preserve valuable scholarly / academic resources Make it accessible to designated users Enable further research advancements Knowledge enhancement, problem solving Collective growth Comparative quality
InformationObject
PhysicalObject
DigitalObject
InformationObject
DigitalObject
BitSequence
BitSequence
Reformatted Digital Information Born Digital Information
Deteriorating due to time, changing weather, handling
Less accessible
Digital Surrogate
Best capture of current condition
More accessible
Interlinking between both is needed for
continuity
Analysis
Experimentation
Data Collection
Final Manuscript
Software
Artefacts
Data Formats
Raw data
Dissemination
Final Manuscript Dissemination
ISO 19005-1 PDF/A-1a
Key characteristics1. Preserves the visual appearance of document
2. Includes visible contents like text, raster images, vector graphics, fonts, color information
3. Documents logical structure
4. 100% self contained
5. Long term reproducibility - internationally accepted as a Standard for long-term electronic archiving
Prohibitions6. Information from direct or indirect external sources
7. Transparency and sound and movie actions
Databases
Statistics Graphs
Images
Video Audio Hyperlinks to URLs
3D Models
Documents
Algorithms
Base Programs
2. DEFINITION AND CONTENT OF THESES2.1 Definition
2.1.1 The Research Awards Rules (sr. 1.4(1)) define a thesis as 'original written work', which, for the PhD may have been published (thesis by publication), or may be comprised of video recordings, film or other works of visual or sonic arts, computer software, digital material or other non-written material.
2.1.2 ‘Written work' for a thesis, includes video recordings, film or other works of visual or sonic arts submitted by a student for examination.
2.3.2 A thesis by publication may also include video recordings, film or other works of visual or sonic arts, computer software, digital material or other non-written material for which approval has been given for submission in alternative format
E-thesis and dissertations Research data Paper publications
Research Resources
University Repository of Electronic Educational Content
Learning Resources
Lecture notes Power Point Slides Audio / Video / Animations E-learning content
State Level Repository of Electronic Educational
ContentUniv 1
Univ 2Univ 3
Univ 4
Univ 5
Engineering Degrees Awarded (2008)(R Banerjee , Engineering Education in India, 2008)
Ph.D.s 1000
Masters 20000
Bachelor 230000
Annual projection of Science Ph.Ds in India in 2020 - 20,000(Education: The PhD factory, Nature, 20 April 2011)
Location, building architectureServer hardware / racks / KVMStorage hardware (SAN, NAS)Backup deviceNetwork Infrastructure (Switches, routers, UTM)Raised flooringConnectivity and bandwidthDisaster recovery sitePhysical SecurityBiometric SecurityNetwork Security (Firewall)Motion detection for lightingCCTV digital recordersHumidity controlCoolingFire suppression Smoke detectionRedundant power supply
Physical setup
Trusted Digital Repository of Electronic Educational Content
Electronic Educational
Content
Key concerns in ETD and e-content preservation Raw research data
Use of Indian languages for thesis writing
Specifications for learning objects
Version control
De-duplication
Copyright protection
Authentication
Type of data and file formats
Define the standard practices
Digital preservation strategy
Data and value sharing rules
India’s National Digital Preservation Programmewww.ndpp.in
Thank You