mets at uc berkeley generating mets objects. background kinds of materials: –primarily imaged...
TRANSCRIPT
Background
• Kinds of materials: – primarily imaged content & tei encoded
content• archival materials: manuscripts and pictorial
collections• oral histories
• Kinds of Metadata– Structural metadata: physical structure– Descriptive metadata – BasicTechnical metadata about digital files
and how they were produced
Tools For Producing METS Objects
• GenDB– Gathers structural, descriptive and
technical metadata
• GenX– Generates METS objects from
GenDB
GenDB
• Consists of:– Relational database (Currently SQL Server)– Locally developed software for gathering
metadata and facilitating digital processing
Div 1
GenDB Database StructureStructural Metadata
Div 2
Div 3
Object 1
Object 2
(root)
(parent = div 1)
(parent = div 1)
Div 1
Div 2
Div 3
(root)
(parent = div 2)
(parent = div 1)
Div 4 (parent = div 2)
Object 1 Div 1 Div 2 Div 3
Object 2 Div 1 Div 2 Div 3 Div 4
…
Structural Md Table
Div 1
GenDB Database StructureDescriptive Metadata
Div 2
Div 3
Object 1
Object 2 Div 1
Div 2
Div 3
Div 4
Core Desc Md
Core Desc Md
Core Desc Md
Core Desc Md
Core Desc Md
Core Desc Md
Core Desc Md
Name 1
Name 2
Name 3
Note 1
Note 2
Note 3
Name Table
Note Tables
Structural Md Table
Div 1
GenDB Database StructureContent File/Technical Md
Div 2
Div 3
Object 1
Master Image Table
Derivative Image Table
Structural Md Table
Drv 1
Drv 2
Drv 3
Mstr 1
Mstr 2
Technical Md
Technical Md
Drv 4
Technical Md
Technical Md
Technical Md
Technical Md
Populating the Database Tables
• Web interface: manual input of structural and descriptive metadata
• Digitization Management modules
– Generate work orders to guide digitization process
– Import content file information and technical metadata coming out of digitization process
• Batch loader: batch input based on TEI encodings, legacy metadata
Web Interface: WebGenDB
WebInterface
SQL ServerDatabase
Java Servlet
Java Server
XML Config Files
rmi
jdbc
Digitization Management Modules
WebInterface
Java ServletJava Server
SQL ServerDatabase
Imaging/TranscriptionWorkOrders
Vendor
Technical MDSpreadsheets
Batch Loader
WebInterface
SQL ServerDatabase
Java Servlet
Java Server
Java Batch Loader
XML Batch Load File
TEI Docs
XSLT
WebGenDB
The concepts that drove the design• Shielding user from METS complexity• Highly configurable• Unicode support• Access driven by login privileges• Use of Open Source software and
components• Distributed approach
XML Configuration Files
• Three levels– Common to all projects elements
– Common to all screens in a project elements
– Specific to a screen in a project
• Define fields common to all projects• Define fields used in specific project• Define screens by project & object type
AlProjects.xml
Proj1.xml
Proj2.xml
ObjectType1.xml
ObjectType2.xml
ObjectType1.xml
ObjectType2.xml
Relation among XML files
<ObjectType> <name>workorder</name> <fileLocation> /data/_w/GenDB/WEB-INF/classes/edu/berkeley/library/propertyFiles/CalCultureWorkOrderScreensFile.xml</fileLocation> </ObjectType>
<Field> <name>Image</name><type>checkbox</type><label>Image </label><size>1</size> </Field>
<Field> <name>Text</name><type>checkbox</type><label>Text </label><size>1</size> </Field>
<Field> <name>Title</name><type>text</type><label>Title </label><size>60</size> </Field>
Project XML file example
Software used
• MSSQL running on NT• Tomcat 4.1.2 implementing servlets 2.3• Jsdk 1.4• Xalan 2.4• Xerces 1.0.3• FOP 0.12.1• JDOM beta 8• Opta 2000
Relationship of GenDB to METS
• Metadata not directly stored in METS, MODS or MIX schema formats.– Much of the database structure was developed
before these standards emerged– Database structure and content adjusted to be
compatible with all these formats
GenX: From GenDB to METS
• Allows Digital Publishing Group staff to select the objects in the GenDB database that are ready for export and to export them as METS objects.
GenX Output
• METS output corresponding to version 1.3
• Descriptive metadata exported to METS descMD in MODS 2.0 format
• Technical Metadata exported to METS techMD in MIX format
• Planned:– Text technical md to METS descMD in
NYU TextMD– Rights to METS rightsMD in ODRL subset
Links
• GenDB Web Interface Demo– http://sunsite2.berkeley.edu/GenD– login: demo– password: demo
• Developers:– [email protected]– [email protected]– [email protected]