nikola tesla museum clipping library saša malkov nenad mitić Žarko mijajlović 3 rd seedi...

Post on 17-Jan-2016

216 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Nikola Tesla Museum Clipping Library

Saša Malkov

Nenad Mitić

Žarko Mijajlović

3rd SEEDI Int.Conf.

Cetinje, Montenegro

14. September 2007.

Clipping Library

Nikola Tesla Museum possesses a rich collection of newspaper clippings on work and life of Nikola Tesla

The clipping library is collected by Nikola Tesla, supported by his personal secretary

One part of the library is organized in books, while many clippings are not organized

Digital Library Prototype

Digitization Group at Faculty of Mathematics approached the development of digital clipping library prototype

Primary goals:– The problem analysis– Recognition of appropriate solutions

Problems

Significant variations in materials sources and qualities

The data and metadata organization and modeling

Data access

Differences in sources and preservation level

Different digitization techniques provide the different results, depending on paper and print type and preservation level

Different target formats are considered– Digital image formats– PDF– DejaVu format

Data organization

File systems are not appropriate– Complex data and metadata access– Limited search capabilities

Databases allow– Simpler access– Advanced searching

Automatic text extraction

Primary problems are :– Different languages– Large varieties and high font stylization used in the

corresponding time period– Significantly low material quality, because of aging

Different OCR systems are evaluated– No OCR software satisfied, primarily because of the low

material readability– Significant amount of manual corrections is necessary

Searching

The multiple criteria searching is essential, including searching by

– Metadata Caption Key words Publications Language Period

– The clipping content Manual corrections of text are essential The efficiency require the application of some indexing methods

The solution – DBMS

The prototype is based on DBMS IBM DB2– Advanced SQL implementation– Efficient handling of binary content– High concurrency level – High reliability– Good experiences– Free licensing terms

The solution – User interface

Web application concept is– Rich in content and visual presentation – Customizable – Portable– Relatively simple for implementing

The solution – Application

The library prototype is implemented in functional programming language Wafl– Wafl is designed for automatic document generation

and particularly customized for Web development– Features very simple and efficient database access

Nikola Tesla Museum Clipping Library

Saša Malkov

Nenad Mitić

Žarko Mijajlović

3rd SEEDI Int.Conf.

Cetinje, Montenegro

14. September 2007.

top related