term paper - devansh mathur

7/31/2019 Term Paper - Devansh Mathur

1/56


2/56

2

CERTIFICATE

This is to certify that Devansh Mathur student of B. Tech. in Computer Science

& Technology has carried out the work presented in the project of the Term

paper entitled Semantic Digital library as a part of Second Year programme

of Bachelor of Technology in of B. Tech. in Computer Science & Technology

from Amity School of Engineering and Technology, Amity University

Rajasthan, under my supervision.

DATE: Mr Sanjay Jain

Faculty Guide

ASET, AUR


3/56


4/56

4

Content

Abstract.. 5

List of Figures. 6

1. Introduction.. 72. The Use of semantic web in digital libraries.. 10

2.1Resource Development Framework 112.2Web Ontology Language. 13

3. Digital Libraries: Earlier times.. 184. Digital Libraries Evolution: Content Sharing.. 235. The Digital Library Universe: .. .. 29

5.1.A Three-tier Framework.. 295.2.Main concept... 315.3.The Main Roles of Actors. 38

6. Advantages and Disadvantages of Libraries: From Past Till Now... 446.1.Of Past Digital Libraries.. 446.2. Of DL to SDL. 456.3.Of SDL to SSDL.. 476.4.SSDL and the future. 48

7. Future Prospects 498. Existing Semantic Digital Library Systems. 52

8.1. SMILE.. 528.2.JeromeDL. 528.3.BRICKS53

9. Conclusion. 5410.References..55


5/56


6/56


7/56

7

Introduction

Typical digital libraries usually focus on categorizing and cataloguing

resources. Information retrieval in such libraries relies primarily on text search

engines and free browsing. This approach proved to be useful, however it

suffers from ambiguity of natural language, neglecting the importance of

metadata; it also does not engage users in the process of sharing knowledge.Simple searching still returns too many results which have to be filtered

somehow. Page ranking algorithm helps with websites but cannot be easily

applied to books or e-learning objects. On the other hand, having a look on a

friends bookshelf can give us much clearer view on what is worth reading in a

particular domain than digging through a thousand books or websites published

this month. The semantic digital library is an attempt to restore the collaborative

approach to sharing knowledge.

The term Digital Library is currently used to refer to systems that are

heterogeneous in scope and yield very different functionality. These systems

range from digital object and metadata repositories, reference-linking systems,

archives, and content administration systems (mainly developed by industry) to

complex systems that integrate advanced digital library services (mainly

developed in research environments). This overloading of the term Digital

Library is a consequence of the fact that as yet there is no agreement on what

Digital Libraries are and what functionality is associated with them. This results

in a lack of interoperability and reuse of both content and technologies. This


8/56

8

document attempts to put some order in the field for the benefit of its future

advancement.

Libraries, together with archives, have always been the primary institutions

delegated to manage collect, preserve and diffuse human knowledge and

culture. When advances in computer science allowed dealing with digital

representation of documents dedicated to capture human knowledge and culture

rather than printed ones, libraries were particularly involved in exploiting the

potential of the digital revolution. Thus digital libraries soon became the term

to indicate the digital counterpart of traditional libraries. However, digital

library systems have greatly evolved since their early appearance. Today they

have become complex networked systems able to support communication and

collaboration among different worldwide distributed communities, dealing with

digital objects comprising not only the digital counterpart of printed

documents, but also images, video, programs and any other kind of multimedia

objects a community may define as appropriate to its working andcommunication needs. The evolution of digital libraries (DLs) has not been

linear, coming from the contribution of many disciplines. This has created

several conceptions of what a DL is, each one influenced by the perspective of

the primary discipline of the conceiver(s) or by the concrete needs it was

designed to satisfy.

As a natural consequence, the history of Digital Libraries, which is nowapproximately twenty years long, is the history of a variety of different types of

information systems that have been called digital libraries. These systems are

very heterogeneous in scope and functionality and their evolution does not

follow a single path. In particular, when changes happened this has not only

meant that a better quality system was been conceived superseding the


9/56

9

preceding ones but also meant that a new conception of digital libraries was

born corresponding to new raised needs. As it will be seen, most of the systems

dealt with in this history are still living in their original conception, even though

not in their original technological solutions.

The rest of this chapter goes back over this history, giving an account of past

and present understanding of these kinds of systems and on-going work in the

area. The chapter concludes with a vision of the impact that new DLs are

expected to have in the near future.


10/56

10

The Use of Semantic Web in Digital Libraries

The Semantic Web is a concept that was first introduced by Tim Berners-Lee.

Berners-Lee, Hendler, & Lassilas (2001) vision of the Semantic Web is that it

will bring structure to the meaningful content of Web pages, creating an

environment where software agents roaming from page to page can readily

carry out sophisticated tasks for users . This vision is supported by the

combination of metadata schemes, the use of the Resource Description

Framework (RDF), and the creation of ontologies. The ultimate goal of the

Semantic Web is to allow machines to understand the meaning of digital

objects, rather than just the key words used to describe them. This will

revolutionize the search and retrieval of digital objects, a key function for

digital libraries.

One of the main benefits of the Semantic Web is that it operates within the

current Web environment to add logic and meaning to digital objects. In their

review of an experiment to enhance the semantic interoperability of two digital

collections, Angjeli and Isaac (2009) state One of the key ideas of the semantic

web approach is to open and interconnect the meaning capitalized within

existing metadata. This idea is accomplished through the use of a combination

of XML, the Resource Description Framework (RDF), and the creation of

ontologies. XML or eXtensible Markup Language is a computer language that

allows users to create their own tags to annotate Web pages.

Many digital objects contain metadata, or information contained within the

document that defines information about the document such as the author,

publisher and date created. This metadata is often written using XML which


11/56

11

gives digital objects a similar framework, regardless of which metadata scheme

is used within the object. Although XML can define structure within a digital

object, it does not convey meaning to computers.

Figure 1 semantic web stack

Resource Description Framework (RDF)

The next step in the Semantic Web process is to add meaning to the XML data

found within digital objects. This is accomplished through the creation of RDF

models and schemas.

RDF models are composed of three parts with the basic form of subject-

predicate-object, and are often referred to as triplets. The subject is a resource

which can essentially be any digital object. The predicate is a property, such as

has author which signifies that the resource has an author. The object can be

another property or a class of that property, in our example this could be

Jones. The complete RDF model would be Object A has author Jones. This

designates to the computer that the object was authored by Jones.


12/56

12

Each part of the triplet can be assigned a unique universal resource identifier

(URI), which directs the computer, to that specific digital object. As pieces of

information are added to this data model it can be written in XML andtransmitted across applications. The RDF schema is the set of rules that define

the way RDF document must be written.

The benefits of using RDF are tremendous. As McCarthie Nevile and Mendez

state, If you give an RDF processor two sets of statements about the same

thing (a thing identified by the same URI), it just treats them as one collection

of statements. This means that a web of meaning can be created about aspecific object, and that this web of meaning can have multiple contributors.

Person

Student Researcher

subClassOfsubClassOf

Jeen

Type

hasSuperVisorDomain Range

Frank

Type

Has Supervisor


13/56

13

Web Ontology Language

The final piece of the Semantic Web is the use of ontologies. Ontologies

serve as the backbone of the Semantic Web by providing vocabularies and

formal conceptualization of a given domain to facilitate information sharing and

exchange . Ontologies are basically sets of logical rules that define the

relationships between sets of concepts. Ontologies can also be used to define the

concepts and the XML codes used to designate them within Web pages.

Ontologies are usually domain specific, and attempt to exhaustively define

concepts and map their relationships. The goal of ontology is to reinforce the

formal logic and tighten the meaning to the point where it can no longer

escape correct computer interpretation . The combination of XML, RDF, and

ontologies makes it possible for computer programs and agents to interpret

digital objects based on their meaning and their relationship to other concepts.

The benefits of using these technologies in digital libraries are myriad.

Furthermore, digital librarians already possess a skill set that enables them to

work with Semantic Web technology. Digital librarians are already working

with XML to create metadata, and the creation of ontologies is akin to the

development of a thesaurus.

There are many digital library projects already making use of these

technologies. One of the greatest benefits of the Semantic Web is the ability to

relate resources by their meaning rather than their form. This allows for the

linking of concepts from different vocabularies, freeing users from the confines

of any particular vocabulary.


14/56

14

The best example of the use of ontologies to achieve this goal is the Unified

Medical Language System (UMLS) developed by the National Library of

Medicine. The UMLS is comprised of three parts: a Meta thesaurus which

contains over one million biomedical concepts from over 100 source

vocabularies; a Semantic Network which defines 133 broad categories and

fifty-four relationships between categories for labelling the biomedical

domain; and a SPECIALIST Lexicon & Lexical Tools which provide lexical

information and programs for language processing (National Library of

Medicine, 2010).

The UMLS is distributed with software and tools that enable a user or

organization to use the UMLS to do research in the biomedical field. The

UMLS is essentially a comprehensive ontology of the biomedical field, and can

be used by a digital library who serves users with an interest in the biomedical

field. The UMLS is the perfect example of the potential for interoperability

created by using Semantic Web technologies. The National Library of Medicine

is arguably the well-respected source of biomedical resources, and their UMLScan be used by any single user or digital library to enhance the quality of their

searching or to build a search and retrieval interface.

Another example of a collaborative digital library project using Semantic Web

technology is Europeana , an interface for searching the digital resources of

Europe's museums, libraries, archives and audio-visual collections

(Europeana). Europeana allows users to explore these items and share thoseusing social media outlets. Europeana has also developed a Semantic Searching

Prototype which currently contains items from the Rijksmuseum, the Louvre,

and Netherlands Institute for Art History. The Semantic Search interface is a

simple search bar that allows users to search for concepts represented by digital

objects from these institutions.


15/56

15

Performing a search for love returns 504 items that are listed under the

following headings: works showing concept, works titled, works showing a

more specific concept, works showing a related concept, works from type a

related concept, and other. This breakdown gives the user a unique perspective

on how a Semantic Web search differs from a traditional search. When the user

clicks on a resource, a window pops up that gives the user information about the

resource including date, material, relation, subject, title, and type. These fields

correspond to many metadata schemes. The user can also link to the page from

the original institution that contains the item, and to a Europeana created page

for that item. The Europeana page allows users to download RDF files relating

to specific data about the item. The Europeana Semantic Search Prototype

shows how Semantic Web technology can be used to integrate resources from

multiple sources, to allow users to access these resources, and to facilitate the

sharing of Semantic Web data for each resource.

There are also many Semantic Web resources that are not directly connectedwith a traditional digital library that can potentially be used in digital library

interfaces. One such resource is Friend of a Friend Project or FOAF. This

project provides users with a way to create an RDF file that identifies the user

as a unique individual. This project is similar to the authority files created in

traditional libraries to differentiate between authors with the same name. FOAF

files can designate a wealth of information about a person. They can begenerated using XML/RDF by users or can be generated through social

networking sites. Digital libraries could allow users to create profiles, and then

translate these profiles to FOAF files for use within the digital library interface.

Digital libraries should provide users with services beyond simple search and

retrieval of items, and allowing users to customize their interface and track


16/56

16

search history is one way to accomplish this goal. A 2009 study by Jiang and

Tan attempted to use Semantic Web technology to create user ontologies to help

provide personalized information services. Jiang and Tan (2009) explain that

user ontology is a specialization of domain ontology by assigning each concept

and relation of the domain ontology with a specific value for indicating a users

interest.

This user ontology allows for ranking of retrieved search items by user

preferences. The authors required participants to submit a search query to a

system loaded with documents from the ACM digital library. The participants

then were asked to browse the top 30 documents retrieved and rank them based

on relevance.

This data was loaded into that users ontology based on the concepts included in

preferred and non-preferred documents. After the creation of the user ontology,

the data was loaded into the search interface. The participants then searched for

a new query, and the search system ranked the documents based on preferencesfrom the user ontology. These results were compared to a search query of the

same system not loaded with the user ontology. The authors found that the

system loaded with the user ontologies returned more precise results than the

system that was not programmed for user preference. The authors state User

ontology can be used in many ways to support Semantic Web applications,

including document re-ranking, information filtering, and query expansion.

The implications of this research are that digital libraries can employ Semantic

Web technology to track user preferences in a new way, and can seamlessly

integrate these preferences with current search capabilities.


17/56

17

Semantic Web technologies have been integrated with digital libraries to meet

the needs of digital library users and to meet the goals of digital library

organizations. The greatest asset of the Semantic Web is that it allows for

interoperability between organizations and information systems, for even

agents that were not expressly designed to work together can transfer data

among themselves when the data comes with semantics. Another asset of the

Semantic Web is that users can create information about digital objects using

any language, classification scheme, or metadata scheme. Once this information

has been linked to that objects DOI, the computer will integrate each piece of

information scattered throughout the Web to create a more complete record of

that object. 8 The Use of Semantic Web Technologies in Digital Libraries

Digital librarians should be leading the charge for the creation of RDF schemas

and ontologies for digital objects because they have a combination of traditional

library skills such as content representation and cataloguing along with

technological skills such as metadata creation. Digital libraries must use these

technologies to improve all aspects of their organizations.

The ideal digital library would use Semantic Web technology as the backbone

of its entire organization. All digital objects contained within their collections

would have RDF documents associated with them. The collections would all be

pre-loaded with ontologies related to their specific domain. Digital libraries

would contain resources from organizations around the world. Users would beable to log in and create profiles that would translate to FOAF files and user

ontologies to keep track of semantic preferences. The search and retrieval

functions would benefit from the ability to search semantically along with the

traditional search methods the library currently employs. Semantic Web

technology essentially teaches the computer the relationships between digital


18/56

18

objects, their meaning, and how users would like to interact with them. The

technology is already in place and the ideal digital library is only a few

keystrokes away.

DIGITAL LIBRARIES: THE EARLY TIMES

The digital library concept can be traced back to the famous papers of foreseer

scientists like Vannevar Bush and J.C.R. Licklider identifying and pursuing the

goal of innovative technologies and approaches toward knowledge sharing as

fundamental instruments for progress. Bush (Bush, 1945) devised

A device in which an individual stores all his books, records, and

communications, and which is mechanized so that it may be consulted with

exceeding speed and flexibility.

Moreover, on top of it there is a transparent platen. On this are placed

longhand notes, photographs, memoranda, all sorts of things. Because of the

lack of digital support, he identified in improved microfilm the means for

content storage and exchange: contents are purchased on microfilm ready for

insertion. Books of all sorts, pictures, current periodicals, newspapers, are thus

obtained and dropped into place. Of course, he envisaged also support forknowledge discovery (provision for consultation of the record by the usual

scheme of indexing),access (to consult a certain book, he taps its code on the

keyboard, and the title page of the book promptly appears before him) and

management (new forms of encyclopaedias will appear, ready made with a

mesh of associative trails running through them, ready to be dropped into the


19/56

19

memex and there amplified). Licklider realized thatcomputers were getting to

be powerful enough tosupport the type of automated library systems thatBush

had described and in 1965, wrote his bookabout how a computer could provide

an automated library with simultaneous remote use by many different people

through access to a common database. Because of this, Licklider is also

considered a pioneer of Internet and in its book he established the connection

between Internet and digital library. Thus, it is not surprising that research and

development activity on digital libraries started in the early 1990s, with the

Internet proliferation, and that Internet has created unprecedented possibilities

to discover and deliver human knowledge. The first systems delivering

knowledge artefacts in digital form can essentially be seen as archives of digital

texts accessible through a search service and implemented by a centralized

metadata catalogue.

The early ones of such systems were constructed on a rather simple

architecture, with the exception of very few cases. This worked to the advantage

of their diffusion and adoption by different scientific communities. BesidesarXiv, significant examples of such early systems were archives of various type

like Electronic Thesis and Dissertations repositories (ETDs), whose pilot project

started in 1996 , and archives of cognitive sciences papers and of research

papers in economics both launched in 1997. The former was a system which

was offering services for submitting, browsing and searching electronic thesis in

PDF format. The availability of this product stimulated the creation of theNetworked Digital Library of Theses and Dissertations international

organization, still operational, which registers and keep track of ETDs. Cog

Prints, was initially conceived as repository allowing the cognitive science

community to self-archive their papers. It now contains more than 3,000

artefacts starting from 1950. In 2000 it was made compliant with the protocol


20/56

20

defined by the Open Archives Initiative and then its software was converted

into the EPrints Digital Repository Software (EPrints, (n.d.)), a flexible platform

supporting easy and fast set up of repositories of open access research outputs.

Because of its simplicity, EPrints is currently widely used, more than 250

repositories declared to rely on it.

Similarly, RePEc was initially conceived as an open repository of electronic

papers in a specific domain. Thomas Krichel, principal investigator of the

RePEc Project, in 1997 illustrated the principles underlying a new realised

version of this system by affirming Distributed archivesshould offer metadata

about digital objects (mainlyworking papers); the data from all archives should

form one single logical database despite the fact that it should be held on

different servers; users could access the data through many interfaces;

providers of archives should offer their data to allinterfaces at the same time.

Krichel, with these statements was anticipating a view that would have largely

emerged few years later. These systems all still living in more recent and

enhanced versionsrepresent very embryonic forms of digital libraries. In fact,their functionality is essentially confined to (self) publishing of simple

information objects and discovery of these information objects through

rudimentary search and browse facilities.

The Digital Library Initiative (DLI) consisted of two major competitive fundingprograms, the first of which started in 1994 and funded six research projects

(chosen among 73 proposals) over a four-year period (Schatz and Chen,1996)

while the second phase was dedicated to extend the research carried out during

the previous phase by including content providers thus to guarantee the

availability of real test bed to validate research outcomes. However, the DLI


21/56

21

funded projects have not been the only ongoing efforts (CACM, 1995) even if

they were very innovative because they focused on future technological

problems. The six projects funded by DLI phase one were: the California

Environmental Digital Library (Wilensky, 1995) focused on developing the

technologies to access large, distributed collections of photographs, satellite

images, videos, maps, documents, and multivalent documents and to support

work-centred digital information services (Wilensky, 1996); the Alexandria

Digital Library (Smith and Frew, 1995) focused on building an online,

distributed digital library for geo-referenced1 information, including maps,

aerial photographs, satellite imagery, and catalogue records, and on supporting

geographically defined queries (Smith, 1996); the InformediaDigital Video

Library (Christel, Kanade, Maudlin, et al., 1995) focused on establishing a

large, online digital video collection with full-content and knowledge-based

search and retrieval (Wactlar, Kanade, Smith and Stevens, 1996)

Despite none of these systems exist anymore as a running service2, the

solutions proposed, the technology developed as well as the resources collectedand built have been largely used by more complex DLs developed later. It is

well known that one of the most important success stories resulting from these

projects is Google. Page and Brin started working on their search engine while

being PhD Students at Stanford working on the Stanford Digital Library Project.

Actually, the Digital Library Initiative merits goes far beyond the specific workthat it funded and we can affirm that it gave shape to digital library as a new

research discipline. Research in digital library topics was not new but it had

been fragmented across many disciplines. This program led to conferences,

publications and researcher, teams explicitly interested in doing research in

digital libraries. Moreover, it gave directions to the overall movement toward a


22/56

22

practical research field.(Arms, 2001) As anticipated, in Europe the scene was

characterised by the existence ofDELOSinitiatives.

In addition, it was intended to develop new models for intelligent audio-

visual content-based searching and film-sequence retrieval, new video

abstracting tools, and user interfaces specifically tailored to the new

functionality. The provision of multilingual services and cross language

retrieval tools was also addressed. Another project, i.e. AnIntegrated Art

Analysis and Navigation Environment(ARTISTE) (Allen, Vaccari and Presutti,

2000), focused on giving providers, publishers, distributors, rights protectors

and end users of art images information, as well as the multi-media information

market as a whole, a more efficient system for storing, classifying, linking,

matching and retrieving art images. This environment was providing, for

example, automatic extraction of metadata based on iconography, painting style,

etc; content-based navigation for art documents; distributed linking and

searching across multiple archives allowing ownership of data to be retained;

and storage of art images using large multimedia object relational databases.

According to these principles: the functionality of a digital library system were

available in the form of distinct functional units, each exposing its operational

semantics through an open protocol; digital library systems are compositions of

these functional units and new functionality can be added through the

implementation of value-added services, which interact with existing othersusing established protocols; the components (and content) of a digital library

could be spread over the global Internet, but should be presented to the user as a

single system.


23/56

23

DIGITAL LIBRARIES EVOLUTION:CONTENT

SHARING

The construction of digital libraries similar to those just described was very

resource-consuming since, for each new one, both the content and the software

providing its functionality were built from scratch. At the end of the 1990s, the

experiences of using distributed architectures to implement proper digital

libraries and the proliferation of independent repositories of valuable content

stimulated the idea of reusing content already collected (and curated) in existing

independent repositories so as to reduce the effort to build large-scale digital

libraries. However, many obstacles were to be solved to fully implement this

solution. The major of them was certainly how to implement repository service

interoperability, i.e. the capability of seamlessly accessing and using the content

managed in distributed and heterogeneous repositories. Approaches based on

cross-searching multiple archives based on a common protocol, such as

Z39.5010,, (Miller, P., 1999) were considered at the time costly and hardly

scalable. A very important meeting toward the interoperability of electronic

repositories was organised in Santa Fe, New Mexico, on October 1999, with the

goal to establish recommendations and mechanisms to facilitate cross-archive

value-added services.

This meeting led to the Santa Fe Conventiona combination of organizational

principles and technical specifications to facilitate a minimal but potentially


24/56

24

highly functional level of interoperability among scholarly e-print archives

and to the establishment of the Open Archives Initiative. (Van de Sompel, H.,

Lagoze, C., 2000) The meeting started by discussing a concrete example of

interoperability implemented through the UPS Prototype (Van de Sompel, H.,

Krichel, T., Nelson, M.L., 2000) and recognising its potentialities. The UPS

prototype demonstrated the integrated action of a variety of services operating

over data originating from a set of archives. Each of those services provided a

reasonably rich level of functionality (accessible through a set of protocol

methods). The participants recognised that trying to reach consensus on the full

functionality of the prototype was aiming too high and that a proper degree of

modesty in the approach toward integration capable to balance the cost of

participation with the need for adequate functionality was mandatory. The Santa

Fe Convention identified two key roles in participating institutions: data

providers and service providers. Data providers were in charge to handle the

depositing and publishing of resources in a repository and expose for

harvesting the metadata (what they called record) about resources in therepository. They were the creators and keepers of the metadata and repositories

of resources. Service providers were in charge of harvesting metadata from data

providers for the purpose of providing one or more services over the collected

data. The types of services that might be offered included a search interface,

peer-review system, etc. The cooperation between content and service providers

was regulated by a protocol, initially defined as a subset of the Dienst protocoland nowadays known as the Open Archive Protocol for Metadata

In the US, the National Science Foundation funded the National Science Digital

Library(NSDL) (Zia, L.L., 2001) with the aim to provide organized access to

high quality resources and tools that support innovations in teaching and


25/56

25

learning at all levels of science, technology, engineering, and mathematics

education.

These large-scale initiatives devoted to aggregate in a single place knowledge

that is spread across a plethora of archives and systems will ever exist for a

series of reasons including the existence of various (institutional) repositories

and the ever growing multidisciplinary nature of our society. In particular, TEL

and DARE anticipated important initiatives, namely, Europeana and DRIVER,

respectively, which were launched few years later. Europeana14 is a Thematic

Network funded by the European Commission under the eContentplus

programme, as a part of the i2010 initiative15. Europeana began in July 2007.

Originally known as the European digital library networkEDLnet it is the

result of a partnership of 100 representatives of heritage and knowledge

organisations and IT experts from throughout Europe. Objective of Europeana

is to provide access to Europes cultural and scientific heritage through a cross-

domain portal. The first Europeana prototype, launched in November 2008,

provided simple search and retrieval facility on an information space ofapproximately two millions of digital objects selected from Europes museums,

libraries, archives and audio-visual collections, harvested through the OAI-

PMH protocol. The first production quality version of Europeana (called Rhine)

will go live on July 2010, to be followed in April 2011 by a more sophisticate

version (Danube), including more contents and offering a richer set of

functionality. The intention is that by 2010 the Europeana portal will giveeverybody direct access to well over 6 million digital sounds, pictures, books,

archival records and films. Moreover, Europeanas goal is to realize a system

serving very different type of users. It should meet occasional curiosity of

generic users as well as the information needs of school children and students. It

should also provide academic students and teachers with certified information


26/56

26

and the possibility to export information for courses, as well as offer expert

researchers and professional the possibility of searching, verifying and

annotating information and using ad-hoc services. In the context established by

Europeana, special types of providers are the aggregators, i.e. specialised DLs

that act as collectors of content from other providers. For instance, Culture.fr is

the largest aggregator, providing content from about 480 organizations in

France, including the Louvre and the Muse dOrsay. The information resources

that populate Europeanas information space are harvested as surrogates of the

original objects that are located at content providers sites. Since surrogates may

also contain elements of the original object (table of contents, full text index

items, music and video abstraction etc.), the very interesting new feature

of Europeana is that it will also deliver digital objects besides metadata. Clearly,

heterogeneity and interoperability are main issues that such a

DL is having to deal with, as well as, of course with scalability, quality of

service and, more in general, sustainability of the joint portal.

DRIVER16is another notable example of a DL that relies on content providedby a large number of external data providers. It is the result of two subsequent

projects funded by the European Commission in the period 2006-2009. The

main aim of these two projects is to create the organisational and technological

conditions for the set up of a European Repository Infrastructure (Jones, S.,

Manghi, P., 2009). The main instrument identified by the project to address

organisational issues is the DRIVER Confederation17. The Confederationpartners represent European and international repository communities, like

subject based communities, repository system providers, service providers, as

well as political, research, and funding organisations, who share the DRIVER

vision to allow all research institutions in Europe and worldwide to make all

their research publications openly accessible through institutional repositories.


27/56

27

In the spirit of this shared goal, the DRIVER confederation encourages a

combined effort of repository development by setting up guidelines and best

practices that favour the realization of a shared, trusted, long-term repository

infrastructure. From the technical point of view, DRIVER is based on theD-Net

technology18. This enabling technology is quite innovative in the context of

these kinds of aggregative systems because it is oriented to the realisation of a

digital library infrastructure (cf. Sec. 5). D-Net is based on a Service-oriented

architecture, where distributed and shared resources are implemented as

standard Web Services and applications consist of sets of interacting services. It

offers services to both data providers, that through it can more easily

share their content, and service providers, that are facilitated in implementing

DLs that exploit the aggregated content.19 At the time of this writing, the

DRIVER service provides access to approximately one million records out of

200+ repositories across 27 countries. Moreover, it delivers three DL

applications: the Belgium national repository portal, offering search over the

Belgium Repository Federation subset; Recolecta national repository portal,offering search on the Spanish Repository Federation subset; and the main

DRIVER portal, providing access and advanced functionality over the whole

space. The current Europeana and DRIVER services operate an information

space of metadata records, i.e. they harvest metadata records through the

OAI-PMH protocol from exiting repositories and then they run their services by

exploiting this content. Because of this they suffer from the limitations thatOAI-PMH poses if it has to be used to exchange information objects that are

rich in structure and payload as those at the core of changing nature of

scholarship and scholarly communication.(Van de Sompel, H., Payette, S.,

2004)(Van de Sompel, H., Lagoze, C., 2006) In particular, when feasible, they

give access to the content associated with the metadata by exploiting URL or


28/56

28

some other information contained in the record. This solution to access

information objects, however, suffers of two main problems:

(i)the access is not always feasible since there is no standard protocol toaccess objects;

(ii) There is no way of accessing compound objects since the structure and

the relation holding among the different parts is unknown.

A solution to this problem may come from the OAI-ORE20 standard, whose

version 1.0 has been released in October 2008 by the Open Archives Initiative.

This standard, based on Web standards, proposes a solution to handle

aggregations of Web resources. These aggregations, sometimes called

compound digital objects, may combine distributed resources having multiple

media types including text, images, data, and video as to form innovative

research outcomes. Both Europeana and DRIVER have already planned to

move very soon to technologies la OAI-ORE to manage compound objects.

All the systems and initiatives described in this section are essentially oriented

to contentsharing. Moreover, the majority of them is characterisedby a strong organisational effort since the model is based on a cooperative

participation of the content providers. Content sharing across digital libraries is

now being largely promoted as an important strategy to reduce the digital

library set up costs largely coming from selecting, digitising, describing, and

digitally curating content resources. However, the realisation of wide and

generalised content sharing is today still problematic due to the great variety ofproprietary models and ontologies adopted by existing systems and by the lack

of systematic approach to interoperability a recently funded EC project

stemming from the DELOS project, is paving the way for the future

interoperability of DL systems thus making feasible the implementation of

global digital library infrastructures.


29/56

29

The Digital Library Universe

A Three-tier FrameworkA Digital Library is an evolving organisation that comes into existence through

a series of development steps that bring together all the necessary constituents.

Figure below presents this process and indicates three distinct notion of the

`systems` developed along the way forming a three tier framework: Digital

Library, Digital library system and Digital library management system. These

correspond to three different levels of conceptualisation of the universe of

Digital Libraries.

Figure 2 DL, DLS and DLMS: A Three-tier Framework

These three system notions are often confused and are used interchangeably inthe literature this terminological imprecision has produced a plethora of

heterogeneous entities and contribute to make the description, understanding

and development of digital library systems difficult.


30/56

30

As Figure indicates, all three systems play a central and distinct role in the

digital library development process. To clarify their differences and their

individual characteristics, the explicit definitions that follow may help.

Digital Library (DL)

A potentiallyvirtual organisation,thatcomprehensivelycollects,managesand

preserves for the long depth of time rich digital content, and offers to its

targeted user communitiesspecialised functionality onthatcontent,of Defined

quality and according to comprehensive codified policies.

Digital Library System (DLS)

A deployed software system that is based on a possibly distributed architecture

and provides all facilities required by a particular Digital Library. Users interact

with a Digital Library through the corresponding Digital Library System.

Digital Library Management System (DLMS)A generic software system that provides the appropriate software infrastructure

both (i) to produce and administer a Digital Library System incorporating The

Suite of facilities considered fundamental for Digital Libraries and (ii) to

integrate additional software offer in more refined specialised or advanced

facilities. Although the concept of Digital Library is intended to capture an

abstract system that consists of both physical and virtual components the DigitalLibrary System and the Digital Library Management System capture concrete

software systems. For every Digital library there is a unique digital library

system in operation (possibly consisting of man inter connected smaller digital

library systems) whereas all Digital Library systems are based on a handful of

Digital library management systems. The digital library is thus the abstract


31/56

31

entity that `lives` thanks to the software system constituting the DLS and the

DLMS is the software system that is conceived to support the life cycle of one

or more DLS. It is important to note that all these concepts underlie all types of

information environment and systems, e.g. database, hospital info systems,

Banking systems, the web, Wikipedia,, etc. however it is the particular

characterizations given in the definitions of the previous section that

distinguishes digital libraries from the others: the content should be rich,

annotated and preserved for depth of time the user communities should be

targeted, the functionality should be specialised the quality should be measured

and according to the comprehensive policies . All of these characterizations, of

course are abstract and subject to interpretation, so they cannot lead to a precise

formal definition. Nevertheless they offer conceptual yardsticks by which

system can be measured and mutually compared and psychological lower

bounds can be established regarding the nature of digital libraries.

Main ConceptsDespite the great variety and diversity of existing digital libraries, there is a

small number of fundamental concepts that underlie all systems. These concepts

are identifiable in nearly every digital library currently in use. They serve as a

starting point for any researcher who wants to study and understand the field ,

for any system developer intending to construct a digital library, and for any

content provider seeking to expose its content via digital library technologies. In

this section, we identify these concepts and briefly discuss them. Seven core

concepts provide a foundation for digital libraries. One of them appears in the

definition of digital library to capture the commonalities between this universe

and other social arrangements: Organisation.


32/56

32

Five of them appear in the definition of digital library to capture the features

characterising this kind of organisation and the expected service: Content,

User, Functionality, Quality and Policy. The seventh one emerges in the

definition of the digital library system to capture the systemic features

underlying the expected service:Architecture.

All seven concepts influence the digital library three-tier framework, as shown

in fig.

Figure 3 the Digital Library Universe: Main Concepts

Organisation

The organisation concept is surrounding the entire Digital library universe. A

digital library is a kind of organisation by its own, it is a social arrangement

pursuing a well-defined goal (the digital library service). This concept subsumes

the mission the digital library has been conceived for and every other aspect that

is needed to define this mission and the operation of the resulting service.

However, this should not be confused with the organisation/institution that


33/56

33

decided to set up the digital library and drive its development although there are

overlaps and dependencies between the two. It is quite easy to recognise the

dependency relationship between the two, to some extent the institution sets the

scene for the digital library organisation, the institution is the establisher of the

Digital LibraryOrganisation and has the power to define the overall service this

organisation is requested to realise. However, the digital library, being an

organisation by its own has the power to control its own behaviour and

evolution in the frame defined by the institution. This concept is fundamental to

characterise the digital library universe because it highlights the commonalities

between this universe and the other one dedicated to capture organised body of

people having a particular purpose.

Content

The content concept encompasses the data and the information that the Digital

library handles and makes available to its users. It is composed of a set of

information objects organised in collections. Content is an umbrella conceptused to aggregate all forms of information objects that a digital library collects,

manages and delivers. It encompasses a diverse range of information objects,

including primary objects annotations and metadata.

This concept is fundamental to characterise the digital library universe because

it captures one of the major resource these organisations are called to manage,

i.e. the data and the information that is made available through it.

User

The User concept covers the various actors (whether human or machine)

entitled to interact with Digital Libraries. Digital Libraries connect actors with

information and support them in their ability to consume and make creative use


34/56

34

of it to generate new information. User is an umbrella concept including all

notions related to the representation and management of actors entities within a

digital library. It encompasses such elements as the rights that actors have

within the system and the profiles of the actors with characteristics that

personalise the systems behaviour or represent these actors in collaborations.

This concept is fundamental to characterise the Digital Library universe because

it captures the actors of the overall Organisation.

Functionality

The Functionality concept encapsulates the services that a Digital Library offers

to its different users, whether individual users or user groups. While the general

expectation is that Digital Libraries will be rich in functionality, the bare

minimum of functions includes new information object registration, search and

browse. Beyond that, the system seeks to manage the functions of the Digital

Library to ensure that the overall service reflects the particular needs of the

Digital Librarys community of users and/or the specific requirements related toIts Content. This concept is fundamental to characterise the digital library

universe because it captures the facilities offered by the overall organisation.

Policy

The Policy concept represents the set or sets of conditions, rules, terms, and

governing every single aspect of the digital library service including acceptablebehaviour, digital rights management, privacy and confidentiality charges to

users, and collection formation. Policies may be defined within the digital

library or be superimposed by the institution establishing the digital library or

outside of that (e.g., Policy governing our society). There policies can be

extrinsic or intrinsic policies.


35/56

35

This concept is fundamental to characterise the Digital Library universe because

it captures the rules and conditions regulating the overall Organisation.

Quality

The Quality concept represents the parameters that can be used to characterise

and evaluate the overall service of a Digital Library including every aspect of it,

i.e. Content, User, Functionality, Policy, Quality, and Architecture.

Quality can be associated not only with each class of content or functionality

but also with specific information objects or services. Some of these parameters

are quantitative and objective in nature and can be measured automatically,

whereas others are qualitative and subjective in nature and can only be

measured through user evaluations (e.g., focus groups). This concept is

fundamental to characterise the Digital Library universe because it captures

qualitative aspects characterising the Organisation.

Architecture

The Architecture concept refers to a Digital Library System and represents a

mapping of the overall service offered by a Digital Library (and characterised

by Content, User, Functionality, Policy and Quality) on to hardware and

software components. There are two primary reasons for having Architecture asa core concept:

(i)Digital Libraries are often assumed to be among the most complex andadvanced forms of information systems (Fox & Marchionini, 1998);

And


36/56

36

(ii)Interoperability across Digital Libraries is recognising as a majorchallenge.

A clear architectural framework for Digital Library Systems offers ammunition

in addressing both of these issues effectively. This concept is fundamental to

characterise the Digital Library universe because it captures the systemic part of

the service offered by the Organisation. The concepts populating the areas just

introduced (Organisation Is a special case since it subsumes all the rest) share

many similar characteristics and all refer to internal entities of a Digital Library

that can be sensed by the external world. Therefore, there has also been

introduced a higher--level concept referring to all of these, i.e.,

Resource, which enables us to reason about the common characteristics in a

consistent manner.

Figure puts in perspective the main concepts of the Digital Library universe.

The Organisation concept surrounds and subsumes all the other concepts.

Among the remaining six, two of them are independent each other, i.e., they

exist independently of a specific Digital Library. These are User, Representing

the external humans or hardware interacting with the Digital Library and

Content, representing the material handled by the Digital Library. Architecture,

representing the technological design on which the Digital Library System is

based, represents the underlying technology that is called to implement all the

rest. On top of these concepts there comes Functionality, primarily representing

the means for connecting Userto Content, i.e., all procedures, transformations,actions and interactions that bring Content to User Or vice versa. Finally,

operation of the Digital Library and activation of its Functionality are based on

Policy and aim to achieve certain Quality.


37/56


38/56

38

The Main Roles of Actors

In order to describe the overall operation of the Digital Library Organisation

and the way it is expected to deliver the service it has been established for, we

envisage actors interacting with digital library playing roles in three different

and complementary categories:

DL End--users,

DL Managers

AndDLSoftwareDevelopers.

Figure 5 The Main Roles of Actors versus the Three--tier Framework

As shown in Fig each role is primarily associated with one of the three

systems in the three--tier framework. The system a role is associated with

represents the entity that is expected to provide the actor playing such a role

with the facilities needed to accomplish the mandate assigned to the role.

Moreover, every actor, independently from the role he/she is playing, is

expected to deal with all the foundational concepts characterising the Digital

Library universe.


39/56

39

DLEnd-- user

DL End--users exploit the overall Digital Library service for the purpose of

providing, consuming, and managing the DL. They are the target clients of the

service defined by the DL Organisation in terms of the Content to be managed,

the User(s) to be served, the Functionality to be supported, the Policy (ies) to be

put in place and the Quality to be exposed. They perceive the DL as a stateful

entity serving their needs. This state of the Digital Library is a complex

condition resulting from and influencing Content, User, Functionality, Policy

and Quality aspects of the DL Organisation. Moreover, the state is expected to

evolve during the lifetime of the Digital Library as a consequence of a series of

actions and activities performed in the context of the DL Organisation as well as

of external factors influencing the DL Organisation. DL End--users may be

further divided into

Content Creators, Content Consumers and Digital Librarians.

Content Creators are theproducersoftheDigitalLibraryContent,i.e.,they

take careof producing new items contributing to the Digital Library Content.

Theiractivityisperformed

(i) ThroughtheFunctionalitytheDLisprovidedwith,

(ii) In accordance with the Policies defined in the DL, and(iii)

With the guarantee of Quality the DL declares.

Content Consumers are theclientsof theDigitalLibraryContent, i.e., they

access and use the items in the Digital Library Content. Their activity is

performed

(i) Through the Functionality the DL is provided with,


40/56

40

(ii) In accordance with the Policies defined in the DL, and(iii) With the guarantee of Quality the DL declares.

Digital Librarians are the curators of the Digital Library Content, i.e. they

select,organise and look after the items in the Digital Library Content. Their

activityisperformed

(i) Through the Functionality the DL is provided with,(ii) In accordance with the Policies Defined in the DL, and(iii) With the guarantee of Quality the DL declares. Moreover, they might

influence the behaviour of the overall Digital Library service by acting

as mediators between the final clients of iti.e., Content Creators and

Content Consumers and those defining and operating this service

i.e., DL Managers by distilling and elaborating feedbacks on the DL.

DL ManagersDL Managers are the actors driving the overall Digital Library service. They are

expected to rely on the facilities offered by The DLMS To define and operate

the Digital Library and the DLS implementing it. DL Managers may be further

divided into DLDesigners and DL SystemAdministrators. The former are

called to devise the overall service while the latter are called to deploy and

operate the DLS implementing the planned service.

DLDesigners

exploit their knowledge of the application environment that a DL is called to

serve in order to define, customise, and maintain the Digital Library so that it is

aligned with the needs of its target DL End--users.


41/56

41

To perform this task, the DL Designers interact with the DLMS to decide upon

the characteristics the Digital Library Should have in terms of

(i) Content, e.g., the set of repositories, ontologies, classification schemas,information object types, metadata formats, authority files, and

gazetteers that form the DL Content;

(ii) User, e.g., the allowed actors, the allowed roles, the informationcharacterising the actors;

(iii) Functionality, e.g., the functional facilities to be offered, the behaviourthese facilities should implement;

(iv) Policy e.g., the rule and principles governing the evolution of the DLContent, the allowed actions per actor or family of actors, the

exploitation of a resource;

(v) Quality, e.g. the minimal availability of DL Functionality, the minimalresponse time of DL Functionality, the completeness and

authoritativeness of the DL Content, the confidentiality of the Useractions. These aspects characterise the overall Digital Library service,

actually the way it is perceived by the DL End--users.

These parameters need not necessarily be fixed for the entire lifetime of the DL;

they may be reconfigured to enable the DL to respond to the evolving

expectations of target users and changes in all aspects.

DL System Administrators

DL System administratorworkintandemwithDLDesignerst oputinplacethe

DigitalLibrarySystemimplementingtheplannedDigitalLibraryservice.


42/56

42

They select, deploy and manage a set of networked computers and software

modules needed to fulfil the expectations that DL End--users and DL

Designers have for the Digital Library.

DL System Administrators perform their tasks by interacting with the DLMS

and relying on the facilities these systems offer for DLS constituents

identification, linking, allocation, deployment, configuration, tuning,

monitoring, alerting, and any other management facility requested to manage

potentially distributed software systems as DLSs are expected to be. Different

DLMSs are expected to offer diverse management facilities ranging from

manual installation and configuration of the computers and the software

modules on the target computers to fully autonomic solutions aiming at

reducing human intervention to a few corner cases.

DL Software Developers

DL Software Developers develop and/or customise the software components

that will be used as constituents of the DLSs. They are requested to produce the

software implementing every aspect of the Digital Library service ranging from

the DL Content and User to Functionality, Policy and Quality. However, DL

Software Developers should not start from scratch and their activity is expected

to be performed by relying on the offering of a DLMS. In fact, a DLMS is asoftware system that is equipped with a bunch of off--the--shelf software

modules implementing to some extent some Digital Library facilities, e.g.,

content repositories, users management systems, cooperative working

environments, information retrieval engines, policies enforcement modules.


43/56

43

DL Software Developers include Software Engineers and Programmers that

are requested to customise and complement the set of software modules

provided by the exploited DLMS as to obtain the set of software constituents

needed to implement the planned Digital Library. The three roles described

above encompass the entire spectrum of actors working in the digital libraries

universe. Their conceptual models of such a universe are linked together in a

hierarchical way, as shown in Figure I.4--2.

This hierarchy is a direct consequence of the above definitions, since DL End--

users act on the Digital Library, whereas DL Managers and DL Application

Developers operate on the DLS (through the mediation of a DLMS) and,

consequently, on the DL as well. This inclusion relationship ensures that

cooperating actors share a common vocabulary and knowledge. For instance,

the DL End--user expresses requirements in terms of the DL model and,

subsequently, the DL Designer understands these requirements and defines the

DL accordingly.

Figure 6 Hierarchies of Users' Views


44/56

44

Advantages and Disadvantages of Libraries: From Past

Till Now

PAST: DIGITAL LIBRARIES (DL)

Computers have made revolutionary changes in every field of life; undoubtedly,

the field of education and information has been no different. Importantly,

conventional libraries moved to the concept of digital libraries, which ultimately

made gaining knowledge more efficient and organised. However, a notable

important fact here is that the digital library should stand for more than a well-

organised centralised form of information [26]. Furthermore, they should also

embody the essence of communication, which was originally the aspect of face-

to-face interaction between the people at a conventional library.

Advantages

People can access required information at any time of the day, as long as they

have access to the internet.

Disadvantages

Searching is not efficient, as it may not provide meaningful data to the user as

a result of his command. In many cases, access to certain information is limited

by copyright law.


45/56

45

Data is static; therefore, no users can contribute their views or share their

knowledge with other participants.

Digital libraries should explore using Semantic Web technologies to meet their

organizational challenge. It summarizes the key challenges facing digital

libraries, based on a digital libraries workshop. They identify five key

challenges facing digital libraries: interoperability; description of objects and

repositories; collection management and organization; user interfaces and

human-computer interaction; and economic, social, and legal issues. Digital

libraries can use Semantic Web technology to help meet all of these challenges.

FROMDL TO SDL

Following the advent of digital libraries in our lives, another innovative step

followed. This step was made in relation to making the search more meaningful

and direct. Essentially, it was concerned with refraining from the habit of

searching all the things everywhere.

The growth of Web 2.0 has given way to new methods of accessing

information and contributing opinions. Notably, semantic digital libraries enable

the user to get the intended information concerning an object without thepresence of the exact word in the search. This integrated form of information is

based on different metadata which provides a more meaningful data. These

libraries tend to provide a better and more convenient form of browsing

interfaces.


46/56

46

Figure 7 Evolution of Semantic Digital Libraries

Advantages

Semantic Digital Libraries make it easier to find information in the vast ocean

of available data. This is facilitated by ontology-based search and facet search.

Access is not confined to only one digital library; to the contrary, it provides a

mechanism of interoperability between different systems.


47/56

47

Disadvantages

Existing metadata of the digital libraries have to be lifted to a semantic level.

Not all digital libraries, government agencies etc. maintain metadata.

FROM SDL TO SSDL

Semantic digital libraries tend to focus more keenly on the retrieval of

meaningful information rather than giving the opportunity of sharing user

knowledge. This need subsequently led to the development of social semantic

digital libraries.

Figure 8 Evolution of Social Semantic Information


48/56

48

This is achieved by a combination of Semantic Web with collaboration tools on

the web. Social semantic digital libraries complement the existing features of

the semantic digital libraries by providing the opportunity to contribute to the

information. Web 1.0 evolved into a collaborative platform where people could

interact and share information, i.e. Web 2.0. Web 2.0 was promoted by Tim

OReily around the year 2005; it gives ordinary internet users the opportunity to

interact, meet and share information like never before, and involves concepts

like blogs, wikis, social networking sites etc.

Advantages

libraries, and accordingly achieve great things.

of

individuals to the other.

Disadvantagesamateurish data by some users.

SSDL AND THE FUTURE

Social Semantic Digital Libraries (and Web 2.0) have made the web

collaborative and interactive; however, one drawback which has become

apparent as a result of this innovation is that of information overload. Owing to

the increase in internet users and thus their participation level on forums, it has

become difficult to point to the knowledge part of the content.


49/56

49

Disadvantages

Another drawback of such libraries is that web pages are dynamic but are not

very structured. Notably, Web 2.0 tools enable the shaping of content on the

pages, but not the content itself. Essentially, the future will address these

aspects and make the web more powerful and structured by the advent of Web

3.0 . The basic concept behind Web 3.0 is that of ontologydefined by

Thomas Gruber as explicit specification of a conceptualisation. Another future

enhancement which is foreseen for the future is that there shall be digital

annotation linked with physical objects in life: for example, in a museum. An

application of this technology can be to have real-virtual tours of a certain

place: for example, to start with a real guided tour and then (if desired)

browsing through the virtual context information or otherwise gathering

information about other exhibitions in the premises. The future aspect of the

social semantic digital library is to improve user benefits by empowering the

user interfaces and social networking. The user identification and system

automation are important key points in the future social semantic digital

libraries.


50/56

50

Future Prospects

There are inevitable barriers to the Semantic Web that still need to be addressed.

We have mentioned the slow progress on certain features, particularly ontology

and reasoning support, due to the development community not coming to a

consensus. This does not mean that progress cannot be made immediately using

the simpler tools for RDF and RDF Schema available now.

Some of the larger IT companies are hanging back, waiting to spot the

opportunity and waiting for the research community to settle on standards. Thus

the main impetus is coming from communities themselves it is an opportunity

to profoundly affect the way that the world talks to each other.

There is a good deal of RDF data giving semantic descriptions already on the

Web, both from website owners publishing their own annotations as RDF files

and from sites such as rdfdata.org which provide portals for RDF data.

However, before the Semantic Web can become globally usable, there does

need to be more, and it needs to be more easily available. There is a distinct

overhead to using the Semantic Web in terms of establishing shared

vocabularies and ontologies, and in providing the appropriate annotations to

resources which make them visible to the Semantic Web. This is a non-trivial

task and often users will either not have the time to include this, or the expertise

to do it well. A missing component of the Semantic Web is a simple means to

support this, similar to the editors and tools for the conventional Web.

Undoubtedly the simplicity of the HTML language used within the current Web

was a major influence on its success and in order for the Semantic Web to break

out from narrow communities to universal use it needs to address the issues of

making it easy to use and accessible to all.


51/56

51

Otherwise, the Semantic Web is likely to require particular effort and expertise.

This is expensive, and so it may well be confined to particular domains on the

Web which see a strong advantage in its use, although over time as the expertise

becomes more commonplace it should become cheaper. Also, the 'network

effect' can work as both a barrier and an incentive. One of the main advantages

of supplying Semantic Web annotation is that is can be shared and can gain

advantage to others, so when there is little data to share, then there is little

incentive to take the extra expense in sharing; however, once the ball starts to

roll, there is an exponential advantage in combining your own data with others'.

These problems may be less of a disadvantage in the HE and FE sectors, which

has well-integrated communities with stronger control over their resources.

Information science professionals in libraries are available to help with the task

of cataloguing and publishing annotations. Thus it is likely that this sector will

be in the forefront in the use of this technology.

The impact on digital libraries, combined with the Open Access Initiative andthe rise of open archiving is likely to be quite profound. Libraries become

'value-added' information annotators and collators rather than the archivists of

externally published literature and the holders of the published output of

institutions. The Semantic Web, although not a prerequisite or a motivator for

this change is nevertheless likely to smooth its development. The tools are in

place for sharing classification schemes and to allow the community to develop,deepen and share such schemes. The information infrastructure tools discussed

above will have particular impact on the way students and researchers find

information, so these tools may typically be provided and adapted by libraries

who will tailor them to the needs of their own users. The Semantic Web, like


52/56

52

the current Web, has the capacity of being an overwhelming place; libraries are

well-placed to make sense of this for the HE and FE community.

Existing Semantic Digital Library Systems

SIMILE (Semantic Interoperability of Metadata and Information in

unLike Environments): This system focuses on enhancing the integration

aspect of metadata, services etc. to increase accessibility.

We propose a project that will create compelling use case demonstrations and

prototypes at the intersection of community or individual information creation

and management, institutional information and digital collections management,

and the Semantic Web data architecture.

We seek to enable low-cost, scalable interoperability among digital research

collections, the metadata that describes them, and the services that leverage that

metadata and digital material. We will extend DSpace to greatly enhance its

ability to provide easy-to-use support for arbitrary schemas and metadata,

primarily through the use of RDF and Semantic Web techniques, and explore

support for distributed research collections. The project will ground its work in

focused, well-defined, real-world use cases in the domains of libraries and

higher education. We will also collaborate with other significant projects that

operate in these domains, (e.g. ARTstor, JSTOR, OCW, Fedora, Sakai, and

VUE) to improve understanding and use of Semantic Web technologies to

provide scalable data interoperability within and across those platforms.

Jerome Digital library: Can be considered as a social semantic digital

library. Based on Semantic Web as well as social networking in order to


53/56

53

promote collaborative activities along with other common uses of semantic

digital libraries.With JeromeDL social and semantic services every library usercan bookmark interesting books, articles or other materials in semantically

annotated directories.

Users can share their knowledge with others within a social network. We

enriched the standard SSCF browser with an ability to bookmark and browse

community based data. JeromeDL also has a feature which allows it to treat a

single library resource as a blog post. With SIOC based annotations users can to

comment the content of the resource and in this way create new knowledge.

JeromeDL also provides various browsing, filtering and navigation solutions,

such as TagsTreeMaps, MultiBeeBrowse and Exhibit.

JeromeDL has been installed in a number of locations; the two most used,

DERI Galway library and WBSS8 at Gdansk University of Technology, serve

their community of users in everyday activities. DERI Galway library is used by

researchers as a pre-print server to locate and share publications. WBSS

maintains a set of scans of antique books and a number of books written bylecturers at GUT; the latter ones are used as learning material.

BRICKS: This system focuses on the basic infrastructure of a digital library

network so that information can be shared amongst users in the cultural heritage

domain. The BRICKS network infrastructure uses the Internet as a backbone,

and is made of decentralized BRICKS Nodes (BNode), in order to avoid centralpoints whose failure or overload could stop or slowdown the whole Network.

BNodes communicate among each other and use available resources for content

and metadata management.

Every BNode knows directly only a subset of other BNodes in the system.

However, if a BNode wants to reach another member that is directly unknown


54/56

54

to it, it will forward a request to some of its known neighbour BNodes that will

deliver the request to the final destination or forward it again. BRICKS users

access the system only through a local BNode available at their institution.

Hence every user request is primarily sent to the institution's BNode and then

the request is routed via other BNodes to the final destination.

Search requests behave like that; the BNode pre-selects a list of BNodes where

a search request can be fulfilled, and then the BNode routes it there. When the

location of the content is known, e.g. as a result of the query, the BNode is

directly contacted.

Conclusion

Traditional libraries have taken the shape of an interactive, accessible and

efficient platform which is present for the user at any time of the day. The new

forms of digital libraries, i.e. semantic digital libraries, have proved to produce

more meaningful results for the user. Further developments in semantic digital

libraries have evolved the concept of contribution of information and social

interactivity between the contributors. Therefore, the future holds much more

promising and efficient mechanisms for handling information.


55/56

55

References

[1] Diane, V., 2006, When did the Web start? Developed Traffic,

http://developedtraffic.com/2006/08/04/when-did-the-web-start/.

[2] Berners-Lee, T., Hendler, J., Lassila, O., 2001, the Semantic Web, Scientific

American Magazine (May 17, 2001).http://www.sciam.com/article.cfm?id=the-

semantic-web&print=true.

[3] Kruk, S., R., Haslhofer, B., Kneevic, P., 2007, Tutorial 7- Semantic Digital

Libraries, JCDL 2007

[4] Lagoze, C., Krafft, D., B., Payette, S., Jesuroga, S., 2005, What Is a Digital

Library Anymore, Anyway? D-Lib Magazine November 2005, Volume 11

Number 11,http://dlib.org/dlib/november05/lagoze/11lagoze.html

[5] Borgman, C., L., 2000, Digital libraries and the continuum of scholarly

communication,Journal of Documentation, 56 (4), pp. 412-430

[6] Borgman, C., L., 1999, What are digital libraries? Competing visions,

Information Processing & Management, pp. 227-243,

[7] Celino, I., Turati, A., Della Valle, E., and Cerizza, D. (2006). Squiggle Med:

Semantic search for medical digital library, Technical report, CEFRIEL.

[8] Baruzzo, A., Casoto, P., Challapalli, P., Dattolo, A., Pudota, N., Tasso, C.,

2009, Toward Semantic Digital Libraries: Exploiting Web2.0 and Semantic

Services in Cultural Heritage,Journal of Digital Information, Vol. 10, No 6.

[9] Xu, X., Zhang, F., Ni, Z., 2008, An Ontology-Based Query System For

Digital Libraries, IEEE Pacific-Asia Workshop on Computational Intelligence

and Industrial Application

[10] Kruk, S., R., Decker, S., Haslhofer, B., Kneevic, P., Payette, S., Krafft,

D., 2007, TutorialSemantic Digital Libraries,BANFF 2007.

[11] Thomas, S., 2006, Web 2.0 and the future for library systems, Technical

Report, University of Adelaide,
http://www.sciam.com/article.cfm?id=the-semantic-web&print=truehttp://www.sciam.com/article.cfm?id=the-semantic-web&print=truehttp://www.sciam.com/article.cfm?id=the-semantic-web&print=truehttp://www.sciam.com/article.cfm?id=the-semantic-web&print=truehttp://dlib.org/dlib/november05/lagoze/11lagoze.htmlhttp://dlib.org/dlib/november05/lagoze/11lagoze.htmlhttp://dlib.org/dlib/november05/lagoze/11lagoze.htmlhttp://dlib.org/dlib/november05/lagoze/11lagoze.htmlhttp://www.sciam.com/article.cfm?id=the-semantic-web&print=truehttp://www.sciam.com/article.cfm?id=the-semantic-web&print=true


56/56

http://digital.library.adelaide.edu.au/dspace/bitstream/2440/14789/1/Web2.0.pdf

[12] Baruzzo, A., Casoto, P., Dattolo, A., and Tasso, C., 2009, A conceptual

model for digital libraries evolution. In WEBIST 09: Proceedings of 5th

Informational Conference on Web Information Systems and Technologies,

pages 299304, Berlin. Springer-Verlag.

[13] The Spiritus Temporis Web Ring Community, 2005 Digital library,

http://www.spiritus-temporis.com/digital-library/disadvantages.html

[13]D. McGuinness and F van Harmelen (eds) OWL Web Ontology LanguageOverview

http://www.w3.org/TR/2003/WD-owl-features-20030331/

[14] M. Dean, G. Schreiber (eds), F. van Harmelen, J. Hendler, I. Horrocks, D.

McGuinness, P. Patel-Schneider, L. Stein, OWL Web Ontology Language

Reference http://www.w3.org/TR/2003/WD-owl-ref-20030331/
http://www.spiritus-temporis.com/digital-library/disadvantages.htmlhttp://www.spiritus-temporis.com/digital-library/disadvantages.htmlhttp://www.spiritus-temporis.com/digital-library/disadvantages.html

term paper - devansh mathur

Documents