term paper - devansh mathur
Embed Size (px)
TRANSCRIPT
-
7/31/2019 Term Paper - Devansh Mathur
1/56
-
7/31/2019 Term Paper - Devansh Mathur
2/56
2
CERTIFICATE
This is to certify that Devansh Mathur student of B. Tech. in Computer Science
& Technology has carried out the work presented in the project of the Term
paper entitled Semantic Digital library as a part of Second Year programme
of Bachelor of Technology in of B. Tech. in Computer Science & Technology
from Amity School of Engineering and Technology, Amity University
Rajasthan, under my supervision.
DATE: Mr Sanjay Jain
Faculty Guide
ASET, AUR
-
7/31/2019 Term Paper - Devansh Mathur
3/56
-
7/31/2019 Term Paper - Devansh Mathur
4/56
4
Content
Abstract.. 5
List of Figures. 6
1. Introduction.. 72. The Use of semantic web in digital libraries.. 10
2.1Resource Development Framework 112.2Web Ontology Language. 13
3. Digital Libraries: Earlier times.. 184. Digital Libraries Evolution: Content Sharing.. 235. The Digital Library Universe: .. .. 29
5.1.A Three-tier Framework.. 295.2.Main concept... 315.3.The Main Roles of Actors. 38
6. Advantages and Disadvantages of Libraries: From Past Till Now... 446.1.Of Past Digital Libraries.. 446.2. Of DL to SDL. 456.3.Of SDL to SSDL.. 476.4.SSDL and the future. 48
7. Future Prospects 498. Existing Semantic Digital Library Systems. 52
8.1. SMILE.. 528.2.JeromeDL. 528.3.BRICKS53
9. Conclusion. 5410.References..55
-
7/31/2019 Term Paper - Devansh Mathur
5/56
-
7/31/2019 Term Paper - Devansh Mathur
6/56
-
7/31/2019 Term Paper - Devansh Mathur
7/56
7
Introduction
Typical digital libraries usually focus on categorizing and cataloguing
resources. Information retrieval in such libraries relies primarily on text search
engines and free browsing. This approach proved to be useful, however it
suffers from ambiguity of natural language, neglecting the importance of
metadata; it also does not engage users in the process of sharing knowledge.Simple searching still returns too many results which have to be filtered
somehow. Page ranking algorithm helps with websites but cannot be easily
applied to books or e-learning objects. On the other hand, having a look on a
friends bookshelf can give us much clearer view on what is worth reading in a
particular domain than digging through a thousand books or websites published
this month. The semantic digital library is an attempt to restore the collaborative
approach to sharing knowledge.
The term Digital Library is currently used to refer to systems that are
heterogeneous in scope and yield very different functionality. These systems
range from digital object and metadata repositories, reference-linking systems,
archives, and content administration systems (mainly developed by industry) to
complex systems that integrate advanced digital library services (mainly
developed in research environments). This overloading of the term Digital
Library is a consequence of the fact that as yet there is no agreement on what
Digital Libraries are and what functionality is associated with them. This results
in a lack of interoperability and reuse of both content and technologies. This
-
7/31/2019 Term Paper - Devansh Mathur
8/56
8
document attempts to put some order in the field for the benefit of its future
advancement.
Libraries, together with archives, have always been the primary institutions
delegated to manage collect, preserve and diffuse human knowledge and
culture. When advances in computer science allowed dealing with digital
representation of documents dedicated to capture human knowledge and culture
rather than printed ones, libraries were particularly involved in exploiting the
potential of the digital revolution. Thus digital libraries soon became the term
to indicate the digital counterpart of traditional libraries. However, digital
library systems have greatly evolved since their early appearance. Today they
have become complex networked systems able to support communication and
collaboration among different worldwide distributed communities, dealing with
digital objects comprising not only the digital counterpart of printed
documents, but also images, video, programs and any other kind of multimedia
objects a community may define as appropriate to its working andcommunication needs. The evolution of digital libraries (DLs) has not been
linear, coming from the contribution of many disciplines. This has created
several conceptions of what a DL is, each one influenced by the perspective of
the primary discipline of the conceiver(s) or by the concrete needs it was
designed to satisfy.
As a natural consequence, the history of Digital Libraries, which is nowapproximately twenty years long, is the history of a variety of different types of
information systems that have been called digital libraries. These systems are
very heterogeneous in scope and functionality and their evolution does not
follow a single path. In particular, when changes happened this has not only
meant that a better quality system was been conceived superseding the
-
7/31/2019 Term Paper - Devansh Mathur
9/56
9
preceding ones but also meant that a new conception of digital libraries was
born corresponding to new raised needs. As it will be seen, most of the systems
dealt with in this history are still living in their original conception, even though
not in their original technological solutions.
The rest of this chapter goes back over this history, giving an account of past
and present understanding of these kinds of systems and on-going work in the
area. The chapter concludes with a vision of the impact that new DLs are
expected to have in the near future.
-
7/31/2019 Term Paper - Devansh Mathur
10/56
10
The Use of Semantic Web in Digital Libraries
The Semantic Web is a concept that was first introduced by Tim Berners-Lee.
Berners-Lee, Hendler, & Lassilas (2001) vision of the Semantic Web is that it
will bring structure to the meaningful content of Web pages, creating an
environment where software agents roaming from page to page can readily
carry out sophisticated tasks for users . This vision is supported by the
combination of metadata schemes, the use of the Resource Description
Framework (RDF), and the creation of ontologies. The ultimate goal of the
Semantic Web is to allow machines to understand the meaning of digital
objects, rather than just the key words used to describe them. This will
revolutionize the search and retrieval of digital objects, a key function for
digital libraries.
One of the main benefits of the Semantic Web is that it operates within the
current Web environment to add logic and meaning to digital objects. In their
review of an experiment to enhance the semantic interoperability of two digital
collections, Angjeli and Isaac (2009) state One of the key ideas of the semantic
web approach is to open and interconnect the meaning capitalized within
existing metadata. This idea is accomplished through the use of a combination
of XML, the Resource Description Framework (RDF), and the creation of
ontologies. XML or eXtensible Markup Language is a computer language that
allows users to create their own tags to annotate Web pages.
Many digital objects contain metadata, or information contained within the
document that defines information about the document such as the author,
publisher and date created. This metadata is often written using XML which
-
7/31/2019 Term Paper - Devansh Mathur
11/56
11
gives digital objects a similar framework, regardless of which metadata scheme
is used within the object. Although XML can define structure within a digital
object, it does not convey meaning to computers.
Figure 1 semantic web stack
Resource Description Framework (RDF)
The next step in the Semantic Web process is to add meaning to the XML data
found within digital objects. This is accomplished through the creation of RDF
models and schemas.
RDF models are composed of three parts with the basic form of subject-
predicate-object, and are often referred to as triplets. The subject is a resource
which can essentially be any digital object. The predicate is a property, such as
has author which signifies that the resource has an author. The object can be
another property or a class of that property, in our example this could be
Jones. The complete RDF model would be Object A has author Jones. This
designates to the computer that the object was authored by Jones.
-
7/31/2019 Term Paper - Devansh Mathur
12/56
12
Each part of the triplet can be assigned a unique universal resource identifier
(URI), which directs the computer, to that specific digital object. As pieces of
information are added to this data model it can be written in XML andtransmitted across applications. The RDF schema is the set of rules that define
the way RDF document must be written.
The benefits of using RDF are tremendous. As McCarthie Nevile and Mendez
state, If you give an RDF processor two sets of statements about the same
thing (a thing identified by the same URI), it just treats them as one collection
of statements. This means that a web of meaning can be created about aspecific object, and that this web of meaning can have multiple contributors.
Person
Student Researcher
subClassOfsubClassOf
Jeen
Type
hasSuperVisorDomain Range
Frank
Type
Has Supervisor
-
7/31/2019 Term Paper - Devansh Mathur
13/56
13
Web Ontology Language
The final piece of the Semantic Web is the use of ontologies. Ontologies
serve as the backbone of the Semantic Web by providing vocabularies and
formal conceptualization of a given domain to facilitate information sharing and
exchange . Ontologies are basically sets of logical rules that define the
relationships between sets of concepts. Ontologies can also be used to define the
concepts and the XML codes used to designate them within Web pages.
Ontologies are usually domain specific, and attempt to exhaustively define
concepts and map their relationships. The goal of ontology is to reinforce the
formal logic and tighten the meaning to the point where it can no longer
escape correct computer interpretation . The combination of XML, RDF, and
ontologies makes it possible for computer programs and agents to interpret
digital objects based on their meaning and their relationship to other concepts.
The benefits of using these technologies in digital libraries are myriad.
Furthermore, digital librarians already possess a skill set that enables them to
work with Semantic Web technology. Digital librarians are already working
with XML to create metadata, and the creation of ontologies is akin to the
development of a thesaurus.
There are many digital library projects already making use of these
technologies. One of the greatest benefits of the Semantic Web is the ability to
relate resources by their meaning rather than their form. This allows for the
linking of concepts from different vocabularies, freeing users from the confines
of any particular vocabulary.
-
7/31/2019 Term Paper - Devansh Mathur
14/56
14
The best example of the use of ontologies to achieve this goal is the Unified
Medical Language System (UMLS) developed by the National Library of
Medicine. The UMLS is comprised of three parts: a Meta thesaurus which
contains over one million biomedical concepts from over 100 source
vocabularies; a Semantic Network which defines 133 broad categories and
fifty-four relationships between categories for labelling the biomedical
domain; and a SPECIALIST Lexicon & Lexical Tools which provide lexical
information and programs for language processing (National Library of
Medicine, 2010).
The UMLS is distributed with software and tools that enable a user or
organization to use the UMLS to do research in the biomedical field. The
UMLS is essentially a comprehensive ontology of the biomedical field, and can
be used by a digital library who serves users with an interest in the biomedical
field. The UMLS is the perfect example of the potential for interoperability
created by using Semantic Web technologies. The National Library of Medicine
is arguably the well-respected source of biomedical resources, and their UMLScan be used by any single user or digital library to enhance the quality of their
searching or to build a search and retrieval interface.
Another example of a collaborative digital library project using Semantic Web
technology is Europeana , an interface for searching the digital resources of
Europe's museums, libraries, archives and audio-visual collections
(Europeana). Europeana allows users to explore these items and share thoseusing social media outlets. Europeana has also developed a Semantic Searching
Prototype which currently contains items from the Rijksmuseum, the Louvre,
and Netherlands Institute for Art History. The Semantic Search interface is a
simple search bar that allows users to search for concepts represented by digital
objects from these institutions.
-
7/31/2019 Term Paper - Devansh Mathur
15/56
15
Performing a search for love returns 504 items that are listed under the
following headings: works showing concept, works titled, works showing a
more specific concept, works showing a related concept, works from type a
related concept, and other. This breakdown gives the user a unique perspective
on how a Semantic Web search differs from a traditional search. When the user
clicks on a resource, a window pops up that gives the user information about the
resource including date, material, relation, subject, title, and type. These fields
correspond to many metadata schemes. The user can also link to the page from
the original institution that contains the item, and to a Europeana created page
for that item. The Europeana page allows users to download RDF files relating
to specific data about the item. The Europeana Semantic Search Prototype
shows how Semantic Web technology can be used to integrate resources from
multiple sources, to allow users to access these resources, and to facilitate the
sharing of Semantic Web data for each resource.
There are also many Semantic Web resources that are not directly connectedwith a traditional digital library that can potentially be used in digital library
interfaces. One such resource is Friend of a Friend Project or FOAF. This
project provides users with a way to create an RDF file that identifies the user
as a unique individual. This project is similar to the authority files created in
traditional libraries to differentiate between authors with the same name. FOAF
files can designate a wealth of information about a person. They can begenerated using XML/RDF by users or can be generated through social
networking sites. Digital libraries could allow users to create profiles, and then
translate these profiles to FOAF files for use within the digital library interface.
Digital libraries should provide users with services beyond simple search and
retrieval of items, and allowing users to customize their interface and track
-
7/31/2019 Term Paper - Devansh Mathur
16/56
16
search history is one way to accomplish this goal. A 2009 study by Jiang and
Tan attempted to use Semantic Web technology to create user ontologies to help
provide personalized information services. Jiang and Tan (2009) explain that
user ontology is a specialization of domain ontology by assigning each concept
and relation of the domain ontology with a specific value for indicating a users
interest.
This user ontology allows for ranking of retrieved search items by user
preferences. The authors required participants to submit a search query to a
system loaded with documents from the ACM digital library. The participants
then were asked to browse the top 30 documents retrieved and rank them based
on relevance.
This data was loaded into that users ontology based on the concepts included in
preferred and non-preferred documents. After the creation of the user ontology,
the data was loaded into the search interface. The participants then searched for
a new query, and the search system ranked the documents based on preferencesfrom the user ontology. These results were compared to a search query of the
same system not loaded with the user ontology. The authors found that the
system loaded with the user ontologies returned more precise results than the
system that was not programmed for user preference. The authors state User
ontology can be used in many ways to support Semantic Web applications,
including document re-ranking, information filtering, and query expansion.
The implications of this research are that digital libraries can employ Semantic
Web technology to track user preferences in a new way, and can seamlessly
integrate these preferences with current search capabilities.
-
7/31/2019 Term Paper - Devansh Mathur
17/56
17
Semantic Web technologies have been integrated with digital libraries to meet
the needs of digital library users and to meet the goals of digital library
organizations. The greatest asset of the Semantic Web is that it allows for
interoperability between organizations and information systems, for even
agents that were not expressly designed to work together can transfer data
among themselves when the data comes with semantics. Another asset of the
Semantic Web is that users can create information about digital objects using
any language, classification scheme, or metadata scheme. Once this information
has been linked to that objects DOI, the computer will integrate each piece of
information scattered throughout the Web to create a more complete record of
that object. 8 The Use of Semantic Web Technologies in Digital Libraries
Digital librarians should be leading the charge for the creation of RDF schemas
and ontologies for digital objects because they have a combination of traditional
library skills such as content representation and cataloguing along with
technological skills such as metadata creation. Digital libraries must use these
technologies to improve all aspects of their organizations.
The ideal digital library would use Semantic Web technology as the backbone
of its entire organization. All digital objects contained within their collections
would have RDF documents associated with them. The collections would all be
pre-loaded with ontologies related to their specific domain. Digital libraries
would contain resources from organizations around the world. Users would beable to log in and create profiles that would translate to FOAF files and user
ontologies to keep track of semantic preferences. The search and retrieval
functions would benefit from the ability to search semantically along with the
traditional search methods the library currently employs. Semantic Web
technology essentially teaches the computer the relationships between digital
-
7/31/2019 Term Paper - Devansh Mathur
18/56
18
objects, their meaning, and how users would like to interact with them. The
technology is already in place and the ideal digital library is only a few
keystrokes away.
DIGITAL LIBRARIES: THE EARLY TIMES
The digital library concept can be traced back to the famous papers of foreseer
scientists like Vannevar Bush and J.C.R. Licklider identifying and pursuing the
goal of innovative technologies and approaches toward knowledge sharing as
fundamental instruments for progress. Bush (Bush, 1945) devised
A device in which an individual stores all his books, records, and
communications, and which is mechanized so that it may be consulted with
exceeding speed and flexibility.
Moreover, on top of it there is a transparent platen. On this are placed
longhand notes, photographs, memoranda, all sorts of things. Because of the
lack of digital support, he identified in improved microfilm the means for
content storage and exchange: contents are purchased on microfilm ready for
insertion. Books of all sorts, pictures, current periodicals, newspapers, are thus
obtained and dropped into place. Of course, he envisaged also support forknowledge discovery (provision for consultation of the record by the usual
scheme of indexing),access (to consult a certain book, he taps its code on the
keyboard, and the title page of the book promptly appears before him) and
management (new forms of encyclopaedias will appear, ready made with a
mesh of associative trails running through them, ready to be dropped into the
-
7/31/2019 Term Paper - Devansh Mathur
19/56
19
memex and there amplified). Licklider realized thatcomputers were getting to
be powerful enough tosupport the type of automated library systems thatBush
had described and in 1965, wrote his bookabout how a computer could provide
an automated library with simultaneous remote use by many different people
through access to a common database. Because of this, Licklider is also
considered a pioneer of Internet and in its book he established the connection
between Internet and digital library. Thus, it is not surprising that research and
development activity on digital libraries started in the early 1990s, with the
Internet proliferation, and that Internet has created unprecedented possibilities
to discover and deliver human knowledge. The first systems delivering
knowledge artefacts in digital form can essentially be seen as archives of digital
texts accessible through a search service and implemented by a centralized
metadata catalogue.
The early ones of such systems were constructed on a rather simple
architecture, with the exception of very few cases. This worked to the advantage
of their diffusion and adoption by different scientific communities. BesidesarXiv, significant examples of such early systems were archives of various type
like Electronic Thesis and Dissertations repositories (ETDs), whose pilot project
started in 1996 , and archives of cognitive sciences papers and of research
papers in economics both launched in 1997. The former was a system which
was offering services for submitting, browsing and searching electronic thesis in
PDF format. The availability of this product stimulated the creation of theNetworked Digital Library of Theses and Dissertations international
organization, still operational, which registers and keep track of ETDs. Cog
Prints, was initially conceived as repository allowing the cognitive science
community to self-archive their papers. It now contains more than 3,000
artefacts starting from 1950. In 2000 it was made compliant with the protocol
-
7/31/2019 Term Paper - Devansh Mathur
20/56
20
defined by the Open Archives Initiative and then its software was converted
into the EPrints Digital Repository Software (EPrints, (n.d.)), a flexible platform
supporting easy and fast set up of repositories of open access research outputs.
Because of its simplicity, EPrints is currently widely used, more than 250
repositories declared to rely on it.
Similarly, RePEc was initially conceived as an open repository of electronic
papers in a specific domain. Thomas Krichel, principal investigator of the
RePEc Project, in 1997 illustrated the principles underlying a new realised
version of this system by affirming Distributed archivesshould offer metadata
about digital objects (mainlyworking papers); the data from all archives should
form one single logical database despite the fact that it should be held on
different servers; users could access the data through many interfaces;
providers of archives should offer their data to allinterfaces at the same time.
Krichel, with these statements was anticipating a view that would have largely
emerged few years later. These systems all still living in more recent and
enhanced versionsrepresent very embryonic forms of digital libraries. In fact,their functionality is essentially confined to (self) publishing of simple
information objects and discovery of these information objects through
rudimentary search and browse facilities.
The Digital Library Initiative (DLI) consisted of two major competitive fundingprograms, the first of which started in 1994 and funded six research projects
(chosen among 73 proposals) over a four-year period (Schatz and Chen,1996)
while the second phase was dedicated to extend the research carried out during
the previous phase by including content providers thus to guarantee the
availability of real test bed to validate research outcomes. However, the DLI
-
7/31/2019 Term Paper - Devansh Mathur
21/56
21
funded projects have not been the only ongoing efforts (CACM, 1995) even if
they were very innovative because they focused on future technological
problems. The six projects funded by DLI phase one were: the California
Environmental Digital Library (Wilensky, 1995) focused on developing the
technologies to access large, distributed collections of photographs, satellite
images, videos, maps, documents, and multivalent documents and to support
work-centred digital information services (Wilensky, 1996); the Alexandria
Digital Library (Smith and Frew, 1995) focused on building an online,
distributed digital library for geo-referenced1 information, including maps,
aerial photographs, satellite imagery, and catalogue records, and on supporting
geographically defined queries (Smith, 1996); the InformediaDigital Video
Library (Christel, Kanade, Maudlin, et al., 1995) focused on establishing a
large, online digital video collection with full-content and knowledge-based
search and retrieval (Wactlar, Kanade, Smith and Stevens, 1996)
Despite none of these systems exist anymore as a running service2, the
solutions proposed, the technology developed as well as the resources collectedand built have been largely used by more complex DLs developed later. It is
well known that one of the most important success stories resulting from these
projects is Google. Page and Brin started working on their search engine while
being PhD Students at Stanford working on the Stanford Digital Library Project.
Actually, the Digital Library Initiative merits goes far beyond the specific workthat it funded and we can affirm that it gave shape to digital library as a new
research discipline. Research in digital library topics was not new but it had
been fragmented across many disciplines. This program led to conferences,
publications and researcher, teams explicitly interested in doing research in
digital libraries. Moreover, it gave directions to the overall movement toward a
-
7/31/2019 Term Paper - Devansh Mathur
22/56
22
practical research field.(Arms, 2001) As anticipated, in Europe the scene was
characterised by the existence ofDELOSinitiatives.
In addition, it was intended to develop new models for intelligent audio-
visual content-based searching and film-sequence retrieval, new video
abstracting tools, and user interfaces specifically tailored to the new
functionality. The provision of multilingual services and cross language
retrieval tools was also addressed. Another project, i.e. AnIntegrated Art
Analysis and Navigation Environment(ARTISTE) (Allen, Vaccari and Presutti,
2000), focused on giving providers, publishers, distributors, rights protectors
and end users of art images information, as well as the multi-media information
market as a whole, a more efficient system for storing, classifying, linking,
matching and retrieving art images. This environment was providing, for
example, automatic extraction of metadata based on iconography, painting style,
etc; content-based navigation for art documents; distributed linking and
searching across multiple archives allowing ownership of data to be retained;
and storage of art images using large multimedia object relational databases.
According to these principles: the functionality of a digital library system were
available in the form of distinct functional units, each exposing its operational
semantics through an open protocol; digital library systems are compositions of
these functional units and new functionality can be added through the
implementation of value-added services, which interact with existing othersusing established protocols; the components (and content) of a digital library
could be spread over the global Internet, but should be presented to the user as a
single system.
-
7/31/2019 Term Paper - Devansh Mathur
23/56
23
DIGITAL LIBRARIES EVOLUTION:CONTENT
SHARING
The construction of digital libraries similar to those just described was very
resource-consuming since, for each new one, both the content and the software
providing its functionality were built from scratch. At the end of the 1990s, the
experiences of using distributed architectures to implement proper digital
libraries and the proliferation of independent repositories of valuable content
stimulated the idea of reusing content already collected (and curated) in existing
independent repositories so as to reduce the effort to build large-scale digital
libraries. However, many obstacles were to be solved to fully implement this
solution. The major of them was certainly how to implement repository service
interoperability, i.e. the capability of seamlessly accessing and using the content
managed in distributed and heterogeneous repositories. Approaches based on
cross-searching multiple archives based on a common protocol, such as
Z39.5010,, (Miller, P., 1999) were considered at the time costly and hardly
scalable. A very important meeting toward the interoperability of electronic
repositories was organised in Santa Fe, New Mexico, on October 1999, with the
goal to establish recommendations and mechanisms to facilitate cross-archive
value-added services.
This meeting led to the Santa Fe Conventiona combination of organizational
principles and technical specifications to facilitate a minimal but potentially
-
7/31/2019 Term Paper - Devansh Mathur
24/56
24
highly functional level of interoperability among scholarly e-print archives
and to the establishment of the Open Archives Initiative. (Van de Sompel, H.,
Lagoze, C., 2000) The meeting started by discussing a concrete example of
interoperability implemented through the UPS Prototype (Van de Sompel, H.,
Krichel, T., Nelson, M.L., 2000) and recognising its potentialities. The UPS
prototype demonstrated the integrated action of a variety of services operating
over data originating from a set of archives. Each of those services provided a
reasonably rich level of functionality (accessible through a set of protocol
methods). The participants recognised that trying to reach consensus on the full
functionality of the prototype was aiming too high and that a proper degree of
modesty in the approach toward integration capable to balance the cost of
participation with the need for adequate functionality was mandatory. The Santa
Fe Convention identified two key roles in participating institutions: data
providers and service providers. Data providers were in charge to handle the
depositing and publishing of resources in a repository and expose for
harvesting the metadata (what they called record) about resources in therepository. They were the creators and keepers of the metadata and repositories
of resources. Service providers were in charge of harvesting metadata from data
providers for the purpose of providing one or more services over the collected
data. The types of services that might be offered included a search interface,
peer-review system, etc. The cooperation between content and service providers
was regulated by a protocol, initially defined as a subset of the Dienst protocoland nowadays known as the Open Archive Protocol for Metadata
In the US, the National Science Foundation funded the National Science Digital
Library(NSDL) (Zia, L.L., 2001) with the aim to provide organized access to
high quality resources and tools that support innovations in teaching and
-
7/31/2019 Term Paper - Devansh Mathur
25/56
25
learning at all levels of science, technology, engineering, and mathematics
education.
These large-scale initiatives devoted to aggregate in a single place knowledge
that is spread across a plethora of archives and systems will ever exist for a
series of reasons including the existence of various (institutional) repositories
and the ever growing multidisciplinary nature of our society. In particular, TEL
and DARE anticipated important initiatives, namely, Europeana and DRIVER,
respectively, which were launched few years later. Europeana14 is a Thematic
Network funded by the European Commission under the eContentplus
programme, as a part of the i2010 initiative15. Europeana began in July 2007.
Originally known as the European digital library networkEDLnet it is the
result of a partnership of 100 representatives of heritage and knowledge
organisations and IT experts from throughout Europe. Objective of Europeana
is to provide access to Europes cultural and scientific heritage through a cross-
domain portal. The first Europeana prototype, launched in November 2008,
provided simple search and retrieval facility on an information space ofapproximately two millions of digital objects selected from Europes museums,
libraries, archives and audio-visual collections, harvested through the OAI-
PMH protocol. The first production quality version of Europeana (called Rhine)
will go live on July 2010, to be followed in April 2011 by a more sophisticate
version (Danube), including more contents and offering a richer set of
functionality. The intention is that by 2010 the Europeana portal will giveeverybody direct access to well over 6 million digital sounds, pictures, books,
archival records and films. Moreover, Europeanas goal is to realize a system
serving very different type of users. It should meet occasional curiosity of
generic users as well as the information needs of school children and students. It
should also provide academic students and teachers with certified information
-
7/31/2019 Term Paper - Devansh Mathur
26/56
26
and the possibility to export information for courses, as well as offer expert
researchers and professional the possibility of searching, verifying and
annotating information and using ad-hoc services. In the context established by
Europeana, special types of providers are the aggregators, i.e. specialised DLs
that act as collectors of content from other providers. For instance, Culture.fr is
the largest aggregator, providing content from about 480 organizations in
France, including the Louvre and the Muse dOrsay. The information resources
that populate Europeanas information space are harvested as surrogates of the
original objects that are located at content providers sites. Since surrogates may
also contain elements of the original object (table of contents, full text index
items, music and video abstraction etc.), the very interesting new feature
of Europeana is that it will also deliver digital objects besides metadata. Clearly,
heterogeneity and interoperability are main issues that such a
DL is having to deal with, as well as, of course with scalability, quality of
service and, more in general, sustainability of the joint portal.
DRIVER16is another notable example of a DL that relies on content providedby a large number of external data providers. It is the result of two subsequent
projects funded by the European Commission in the period 2006-2009. The
main aim of these two projects is to create the organisational and technological
conditions for the set up of a European Repository Infrastructure (Jones, S.,
Manghi, P., 2009). The main instrument identified by the project to address
organisational issues is the DRIVER Confederation17. The Confederationpartners represent European and international repository communities, like
subject based communities, repository system providers, service providers, as
well as political, research, and funding organisations, who share the DRIVER
vision to allow all research institutions in Europe and worldwide to make all
their research publications openly accessible through institutional repositories.
-
7/31/2019 Term Paper - Devansh Mathur
27/56
27
In the spirit of this shared goal, the DRIVER confederation encourages a
combined effort of repository development by setting up guidelines and best
practices that favour the realization of a shared, trusted, long-term repository
infrastructure. From the technical point of view, DRIVER is based on theD-Net
technology18. This enabling technology is quite innovative in the context of
these kinds of aggregative systems because it is oriented to the realisation of a
digital library infrastructure (cf. Sec. 5). D-Net is based on a Service-oriented
architecture, where distributed and shared resources are implemented as
standard Web Services and applications consist of sets of interacting services. It
offers services to both data providers, that through it can more easily
share their content, and service providers, that are facilitated in implementing
DLs that exploit the aggregated content.19 At the time of this writing, the
DRIVER service provides access to approximately one million records out of
200+ repositories across 27 countries. Moreover, it delivers three DL
applications: the Belgium national repository portal, offering search over the
Belgium Repository Federation subset; Recolecta national repository portal,offering search on the Spanish Repository Federation subset; and the main
DRIVER portal, providing access and advanced functionality over the whole
space. The current Europeana and DRIVER services operate an information
space of metadata records, i.e. they harvest metadata records through the
OAI-PMH protocol from exiting repositories and then they run their services by
exploiting this content. Because of this they suffer from the limitations thatOAI-PMH poses if it has to be used to exchange information objects that are
rich in structure and payload as those at the core of changing nature of
scholarship and scholarly communication.(Van de Sompel, H., Payette, S.,
2004)(Van de Sompel, H., Lagoze, C., 2006) In particular, when feasible, they
give access to the content associated with the metadata by exploiting URL or
-
7/31/2019 Term Paper - Devansh Mathur
28/56
28
some other information contained in the record. This solution to access
information objects, however, suffers of two main problems:
(i)the access is not always feasible since there is no standard protocol toaccess objects;
(ii) There is no way of accessing compound objects since the structure and
the relation holding among the different parts is unknown.
A solution to this problem may come from the OAI-ORE20 standard, whose
version 1.0 has been released in October 2008 by the Open Archives Initiative.
This standard, based on Web standards, proposes a solution to handle
aggregations of Web resources. These aggregations, sometimes called
compound digital objects, may combine distributed resources having multiple
media types including text, images, data, and video as to form innovative
research outcomes. Both Europeana and DRIVER have already planned to
move very soon to technologies la OAI-ORE to manage compound objects.
All the systems and initiatives described in this section are essentially oriented
to contentsharing. Moreover, the majority of them is characterisedby a strong organisational effort since the model is based on a cooperative
participation of the content providers. Content sharing across digital libraries is
now being largely promoted as an important strategy to reduce the digital
library set up costs largely coming from selecting, digitising, describing, and
digitally curating content resources. However, the realisation of wide and
generalised content sharing is today still problematic due to the great variety ofproprietary models and ontologies adopted by existing systems and by the lack
of systematic approach to interoperability a recently funded EC project
stemming from the DELOS project, is paving the way for the future
interoperability of DL systems thus making feasible the implementation of
global digital library infrastructures.
-
7/31/2019 Term Paper - Devansh Mathur
29/56
29
The Digital Library Universe
A Three-tier FrameworkA Digital Library is an evolving organisation that comes into existence through
a series of development steps that bring together all the necessary constituents.
Figure below presents this process and indicates three distinct notion of the
`systems` developed along the way forming a three tier framework: Digital
Library, Digital library system and Digital library management system. These
correspond to three different levels of conceptualisation of the universe of
Digital Libraries.
Figure 2 DL, DLS and DLMS: A Three-tier Framework
These three system notions are often confused and are used interchangeably inthe literature this terminological imprecision has produced a plethora of
heterogeneous entities and contribute to make the description, understanding
and development of digital library systems difficult.
-
7/31/2019 Term Paper - Devansh Mathur
30/56
30
As Figure indicates, all three systems play a central and distinct role in the
digital library development process. To clarify their differences and their
individual characteristics, the explicit definitions that follow may help.
Digital Library (DL)
A potentiallyvirtual organisation,thatcomprehensivelycollects,managesand
preserves for the long depth of time rich digital content, and offers to its
targeted user communitiesspecialised functionality onthatcontent,of Defined
quality and according to comprehensive codified policies.
Digital Library System (DLS)
A deployed software system that is based on a possibly distributed architecture
and provides all facilities required by a particular Digital Library. Users interact
with a Digital Library through the corresponding Digital Library System.
Digital Library Management System (DLMS)A generic software system that provides the appropriate software infrastructure
both (i) to produce and administer a Digital Library System incorporating The
Suite of facilities considered fundamental for Digital Libraries and (ii) to
integrate additional software offer in more refined specialised or advanced
facilities. Although the concept of Digital Library is intended to capture an
abstract system that consists of both physical and virtual components the DigitalLibrary System and the Digital Library Management System capture concrete
software systems. For every Digital library there is a unique digital library
system in operation (possibly consisting of man inter connected smaller digital
library systems) whereas all Digital Library systems are based on a handful of
Digital library management systems. The digital library is thus the abstract
-
7/31/2019 Term Paper - Devansh Mathur
31/56
31
entity that `lives` thanks to the software system constituting the DLS and the
DLMS is the software system that is conceived to support the life cycle of one
or more DLS. It is important to note that all these concepts underlie all types of
information environment and systems, e.g. database, hospital info systems,
Banking systems, the web, Wikipedia,, etc. however it is the particular
characterizations given in the definitions of the previous section that
distinguishes digital libraries from the others: the content should be rich,
annotated and preserved for depth of time the user communities should be
targeted, the functionality should be specialised the quality should be measured
and according to the comprehensive policies . All of these characterizations, of
course are abstract and subject to interpretation, so they cannot lead to a precise
formal definition. Nevertheless they offer conceptual yardsticks by which
system can be measured and mutually compared and psychological lower
bounds can be established regarding the nature of digital libraries.
Main ConceptsDespite the great variety and diversity of existing digital libraries, there is a
small number of fundamental concepts that underlie all systems. These concepts
are identifiable in nearly every digital library currently in use. They serve as a
starting point for any researcher who wants to study and understand the field ,
for any system developer intending to construct a digital library, and for any
content provider seeking to expose its content via digital library technologies. In
this section, we identify these concepts and briefly discuss them. Seven core
concepts provide a foundation for digital libraries. One of them appears in the
definition of digital library to capture the commonalities between this universe
and other social arrangements: Organisation.
-
7/31/2019 Term Paper - Devansh Mathur
32/56
32
Five of them appear in the definition of digital library to capture the features
characterising this kind of organisation and the expected service: Content,
User, Functionality, Quality and Policy. The seventh one emerges in the
definition of the digital library system to capture the systemic features
underlying the expected service:Architecture.
All seven concepts influence the digital library three-tier framework, as shown
in fig.
Figure 3 the Digital Library Universe: Main Concepts
Organisation
The organisation concept is surrounding the entire Digital library universe. A
digital library is a kind of organisation by its own, it is a social arrangement
pursuing a well-defined goal (the digital library service). This concept subsumes
the mission the digital library has been conceived for and every other aspect that
is needed to define this mission and the operation of the resulting service.
However, this should not be confused with the organisation/institution that
-
7/31/2019 Term Paper - Devansh Mathur
33/56
33
decided to set up the digital library and drive its development although there are
overlaps and dependencies between the two. It is quite easy to recognise the
dependency relationship between the two, to some extent the institution sets the
scene for the digital library organisation, the institution is the establisher of the
Digital LibraryOrganisation and has the power to define the overall service this
organisation is requested to realise. However, the digital library, being an
organisation by its own has the power to control its own behaviour and
evolution in the frame defined by the institution. This concept is fundamental to
characterise the digital library universe because it highlights the commonalities
between this universe and the other one dedicated to capture organised body of
people having a particular purpose.
Content
The content concept encompasses the data and the information that the Digital
library handles and makes available to its users. It is composed of a set of
information objects organised in collections. Content is an umbrella conceptused to aggregate all forms of information objects that a digital library collects,
manages and delivers. It encompasses a diverse range of information objects,
including primary objects annotations and metadata.
This concept is fundamental to characterise the digital library universe because
it captures one of the major resource these organisations are called to manage,
i.e. the data and the information that is made available through it.
User
The User concept covers the various actors (whether human or machine)
entitled to interact with Digital Libraries. Digital Libraries connect actors with
information and support them in their ability to consume and make creative use
-
7/31/2019 Term Paper - Devansh Mathur
34/56
34
of it to generate new information. User is an umbrella concept including all
notions related to the representation and management of actors entities within a
digital library. It encompasses such elements as the rights that actors have
within the system and the profiles of the actors with characteristics that
personalise the systems behaviour or represent these actors in collaborations.
This concept is fundamental to characterise the Digital Library universe because
it captures the actors of the overall Organisation.
Functionality
The Functionality concept encapsulates the services that a Digital Library offers
to its different users, whether individual users or user groups. While the general
expectation is that Digital Libraries will be rich in functionality, the bare
minimum of functions includes new information object registration, search and
browse. Beyond that, the system seeks to manage the functions of the Digital
Library to ensure that the overall service reflects the particular needs of the
Digital Librarys community of users and/or the specific requirements related toIts Content. This concept is fundamental to characterise the digital library
universe because it captures the facilities offered by the overall organisation.
Policy
The Policy concept represents the set or sets of conditions, rules, terms, and
governing every single aspect of the digital library service including acceptablebehaviour, digital rights management, privacy and confidentiality charges to
users, and collection formation. Policies may be defined within the digital
library or be superimposed by the institution establishing the digital library or
outside of that (e.g., Policy governing our society). There policies can be
extrinsic or intrinsic policies.
-
7/31/2019 Term Paper - Devansh Mathur
35/56
35
This concept is fundamental to characterise the Digital Library universe because
it captures the rules and conditions regulating the overall Organisation.
Quality
The Quality concept represents the parameters that can be used to characterise
and evaluate the overall service of a Digital Library including every aspect of it,
i.e. Content, User, Functionality, Policy, Quality, and Architecture.
Quality can be associated not only with each class of content or functionality
but also with specific information objects or services. Some of these parameters
are quantitative and objective in nature and can be measured automatically,
whereas others are qualitative and subjective in nature and can only be
measured through user evaluations (e.g., focus groups). This concept is
fundamental to characterise the Digital Library universe because it captures
qualitative aspects characterising the Organisation.
Architecture
The Architecture concept refers to a Digital Library System and represents a
mapping of the overall service offered by a Digital Library (and characterised
by Content, User, Functionality, Policy and Quality) on to hardware and
software components. There are two primary reasons for having Architecture asa core concept:
(i)Digital Libraries are often assumed to be among the most complex andadvanced forms of information systems (Fox & Marchionini, 1998);
And
-
7/31/2019 Term Paper - Devansh Mathur
36/56
36
(ii)Interoperability across Digital Libraries is recognising as a majorchallenge.
A clear architectural framework for Digital Library Systems offers ammunition
in addressing both of these issues effectively. This concept is fundamental to
characterise the Digital Library universe because it captures the systemic part of
the service offered by the Organisation. The concepts populating the areas just
introduced (Organisation Is a special case since it subsumes all the rest) share
many similar characteristics and all refer to internal entities of a Digital Library
that can be sensed by the external world. Therefore, there has also been
introduced a higher--level concept referring to all of these, i.e.,
Resource, which enables us to reason about the common characteristics in a
consistent manner.
Figure puts in perspective the main concepts of the Digital Library universe.
The Organisation concept surrounds and subsumes all the other concepts.
Among the remaining six, two of them are independent each other, i.e., they
exist independently of a specific Digital Library. These are User, Representing
the external humans or hardware interacting with the Digital Library and
Content, representing the material handled by the Digital Library. Architecture,
representing the technological design on which the Digital Library System is
based, represents the underlying technology that is called to implement all the
rest. On top of these concepts there comes Functionality, primarily representing
the means for connecting Userto Content, i.e., all procedures, transformations,actions and interactions that bring Content to User Or vice versa. Finally,
operation of the Digital Library and activation of its Functionality are based on
Policy and aim to achieve certain Quality.
-
7/31/2019 Term Paper - Devansh Mathur
37/56
-
7/31/2019 Term Paper - Devansh Mathur
38/56
38
The Main Roles of Actors
In order to describe the overall operation of the Digital Library Organisation
and the way it is expected to deliver the service it has been established for, we
envisage actors interacting with digital library playing roles in three different
and complementary categories:
DL End--users,
DL Managers
AndDLSoftwareDevelopers.
Figure 5 The Main Roles of Actors versus the Three--tier Framework
As shown in Fig each role is primarily associated with one of the three
systems in the three--tier framework. The system a role is associated with
represents the entity that is expected to provide the actor playing such a role
with the facilities needed to accomplish the mandate assigned to the role.
Moreover, every actor, independently from the role he/she is playing, is
expected to deal with all the foundational concepts characterising the Digital
Library universe.
-
7/31/2019 Term Paper - Devansh Mathur
39/56
39
DLEnd-- user
DL End--users exploit the overall Digital Library service for the purpose of
providing, consuming, and managing the DL. They are the target clients of the
service defined by the DL Organisation in terms of the Content to be managed,
the User(s) to be served, the Functionality to be supported, the Policy (ies) to be
put in place and the Quality to be exposed. They perceive the DL as a stateful
entity serving their needs. This state of the Digital Library is a complex
condition resulting from and influencing Content, User, Functionality, Policy
and Quality aspects of the DL Organisation. Moreover, the state is expected to
evolve during the lifetime of the Digital Library as a consequence of a series of
actions and activities performed in the context of the DL Organisation as well as
of external factors influencing the DL Organisation. DL End--users may be
further divided into
Content Creators, Content Consumers and Digital Librarians.
Content Creators are theproducersoftheDigitalLibraryContent,i.e.,they
take careof producing new items contributing to the Digital Library Content.
Theiractivityisperformed
(i) ThroughtheFunctionalitytheDLisprovidedwith,
(ii) In accordance with the Policies defined in the DL, and(iii)
With the guarantee of Quality the DL declares.
Content Consumers are theclientsof theDigitalLibraryContent, i.e., they
access and use the items in the Digital Library Content. Their activity is
performed
(i) Through the Functionality the DL is provided with,
-
7/31/2019 Term Paper - Devansh Mathur
40/56
40
(ii) In accordance with the Policies defined in the DL, and(iii) With the guarantee of Quality the DL declares.
Digital Librarians are the curators of the Digital Library Content, i.e. they
select,organise and look after the items in the Digital Library Content. Their
activityisperformed
(i) Through the Functionality the DL is provided with,(ii) In accordance with the Policies Defined in the DL, and(iii) With the guarantee of Quality the DL declares. Moreover, they might
influence the behaviour of the overall Digital Library service by acting
as mediators between the final clients of iti.e., Content Creators and
Content Consumers and those defining and operating this service
i.e., DL Managers by distilling and elaborating feedbacks on the DL.
DL ManagersDL Managers are the actors driving the overall Digital Library service. They are
expected to rely on the facilities offered by The DLMS To define and operate
the Digital Library and the DLS implementing it. DL Managers may be further
divided into DLDesigners and DL SystemAdministrators. The former are
called to devise the overall service while the latter are called to deploy and
operate the DLS implementing the planned service.
DLDesigners
exploit their knowledge of the application environment that a DL is called to
serve in order to define, customise, and maintain the Digital Library so that it is
aligned with the needs of its target DL End--users.
-
7/31/2019 Term Paper - Devansh Mathur
41/56
41
To perform this task, the DL Designers interact with the DLMS to decide upon
the characteristics the Digital Library Should have in terms of
(i) Content, e.g., the set of repositories, ontologies, classification schemas,information object types, metadata formats, authority files, and
gazetteers that form the DL Content;
(ii) User, e.g., the allowed actors, the allowed roles, the informationcharacterising the actors;
(iii) Functionality, e.g., the functional facilities to be offered, the behaviourthese facilities should implement;
(iv) Policy e.g., the rule and principles governing the evolution of the DLContent, the allowed actions per actor or family of actors, the
exploitation of a resource;
(v) Quality, e.g. the minimal availability of DL Functionality, the minimalresponse time of DL Functionality, the completeness and
authoritativeness of the DL Content, the confidentiality of the Useractions. These aspects characterise the overall Digital Library service,
actually the way it is perceived by the DL End--users.
These parameters need not necessarily be fixed for the entire lifetime of the DL;
they may be reconfigured to enable the DL to respond to the evolving
expectations of target users and changes in all aspects.
DL System Administrators
DL System administratorworkintandemwithDLDesignerst oputinplacethe
DigitalLibrarySystemimplementingtheplannedDigitalLibraryservice.
-
7/31/2019 Term Paper - Devansh Mathur
42/56
42
They select, deploy and manage a set of networked computers and software
modules needed to fulfil the expectations that DL End--users and DL
Designers have for the Digital Library.
DL System Administrators perform their tasks by interacting with the DLMS
and relying on the facilities these systems offer for DLS constituents
identification, linking, allocation, deployment, configuration, tuning,
monitoring, alerting, and any other management facility requested to manage
potentially distributed software systems as DLSs are expected to be. Different
DLMSs are expected to offer diverse management facilities ranging from
manual installation and configuration of the computers and the software
modules on the target computers to fully autonomic solutions aiming at
reducing human intervention to a few corner cases.
DL Software Developers
DL Software Developers develop and/or customise the software components
that will be used as constituents of the DLSs. They are requested to produce the
software implementing every aspect of the Digital Library service ranging from
the DL Content and User to Functionality, Policy and Quality. However, DL
Software Developers should not start from scratch and their activity is expected
to be performed by relying on the offering of a DLMS. In fact, a DLMS is asoftware system that is equipped with a bunch of off--the--shelf software
modules implementing to some extent some Digital Library facilities, e.g.,
content repositories, users management systems, cooperative working
environments, information retrieval engines, policies enforcement modules.
-
7/31/2019 Term Paper - Devansh Mathur
43/56
43
DL Software Developers include Software Engineers and Programmers that
are requested to customise and complement the set of software modules
provided by the exploited DLMS as to obtain the set of software constituents
needed to implement the planned Digital Library. The three roles described
above encompass the entire spectrum of actors working in the digital libraries
universe. Their conceptual models of such a universe are linked together in a
hierarchical way, as shown in Figure I.4--2.
This hierarchy is a direct consequence of the above definitions, since DL End--
users act on the Digital Library, whereas DL Managers and DL Application
Developers operate on the DLS (through the mediation of a DLMS) and,
consequently, on the DL as well. This inclusion relationship ensures that
cooperating actors share a common vocabulary and knowledge. For instance,
the DL End--user expresses requirements in terms of the DL model and,
subsequently, the DL Designer understands these requirements and defines the
DL accordingly.
Figure 6 Hierarchies of Users' Views
-
7/31/2019 Term Paper - Devansh Mathur
44/56
44
Advantages and Disadvantages of Libraries: From Past
Till Now
PAST: DIGITAL LIBRARIES (DL)
Computers have made revolutionary changes in every field of life; undoubtedly,
the field of education and information has been no different. Importantly,
conventional libraries moved to the concept of digital libraries, which ultimately
made gaining knowledge more efficient and organised. However, a notable
important fact here is that the digital library should stand for more than a well-
organised centralised form of information [26]. Furthermore, they should also
embody the essence of communication, which was originally the aspect of face-
to-face interaction between the people at a conventional library.
Advantages
People can access required information at any time of the day, as long as they
have access to the internet.
Disadvantages
Searching is not efficient, as it may not provide meaningful data to the user as
a result of his command. In many cases, access to certain information is limited
by copyright law.
-
7/31/2019 Term Paper - Devansh Mathur
45/56
45
Data is static; therefore, no users can contribute their views or share their
knowledge with other participants.
Digital libraries should explore using Semantic Web technologies to meet their
organizational challenge. It summarizes the key challenges facing digital
libraries, based on a digital libraries workshop. They identify five key
challenges facing digital libraries: interoperability; description of objects and
repositories; collection management and organization; user interfaces and
human-computer interaction; and economic, social, and legal issues. Digital
libraries can use Semantic Web technology to help meet all of these challenges.
FROMDL TO SDL
Following the advent of digital libraries in our lives, another innovative step
followed. This step was made in relation to making the search more meaningful
and direct. Essentially, it was concerned with refraining from the habit of
searching all the things everywhere.
The growth of Web 2.0 has given way to new methods of accessing
information and contributing opinions. Notably, semantic digital libraries enable
the user to get the intended information concerning an object without thepresence of the exact word in the search. This integrated form of information is
based on different metadata which provides a more meaningful data. These
libraries tend to provide a better and more convenient form of browsing
interfaces.
-
7/31/2019 Term Paper - Devansh Mathur
46/56
46
Figure 7 Evolution of Semantic Digital Libraries
Advantages
Semantic Digital Libraries make it easier to find information in the vast ocean
of available data. This is facilitated by ontology-based search and facet search.
Access is not confined to only one digital library; to the contrary, it provides a
mechanism of interoperability between different systems.
-
7/31/2019 Term Paper - Devansh Mathur
47/56
47
Disadvantages
Existing metadata of the digital libraries have to be lifted to a semantic level.
Not all digital libraries, government agencies etc. maintain metadata.
FROM SDL TO SSDL
Semantic digital libraries tend to focus more keenly on the retrieval of
meaningful information rather than giving the opportunity of sharing user
knowledge. This need subsequently led to the development of social semantic
digital libraries.
Figure 8 Evolution of Social Semantic Information
-
7/31/2019 Term Paper - Devansh Mathur
48/56
48
This is achieved by a combination of Semantic Web with collaboration tools on
the web. Social semantic digital libraries complement the existing features of
the semantic digital libraries by providing the opportunity to contribute to the
information. Web 1.0 evolved into a collaborative platform where people could
interact and share information, i.e. Web 2.0. Web 2.0 was promoted by Tim
OReily around the year 2005; it gives ordinary internet users the opportunity to
interact, meet and share information like never before, and involves concepts
like blogs, wikis, social networking sites etc.
Advantages
libraries, and accordingly achieve great things.
of
individuals to the other.
Disadvantagesamateurish data by some users.
SSDL AND THE FUTURE
Social Semantic Digital Libraries (and Web 2.0) have made the web
collaborative and interactive; however, one drawback which has become
apparent as a result of this innovation is that of information overload. Owing to
the increase in internet users and thus their participation level on forums, it has
become difficult to point to the knowledge part of the content.
-
7/31/2019 Term Paper - Devansh Mathur
49/56
49
Disadvantages
Another drawback of such libraries is that web pages are dynamic but are not
very structured. Notably, Web 2.0 tools enable the shaping of content on the
pages, but not the content itself. Essentially, the future will address these
aspects and make the web more powerful and structured by the advent of Web
3.0 . The basic concept behind Web 3.0 is that of ontologydefined by
Thomas Gruber as explicit specification of a conceptualisation. Another future
enhancement which is foreseen for the future is that there shall be digital
annotation linked with physical objects in life: for example, in a museum. An
application of this technology can be to have real-virtual tours of a certain
place: for example, to start with a real guided tour and then (if desired)
browsing through the virtual context information or otherwise gathering
information about other exhibitions in the premises. The future aspect of the
social semantic digital library is to improve user benefits by empowering the
user interfaces and social networking. The user identification and system
automation are important key points in the future social semantic digital
libraries.
-
7/31/2019 Term Paper - Devansh Mathur
50/56
50
Future Prospects
There are inevitable barriers to the Semantic Web that still need to be addressed.
We have mentioned the slow progress on certain features, particularly ontology
and reasoning support, due to the development community not coming to a
consensus. This does not mean that progress cannot be made immediately using
the simpler tools for RDF and RDF Schema available now.
Some of the larger IT companies are hanging back, waiting to spot the
opportunity and waiting for the research community to settle on standards. Thus
the main impetus is coming from communities themselves it is an opportunity
to profoundly affect the way that the world talks to each other.
There is a good deal of RDF data giving semantic descriptions already on the
Web, both from website owners publishing their own annotations as RDF files
and from sites such as rdfdata.org which provide portals for RDF data.
However, before the Semantic Web can become globally usable, there does
need to be more, and it needs to be more easily available. There is a distinct
overhead to using the Semantic Web in terms of establishing shared
vocabularies and ontologies, and in providing the appropriate annotations to
resources which make them visible to the Semantic Web. This is a non-trivial
task and often users will either not have the time to include this, or the expertise
to do it well. A missing component of the Semantic Web is a simple means to
support this, similar to the editors and tools for the conventional Web.
Undoubtedly the simplicity of the HTML language used within the current Web
was a major influence on its success and in order for the Semantic Web to break
out from narrow communities to universal use it needs to address the issues of
making it easy to use and accessible to all.
-
7/31/2019 Term Paper - Devansh Mathur
51/56
51
Otherwise, the Semantic Web is likely to require particular effort and expertise.
This is expensive, and so it may well be confined to particular domains on the
Web which see a strong advantage in its use, although over time as the expertise
becomes more commonplace it should become cheaper. Also, the 'network
effect' can work as both a barrier and an incentive. One of the main advantages
of supplying Semantic Web annotation is that is can be shared and can gain
advantage to others, so when there is little data to share, then there is little
incentive to take the extra expense in sharing; however, once the ball starts to
roll, there is an exponential advantage in combining your own data with others'.
These problems may be less of a disadvantage in the HE and FE sectors, which
has well-integrated communities with stronger control over their resources.
Information science professionals in libraries are available to help with the task
of cataloguing and publishing annotations. Thus it is likely that this sector will
be in the forefront in the use of this technology.
The impact on digital libraries, combined with the Open Access Initiative andthe rise of open archiving is likely to be quite profound. Libraries become
'value-added' information annotators and collators rather than the archivists of
externally published literature and the holders of the published output of
institutions. The Semantic Web, although not a prerequisite or a motivator for
this change is nevertheless likely to smooth its development. The tools are in
place for sharing classification schemes and to allow the community to develop,deepen and share such schemes. The information infrastructure tools discussed
above will have particular impact on the way students and researchers find
information, so these tools may typically be provided and adapted by libraries
who will tailor them to the needs of their own users. The Semantic Web, like
-
7/31/2019 Term Paper - Devansh Mathur
52/56
52
the current Web, has the capacity of being an overwhelming place; libraries are
well-placed to make sense of this for the HE and FE community.
Existing Semantic Digital Library Systems
SIMILE (Semantic Interoperability of Metadata and Information in
unLike Environments): This system focuses on enhancing the integration
aspect of metadata, services etc. to increase accessibility.
We propose a project that will create compelling use case demonstrations and
prototypes at the intersection of community or individual information creation
and management, institutional information and digital collections management,
and the Semantic Web data architecture.
We seek to enable low-cost, scalable interoperability among digital research
collections, the metadata that describes them, and the services that leverage that
metadata and digital material. We will extend DSpace to greatly enhance its
ability to provide easy-to-use support for arbitrary schemas and metadata,
primarily through the use of RDF and Semantic Web techniques, and explore
support for distributed research collections. The project will ground its work in
focused, well-defined, real-world use cases in the domains of libraries and
higher education. We will also collaborate with other significant projects that
operate in these domains, (e.g. ARTstor, JSTOR, OCW, Fedora, Sakai, and
VUE) to improve understanding and use of Semantic Web technologies to
provide scalable data interoperability within and across those platforms.
Jerome Digital library: Can be considered as a social semantic digital
library. Based on Semantic Web as well as social networking in order to
-
7/31/2019 Term Paper - Devansh Mathur
53/56
53
promote collaborative activities along with other common uses of semantic
digital libraries.With JeromeDL social and semantic services every library usercan bookmark interesting books, articles or other materials in semantically
annotated directories.
Users can share their knowledge with others within a social network. We
enriched the standard SSCF browser with an ability to bookmark and browse
community based data. JeromeDL also has a feature which allows it to treat a
single library resource as a blog post. With SIOC based annotations users can to
comment the content of the resource and in this way create new knowledge.
JeromeDL also provides various browsing, filtering and navigation solutions,
such as TagsTreeMaps, MultiBeeBrowse and Exhibit.
JeromeDL has been installed in a number of locations; the two most used,
DERI Galway library and WBSS8 at Gdansk University of Technology, serve
their community of users in everyday activities. DERI Galway library is used by
researchers as a pre-print server to locate and share publications. WBSS
maintains a set of scans of antique books and a number of books written bylecturers at GUT; the latter ones are used as learning material.
BRICKS: This system focuses on the basic infrastructure of a digital library
network so that information can be shared amongst users in the cultural heritage
domain. The BRICKS network infrastructure uses the Internet as a backbone,
and is made of decentralized BRICKS Nodes (BNode), in order to avoid centralpoints whose failure or overload could stop or slowdown the whole Network.
BNodes communicate among each other and use available resources for content
and metadata management.
Every BNode knows directly only a subset of other BNodes in the system.
However, if a BNode wants to reach another member that is directly unknown
-
7/31/2019 Term Paper - Devansh Mathur
54/56
54
to it, it will forward a request to some of its known neighbour BNodes that will
deliver the request to the final destination or forward it again. BRICKS users
access the system only through a local BNode available at their institution.
Hence every user request is primarily sent to the institution's BNode and then
the request is routed via other BNodes to the final destination.
Search requests behave like that; the BNode pre-selects a list of BNodes where
a search request can be fulfilled, and then the BNode routes it there. When the
location of the content is known, e.g. as a result of the query, the BNode is
directly contacted.
Conclusion
Traditional libraries have taken the shape of an interactive, accessible and
efficient platform which is present for the user at any time of the day. The new
forms of digital libraries, i.e. semantic digital libraries, have proved to produce
more meaningful results for the user. Further developments in semantic digital
libraries have evolved the concept of contribution of information and social
interactivity between the contributors. Therefore, the future holds much more
promising and efficient mechanisms for handling information.
-
7/31/2019 Term Paper - Devansh Mathur
55/56
55
References
[1] Diane, V., 2006, When did the Web start? Developed Traffic,
http://developedtraffic.com/2006/08/04/when-did-the-web-start/.
[2] Berners-Lee, T., Hendler, J., Lassila, O., 2001, the Semantic Web, Scientific
American Magazine (May 17, 2001).http://www.sciam.com/article.cfm?id=the-
semantic-web&print=true.
[3] Kruk, S., R., Haslhofer, B., Kneevic, P., 2007, Tutorial 7- Semantic Digital
Libraries, JCDL 2007
[4] Lagoze, C., Krafft, D., B., Payette, S., Jesuroga, S., 2005, What Is a Digital
Library Anymore, Anyway? D-Lib Magazine November 2005, Volume 11
Number 11,http://dlib.org/dlib/november05/lagoze/11lagoze.html
[5] Borgman, C., L., 2000, Digital libraries and the continuum of scholarly
communication,Journal of Documentation, 56 (4), pp. 412-430
[6] Borgman, C., L., 1999, What are digital libraries? Competing visions,
Information Processing & Management, pp. 227-243,
[7] Celino, I., Turati, A., Della Valle, E., and Cerizza, D. (2006). Squiggle Med:
Semantic search for medical digital library, Technical report, CEFRIEL.
[8] Baruzzo, A., Casoto, P., Challapalli, P., Dattolo, A., Pudota, N., Tasso, C.,
2009, Toward Semantic Digital Libraries: Exploiting Web2.0 and Semantic
Services in Cultural Heritage,Journal of Digital Information, Vol. 10, No 6.
[9] Xu, X., Zhang, F., Ni, Z., 2008, An Ontology-Based Query System For
Digital Libraries, IEEE Pacific-Asia Workshop on Computational Intelligence
and Industrial Application
[10] Kruk, S., R., Decker, S., Haslhofer, B., Kneevic, P., Payette, S., Krafft,
D., 2007, TutorialSemantic Digital Libraries,BANFF 2007.
[11] Thomas, S., 2006, Web 2.0 and the future for library systems, Technical
Report, University of Adelaide,
http://www.sciam.com/article.cfm?id=the-semantic-web&print=truehttp://www.sciam.com/article.cfm?id=the-semantic-web&print=truehttp://www.sciam.com/article.cfm?id=the-semantic-web&print=truehttp://www.sciam.com/article.cfm?id=the-semantic-web&print=truehttp://dlib.org/dlib/november05/lagoze/11lagoze.htmlhttp://dlib.org/dlib/november05/lagoze/11lagoze.htmlhttp://dlib.org/dlib/november05/lagoze/11lagoze.htmlhttp://dlib.org/dlib/november05/lagoze/11lagoze.htmlhttp://www.sciam.com/article.cfm?id=the-semantic-web&print=truehttp://www.sciam.com/article.cfm?id=the-semantic-web&print=true -
7/31/2019 Term Paper - Devansh Mathur
56/56
http://digital.library.adelaide.edu.au/dspace/bitstream/2440/14789/1/Web2.0.pdf
[12] Baruzzo, A., Casoto, P., Dattolo, A., and Tasso, C., 2009, A conceptual
model for digital libraries evolution. In WEBIST 09: Proceedings of 5th
Informational Conference on Web Information Systems and Technologies,
pages 299304, Berlin. Springer-Verlag.
[13] The Spiritus Temporis Web Ring Community, 2005 Digital library,
http://www.spiritus-temporis.com/digital-library/disadvantages.html
[13]D. McGuinness and F van Harmelen (eds) OWL Web Ontology LanguageOverview
http://www.w3.org/TR/2003/WD-owl-features-20030331/
[14] M. Dean, G. Schreiber (eds), F. van Harmelen, J. Hendler, I. Horrocks, D.
McGuinness, P. Patel-Schneider, L. Stein, OWL Web Ontology Language
Reference http://www.w3.org/TR/2003/WD-owl-ref-20030331/
http://www.spiritus-temporis.com/digital-library/disadvantages.htmlhttp://www.spiritus-temporis.com/digital-library/disadvantages.htmlhttp://www.spiritus-temporis.com/digital-library/disadvantages.html