bibliographic relations

Download Bibliographic relations

Post on 22-Feb-2016

55 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

Bibliographic relations. Erik Thorlund Jepsen Library advisory officer The Danish Library Agency. Outline. Bibliographic relations: Definition Why are relations important in the bibliographic universe Typologies Relations and FRBR Utilization of relations in OPAC’s - PowerPoint PPT Presentation

TRANSCRIPT

Dias nummer 1Typologies
One record structures, Links, displays...
Examples
Relations: Definition
A relationship between information entities exist, when two entities are somehow associated with each other.
(Velucci, 1997, p.105).
Though semantically precise, this definition does not provide much guidance for the identification of relationships, since associations rely on subjective judgements, assessing the relevance of ‘connecting/relating’ two or more entities.
For convenience, the concept ”Bibliographic” is dealt with as ”being related to bibliographic entities and records”
Erik Thorlund Jepsen
Importance of relations: FRBR-terms
Information about bibliographic relations between two or more bibliographic entities can support the user tasks ‘find’, ‘identify’ and select’. Relations stated somehow in a bibliographic record can potentially:
Improve users understanding of a given entity, which potentially strengthens the identification and selection/deselection of the entity.
Improve user’s options for finding relevant entities by leading the way from a known (found) entity too related entities which are more relevant in a given situation.
Erik Thorlund Jepsen
Importance of relations
Furthermore, information about bibliographic relationships can strengthen the users understanding of the system (database) at hand and the knowledge organization in the system, by:
Creating groups of entities
Erik Thorlund Jepsen
Free-text 7%
Author 34%
and ”Literature about..”) 15%
Source: Kirsten Larsen, Deputy Head, The Danish Library Centre (DBC)
Erik Thorlund Jepsen
Importance - GFOD
User Principle: General guidelines for good practice in display design and criteria for effective screen displays as these relate to legibility, clarity, understandability and navigability
Content and Arrangement Principle 7. Support navigation from the displayed information to related information
(this principle is further divided into more specific, and i add ambitious, principles.)
Erik Thorlund Jepsen
Sets of documents
Existing information systems
Standards, rule sets and registration formats
Empirical studies of user’s identification - and assessment of importance – of associations among groups of entities
Erik Thorlund Jepsen
Derivative Relationships (versions, editions, revisions, translations…)
Descriptive Relationships (annotated editions, commentaries, reviews…)
Whole-Part Relationships (selections from anthologies, collections, series, chapters vs. books…)
Accompanying Relationships (supplements, concordances, indexes…)
Sequential Relationships (sequels of a monograph, parts of a series)
Shared Characteristics Relationships (common author, publisher, title, subject…)
Vellucci and Tillett Categories of Relations (Shortened description from
Vellucci, 1997)
Erik Thorlund Jepsen
Erik Thorlund Jepsen
Erik Thorlund Jepsen
FRBR and relations
Other Relationships between Group 1 entities at these levels:
Work-to-work
Expression-to-expression
Expression-to-work
Manifestation-to-manifestation
Manifestation-to-item
Item-to-item
Not meant to be exhaustive!
Yet, relationships are mapped to user task (alongside attributes)
Erik Thorlund Jepsen
FRBR - tasks
to find entities that correspond to the user's stated search criteria (i.e., to locate either a single entity or a set of entities in a file or database as the result of a search using an attribute or relationship of the entity);
to identify an entity (i.e., to confirm that the entity described corresponds to the entity sought, or to distinguish between two or more entities with similar characteristics);
to select an entity that is appropriate to the user's needs (i.e., to choose an entity that meets the user's requirements with respect to content, physical format, etc., or to reject an entity as being inappropriate to the user's needs);
to acquire or obtain access to the entity described (i.e., to acquire an entity through purchase, loan, etc., or to access an entity electronically through an online connection to a remote computer).
(Functional Requirements for Bibliographic Records, 1998, p.82)
Erik Thorlund Jepsen
FRBR – additional tasks?
to relate................. A fifth task?
“Even more, FRBR reminds us of the importance of bibliographic relationships, and reminds us that we describe things in the bibliographic universe in order to meet specific user tasks: ‘find,’ ‘identify,’ ‘select,’ ‘obtain,’ and i add ‘relate’” (Tillett, 2005, p. 198).
Yet, information about relationships supports the three tasks: to find, to identify and to select (e.g. supports collocation, which is seen as part of “to find”)
In other words, to relate is a sub task of to find, to identify and to select.
It could cause a breakdown of the model to incorporate “to relate” as a fifth task
To navigate….A fifth task
Yes probably
Three purposes when cataloguing information about relations and setting up system rules:
Identification and understanding of relation
Linking from found entity to related entities
Displaying meaningful/useful sets of records
Erik Thorlund Jepsen
Relations expressed as links
Relations are expressed as implicit or explicit links, where explicit links are divided into ’directional’ and ’mechanical’ links (hyperlinks)
(Velucci, 1997)
Hyperlinks are constructed by manual or computational means
Manual links are static and are commonly used to structure text’s or to connect associatively related entities (by topic) (and to connect bibliographic families – added by etj)
Computational links can be created at search time (dynamicality) and are primarily used to connect ’similar’ entities (e.g. based on shared characteristics – added by etj)
(Agosti, 1997)
Computational links for shared characteristics
Rules and codes (e.g. for derived relations)
Computational solutions for work display
Erik Thorlund Jepsen
Rules and codes: example “Reuse+”
Widened use of specific field in Marc-formats to handle relations in a uniform way.
787 Non-specific relationship entry (Repeatable)....and two subfields :
$w Record control number (target to link current record to)
$g Relationship information (textual; optional)
Erik Thorlund Jepsen
Reuse+ (2)
To distinguish between the various relationships, and to make them specific, our simple model proposes the use of indicator 2 in 787, as yet undefined. This indicator might take on the following values (and here, a full-scale model would not have to differ): (in parentheses: DC Simple terms for relations)
0 Equivalence (facsimile or reproduction) (IsFormatOf)
1 Simultaneous edition (IsVersionOf)
3 Amplification (incl. commentaries, illustrations, criticism etc.) (IsBasedOn)
4 Extraction (abridgements, condensations, excerpts)
5 Recordings of performances
6 Adaptation, modification (change of genre or medium, arrangement) (IsFormatOf)
9 Translations (IsVersionOf)
p Part à whole relationship (IsPartOf)
r Review or other descriptive relationship
s Sequential relationship (like successive title of a serial)
u Unspecific relationship, based on shared characteristics of other kinds
(Eversberg, 1998)
On collocating the work
“Most users seek particular works, not particular editions. Yet works are published in the form of editions; the fundamental duty of descriptive cataloguing is to organize the resulting chaotic bibliographic universe to facilitate user access to works, and to allow them easily to select the edition of the work sought that best meets their needs…” (Yee, 1997, p.64).
Erik Thorlund Jepsen
FRBR Display Tool:
Library of Congress: FRBR Display Tool was developed to transform bibliographic data found in MARC 21 record files into meaningful displays by grouping them into the work, expression and manifestation FRBR entities. Based on XML technologies, the tool may be altered to meet the needs of individual institutions. It also shows how the theoretical portion of the FRBR model can be used practically to allow librarians to evaluate the consistency of their local bibliographic data
Erik Thorlund Jepsen
Work-display: Bibliotek.dk (The Danish Union Catalogue)
An example of an almost totally automatic initiative is the display of editions of a work in the Danish Union Catalogue “ Bibliotek.dk ”
Attributes like author and title are used in a best match algorithm to identify different editions of the work.
Due to, a high level of authority control and the use of original titles, the different expressions of a work will normally be collocated in the search result.
Erik Thorlund Jepsen
End user version of the Danish Union Catalogue
Sponsored by The library Agency but maintenance and development by The Danish Library Centre (DBC)
Content:
all titles in public libraries and research libraries in Denmark
Content is not 100% equivalent to the Union Catalogue (availability matters)
Works together with a national transportation system – users can pick up books from every library at their own (chosen) library
Erik Thorlund Jepsen
The records in bibliotek.dk represents manifestations (AACR2/danMARC2).
The aim is to present these records grouped according to the work they embody
At one point our definition differs from FRBR:
For practical reasons we consider expressions in different language to be different ’works’.
You could also say that in this case we prefer grouping according to the expression of the work.
(Paul B. Jensen, Danish Library Center)
Erik Thorlund Jepsen
Implementing the work concept
The work level display is based on matching and collocating manifestation records on-the fly
This match is based on simple author and title data in normalized form
From the work level you can expand to the manifestations, select one (or more) and make a request
(Paul B. Jensen, Danish Library Center)
Erik Thorlund Jepsen
Accomplishment
A more user-friendly interface (as confirmed by a majority of test-users)
A reduction of unnecessary inter-library loans, because it is easier to locate an edition to your local library (or libraries)
(Paul B. Jensen, Danish Library Center)
Erik Thorlund Jepsen
In principle a traditional aacr2/marc-record does not specify which bibliographic information refers to work level and which to the expression/manifestation level
Many bibliographic items contains more than one work:
Collected plays in one volume (e.g. Shakespeare)
3 novels in one volume
3 symphonies on one cd
Etc.
Erik Thorlund Jepsen
Erik Thorlund Jepsen
Erik Thorlund Jepsen
Use-determined
+ frequencies
Similarity-based
Third part pointers
Author
Entities cited by, referred by or linked to by other entities
Erik Thorlund Jepsen
User
4. oplag. - [Kbh.] : Gyldendal, 2002. - 134 sider
Katten Linda formodes at være en reinkarnation af forfatterens mor, der selv var en kat. Og det passer godt nok på den tilværelse mor og kat har, og deres måde at påvirke omgivelserne på ....
Tidligere: 1. udgave. 2001.
ISBN 87-00-48736-8 : hf. : kr. 175,00.
 Andre, der har lånt Suzanne Brøgger: Linda Evangelista Olsen, har også lånt:
Suzanne Brøgger: Ja
Suzanne Brøgger: En gris som har været oppe at slås kan man ikke stege
Suzanne Brøgger: Creme fraiche
Erik Thorlund Jepsen
Statistical based: e.g. vector space model using tf*idf weights
Entity 1
Entity 2
Shared elements
Co-citations
Similar to “others who has lent this book, has lent these materials”
But from an author/domain perspective
Example from Citeseer ->
Erik Thorlund Jepsen
Citeseer example
Abstract: Latent Semantic Indexing (LSI) is a technique for representing documents, queries, and terms as vectors in a multidimensional real-valued space. The representations are approximations to the original term space encoding, and are found using the matrix technique of Singular Value Decomposition. In comparison, Multidimensional Scaling (MDS) is a class of data analysis techniques for representing data points as points in a multidimensional real-valued space. The objects are represented so that... (Update)
Cited by: More
Automated Modeling and Nonlinear Axis Scaling - Leejay Wu (2005) (Correct)
Similar documents (at the sentence level):
8.5%: Optimizing Ranking Functions: A Connectionist Approach to.. - Bartell (1994) (Correct)
Active bibliography (related documents): More All
0.2: A Survey of Information Retrieval and Filtering Methods - Faloutsos, Oard (1996) (Correct)
0.2: Document Space Models Using Latent Semantic Analysis - Gotoh, Renals (1997) (Correct)
0.2: Approximating Matrix Multiplication for Pattern Recognition Tasks - Cohen, Lewis (1997) (Correct)
Similar documents based on text: More All
0.5: Chapter 15: Getting Better Results With Latent Semantic Indexing - Nakov (2000) (Correct)
0.4: Image Retrieval using Latent Semantic Indexing - Pecenovic (1997) (Correct)
0.4: On the Use of Singular Value Decomposition for Text Retrieval - Husbands, Simon, Ding (2000) (Correct)
Related documents from co-citation: More All
8: Personalized information delivery: an analysis of information filtering methods (context) - Foltz, Dumais - 1992
6: Indexing by latent semantic analysis - Deerwester, Dumais et al. - 1990
5: Term-Weighting Approaches in Automatic Text Retrieval (context) - Salton, Buckley - 1988
Co-citation + threshold
Perspectives: Designing OPAC’s and integrated search tools: according to relations
A lot of possibilities – lots of types of relationships to display and utilize in different ways:
Bibliographic families; Shared characteristics; Whole-part and other bibliographic relations
Similarity (statistical); Co citations; User defined (co use)
A.o.
Need for carefull design of system features/link structures and a lot of testing (not only emploing user satisfaction but essentially improved search results)
In other words: Pick the functionalities that works for the user – not the ones you like or are familiar with
Yet, data are essential and data decides the value of functionalities
Erik Thorlund Jepsen

View more