creating a new faceted browsing function for millennium webpac pro

66
1 Creating a New Faceted Browsing Function for Millennium WebPAC Pro Li, Yiu On, Senior Assistant Librarian Leung, Roger, Information Technology Officer Hong Kong Baptist University Library 9th HKIUG Meeting University of Hong Kong Library 8th Dec., 2009

Upload: deliz

Post on 14-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Creating a New Faceted Browsing Function for Millennium WebPAC Pro. Li, Yiu On, Senior Assistant Librarian Leung, Roger, Information Technology Officer Hong Kong Baptist University Library. 9th HKIUG Meeting University of Hong Kong Library 8th Dec., 2009. Outline. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

1

Creating a New Faceted Browsing Function for

Millennium WebPAC Pro

Li, Yiu On, Senior Assistant Librarian

Leung, Roger, Information Technology Officer

Hong Kong Baptist University Library

9th HKIUG MeetingUniversity of Hong Kong Library

8th Dec., 2009

Page 2: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

2

Outline

1. What is Faceted Browsing?

2. Implementations of Faceted Browsing in Traditional WebPAC: Two Approaches

3. Architecture of the New Faceted Browsing Function in WebPAC Pro

4. BU Faceted Browsing Function and Encore: A Comparison

5. Conclusion

Page 3: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

1. What is Faceted Browsing?

Page 4: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

4

1.1 Definition of Faceted Browsing

Faceted Browsing also known as faceted searching, or faceted

navigation is a special navigation interface designed for

record searching and browsing to display aspects of result sets in multiple

classification and categorization schemes, (e.g. related authors, titles, subject headings, material types, locations, languages, publication years, etc.)

Page 5: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

5

1.2 Advantages of Faceted Browsing

Unlike a single, pre-determined, hierarchical scheme, faceted browsing gives users the abilities:

To find items from multiple dimensions and attributes

To explore new directions in dynamic taxonomies (i.e. divisions into ordered groups or categories)

To refine/narrow down the searches

Page 6: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

6

1.2 Advantages of Faceted Browsing (Con’t)

To easily switch between searching and browsing, users can use their own terminology for searching, while browsing the organizations and categories suggested by faceted classifications

To display the number & contents of each suggested category

Page 7: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

7

1.2 Advantages of Faceted Browsing (Con’t)

“For experienced Web users, faceted navigation isn’t something that needs to be explained”

-- Marshall Breeding. "Next-Generation Library Catalogs". Library Technology Reports, vol. 43, no. 4, July-August 2007, p.12.

Page 8: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

8

1.3 Use of Faceted Browsing in Commercial Sites

Indeed, faceted browsing has become part of a well-established user interface convention

A 2003 survey reported that: 69% of 75 leading commercial sites made use of faceted browsing. In fact, all sites of computers, gifts, kitchen ware, music/video categories used faceted browsing

-- Use of Faceted Classification, http://www.webdesignpractices.com/navigation/facets.html

e.g. Amazon, the largest online book stores

Page 9: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

9

In Amazon, faceted browsing includes:1. New Releases

2. Department

3. Formats

4. Binding

5. Shipping Options

6. Award Winners

7. Promotion

8. Avg. Customer Review

9. Condition

Page 10: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

10

1.4 Implementation of Faceted Browsing in WebPAC

If librarians can implement this common faceted browsing function in WebPAC environment, then

we can change WebPAC from a traditional searching tool to a powerful information discovery tool

Page 11: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

2. Implementations of Faceted Browsing in Traditional WebPAC:Two Approaches

Page 12: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

12

2.1 Need for Adding New Web 2.0 Functions to WebPAC

More and more librarians are discontent with the insufficient functionalities of the traditional WebPAC interfaces

To win the support from the new generation of web users, we need to add new Web 2.0 technologies such as faceted browsing, interactive cloud tags, federated search, and social networking tools, etc.

Page 13: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

13

2.2 Next Generation WebPAC

Different names of WebPAC equipped with new Web 2.0 functions include:

Next Generation Library Catalog, SmartCat, Library Catalog 2.0….

Page 14: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

14

2.3 Two Different Approaches

In Hong Kong, two different approaches are adopted to build the Next Generation Library Catalog. They are:

1. New Functions in New WebPAC (NFNW)

2. New Functions in Current WebPAC (NFCW)(Note: in this presentation, we use faceted browsing

function as an representative example of the Web 2.0 functions)

Page 15: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

15

2.4 NFNW Development Logic

The development logic of the New Function New WebPAC approach may be summarized as :

1. We MUST add faceted browsing function to WebPAC

2. Existing Millennium WebPAC Pro environment is too old and CANNOT accommodate this transformation

3. Thus, we need to develop a new WebPAC to implement new Web 2.0 technology

(NOTE: this argument is invalid, we will talk more in NFCW later)

Page 16: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

16

2.5 Two Models of NFNW

Two different development models of New Function in New WebPAC :

1. Encore III product Relatively high annual subscription fee CUHK, HKU, PolyU have purchased

2. Scriblio an open-source software enhanced & used at HKUST

Page 17: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

17

2.6 Disadvantages of NFNW

Many existing powerful functions of WebPAC Pro are missing in Encore

1. Exact Author, Title, Subject Searching

2. Scope searching

3. Limit results to items with "Available" status

4. Search History

5. Author/title/subject authority list (e.g. Author search = Strauss, Johann, 1825-1899)

6. Modify/Limit this Search command

7. Advanced Keyword Search Form

Page 18: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

18

Existing WebPAC Pro powerful functions are missing in Encore

Page 19: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

19

2.7 NFNW = Dual WebPAC System

1. As a result, Encore cannot replace “traditional” WebPAC Pro

2. If patrons want to use those “old” advanced search functions, they have to use the “Traditional” WebPAC Pro

3. Thus, Encore (New Function New WebPAC) approach, in reality, is a dual WebPAC system

Encore (New WebPAC) + WebPAC Pro (Traditional WebPAC)

Page 20: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

20

Encore in University of Queensland Library http://encore.library.uq.edu.au/iii/encore/search/C%7CSStrauss%7COrightresult%7CU1?lang=eng&suite=def

click on to access WebPAC Pro for more “old/classic” advance searching capabilities

Page 21: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

21

2.8 Disadvantages of a Dual WebPAC System

1. Patrons have to learn how to use two different WebPAC systems. This may cause inconvenience and confusions

2. Library staff spend more time and effort to maintain two searching interfaces, therefore, maintenance cost is high

3. Systems people waste time to re-invent a “new” interface rather than concentrate on the design work of faceted browsing function

Page 22: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

22

2.9 Building Next Generation Library Catalog – the Second Approach

1. The development logic of New Functions New WebPAC approach is based on an invalid argument:“the existing Millennium WebPAC Pro environment is

too old and CANNOT accommodate any Web 2.0 functions”

2. But, our study shows that WebPAC Pro is a comparatively open and flexible environment, and we can add in-house developed scripts to the interface

Page 23: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

23

2.9 Building Next Generation Library Catalog – the Second Approach (Con’t)

3. Thus, we decided to add faceted browsing function to the existing WebPAC Pro interface

4. This is a more logical, simple and direct approach, and I call it:

New Functions in Current WebPAC (NFCW)

Page 24: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

24

2.10 Merits of the NFCW Approach

1. Faceted browsing is inserted to WebPAC Pro and becomes an integral part of it

All the existing WebPAC Pro powerful functions are kept

The new add-on faceted browsing functions are fully compatible with the existing WebPAC Pro functions

2. The new add-on faceted browsing functions strengthen the existing WebPAC Pro searching capabilities

Page 25: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

25

2.10 Merits of the NFCW Approach (Con’t)

3. Single interface avoids unnecessary inconvenience, inconsistency and confusion caused by a dual WebPAC systems

4. Save library staff’s time and efforts in maintaining two different WebPAC systems

5. No need to re-invent a new WebPAC interface, therefore, software development cost and cycle is largely reduced

Page 26: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

26

2.11 New Faceted Browsing in HKBU WebPAC Pro

1. Based on the NFCW development logic, HKBU has recently installed a new faceted browsing function on the staging port of WebPAC Pro

2. Currently, only some 222,000 records are uploaded to this database for testing

Page 28: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

28

New In-house Developed Faceted Browsing Function in HKBU WebPAC Pro

Page 29: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

3. Architecture of the New Faceted Browsing Function in WebPAC Pro

Page 30: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

30

3.1 Systems Requirements

1. Hardware X86 based PC/Server Our Server configuration:

Dual Xeon Q-Core CPU 32GB Memory 1 TB HDD space

NOTE: 220,000 bib records are uploaded for testing, and use 7.8 GB for MySQL and Sphinx

Page 31: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

31

3.1 Systems Requirements (Con’t)

2. Software Perl 5 with marc2xml and marc-charset (for

MARC to XML conversion) MySQL 5 (for data storage) Sphinx (for building index and searching data) IIS with ASP 3.0 (for user interface & data

conversion)

Page 32: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

32

3.1 Systems Requirements (Con’t)

Systems requirement is minimal Don’t need a dedicated server Don’t require special high end programming

language (Perl, MYSQL, and Sphinx are freeware)

Page 33: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

33

3.2 Program Workflow

Two major parts:

1. Construct a bibliographic record database for facets analysis

2. Create a special iFrame in WebPAC Pro for displaying facets

Page 34: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

34

3.3 Construction of a New Bibliographic Database

1. Metadata are required to calculate facets

2. Thus, we build a separate database to store the raw data for creating facets instead of using the records in Innopac system

3. All MARC records are exported from Innopac system, and uploaded to our in-house developed bibliographic database

Page 35: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

35

3.3 Construction of a New Bibliographic Database (Con’t)

4. An indexing program is designed to extract facets according to 11 categories below:

Variable Fields Fixed Fields

1. Author 5. Scope

2. Title 6. Language

3. Subject 7. Material Type

4. Publisher 8. Location

9. Publication Year

10. Call No. (browsing only)

Page 36: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

36

3.4 Facets Variable Fields

Below is the facet variable fields for

Author Search = Smith, Adam

Page 37: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

37

3.5 Facets Fixed Fields

Below is the facet fixed fields for

Author Search = Smith, Adam

Page 38: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

38

3.6 Insert Facets on WebPAC Pro

1. WebPAC Pro is an open environment, we can insert scripts and create an iFrame to display facets on brief citation browse page and bib record page

Page 39: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

39

3.7 iFrame Tag on Briefcit.html

An example of iFrame tag:

<iFrame src="http://lib.hkbu.edu.hk/facet/browse/index.asp?searchterm " width="100%" height="100%" frameborder="0" scrolling="no">

</iFrame>

Page 40: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

40

1. Input search variable by extracting the search term in the WebPAC search URL, http://hkbulib.hkbu.edu.hk:2082/search~S11/?searchtype=X&searcharg=china&searchscope=11&SORT=DZ&extended=0&SUBMIT=Search&searchlimits=&searchorigarg=Xchina

2. Pass the search term to bibliographic database and made a SQL search query

3. Extract search results from the in-house bibliographic database, and then, calculate and group the facet values

4. Display faceted categories and values on iFrame

3.8 Facet Display Program

Page 41: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

41

1. Create

facet

iFrame in

briefcit.html

2. Extract search term and pass to iFrame

3. SQL to bibliographic db

4. Return facets values

Page 42: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

4. BU Faceted Browsing Function and Encore: A Comparison

Page 43: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

43

1. Encore cannot provide facets values for variable fields like Author, Title, and Subject

2. Thus, Encore cannot provide a meaningful refinement alternative for variable field categories

3. An example: Author search = Smith, Adam

4.1 Facets Variable Fields

Page 44: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

44

Encore Variable Field Facets No facet values Indeed, only keyword

search links are provided Keyword-Author Keyword-Title Keyword-Subject

Fail to provide meaningful alternatives for users to refine/limit their search

Page 45: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

45

Author Facets in HKBU Names of Chinese

translators are provided Ebook collections

Page 46: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

46

Title Facets in HKBU List of Adam Smith’s most

important work: Wealth of Nations Theory of Moral Sentiments

Chinese translation titles for Wealth of Nation 原富 , 國富論 are provided

Page 47: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

47

Subject Facets in HKBU Contributions of Adam Smith

in subject areas: Economics Ethics

Page 48: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

48

1. BU provides “Publisher” as a new facet variable field

2. Users may choose to refine the search by publisher like Oxford University Press

4.2 New Facets Variable Field -- Publisher

Page 49: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

49

1. In BU, the publication year is sorted in a 10-year range instead of a long list of single year as in Encore

2. A 10-year list is easier for browsing, searching, and collection analysis

4.3 Publication Year Facets

Page 50: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

50

1. In Encore, keyword searching is the only searching capability

2. Without exact Author, Title, Subject search, it will make the searching process more complex, and difficult

3. In BU, users can still use exact Author, Title, and Subject search, and refine the search by facets

4.4 Encore Only Provides Keyword Searching

Page 51: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

51

4. It is difficult to do a keyword search on authors with common last names and first names

e.g., Adam Smith

4.4 Encore Only Provides Keyword Searching (Con’t)

Page 52: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

52

Find all records containing Adam and Smith

NOTE: the first two are not written by Adam Smith, the British economist, that we are looking for

Page 53: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

53

An exact Author search is much easier and straight forward

Adam Smith was born in 18th century, entry #2 is the one we are looking for

Facets is also helpful to refine the search

Page 54: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

54

5. It is also difficult to search subject headings containing common terms by Keyword

e.g. Philosophy – History -- China

4.4 Encore Only Provides Keyword Searching (Con’t)

Page 55: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

55

In Encore, find all records containing philosophy and history and China

NOTE: Many are irrelevant records

Page 56: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

56

An exact Subject search is much easier and straight forward

Facets is very useful to refine the search

Page 57: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

57

4.5 Call Number Analysis

Unavailable in Encore: e.g. Keyword = Plato To facilitate users to browse the class

number list, the program will display both the class number and scope of content

Page 58: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

58

4.5 Call Number Analysis (Con’t)

Page 59: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

59

4.6 Fully Compatible with All WebPAC Searching Functions

Unavailable in Encore:1. Exact Author, Title, Subject Searching

2. Scope searching

3. Limit results to items with "Available" status

4. Search History

5. Author/title/subject authority list (e.g. Author search = Strauss, Johann, 1825-1899)

6. Modify/Limit this Search command

7. Advanced Keyword Search Form

Page 60: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

5. Conclusion

Page 61: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

61

Created new faceted browsing function in existing WebPAC Pro environment is beneficial

1. WebPAC Pro can be re-engineered and upgrade to become a Next Generation Library Catalog

2. This upgrade can keep all the advanced searching functionalities of WebPAC Pro

5.1 Benefit

Page 62: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

62

3. Upgrade cost is low because there is no need to re-invent a new WebPAC

4. Annual maintenance cost is low because there is no need to maintain two different WebPAC interface

5. Development circle is faster because we can concentrate our work on designing new functions

5.1 Benefit of NFCW (Con’t)

Page 63: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

63

In the second phase, we will add the following new Web 2.0 functions

1. Cloud Tagging

2. Newly Added Book List

3. RSS*

4. User Book Rating

5. User Book Review/Comment

6. Adding Google/Amazon table of content*

7. Most Common Search Terms*

*Not available in Encore

5.2 Future Development

Page 64: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

64

Cloud

Tagging

Recently

Added

List

RSS

User Book Rating

User Book Comment/Review

Google/Amazon table of content

Page 65: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

Demo

Thank you

Page 66: Creating a New Faceted Browsing Function for  Millennium WebPAC Pro

66

BU WebPAC Staging Port

http://hkbulib.hkbu.edu.hk:2082/search/X?SEARCH=plato&SORT=D&l=&m=&p=&b=&Da=&Db=&searchscope=11

KW = Plato (scope = Multimedia) AU = 張五常 (publisher = 香港經濟日報 ) AU = Strauss, Johann, 1825-1899 (subject =

Waltzes (Orchestra)) SU = Kant (author = 牟宗三 ) KW = China (pub year = pre 1900)

Demo examples