inteligent catalogue final
DESCRIPTION
To appreciate the paradigm shift involved in the next generation search systems one needs to look back at the traditional approach to resource discovery and compare to the new trends. Here I focus on three aspects: • Databases versus search engines • Federated versus integrated search • Integrated versus modular architecture.TRANSCRIPT
The Intelligent/Next Generation/Dynamic Catalogue
Birte Christensen-Dalsgaard
State and University Library
Aarhus, Denmark
TICER 2008
Vision: The Intelligent Information Client
A little thing following you, who knows you, knows your different profiles and knows where you are - and based on this can find what is relevant and adequate for your situation.• May push information (concert based on your last searches• Quality material based on crab searches
Steps towards the Information Client
• Information available for datamining
• Structure and semantics• Identity management• Tools and Services
• Search, present, resolve, “pay”, deliver
• Lots of technology• Mobile network, GPS, RDIF, reading-
listening- viewing devices,…Crated by Adreas Rauber, Vienna Technological University
Issues addressed in this presentation
• Difference between federated search and integrated search
• Structured versus self organised • Database versus Search Engines
• Verificative versus explorative search
• Importance of rank• Link to behaviour information require
• Introduce quality?
• Importance of user involvement
Outline of presentation
• Start with the users
• Federated <-> Integrated search• Datamining
• Ranking
• The user interface• Search strategies
• SOA – from websites to services
• Requirement: standards, standards and standards
Observe Users:The users and their expectations?
Library Enthusiasts Drive-in users
From: Users expectation to the hybrid library
Question Users:E.g. How do you discover resources?
01020304050607080
Library Users Library Non-Users
Up to three choices
Field Study by Proquest:Inhibitors for using licensed resources
• Lack of awareness
• Difficulty navigating library website to locate appropriate e-resources
• Authentication barriers, especially considering limited access points
From field study by John Law, Proquest (2007)
The different worlds
Librarian
The customer
EBSCO
Web of Science
Catalogue
Union Catalog
OCLC
LCSH
• Suggest,• Advise,• Supporting information,• User involvement• Pervasive information• Persuasive design,• …….
Problem,Research topic,Project
Federated Search
OPAC aOPAC b E-journalInstitutional Repository
Resource Identification
Resource Delivery
Query and response
SRU/SRW
Problems
• Different databases may respond in different manners (e.g. AND or OR, order)
• No means of ranking the results
• Merging require all targets have responded
CONNECTO
RS
SEARCHQ
UERY & RESULTPRO
CESSING
FILTER
Query
Results
Alert
VerticalApplications
Portals
CustomFront-Ends
MobileDevices
DATABASECONNECTOR
FILETRAVERSER
WEBCRAWLER
ContentPush
DOCUM
ENTPRO
CESSING
WebContentFiles,
Documents
Databases
CustomApplications
CONNECTO
RS
TUNING, ADMINISTRATION
Index Files
Multimedia
Slide from Dr. John M. Lervik, CEO FAST 7th International Bielefeld Conference 2004
Open, modular, scalable architecture
World according to ”FAST”
Integrated search
OPAC aOPAC b E-journalInstitutional Repository
Resource Identification
Metadata extracted or harvested from different sources
Index – based on metadata
Resource Delivery
Problems
• Need access to all data
• The more information for each “record” the better
• Ranking among heterogeneous information resources
Search Paradigm Change
• Federated typically associated with:– Database approach
– Queries
– Based on Z39.50 – like protocol
– Structured
– “Exact” match
• Integrated typically associated with:– Search engine
approach
– Natural language
– Large volume
– Statistical approach
Datamining - examples
• Recommender systems• Content-based filtering
Information about the item itself informs the recommendation
• Collaborative filteringInformation drawn from the user preference/rating inform the recommendation
• Audience level (OCLC)• Clustering – based on other
aspects
Relevance and quality?
• Relevance• A library focussed approach
• How well does the record match the search,
• how good is the quality of the material represented by the record,
• how well does the material match the needs of the user?
• A community focussed approach• What do others use, circulation• User recommendations
• Individual• E.g. importance of publication date
• Quality• Some source are better than others?
Big challenge
Capitalise on Internet development
• Spam:We think this is spam- do you agree
• Help in the search process:Here are more options, which one is correct?
Two Relevant Search Strategies
• Verificative search – look-up
• Exact search terms – ideally few answers
• Can be formulated to many databases:Federated Search
• Exploratory Search
• Approximate search terms – where results need refinement
• Tools to support refinement essential
• Need to operate on all available data: Integrated Search
Exploratory Search
Marchionini, G. (2006). Exploratory search: From finding to understanding. Communications of the ACM, 49(4): 41-46.
Next Generation Search Systems: Google-like search fiels and support of “Common” features
Suggest
Did you meanBasket
Different sorting mechanisms
Search systems could
• Support enrichment of information objects – for indexing purposes
• Support exchange of information – such as tags (I don’t think any library has a large
enough user base to generate enough tags for them to be relevant)
• Might take advantage of link collections to group resources
User generated information
Synthesise, Specialise, Mobilise
Robin Murray, 2006, Library Systems: Synthesise, Specialise, Mobilise in Ariadne vol 48
• Mobiliseto secure that the material can be reached as a matter of course for the user.
• Specialiseby the use of specific knowledge to select and/or assemble material for use in specific correlations
• Synthesiseto combine a diversity of material to an entity
Service oriented architecture
Relevant information
Use of “external information”
Webservices
• Syndetics (Bowker)
• Amazon
• LibraryThing
OPAC aOPAC b E-journalInstitutional Repository
Resource
Discovery
Different information webservices
Metadata extracted or harvested from different sourcesIndex based on data
Resource
Delivery
Other Information Resources
Idea behind Summa, Primo, VUFinder, eXtensible Catalogue etc.
Example of search: Kina, Japan, Indien
Initiatives
• Endeca – http://www.lib.ncsu.edu/catalog/
• Primo – http://www.exlibrisgroup.com/primo.htm
• Encore - http://www.iii.com/encore/splash.html
• AquaBrowser – http://www.aquabrowser.com
• Meresco – http://meresco.com/meresco
• Summa – http://www.statsbiblioteket.dk/summa
• Worldcat local – http://www.lib.washington.edu/
Search layer – Library system: example Summa
Horizon
LuceneIndex
Search systemService
Browser
XSLT
HTMLXML via AJAX
WS WS WS
User
XML-repository
Webservices
Status Reservation Search Get post
DLF ILS Discovery Interface Task Group
Standards, standards and standards
• Introduce semantic• Ontologies (OWL)
• Personalised: Strategy for collecting and sharing information• Identity management (SAML2)
• Tag-, recommendation- etc services
• Basket – across all information resources• Reserve, order material
And we have a good system….
• User Interface needs constants adjustment
• User expectation will change
• User environment will change
• Reading devices will change
History of Technological Change
• First stage: New technologies are applied to existing processes (Do more of the same faster/cheaper)
• Second stage: New technologies are integrated into existing process (Improving existing systems)
• Third stage: New technologies are infused and diffused to create new processes and systems
Apply
Integrate
Infuse &diffuse
from: Mark Lawrence Kornbluh
Questions