1 interoperability: architectures and connections john gilby, m25 systems team, lse ashley sanders,...
TRANSCRIPT
1
Interoperability:architectures and connections
John Gilby, M25 Systems Team, LSE
Ashley Sanders, Copac Team, MIMAS
"Hyper Clumps, Mini Clumps and National Catalogues: resource discovery for the 21st century“
11th November 2004, British Library, London
2
Contents
• Overview of technical architecture of union catalogues (Copac & InforM25)
• Introduce Z39.50 to Z39.50 middleware & issues to consider
• CC-interop and JAFER– Installation, configuration & testing
• Results set issues and searching times
3
A reminder, Z39.50 is:
• a standard for information retrieval
• a client/server relationship– Z-client – stand-alone in PC or associated
with web server/user interface– Z-server - generally a module in library
systems
• a method for communication between disparate computer systems (such as a library catalogue and a user’s PC)
4
Copac• has 26 libraries (including large research, academic and BL, NLS)
• geographically covers whole of UK• JISC funded, administered by MIMAS• has “control” over indexes and searching process• can be searched via Z39.50• periodic data loads• live circulation data via Z39.50 = very successful
and popular with users• Copac V3 – experimental Z39.50 searching of
Copac and National Library of Wales
5
Incoming MARC records from contributing institutions
Record pre-processing:
standardisation & problem
identification
Copac database
Z-server, OpenURL& web interface
Formation of consolidated and
individual records & indexes
CURL database creation
MARC21 & UKMARC
Duplicate checks
pass/fail
web Z39.50
CURL/Copac database creation
6
Distributed catalogue• typically has up to 40 library catalogues (academic – CAIRNS,
InforM25, RIDING; Public - WiLL)
• regionally based
• funded by regional organisation
• rely on institutional catalogues for record standards, indexing and Z-server configurations
• some control over Z39.50 searching process
• data is as up to date as library OPAC
• ‘clump’ software combines result sets and presents them to user
• generally cannot accept queries outside of user interface
7
User
Copacsingle, large
database
Distributed catalogueZ-client software and
user interface
Z-server/institutional library systems
network
network
Union catalogues
8
Z to Z Middleware
Remote user
Z-client
Z39.50to
Z39.50Middleware
Institution Z-serverA
Institution Z-serverB
Z39.50
Z39.50
‘Local’ user web interface
e.g. M25 libraries
e.g. Copac V3
9
Connection Issues
• When to make connections ?• Which Z-servers ?
– selecting some/all, landscaping
• Access & Authentication– handled by middleware
• Timing of middleware response– user’s client is expecting single response– middleware has to wait for Z-servers to respond
before it responds to client– automatic time-out advisable
10
Search & Result Set Issues
• Query transformation– multiple Z-servers behave differently to an incoming
query– user sends query in their own ‘format’ (attribute set)– need to avoid failed searches– middleware transforms query to form suitable for
individual Z-servers
• Response aggregation– user’s client cannot know hits/Z-server– client must display origin of record– various options
11
and so to JAFER
• Middleware options for CC-interop:– graft Z39.50 server onto existing InforM25
software– develop completely new software– use existing available software
• JAFER Toolkit Project (JISC 5/99 Programme)– readily available & supported– could do most of what was required
12
Working with JAFER
• JAFER: http://www.jafer.org/– increased the JAFER logging facilities– established subsets of libraries for
searching– produced XSLT stylesheets
• Created new Copac Interface– copy of standard Copac web interface
tailored for testing JAFER
17
Search tests
• Search set 1 - Copac Z39.50 criteria– no query transformations
• Search set 2 - M25 ‘best practice’ settings– query transforms applied
18
Search test results — 1
• Access failed– variable: always, sometimes, occasional,
never– Talis & Aleph access problems – firewall problems
• Access succeeded– some searches received no response
19
Search test results — 2
• Response with Copac search settings– 203 searches carried out– 95 failed to return a result (0 or more
records)
• Response with InforM25 settings– 199 searches carried out– 3 failed to return a result (0 or more
records)
20
Middleware benefits
• Simplifies access to range of catalogues• Query transformation improves search
success rate• Virtual catalogue staff can:
– provide centralised development and maintenance– identify and investigate problems– act as a central contact point
• Can interconnect the (JISC) Information Environment
• Potentially useful for a National Catalogue
21
Search problems/solutions• Users lose control of query• Search consistency
– failure of catalogues to respond– lowest common denominator or all options?– catalogues searching different fields– catalogues searching fields in different ways
• Standardisation– profiles eg. Bath Profile – work on index standardisation
22
Response times
• Improved access to resources– benefits end-user and library staffBUT– impacts on local catalogue– over-large result sets– duplication of material
• Response times– impact on local catalogue searcher– impact on virtual catalogue searcher
23
Response time test
• Hourly search for ‘Austen’ – record time taken to obtain search result– does not include record collection or result processing
• Number of searches responding– c.90% within 2 seconds– c.4% within 4-27seconds
• Overall response time governed by slowest catalogue– Timeouts for slow to- or non-responding catalogues
24
Restricted searches
• Should all searches be sent to all catalogues?– control where searches are sent initially
– pre-defined search groups - by location/subject?
• Better to deal with large result sets through ranking and/or sorting?– which brings us back to response times…
25
Summary & what next ?
• JAFER tests - middleware works• Enables distributed catalogues to be
‘plugged into’ the IE• Dynamic resource selection is technically
feasible• Clump services interested• Further investigations:
– Response-time tests– Results processing
26
Further details
• Reports on the project website:http://ccinterop.cdlr.strath.ac.uk/documents.htm
• Copac Team:[email protected]
• M25 Systems Team:[email protected]