ukoln is supported by: the jisc information environment bath profile four years on: whats being done...
TRANSCRIPT
UKOLN is supported by:
The JISC Information Environment
Bath Profile Four Years On: what’s being done in the UK?
7th July 2003
Andy Powell, UKOLN, University of Bath
www.bath.ac.uk
A centre of expertise in digital information management
www.ukoln.ac.uk
Contents• JISC Information Environment technical
architecturehttp://www.ukoln.ac.uk/distributed-systems/jisc-ie/
– putting Z39.50 and the Bath Profile in a national context
• JISC IE service registryhttp://www.mimas.ac.uk/iesr/
– disclosing the existence of Bath Profile targets
• technical issues– Z39.50/Bath Profile and other ‘discovery’
technologies
Simple scenario
• consider a researcher searching for material to inform a research paper on HIV and/or AIDS
• he or she searches for ‘hiv aids’ using:– the RDN, to discover Internet resources – ZETOC, to discover recent journal articles
• (and, of course, he or she may use a whole range of other search strategies using other services as well)
Issues
• different user interfaces– look-and-feel– subject classification, metadata usage
• everything is HTML – human-oriented– difficult to merge results, e.g. combine into a list
of references– difficult to build a reading list to pass on to
students– need to manually copy-and-paste search results
into HTML page or MS-Word document or desktop reference manager or …
Issues (2)
• difficult to move from discovering journal article to having copy in hand (or on desktop)
• users need to manually join services together• problem with hardwired links to books and
journal articles, e.g.– lecturer links to university library OPAC but student is
distance learner and prefers to buy online at Amazon
– lecturer links to IngentaJournals but student prefers paper copy in library
The problem space…
• from perspective of ‘data consumer’– need to interact with multiple collections of stuff -
bibliographic, full-text, data, image, video, etc.– delivered thru multiple Web sites– few cross-collection discovery services (with
exception of big search engines like Google, but lots of stuff is not available to Google, i.e. it is part of the ‘invisible Web’)
• from perspective of ‘data provider’– few agreed mechanisms for disclosing availability
of content
UK JISC IE context…
• 206 collections and counting…(Hazel Woodward, e-ICOLC, Helsinki, Nov 2001)– Books: 10,000 +– Journals: 5,000 +– Images: 250,000 +– Discovery tools: 50 +
• A & I databases, COPAC, RDN, …
– National mapping data & satellite imagery
• plus institutional content (e-prints, research data, library content, learning resources, etc.)
• plus content made available thru projects – 5/99, FAIR, X4L, …
• plus …
The problem(s)…
• portal problem– how to provide seamless discovery across multiple content
providers
• appropriate-copy problem– how to provide access to the most appropriate copy of a
resource (given access rights, preferences, cost, speed of delivery, etc.)
A solution…
• an information environment• framework of machine-oriented services allowing
the end-user to– discover, access, use and publish resources across a range of
content providers
• move away from lots of stand-alone Web sites... • ...towards more coherent whole• remove need for use to interact with multiple
content providers– note: ‘remove need’ rather than ‘prevent’
JISC Information Env.
• discover–finding stuff across multiple content providers
• access–streamlining access to appropriate copy
• content providers expose metadata about their content for
–searching
–harvesting
–alerting
• develop services that bring stuff together–portals (subject portals, media-specific portals, geospatial
portals, institutional portals, VLEs, …)
A note about ‘portals’
• ‘portal’ word possibly slightly misleading• the JISC IE architecture supports many
different kinds of user-focused services…– subject portal– reading list and other tools in VLE– commercial ‘portals’ (ISI Web of Knowledge, ingenta, Bb
Resource Center, etc.)– library ‘portal’ (e.g. Zportal or MetaLib)– SFX service component– personal desktop reference manager (e.g. Endnote)– increasingly rich browser-based tools – XSLT, Javascript,
Java, SOAP, …
Discovery
• technologies that allow providers to disclose metadata to portals– searching - Z39.50 (Bath Profile Functional Area C), and
SRW
– harvesting - OAI-PMH
– alerting - RDF Site Summary (RSS)
• fusion services may sit between provider and portal– broker (searching)
– aggregator (harvesting and alerting)
– catalogue (manually created records)
– index (machine-generated full-text index)
Access
• in the case of books, journals, journal articles, end-user wants access to the most appropriate copy
• need to join up discovery services with access/delivery services (local library OPAC, ingentaJournals, Amazon, etc.)
• need localised view of available services• discovery service uses the OpenURL to pass metadata
about the resource to an ‘OpenURL resolver’• the ‘OpenURL resolver’ provides pointers to the most
appropriate copy of the resource, given:– user and institutional preferences, cost, access rights, location,
etc.
Shared services
• service registry– information about collections (content) and services
(protocol) that make that content available
• authentication and authorisation• OpenURL and other resolver services• user preferences and institutional profiles• terminology services• metadata registries• ...
JISC Information Environment
JISC-fundedcontent providers
institutionalcontent providers
externalcontent providers
brokers aggregators catalogues indexes
institutionalportals
subjectportals
learning managementsystems
media-specificportals
end-userdesktop/browser pr
esen
tatio
n
fusion
prov
isio
n
OpenURLresolvers
shared infrastructure
authentication/authorisation (Athens)
JISC IE service registry
institutional preferencesservices
terminology services
user preferences services
resolvers
metadata schema registries
Summary• Z39.50 (Bath Profile), OAI, RSS are key
‘discovery’ technologies...– … and by implication, XML and
simple/unqualified Dublin Core– anticipate growing requirement to transport
‘qualified DC’ and IEEE LOM metadata
• access to resources via OpenURL and resolvers where appropriate
• Z39.50 and OAI not mutually exclusive• general need for all services to know
what other services are available to them
IE Service Registry
JISC-fundedcontent providers
institutionalcontent providers
externalcontent providers
brokers aggregators catalogues indexes
institutionalportals
subjectportals
learning managementsystems
media-specificportals
end-userdesktop/browser pr
esen
tatio
n
fusion
prov
isio
n
OpenURLresolvers
shared infrastructure
authentication/authorisation (Athens)
JISC IE service registry
institutional preferencesservices
terminology services
user preferences services
resolvers
metadata schema registries
IE Service Registry
IESR purpose
“to allow service components to discover and interact with other service components within the JISC IE”
• collection descriptions (describing the content of collections)
• service descriptions (protocol-level detail about how to interact with service components)
• Z39.50, SRW, OAI-PMH, RSS, OpenURL resolvers, SOAP services, Web sites, CGI-based services
ZeeRex
Z39.50 – one among many
• in the context of something like the JISC IE…• Z39.50/Bath Profile is part of a bigger fabric of
protocols (SRW, OAI_PMH, SOAP/XQuery, RDF/RDFQuery, …)
• many are based on XML and DC• many developers will work across all the above• desirable to have more consistent approaches
to use of– XML, XML schemas vs. DTDs, XML namespaces
e-Learning and Bath Profile
• e-Learning seems to be a significant driving force behind cross-domain activity
• is there an argument that Bath Profile should cater better for e-Learning activities?– support for qualified DC (DC-Education)– support for IEEE LOM (as per IMS Digital
Repositories Interoperability Spec.)
Conclusions
• Z39.50 and Bath Profile remains a key component in initiatives like the JISC IE
• but… it is only one component among many• deployment and use is almost always in the
context of other available technologies• future work needs to be mindful of the way the
Web is evolving (XML, URI, RDF, client/server, etc.)
• should IMS DRI (e-Learning work) be folded into Bath Profile?