river campus libraries guf: getting users to full-text ( with voyager®, encompass™, openurl,...

Post on 20-Dec-2015

219 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

River Campus Libraries

GUF: Getting Users to Full-Text (With Voyager®, ENCompass™, OpenURL, etc.)

Jeff SuszczynskiSenior Web Developer

jeffs@library.rochester.edu

Library IT Environment

Digital Initiatives Unit- Software Developers (3)- Systems Analyst- Computer Scientist- Anthropologist- Art Director / Designer

- Digital Librarian for Public Services- Usability Team- Content Groups

Library IT Environment

Usability Testing Lab

Metasearch - Major Issues

Our users fail when they must:

Deal with Link Resolver menu choices Follow long click-paths Get stuck at dead-ends Resubmit their searches mid-session

Major Issues - Examples

Major Issues - Resolved

GUF (Getting Users to Full-text)

Title links on results screen lead to either: Full-text (best) Print holdings information with map (2nd best) Pre-filled interlibrary loan request form (worst)

Issues Addressed: Deal with Link Resolver menus Follow long click-paths Get stuck at dead-ends Resubmit search mid-session

LibraryWeb Server

GUF

Library website user interface

List ofresults

SubscriptionDatabase

Fulltext

ILL loginw/ request

Map tojournal

Search

GUF

Major Issues - Resolved

In each case, the user only had to click once to get to full-text or the next best option.

Major Issues - Resolved

To resolve user issues:

Link Resolver menu choices Long click-paths Dead-ends Resubmitting searches

GUF must do the following:

Improve metadata transfer Eliminate/handle errors Eliminate clicks Check local holdings Handle multiple editions

Improved Metadata Transfer

Problem:

Metasearch (ERA) ==> Link Resolver (SFX, LinkFinderPlus)

metadata hand-offs are extremely important… …and are generally inadequate

inability to handle different sets of metadata for

different databases inability to handle multiple items types from each

database

Improved Metadata Transfer

Example for Database X:

if type (072 $a) = ‘Journal Article’ or ‘Periodical’ then:ISSN will be in the 022 $a or 773 $x or in a section of the 024 $a (parsable SICI)

BUT

if type (072 $a) = ‘Collected Volume Article’ (series) then: ISSN might be in the 490 $x

Metasearch products generally don’t allow for this granularity.

Improved Metadata Transfer

GUF succeeds where Metasearch => Link Resolver often fails: advanced metadata parsing tailored for each database’s quirks

Improved Metadata Transfer

How does GUF do this?

results.xsl => objectGUF.xsl

Title link on results.xsl sends metadata to objectGUF.xsl

Improved Metadata Transfer

How?

objectGUF.xsl => parserxxxdb.js

objectGUF.xsl grabs and massages metadata received objectGUF.xsl sends cleaned and selected metadata to

database-specific JavaScript file

<xsl:when test="$repocode='E_FSARTF'"><xsl:variable name="sici" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR024/MR024a"/><xsl:variable name="unique_id" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR035/MR035a"/><xsl:variable name="author" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR100/MR100a"/><xsl:variable name="atitle" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR245/MR245a"/><xsl:variable name="atitle" select="translate($atitle, '&quot;', '')"/><xsl:variable name="jtitle" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR773/MR773t"/><xsl:variable name="volume" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR949/MR949a"/><xsl:variable name="issue" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR949/MR949b"/><xsl:variable name="spage" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR949/MR949f"/><xsl:variable name="epage" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR949/MR949c"/><xsl:variable name="year" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR949/MR949g"/>

<script language="javascript">fa2_articlefirst("<xsl:value-of select="$sici"/>", "<xsl:value-of select="$unique_id"/>", "<xsl:value-of select="$author"/>",

"<xsl:value-of select="$atitle"/>", "<xsl:value-of select="$jtitle"/>", "<xsl:value-of select="$volume"/>", "<xsl:value-of select="$issue"/>", "<xsl:value-of select="$spage"/>", "<xsl:value-of select="$epage"/>", "<xsl:value-of select="$year"/>")</script>

</xsl:when>

Improved Metadata Transfer

How?

parserxxxdb.js

JavaScript file, tailored for particular database, receives the metadata from objectGUF.xsl

It further massages the metadata using regular expressions

Improved Metadata Transfer

<MR973a>The New Republic$bNew Repub$c233$d4$fJuly$gJuly$h07$i25$j2005$k6</MR973a>

var regvolume = /\$c/;

if (regvolume.test(journalinfo)){var volume = journalinfo.replace(/.*\$c/, '');var volume = volume.replace(/\$.*/, '');}

else{var volume = '';}

var regissue = /\$d/;

if (regissue.test(journalinfo)){var issue = journalinfo.replace(/.*\$d/, '');var issue = issue.replace(/\$.*/, '');

var issue = issue.replace(/[a-zA-Z]/g, '');

}else

{var issue = '';}

Improved Metadata Transfer

How?

parserxxxdb.js

JavaScript file forms robust OpenURL using regular expressions and advanced parsing

JavaScript file ships the OpenURL off to GUF

All of this creates a better OpenURL than a typical Metasearch => Link Resolver hand-off, allowing for more and better links to full-text.

Articles – Eliminating Errors

Problem:

Full-text link on Link Resolver menu yields error page fairly often.

Articles – Eliminating Errors

GUF eliminates these dead-ends by pre-fetching pages trying other sources on error

Articles – Eliminating Errors

1. GUF receives OpenURL from ERA server (JavaScript file)

2. GUF formats OpenURL into XML

Example:

http://chico.lib.rochester.edu:8080/SFX_API/sfx_local?XML=<?xml version="1.0" ?> <open url> <object_description>

<object_metadata_zone> <genre>article</genre>

<issn>00084360</issn> <volume>43</volume> <issue>181</issue> <spage>149</spage> <title>Canadian Literature</title> <atitle>A+glimpse+of+something</atitle> <date>2004-08</date> <aulast>Beauregard</aulast> <aufirst>Guy</aufirst>

<__service_type>getFullTxt</__service_type>

</object_metadata_zone> </object_description>

</open-url>

Articles – Eliminating Errors

3. GUF sends XML OpenURL via HTTP Post to SFX API

4. SFX API returns a list of full-text URLs

Example…

Articles – Eliminating Errors

<?xml version="1.0"?><openurl_result>

<record><aulast>Beauregard</aulast><date>2004</date><atitle>A glimpse of something</atitle><spage>149</spage><issn>00084360</issn><__service_type>getFullTxt</__service_type><issue>181</issue><title>Canadian Literature</title><aufirst>Guy</aufirst>

</record><target> <url>http://

gateway.proquest.com/openurl?ctx_ver=Z39.88-2003&amp;res_id=xri:pqd&amp;rft_val_fmt=ori:fmt:kev:mtx:journal&amp;genre=article&amp;issn=0008-4360&amp;date=2004&amp;atitle=A+glimpse+of+something&amp;req_dat=xri:pqil:pq_clntid=17941</url>

<target_name>available via ProQuest Research Library</target_name>

<service>getFullTxt</service></target>

</openurl_result>

Articles – Eliminating Errors

5. GUF executes HTTP calls (screen scraping) to the full-text URLs

6. If any error occurs, skip to the next full-text URL until all are exhausted

<CFELSEIF #REFindNoCase("proquest", targets)#><cfloop from="1" to="#ArrayLen(ft_link)#" index="i">

<CFIF #REFindNoCase("proquest", ft_link[i].XmlText)#><!--- Go to link from ProQuest. --->

<CFSET target_url = #ft_link[i].XmlText#><CFIF REFindNoCase("issn\=\d\d", target_url) AND REFindNoCase("volume\=\d", target_url) AND REFindNoCase("issue\=\d",

target_url)AND REFindNoCase("spage\=\d", target_url)>

<CFSET target_url = #REReplaceNoCase(target_url, "\&date\=(\d)*\&req", "&req")#></CFIF><CFIF REFindNoCase("\&date\=\d\d\d\d\-\d\d\d\&", target_url)><CFSET target_url = REReplaceNoCase(target_url, "\-\d\d\d\&atitle", "&atitle")>

</CFIF><CFHTTP URL="#target_url#"><CFSET new_url = #cfhttp.filecontent#>

<!--- Need to insert error checking here. ---><CFIF REFindNoCase("did not find any documents", new_url) EQ 0>

<!--- Check for link to PDF. ---><CFIF #REFindNoCase("alt\=""Page Image \- PDF", new_url)#>

<!--- If it exists, parse out the URL and go to it. ---><CFSET new_url = #REReplaceNoCase(new_url, "alt\=""Page Image \- PDF.*", "")#><CFSET new_url = #REReplaceNoCase(new_url, ".*href\=""", "", "ALL")#><CFSET new_url = #REReplaceNoCase(new_url, """.*", "")#><CFSET new_url = "http://proquest.umi.com" & #new_url#><CFLOCATION url="#new_url#" addtoken="no">

<!--- Otherwise, check for link to Text+Graphics. ---><CFELSEIF #REFindNoCase("alt\=""Text\+Graphics", new_url)#>

<!--- If it exists, parse out the URL and go to it. ---><CFSET new_url = #REReplaceNoCase(new_url, "alt\=""Text\+Graphics.*", "")#><CFSET new_url = #REReplaceNoCase(new_url, ".*href\=""", "", "ALL")#><CFSET new_url = #REReplaceNoCase(new_url, """.*", "")#><CFSET new_url = "http://proquest.umi.com" & #new_url#><CFLOCATION url="#new_url#" addtoken="no">

<!--- Check for link to Full text (text-only format). ---><CFELSEIF #REFind("alt\=""Full text", new_url)#>

<CFSET new_url = #REReplaceNoCase(new_url, "alt\=""Full text.*", "")#><CFSET new_url = #REReplaceNoCase(new_url, ".*href\=""", "", "ALL")#><CFSET new_url = #REReplaceNoCase(new_url, """.*", "")#><CFSET new_url = "http://proquest.umi.com" & #new_url# & "##fulltext"><CFLOCATION url="#new_url#" addtoken="no">

<!--- Otherwise, do nothing. ---><CFELSE></CFIF>

</CFIF></cfloop>

Articles – Eliminating Clicks

Problem:

Full-text links in Link Resolver menus often lead the user to a journal- or abstract-level page.

This forces the user to scan the page and click one or more times to actually see the article.

Articles – Eliminating Clicks

GUF: drills down to article level for most databases screen scrapes for embedded links to PDF or HTML fulltext

<CFELSEIF #REFindNoCase("muse", targets)#><cfloop from="1" to="#ArrayLen(ft_link)#" index="i">

<CFIF #REFindNoCase("muse", ft_link[i].XmlText)#>

<!--- If 'muse' was found in data from SFX API, execute CFHTTP call to the URL. --->

<CFHTTP URL="#ft_link[i].XmlText#">

<!--- Now screen scrape the data, look to make sure a link to a specific article is returned. ---><CFIF (REFindNoCase("Access article in PDF", cfhttp.filecontent)) AND (REFindNoCase("href\=.*\.pdf",

cfhttp.filecontent)) AND (REFindNoCase("\<title\>.*Table of Contents\<\/title", cfhttp.filecontent) EQ 0)>

<!--- If a specific article is returned, parse out just the embedded link to full-text --->

<CFSET new_url = #cfhttp.filecontent#><CFSET new_url = #REReplaceNoCase(new_url, ".*\<url\>", "")#><CFSET new_url = #REReplaceNoCase(new_url, "\<\/url\>.*", "")#> <CFSET new_url = #REReplaceNoCase(new_url, "\.(html|htm)", ".pdf")#><CFSET new_url = "http://muse.uq.edu.au" & new_url>

<!--- Now go right to the article, without extra clicks. --->

<CFLOCATION url="#new_url#" addtoken="no"> <CFELSEIF REFindNoCase("following articles match", cfhttp.filecontent)>

<CFSET new_url = #cfhttp.filecontent#> <CFIF IsDefined('url.atitle') AND url.atitle NEQ "">

<CFSET new_url = #REReplaceNoCase(new_url, "#url.atitle#.*", "")#><CFSET new_url = #REReplaceNoCase(new_url, ".*href\=""", "", "ALL")#><CFSET new_url = #REReplaceNoCase(new_url, """ target\=""\_new""\>.*", "")#><CFSET base_url = "http://muse.uq.edu.au"><CFSET new_url = base_url & new_url><CFLOCATION url="#new_url#" addtoken="no">

</CFIF><CFELSEIF REFindNoCase("Page Not Found\<\/title\>", cfhttp.filecontent) EQ 0>

<CFSET new_url = #ft_link[i].XmlText#><CFLOCATION url="#new_url#" addtoken="no">

<CFELSE></CFIF>

</CFIF></cfloop>

Articles – Check local holdings

Problem:

Link resolvers do not compare citation info against actual holdings info.

Articles – Check Local Holdings

GUF is able to check your citation against local print holdings

How? Extract of entire print journal holdings from Voyager into

homegrown SQL table

Print holdings are parsed using a set of regular expressions on our non-standard holdings info

Thus, the user only sees the ‘Library Holdings’ page when the library actually owns the volume/issue/date for a metasearch result.

Books – Multiple editions

Problem:

Metasearch can yield book results for particular editions

Currently no way that link resolvers can determine whether local holdings include alternate editions

Dead end result for users

Books – Multiple Editions

GUF uses OCLC’s xISBN service to find all editions and theirISBNs.

Example:<CFHTTP url=http://labs.oclc.org/xisbn/#url.isbn# method =“get”>

Yields:<?xml version="1.0" encoding="UTF-8" ?> <idlist>

<isbn>0441172717</isbn>   <isbn>0801950775</isbn>   <isbn>0399128964</isbn>   <isbn>044100590x</isbn>   <isbn>1556909330</isbn>   <isbn>0425027066</isbn>   <isbn>0425036987</isbn>   <isbn>0425046877</isbn>   <isbn>042507160x</isbn>   <isbn>042505313x</isbn>   <isbn>0441172660</isbn>   <isbn>0736692401</isbn> …</idlist>

Books – Check local holdings

GUF searches catalog for complete list of ISBNs to find ‘good enough’ copy

How? ColdFusion custom tag that allows multiple, concurrent

HTTP requests against Voyager OPAC This allows for reasonable performance over large numbers

of ISBN searches

Miscellaneous Features

ISSN lookup Homemade DOI resolver Date formatting findarticles.com

Future Enhancements

Item records for print journals Journal abbreviation lookup table Google spellchecking CrossRef DOI help Flash-based maps

Other Ideas?

Any ideas you may have are welcome…

River Campus Libraries

GUF: Getting Users to Full-Text (With Voyager®, ENCompass™, OpenURL, etc.)

Jeff SuszczynskiSenior Web Developer

jeffs@library.rochester.edu

top related