searching for nz information in the virtual library alastair g smith school of information...
TRANSCRIPT
Searching for NZ Information in the Virtual
Library
Alastair G SmithSchool of Information
ManagementVictoria University of
Wellington
Overview
Search engines: local vs global Search engines: limitations Searching for NZ info: effective
strategies Information Quality on the Web Making NZ info more accessible:
the role of librarians
NZ information online
Online access can mean that US, European Information is easier to access than NZ E.g. Dialog
However Internet provides accessible infrastructure for making NZ information available E.g. Knowledge Basket
Search tool definitions
Directories: resources categorised by human beings: e.g. Yahoo! Te Puna Web Directory
Search engines: automatically created databases of web pages, searchable by keyword e.g. Google, SearchNZ
Role of Search Engines
Convenient, fast, usually find some information (if not most relevant)
Most people turn to a search engine first (GVU user survey: 85%)
For NZ Information we have a choice: Global search engines, e.g. Google Local search engines, e.g. SearchNZ
Comparing NZ and global search engines
Experiment compared NZ, global and metasearch engines
Test questions on NZ topics Compared relative recall
Global Search Engines
AlltheWeb/FAST http://www.alltheweb.com/
Google http://www.google.co.nz/ HotBot http://hotbot.lycos.com/ Altavista http://nz.altavista.com/
Local Search Engines
SearchNZ http://www.searchnz.co.nz/
SearchNow http://www.searchnow.co.nz (no longer exists)
NZExplorer http://nzexplorer.co.nz/
Metasearch engines
Excite http://www.excite.com/ Vivisimo http://vivisimo.com/ Surfwax http://www.surfwax.com/
Examples of test questions
A description and image of the Maori flag
Information about the Otago Central Rail Trail
Information on the payment of British pensions in NZ
Recall
Recall: proportion of possible relevant documents found in search, e.g. 100 relevant documents in database Search finds 20 relevant documents Recall is 20%
Problems in using recall to evaluate search engines:
Don’t know total number of relevant documents on Web
Ranking: Is document “found” if it appears in first 10, first 20…?
Relative Recall
A
B
C
Pool results of search engines A, B, C: approximates to all relevant documents
Recall in NZ search engine experiment
“First 20 relative recall” Noted URLs of relevant documents
found in first 20 hits for each search engine
Pooled results for all search engines Used pooled list as approximation
of all relevant documents
Recall results
0
5
10
15
20
25
30
35
40
45
AltaVist
a
AlltheW
eb
HotBot
Searc
hNZ
Searc
hNow
NZExplor
er
Surfw
ax
Vivisim
o
Excite
rela
tive
rec
all (
%)
Points arising from recall results
Only one local search engine equalled global search engines
No search engine found over half of relevant documents
Metasearch engines did not outperform standalone search engines
Comparison with 2000 Relative recall for NZ questions
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
NZ Explorer SearchNZ WebSearchNZ
ANZWERS Excite Aus GoEureka AllTheWeb Google
Factors affecting performance of NZ search
engines Global search engines have similar
or larger coverage of .nz sites NZ search engines have less
sophisticated search features 36% of sites relevant to NZ topics
were outside .nz domain Global search engines update
more rapidly
Overlap of search engine hits
Overlap of search engine hits
0
10
20
30
40
50
1 2 3 4 5 6 7 8 9 10Number of search engines
Nu
mb
er o
f h
its
Implications of overlap results
Most sites only found by one search engine
Few sites found by 7 or more search engines
Little overlap Comprehensive searches require
several search engines
Why aren’t metasearch engines better?
Metasearch engines select a few top ranked items from each search engine list
Search engine ranking imperfect Looking at more results from one search
engine may be as useful as looking at a few from each
Metasearch engines use “lowest common denominator” search
But can be useful for specific terminology
Limitations of Search Engines for finding NZ
information “hidden web” How does a search engine work?…
Search engine architecture
Interface
Query Engine
Indexer
Crawler
Index
WEB
Users
Search engine limitations:
Spider can’t access some types of pages: database, frames, javascript…
Only 40% of pages are highly linked, others difficult for spider to locate
Search is of database: “some of the pages that once existed on the Web”
Spider may be optimised for popular sites rather than full coverage
Implications for Internet search strategy for NZ
topics Use several search engines Avoid restricting search to .nz domain Don’t rely on search engines to find
everything Use directories, subject resource guides Use as many words as possible to
describe your topic: optimise relevance
NZ directory examples
NZ Subject Resource Guides
Searching in practice…
Quality of NZ information on the Web
Like global information, and information in print: variable
NZ Information quality examples
Role of librarians in making NZ internet
information available Sharing our knowledge of web
navigation…
…Creating search tools and information resources
…Preserving Internet information
Conclusion
NZ search engines do not offer advantages over global search engines
Comprehensive searches involve several search engines, directories, subject guides
Librarians have a role in creating local search tools, and in improving search skills