measuring system performance
DESCRIPTION
Measuring system performance. The library. A system view. Environment. U s e r s. Inputs. Outputs. Transformational process. energy money materials personnel information. products services. System performance measures. recall. precision. relevance. - PowerPoint PPT PresentationTRANSCRIPT
Measuring system Measuring system performanceperformance
The libraryA system view
Environment
Transformationalprocess
Inputs Outputs
energymoneymaterialspersonnelinformation
productsservices
Users
System performance measures
recall precision
relevance
Robert Taylor's four levels of question formation
The actual but unexpressed need forinformation (the visceral need)Q1The conscious, within-brain descriptionof the need (the conscious need)Q2The formal statement of the need(the formalized need)Q3The question as presented to the infor-mation system (the compromised need)Q4
Taylor, Robert S. 1968. Question-negotiation and information seeking in libraries. College & Research Libraries 29(3): 178-194 (May 1968).
System-defined relevance
find health AND feet
The health of the lumber90% industry in terms of cubic feet
of lumber produced
"My feet are killing me."
Information retrieval process
Questionformulation
RelevancydeterminationSystem: Which documents are relevant to the query?User: Are these documents
relevant to my needs?
Defining relevance
System-definedrelevance
User-definedrelevancevs.
ObjectiveOften topical.Does it match
the query?
Subjective.Situational.Is it useful?
User-defined relevance
The effect of lysergic acid diethylamideingestion on toenail fungus in cloned mice
"My feet are killing me."
Soothing remedies for aching feet
Controlling the body by controlling the mind--meditative techniques for dealing with pain
Determining topical relevance• Analyze work as to what it
is about
• Assign to the document one or more terms from a finite list of topics
• Users can then search on those topic indicators
Recall
Recall =
No. of relevant documents retrieved
Total no. of relevant documents in the file
Precision
Precision =
No. of relevant documents retrieved
Total no. of documents retrieved from the file
Precision vs. RecallAn inverse relationship
As the level of recall rises the level of precision generally declines and vice versa.
The Cranfield experiments (1957 & 1962)Cyril Cleverdon, p.i.
Precision vs. RecallSubject: sexual dimorphismWord stemming:
sex sexes sexualsexy sexier sexiest
Field-specific searches:DE,TI/sexual()dimorphism
Recall Precision
Recall Precision
User-defined relevance"Relevance appears to be a subjective quality, unique between the individual and a given document supporting the assumption that relevance can only be judged by the information user."
Miranda Pao
Years later
The effect of lysergic acid diethylamideingestion on toenail fungus in cloned mice
"My feet are still killing me."
Soothing remedies for aching feet
Controlling the body by controlling the mind--meditative techniques for dealing with pain
Factors affecting relevance (1)• Purpose of the information• Situation of the user• Level at which the information
source is written– Journal of the Amer. Med. Assn.– Healthy times
Factors affecting relevance (2)• Subject knowledge of the user
– Is the data new to the user?– Does the information relate to the
user's prior knowledge?• Values - ethical, social,
philosophical, political, religious, legal
User-defined relevance
Subjectivity and fluidity make it difficult to use as measuring tool for system performance
Incorporating user-defined relevance into information retrieval systems (1)
• User performs search• System retrieves results
.
.
.
Incorporating user-defined relevance into information retrieval systems (2)• System asks user if he/she would
like to retrieve similar documentsSearch for other documents with
similar word frequenciesSearch for other documents with
same subject descriptors
Search for other documents with same subject descriptors
Main Author:Title:
Subject(s):
Gribbin, John R.In search of Schrodinger's cat :quantum physics and reality / by John Gribbin.
Schrodinger, Erwin, 1887-1961.Quantum theory History.Reality.
Amazon.com
Amazon.com
Amazon.com
Assisting users in determining relevancy
Indexingterms
Title
Citationdata
Abstract
Source: Barry, Carol L. 1998. Document representations and clues to document relevance. Journal of the American Society for Information Science 49(14):1293-1303.
Document representation research
Titles
Fulltext
Title: Getting good grades in graduate school
Title: How to impress your advisor in graduate school
Title: Writing a dissertation
Title: The well-written graduate paper
Getting good grades in graduate school
The best way to get good grades is to study hard…
How to impress your advisor in graduate school
Never show up late for a meeting with your advisor…
Writing a dissertation
The first thing to do is to pick a topic that truly interests you…
The well-written graduate paper
Before finalizing your topic do a preliminary search on…
How relevant
are these?
How relevant
are these?
Document representation research
Titles Citationdata
Indexingterms Abstracts
Fulltext
Fulltext
Fulltext
Fulltext
How relevant
are these?
How relevant
are these?
Utility studies - Indications that user found relevant materials
• Citation & abstract databases– User requests citations be formatted for
printing– User requests citations be sent by e-mail– User downloads citations
• Full-text databases– Pull up the full text– Print the article– Download the article to their Blackberry
Utility studies - Indications that user found relevant materials
Search Short list
If user stops may not have
found a relevant article
chocolate
Utility studies - Indications that user found relevant materials
Search Short list
Modifiessearch
View fullcitationdata forarticle
View fulltext ofarticle
Downloador printarticle
Assume that user found
article relevant
Characteristics of searches that produce relevant materials• Subject searching• Utilization of Boolean operators• Search modification• Increased time in display activities• User of greater number of
databasesCooper, Michael Dr. and Hui-Min Chen. 2001. Predicting the relevance of a library catalog search. Journal of the American Society for Information Science and Technology 52 (10):813-827.
Importance of abstract (1)• Indication as to depth/scope of
the article
• Delineates methodology--indication of reliability and validity
• Gives indication as to content novelty
Authors studied leg-hair count variations of Drosophila in
Kawainui Marsh
Random sampling in 40 sectors during March, June,
September & December
Greater variation in June
Importance of abstract (2)• Basis for research may
indicate recency
• Delineation of results indicates "tangibility" (important, useful data)
American housing market was selected because it is always robust.
Authors concluded that American teenagers listen to rock music.
Types of abstracts
• Indicative• Informative• Critical (evaluative)
(Not common in library databases)
Indicative abstractIndicates what the document is about but doesn't report findings
Title: A review of the current literature on relevance.
Abstract: The author reviews the current literature on relevance.
Informative abstractActs as a substitute for the documentTitle: The effects of library school on the mental health of library students
Abstract: The authors performed longitudinal studies on 32 graduate students in 8 library and information science programs and found a significant increase in aberrant psychological traits over time.
(fictitious title and abstracts)
Abstract creation
• Author-produced• Vendor-added• Automated abstracting
Automated abstracting1. Word counts2. Remove stop words3. Weight remaining words
according to frequency4. Search for sentences with
highest density of most frequently-occurring words
1. Word countTitle: Seasonal variations in the feral cat population of Fargo
the 81is 68a 56to 42cats 61number 45season 27winter 11
summer 11spring 11fall 11monthly 10temperature 61variation 12food 10availability 10
average 9concept 7per 8over 9immediate 5implement 3mortality 8survival 9
2. Eliminate stop wordsTitle: Seasonal variations in the feral cat population of Fargo
the 81is 68a 56to 42cats 61number 45season 27winter 11
summer 11spring 11fall 11monthly 10temperature 61variation 12food 10availability 10
average 9concept 7per 8over 9immediate 5implement 3mortality 8survival 9
3. Rank by frequencyTitle: Seasonal variations in the feral cat population of Fargo
cats 61temperature 61number 45seasonal 27variation 12winter 11
summer 11spring 11fall 11monthly 10food 10availability 10
average 9survival 9mortality 8concept 7immediate 5implement 3
4. Search for sentences with highest density of high frequency wordsTitle: Seasonal variations in the feral cat population of Fargo
We found a significant seasonal variation in the number of cats. The highest number of cats are found in the summer, the lowest number of cats in the winter.
Automated abstract... The Children's Internet Protection Act (CIPA) sets conditions on public libraries' receipt of federal financial assistance for Internet access. ... It would not have been possible for the broadcasting station to limit the use of federal funds to all non-editorializing activities. ... The instant Court distinguished Velazquez, restricting its holding to situations in which the grantee is "pit[ted] . . . against the Government. ... " Justice Stevens asserted that the filtering condition was unconstitutional because it distorted the normal usage of library Internet terminals as sources of a wide array of information. ... A condition mandating Internet filters distorts this mission by "deny[ing] patrons access to constitutionally protected speech that libraries would otherwise provide. ...
Relevance and information overloadIn this age of information overload, tools to aid the user in determining relevance are increasingly critical.