1 vocabulary & languages in indexing & searching connection: indexing searching...
Post on 20-Jan-2016
228 views
TRANSCRIPT
![Page 1: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/1.jpg)
1
Vocabulary & languages in indexing & searching
Connection:indexing
searching
[email protected]; http://comminfo.rutgers.edu/~tefko/
© Tefko Saracevic
![Page 2: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/2.jpg)
2
Central ideaIndexing and searching: inexorably connected– you cannot search that that was not first indexed
in some manner or other• to be searched everything is and must be indexed
somehow even if it is not called “indexed”– indexing of documents or objects is done in order
to be searchable • there are great many ways to do indexing
– to index one needs an indexing language• there are great many indexing languages
– even taking every word in a document is an indexing language
Knowing searching is knowing indexing
Tefko Saracevic
![Page 3: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/3.jpg)
ToC
1. Definitions2. Controlled & uncontrolled vocabularies3. Inverted indexes4. Thesaurus
© Tefko Saracevic 3
![Page 4: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/4.jpg)
A few concepts from general to specific1. Definitions
© Tefko Saracevic 4
![Page 5: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/5.jpg)
Defined concepts valid for application in indexing & searching
General– language– vocabulary
Specific– index terms– indexing vocabulary– indexing language– descriptors– keywords– search terms– search vocabulary– query language
Tefko Saracevic 5
![Page 6: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/6.jpg)
General definitions [Encarta Dictionary]
Language1. communication with words: the human use of spoken or written words as a communication system2. system of communication: a system of communication with its own set of conventions or special words
Vocabulary 1. words of language: all the words used in a language as a whole2. words of subject area: the set of words associated with a subject or area of activity, or used by an individual person
Tefko Saracevic6
![Page 7: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/7.jpg)
7
Specific definitions Starting from the most basic concept:
Index term:A word or phrase that denotes (describes) a concept
& connotes (implies) a class
index term “table” describes a
and implies many kinds of tables:
for which, if desired, we may have more specific index terms
Tefko Saracevic
![Page 8: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/8.jpg)
8
More definitions ...
Indexing vocabularya set of index terms used in a domain or for a set of
documents or objects• it could be even a single document or object e.g. a book
Indexing languagean indexing vocabulary together with rules – syntax,
grammar – for their application and use
Tefko Saracevic
![Page 9: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/9.jpg)
Variation on Index term
DescriptorWord or phrase used to identify a topic or idea. Part of a
controlled vocabulary, normally listed in a thesaurus (defined later) . May be used as a search term.
KeywordA significant word from a text of a record which can be
used as a search term in a free-text search to retrieve all the records containing it
– Could be assigned manually, but now done mostly automatically – key entry in automatic indexing
Tefko Saracevic 9
![Page 10: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/10.jpg)
Searching definitions
Questionrequest by a user related to user’s information need,
task, problem at hand
Question analysisbreakdown & elaboration of concepts in a question
to be translated into search terms
Queryquestion or part thereof as stated for searching
according to rules of a given system
© Tefko Saracevic 10
![Page 11: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/11.jpg)
11
more ...
Search terma counterpart to index term, also denoting a
concept and connoting a class for a search
Search vocabularya set of search terms in a domain or available in a
systems
Query languagea search vocabulary together with rules for their use
in searching
Tefko Saracevic
![Page 12: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/12.jpg)
elaboration …
• Example: Question:– What are some major
historical developments in the area of information retrieval?
• Transformed into query– history information
retrieval (in Google)– history AND
information(w)retrieval (in Dialog) (plus you have to select which file(s) to search
Tefko Saracevic 12
• Question is what user asks and what you may then have elaborated
• Query is what is asked of computer to match – what is put in for searching
• Question is transformed into query
![Page 13: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/13.jpg)
13
more …
“An index language is the language used to describe documents and requests.
The elements of the index language are index terms, which may be derived from the text of the document to be described, or may be arrived at independently.
The vocabulary of an index language may be controlled or uncontrolled.”
(van Rijsbergen, 1979)
Tefko Saracevic
![Page 14: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/14.jpg)
Approaches, tensions
2. Controlled & uncontrolled vocabularies
Tefko Saracevic 14
![Page 15: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/15.jpg)
15
Controlled vocabulary
• Predetermined – indicating what terms to be used in indexing– may show definition of and relations between
terms• examples: thesaurus, subject heading list, classification
• Also indicates terms that may be selected for searching
• An indexing AND a searching tool• Human constructed
– and costly to construct and use Tefko Saracevic
![Page 16: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/16.jpg)
Example of controlled vocabularies
Medical Subject Headings (MeSH) of the National Library of Medicine
• One of the largest & most comprehensive– used in indexing &
searching
• More than 22,000 descriptors, with more than 106,000 cross-references
• More than 139,000 Supplementary Concept Records
• Approximately 50 publication types (Journal Article, News, Editorial, Review, Randomized Controlled Trial, etc)
• Done by indexers• But also experimenting
with semi-automatic indexing
© Tefko Saracevic 16
![Page 17: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/17.jpg)
17
Uncontrolled vocabulary• Derived from texts – natural language - in
documents– nowadays automatically
• using various ways or algorithms– constantly tested: which algorithm is better?
• Used to construct inverted indexes • In turn, inverted indexes are used for free text
searching
Tefko Saracevic
![Page 18: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/18.jpg)
Comparison of vocabularies
Controlled • The idea of a controlled
vocabulary is to reduce the variability of expressions used to characterize documents being indexed & searched for
• Manual, costly, time consuming, also semi-automatic in some systems
• Dynamic – needs constant changing, updating
Uncontrolled or free• The idea is to follow natural
language expressions as they occur in documents
• Could be automatic– great advantage– algorithms constantly
changing & improving• e.g. parsing phrases,
connections
• Prevailing in many applications
© Tefko Saracevic 18
![Page 19: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/19.jpg)
19
Controlled vs. free text searching
• Endless source of debate & controversy
• But, each has its place for given circumstance & retrieval goal
• Each has strengths & weaknesses• can you list or find a list comparing them? – this
is a good search assignment
• Users mostly use free text searching• Professional searchers use both as
warranted – have to know when• Professional credo:
KNOW THY CONTROLLED VOCABULARY so you can apply it in searching as/or when needed
Tefko Saracevic
![Page 20: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/20.jpg)
Use in searching3. Inverted indexes
Tefko Saracevic 20
![Page 21: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/21.jpg)
21
Inverted indexes & searching
Useful to know how they function to understand search & retrieval. Steps:
1. Each document is indexed– every word in a document is taken as index term
with exception of stop words, if any– position in text is noted, even for stop words
2. Indexes for all documents are merged• index terms are arranged alphabetically in the bowel
of the system, so they can be searched • under each index term are document numbers in which it
appears & position in text for that document Tefko Saracevic
![Page 22: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/22.jpg)
22
So, when you search
for digital AND libraries:1. computer takes all documents under digital2. and all documents under libraries3. compares to “see” which documents have both terms and
then4. provides you the list of those documents that have in the
document both terms, no matter where • This is also called “coordinate indexing”
– coordination is done at time of searching
Tefko Saracevic
![Page 23: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/23.jpg)
23
Variation: when you search
for digital (WITH) libraries or“digital libraries” i.e as a phrase1. computer goes through the same steps as before but then
also
2. “looks” for documents where digital is positioned right before libraries • remember: computer “knows” position of each
term in each document, each sentence • So searching for a phrase is a form of searching of terms
connected with AND but in a given sequence
Tefko Saracevic
![Page 24: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/24.jpg)
24
Example of searches in inverted fileDoc # Text
1 Slow brown truck arrived
2 Shipment of brownies damaged in a fire
3 Delivery of brownies arrived in a slow truck
4 Shipment of brownies arrived in a truck
Term Position in doc number
arrived (1:4), (3:4), (4:4)
brown (1:2)
brownies (2:3), (3:3), (4:3)
damaged (2:4)
delivery (3:1)
fire (2:7)
shipment (2:1), (4:1)
slow (1:1), (3:7)
truck (1:3), (3:8), (4:7)
For simplicity documents have one sentence.Stop words: “a” “of” “in” – but their position counted
Inverted index
Search for slow AND truck gets as results documents 1 and 3 since
both contain slow and truck
Search for slow (w) truck retrieves only document 3 in which slow is 7th and
truck is 8th, they are right next to each other. Doc 1 has both words, but
not next to each other thus not retrieved
Tefko Saracevic
![Page 25: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/25.jpg)
Everything is inverted- consequences for searching
• All words in all fields are inverted, no matter if– in title, full text, descriptor,
author …
• Thus all are searchable • In some systems (but not
all) phrases are parsed & thus searchable– but in most phrases are
searched as AwB, or “AB”
• But beware:– search for libraries as
descriptor• e.g. libraries/DE in Dialog
– will retrieve ALL other descriptors where libraries appear in addition to descriptor libraries itself
• e.g. academic libraries, public libraries, special libraries, research libraries …
– but there are search tricks to avoid that
Tefko Saracevic 25
![Page 26: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/26.jpg)
A major tool for controlled vocabularies in information retrieval (IR)
4. Thesaurus
Tefko Saracevic 26
![Page 27: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/27.jpg)
27
What is a thesaurus?
“For writers, it is a tool like Roget’s one with words grouped and classified to help select the best word to convey a specific nuance of meaning.
For indexers and searchers, it is an information storage and retrieval tool: a listing of words and phrases authorized for use in an indexing system, together with relationships, variants and synonyms, and aids to navigation through the thesaurus.”
(Milstead, 2000)
Tefko Saracevic
![Page 28: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/28.jpg)
28
more…
“A thesaurus to an information scientist is a controlled set of the terms used to index information in a database, and therefore also to search for information in that database so the same concepts are represented by the same term.”
(Batty, 1998)
Tefko Saracevic
![Page 29: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/29.jpg)
29
Thesaurus
• Good old Peter Mark Roget had a most useful idea in 1890s & did a great job
• Following this idea thesaurus became THE major tool for controlled vocabulary in IR– starting in 1950’s & to this day great many IR thesauri
have been developed for all kinds of subjects• including, for instance, in information science
– all have a similar structure & function– but they are difficult & costly to construct & maintain
Tefko Saracevic
![Page 30: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/30.jpg)
Standards, software
• Subject to international standards:– “Guidelines for the Construction, Format, and
Management of Monolingual Controlled Vocabularies” ANSI/NISO Standard Z39.19
– followed by “Construction of Controlled Vocabularies. A Primer”
• A number of software products are available for thesaurus construction and maintenance – e.g. as listed by American Society for Indexing
© Tefko Saracevic 30
![Page 31: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/31.jpg)
31
Examples of thesauri
• Thesauri have been constructed for great many domains, from A to Z – here are some lists
• international & multilingual thesauri• online thesauri• among them ERIC Thesaurus (we use it for example)
– BUT: different thesauri may and do treat the same descriptor (index term) differently
• having different, more or fewer narrower, broader, related terms
• thus it is dangerous to use them interchangeably Tefko Saracevic
![Page 32: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/32.jpg)
32
Basic thesaurus components• For each entry thesaurus has a classification grid:
– Descriptor (DE) – an index term that has• Scope note (SN) – context in which used• Broader terms (BT) – higher in a hierarchy• Narrower terms (NT) – lower in a hierarchy• Related terms (RT) – other connected descriptors• Used for (UF) – synonyms that are not descriptors
– Note: not all of these may be present for every descriptor
• A searcher or indexer can use these as a guide for selection/rejection & for browsing to get ideas
Tefko Saracevic
![Page 33: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/33.jpg)
33
Standard structureWith variations on the theme, thesauri have similar conceptual structure to guide searcher or indexer:
Note: Every descriptor doesn't have to have all of these
Descriptor - DE
Broader terms - BT
Narrower terms - NT
Related terms - RTUsed for - UF
Synonyms
Scope note - SN
Tefko Saracevic
![Page 34: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/34.jpg)
34
Same thesaurus but …
• Examples of ERIC (Educational Resources Information Center) thesaurus as used differently in different systems:
1. ERIC own system2. ERIC file on Dialog (begin 1)3. ERIC file on OVID (accessible through RUL)
• Notice how each uses the same ERIC thesaurus displays & search in its own way, but principles still the same
• Oh well…
Tefko Saracevic
![Page 35: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/35.jpg)
35
ERIC online thesaurus on ERIC
• Allows for – searching for words that are included in
descriptors by category or all categories– browsing alphabetically– browsing in one of about 40 categories
• Search for libraries in all categories found 50 descriptors that have “library” included
• Out of these selected libraries
Tefko Saracevic
![Page 36: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/36.jpg)
ERIC online thesaurus on ERICdescriptor libraries
© Tefko Saracevic 36
Other descriptors – one could browse
Descriptor
![Page 37: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/37.jpg)
37
ERIC thesaurus on Dialog
• In a convoluted way ERIC thesaurus (and other ones) can be displayed on Dialog (and other vendors, such as OVID)
• How?– begin in file 1 – ERIC– then expand a desired term – here we used term library– you will see under R that certain terms have related terms
– meaning that these are thesaurus entries– then expand on one of those to see related terms– then you can browse & choose which ones to use in search
• And here are printed screens of the process
Tefko Saracevic
![Page 38: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/38.jpg)
Note on command expand (E) in Dialog
• Dialog (and some other systems) has a neat way to display all entries in any inverted index alphabetically – command is Expand or e– it could be done in any of
the indexes – basic and additional
For instance:e library will provide alpha list
of term library in basic index & then after expanding again you can see related terms (see next)
e Au=Saracevic will provide alpha list of all entries in the author additional index around that name
© Tefko Saracevic 38
![Page 39: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/39.jpg)
going
Tefko Saracevic 39
Expandlibrary
![Page 40: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/40.jpg)
going …
Tefko Saracevic 40
RT indicates related terms
46865 items have
library
This one has 14 related
terms
![Page 41: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/41.jpg)
going …
Tefko Saracevic 41
We now chose descriptor LIBRARY ADMINISTRATION and expand on that one
Neat trick:
You can expand on expand & get related terms out of Eric thesaurus
![Page 42: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/42.jpg)
going …
Tefko Saracevic 42
14 related terms for this one are listed
These are now R terms of
various type
Can expand on this one to see
other RT
You can also select any of
these to search
![Page 43: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/43.jpg)
going …
Tefko Saracevic 43
We have now selected r15 – library services to search for documents
![Page 44: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/44.jpg)
going …
Tefko Saracevic 44
And this is the no. of
items we got
Now we can view some items in a chosen format
or we can further modify this search - add refine, …
![Page 45: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/45.jpg)
gone
Tefko Saracevic 45
This is one of the items
we got
Descriptors used for this item Additional
index terms
![Page 46: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/46.jpg)
Start ERIC search on OVID (accessed through RUL)
© Tefko Saracevic 46
Start with
![Page 47: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/47.jpg)
Automatically gets you to thesaurus
Tefko Saracevic 47
This one of selected to
enlarge
![Page 48: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/48.jpg)
Allows you to select thesaurus (or not)
Tefko Saracevic 48
This one of selected to
enlarge
![Page 49: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/49.jpg)
Then go to ERIC thesaurus on OVID (accessed through RUL)
© Tefko Saracevic 49
Scro
ll
Descriptor
![Page 50: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/50.jpg)
gone
• Next go and select additional terms• Or search for libraries only• See no. of results• Select fields and formats by making a check• and happy going …• suggestion: repeat this exercise
Tefko Saracevic 50
Point being that the same thesaurus is handled differently by different databases
![Page 51: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/51.jpg)
51
Relevance feedback - an important search tactic
• Method for using information in items judged relevant to further refine or change the search– first you find a relevant document (or documents)– in relevant document(s) you browse titles,
descriptors, identifiers, abstracts … to get leads (e.g. keywords) for further search terms & tactics
– then you search for those
• in some advanced systems this may be done automatically
Tefko Saracevic
![Page 52: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/52.jpg)
52
Query expansion – another important search tactic
• Method for adding, modifying, changing search terms in a query– to broaden, narrow, focus, change … terms
• Many sources can be used– relevance feedback, thesauri, dictionaries,
textbooks, documents, catalogs, & people: users, colleagues, your own mind & experience
• Some systems suggest terms for query expansion
Tefko Saracevic
![Page 53: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/53.jpg)
Query expansion tactics• You can use the same structure for expanding
query terms as in a thesaurus– think of what may be broader, narrower, related terms or
synonyms to use as search terms
Tefko Saracevic 53
Query term
Broader terms - BT
Narrower terms - NT
Related terms - RT Synonyms
![Page 54: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/54.jpg)
54
Conclusion
• At the base of all searching are– terms– vocabularies– languages– but a variety exists
• In reality in searching there is no completely controlled or uncontrolled vocabulary– matter of degree– & most importantly, matter of
mastery
Tefko Saracevic
![Page 55: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/55.jpg)
55
symbolically;controlled & free vocabulary
Tefko Saracevic
![Page 56: 1 Vocabulary & languages in indexing & searching Connection: indexing searching tefkos@rutgers.edutefkos@rutgers.edu; tefko/tefko](https://reader036.vdocuments.net/reader036/viewer/2022062322/56649d615503460f94a43a37/html5/thumbnails/56.jpg)
56
thank you! Tefko Saracevic