ch1 intro to information retrieval-lina nemri
TRANSCRIPT
-
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
1/23
Information Retrieval
Lebanese UniversityFaculty of Economics and Business
Administration 1st Branch
Class: M1
Instructor: Dr. Lina A. Nimri
1
-
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
2/23
-
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
3/23
Introduction
Modern Information Retrieval, Chapter 1
Ricardo Baeza-Yates, Berthier Ribeiro-Neto
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
4/23
Introduction Examples of information need in the context of the
world wide web:
Find all documents containing information oncomputer courses which:
(1) are offered by universities in South England, and(2) are accredited by the BCS/IEE bodies,
To be relevant, the document must include information on admissionrequirements, and e-mail and phone number for contact purpose.
Find all docs containing information on college tennisteams which:
(1) are maintained by a USA university and
(2) participate in the NCAA tournament.
Information Retrieval 4
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
5/23
5
Information Retrieval
Retrieval System
Query
Set of retrieved documents
Documents
User Information Need
Search Engine
Useful or relevantinformation to the user
Primary goal of an IR system
Retrieve all the documents which are relevant to a user query,
while retrieving as few non-relevant documents as possible.
Representation, storage, organisation, and access to
information items
(Usually) keyword-based representation
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
6/23
Data Retrieval
Determine which documents contain thekeywords in the user query is not always enough
to satisfy the user information need. Data Retrieval retrieves objects which satisfy
clearly defined conditions, such as regularexpressions or relational algebra expressions.
Data Retrieval system deals with data with well-defined structure and semantics
6
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
7/23
Information Retrieval System
Retrieving information about a subject
Deals with natural language text which is
not well structured and could besemantically ambiguous
It must interpret the contents of
documents and rank them according tothe degree of relevance to the user need.
7
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
8/23
Area of interest
Digital Libraries
Information experts
World Wide Web - Very difficult task The hyperspace is vast
The absence of a well defined data model
(format or representation form)
8
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
9/23
Effective retrieval
The effective retrieval of relevantinformation is directly affected by: The user task
The logical view of the document(documents representation) adopted by
the retrieval system.
9
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
10/23
-
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
11/23
Pulling
The user can browse the documents when hismain objectives are not clear in the beginningand whose purpose might change during the
interaction with the system. Combination of retrieval and browsing is not yet
a well established approach.
11
Retrieval
Browsing
Database
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
12/23
Documents
Unit of retrieval A passage of free text
composed of text, strings of characters
from an alphabet composed of natural language
newspaper article, a journal paper, adictionary definition, email messages
size of documents arbitrary
newspaper article vs. journal paper vs. email
12
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
13/23
What is a document?
13
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
14/23
Representation of documents Documents are represented thru a set of index terms or
keywords or term descriptors extracted directly form text specified by human subjects (information science) metadata
Most concise representation Poor quality of retrieval
Full text representation Most complete representation High computational cost
Large collections
Reduce set of representative keywords Elimination of stop words Stemming Identification of noun phrases Further compression and indexing
14
Document term
descriptors toaccess texts
Generation ofdescriptors fortext
By hand
By analysing the text
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
15/23
Logical View of the documents
15
structure
Accents
spacing stopwordsNoun
groups stemmingManual
indexingDocs
structure Full text Index terms
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
16/23
The retrieval functions
16
Information need
Query
Formulation
Documents
Document representation
Indexing
Retrieved documents
Retrievalfunctions
Relev
ance
feedb
ack
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
17/23
Queries
Information Need: Simple queries
composed of two or three, perhaps even
dozens, of keywords e.g., as in web retrieval
Boolean queries
neural networks AND speech recognition Context Queries
Proximity search, phrase queries
17
User termdescriptorscharacterisingthe user need
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
18/23
Best-Match retrieval
Compare the terms in a documentand query
Compute similarity between eachdocument in the collection and thequery based on the terms that theyhave in common
Sorting the documents in order ofdecreasing similarity with the query
The outputs are a ranked list and
displayed to the user - the top onesare more relevant as judged by thesystem
18
Document termdescriptors toaccess texts
User termdescriptorscharacterisingthe user need
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
19/23
Conceptual view of text
retrieval system
19
QueriesDocuments
Similarity
Computation
RetrievedDocuments
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
20/23
Expanded view of text
retrieval system
20
Queries DocumentsIndexing
Indexed
Documents
Similarity
Computation
RetrievedDocuments
RankedDocuments
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
21/23
Process of retrieving info
21
User Interface
Text Operations
QueryOperations
Indexing
Similarity Computation(Searching)
Ranking
DocumentRepositoryManager
Index
Userneed
Logical view Logical view
Inverted
file
Query
Retrieved docs
Text
TextUser feedback
Ranked docs
Textrepository
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
22/23
Key Topics
Indexing text documents
Retrieving text documents
Evaluation Query reformulations
Search Engines=
IR + Link Structure + Name Interpretation
22
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/ -
8/2/2019 Ch1 intro to information retrieval-Lina Nemri
23/23
Information Retrievalvs Information Extraction
Information Retrieval Given a set of query terms and a set of document
terms select only the most relevant documents[precision], and preferably all the relevant [recall].
Information Extraction Extract from the text what the document means.
IR systems can FIND documents but need notunderstand them
23
http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/http://www.sims.berkeley.edu/~hearst/irbook/