peer to peer information retrieval
Post on 13-Apr-2017
396 Views
Preview:
TRANSCRIPT
Peer to PeerInformation Retrieval
By, Chetan K. Sundarde@CHETANSUNDARDE
https://www.linkedin.com/in/chetansundarde
May 3, 2023 1P2PIR
Outlines :- Peer to Peer Network Information Retrieval Peer to Peer Information Retrieval (P2PIR) Peer to peer IR system architectures Techniques used in IR in P2P networks Basic algorithms used in P2PIR Evaluation techniques used P2PIR Challenges Conclusion References May 3, 2023 2P2PIR
Peer To Peer Network Collection of distributed system Computers leave and join the network frequently Each computer acts as a server and a client
simultaneously three tasks that every peer-to-peer network performs
Searching: Querying and getting list of document references.
Locating: Resolve a document reference to concrete location - full document
Transferring: download the document.May 3, 2023 3P2PIR
Applications of P2P
Information Retrieval File Sharing Gnutella, Napster, Bit-torrent, etc.
May 3, 2023 4P2PIR
Information Retrieval :- A field dealing with the structure, analysis,
organization, storage, searching and retrieval of information is called information retrieval
Search relevant documents, on the basis of user input
Document collection
Info. need
Query
Answer list
IR Retrieval
May 3, 2023 5P2PIR
Comparison between File Sharing and Information Retrieval
File Sharing Information Retrieval
Application Locating SearchingIndex-Content File Identifiers Document Content-Size Small LargeData Exchange-Unit File Search Result-Size Megabyte+ Kilobyte(small)
May 3, 2023 8
P2PIR- file sharing networks and federated information retrieval
P2PIR
Peer to peer Information Retrieval (P2PIR)
Searching in peer-to-peer networks Each peer shares its information with other peer
Peer searches information by sending queries to its peer Routed to one or many other peers. Query result is provide in the form of index
May 3, 2023 9P2PIR
Peer to peer IR system architectures Based on relationship between peers:o Cooperative systemo Uncooperative system Based on the network structureo Centralized networko Structured architectureo Unstructured architecture Based on task perform in P2P networko Centralized Global Indexo Distributed Global Indexo Strict Local Indiceso Aggregated Local Indices May 3, 2023 11P2PIR
Peer-to-Peer architectures used in IR
May 3, 2023 15
G
GG
G
G
GGG
G
G
L L
L
LL
L
L
LL
L
LL
Central Global Index Distributed Global Index
Aggregated Local Index Strict Local IndexP2PIR
Algorithm used in P2PIR Statistical IR algorithms
Vector Space Model (VSM)
Document A: “books on computer networks”Document B: “network routing in P2P networks”Query Q: “computer network”
Each elements of the vector corresponds to the importance of the term in the document
Ranking of retrieved documents based Similarity between document vector and query vector
bookcomputernetworkrouting
vocabulary0.50.50.80
VA000.90.6
VB
00.50.80
VQ
0.89 0.72
May 3, 2023P2PIR 16
Algorithm used in P2PIR Statistical IR algorithms
Latent Semantic Indexing (LSI)documents
terms …..
V’a V’bsemantic vectors
SVD …..
SVD: singular value decomposition– Reduce dimensionality– Discover word
semanticsCat <-> PetBus <-> Travel
Va Vb
May 3, 2023 17P2PIR
Algorithm used in P2PIR… Distributed Hash Table (DHT) method of hash table lookup over a decentralized distributed network Key–value pairs are stored in
Kd=hash (“books on computer networks”) Kq=hash (“computer network”)
the DHT at a parent node. (Structured Architecture) Any node in the DHT can then efficiently retrieve the value by providing its key. Napster and BitTorrent modern DHTs are CAN, Chord, etc. Extend with Content-Based Search
Full-Text Retrieval Content-Based Image Retrieval Content-Based Music Retrieval ,etc. May 3, 2023 18P2PIR
P2P Information Retrieval Techniques
Unstructured
BFS, RBFS,
Eg. GnutellaBlind Search
Random Walk
Blind Search
RoutingIndicesIndexing
Semantic Searchin
gEg. (SON)Clustering
Structured
pSearchClustering
May 3, 2023 19P2PIR
Evaluation in P2P IR Recall (Are all the relevant documents retrieved?)
fraction of the documents that are relevant to the query that are successfully retrieved
Recall = number of retrieved relevant in answer/ total number of relevant in the collection.
Precision (Are the retrieved documents relevant?) fraction of documents retrieved that are relevant to a search query Precision = number of retrieved relevant in answer/ number of retrieved
Measure
retrieved relevant
Relevant Retrieved
May 3, 2023 20P2PIR
Evaluation Techniques in P2P IR… F-Score / F-measure
Harmonic mean of precision and recall.
Hits per Query average number of distinct relevant documents discovered per search
query.
May 3, 2023 21P2PIR
Applications Of P2P Information RetrievalIn Real World YaCy (www.yacy.net)
local index entries are injected into a distributed global index YaCy uses no centralized servers, but The resulting decentralized web search currently has about 1.4 billion
documents in its index and more than 600 peer operators contribute each month. About 130,000 search queries are performed with this network each day (Feb 2015)
Faroo (www.faroo.com) This is a proprietary peer-to-peer search engine that uses a distributed global
index. They perform distributed crawling and ranking. Faroo encrypts queries and results for privacy protection. 2 million peers.
Some other P2PIR system: Sixearch, ODISSEA, MINERVA, Seeks, etc.May 3, 2023 22P2PIR
Challenges:- Cross-Language Information Retrieval Maintaining index freshness Security features Quality of service Efficient use of resources Increase range of peer-to-peer network
May 3, 2023 24P2PIR
Conclusion :- P2PIR is one of the application of peer to peer network P2PIR combines key elements of File Sharing and Federal Information
Retrieval No single technique is used for all P2PIR problem Recall and Precision are used for Evaluation of P2PIR
May 3, 2023 25P2PIR
References ALMER S. TIGELAAR, DJOERD HIEMSTRA and DOLF
TRIESCHNIGG “Peer-to-Peer Information Retrieval ” University of Twente, IEEE PAPER SEPT 2012.
Rasanjalee Dissanayaka Mudiyanselage. “Ontology-based Search Algorithms over Large- Scale Unstructured Peer-to-Peer Networks.”Georgia State University, IEEE , OCT 2014
Demetrios Zeinalipour-Yazti . “Information Retrieval in Peer-to-Peer Systems .” UNIVERSITY OF CALIFORNIA RIVERSIDE, JUNE, IEEE 2003.
Chengye lu. “Peer to Peer English/Chinese Cross-Language Information Retrieval.”Queensland University of Technology, SEPT 2008.
May 3, 2023 26P2PIR
References Xiuqi Li and Jie Wu “Searching Techniques in Peer-to-Peer
Networks.” Florida Atlantic University Boca Raton, FL 33431, 2007
Christos Gkantsidis, Milena Mihail, and Amin Saberi. “Random Walks in Peer-to-Peer Networks.” Georgia Institute of Technology, Atlanta, GA, 2002.
Taoufik Yeferny, Amel Bouzeghoub and Khedija Arour. “A QUERY LEARNING ROUTING APPROACH BASED ON SEMANTIC CLUSTERS.”International Journal of Advanced Information Technology (IJAIT) Vol. 1, No.6, December 2011
Yulian YANG . “Semantic Information Retrieval over P2P Networks.”Universit de Lyon, CNRS INSA-Lyon, LIRIS, UMR5205, F-69621, France, 2009. May 3, 2023 27P2PIR
May 3, 2023 28P2PIR
top related