freenet: a distributed anonymous information storage and retrieval system
DESCRIPTION
Freenet: A Distributed Anonymous Information Storage and Retrieval System. 박사 과정 6 차 학기 김 훈 규. Topics. Overview Related work Architecture Keys and Searching Retrieving data Storing data Managing data Adding nodes Protocol Details Performance analysis Security Conclusion. - PowerPoint PPT PresentationTRANSCRIPT
Freenet: A Distributed Anonymous Information Storage and Retrieval
System
박사 과정 6 차 학기 김 훈 규
Topics
Overview Related work Architecture
Keys and Searching Retrieving data Storing data Managing data Adding nodes
Protocol Details Performance analysis Security Conclusion
Overview (1/2)
What is Freenet? 인터넷상에서 의사소통의 진정한 자유를 보장하기 위해
설계된 P2P application 완전한 익명성 하에서 누구나 정보 제공 /획득 가능
누구도 ( controls Freenet, not even its creators Freenet nodes 에 의한 통신은 암호화 됨 정보 요청자가 누구인지 , 내용이 무엇인가를 결정하는 것이
어렵게 하기 위해 경로를 통해 다른 노드로 접근
Who is behind Freenet? Originally, Ian Clarke while a student at the University
of Edinburgh, Scotland. Still supervised by Ian Clarke, though many other
people contribute to the project.
Overview (2/2)
Purpose Prevent information censorship Maintain personal privacy
Design Goals Anonymity for information producers, consumers, and
holders Deniability for storers of information Resistance to information censorship High availability and reliability through
decentralization Efficient, scalable, and adaptive storage and routing
Related work
File-sharing Gnutella, FastTrack, Overnet
Consumer Anonymity Anonymizer, SafeWeb/Triangle Boy
Producer Anonymity Rewebber, TAZ, Publius
Shared-storage OceanStore, Cooperative File System, PAST
Architecture (1/2)
Peer-to-peer network
Key Reference local
a Node7, …
b Node5, …
c Node1, …
d Node2, …
e Node3, …
f Node5, …
g Node1, …
new
olddata deleted
dynamic routing table (LRU)
file (a)
file (b)
file (c)
file (d)
local datastore
Architecture (2/2)
A cooperative distributed filesystem incorporating location independence and transparent lazy replication
Basic Model Key 요청이 proxy requests chain 을 통해 node 에서
node 로 전달 각 node 는 요청을 다음에 어디로 보낼 것인를 자체적으로
결정 Routing algorithm : adjust routes over time to provide
efficient performance Request
hopes-to-live : to prevent infinite chains pseudo-unique random identifier : prevent loops by
rejecting request they have been seen before result is passed back up the chain to the sending node No node is privilege : no hierarchy or central point of
failure
Keys and searching
Files in Freenet are identified by binary file keys applying hash function : 160-bit SHA function
Key types Keyword-signed key (KSK)
simplest type of file key A short descriptive text string chosen by the users
Signed-subspace key (SSK)Used primarily for data storageGenerated by hashing the content
Content-hash key (CHK)Generated with a public key and (usually) text
description, signed with private keyCan be used as a sort of private namespaceDescription e.g. politics/us/pentagon-papers
Keyword-signed key (KSK)
<keyword>
keypub
keypriv
SHA(keypub) = KSK
keypriv(File)Signature, minimal integrity check
File E(<keyword>, File)Encryption
KSK@plays/shakespeare/Coriolanus
To retrieve the file user need only publish the keyword
Problematic flat global namespace두 사용자가 상이한 파일에 대하여 동일한 keyword 를 각각
독립적으로 선택하는 것을 막을 수 없음
Signed-subspace key (SSK)
<keyword> SHA(<keyword>)
SHA(S-keypub)S-keypub
SHA(…XOR…) = SSKXOR
S-keypriv S-keypriv(File)
E(<keyword>, File)
randomly generated
Signature
Encryption
[email protected]/TFE//
File
To retrieve the file user need only publish the keyword together with
subspace’s public key Storing data
require the private key the owner of subspace can add file to it
Content-hash key (CHK) (1/2)
Useful for implementing updating and splitting To retrieve the file
user publish the content hash key itself together with the description key
Most useful in conjunction with signed-sunspace keys using an indirection mechanism
To store an updateable file inserts file under its content-hash key insert an indirect file under a signed-subspace key
whose contents are the content-hash key
Keyencrypt E(File)
File SHA(File) = CHK
randomly generated
Encryption
[email protected],jMQymYuK
Content-hash key (CHK) (2/2)
To update a file insert a new version (new CHK) insert a new indirect file (original SSK)
Splitting filesDesirable because of storage and bandwidth limitation트래픽에 잇점각 부분을 CHK 하에 별도로 삽입을 하고 , 하나의 indirect file
(or multi levels of indirect files) 을 만들어 각 부분들을 point 함
Problem of Finding keys in the first place hypertext: conflict with the design goal (decentralization) create a special class of lightweight indirect file
파일 삽입 시 , 파일에 대한 포인터를 갖고있는 일련의 indirect files 를 삽입
create compilation of favorite keys, publicize (use on WWW)
Retrieving data (1/3)
Data request message Key, Hope-to-live value (HTL), Unique ID
Retrieve successful node will pass the data back to the upstream requestor cache the file in its own datastore and create a new entry
in routing table (actual data source with request key)
send Data Reply
check HTL
wait for answer
send Data Request
send Data Failed
[found]
[not found][ok]
[not ok][not found]
[Data Reply]
lookup nearest key in routing table
[Request Failed]
search file in datastore
Retrieving data (2/3)
A typical request sequence
Data RequestData ReplyRequest Failed
a
c
b
d
ef
1
12
2
3
67
5
8
9
10
4
11
start
data
• Data Return path : d -> e -> b -> a• cached node : e, b, a
Retrieving data (3/3)
Quality of the routing should improved over time specialize in locating set of similar keys specialize in storing clusters of files having similar
key Transparently Replicated
popular data to be transparently replicated and mirrored closer to requestor
create new routing entries for previously-unknown nodes, increasing connectivity direct link to data source are created, bypassing the
intermediate node used node that successfully supply data will gain routing
table entries and be contacted more often than nodes that do not
Storing data (1/2)
Insert message Key, Hope-to-live value (HTL), Unique ID
check its own store to see if file already exist and insert
check if key already exists
check HTL
wait for answer
send insert message
send “all clear”
[not found] [ok]
[not ok][not found]
lookup nearest key in routing table
Return reply[found]
try again using a different key
[Request Succeed]
Storing data (2/2)
All clear : successful result HTL value is reached without a key collision being
detected -> propagate back to the original inserter user send the data to insert, propagate along the
path, and stored in each node along the way each node create an entry in routing table with the
new key
Managing data
Storage : LRU cache If datastore is full, the least recently used files are
evicted There is no permanent copyOnce all the node have decided to drop a particular file,
it will no longer be available to the network Advantage of expiration mechanism
allow outdated documents to fade away naturally after being superseded by newer document
Encrypted content node operator not to explicitly know the content암호화 절차는 파일을 보호하려는 의도가 아니고 node
operator 가 저장된 내용을 알지 못하게 하려는 의도
Adding nodes
New nodes must announce their presence Two conflict requirement
라우팅 효율성 향상을 위해 , 모든 노드들은 신규 노드에 보낼 키 결정에 일관성이 있어야 함
보안을 위해 , 한 노드가 라우팅 키를 선정할 때 , 일관성을 지키기 위한 가장 직관적인 방법은 제외함
Use cryptographic protocol Announce public key and physical address (e.g. IP) to
an existing node Announcement is recursively forwarded to random
nodes Nodes in the chain then collectively assign the new
node a random GUID
Protocol details (1/2)
Packet-oriented and uses self-contained messages For efficiency, nodes using a persistent channel(TCP) Node address = IP address + port number Nodes with frequently changing IPs use ARKs
signed-subspace keys updated to contain the current real address
Transaction Message Request.Handshake
specifying the desired return address of the sending node Reply. Handshake
specifying protocol version number Handshakes are remembered for a few hours
Protocol details (1/2)
To request data Request.Data (Trans ID, HTL, depth, search key) Reply.Restart Send.Data : when the request is succeed Reply.NotFound : when the request is failed and HTL are completely used up
these message terminate the transaction and release any resource held
Request.Continue : remaining HTL To insert data
Request.Insert (Random Trans ID, HTL, depth, Key)
Performance analysis (1/4) Topology
1,000 node networks, datastore = 500 item, routing table size = 250 address
key associated with links are hash of destination IPs
Network convergence
Performance analysis (2/4)
Scalability Key assigned
to new nodes = H(IP)
Scales as log(n) until 40,000 nodes
at 40,000 node, RTs are full
Performance analysis (3/4)
Fault-tolerance Median path
length < 20 at 30% node failure
network becomes ineffective at 40% failure
Performance analysis (4/4)
Small-world model Most nodes form local Few high link
connecting node Power law distribution
provides high degree of fault tolerance
Security
Protect the anonymity of requestors and inserters of files Key anonymity (receiver anonymity) sender anonymity : local eavesdrop
Anonymity of storer : encrypted contents Pre-routing
Mesg. Encrypted by public keys which determine path of pre-routing
Protecting data source using random and probabilistic methods
Malicious modification : hash key and signature Denial-of-service : insert a large number of junk files
to storage
Conclusion
Provides a network to anonymously store and request files
Adaptive routing who’s efficiency increase with experience
Deal with privacy and data integrity in various scenarios
Freenet is an ongoing project that still has plenty of flaws
There may be a tradeoff between network efficiency and anonymity, robustness.
More information at http://freenetproject.org/