automated p2p backup group 1 anderson, bowers, johnson, walker
TRANSCRIPT
Automated P2P Backup
Group 1
Anderson, Bowers, Johnson, Walker
Motivation
• Hard drive space is cheap
• Network connectivity is cheap
• Losing data is expensive
• We’d like to pool our resources and easily collectively maintain backups
Background – Distributed Hashes
• Most academic P2P systems are built on “distributed hash tables”– Ask “the system” for a key, and get the
content back
• How the Distributed Hash and the hash lookup is implemented characterizes the P2P system
Related Work – P2P
• Pastry– Each node has an ID, messages routed to node
nearest the hash key– N-order graph used to route
• OceanStore– Write-once, has versioning– Emphasizes local storage
Related Work – P2P (cont.)
• CHORD– Routes similarly to Pastry– Circular routing space
• Freenet– Write-once– Many security and anonymity features
• All resource encrypted by their hash key
Related Work – Distributed Backup
• Distributed Internet Backup System– Not P2P – uses direct connections– Encrypts user data on other drives
• Pastiche– Built on Pastry
• PAST– Also built on Pastry
• pStore– Uses own P2P architecture
Design
Adapter(P2P
systemabstraction)
UI* UI to mark files/dirs
*UI to initiate backup andreceive retreival key
* UI to retreive backup
ENGINE* Store what files chosen* Store last backup date* Manipulate the Storer
Existing P2Psystem
(Freenet,Pastry, etc,)
The User
Our Project
P2P Adapter• Abstracts interfacing with a P2P system’s distributed
hash• Writing another adapter could make this work with
another system• We used Freenet because it is the only working P2P
system publicly available• When backing up a file, returns a key that the adapter
can later use to retrieve the file– P2P system specific– Freenet example:
SSK@iuhgwe78gfq93f73f:/something/path/file.txt
• Done in Python
Engine
• Implements backup and retrieval logic
• Drives the P2P adapter to insert and retrieve files
• Stores keys from P2P adapter for retrieval
• Done in Java
UI
• Allows the user to select files or directories for backup
% mark_dir
> backup
> done
• Allows user to initiate backup or retrieval based on selection
• Stores selections in “backup.txt” as a comma-delimited file storing the filename, date of last backup, and retrieval key (if any)
• Done in Java
Example Runbowersj2@arctic> ls –l
total 8
drwx------ 2 bowersj2 student 4096 Apr 16 13:58 dir
-rw------- 1 bowersj2 student 27 Apr 16 13:58 example
bowersj2@arctic> java Backup_UI
% mark_dir
> dir
> done
% mark_file
> example
> done
% backup
Backing up dir/one... retrieval key freenet:SSK@iuhgwe78gfq93f73f/3
Backing up dir/two... retrieval key freenet:SSK@iuhgwe78gfq93f73f/4
Backing up example... retrieval key freenet:SSK@iuhgwe78gfq93f73f/5
% exit
Results
• One limit of the Freenet library used was that files must be no larger then 64KB• Not fundamental to Freenet
• A Freenet file takes approximately 4-5 seconds to insert into the system
• Retrieval was very fast since it was always from the local drive cache
Design Benefits
• Careful design allows each component to be implemented in any language• UI and Engine communicate through backup.txt• Engine and P2P adapter communicate through
command lines
• Problems getting other P2P systems running• Most not publicly available yet• PASTRY could not be compiled
• Shipped source had Java exception handling errors
Conclusion
• Modern P2P systems will provide a good substrate for this sort of application• When they are released and working!
• Writing a basic version of this kind of application is fairly easy
• Effectiveness depends on the underlying P2P system• Freenet doesn’t chunk files, some P2P systems do• Freenet has no retention guarantees, some P2P systems do• Freenet natively prevents snooping by other users, some
P2P systems don’t