peekatorrent: leveraging p2p hash values for digital forensics
TRANSCRIPT
PeekaTorrentLeveraging P2P Hash Values for
Digital ForensicsSebas�an Neuner,Mar�n Schmiedecker,
Edgar Weippl
Problem Descrip�on
We are drowning in data:• processes and best-prac�ces do not scalewell
• 12TB hard drives recently presented• sector hashing, unziping and unpacking, ...
2/21
Problem Descrip�on
Hashing is prevalent:• DHTs, P2P file-sharing (SHA-1)• Dropbox (4MB, SHA-256)• file whitelis�ng (NSRL):
◦ full file (SHA256, SHA1 & MD5)◦ fuzzy files (ssdeep, sdhash)◦ blocks (MD5b4096, MD5b8192)
5/21
Problem Descrip�on
Hashing is prevalent:• DHTs, P2P file-sharing (SHA-1)• Dropbox (4MB, SHA-256)• file whitelis�ng (NSRL):
◦ full file (SHA256, SHA1 & MD5)◦ fuzzy files (ssdeep, sdhash)◦ blocks (MD5b4096, MD5b8192)
6/21
Problem Descrip�on
Couldn’t we add to this:• exclude commonly found files• mostly totally irrelevant for inves�ga�on• even before looking at files manually
7/21
peekaTorrent
General idea:• leverage publicly shared hash values• more granular than files, but less thansectors
• it’s all in the .torrent• copyright-free!
9/21
peekaTorrent
BitTorrent uses chunking:• all files are concatenated• then split in chunks (=pieces)• most o�en 256kb, (observed 16kb-16mb)• depending on implementa�on and userpreference
10/21
peekaTorrent
Instead of hashing sectors, or files:• variable hash windows (2n)• iterate over each sector• build on bulk extractor
Then pipe it all into hashdb, see what drops out
11/21
peekaTorrent
Benefits:• also deleted & par�ally overwri�en files• fast!• less false-posi�ves• hashdb files can be easily shared
12/21
peekaTorrent
Use cases:• file whitelis�ng (torrents)• file blacklis�ng• custom hashsets: source code, emaila�achements, sharepoint, ...
13/21
peekaTorrent
Simplis�c use:• create torrent with files of interest• don’t publish/announce it• pipe into hashdb, done
14/21
Evalua�on
Collected data:• in total: 2.65 million torrent files• crawling Piratebay & KAT• mul�ple data dumps
• 3.3 billion unique chunk hashes• up to 2.6 PB of data
16/21
Evalua�on
Some numbers:• 1 GB filesystem, Ubuntu Desktop= 1158chunks
• running bulk extractor: 220s (Notebook),23s (Server)
• running hashdb: few seconds
17/21
Future Work
What’s needed:• more .torrents!• more data• inves�gate data set more closely(duplicates)
• get feedback
19/21
Ques�ons?
Thank you for youra�en�on
URL: h�ps://peekatorrent.orgEmail: [email protected]�er: @fr333k
21/21