automated worm fingerprinting sumeet singh, cristian estan, george varghese, and stefan savage manan...
Post on 21-Dec-2015
222 views
TRANSCRIPT
Automated Worm Fingerprinting
Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage
Manan Sanghi
The menace
Context
Worm Detection Scan detection Honeypots Host based behavioral detection
Payload-based ???
Context
Characterization A priori vulnerability signatures
Generally manual Honeycomb
Host based Longest common subsequences
Autograph Network level automatic signature generation
Context
Containment Host quarantine String matching Connection throttling
Address Blacklisting
Content Filtering
Internet Quarantine
Worm behavior
Content Invariance Limited polymorphism e.g. encryption key portions are invariant e.g. decryption routine
Content Prevalence invariant portion appear frequently
Address Dispersion # of infected distinct hosts grow overtime reflecting different source and dest. addresses
Key Idea
Detect unknown worms on the basis of
A common exploit sequence
Rage of unique sources and destination
Content Sifting
For each string w, maintain prevalence(w): Number of times it is found in the
network traffic sources(w): Number of unique sources
corresponding to it destinations(w): Number of unique destinations
corresponding to it
If thresholds exceeded, then block(w)
Issues
How to compute prevalence(w), sources(w) and destinations(w) efficiently?
Scalable Low memory and CPU requirements Real time deployment over a Gigabit scale
link
prevalence(w)
w – entire packet Use multi-stage filters (k-ary sketches?)
w – small fixed length b Rabin fingerprints Value sampling
Value Sampling
The problem: s-b+1 substrings Solution: Sample But: Random sampling is not good enough Trick: Sample only those substrings for which
the fingerprint matches a certain pattern Since Rabin fingerprints are randomly
ditributed,
Prtrack(x)=1-e-f(x-b+1)
sources(w) & destinations(w)
Address Dispersion Counting distinct elements vs. repeating
elements Simple list or hash table is too expensive Key Idea: Bitmaps Trick : Scaled Bitmaps
Direct Bitmap
Each content source is hashed into a bitmap, the corresponding bit is set, and an alarm is raised when the number of bits set exceeds a threshold
Drawback: lose estimation of actual values of each counter
Scaled Bitmap
Idea: Subsample the range of hash space How it works?
multiple bitmaps each mapped to progressively smaller and smaller portions of the hash space.
bitmap recycled if necessary.
Result
Roughly 5 time less memory + actual estimation of address dispersion
Putting it together
Experience
System design: Sensors and Aggregators sensor sift through traffic on configurable address space
zones of responsibility aggregator coordinates real-time updates from the sensors,
coalesces related signatures and so on. Parameters:
content prevalence: 3 address dispersion threshold:30 garbage collection time: several hours
prevalence(w) threshold
Address Dispersion threshold
Garbage Collection threshold
Trace-based False Positives
Performance Processing time:
Memory Consumption: 4M bytes
Live Experience
Detect known worms: CodeRed,
Detect new worms: MyDoom, Sasser, Kibvu.B
Limitation & Extension
Variant content
Network evasion
Extension: Dealing with slow worms
Comparison
Earlybird Autograph
Infect the system with Network Data (real traces)
Rabin fingerprint
White-list/blacklist
No-prefiltering Flow-reassembly
Single sensor algorithmics + centralized aggregators
Distributed Deployment + active cooperation between
multiple sensors
On-line Off-line
Overlapping, fixed-length chunks
Non-overlapping, variable-length chunks
Qinghua Zhang
Breather
Polygraph: Automatically Generating Signatures For Polymorphic Worms
James Newsome, Brad Karp, Dawn Song
The case for polymorphic worms
Single Substring Insufficient
Sensitive: Should exist in all payload of a worm
Specific: Should be long enough to not exist in any non-worm payload
Examples
Signature Classes
Signature – set of tokens
Conjunction Signatures
Token-subsequence Signatures
Bayes Signatures
Problem Formulation
Algorithms
Preprocessing Distinct substrings of a minimum length l that
occur in at least k samples in suspicious pool
Generating signatures Conjunction signatures Token Subsequence Signatures Bayes Signatures
Wrap Up
Automated Worm Fingerprinting (OSDI 2004)
Polygraph: Automatically Generating Signatures For Polymorphic Worms
(IEEE Security Symposium 2005)
Manan Sanghi