the spammer, the botmaster, and the researcher: on the arms race in spamming botnet mitigation

48
The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation Gianluca Stringhini Major Area Exam December 5, 2011

Upload: gianluca-stringhini

Post on 21-Jun-2015

622 views

Category:

Technology


0 download

DESCRIPTION

Unsolicited bulk email, or spam, accounts for more than 90% of worldwide email traffic. The underground economy behind email spam is prosperous, and involves parties located in many parts of the world. Nowadays, most spam is sent by botnets, which are large networks of compromised computers that act under the control of a single entity, called a botmaster. Security researchers have entered an arms race with spammers and botmasters. The goal of researchers is to secure networks and prevent malicious operations from happening, while the goal of cybercriminals is to keep their business up and running.In this talk I will analyze the outcome of this arms race. On one side, I will talk about the different levels of sophistication the botmasters developed to make their network resilient to take down attempts. On the research side, I will analyze the approaches proposed to prevent machines from being infected, identifying compromised ones, and disrupting command and control structures. In particular, I will focus on the shortcomings of previous approaches, as well as open problems in the area and the areas that have not been studied yet.

TRANSCRIPT

Page 1: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

The Spammer, the Botmaster, and the Researcher: On theArms Race in Spamming Botnet Mitigation

Gianluca Stringhini

Major Area Exam

December 5, 2011

Page 2: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

What is spam?

Spam is a big problem

Everyone receives spam90-95% of emails are spam

Organic vs. Junk food

Spam vs. Ham

We need a definition acomputer can understand

Unsolicited Bulk Email

Page 3: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Early days spam

Spam as a hobby

Businesses ran from home’s basement

CAN-SPAM Act (2003)

Doesn’t forbid to spam, but the spammerhas to be nice.$16k fine per violating email

The world is big

Not every country prosecutes spammers

Page 4: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Modern spam

Page 5: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Modern spam

1

� Affiliate programs [Samosseiko 2009]

� Are banks the weak link? [Levchenko 2011]

1source: Levchenko et al., Click Trajectories: End-to-End Analysis of theSpam Value Chain

Page 6: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Is Spam Profitable?

Yes, it isEstimates between $300k and $1M a month for large affiliateprograms [Kanich 2008, Kanich 2011]

Relatively low risk

� Small fishes are the ones who get caught

� The geographic dispersion makes coordinated actions difficult

Page 7: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

How is Spam Delivered?

BotnetsBotnets are networks of compromised computers that act under thecontrol of a single entity (Botmaster)

What are botnets used for?

� Running DoS

� Stealing Information

� Solving Captchas

� Sending spam

Botnets are responsible for 85% of worldwide spam

Why botnets?

Botnets combine the best of two worlds: worms and IRC bots

Researchers and Botmasters are involved in an arms race

Page 8: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet Evolution

Page 9: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet Evolution - Structure

SDBot 2002

Page 10: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet Evolution - Structure

IRC botnetsThe C&C is an IRC serverBots join a channel and get orders

Problems

� Researchers can join the channel too

� DNS sinkholing is possible

Page 11: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet Evolution - Structure

MyDoom 2004

Page 12: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet Evolution - Structure

Proprietary protocol botnets

The C&C uses a proprietary encrypted protocolTwo architectures:

� Pull architecture

� Push architecture

Problems

� Researchers can reverse engineer the protocol

� DNS sinkholing is still possible

Page 13: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet Evolution - Structure

Lethic 2007

Page 14: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet Evolution - Structure

Multiple tier botnets

The bots don’t connect directly to the C&CThe domains used by the proxies use Fast Flux

Fast FluxTechnique similar to Round-robin DNS and CDNsGive high reliability for the botnet backbone

� Many IP addresses associated to a domain

� Low TTL, the record changes all the time

Page 15: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet Evolution - Structure

ProblemThe domains used can still be sinkholed / blacklisted

The solutionDomain Generation AlgorithmsBots contact a domain according to a time-dependent algorithmUsed by Torpig (2008)

ProblemsThe algorithm can be reverse engineered [StoneGross 2009a]Botmasters can add non-determinism (e.g., Twitter trends)

Page 16: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet Evolution - Structure

Storm 2007

Page 17: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet Evolution - Structure

Peer-to-peer botnets

Bots with private IPs act as workersBots with public IPs act as proxiesWorkers find proxies based on some overnet protocol

ProblemProxies are not under the control of the botmasterResearchers can impersonate a proxy and infiltrate the botnet

Page 18: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet Evolution - Infection model

Worm-like spread

The bot scans the network for vulnerabilities and propagates

Non-spreading bots

Infections are propagated through

� Drive-by-download websites [Provos 2008, StoneGross 2011]

� Email attachments

Pay-per-Install

The new trend is paying third parties for “installing” a certain numberof bots [Caballero 2011]

Page 19: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet and Spam Mitigation

Page 20: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Botnet and Spam Mitigation

Many Possible Vantage Points

Page 21: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Host-based detection

Page 22: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Host-based detection

Traditional anti-virus approach

Look for the presence of virus specific instructions in the binariesAntiviruses can be fooled by simple obfuscations[Christodorescu 2003, Christodorescu 2004]

ObfuscationsNOP insertion and code transposition are usually enough

� Metamorphic malware

� Polymorphic malware

Page 23: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Host-based detection

Static analysis

Take program semantics into account [Christodorescu 2003,Christodorescu 2005]

Dynamic analysis

Model the behavior of a program (e.g., using system calls)[Kolbitsch 2009]Monitor access to sensitive information [Yin 2007]Reverse engineer of the C&C protocol [Caballero 2009]

ProblemsProgram equivalence is undecidable!Analysis of samples takes time and resources

Page 24: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Malicious Web Pages Detection

Page 25: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Malicious Web Pages Detection

Infection happening through browser exploits are a big problem

Detecting Drive-by-Download pages

Malicious Javascript can be detected by:

� Emulation [Cova 2010]

� Monitoring system changes [Provos 2008]

� Hooking runtime [Curtsinger 2011, Heiderich 2011]

� Look for common attack patterns (e.g., heap spray)[Ratanaworabhan 2009]

Problems

� The analysis could be detected

� These systems might not detect newer attacks

Page 26: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Command and Control based Detection

Page 27: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Command and Control-based Detection

IRC server infiltration [AbuRajab2006]

Protocol Reverse Engineering

Protocol reverse engineering by active probing [Cho 2010a]This enables botnet infiltration [Stock 2009, Kreibich 2009,Cho 2010b]

Botnet TakeoversReverse engineering of DGAs [StoneGross 2009a]This enables C&C impersonation [StoneGross 2009a]

Page 28: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Honeypots

Running bots in virtual machines allows to learn important botnetfeatures [John 2009]

This can be used for

� Blacklisting the domains that host C&C servers[StoneGross 2009b]

� Performing botnet takedowns [StoneGross 2011]

Problems

� Bots might detect virtualization [Balzarotti 2010]

� Containment problems arise [Kreibich 2011]

Page 29: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

DNS Based Detection

Page 30: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

DNS Based Detection

Detecting infected IPs

DNS sinkhole [Dagon 2006]Look for DNS cached results [AbuRajab 2006]

Detect Fast-Flux DomainsFast Flux domains present very different characteristics thanlegitimate ones [Holz 2008, Passerini 2008, Hu 2009]

� IPs belong to different networks

� TTL is low

� results change very frequently

Page 31: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

DNS Based Detection

Detecting Malicious Domains

It is possible to build classifiers to detect malicious domains

� Passive analysis of RDNSs queries [Antonanakis 2010,Bilge 2011]Limitation: only local view

� Analysis at the authoritative server level or TLDs[Antonanakis 2011]Limitation: it can be evaded using diverse DNS servers

Page 32: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

SMTP based Detection

Page 33: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

SMTP based Detection: Content Analysis

Rule-based Spam Detection

� The nature of spam changes over time

� Having a binary decision introduces problems.

Machine Learning

� Bayesian Filtering: uses naıve Bayes [Sahami 1998,Androutsopolous 2000]

� Support Vector Machines [Drucker 1999]

Problems

� Feature selection has to be performed

� “Good word” attacks are possible [Lowd 2005, Karlserger 2007]

Page 34: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

SMTP based Detection: Content Analysis

Assign a Reputation to Received Emails

Different features between spam and ham [Hao 2009]

Building Signatures from Spam

[Pitsillidis 2010] ran bots and assigned templates to different botnets

Detect Spam by Looking at URLs

� Study the URL structure [Xie 2008, Ma 2009]

� Learning features from the landing page [Thomas 2011]

Problem

� In general, content analysis is expensive

Page 35: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

SMTP based Detection: IP Blacklisting

DNS-based blacklistsMailservers can query the service to know whether an individual IP isa known spammer

Problems

� Low coverage [Ramachandran 2006a, Sinha 2008]

� Bot machines have dynamic IPs

� What happens when IPv6 takes over?

Better Approaches

� IP reputation [Ramachandran 2006b, Sinha 2010, Qian 2010]

� Behavioral blacklisting [Ramachandran 2007, Stringhini 2011]

Page 36: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

SMTP based Detection: Policies

Greylisting

If a delivery temporary fails, spambots will not try againEasy to bypass and prone to false positives [Levine 2005]Multi-level greylisting [Janecek 2008]

Sender ValidationSpam pretends to come from legitimate addressesSPF,DomainKeys,DKIM [Leiba 2007]

The solution chosen by Google

User voting on spam and ham [Taylor 2006]

Main problem: Spam hits server performances!

Mail prioritization systems [Twining 2004, Venkataraman 2007]

Page 37: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Social Network Detection

Page 38: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Social Network Detection

Online Social Networks are very successful

Users are not as risk aware as they are with email spam

Miscreants create fake profiles to spread spam

Systems to detect fake profiles have been developed[Benvenuto 2010, Lee 2010, Stringhini 2010, Yang 2011a,Yang 2011b]

Real accounts that get compromised are more valuable

45% of social network users click on any link by their friends[Bilge 2009]89% of profiles sending malicious content on Facebook arecompromised [Gao 2010]

Page 39: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Network Edge Detection

Page 40: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Intrusion Detection

Signature-based intrusion detection

Snort,Bro [Paxson 1998]

Problems

� Constant need of new rules

� Problems with encrypted traffic

Anomaly-based intrusion detection

The system learns the “normal” behavior of a network and flagsanomalies [Portnoy 2001, Kruegel 2002, Wang 2004]

Problems

� What is ”normal“ behavior?

� It is hard to get traffic that is free of infections

Page 41: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Network Edge Detection

Detecting Successful Infections

Botnet infection can as a set of communication flows [Gu 2007]Problem: what’s the infection model of a botnet?

Detecting Malicious Activity

Correlation between C&C commands and malicious activity[Gu 2008a]How to identify C&C traffic?

� Well-known protocols (e.g., IRC, HTTP) [Gu 2008b]

� Look for malicious activity first [Wurzinger 2010]

Leverage Previous Knowledge

Detect hosts that contact the same IPs as infected machines[Coskun 2010]

Page 42: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Conclusions

Page 43: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

How About the Future?

The arms race between researchers and cybercriminals is far frombeing over

Is security research like fighting the Hydra?

Page 44: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Future Directions

Botmasters will keep developing more sophisticated techniques

However, a functional botnet has to interact with legitimate services

� DNS servers

� SMTP servers

� Web servers

� Social Networks

This interaction cannot be obfuscated!

Page 45: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

My Research

In my research, I focus on analyzing how bots interact withlegitimate, third party services

Bots can be distinguished from real users in the way they use suchservices

The main reason is that bots have a different goal than real users:

Fast interaction vs. Good user experience

Page 46: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

My Research

So far, I have been looking at:

Social Networks

� How fake accounts differ from legitimate ones [ACSAC 2010]

� How users behavior change once an account is compromised[In submission]

SMTP serversDistinguishing bots:

� based on the destinations they target [USENIX 2011]

� based on the (wrong) way in which they implement SMTP[Work in progress]

Page 47: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

My Research

Other interesting areas:

� Login patterns on Social Networks

� Interaction with search engines (e.g., SEO)

What if bots started behaving like legitimate users / programs?

This conflicts with their goal!

Page 48: The Spammer, the Botmaster, and the Researcher: On the Arms Race in Spamming Botnet Mitigation

Thanks!

email: [email protected]: @gianlucaSB