2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

29
Detecting P2P Botnets through Network Behavior Analysis and Machine Learning Sherif Saad, Issa Traore et al. 2011 PST (Ninth Annual International Conference on Privacy, Security and Trust)

Upload: ericsuboy

Post on 09-Jun-2015

290 views

Category:

Software


2 download

DESCRIPTION

lab presentation

TRANSCRIPT

Page 1: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Detecting P2P Botnets through Network Behavior Analysis and Machine Learning

Sherif Saad, Issa Traore et al.2011 PST

(Ninth Annual International Conference on Privacy, Security and Trust)

Page 2: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Outline

• Introduction• Related Work• Network Behavior Analysis• Experiment and Evaluation• Conclusion

Page 3: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Introduction

• IRC and HTTP-based botnets are vulnerable because they are based on highly centralized architectures.• Currently the new trend in botnet communication is toward Peer-to-

Peer architectures.• Bot master can inject commands in to any part of the P2P botnet.

Page 4: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Centralized architecture

Page 5: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Decentralized architecture

Page 6: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Botnet Lifecycle

• Leonard et al divided the botnet lifecycle into three phases, namely, Formation, C&C communication, and attack phases.• Most of recent research detects botnet during the formation or the

attack phase.• This paper focus on detecting bots during the C&C phase.

Page 7: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Formation Phase

Injection, unwanted downloadbinary.

Web browsing, etc.

Compromised

Binary server

Page 8: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

C&C Communication Phase

Propagate instructions

Periodical connection, Update status.

Compromised

C&C server

Page 9: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Attack Phase

DDoS attack, spread spam, or steal personal user information.

Compromised computers

Victim

Page 10: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Related Work

• Several studies have shown that network traffic identification can effectively distinguish between different classes of network applications.• Recently, many of the literature in this field focuses on analyzing P2P

botnet.

Page 11: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Using Network Behaviors Analysis To Detect Botnet

• It’s possible to detect bots during any phase of their lifecycle.• It’s less expensive compared to other approaches like implement

deep-payload-analysis or attempt to capture and study live bots using honeynets.

Page 12: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Detecting Bots During C&C Phase

• Allows detecting bots that were missed during the formation phase and before they launch their attack and cause some damages.

Page 13: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Network Behavior Analysis

• In general, there are three categories of network traffic identification methods:

• Port-based analysis• Protocol-based analysis• Behavior-based analysis

• Network traffic information can usually be easily retrieved from various network devices without affecting significantly network performance or service availability.

Page 14: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Network Behavior Analysis

• Each of the existing major botnet (for instance Storm and Zeus.) implements their own specific C&C architecture.• Such architectures tend to exhibit distinguishing behaviors that can be

captured by analyzing network traffic characteristics.• Identifying specific traffic characteristics can be used to distinguish

between botnets traffic and other network application traffic.

Page 15: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Traffic Characteristics

• Payload size• Number of packets• Duplicated packets length• Concurrent active ports

Page 16: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Features Selection

• Flow-based features• Used to link flows to specific class of network traffic such as P2P traffic or

non-P2P traffic.

• Host-based features• Occur in the communications between hosts.• Identify host with shared communications patterns.

• 17 features extracted.

Page 17: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Flow-Based Features

• Source IP, Source Port, Destination IP, Destination Port, Protocol.• Packet Length, Average Packet Length, Length of First Packet.• Total Number of Packets per Flow.• Total Number of Bytes per Flow.• Incoming Packets over Outgoing Packets.• Packets of Same Length over Total Number of Packets in Same Flow.• Total Bytes of All Packets over Total Number of Packets in Same Flow.

Page 18: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Host-Based Features

• Ratio of Number of Source Ports to The Number of Destination Ports.• The Number of Connections over The Number of Destination IP.• The Sum of Different Transmission Protocols used per Destination IP

over The Total Number of Destination IPs.• The Number of Destination IPs Connected to The Same Open Port in

The Monitored Host over The Total Number of Open Ports in The Monitored Host.

Page 19: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Experiment

• Datasets• Malware traffic

• French chapter of the honeynet project, involving the Storm and the Walowdac botnet.• Such traffic doesn’t generate regular benign traffic that typically would occur in a real

world scenario.• Non-malicious traffic

• Labeled dataset from the Traffic Lab at Ericsson Research in Hungary.• User-generated normal traffic

• The traffic in the dataset should be intermixed as if both kinds of traffic were happening at same time from the same machines.

Page 20: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Malware Network Traffic

• The trace file corresponds to the C&C and attack phase of the storm and Walowdac botnet as the bot master used this machine to spread spam.

Page 21: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Malware Network Traffic

Page 22: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Non-Malicious Traffic

• Contains over a million packets of general traffic that ranges from web browsing to P2P traffic and gaming such as World of Warcraft.• Every packet was labeled with the originating or the target process

running on the test machines.

Page 23: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Non-Malicious Traffic

Page 24: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Datasets Merging

• Mapped the IP addresses of the infected machines to two of the machines in benign dataset.• Replayed all of the trace files using TcpReplay tool on the same

network interface card.• Use capturing tool, such as wireshark, to listen on network interface

and capture the output to a file.

Page 25: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Datasets Merging

Page 26: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Evaluation

• Parse the network traffic dataset and extracts 129,453 feature vectors, which are labeled into three classes, namely, Botnet C&C, non-P2P traffic, and normal P2P traffic.• Use 10-fold cross-validation and machine learning tools, like Weka to

evaluate their approach.

Page 27: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Evaluation

Page 28: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Evaluation

Page 29: 2014.7.9 detecting p2 p botnets through network behavior analysis and machine learning

Conclusion

• They design a model using network traffic characteristic to detect P2P botnet (Storm and Walowdac).• They experiment 5 popular MLA to classify malicious traffic.