1. botnet detection algorithms and techniques
TRANSCRIPT
What is the Botnet
! Bot – A malware instance that runs autonomously on a compromised computer without owner consent.
! Botnet (Bot army): network of bots controlled by a Botmaster.
! Botmaster : individual who architect and control the botnet.
! C&C server Computer used to coordinate the actions of computers infected by a bot.
Motivations
! Botnets are enabler of following threats: ! DDOS ! Spam ! Click Fraud ! Information theft ! Phishing attacks ! Distributing Malware
! 25% in the world are part of the bot (Vint – Cerf )
! Reports are alarming ! Q4 Mcafee reports (2012) ! Internet Security threat report 2014
Motivations ! Traditional detection approaches failed (Signature based approach)
! Signature based approach ! Collect botnet samples ! Analyze samples ! Extract Behavior ! Generate and deploy detection model.
! Problem Hard ! Lack of general definition of botnet behavior. ! Attackers have much freedom.
! Need a new approach ! Machine Learning a rescue.
Challenges
! Selection of network Monitoring tool
! Features selections
! Machine Learning algorithm selection
! False positive
! The fast flux
C&C approaches
! Statistical approach ! Monitor traffics to extract samples data or use an existing one. ! Extract relevant features. ! Choose the machine learning algorithms and tools. ! Train the system with data. (some heuristics can be apply during training) ! Generate and deploy the detection model.
! Behavioral approach ! Monitor an individual host to identify specific host behavior. e.g: surfing habit. ! Use some heuristics or machine learning algorithms ! Generate the host of behavior pattern
! Correlation approach ! Flow – Based correlations: correlate traffic flow ! Flow and behavioral correlation : correlate traffic flow and activities.
Machine Learning Algorithms
! C4.5 ! Random Forest ! Support Vector Machine (SVM) ! Artificial Neural Network (ANN) ! K Nearest Neighbors
Statistical approach: C4.5 Algorithm ! Decision tree algorithm.
! Developed by Ross Quinlan (1993)
! Use the concept of entropy (degree of purity) and information gain to build the tree.
! Used for C&C detection: ! Reference [1], [2], [5]
Statistical approach: Random Forest
! Ensemble classifier using many decision tree models
! Can be used for classification or regression
! Given a data set, it builds K decision trees by selecting a random subset of data
! During prediction, random forest uses a majority of vote: class predicted most often.
! Used in botnet detection ! Reference [2]
Statistical approaches: Support Vector Machine ! Binary machine learning classifier.
! Build a hyperplane that optimally separates samples of data with maximal margin
! Can be extended to K-class classification by constructing k two class classifiers
! Use a non-linear mapping to solve a non linear classification problem.
! Use for detection ! Reference : [3] , [4]
Statistical approaches: Artificial Neural Network
! Machine learning algorithm use for classification
! Models the brain and the nervous system
! Composed of many “neurons” that co-operate to perform the desired function.
! Use for detection: ! Reference [2]
Statistical approach: K Nearest Neighbors
! Machine learning classifier.
! Classify an instance by finding its k nearest neighbors
! Pick the most popular class among the neighbors.
! Use for detection ! Reference [2]
Behavioral approaches
! Monitor an individual host to identify specific host behavior. e.g: surfing habit.
! Use some heuristics or machine learning algorithms
! Generate the host of behavior pattern
! Use for detection: ! Reference [6]
Correlation Approach
! Flow correlation ! Correlation based approach that correlate network traffic flow to C&C flow ! Reference [7]
! Flow and activities correlation ! Combine flow correlation with activity-based correlation of host. ! Reference [8]
References
[1]On Botnet Behaviour Analysis using GP and C4.5 Fariba Haddadi, Dylan Runkel, A. Nur Zincir-Heywood, and Malcolm I. Heywood Faculty of Computer Science Dalhousie UniversityHalifax, Nova Scotia, Canada
[2] An efficient flow-based botnet detection using supervised machine learning Matija Stevanovic and Jens Myrup Pedersen Networking and Security Section, Department of Electronic Systems Aalborg University, Fredrik Bajers Vej 7,DK-9220 Aalborg, Denmark.
[3]Analyzing String Format-Based Classifiers For Botnet Detection: GP and SVM Fariba Haddadi, A. Nur Zincir-Heywood Computer Science, Dalhousie University Halifax, NS, Canada
[4] Kazumasa Yamauchi, Yoshiaki Hori, Kouichi Sakurai , Detecting HTTP-based Botnet based on Characteristic of the C&C session using by SVM.
References
[5] Feature Selection for Detection of Peer-to-Peer Botnet Traffic Pratik Narang, Jagan Mohan Reddy, Chittaranjan Hota Department of Computer Science & EngineeringBirla Institute of Technology and Science-Pilani, Hyderabad Campus Shameerpet, R.R. District, A.P., India 500078
[6] P. Wurzinger, L. Bilge, T. Holz, J. Gobel, C. Kruegel, and E. Kirda, “Automatically generating models for botnet detection,” in European Symposium on Research in Computer Security (ESORICS), 2009.
[7] T. Strayer, D. Lapsley, R. Walsh, and C. Livadas, Botnet detection based on network behavior, ser. Advances in Information Security. Springer, 2008, vol. 36, pp. 1–24.
[8] G. Gu, R. Perdisci, J. Zhang, and W. Lee, “Botminer: Clustering anal- ysis of network trafficfor protocol- and structure-independent botnet detection,” in Usenix Security Symposium, 2008.
[9]A Taxonomy of Botnet Behavior,Detection, and Defense Sheharbano Khattak, Naurin Rasheed Ramay, Kamran Riaz Khan, Affan A. Syed, and Syed Ali Khayam