malware detection using machine learning

26
MALWARE DETECTION USING MACHINE LEARNING ABHIJIT MOHANTA

Upload: cysinfo-cyber-security-community

Post on 16-Apr-2017

1.628 views

Category:

Technology


143 download

TRANSCRIPT

Page 1: Malware Detection using Machine Learning

MALWARE DETECTION USING MACHINE

LEARNING

ABHIJIT MOHANTA

Page 2: Malware Detection using Machine Learning

ABOUT PRESENTER

• Worked as security researcher for Symantec,Mcafee,Cyphort

• Experience in reverse engineering ,malware analysis and detection

• Worked on antivirus engines,and sandbox engines

Page 3: Malware Detection using Machine Learning

DISCALIMERI have used some contents from the following sites Reference:

• analyticsvidhya.com• datadrivensecurity.info• home.agh.edu.pl• neuralnetworksanddeeplearning.com• http://www.astroml.org• Youtube• Google images

Page 4: Malware Detection using Machine Learning

Malware Detection in Antivirus:How Antiviruses detect malware?• Traditional AV's pattern matching on static files• Partially decrypt using techniques like emulation

How Malwares evade antivirus?• use polymorphic packers which evades static pattern

matching

Why Machine Learning?• Too many types of malware bots,virus • Based on target stealers,POS malwares,banking• Too much data for human to process

Page 5: Malware Detection using Machine Learning

MACHINE LEARNING INTRO• Some prerequisites:

statistics,calculus,vectors,algebra

• Problems solved: classification /regression

• Types: supervised,semi-supervised,unsupervised

• What is our problem? Classification

Page 6: Malware Detection using Machine Learning

Supervised Learning:• What is it?• Steps:

– Feature Selection– Training(provide Labelled Data)– Prediction

Page 7: Malware Detection using Machine Learning

FEATURE SELECTION• How features are selected in Classification?• Some property with which you can distinguish two

classes is A Feature• Feature can be represented as Vector,Boolean etc• Apple Vs Orange Class:

– Feature: colour,weight,shape– Label: apple,guava

Page 8: Malware Detection using Machine Learning
Page 9: Malware Detection using Machine Learning

MODEL SELECTIONModels for supervised Learning:•K-Nearest Neighbours(KNN)-classification•K-Means clustering•SVM•Decision Tree•Random Forest•Naive Bayes Algorithm

Page 10: Malware Detection using Machine Learning

K-Nearest Neighbours(KNN)• Supervised learning• Classification Algorithm• Similarity to neighbours-(Eucledian,Manhattan,Minkowski)• Euclidean distance• A circle around the point to be classified that contains k points

Page 11: Malware Detection using Machine Learning

K-Means• Unsupervised learning• Clustering algorithm• Given some data we cluster the data to K

groups• In each iteration the mean value of the

cluster is updated• Centre calculated using Eucledian

distance• ref video:https://www.youtube.com/watch?

v=aiJ8II94qck

Page 12: Malware Detection using Machine Learning
Page 13: Malware Detection using Machine Learning
Page 14: Malware Detection using Machine Learning
Page 15: Malware Detection using Machine Learning
Page 16: Malware Detection using Machine Learning
Page 17: Malware Detection using Machine Learning

Support Vector Machines• Classifier• What are support vectors• Linearly separating Hyperplane• Margins with max separation

Page 18: Malware Detection using Machine Learning

Support Vector Machines

• ref:http://www.saedsayad.com/support_vector_machine.htm• videos:• https://www.youtube.com/watch?v=1NxnPkZM9bc• https://www.youtube.com/watch?v=5zRmhOUjjGY

Page 19: Malware Detection using Machine Learning

Decision Tree

Ref:https://databricks.com/blog/2014/09/29/scalable-decision-trees-in-mllib.html

Page 20: Malware Detection using Machine Learning

Random Forest• Ensemble learning method• Uses output of multiple decision trees

Ref:https://citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics/

Page 21: Malware Detection using Machine Learning

Features for Malware Detection• Static:

– Size– Signed/unsigned– Icon-exe file without icons– entropy

• Behaviour:– Process executed from %appdata% and %temp%– Dropped file has random name eg xszsde.exe– Process creating run entries– Code injection

Page 22: Malware Detection using Machine Learning

Training Sets for malware

Page 23: Malware Detection using Machine Learning

Some application for Malware Traffic Detection• DGA algorithm detection• DGA: what is DGA?

• Features:– N-Grams– Entropy– Dictionary– Reference:http://datadrivensecurity.info

Page 24: Malware Detection using Machine Learning

ADVANCED TOPICS• NEURAL NETWORKS• DEEP NEURAL NETWORKS

Page 25: Malware Detection using Machine Learning

PYTHON LIBRARIES• Scikit-Learn• Numpy• Pandas

Page 26: Malware Detection using Machine Learning