machine learning-based malicious adversaries detection in an enterprise environment by using open...

66
Intro The issues in general Motivation Solution Experiments Tools eof() Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools Muhammad Najmi Ahmad Zabidi International Islamic University Malaysia MOSC 2012 Berjaya Times Square, Kuala Lumpur 9th July 2012 Muhammad Najmi Ahmad Zabidi MOSC 2012 1/34

Upload: najmizabidi

Post on 29-Jul-2015

54 views

Category:

Documents


2 download

DESCRIPTION

Machine Learning-based Malicious AdversariesDetection in an Enterprise Environment by Using OpenSource Tools-talk for Malaysian Open Source Conference 2012, 9th July 2012, Berjaya Times Square, Kuala Lumpur, Malaysia

TRANSCRIPT

Page 1: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Machine Learning-based Malicious AdversariesDetection in an Enterprise Environment by Using Open

Source Tools

Muhammad Najmi Ahmad ZabidiInternational Islamic University Malaysia

MOSC 2012Berjaya Times Square, Kuala Lumpur

9th July 2012

Muhammad Najmi Ahmad Zabidi MOSC 2012 1/34

Page 2: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

About

• I am a research grad student at Universiti TeknologiMalaysia, Skudai, Johor Bahru, Malaysia

• My current employer is International Islamic UniversityMalaysia, Kuala Lumpur

• Research area - malware detection, narrowing onWindows executables

• For past few years (since 2003), I am a Subversion(SVN)committer for KDE localization project to Malay language(but now rarely commit.. need a new intern to replace :) )

Muhammad Najmi Ahmad Zabidi MOSC 2012 2/34

Page 3: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Computing world as we knew it

• Interconnected machine

• Previously less connected, now ‘‘socialized’’ machines

• Brought real problems to the cyberworld

Muhammad Najmi Ahmad Zabidi MOSC 2012 3/34

Page 4: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Risks

• Financial lost

• Company/government level espionage

• Privacy breach

Muhammad Najmi Ahmad Zabidi MOSC 2012 4/34

Page 5: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Types of adversaries

• Spam

• Scam

• Phishing

• Malware, botnet, rookit etc

• Anything else?

Muhammad Najmi Ahmad Zabidi MOSC 2012 5/34

Page 6: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Spam

• Annoying

• Productivity wasted in unneccesary file deletion

• Difficult to find important email - extreme case

Muhammad Najmi Ahmad Zabidi MOSC 2012 6/34

Page 7: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Spam

• Annoying

• Productivity wasted in unneccesary file deletion

• Difficult to find important email - extreme case

Muhammad Najmi Ahmad Zabidi MOSC 2012 6/34

Page 8: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Spam

• Annoying

• Productivity wasted in unneccesary file deletion

• Difficult to find important email - extreme case

Muhammad Najmi Ahmad Zabidi MOSC 2012 6/34

Page 9: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Spam

• Annoying

• Productivity wasted in unneccesary file deletion

• Difficult to find important email - extreme case

Muhammad Najmi Ahmad Zabidi MOSC 2012 6/34

Page 10: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Scam

• Preying on naive victims

• Sounds to good to be true, but still some people believed

• Organized crime/syndicate... with mules cooperating

Muhammad Najmi Ahmad Zabidi MOSC 2012 7/34

Page 11: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Scam

• Preying on naive victims

• Sounds to good to be true, but still some people believed

• Organized crime/syndicate... with mules cooperating

Muhammad Najmi Ahmad Zabidi MOSC 2012 7/34

Page 12: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Scam

• Preying on naive victims

• Sounds to good to be true, but still some people believed

• Organized crime/syndicate... with mules cooperating

Muhammad Najmi Ahmad Zabidi MOSC 2012 7/34

Page 13: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Scam

• Preying on naive victims

• Sounds to good to be true, but still some people believed

• Organized crime/syndicate... with mules cooperating

Muhammad Najmi Ahmad Zabidi MOSC 2012 7/34

Page 14: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Phishing

• Almost similar with scam, but different tactic

• More sophisticated, but does not need mule/physicalmeetup

• Main purpose to gain important details - online bankinglogin name, password hence access to the victim’saccount

• More secure to the criminal

Muhammad Najmi Ahmad Zabidi MOSC 2012 8/34

Page 15: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Phishing

• Almost similar with scam, but different tactic

• More sophisticated, but does not need mule/physicalmeetup

• Main purpose to gain important details - online bankinglogin name, password hence access to the victim’saccount

• More secure to the criminal

Muhammad Najmi Ahmad Zabidi MOSC 2012 8/34

Page 16: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Phishing

• Almost similar with scam, but different tactic

• More sophisticated, but does not need mule/physicalmeetup

• Main purpose to gain important details - online bankinglogin name, password hence access to the victim’saccount

• More secure to the criminal

Muhammad Najmi Ahmad Zabidi MOSC 2012 8/34

Page 17: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Phishing

• Almost similar with scam, but different tactic

• More sophisticated, but does not need mule/physicalmeetup

• Main purpose to gain important details - online bankinglogin name, password hence access to the victim’saccount

• More secure to the criminal

Muhammad Najmi Ahmad Zabidi MOSC 2012 8/34

Page 18: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Phishing

• Almost similar with scam, but different tactic

• More sophisticated, but does not need mule/physicalmeetup

• Main purpose to gain important details - online bankinglogin name, password hence access to the victim’saccount

• More secure to the criminal

Muhammad Najmi Ahmad Zabidi MOSC 2012 8/34

Page 19: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Malware

• Safely to say,coverstrojan,virus,dialers,rabbits,worms,rootkit(bundlednowadays)

• Already infecting computers since 1980s, threat is moreobvious when the Internet is coming in

• Attacking any operating system, Linux, Windows, Mac...even Android phones

Muhammad Najmi Ahmad Zabidi MOSC 2012 9/34

Page 20: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Malware

• Safely to say,coverstrojan,virus,dialers,rabbits,worms,rootkit(bundlednowadays)

• Already infecting computers since 1980s, threat is moreobvious when the Internet is coming in

• Attacking any operating system, Linux, Windows, Mac...even Android phones

Muhammad Najmi Ahmad Zabidi MOSC 2012 9/34

Page 21: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Malware

• Safely to say,coverstrojan,virus,dialers,rabbits,worms,rootkit(bundlednowadays)

• Already infecting computers since 1980s, threat is moreobvious when the Internet is coming in

• Attacking any operating system, Linux, Windows, Mac...even Android phones

Muhammad Najmi Ahmad Zabidi MOSC 2012 9/34

Page 22: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Malware

• Safely to say,coverstrojan,virus,dialers,rabbits,worms,rootkit(bundlednowadays)

• Already infecting computers since 1980s, threat is moreobvious when the Internet is coming in

• Attacking any operating system, Linux, Windows, Mac...even Android phones

Muhammad Najmi Ahmad Zabidi MOSC 2012 9/34

Page 23: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Problems with adversaries detection

• Some manually crafted, some automated

• React relatively fast, difficult to trace

• Too many (for example, spam) hence too time consumingfor manual work

Muhammad Najmi Ahmad Zabidi MOSC 2012 10/34

Page 24: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

In house analysis

• Given enough expertise, in house analysis could be useful

• Maintaining reputation, having own group of analysts tohandle incidents

• Try minimize costs, use open source tools wheneverpossible

Muhammad Najmi Ahmad Zabidi MOSC 2012 11/34

Page 25: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Machine Learning

• Associated with the Artificial Intelligence

• Mimicking human (brain) learning

• Learns through experience

• Deals with known and unknown patterns

• Overlapping (or somehow originated) with Data Mining,Pattern Recognition

Muhammad Najmi Ahmad Zabidi MOSC 2012 12/34

Page 26: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Table 1: Differences between clustering and classification

Classification Clustering

Deals with known data Deals with unknown data

Supervised learning Unsupervised learning

Popular algorithms includes:

• Random Forest

• Neural Networks

• k-Nearest Neighbor

• Decision Trees

Popular algorithms includes:

• K-means

• Fuzzy C

• Gaussian

Predictive [Tan et al., 2005] Descriptive [Tan et al., 2005]

Muhammad Najmi Ahmad Zabidi MOSC 2012 13/34

Page 27: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Table 1: Differences between clustering and classification

Classification

Clustering

Deals with known data Deals with unknown data

Supervised learning Unsupervised learning

Popular algorithms includes:

• Random Forest

• Neural Networks

• k-Nearest Neighbor

• Decision Trees

Popular algorithms includes:

• K-means

• Fuzzy C

• Gaussian

Predictive [Tan et al., 2005] Descriptive [Tan et al., 2005]

Muhammad Najmi Ahmad Zabidi MOSC 2012 13/34

Page 28: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Table 1: Differences between clustering and classification

Classification

Clustering

Deals with known data

Deals with unknown data

Supervised learning Unsupervised learning

Popular algorithms includes:

• Random Forest

• Neural Networks

• k-Nearest Neighbor

• Decision Trees

Popular algorithms includes:

• K-means

• Fuzzy C

• Gaussian

Predictive [Tan et al., 2005] Descriptive [Tan et al., 2005]

Muhammad Najmi Ahmad Zabidi MOSC 2012 13/34

Page 29: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Table 1: Differences between clustering and classification

Classification

Clustering

Deals with known data

Deals with unknown data

Supervised learning

Unsupervised learning

Popular algorithms includes:

• Random Forest

• Neural Networks

• k-Nearest Neighbor

• Decision Trees

Popular algorithms includes:

• K-means

• Fuzzy C

• Gaussian

Predictive [Tan et al., 2005] Descriptive [Tan et al., 2005]

Muhammad Najmi Ahmad Zabidi MOSC 2012 13/34

Page 30: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Table 1: Differences between clustering and classification

Classification

Clustering

Deals with known data

Deals with unknown data

Supervised learning

Unsupervised learning

Popular algorithms includes:

• Random Forest

• Neural Networks

• k-Nearest Neighbor

• Decision Trees

Popular algorithms includes:

• K-means

• Fuzzy C

• Gaussian

Predictive [Tan et al., 2005] Descriptive [Tan et al., 2005]

Muhammad Najmi Ahmad Zabidi MOSC 2012 13/34

Page 31: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Table 1: Differences between clustering and classification

Classification

Clustering

Deals with known data

Deals with unknown data

Supervised learning

Unsupervised learning

Popular algorithms includes:

• Random Forest

• Neural Networks

• k-Nearest Neighbor

• Decision Trees

Popular algorithms includes:

• K-means

• Fuzzy C

• Gaussian

Predictive [Tan et al., 2005]

Descriptive [Tan et al., 2005]

Muhammad Najmi Ahmad Zabidi MOSC 2012 13/34

Page 32: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Table 1: Differences between clustering and classification

Classification Clustering

Deals with known data

Deals with unknown data

Supervised learning

Unsupervised learning

Popular algorithms includes:

• Random Forest

• Neural Networks

• k-Nearest Neighbor

• Decision Trees

Popular algorithms includes:

• K-means

• Fuzzy C

• Gaussian

Predictive [Tan et al., 2005]

Descriptive [Tan et al., 2005]

Muhammad Najmi Ahmad Zabidi MOSC 2012 13/34

Page 33: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Table 1: Differences between clustering and classification

Classification Clustering

Deals with known data Deals with unknown data

Supervised learning

Unsupervised learning

Popular algorithms includes:

• Random Forest

• Neural Networks

• k-Nearest Neighbor

• Decision Trees

Popular algorithms includes:

• K-means

• Fuzzy C

• Gaussian

Predictive [Tan et al., 2005]

Descriptive [Tan et al., 2005]

Muhammad Najmi Ahmad Zabidi MOSC 2012 13/34

Page 34: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Table 1: Differences between clustering and classification

Classification Clustering

Deals with known data Deals with unknown data

Supervised learning Unsupervised learning

Popular algorithms includes:

• Random Forest

• Neural Networks

• k-Nearest Neighbor

• Decision Trees

Popular algorithms includes:

• K-means

• Fuzzy C

• Gaussian

Predictive [Tan et al., 2005]

Descriptive [Tan et al., 2005]

Muhammad Najmi Ahmad Zabidi MOSC 2012 13/34

Page 35: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Table 1: Differences between clustering and classification

Classification Clustering

Deals with known data Deals with unknown data

Supervised learning Unsupervised learning

Popular algorithms includes:

• Random Forest

• Neural Networks

• k-Nearest Neighbor

• Decision Trees

Popular algorithms includes:

• K-means

• Fuzzy C

• Gaussian

Predictive [Tan et al., 2005]

Descriptive [Tan et al., 2005]

Muhammad Najmi Ahmad Zabidi MOSC 2012 13/34

Page 36: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Table 1: Differences between clustering and classification

Classification Clustering

Deals with known data Deals with unknown data

Supervised learning Unsupervised learning

Popular algorithms includes:

• Random Forest

• Neural Networks

• k-Nearest Neighbor

• Decision Trees

Popular algorithms includes:

• K-means

• Fuzzy C

• Gaussian

Predictive [Tan et al., 2005] Descriptive [Tan et al., 2005]

Muhammad Najmi Ahmad Zabidi MOSC 2012 13/34

Page 37: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

What to look?

• We look for patterns

• In some case, have the spam,phishing mails corpus ready

• We call these patterns as ‘‘features’’

Muhammad Najmi Ahmad Zabidi MOSC 2012 14/34

Page 38: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Spam/scam

• The language that being used

• Perhaps words like ‘‘You have won GBP100,000,000’’notification through emails

• Spam bombarded emails, some might be true businesses,but irresistable to handle.

• Scam, asking people to bank in money for untruthfulreasons

Muhammad Najmi Ahmad Zabidi MOSC 2012 15/34

Page 39: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Phishing mails

• Look for URL

• Current effort for example by PhishTank is done by usingpublic submission and (I believe) manual verification

Muhammad Najmi Ahmad Zabidi MOSC 2012 16/34

Page 40: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Malware

• Researchers tend to look on the ApplicationProgramming Interface (API) calls, some on the opcodes

• Analysis done either by using static or dynamic analysis

Muhammad Najmi Ahmad Zabidi MOSC 2012 17/34

Page 41: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Categories

Some example

Figure 1: Automated classification proposed by [Rieck et al., 2009]

Muhammad Najmi Ahmad Zabidi MOSC 2012 18/34

Page 42: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

The datasets

• Spam email research is already quite sometimescompared to the other (phishing)

• Sample dataset:• http://csmining.org/index.php/spam-email-datasets-.html• http://archive.ics.uci.edu/ml/datasets/Spambase

• Scam email somehow very much associated with spam,since it is unwanted email. Might as well beingcategorized as ‘‘sub-spam’’

• Phishing emails samples:• Sample dataset:

• http://phishtank.com

Muhammad Najmi Ahmad Zabidi MOSC 2012 19/34

Page 43: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

Feature Selection/Extraction

• When analyzing, we’re interested with features• What kind of feature?

• Important keywords, strong features• Non important features will be phased out.. unneccesary• Some features might be redundant

Muhammad Najmi Ahmad Zabidi MOSC 2012 20/34

Page 44: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

• There are algorithms which meant for this:• Information Gain• Support Vector Machine (SVM)• other... some maybe hybrid algoritms(combining several

algorithms altogether) - also known as ensemble

Muhammad Najmi Ahmad Zabidi MOSC 2012 21/34

Page 45: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

List of tools

• Weka

• R language

• Octave (as replacement for Matlab)

• Python Sci-py with Matplotlib

Muhammad Najmi Ahmad Zabidi MOSC 2012 22/34

Page 46: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

List of tools

• Weka

• R language

• Octave (as replacement for Matlab)

• Python Sci-py with Matplotlib

Muhammad Najmi Ahmad Zabidi MOSC 2012 22/34

Page 47: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

List of tools

• Weka

• R language

• Octave (as replacement for Matlab)

• Python Sci-py with Matplotlib

Muhammad Najmi Ahmad Zabidi MOSC 2012 22/34

Page 48: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

List of tools

• Weka

• R language

• Octave (as replacement for Matlab)

• Python Sci-py with Matplotlib

Muhammad Najmi Ahmad Zabidi MOSC 2012 22/34

Page 49: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

List of tools

• Weka

• R language

• Octave (as replacement for Matlab)

• Python Sci-py with Matplotlib

Muhammad Najmi Ahmad Zabidi MOSC 2012 22/34

Page 50: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

Figure 2: Weka

Muhammad Najmi Ahmad Zabidi MOSC 2012 23/34

Page 51: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

Weka

• Obtained data are in numbers and visualizations

• Need to do some reading on how to interpret them

• Test with different algorithms to get the best results

Muhammad Najmi Ahmad Zabidi MOSC 2012 24/34

Page 52: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

R language

• No merely a tool, but a language by itself

• Usually being used by data analysts

Muhammad Najmi Ahmad Zabidi MOSC 2012 25/34

Page 53: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

Figure 3: These books use R language for their analysis purposes

Muhammad Najmi Ahmad Zabidi MOSC 2012 26/34

Page 54: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

Octave

• Octave is an open source alternative for Matlab (MATrixLABoratory)

• Works almost similar like Matlab does

Muhammad Najmi Ahmad Zabidi MOSC 2012 27/34

Page 55: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

Figure 4: Octave also has GUI, QtOctave - discontinued

Muhammad Najmi Ahmad Zabidi MOSC 2012 28/34

Page 56: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

Python Scipy

#!/usr/bin/env python"""Example: simple line plot.Show how to make and save a simple lineplot with labels, title and grid"""import numpyimport pylab

t = numpy.arange(0.0, 1.0+0.01, 0.01)s = numpy.cos(2*2*numpy.pi*t)pylab.plot(t, s)

pylab.xlabel(’time (s)’)pylab.ylabel(’voltage (mV)’)pylab.title(’About as simple as it gets,folks’)pylab.grid(True)pylab.savefig(’simple_plot’)

pylab.show()

Muhammad Najmi Ahmad Zabidi MOSC 2012 29/34

Page 57: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

WekaR languageOctavePython Scipy

Muhammad Najmi Ahmad Zabidi MOSC 2012 30/34

Page 58: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

FlowchartConclusion

The flow

Feature Selection Feature Categorization

Clustering Classification

Visualization

Weka,Octave,R

scipy, octave,R

Weka,Octave,R

scipy, octave,R

Muhammad Najmi Ahmad Zabidi MOSC 2012 31/34

Page 59: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

FlowchartConclusion

Conclusion

• Malicious/unwanted threats from spam, scam, phishingand malware is not easy

• Perhaps one sample could be done by hands, but havingthousands per day is tedious

• Machine learning assist in automation

• Open source provides alternative (free as in minimal cost)for the analysis

• In house analysis provides security in anorganization/enterprise reputation

Muhammad Najmi Ahmad Zabidi MOSC 2012 32/34

Page 60: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

FlowchartConclusion

Conclusion

• Malicious/unwanted threats from spam, scam, phishingand malware is not easy

• Perhaps one sample could be done by hands, but havingthousands per day is tedious

• Machine learning assist in automation

• Open source provides alternative (free as in minimal cost)for the analysis

• In house analysis provides security in anorganization/enterprise reputation

Muhammad Najmi Ahmad Zabidi MOSC 2012 32/34

Page 61: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

FlowchartConclusion

Conclusion

• Malicious/unwanted threats from spam, scam, phishingand malware is not easy

• Perhaps one sample could be done by hands, but havingthousands per day is tedious

• Machine learning assist in automation

• Open source provides alternative (free as in minimal cost)for the analysis

• In house analysis provides security in anorganization/enterprise reputation

Muhammad Najmi Ahmad Zabidi MOSC 2012 32/34

Page 62: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

FlowchartConclusion

Conclusion

• Malicious/unwanted threats from spam, scam, phishingand malware is not easy

• Perhaps one sample could be done by hands, but havingthousands per day is tedious

• Machine learning assist in automation

• Open source provides alternative (free as in minimal cost)for the analysis

• In house analysis provides security in anorganization/enterprise reputation

Muhammad Najmi Ahmad Zabidi MOSC 2012 32/34

Page 63: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

FlowchartConclusion

Conclusion

• Malicious/unwanted threats from spam, scam, phishingand malware is not easy

• Perhaps one sample could be done by hands, but havingthousands per day is tedious

• Machine learning assist in automation

• Open source provides alternative (free as in minimal cost)for the analysis

• In house analysis provides security in anorganization/enterprise reputation

Muhammad Najmi Ahmad Zabidi MOSC 2012 32/34

Page 64: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

FlowchartConclusion

Conclusion

• Malicious/unwanted threats from spam, scam, phishingand malware is not easy

• Perhaps one sample could be done by hands, but havingthousands per day is tedious

• Machine learning assist in automation

• Open source provides alternative (free as in minimal cost)for the analysis

• In house analysis provides security in anorganization/enterprise reputation

Muhammad Najmi Ahmad Zabidi MOSC 2012 32/34

Page 65: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

FlowchartConclusion

Get in touch!

najmi.zabidi @ gmail.comhttp://mypacketstream.blogspot.com

This slides was created with LATEX Beamer

Muhammad Najmi Ahmad Zabidi MOSC 2012 33/34

Page 66: Machine Learning-based Malicious Adversaries Detection in an Enterprise Environment by Using Open Source Tools

IntroThe issues in general

MotivationSolution

ExperimentsToolseof()

FlowchartConclusion

Bibliography

Rieck, K., Trinius, P., Willems, C., and Holz, T. (2009).

Automatic analysis of malware behavior using machine learning.TU, Professoren der Fak. IV.

Tan, P.-N., Steinbach, M., and Kumar, V. (2005).

Introduction to Data Mining, (First Edition).Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.

Muhammad Najmi Ahmad Zabidi MOSC 2012 34/34