metamorphic malware 1 metamorphic malware research

28
Metamorphic Malware 1 Metamorphic Malware Research

Post on 15-Jan-2016

256 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 1

Metamorphic Malware Research

Page 2: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 2

Metamorphic Malware

Metamorphic software changes “shape”o But has instance has same function

o In contrast, most software is “cloned”

Metamorphism used by virus writers to evade signature detection

Lots of interesting research problems We look at some here…

Page 3: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 3

Metamorphic Research

How metamorphic are hacker produced generators?

How to detect metamorphic viruses? The “ultimate” metamorphic

generator? How to make metamorphic that

“carries its own generator” Related questions/issues?

Page 4: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 4

Metamorphic Generators

To analyze metamorphic generators…

First problem is, how to compare code?

We developed a “similarity index”o Based on extracted opcodes

o Can be represented graphically

o Also gives a numerical score

Page 5: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 5

Similarity Suppose we want to compare exe files

o Say, file X and file Y

Extract opcodes from eacho x0, x1, …, xn and y0, y1, …, ym

Compare all 3-opcode subsequenceso If they agree (in any order) plot a point on

the axes at appropriate point

Filter noise with window of length 5

Page 6: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 6

Similarity That is, matches of length 5 or greater

are add to scoreo Lengths were determined experimentally

o Scores range from 0 to 1, where 0 == no match, 1 == perfect match

Gives us a graphical view and a score In graph, what is a perfect match?

o Main diagonal, or segments parallel to it

Page 7: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 7

Normal Files

Similar of typical “normal” files

Page 8: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 8

Metamorphic Generators

A typical “metamorphic” generator

Page 9: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 9

Metamorphic Generators

Highly metamorphic generator

Page 10: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 10

Metamorphic Generators

We measured metamorphism of metamorphic generators

What did we find? Generally, not very metamorphic… We did find one exception:

o Next Generation Virus Creation Kit (NGVCK)

Can we detect NGVCK viruses?

Page 11: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 11

Metamorphic Detection

We “trained” a hidden Markov modelo Based on a bunch of “family” viruses

o Using extracted opcode sequences

Then trained a model for detection Next, we discuss HMMs

o Other techniques could be used

o Neural nets, data mining, etc.

Page 12: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 12

Hidden Markov Models

HMMs --- a machine learning technique

Widely used in speech recognition, bioinformatics, and other areas

We can train an HMM Then use the resulting trained model

to score unknowno High score? Data matches training data

o Low score? Does not match training data

Page 13: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 13

Hidden Markov Models

What are HMMs? Consider an example… Suppose we want to know average

annual temperature in the past We cannot go back in time

o So what to do?

Suppose we know that tree ring size is related to temperature

Page 14: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 14

Hidden Markov Models

We consider 2 possible temperatureso Hot (H) and cold (C)

We consider 3 tree ring sizeso Small (S), medium (M), large (L)

Based on measurements, we find:

Page 15: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 15

HMM

Also, based on historical record:

Then transitions between hot and cold years is a Markov process (order 1)

For the past, we cannot observe temp

But, we can measure tree rings sizes

Page 16: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 16

HMMs

HMM give us efficient algorithms to solve problems like:o Given a series of tree ring sizes, can

we say anything about temperatures?

Page 17: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 17

HMMs

The generic picture is like this…

Note, there is a Markov process And a series of observations

Page 18: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 18

HMMs

HMM model denoted as: λ=(A,B,π)o A is state transition matrix

o B gives probabilities of observations, depending on state of Markov process

o π contains initial state probabilities

For HMMs there are efficient algorithms to solve 3 problemso Next slide…

Page 19: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 19

The 3 HMM Problems1. Given a model and observations, we

can score the sequence of observationso How well does observed data fit model?

2. Given model and observations, we can find optimal state sequenceo Here, we uncover the hidden states

3. Given observation sequence, we can train a model to best fit the datao Only assumption is size of the A matrix

Page 20: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 20

HMM Training:

English Text Example

Assuming 2 hidden states

Here, we show the B matrix…

Page 21: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 21

HMMs and Metamorphic Generators

So, what’s the game plan?

a)Extract opcodes from several metamorphic viruses from same family

b)Train HMM model to on these opcodes (problem 3 from previous slide)

c) Given unknown file, score extracted opcodes using the trained HMM model (problem 1)

Page 22: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 22

HMM Detection of NGVCK Trained model works for detection Effective to the point of practical…

Page 23: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 23

Why Does this Work?

NGVCK viruses are highly metamorphic

But they have some common statistical propertieso This is automatically extracted by HMM

NGVCK differs from normal codeo So HMM can distinguish between the

How to make a “better” metamorphic generator? Hold that thought…

Page 24: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 24

What Next? Can we extract opcodes (or approximation)

efficiently? Are “profile hidden Markov models” better? Similarity index for detection?

o Better ways to measure similarity?

o Statistical tests versus similarity?

HMMs to detect the “undetectable”? HMM compared to other proposed methods? Metamorphism for software watermarking?

Page 25: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 25

Ultimate Metamorphic?

How to evade signature detection and HMM detection?o Metamorphic code evades signature

detection

o But how to also evade HMM detection?

Make the code highly metamorphic and similar to normal codeo Then trained HMM will confuse the two

Page 26: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 26

Ultimate Metamorphic?

Insert dead code from normal programs

Before After

Page 27: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 27

What Now?

How to detect the “ultimate” metamorphic generator?o Remove the dead code

How to remove dead code?o Emulation can help, but…

Can we “improve” the generator? Can we improve the detection? Can we say something more general?

Page 28: Metamorphic Malware 1 Metamorphic Malware Research

Metamorphic Malware 28

References Revealing introduction to HMMs Hunting for metamorphic engines Profile hidden Markov models Approximate disassembly Detecting “undetectable” metamorphic

viruses Hunting for undetectable metamorphic

viruses And lots more work in progress…