1. introduction the underground internet economy web-based malware the system analyzing the...

25
CS 510 MALWARE GHOST TURNS ZOMBIE: EXPLORING THE LIFE CYCLE OF WEB-BASED MALWARE MICHALIS POLYCHRONAKIS PANAYIOTIS MAVROMMATIS NIELS PROVOS 1

Upload: georgina-bruce

Post on 23-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

1

CS 510 MALWARE

GHOST TURNS ZOMBIE: EXPLORING THE LIFE CYCLE

OF WEB-BASED MALWARE

MICHALIS POLYCHRONAKISPANAYIOTIS MAVROMMATIS

NIELS PROVOS

2

Introduction

• The underground Internet economy • Web-based malware• The system analyzing the post-infection network

behavior of web-based malware • How do malware’s behaviors taken together

provide a compelling perspective on the life cycle of web-based malware?

3

System Architecture

The goal of the system detect harmful URLs on the web

The brief overview of the overall system they used in their prior work machine learning techniques are used to find suspicious URLs

among a large number of web pages for verification in a virtual machine

The new extended system Responders

System Architecture

4

Over system architecture

oVirtual machine usedoObserved features:

• Links to known malware distribution sites

• Suspicious HTML element• The presence of code obfuscation.

oMachine learning system• Scores if the URL has a high score

oVerification results used to retrain the machine learning system

5

System Architecture

They extended the system improving verification components with light-weight responders

Providing fabricated responses for protocols such as SMTP, FTP and IRC

HTTP proxy is to record all HTTP requests and scan all HTTP responses

Generic responder is to hand off connections over nonstandard ports and identify connections that use unknown protocols

Responders

6

Responders Network flow in the verification component

7

Life cycle of web-based malwareo Malware’s interaction with other hosts and

responders are organized into 3 categories:

1.Propagation

2.Data exfiltration

3.Remote controlo They analyzed the post-infection activity

and the result of these behaviors to find out the life cycle of web-based malware

8

Life cycle of web-based malware Data Set In 2 months virtual machine analyzed URLs from 5,756,000

unique host names and report on unique names At least one harmful URL in 307,000 hostnames %49 of these websites had URLs that resulted in HTTP

request initiated from process other than the web browser %5 of the sites had URLs that activated responder session The total number of responder sessions with transmitted data

is more than 448,000 They observed that malware made network connections

without transmitting data in many more cases

9

Life cycle of web-based malware Network characteristics

The destination ports of all outgoing connections from the virtual machine upon infection

10

Life cycle of web-based malware Network characteristics

They notified the number of unique hostnames for each port On these hosts at least one URL installs

malware that transmitted data to that port

More than 400 different destination ports were connected

This shows the diverse nature of malware’s post-

infection network behavior

11

The exact distribution of HTTP connections destined to nonstandard ports according to the destination port number

12

Life cycle of web-based malware Discovery and Propagation

Malwares usually scan for other vulnerable systems either in the same lan or on the internet to propagate

This figure shows the network protocol distribution used by malware

13

Life cycle of web-based malware Reporting Home

To observe this activity SMTP responders are employed to capture emails

Each email captured has a subject and body

14

TABLE 1Subject # MessagesXP Hacked 390ProRat [...] 162Vip Passw0rds 98Log file from ... 82Installation report 76Perfect Keylogger [...] 47Installation on XP succeeded 12E g y S p y KeyLogger [...] 12INFECTADO 6Mais 1: XP 3AVSXP 3C-h-e-c-k-i-n-g:XP 2...:Noticia quentinha de:... XP 2

Table 1 shows that the most common email subjects

SMTP Server # Messagesyahoo.com 436google.com 118tvm.com.tr 98aol.com 82hotmail.com 19outblaze.com 8globo.com 6

Life cycle of web-based malware Reporting Home

Table 2 above shows that the common SMTP servers used by malware to send installation

reports

15

Life cycle of web-based malware Reporting Home

GET /geturl.php?version=1.1.2&fid=7493&mac=00-00-00-00-00-

00&lversion=&wversion=&day=0&name=dodolook&recent=0

HTTP/1.1

Accept: */*

User-Agent: Mozilla/4.0 (compatible; )

Host: loader.51edm.net:1207

Cache-Control: no-cache

The HHTP protocol is also used to report successful installations back to malware authors

The trojan example:

16

Life cycle of web-based malware Reporting Home

Malware also reported infections using a custom XML-like format

HGZ5.<FT>2008-01-28 12:55:30</FT><IM>80</IM><GR>_&</GR>

<SYS>Windows XP 5.1</SYS>

<NE>XP</NE><pid>488</PID><VER>Ver1.22-0624</VER>

<BZ></BZ><P>1</P><V>0</V><IP>0.0.0.0</IP>

000......<LC></LC><GR>-</GR><IM>25</IM><NA>XP</NA>

<CS>English (United States)</CS><OS>Windows XP</OS>

<MEM>1024MB</MEM><CPU>2200 MHz</CPU>

<NET>LAN</NET><video>0</video><BZ>-</BZ>

17

Life cycle of web-based malware Data exfiltration

There are indications of data exfiltration in responder sessions such as browser history files and stored passwords

o In their observation, they found some emails that send back stored password from a compromised machine

o HTTP is also used for sending sensitive information back to data collection servers (notice the large number of POST requests on the graph on slide #11)

18

Life cycle of web-based malware Data exfiltration

In 2 days, one server had 4,729 files including more than 250,000 valid email addresses

They found more sensitive information in extensive logs continuously uploaded by malware

Logs have victim’s IP address, DNS server, gateway,

MAC address, username, URL, intercepted form and

password fields of HTTP request

o In 250MB logs, 500 usernames and passwords were found for over 250 web sites such as banking site, google.com, yahoo.com, etc.

19

Life cycle of web-based malware Joining Botnets

Botnets They encountered 2 types of botnets in their

work:

1.IRC Botnets

2.HTTP Botnets

20

Life cycle of web-based malware IRC Botnets

IRC and C&C communication IRC sessions to 90 servers were observed using

1587 different nicknames in 95 channels

21

Life cycle of web-based malware IRC Botnets

Some malwares use regular nicknames and channels, but some of them use artificial nicknames such as

[0]USA|XP[P]152102 or Inject-2l087876

22

Life cycle of web-based malware HTTP Botnets

Organize large-scale spam campaigns To participate in spam campaigns each bot

repeatedly downloaded ZIP-archives with instructions using HTTP requests

Each response has a ZIP-archive with instructions on how to participate in spam campaigns

23

Life cycle of web-based malware HTTP Botnets Some example instructions: 000_data22 - a list of domains and their authoritative name severs used

to form the sender's email address  001_ncommall - a list of common first names used as part of the sender's

email address  002_otkogo_r - a list of possible ``from'' names related to the subject of

the spam campaign  003_subj_rep - a list of possible email subjects,  004_outlook - the template of the spam email,  config - a configuration file that instructs the bot how to construct emails

from the data files, how many emails to sent in total, and how many connections are allowed at a given time, 

message - the message body of the spam campaign,  mlist - a list of email addresses to which to send the spam, andmxdata - a binary file containing information about the mail-exchange

servers for the email addresses in mlist

24

Life cycle of web-based malware HTTP Botnets

Top domains out of 700,000 email addresses collected from a spam-sending botnet.Email Domain Frequencyyahoo.com 28899sbcglobal.net 14417yahoo.co.uk 8939shaw.ca 8321hotmail.com 6985korea.com 6041yahoo.co.jp 5215striker.ottawa.on.ca 4415web.de 4276yahoo.co.in 4200

o The most frequent domains captured in an hour didn’t entirely overlap with the larger data set

25

Summary and Conclusion