an information-theoretic view of network-aware malware attacks

12
530 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009 An Information-Theoretic View of Network-Aware Malware Attacks Zesheng Chen, Member, IEEE, and Chuanyi Ji, Senior Member, IEEE Abstract—This work provides an information-theoretic view to better understand the relationships between aggregated vulnera- bility information viewed by attackers and a class of randomized epidemic scanning algorithms. In particular, this work investigates three aspects: 1) a network vulnerability as the nonuniform vulner- able-host distribution, 2) threats, i.e., intelligent malwares that ex- ploit such a vulnerability, and 3) defense, i.e., challenges for fighting the threats. We first study five large data sets and observe con- sistent clustered vulnerable-host distributions. We then present a new metric, referred to as the nonuniformity factor, that quantifies the unevenness of a vulnerable-host distribution. This metric is es- sentially the Renyi information entropy that unifies the nonunifor- mity of a vulnerable-host distribution with different malware-scan- ning methods. Next, we draw a relationship between Renyi en- tropies and randomized epidemic scanning algorithms. We find that the infection rates of malware-scanning methods are charac- terized by the Renyi entropies that relate to the information bits in a nonunform vulnerable-host distribution extracted by a ran- domized scanning algorithm. Meanwhile, we show that a represen- tative network-aware malware can increase the spreading speed by exactly or nearly a nonuniformity factor when compared to a random-scanning malware at an early stage of malware propaga- tion. This quantifies that how much more rapidly the Internet can be infected at the early stage when a malware exploits an uneven vulnerable-host distribution as a network-wide vulnerability. Fur- thermore, we analyze the effectiveness of defense strategies on the spread of network-aware malwares. Our results demonstrate that counteracting network-aware malwares is a significant challenge for the strategies that include host-based defenses and IPv6. Index Terms—Attack models, network security, performance metrics. I. INTRODUCTION M ALWARE attacks are a significant threat to networks. Malwares are malicious software that include worms, viruses, bots, and spyware. A fundamental characteristic of mal- wares is self-propagation, i.e., a malware can infect vulnerable hosts and use infected hosts to self-disseminate. A key compo- nent of malware propagation is malware-scanning methods, i.e., how effectively the malware finds vulnerable targets. Most of Manuscript received January 06, 2009; revised May 18, 2009. First pub- lished June 30, 2009; current version published August 14, 2009. This work was supported in part by NSF ECS 0300605. The associate editor coordi- nating the review of this manuscript and approving it for publication was Prof. Robert H. Deng. Z. Chen is with the Department of Electrical and Computer Engineering, Florida International University, Miami, FL 33174 USA (e-mail: zchen@fiu. edu). C. Ji is with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIFS.2009.2025847 the real, especially old worms, such as Code Red [19], Slammer [18], and latter Witty [25], use naive random scanning (RS) [6]. RS chooses target IP addresses uniformly and does not take any information on network structures into consideration. Advanced scanning methods, however, have been developed that exploit the IP address structure. For example, Code Red II [34] and Nimda [33] worms have used localized scanning (LS) [5]. LS preferentially searches for vulnerable hosts in the local subnet- work. The Blaster worm [36] has used sequential scanning in addition to LS [31]. Sequential scanning searches for vulnerable hosts through their closeness in the IP address space. Moreover, the AgoBot has employed a blacklist of the well-known moni- tored IP address space and avoided scanning these addresses to be stealthy [39]. The Samy worm has developed to make use of “friendship” in social networks to propagate across the MyS- pace site [40]. A common characteristic of these malwares is that they scan for vulnerable hosts by exploiting a certain struc- ture in the IP address space. Such a structure, as we shall soon show, exhibits network vulnerabilities to defenders and advan- tages to attackers. In this paper, we study the perspective of attackers who at- tempt to collect the information on network vulnerabilities and design intelligent malwares. By studying this perspective, we hope to help defenders better understand malware spreading, e.g., the worst as well as practical malwares that utilize a cer- tain class of randomized scanning algorithms, and better defend against malware attacks. For attackers, an open question is how certain information can help them design fast-spreading malwares. The informa- tion can be from coarse to fine, including the number of vul- nerable hosts, a distribution of vulnerable hosts, the locations of detection systems, and individual vulnerable hosts. This work focuses on aggregated information, i.e., vulnerable-host distri- butions. The vulnerable-host distributions have been observed to be bursty and spatially inhomogeneous by Barford et al. [1]. A nonuniform distribution of Witty-worm victims has been re- ported by Rajab et al. [21]. A web-server distribution has been found to be nonuniform in the IP address space in our prior work [8]. These discoveries suggest that vulnerable hosts and web servers may be “clustered” (i.e., nonuniform). The clus- tering/nonuniformity makes the network vulnerable since if one host is compromised in a cluster, the rest may be compromised rather quickly. Therefore, the information on vulnerable-host distributions can be critical for attackers to develop intelligent malwares. We refer the malwares that exploit the information on the highly uneven distributions of vulnerable hosts as net- work-aware malwares. Such malwares include aforementioned localized-scanning and sequential-scanning malwares. In our 1556-6013/$26.00 © 2009 IEEE

Upload: zesheng-chen

Post on 22-Sep-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: An Information-Theoretic View of Network-Aware Malware Attacks

530 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009

An Information-Theoretic View of Network-AwareMalware Attacks

Zesheng Chen, Member, IEEE, and Chuanyi Ji, Senior Member, IEEE

Abstract—This work provides an information-theoretic view tobetter understand the relationships between aggregated vulnera-bility information viewed by attackers and a class of randomizedepidemic scanning algorithms. In particular, this work investigatesthree aspects: 1) a network vulnerability as the nonuniform vulner-able-host distribution, 2) threats, i.e., intelligent malwares that ex-ploit such a vulnerability, and 3) defense, i.e., challenges for fightingthe threats. We first study five large data sets and observe con-sistent clustered vulnerable-host distributions. We then present anew metric, referred to as the nonuniformity factor, that quantifiesthe unevenness of a vulnerable-host distribution. This metric is es-sentially the Renyi information entropy that unifies the nonunifor-mity of a vulnerable-host distribution with different malware-scan-ning methods. Next, we draw a relationship between Renyi en-tropies and randomized epidemic scanning algorithms. We findthat the infection rates of malware-scanning methods are charac-terized by the Renyi entropies that relate to the information bitsin a nonunform vulnerable-host distribution extracted by a ran-domized scanning algorithm. Meanwhile, we show that a represen-tative network-aware malware can increase the spreading speedby exactly or nearly a nonuniformity factor when compared to arandom-scanning malware at an early stage of malware propaga-tion. This quantifies that how much more rapidly the Internet canbe infected at the early stage when a malware exploits an unevenvulnerable-host distribution as a network-wide vulnerability. Fur-thermore, we analyze the effectiveness of defense strategies on thespread of network-aware malwares. Our results demonstrate thatcounteracting network-aware malwares is a significant challengefor the strategies that include host-based defenses and IPv6.

Index Terms—Attack models, network security, performancemetrics.

I. INTRODUCTION

M ALWARE attacks are a significant threat to networks.Malwares are malicious software that include worms,

viruses, bots, and spyware. A fundamental characteristic of mal-wares is self-propagation, i.e., a malware can infect vulnerablehosts and use infected hosts to self-disseminate. A key compo-nent of malware propagation is malware-scanning methods, i.e.,how effectively the malware finds vulnerable targets. Most of

Manuscript received January 06, 2009; revised May 18, 2009. First pub-lished June 30, 2009; current version published August 14, 2009. This workwas supported in part by NSF ECS 0300605. The associate editor coordi-nating the review of this manuscript and approving it for publication wasProf. Robert H. Deng.

Z. Chen is with the Department of Electrical and Computer Engineering,Florida International University, Miami, FL 33174 USA (e-mail: [email protected]).

C. Ji is with the School of Electrical and Computer Engineering, GeorgiaInstitute of Technology, Atlanta, GA 30332 USA (e-mail: [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TIFS.2009.2025847

the real, especially old worms, such as Code Red [19], Slammer[18], and latter Witty [25], use naive random scanning (RS) [6].RS chooses target IP addresses uniformly and does not take anyinformation on network structures into consideration. Advancedscanning methods, however, have been developed that exploitthe IP address structure. For example, Code Red II [34] andNimda [33] worms have used localized scanning (LS) [5]. LSpreferentially searches for vulnerable hosts in the local subnet-work. The Blaster worm [36] has used sequential scanning inaddition to LS [31]. Sequential scanning searches for vulnerablehosts through their closeness in the IP address space. Moreover,the AgoBot has employed a blacklist of the well-known moni-tored IP address space and avoided scanning these addresses tobe stealthy [39]. The Samy worm has developed to make useof “friendship” in social networks to propagate across the MyS-pace site [40]. A common characteristic of these malwares isthat they scan for vulnerable hosts by exploiting a certain struc-ture in the IP address space. Such a structure, as we shall soonshow, exhibits network vulnerabilities to defenders and advan-tages to attackers.

In this paper, we study the perspective of attackers who at-tempt to collect the information on network vulnerabilities anddesign intelligent malwares. By studying this perspective, wehope to help defenders better understand malware spreading,e.g., the worst as well as practical malwares that utilize a cer-tain class of randomized scanning algorithms, and better defendagainst malware attacks.

For attackers, an open question is how certain informationcan help them design fast-spreading malwares. The informa-tion can be from coarse to fine, including the number of vul-nerable hosts, a distribution of vulnerable hosts, the locations ofdetection systems, and individual vulnerable hosts. This workfocuses on aggregated information, i.e., vulnerable-host distri-butions. The vulnerable-host distributions have been observedto be bursty and spatially inhomogeneous by Barford et al. [1].A nonuniform distribution of Witty-worm victims has been re-ported by Rajab et al. [21]. A web-server distribution has beenfound to be nonuniform in the IP address space in our priorwork [8]. These discoveries suggest that vulnerable hosts andweb servers may be “clustered” (i.e., nonuniform). The clus-tering/nonuniformity makes the network vulnerable since if onehost is compromised in a cluster, the rest may be compromisedrather quickly. Therefore, the information on vulnerable-hostdistributions can be critical for attackers to develop intelligentmalwares.

We refer the malwares that exploit the information onthe highly uneven distributions of vulnerable hosts as net-work-aware malwares. Such malwares include aforementionedlocalized-scanning and sequential-scanning malwares. In our

1556-6013/$26.00 © 2009 IEEE

Page 2: An Information-Theoretic View of Network-Aware Malware Attacks

CHEN AND JI: INFORMATION-THEORETIC VIEW OF NETWORK-AWARE MALWARE ATTACKS 531

prior work, we have studied importance-scanning malwares[8], [10], [15]. Specifically, importance scanning (IS) providesa worst-case (“what-if”) scenario: When there are many waysfor network-aware malwares to exploit the information on vul-nerable hosts, IS is a worst-case threat-model and can serve asa benchmark for studying real malwares. Indeed, what has beenobserved is that real network-aware and importance-scanningmalwares spread much faster than random-scanning malwares[8], [21]. However, it is not well understood how to characterizethe relationship between the information on vulnerable-hostdistributions possessed by attackers and the propagation speedof network-aware malwares.

Questions arise. Does a generic characteristic exist acrossdifferent vulnerable-host distributions? If so, how do network-aware malwares exploit such a vulnerability? How can we de-fend against such malwares? Our goal is to investigate such ageneric characteristic in vulnerable-host distributions, to quan-tify its relationship with network-aware malwares, and to un-derstand the effectiveness of defense strategies. To achieve thisgoal, we investigate network-aware malware attacks in view ofinformation theory, focusing on both the worst-case and real net-work-aware malwares.

A fundamental concept of information theory is the entropythat measures the uncertainty of outcomes of a random event.The reduction of uncertainty is measured by the amount of ac-quired information. We apply the Renyi entropy, a generalizedentropy [23], to analyze the uncertainty of finding vulnerablehosts for different malware-scanning methods. As we shallsoon show, the Renyi entropy is unique in relating three factors:malware-scanning methods, the information bits extracted bymalwares from the vulnerable-host distribution, and malwarespreading speed.

As the first step, we observe, from five large-scale mea-surement sets, the common characteristics of nonuniformvulnerable-host distributions. We then derive a new metric asthe nonuniformity factor to characterize the nonuniformity ofa vulnerable-host distribution. A larger nonuniformity factorreflects a more nonuniform distribution of vulnerable hosts. Weobtain the nonuniformity factors from the data sets on vulner-able-host distributions and show that all data sets have largenonuniformity factors. Moreover, the nonuniformity factor isa function of the Renyi entropies of order two and zero [23].We show that the nonuniformity factor better characterizes theunevenness of a distribution than the Shannon entropy. There-fore, in view of information theory, the nonuniformity factorprovides a quantitative measure of the unevenness/uncertaintyof a vulnerable-host distribution.

Next, we relate the generalized entropy with network-awarescanning methods. The class of network-aware malwares thatwe study all utilizes randomized epidemic algorithms. Hencethe importance of applying the generalized entropy is that theRenyi entropy characterizes the bits of information extractableby the randomized epidemic algorithms. We develop explicitrelations among the Renyi entropy, the randomized epidemicscanning methods, and the spreading speed of network-awaremalwares at an early stage of propagation. A malware thatspreads faster at the early stage can in general infect most of thevulnerable hosts in a shorter time. The propagation ability of a

malware at the early stage is characterized by the infection rate[32]. We derive the infection rates of a class of network-awaremalwares. We find that the infection rates of random-scanningand network-aware malwares are determined by the uncertaintyof the vulnerable-host distribution or the Renyi entropies ofdifferent orders. Specifically, a random-scanning malware hasthe largest uncertainty (i.e., Renyi entropy of order zero), andan optimal importance-scanning malware has the smallestuncertainty (i.e., Renyi entropy with order infinity). Moreover,the infection rates of some real network-aware malwares de-pend on the nonuniformity factors or the Renyi entropy oforder two. For example, compared with RS, LS can increasethe infection rate by nearly a nonuniformity factor. There-fore, the infection rates of malware-scanning algorithms arecharacterized by the Renyi entropies, relating the efficiencyof a randomized scanning algorithm with the uncertainty on anonuniform vulnerable-host distribution. These analytical re-sults on the relationships between vulnerable-host distributionsand network-aware malware spreading ability are validated bysimulation.

Finally, we study new challenges to malware defenses posedby network-aware malwares. Using the nonuniformity factor,we show quantitatively that the host-based defense strategies,such as proactive protection (PP) [3] and virus throttling (VT)[27], should be deployed at almost all hosts to slow down net-work-aware malwares at the early stage. A partial deploymentmay invalidate such host-based defenses. Moreover, we demon-strate that the infection rate of a network-aware malware in theIPv6 Internet can be comparable to that of the Code Red v2worm in the IPv4 Internet. This shows that having a much largerIP-address space would not alleviate malware spreading.

The remainder of this paper is structured as follows. Section IIprovides the background on information theory. Section IIIpresents our collected data sets. Section IV introduces a newmetric called the nonuniformity factor and compares this metricto the Shannon entropy. Sections V and VI characterize thespreading ability of network-aware malwares through theoret-ical analyses and simulations. Section VII further studies theeffectiveness of defense strategies on network-aware malwares.Section VIII concludes this paper.

II. RENYI ENTROPY

An entropy is a measure of the average information uncer-tainty of a random variable [13]. A general entropy, called theRenyi entropy [4], [23], is defined as

for (1)

where the random variable is with probability distributionand alphabet . The well-known Shannon entropy is a specialcase of the Renyi entropy, i.e.,

(2)

It is noted that

(3)

Page 3: An Information-Theoretic View of Network-Aware Malware Attacks

532 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009

where is the alphabet size, and

(4)

where is a result from and is calledthe min-entropy of . In this paper, moreover, we are also in-terested in the Renyi entropy of order two, i.e.,

(5)

Comparing , , , and , we have thefollowing theorem that has been proved in [4] and [22].

Theorem 1:

(6)

with equality iff is uniformly distributed over .

III. MEASUREMENTS AND VULNERABLE-HOST DISTRIBUTIONS

We begin our study by considering how significant the un-evenness of vulnerable-host distributions is. We use five largedata sets to obtain empirical vulnerable-host distributions.

A. Measurements

DShield (D1): DShield provides the information of vulner-able hosts by aggregating logs from more than 1600 intrusiondetection systems (IDSes) distributed throughout the In-ternet [35]. We further focus on the following ports that wereattacked by worms: 80 (HTTP), 135 (DCE/RPC), 445 (Net-BIOS/SMB), 1023 (FTP servers and the remote shell attackedby W32.Sasser.E.Worm), and 6129 (DameWare).

iSinks (P1 and C1): Two unused address space monitors runthe iSink system [29]. The monitors record the unwanted trafficarriving at the unused address spaces that include a Class A net-work (referred to as “Provider” or P1) and two Class B net-works at the campus of the University of Wisconsin (referredto as “Campus” or C1) [1].

Witty-worm victims (W1): A list of Witty-worm victims isprovided by CAIDA [25]. CAIDA used a network telescopewith approximate IP addresses to log the traffic of Witty-worm victims that are Internet security systems (ISS) products.

Web-server list (W2): The IP addresses of web servers werecollected through UROULETTE [38]. UROULETTE providesa random uniform resource locator (URL) generator to obtain alist of IP addresses of web servers.

The first three data sets (D1, P1, and C1) were collected over aseven-day period from 12/10/2004 to 12/16/2004 and have beenstudied in [1] to demonstrate the bursty and spatially inhomoge-neous distribution of malicious source IP addresses. The last twodata sets (W1 and W2) have been used in our prior work [8] toshow the virulence of importance-scanning malwares. The sum-mary of our data sets is given in Table I.

B. Vulnerable-Host Distributions

To obtain vulnerable-host group distributions, we use theclassless interdomain routing (CIDR) notation [17]. The In-ternet is partitioned into subnets according to the first bitsof IP addresses, i.e., prefixes or subnets. In this division,there are subnets, and each subnet contains addresses,

TABLE ISUMMARY OF THE DATA SETS

Fig. 1. CCDF of collected data sets.

where . For example, when , theInternet is grouped into Class A subnets ( i.e., /8 subnets); when

, the Internet is partitioned into Class B subnets (i.e., /16subnets).

We plot the complementary cumulative distribution functions(CCDFs) of our collected data sets in /8 and /16 subnets in Fig. 1in log-log scales. CCDF is defined as the fraction of the subnetswith the number of hosts greater than a given value. Fig. 1(a)shows population distributions in /8 subnets for D1, P1, C1,W1, and W2, whereas Fig. 1(b) exhibits host distributions in

Page 4: An Information-Theoretic View of Network-Aware Malware Attacks

CHEN AND JI: INFORMATION-THEORETIC VIEW OF NETWORK-AWARE MALWARE ATTACKS 533

/16 subnets for D1 with different ports (80, 135, 445, 1023, and6129). Fig. 1 demonstrates a wide range of populations, indi-cating highly inhomogeneous address structures. Specifically,the relatively straight lines, such as W2 and D1-135, imply thatvulnerable hosts follow a power law distribution. Similar ob-servations were given in [1], [8], [9], [11], [18]–[21], [28], and[42]–[44].

IV. NONUNIFORMITY FACTOR

In this section, we derive a simple metric, called the nonuni-formity factor, to quantify the vulnerability, i.e., the nonunifor-mity of a vulnerable-host distribution. We show that the nonuni-formity factor is a function of Renyi entropies. We then comparethe nonuniformity factor with the well-known Shannon entropy.

A. Definition and Property

We consider aggregated vulnerable-host distributions. Letbe an aggregation level of IP addresses as

defined in Section III-B. For a given , let be the numberof vulnerable hosts in subnet , where . Let be

the total number of vulnerable hosts, where .Let be the probability that a ran-domly chosen vulnerable host is in the th subnet. Then

; and . Thus, ’s

denote the group distribution of vulnerable hosts in subnets.Definition: The nonuniformity factor in subnets is defined

as

(7)

Note that such a definition is not arbitrary. In Section V,is used to unify the analytical results on the infection speeds ofdifferent malware-scanning strategies.

One property of is that

(8)

The above inequality is derived by the Cauchy–Schwarz in-equality. The equality holds if and only if , for

. In other words, when the vulnerable-host dis-tribution is uniform, achieves the minimum value 1. On theother hand, since ,

(9)

The equality holds when for some and ,, i.e., all vulnerable hosts concentrate on one subnet.

This means that when the vulnerable-host distribution is ex-tremely nonuniform, obtains the maximum value . More-over, assuming that vulnerable hosts are uniformly distributedin the first subnets, i.e., ,

; and , , we have. Therefore, characterizes the nonuniformity

of a vulnerable-host distribution. A larger nonuniformity factorreflects a more nonuniform distribution of vulnerable hosts.

The nonuniformity factor is indeed related to a distance be-tween a vulnerable-host distribution and a uniform distribution.Consider distance between and the uniform distribu-tion for , where

(10)

which leads to

(11)

For a given , is a constant and is the size of the sample spaceof subnets. Hence, essentially measures the deviation ofa vulnerable-host group distribution from a uniform distributionfor subnets.

How does vary with ? When , . In theother extreme, where ,

address is vulnerable to the malwareotherwise

(12)which results in . More importantly, the ratioof to lies between 1 and 2, as shown below.

Theorem 2:

(13)

where .The proof of Theorem 2 can be found in [7]. An intuitive ex-

planation of this theorem is as follows. For and sub-nets, group of subnets is parti-tioned into groups and of subnets. If vulnerablehosts in each group of subnets are equally dividedinto the groups of subnets (i.e.,

), then . If the division of vul-nerable hosts is extremely uneven for all groups (i.e.,

or ), then . Excludingthese two extreme cases, . Therefore,

is a nondecreasing function of . Moreover, the ratio ofto reflects how unevenly vulnerable hosts in eachsubnet distribute between two groups of subnets. This ratiois indicated by the slope of in a figure.

B. Estimated Nonuniformity Factor

Fig. 2 shows the nonuniformity factors estimated from ourdata sets. The nonuniformity factors increase with the prefixlength for all data sets. Note that the -axis is in a scale.Thus, increases almost exponentially with a wide range of. To gain intuition on how large can be, and

are summarized for all data sets in Table II. It can be observedthat and have large values, indicating the significantunevenness of collected distributions.

C. Shannon Entropy

To further understand the importance of the nonuniformityfactor, we now turn our attention on the Shannon entropy for

Page 5: An Information-Theoretic View of Network-Aware Malware Attacks

534 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009

Fig. 2. Nonuniformity factors of collected data sets. Note that the �-axis usesa ��� scale.

TABLE II� AND � OF COLLECTED DISTRIBUTIONS

comparison. It is well-known that the Shannon entropy can beused to measure the nonuniformity of a probability distribution[13]. The Shannon entropy in subnets is defined as

(14)

where .

Fig. 3. Shannon entropies of collected data sets.

It is noted that

(15)

If a distribution is uniform, achieves the maximumvalue . On the other hand, if a distribution is extremely nonuni-form, e.g., all vulnerable hosts concentrate on one subnet,

obtains the minimum value 0.Furthermore, we compare with and

find that their difference is between 0 and 1, as shown in thefollowing theorem.

Theorem 3:

(16)

where .The proof of Theorem 3 can be found in [7]. Fig. 3 shows

the Shannon entropies of our empirical distributions from thedata sets. is denoted by the diagonal line in the

Page 6: An Information-Theoretic View of Network-Aware Malware Attacks

CHEN AND JI: INFORMATION-THEORETIC VIEW OF NETWORK-AWARE MALWARE ATTACKS 535

figure. It can be seen that the curves for our collected data setsare similar.

D. Nonuniformity Factor, Renyi Entropy, and Shannon Entropy

To quantify the difference between the nonuniformity factorand the Shannon entropy, we note that the nonuniformity factordirectly relates to the Renyi entropies of order two and zero, asshown in the following equation:

(17)

where . Therefore, thenonuniformity factor is essentially a Renyi entropy. Hence, thenonuniformity factor corresponds to a generalized entropy oforder 2, whereas the Shannon entropy is the generalized entropyof order 1.

Why do we choose the nonuniformity factor rather than theShannon entropy? We compare these two metrics in terms ofcharacterizing a vulnerable-host distribution and find the fol-lowing fundamental differences. First, in Fig. 2, when a distri-bution is uniform, . Hence, the distance betweenand the horizontal access 1 measures the degree of unevenness.Similarly, the distance between and 0 in Fig. 3 reflectshow uniform a distribution is. A larger corresponds toa more even distribution, whereas a larger corresponds to amore nonuniform distribution. In addition, if two distributionshave different prefix lengths, we can directly apply the nonuni-formity factor (or the Shannon entropy) to compare the uneven-ness (or evenness) between them. Therefore, the Shanon entropyprovides a better measure for describing the evenness of a dis-tribution, while the nonuniformity factor gives a better metricfor characterizing the nonuniformity of a distribution. Second,from Theorem 1 and (17), we have whenthe nonzero ’s are not all equal. Meanwhile, evidenced byFigs. 2 and 3, the nonuniformity factor magnifies the uneven-ness of a distribution. Therefore, depends more on the large

’s. Finally, a more important aspect of using the nonunifor-mity factor is its relation to some real randomized epidemic al-gorithms (e.g., LS and sequential scanning). Such a relationshipcannot be drawn using the Shannon entropy but can be relatedto Renyi entropies and thus information bits.

V. INFORMATION BITS AND NETWORK-AWARE

MALWARE SPREADING

In this section, we explicitly relate the speed of malware prop-agation with the information bits extracted by random-scanningand network-aware malwares.

A. Collision Probability, Uncertainty, and Information Bits

Now consider a malware or an adversary that searches forvulnerable hosts. An adversary often does not have the com-plete knowledge on the locations of vulnerable hosts. Hence,malwares make a random guess on which subnets are likelyto have most vulnerable hosts. This results in a class of ran-domized epidemic algorithms for malwares to scan subnets. Let

be the probability that a malware scansthe th subnet. Thus, ’s characterize the virulence ofrandomized epidemic scanning algorithms.

We then define three important quantities: the collision prob-ability, uncertainty, and information bits. Consider a randomlychosen vulnerable host . The probability that this host is inthe subnet is . Imagine that a malware guesses whichsubnet host belongs to and chooses a target subnet withthe probability . Thus, the probability for the

malware to make a correct guess is . Thisprobability is called the collision probability and is defined in[4]. Such a probability of success is reflected in our designednonuniformity factor and corresponds to a scenario that the mal-ware knows the underlying vulnerable-host group distribution.Intuitively, the more nonuniform a vulnerable-host distributionis, the larger the probability of success is, i.e., the easier it is fora scan to hit a vulnerable host, the more vulnerable the networkis, and the less uncertainty there is in a vulnerable-host distribu-tion.

We now extend the concept of the collision probability anddefine as the probability that a malware scan hits a subnetwhere a randomly chosen vulnerable host locates, i.e.,

(18)

Then two important quantities can be defined:• as the uncertainty exhibited by the vulnerable-

host distribution ’s.• as the number of information bits

extracted by a randomized epidemic scanning algorithmusing ’s.

Here is regarded as the uncertainty on the vulner-able-host distribution in view of the malware, similar to self-in-formation [41]. For example, if a malware has no informationabout a vulnerable-host distribution and has to use RS, it has thelargest uncertainty and extracts zero informationbits from the distribution. Likewise, the number of informationbits extracted by a network-aware malware can be measured asthe reduction of the uncertainty and thus equal to

. For example,is the information bits extractable by an adversary that chooses

.

B. Infection Rate

We characterize the spread of a network-aware malware atan early stage by deriving the infection rate. The infection rate,denoted by , is defined as the average number of vulnerablehosts that can be infected per unit time by one infected hostduring the early stage of malware propagation [32]. The infec-tion rate is an important metric for studying network-aware mal-ware spreading ability for two reasons. First, since the number ofinfected hosts increases exponentially with the rate duringthe early stage, a malware with a higher infection rate can spreadmuch faster at the beginning and thus infect a large number ofhosts in a shorter time [8]. Second, while it is generally difficultto derive a close-form solution for dynamic malware propaga-tion, we can obtain a close-form expression of the infection ratefor different malware scanning methods.

Page 7: An Information-Theoretic View of Network-Aware Malware Attacks

536 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009

Let denote the (random) number of vulnerable hosts thatcan be infected per unit time by one infected host during theearly stage of malware propagation. The infection rate is theexpected value of , i.e., . Let be the scanning rateor the number of scans sent by an infected host per unit time,be the number of vulnerable hosts, and be the scanning space(i.e., ).

C. Random Scanning

RS has been used by most real worms and is the simplestrandomized epidemic algorithm. For RS, an infected host sendsout random scans per unit time, and the probability that onescan hits a vulnerable host is . Thus, follows a Binomialdistribution ,1 resulting in

(19)

Another way to derive the infection rate of RS is as follows.Consider a randomly chosen vulnerable host . The probabilitythat this host is in the subnet is . An RS malwarecan make a successful guess on which subnet host belongsto with collision probability . Ascan from the RS malware can be regarded as first selecting the

subnet randomly and then choosing the host in the subnetat random. Hence the probability for the malware to hit host

is . Since thereare vulnerable hosts, the probability for a malware to hit avulnerable host is . Thus, follows aBinomial distribution , resultingin

(20)

Therefore, for the RS malware, the uncertainty on the vul-nerable-host distribution is , i.e., thenumber of information bits on vulnerable hosts extracted by RSis .

D. Optimal IS

IS exploits the nonuniform distribution of vulnerable hosts.In our prior work, we showed that IS corresponds to theworst-case malware attacks given a vulnerable-host distribution[8]. In this work, we derive the infection rate of IS and relate thatto information bits. An infected host scans subnet with theprobability . Consider a randomly chosen vulnerable host

. The probability for this host being in subnet is . AnIS malware can make a successful guess on which subnet hostbelongs to with collision probability .Thus, the probability for the malware to hit the host is

. Similar to RS, of IS follows

1In our derivation, we ignore the dependency of the events that different scanshit the same target at the early stage of malware propagation.

a Binomial distribution ,which leads to2

(21)

Therefore, the uncertainty of the vulnerable-host distribu-tion for an IS malware is , and thenumber of information bits on vulnerable hosts extracted by ISis .

Note that IS can choose ’s to maximize the infectionrate, resulting in a “worst-case” scenario for defenders or op-timal IS ( OPT_IS) for attackers [8], i.e.,

(22)

That is,

(23)Therefore, the uncertainty on the vulnerable-host distributionfor OPT_IS is ; and the number of informationbits on vulnerable hosts extracted by this scanning method is

.

E. Suboptimal IS

As shown in our prior work [8], the optimal IS is difficult toimplement in reality. Hence we consider a special case of IS,where the group scanning distribution is chosen to beproportional to the number of vulnerable hosts in group , i.e.,

. This results in suboptimal IS [8], called IS.Thus, the infection rate derived in this work is

(24)Therefore, the uncertainty on the vulnerable-host distributionfor IS is ; and the corresponding number of in-formation bits extracted is or .Moreover, compared with RS, this IS can increase the infec-tion rate by a factor of . On the other hand, RS can be re-garded as a special case of suboptimal IS when .

F. Localized Scanning

LS has been used by such real worms as Code Red II andNimda. LS is a simpler randomized algorithm that utilizes only afew parameters rather than an underlying vulnerable-host groupdistribution. We first consider a simplified version of LS, called

LS, which scans the Internet as follows:• of the time, an address with the same first

bits is chosen as the target;• of the time, an address is chosen randomly from an

entire IP address space.

2The same result was derived in [8] but by a different approach.

Page 8: An Information-Theoretic View of Network-Aware Malware Attacks

CHEN AND JI: INFORMATION-THEORETIC VIEW OF NETWORK-AWARE MALWARE ATTACKS 537

Hence, LS is an oblivious yet local randomized algorithm wherethe locality is characterized by parameter . Assume that aninitially infected host is randomly chosen from the vulnerablehosts. Let denote the subnet where an initially infected hostlocates. Thus, , where .For an infected host located in subnet , a scan from this hostprobes globally with the probability of and hits subnet

with the likelihood of . Thus, the groupscanning distribution for this host is

ifotherwise

(25)

where . Given the subnet location of an ini-tially infected host (i.e., subnet ), the conditional collisionprobability or the probability for a malware scan to hit a subnetwhere a randomly chosen vulnerable host locates can be calcu-lated based on (18), i.e.,

(26)

Therefore, we can compute the collision probability as

(27)

resulting in

(28)

Therefore, the number of information bits extracted from thevulnerable-host distribution by LS is ,which is between 0 and .

Moreover, since ( is for a uniform distri-bution and is excluded here), increases with respect to .Specifically, when , . Thus,LS has an infection rate comparable to that of IS. In reality,

cannot be 1. This is because an LS malware begins spreadingfrom one infected host that is specifically in a subnet; and if

, the malware can never spread out of this subnet. There-fore, we expect that the optimal value of should be large butnot 1.

Next, we further consider another LS, called two-level LS(2LLS), which has been used by the Code Red II and Nimdaworms [33], [34]. 2LLS scans the Internet as follows:

• of the time, an address with the same firstbyte is chosen as the target;

• of the time, an address with the samefirst two bytes is chosen as the target;

• of the time, a random address is chosen.For example, for the Code Red II worm, and

[34]; for the Nimda worm, and [33].Using the similar analysis for LS, we can derive the infectionrate of 2LLS

(29)

Similarly, the number of information bits extracted fromthe vulnerable-host distribution by the 2LLS malware is

, which is between 0and .

Since from Theorem 2, holds orincreases when both and increase. Specially, when

, . Thus, 2LLS has an infectionrate comparable to that of /16 IS. Moreover, is much largerthan as shown in Table II for the collected distributions.Hence, is more significant than for 2LLS.

G. Modified Sequential Scanning

The Blaster worm is a real malware that exploits sequentialscanning in combination with LS. A sequential-scanning mal-ware studied in [16], [31] begins to scan addresses sequentiallyfrom a randomly chosen starting IP address and has a similarpropagation speed as a random-scanning malware. The Blasterworm selects its starting point locally as the first address of itsClass C subnet with probability 0.4 [31], [36]. To analyze theeffect of sequential scanning, we do not incorporate LS. Specif-ically, we consider our modified sequential-scanning (MSS)malware, which scans the Internet as follows:

• Newly infected host begins with RS until finding a vul-nerable host with address .

• After infecting the target , host continues to sequen-tially scan IP addresses

in the subnet where locates.Such a sequential malware-scanning strategy is in a similar spiritto the nearest neighbor rule, which is widely used in patternclassification [12]. The basic idea is that if the vulnerable hostsare clustered, the neighbor of a vulnerable host is likely to bevulnerable also.

Such a MSS malware has two stages. In the first stage(called MSS_1), the malware uses RS and has an infectionrate of , i.e., . In the second stage (calledMSS_2), the malware scans sequentially in a subnet. The fist

bits of a target address are fixed, whereas the last bitsof the address are generated additively or subtractively and aremodulated by . Let denote the sunbet where locates.Thus, , where . Since anMSS_2 malware scans only the subnet where locates, theconditional collision probability , which leads

to . Thus, the infection rate is

(30)

Therefore, the uncertainty on vulnerable hosts for MSS isbetween and . Moreover, the infection rateof MSS is between and .

H. Summary

The information bits extractable by the network-aware mal-wares relates the entropy on a vulnerable-host distribution andthe malware propagation speed, as shown in the following equa-tion:

(31)

where is the collision probability and is the infection rateof the malware.

Page 9: An Information-Theoretic View of Network-Aware Malware Attacks

538 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009

TABLE IIIUNCERTAINTY ON THE VULNERABLE-HOST DISTRIBUTION, INFORMATION BITS, AND INFECTION RATES OF DIFFERENT SCANNING METHODS

When subnets are considered, RS has the largest un-certainty , while optimal IS has the smallest un-certainty . Moreover, infection rates of all threenetwork-aware malwares (suboptimal IS, LS, and MSS) canbe far larger than that of an RS malware, depending on thenonuniformity factors (i.e., ) or the Renyi entropy in theorder of two (i.e., ). The infection rates of all thesescanning algorithms are characterized by the Renyi entropies,relating the efficiency of a randomized scanning algorithm withthe information bits in a vulnerable-host distribution.

Hence, we relate the information theory with the net-work-aware malware propagation through the Renyi entropy.The uncertainty of a vulnerable-host group probability distri-bution, which is quantified by the Renyi entropy, is importantfor an attacker to design a network-aware malware. If there isno uncertainty about the distribution of vulnerable hosts (e.g.,either all vulnerable hosts are concentrated on a subnet or allinformation about vulnerable hosts is known), the entropy isminimum, and the malware that uses the information on thedistribution can spread fastest by employing the optimal IS. Onthe other hand, if there is maximum uncertainty (e.g., vulner-able hosts are uniformly distributed), the entropy is maximum.For this case, the best strategy that a malware can take is touse RS. In general, when an attacker obtains more informationabout a nonuniform vulnerable-host distribution (e.g., larger ),the resulting malware can spread faster.

VI. SIMULATION AND VALIDATION

We now validate our analytical results through simulationsusing the collected data sets.

A. Infection Rate

We first focus on validating infection rates. We apply thediscrete event simulation to our experiments [24]. Specifically,we simulate the searching process of a malware using differentscanning methods at the early stage. We use the C1 data set forthe vulnerable-host distribution. Note that the C1 distributionhas the nonuniformity factors and ,and . The malware spreads over theC1 distribution with and has a scanning rate

. The uncertainty on the vulnerable-host distributionand the information bits extractable for different scanningmethods are shown in Table III. The simulation stops whenthe malware has sent out scans for RS, /16 OPT_IS,/16 IS, /16 LS, and 2LLS, and 65 535 scans for /16 MSS_2.Then, we count the number of vulnerable hosts hit by themalware scans and compute the infection rate. The results areaveraged over runs. Table III compares the simulationresults (i.e., sample mean) with the analytical results [i.e.,(20), (22), (24), (28), (29), and (30)]. Here, a /16 LS malware

TABLE IVINFECTION RATES OF A /16 MSS MALWARE

uses , whereas a 2LLS malware employsand . We observe that the sample means and theanalytical results are almost identical.

We observe that network-aware malwares have much largerinfection rates than random-scanning malwares. LS indeed in-creases the infection rate with nearly a nonuniformity factor andapproaches the capacity of suboptimal IS. This is significantas LS only depends on one or two parameters (i.e., forLS and , for 2LLS), while IS requires the information ofthe vulnerable-host distribution. On the other hand, LS has alarger sample variance than IS as indicated by Table III. Thisimplies that the infection speed of an LS malware depends onthe location of initially infected hosts. If the LS malware beginsspreading from a subnet containing densely populated vulner-able hosts, the malware would spread rapidly. Furthermore, wenotice that the MSS malware also has a large infection rate at thesecond stage, indicating that MSS can indeed exploit the clus-tering pattern of the distribution. Meanwhile, the large samplevariance of the infection rate of MSS_2 reflects that an MSSmalware strongly depends on the initially infected hosts. Wefurther compute the infection rate of a /16 MSS malware thatincludes both random-scanning and sequential-scanning stages.Simulation results are averaged over runs and are summa-rized in Table IV. These results strongly depend on the totalnumber of malware scans. When the number of malware scansis small, an MSS malware behaves similar to a random-scan-ning malware. When the number of malware scans increases,the MSS malware spends more scans on the second stage andthus has a larger infection rate.

B. Dynamic Malware Propagation

An infection rate only characterizes the early stage of mal-ware propagation. We now employ the analytical active wormpropagation (AAWP) model and its extensions to characterizethe entire spreading process of malwares [6]. Specifically, thespread of RS and IS malwares is implemented as described in[8], whereas the propagation of LS malwares is modeled ac-cording to [21]. The parameters that we use to simulate a mal-ware are comparable to those of the Code Red v2 worm. CodeRed v2 has a vulnerable population and a scan-ning rate per minute [30]. We assume that the malwarebegins spreading from an initially infected host that is locatedin the subnet containing the largest number of vulnerable hosts.We show the propagation speeds of network-aware malwares for

Page 10: An Information-Theoretic View of Network-Aware Malware Attacks

CHEN AND JI: INFORMATION-THEORETIC VIEW OF NETWORK-AWARE MALWARE ATTACKS 539

Fig. 4. Network-aware malware spreads over the D1-80 distribution.

the same vulnerable-host distribution from data set D1-80. FromSection V, we expect that a network-aware malware can spreadmuch faster than an RS malware. Fig. 4 demonstrates such anexample on a malware that uses different scanning methods. Ittakes an RS malware 10 h to infect 99% of vulnerable hosts,whereas a /8 LS malware with or a /8 IS malwaretakes only about 3.5 h. A /16 LS malware with or a2LLS malware with and can further reducethe time to 1 h. A /16 IS malware spreads fastest and takes only0.5 h.

VII. EFFECTIVENESS OF DEFENSE STRATEGIES

What are new requirements and challenges for a defensesystem to slow down the spread of a network-aware malware?We conduct an initial study on the effectiveness of defensestrategies through nonuniformity factors.

A. Host-Based Defenses

Host-based defenses have been widely used for random-scan-ning malwares. PP and VT are examples of host-based defensestrategies. A PP strategy proactively hardens a system, making itdifficult for a malware to exploit vulnerabilities [3]. Techniquesused by PP include address-space randomization, pointer en-cryption, instruction-set randomization, and password protec-tion. Thus, a malware requires multiple trials to compromise ahost that implements PP. Specifically, let denotethe protection probability or the probability that a single mal-ware attempt succeeds in infecting a vulnerable host that imple-ments PP. On the average, a malware should make exploitattempts to compromise the target. We assume that hosts withPP are uniformly deployed in the Internet. Letdenote the deployment ratio of the number of hosts with PP tothe total number of hosts.

To show the effectiveness of the PP strategy, we consider theinfection rate of a IS malware. Since now some of the vulner-

able hosts implement PP, (24) changes to

(32)

To slow down the spread of a suboptimal IS malware to that ofa random-scanning malware, , resulting in

(33)

When PP is fully deployed, i.e., , can be at most .On the other hand, if PP provides perfect protection, i.e., ,

should be at least . Therefore, when is large,inequality (33) presents high requirements for the PP strategy.For example, if (most of ’s in Table II are largerthan this value), and . That is, a PP strategyshould be almost fully deployed and provide a nearly perfectprotection for a vulnerable host.

We next consider the VT strategy [27]. VT constrains thenumber of outgoing connections of a host and can thus reducethe scanning rate of an infected host. We find that (32) alsoholds for this strategy, except that is the ratio of the scanningrate of infected hosts with VT to that of infected hosts withoutVT. Therefore, VT also requires to be almost fully deployed forfighting network-aware malwares effectively.

From these two strategies, we have learned that an effectivestrategy should reduce either or . Host-based defenses,however, are limited in such capabilities.

B. IPv6

IPv6 can decrease significantly by increasing the scan-ning space [32]. But the nonuniformity factor would increase theinfection rate if the vulnerable-host distribution is still nonuni-form. Hence, an important question is whether IPv6 can coun-teract network-aware malwares when both and aretaken into consideration.

We study this issue by computing the infection rate of a net-work-aware malware in the IPv6 Internet. As pointed out by [2],a smart malware can first detect some vulnerable hosts in /64subnets containing many vulnerable hosts, then release to thehosts on the hitlist, and finally spread inside these subnets. Sucha malware only scans the local /64 subnet. Thus, we focus on thespreading speed of a network-aware malware in a /64 subnet.From Fig. 2, we extrapolate that in the IPv6 Internet canbe in the order of if hosts are still distributed in a clusteredfashion. Using the parameters proposed by [14] and

used by the Slammer worm [18], we derive the infec-tion rate of a /32 IS malware in a /64 subnet of the IPv6 Internet:

. is larger thanthe infection rate of the Code Red v2 worm in the IPv4 Internet,where . There-fore, IPv6 can only slow down the spread of a network-awaremalware to that of a random-scanning malware in IPv4. To de-fend against the malware effectively, we should further considerhow to slow down the increase rate of as increases whenIPv4 is updated to IPv6.

Page 11: An Information-Theoretic View of Network-Aware Malware Attacks

540 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 3, SEPTEMBER 2009

C. Discussions

Defending against network-aware malwares is challengingalso due to the unknown causes of vulnerable-host distributions.That is why the vulnerable-host distribution is highly nonuni-form in the IPv4 address space. An answer to this questionwould involve other studies that are beyond the scope of thiswork. Nevertheless, we hypothesize several possible reasons.First, no vulnerable hosts can exist in reserved or multicast ad-dress ranges [37]. Second, different subnet administrators maymake different use of their own IP address space. For example,an administrator may intend to use NATs extensively, whereasanother administrator would consider to distribute the global IPaddresses first. Third, a subnet intends to have many computerswith the same operating systems and applications for easy man-agement [6], [26]. Last, some subnets are more protected thanothers [1], [21]. Studying these issues may provide insights onnew defense strategies, which can be considered as a future di-rection.

VIII. CONCLUSION

In this paper, we have derived a simple metric, known as thenonuniformity factor, to quantify an uneven distribution of vul-nerable hosts. The nonuniformity factor, shown as a function ofthe Renyi entropies of order two and zero, better characterizesthe uneven feature of a distribution than the Shannon entropy.Moreover, we have drawn a relationship between Renyi en-tropies and randomized epidemic scanning algorithms. Specif-ically, we have related the information bits extracted by mal-wares from a vulnerable-host distribution with the propagationspeed of network-aware malwares. Furthermore, we have evalu-ated the effectiveness of several commonly used defense strate-gies on network-aware malwares. The host-based defenses, suchas PP or VT, are required to be almost fully deployed to slowdown malware spreading at the early stage. This implies thathost-based defenses would be weakened significantly by net-work-aware scanning. More surprisingly, different from pre-vious findings, we have shown that network-aware malwarescan be zero-day malwares in the IPv6 Internet if vulnerable hostsare still clustered. These findings present a significant challengeto malware defenses: Entirely different strategies may be neededfor fighting network-aware malwares.

As part of our ongoing work, we plan to study in more depthrelationships between information theory and dynamic malwareattacks and develop effective detection and defense systems thatexploit vulnerable-host distributions.

REFERENCES

[1] P. Barford, R. Nowak, R. Willett, and V. Yegneswaran, “Toward amodel for sources of Internet background radiation,” in Proc. Passiveand Active Measurement Conf. (PAM’06), Adelaide, Australia, Mar.2006.

[2] S. M. Bellovin, B. Cheswick, and A. Keromytis, “Worm propagationstrategies in an IPv6 Internet,” Login, vol. 31, no. 1, pp. 70–76, Feb.2006.

[3] D. Brumley, L. Liu, P. Poosankam, and D. Song, “Design space andanalysis of worm defense strategies,” in ACM Symp. InformAtion, Com-puter and Communications Security (ASIACCS), Taipei, Taiwan, Mar.2006.

[4] C. Cachin, “Entropy Measures and Unconditional Security in Cryp-tography,” Ph.D. thesis, Swiss Federal Institute of Technology, Zurich,1997.

[5] Z. Chen, C. Chen, and C. Ji, “Understanding localized-scanningworms,” in Proc. 26th IEEE Int. Performance Computing and Com-munications Conf. (IPCCC’07), New Orleans, LA, Apr. 2007, pp.186–193.

[6] Z. Chen, L. Gao, and K. Kwiat, “Modeling the spread of active worms,”in Proc. INFOCOM’03, San Francisco, CA, Apr. 2003, vol. 3, pp.1890–1900.

[7] Z. Chen and C. Ji, An Information-Theoretical View of Net-work-Aware Malware Attacks, Tech. Rep. arXiv:0805.0802v2, 2008[Online]. Available: http://arxiv.org/abs/0805.0802

[8] Z. Chen and C. Ji, “Optimal worm-scanning method using vulnerable-host distributions,” Int. J. Security Netw., Special Issue on Comput.Netw. Security, vol. 2, no. 1/2, pp. 71–80, 2007.

[9] Z. Chen and C. Ji, “Measuring network-aware worm spreading ability,”in Proc. INFOCOM’07, Anchorage, AK, May 2007.

[10] Z. Chen and C. Ji, “A self-learning worm using importance scanning,”in Proc. ACM/CCS Workshop Rapid Malcode (WORM’05), Fairfax,VA, Nov. 2005, pp. 22–29.

[11] Z. Chen, C. Ji, and P. Barford, “Spatial-temporal characteristics ofmalicious sources,” in Proc. INFOCOM’08 Mini-Conf., Phoenix, AZ,Apr. 2008.

[12] T. M. Cover and P. E. Hart, “Nearest neighbor pattern classification,”IEEE Trans. Inf. Theory, vol. IT-13, no. 1, pp. 21–27, Jan. 1967.

[13] T. M. Cover and J. A. Thomas, Elements of Information Theory. NewYork: Wiley, 1991.

[14] H. Feng, A. Kamra, V. Misra, and A. D. Keromytis, “The effect ofDNS delays on worm propagation in an IPv6 Internet,” in Proc. IN-FOCOM’05, Miami, FL, Mar. 2005, vol. 4, pp. 2405–2414.

[15] G. Gu, Z. Chen, P. Porras, and W. Lee, “Misleading and defeating im-portance-scanning malware propagation,” in Proc. 3rd Int. Conf. Se-curity Privacy in Communication Networks (SecureComm’07), Nice,France, Sep. 2007.

[16] G. Gu, M. Sharif, X. Qin, D. Dagon, W. Lee, and G. Riley, “Worm de-tection, early warning and response based on local victim information,”in Proc. 20th Ann. Computer Security Applications Conf. (ACSAC’04),Tucson, AZ, Dec. 2004.

[17] E. Kohler, J. Li, V. Paxson, and S. Shenker, “Observed structure of ad-dresses in IP traffic,” in ACM SIGCOMM Internet Measurement Work-shop, Marseille, France, Nov. 2002.

[18] D. Moore, V. Paxson, S. Savage, C. Shannon, S. Staniford, and N.Weaver, “Inside the Slammer worm,” IEEE Security Privacy, vol. 1,no. 4, pp. 33–39, Jul. 2003.

[19] D. Moore, C. Shannon, and J. Brown, “Code-Red: A case study on thespread and victims of an Internet worm,” in ACM SIGCOMM InternetMeasurement Workshop, Marseille, France, Nov. 2002.

[20] Y. Pryadkin, R. Lindell, J. Bannister, and R. Govindan, An Empir-ical Evaluation of IP Address Space Occupancy, USC/Information Sci-ences Inst., Tech. Rep. ISI-TR-2004-598, Nov. 2004.

[21] M. A. Rajab, F. Monrose, and A. Terzis, “On the effectiveness of dis-tributed worm monitoring,” in Proc. 14th USENIX Security Symp. (Se-curity’05), Baltimore, MD, Aug. 2005, pp. 225–237.

[22] A. Renyi, “Some fundamental questions of information theory,” in Se-lected Papers of Alfred Renyi. Budapest: Akademiai Kiado, 1976,vol. 2, pp. 526–552.

[23] A. Renyi, Probability Theory. New York: Elsevier, 1970.[24] S. M. Ross, Simulation, 3rd ed. New York: Academic, 2002.[25] C. Shannon and D. Moore, “The spread of the Witty worm,” IEEE Se-

curity Privacy, vol. 2, no. 4, pp. 46–50, Jul./Aug. 2004.[26] S. Staniford, V. Paxson, and N. Weaver, “How to 0wn the Internet in

your spare time,” in Proc. 11th USENIX Security Symp. (Security’02),San Francisco, CA, Aug. 2002.

[27] J. Twycross and M. M. Williamson, “Implementing and testing a virusthrottle,” in Proc. 12th USENIX Security Symp. (Security’03), Wash-ington, DC, Aug. 2003, pp. 285–294.

[28] M. Vojnovic, V. Gupta, T. Karagiannis, and C. Gkantsidis, “Samplingstrategies for epidemic-style information dissemination,” in Proc. IN-FOCOM’08, Phoenix, AZ, Apr. 2008.

[29] V. Yegneswaran, P. Barford, and D. Plonka, “On the design and utilityof Internet sinks for network abuse monitoring,” in Proc. Symp. RecentAdvances in Intrusion Detection (RAID’04), Sophia Antipolis, France,2004.

[30] C. C. Zou, L. Gao, W. Gong, and D. Towsley, “Monitoring and earlywarning for Internet worms,” in 10th ACM Conf. Computer and Com-munication Security (CCS’03), Washington, DC, Oct. 2003.

[31] C. C. Zou, D. Towsley, and W. Gong, “On the performance of Internetworm scanning strategies,” Elsevier J. Performance Evaluation, vol.63, no. 7, pp. 700–723, Jul. 2006.

Page 12: An Information-Theoretic View of Network-Aware Malware Attacks

CHEN AND JI: INFORMATION-THEORETIC VIEW OF NETWORK-AWARE MALWARE ATTACKS 541

[32] C. C. Zou, D. Towsley, W. Gong, and S. Cai, “Advanced routing wormand its security challenges,” Simulation: Trans. Soci. Modeling Simu-lation Int., vol. 82, no. 1, pp. 75–85, 2006.

[33] CERT Advisory CA-2001-26 Nimda Worm CERT CoordinationCenter, Dec. 2008 [Online]. Available: http://www.cert.org/advi-sories/CA-2001-26.html

[34] “Code Red II:” Another Worm Exploiting Buffer Overflow in IIS In-dexing Service DLL CERT Coordination Center, CERT Incident NoteIN-2001-09, Dec. 2008 [Online]. Available: http://www.cert.org/inci-dent_notes/IN-2001-09.html

[35] Distributed Intrusion Detection System (DShield) Dec. 2008 [Online].Available: http://www.dshield.org/

[36] ANALYSIS: Blaster Worm eEye Digital Security, Dec. 2008 [On-line]. Available: http://research.eeye.com/html/advisories/pub-lished/AL20030811.html

[37] Internet Protocol V4 Address Space Dec. 2008 [Online]. Available:http://www.iana.org/assignments/ipv4-address-space

[38] UROULETTE Dec. 2008 [Online]. Available: http://www.uroulette.com/

[39] Agobot (Computer Worm) Wikipedia, Dec. 2008 [Online]. Available:http://en.wikipedia.org/wiki/Agobot_(computer_worm)

[40] Samy (XSS) Wikipedia, Dec. 2008 [Online]. Available: http://en.wikipedia.org/wiki/Samy_(XSS)

[41] Self-Information Wikipedia, Dec. 2008 [Online]. Available: http://en.wikipedia.org/wiki/Self-information

[42] S. Venkataraman, S. Sen, O. Spatscheck, P. Haffner, and D. Song, “Ex-ploiting network structure for proactive spam mitigation,” in Proc. 16thUSENIX Security Symp. (Security ’07), Boston, MA, Aug. 2007.

[43] S. Shakkottai and R. Srikant, “Peer to peer networks for defenseagainst Internet worms,” IEEE J. Sel. Areas Commun., vol. 25, no. 9,pp. 1745–1752, Dec. 2007.

[44] M. Vojnovic and A. J. Ganesh, “On the race of worms, alerts, andpatches,” IEEE Trans. Netw., vol. 16, no. 5, pp. 1066–1079, Oct. 2008.

Zesheng Chen (S’02–M’07) received the M.S.and Ph.D. degrees from the School of Electricaland Computer Engineering, Georgia Institute ofTechnology, in 2005 and 2007, respectively. He alsoreceived B.E. and M.E. degrees from the Departmentof Electronic Engineering, Shanghai Jiao TongUniversity, Shanghai, China, in 1998 and 2001,respectively.

He is an Assistant Professor at the Department ofElectrical and Computer Engineering, Florida Inter-national University, Miami, FL. His research interests

include network security and the performance evaluation of computer networks.

Chuanyi Ji (S’85–M’91–SM’07) received the B.S.(Honors) degree from Tsinghua University, Beijing,China, in 1983, the M.S. degree from the Universityof Pennsylvania, Philadelphia, in 1986, and the Ph.D.degree from the California Institute of Technology,Pasadena, in 1992, all in electrical engineering.

She is an Associate Professor in the Department ofElectrical and Computer Engineering, Georgia Insti-tute of Technology, Atlanta. She was on the facultyat Rensselaer Polytechnic Institute, Troy, NY, from1991 to 2001. She spent her sabbatical at Bell Labo-

ratories, Lucent Technologies, in 1999, and was a visiting faculty at the Massa-chusetts Institute of Technology, Cambridge, in Fall 2000. Her research areas arein network management and security, machine learning, and large-scale mea-surements. Her research interests lie in understanding and managing complexnetworks, machine learning theory and algorithms, and information theory.

Dr. Ji received the NSF Career award in 1995, and an Early Career awardfrom Rensselaer Polytechnic Institute in 2000.