chapter 4
Embed Size (px)
TRANSCRIPT

Modeling and Detection of Camouflaging Worm 2012
CHAPTER 4
SYSTEM DESIGN
Systems design is the process of defining the architecture, components, modules,
interfaces, and data for a system to satisfy specified requirements. It implies a systematic
and rigorous approach to design—an approach demanded by the scale and complexity of
many systems problems. The purpose of System Design is to create a technical solution that
satisfies the functional requirements for the system. At this point in the project lifecycle
there should be a Functional Specification, written primarily in business terminology,
containing a complete description of the operational needs of the various organizational
entities that will use the new system. The challenge is to translate all of this information into
Technical Specifications that accurately describe the design of the system, and that can be
used as input to System Construction. The Functional Specification produced during System
Requirements Analysis is transformed into a physical architecture.
The purpose of System Design is to create a technical solution that satisfies the
functional requirements for the system. At this point in the project lifecycle there should be
a Functional Specification, written primarily in business terminology, containing a complete
description of the operational needs of the various organizational entities that will use the
new system. The challenge is to translate all of this information into Technical
Specifications that accurately describe the design of the system, and that can be used as
input to System Construction.
The Functional Specification produced during System Requirements Analysis is
transformed into a physical architecture. System components are distributed across the
physical architecture, usable interfaces are designed and prototyped, and Technical
Specifications are created for the Application Developers, enabling them to build and test
the system.
System design contains Logical Design & Physical Designing, logical designing
describes the structure & characteristics or features, like output, input, files, database &
procedures. The physical design, which follows the logical design, actual software & a
working system. There will be constraints like Hardware, Software, Cost, Time &
Interfaces.
Department of Computer Science & Engg, SaIT Page 1

Modeling and Detection of Camouflaging Worm 2012
System Design involves the analysis, design, and configuration of the necessary
hardware and software components to support your solution's architecture. The five major
components of System Design include: the Information Model, Community Model,
Security/Permission Model, System Integration, Workflow, and Technical Architecture.
A System Design typically provides the following benefits:
Improved system performance; individually tailored configuration advice
demonstrates where improvement is necessary, and how to improve the system to
regain lost performance.
Customers gain a detailed understanding of how their users use their system. This
Usage Profile can be leveraged to develop future architecture changes.
Potential to learn of future concerns, allowing customers to take proactive measures
to avoid problems.
A baseline performance level is established against which benefits can be compared
and changes to the system predicted or foreseen.
System design is the process of working out the overall functionality and approach
that the system will include. It starts at a high level and then drills down into great detail,
and normally ends up with the production of a technical specification.
The design is the process of designing exactly how the specifications are to be
implemented. Analysis and design are very important in the whole development cycle. Any
fault in the design could affect the product or could be very expensive to solve in the later
stage of software development.
System Design is the activity of proceeding from an identified set of requirements
for a system to a design that meets those requirements. A distinction is sometimes drawn
between high-level or architectural design, which is concerned with the main components of
the system and their roles and interrelationships, and detailed design, which is concerned
with the internal structure and operation of individual components. The term system design
is sometimes used to cover just the high-level design activity.
System components are distributed across the physical architecture, usable interfaces
are designed and prototyped, and Technical Specifications are created for the Application.
Developers, enabling them to build and test the system.
Department of Computer Science & Engg, SaIT Page 2

Modeling and Detection of Camouflaging Worm 2012
The system design are broadly classified into two categories : high level design and
low level design.
High Level Design :-A high-level design provides an overview of a solution, platform, system, product,
service, or process. Such an overview is important in a multi-project development to make
sure that each supporting component design will be compatible with its neighboring designs
and with the big picture. The highest level solution design should briefly describe all
platforms, systems, products, services and processes that it depends upon and include any
important changes that need to be made to them. A high-level design document will usually
include a high-level architecture diagram depicting the components, interfaces and networks
that need to be further specified or developed.
The high-level design defines the project level architecture of the system. This
architecture defines the sub-systems to be built, internal and external interfaces to be
developed, and interface standards identified. The high level design is where the sub-
system requirements are developed. The high-level design also identifies the major
candidate off-the-shelf products that might be used in the system.
High-level design is the transitional step between what [requirements for sub-
systems] the system does, and how [architecture and interfaces] the system will be
implemented to meet the system requirements. This process includes the decomposition of
system requirements into alternative project architectures and then the evaluation of these
project architectures for optimum performance, functionality, cost, and other issues
[technical and non-technical]. Stakeholder involvement is critical for this activity. In this
step, internal and external interfaces are identified along with the needed industry standards.
These interfaces are then managed throughout the development process. The following uses
ramp metering as an example for the two key decomposition activities:
Functional decomposition is breaking a function down into its smallest parts. [E.g.,
ramp metering includes the sub-functions of detection, meter rate control, main line
metering, ramp queuing, time of day, and communications].
Physical decomposition defines the physical elements needed to carry out the
function. [E.g., ramp metering decomposition includes loops, controller clock, fiber or
twisted pair for communications, 2070 controllers, host computers, cabinets, and conduits].
Department of Computer Science & Engg, SaIT Page 3

Modeling and Detection of Camouflaging Worm 2012
Finally, allocating these sub-functions to the physical elements of the system will
form the complete project architecture. This step also defines the integration and
verification activities needed when the system elements are developed.
The high-level design of a software system is a collection of module and subroutine
interfaces related to each other by means of USES and IS_COMPONENT_OF
relationships. The High Level Design Document is a pretty important document for a
project, covering at a high level the overall design of the solution. If one were to try and
present a very succinct summary of the High Level Document, it could be something like
this:
Detailed use case scenarios of key process flows of the application
The class model and relationships
The sequence diagrams which outline key use case scenarios
The data/object model with relational table design
User interface style and design
After the requirements definition the high level design is the most important document
and provides the blueprint for the further stages of a project including the detailed design
and implementation stages. By not getting the high level design right, organisations run the
risk of creating problems which could be extremely expensive to remedy at a later stage.
The purpose of this High Level Design (HLD) Document is to add the necessary
detail to the current project description to represent a suitable model for coding. This
document is also intended to help detect contradictions prior to coding, and can be used as a
reference manual for how the modules interact at a high level. The HLD documentation
presents the structure of the system, such as the database architecture, application
architecture (layers), application flow (Navigation), and technology architecture. The HLD
uses non-technical to mildly-technical terms which should be understandable to the
administrators of the system.
The document may also depict or otherwise refer to work flows and/or data flows
between component systems. In addition, there should be brief consideration of all
significant commercial, legal, environmental, security, safety and technical risks, issues and
assumptions. The idea is to mention every work area briefly, clearly delegating the
ownership of more detailed design activity whilst also encouraging effective collaboration
between the various project teams.
Department of Computer Science & Engg, SaIT Page 4

Modeling and Detection of Camouflaging Worm 2012
Today, most high-level designs require contributions from a number of experts,
representing many distinct professional disciplines. Finally, every type of end-user should
be identified in the high-level design and each contributing design should give due
consideration to customer experience. The HLD uses non-technical to mildly-technical
terms which should be understandable to the administrators of the system.
The functioning of high level design can be easily explained by the use of
architecture diagram, class diagram and sequence diagram.
Architecture Diagram
An architecture diagram in “system architecture” is typically a technological
set-up, either various computer components working together, or steps in a software
process working towards a specific end result.
FIG. 4.1 Architecture diagram of camouflaging worm
In fig 4.1 we have a centralized C-Worm detection system along with its
different modules. The different component includes pure random scan, worm
detection list, and a system scan. The system scan is performed by selecting system
volume information.
Department of Computer Science & Engg, SaIT Page 5

Modeling and Detection of Camouflaging Worm 2012
Class Diagram
In software engineering, a class diagram in the Unified Modeling Language (UML)
is a type of static structure diagram that describes the structure of a system by showing the
system's classes, their attributes, and the relationships between the classes. The class
diagram is the main building block of object oriented modeling. It is used both for
general conceptual modeling of the systematic of the application, and for detailed
modeling translating the models into programming code. Class diagrams can also be
used for data modeling. The classes in a class diagram represent both the main
objects and or interactions in the application and the objects to be programmed. In
the class diagram these classes are represented with boxes which contain three parts.
A class with three sections:-
The upper part holds the name of the class
The middle part contains the attributes of the class
The bottom part gives the methods or operations the class can take or
undertake
In the system design of a system, a number of classes are identified and
grouped together in a class diagram which helps to determine the static relations
between those objects. With detailed modeling, the classes of the conceptual design
are often split into a number of subclasses.
FIG. 4.2 Class diagram of camouflaging worm
Department of Computer Science & Engg, SaIT Page 6

Modeling and Detection of Camouflaging Worm 2012
Sequence diagram
A sequence diagram in a Unified Modeling Language (UML) is a kind of
interaction diagram that shows how processes operate with one another and in what
order. It is a construct of a Message Sequence Chart. A sequence diagram shows
object interactions arranged in time sequence. It depicts the objects and classes
involved in the scenario and the sequence of messages exchanged between the
objects needed to carry out the functionality of the scenario. Sequence diagrams
typically are associated with use case realizations in the Logical View of the system
under development.
Sequence diagrams are sometimes called event diagrams, event scenarios,
and timing diagrams. A sequence diagram shows, as parallel vertical lines (lifelines),
different processes or objects that live simultaneously, and, as horizontal arrows, the
messages exchanged between them, in the order in which they occur. This allows
the specification of simple runtime scenarios in a graphical manner.
FIG. 4.3 Sequence diagram of camouflaging worm
Department of Computer Science & Engg, SaIT Page 7

Modeling and Detection of Camouflaging Worm 2012
Main Modules :-The different modules included in this project are:
1. C-Worm detection Module
The C-Worm has a self-propagating behavior similar to traditional worms, i.e., it
intends to rapidly infect as many vulnerable computers as possible. However, the C-Worm
is quite different from traditional worms in which it camouflages any noticeable trends in
the number of infected computers over time. The camouflage is achieved by manipulating
the scan traffic volume of worm-infected computers. Such a manipulation of the scan traffic
volume prevents exhibition of any exponentially increasing trends or even crossing of
thresholds that are tracked by existing detection schemes.
This worm attempts to remain hidden by sleeping (suspending scans) when it
suspects it is under detection. Worms that adopt such smart attack strategies could exhibit
overall scan traffic patterns different from those of traditional worms. Since the existing
worm detection schemes will not be able to detect such scan traffic patterns, it is very
important to understand such smart-worms and develop new countermeasures to defend
against them.
2. Worms are malicious : Detection Module OR Anomaly Detection
Worms are malicious programs that execute on these computers, analyzing the
behavior of worm executables plays an important role in host based detection systems.
Many detection schemes fall under this category. In contrast, network-based detection
systems detect worms primarily by monitoring, collecting, and analyzing the scan traffic
(messages to identify vulnerable computers) generated by worm attacks. Many detection
schemes fall under this category. Ideally, security vulnerabilities must be prevented to begin
with, a problem which must addressed by the programming language community. However,
while vulnerabilities exist and pose threats of large-scale damage, it is critical to also focus
on network-based detection, as this paper does, to detect wide spreading worms.
Anomaly detection, also referred to as outlier detection refers to detecting patterns in
a given data set that do not conform to an established normal behavior.[2] The patterns thus
detected are called anomalies and often translate to critical and actionable information in
several application domains. Anomalies are also referred to as outliers, change, deviation,
surprise, aberrant, peculiarity, intrusion, etc.
Department of Computer Science & Engg, SaIT Page 8

Modeling and Detection of Camouflaging Worm 2012
In particular in the context of abuse and network intrusion detection, the interesting
objects are often not rare objects, but unexpected bursts in activity. This pattern does not
adhere to the common statistical definition of an outlier as a rare object, and many outlier
detection methods (in particular unsupervised methods) will fail on such data, unless it has
been aggregated appropriately. Instead, acluster analysis algorithm may be able to detect the
micro clusters formed by these patterns.
Three broad categories of anomaly detection techniques exist. Unsupervised
anomaly detection techniques detect anomalies in an unlabeled test data set under the
assumption that the majority of the instances in the data set are normal by looking for
instances that seem to fit least to the remainder of the data set. Supervised anomaly
detection techniques require a data set that has been labeled as "normal" and "abnormal"
and involves training a classifier (the key difference to many other statistical classification
problems is the inherent unbalanced nature of outlier detection). Semi-supervised anomaly
detection techniques construct a model representing normal behavior from a given normal
training data set, and then testing the likelihood of a test instance to be generated by the
learnt model.
3. Pure Random Scan (PRS) Module
C-Worm can be extended to defeat other newly developed detection schemes, such
as destination distribution-based detection. In the following, Recall that the attack target
distribution based schemes analyze the distribution of attack targets (the scanned destination
IP addresses) as basic detection data to capture the fundamental features of worm
propagation, i.e., they continuously scan different targets
4. Worm propagation Module
Worm scan traffic volume in the open-loop control system will expose a much
higher probability to show an increasing trend with the progress of worm propagation. As
more and more computers get infected, they, in turn, take part in scanning other computers.
Hence, we consider the C-Worm as a worst case attacking scenario that uses a closed loop
control for regulating the propagation speed based on the feedback propagation status.
Department of Computer Science & Engg, SaIT Page 9

Modeling and Detection of Camouflaging Worm 2012
Low Level DesignLow Level Design (LLD) is like detailing the HLD. It defines the actual logic for
each and every component of the system. Class diagrams with all the methods and relation
between classes comes under LLD. Programs specs are covered under LLD. LLD describes
each and every module in an elaborate manner so that the programmer can directly code the
program based on this. There will be at least 1 document for each module and there may be
more for a module. The LLD will contain: - detailed functional logic of the module in
pseudocode - database tables with all elements including their type and size - all interface
details with complete API references (both requests and responses) - all dependency issues -
error message listings - complete input and outputs for a module.
The low level design document for a project should provide a complete and detailed
specification of the design for the software that will be developed in the project, including
the classes, member and non-member functions, and associations between classes that are
involved. By the end of the Low Level Design stage, the code should be "all but written".
The low level design document should contain a listing of the declarations of all the classes,
non-member-functions, and class member functions that will be defined during the
implementation stage, along with the associations between those classes and any other
details of those classes (such as member variables) that are firmly determined by the low
level design stage. The low level design document should also describe the classes, function
signatures, associations, and any other appropriate details, which will be involved in testing
and evaluating the project according to the evaluation plan defined in the project's
requirements document.
More importantly, each project's low level design document should provide a
narrative describing (and comments in your declaration and definition files should point
out) how the high level design is mapped into its detailed low-level design, which is just a
step away from the implementation itself. This should be an English description of how you
converted the technical diagrams (and text descriptions) found in your high level design into
appropriate class and function declarations in your low level design.
This document describes each and every module in an elaborate manner, so that the
programmer can directly code the program based on this. There will be at least 1 document
for each module and there may be more for a module. The LLD will contain: - detailed
functional logic of the module, in pseudo code - database tables, with all elements,
including their type and size - all interface details with complete API references(both
Department of Computer Science & Engg, SaIT Page 10

Modeling and Detection of Camouflaging Worm 2012
requests and responses) - all dependency issues -error message listings - complete input and
outputs for a module.
The low level design document for a project should provide a complete and detailed
specification of the design for the software that will be developed in the project, including
the classes, member and non-member functions, and associations between classes that are
involved. By the end of the Low Level Design stage, the code should be "all but written".
The low level design document should contain a listing of the declarations of all the
classes, non-member-functions, and class member functions that will be defined during the
implementation stage, along with the associations between those classes and any other
details of those classes (such as member variables) that are firmly determined by the low
level design stage. The low level design document should also describe the classes, function
signatures, associations, and any other appropriate details, which will be involved in testing
and evaluating the project according to the evaluation plan defined in the project's
requirements document.
More importantly, each project's low level design document should provide a narrative
describing (and comments in your declaration and definition files should point out) how the
high level design is mapped into its detailed low-level design, which is just a step away
from the implementation itself. This should be an English description of how you converted
the technical diagrams (and text descriptions) found in your high level design into
appropriate class and function declarations in your low level design. You should be
especially careful to explain how the class roles and their methods were combined in your
low level design, and any changes that you decided to make in combining and refining
them.
During the detailed phase, the view of the application developed during the high level
design is broken down into modules and programs. Logic design is done for every program
and then documented as program specifications. For every program, a unit test plan is
created. The entry criteria for this will be the HLD document. And the exit criteria will the
program specification and unit test plan (LLD). The Low Level Design Document gives the
design of the actual program code which is designed based on the High Level Design
Document. It defines Internal logic of corresponding sub module designers are preparing
and mapping individual LLDs to Every module. A good Low Level Design Document
developed will make the program very easy to be developed by developers because if
proper analysis is made and the Low Level Design Document is prepared then the code can
Department of Computer Science & Engg, SaIT Page 11

Modeling and Detection of Camouflaging Worm 2012
be developed by developers directly from Low Level Design Document with minimal effort
of debugging and testing. The Low Level Design is explained by Data Flow Diagram and
Activity Diagram.
Data Flow Diagram
A Data flow diagram (DFD) is a graphical representation of the "flow" of data
through an information system, modeling its process aspects. Often they are a
preliminary step used to create an overview of the system which can later be elaborated.
[2] DFDs can also be used for the visualization of data processing (structured design).
Data Flow diagrams (DFDs) are one of the three essential perspectives of the
structured-systems analysis and design method SSADM. The sponsor of a project and
the end users will need to be briefed and consulted throughout all stages of a system's
evolution. With a data flow diagram, users are able to visualize how the system will
operate, what the system will accomplish, and how the system will be implemented. The
old system's dataflow diagrams can be drawn up and compared with the new system's
data flow diagrams to draw comparisons to implement a more efficient system. Flow
diagrams can be used to provide the end user with a physical idea of where the data they
input ultimately has an effect upon the structure of the whole system from order to
dispatch to report. How any system is developed can be determined through a data flow
diagram.
A Data flow diagram (DFD) is a graphical representation of the "flow" of data
through an information system, modeling its process aspects. Often they are a
preliminary step used to create an overview of the system which can later be elaborated.
[2] DFDs can also be used for the visualization of data processing (structured design).
A DFD shows what kinds of data will be input to and output from the system,
where the data will come from and go to, and where the data will be stored. It does not
show information about the timing of processes, or information about whether processes
will operate in sequence or in parallel (which is shown on a flowchart).
Department of Computer Science & Engg, SaIT Page 12

Modeling and Detection of Camouflaging Worm 2012
FIG. 4.4 Data Flow diagram of camouflaging worm
Activity Diagram
Department of Computer Science & Engg, SaIT Page 13

Modeling and Detection of Camouflaging Worm 2012
Activity diagrams are graphical representations of workflows of stepwise
activities and actions with support for choice, iteration and concurrency. In the
Unified Modeling Language, activity diagrams can be used to describe the business
and operational step-by-step workflows of components in a system. An activity
diagram shows the overall flow of control.
Activity diagrams are constructed from a limited number of shapes,
connected with arrows. The most important shape types:
Rounded rectangles represent activities.
Diamonds represent decisions.
Bars represent the start (split) or end (join) of concurrent activities.
A black circle represents the start (initial state) of the workflow.
An encircled black circle represents the end (final state).
Arrows run from the start towards the end and represent the order in which
activities happen. Hence they can be regarded as a form of flowchart. Typical
flowchart techniques lack constructs for expressing concurrency. However, the join
and split symbols in activity diagrams only resolve this for simple cases. The
meaning of the model is not clear when they are arbitrarily combined with decisions
or loops.
Activity diagram is basically a flow chart to represent the flow form one
activity to another activity. The activity can be described as an operation of the
system. So the control flow is drawn from one operation to another. This flow can
be sequential, branched or concurrent. Activity diagrams deals with all type of flow
control by using different elements like fork, join etc.
Activity is a particular operation of the system. Activity diagrams are not
only used for visualizing dynamic nature of a system but they are also used to
construct the executable system by using forward and reverse engineering
techniques. The only missing thing in activity diagram is the message part. It does
not show any message flow from one activity to another. Activity diagram is some
time considered as the flow chart. Although the diagrams looks like a flow chart but
it is not. It shows different flow like parallel, branched, concurrent and single.
Department of Computer Science & Engg, SaIT Page 14

Modeling and Detection of Camouflaging Worm 2012
FIG. 4.5 Activity diagram of camouflaging worm
Department of Computer Science & Engg, SaIT Page 15

Modeling and Detection of Camouflaging Worm 2012
Use Case Diagram:-
In software and systems engineering, a use case is a list of steps, typically
defining interactions between a role (known in UML as an "actor") and a system, to
achieve a goal. The actor can be a human or an external system. In systems
engineering, use cases are used at a higher level than within software engineering,
often representing missions or stakeholder goals. The detailed requirements may
then be captured in SysML or as contractual statements.
A use case defines the interactions between external actors and the system
under consideration to accomplish a goal. Actors must be able to make decisions,
but need not be human: "An actor might be a person, a company or organization, a
computer program, or a computer system — hardware, software, or both." Actors
are always stakeholders, but many stakeholders are not actors, since they "never
interact directly with the system, even though they have the right to care how the
system behaves."
For example, "the owners of the system, the company's board of directors,
and regulatory bodies such as the Internal Revenue Service and the Department of
Insurance" could all be stakeholders but are unlikely to be actors.
Similarly, a person using a system may be represented as different actors because he
is playing different roles. For example, user "Joe" could be playing the role of a
Customer when using an Automated Teller Machine to withdraw cash from his own
account, or playing the role of a Bank Teller when using the system to restock the
cash drawer on behalf of the bank.
Actors are often working on behalf of someone else. A stakeholder may play
both an active and an inactive role: for example, a Consumer is both a "mass-market
purchaser" (not interacting with the system) and a User (an actor, actively
interacting with the purchased product).[13] In turn, a User is both a "normal
operator" (an actor using the system for its intended purpose) and a "functional
beneficiary" (a stakeholder who benefits from the use of the system).[13] For
example, when user "Joe" withdraws cash from his account, he is operating the
Automated Teller Machine and obtaining a result on his own behalf.
Conceptual modelling refers to specifying, visualizing, and documenting
models of for instance the context of use, a business model, or a software system.
The perspective of the terms in this category is rather technical.
Department of Computer Science & Engg, SaIT Page 16

Modeling and Detection of Camouflaging Worm 2012
Context of use refers to the characteristics of the users, tasks, and the organizational
and physical environment. Context of use may also describe the cognitive,
motivational and emotional characteristics of the different users, tasks, cooperative
behavior, articulation work and the organizational and physical environment. This is
done out of observations of real work and interviews, including the reflexive point
of view of actors on their context of use. Analyses the possible conflicts of interest
or need between different types of actors. Tries to anticipate different ways in which
a new tool or method could affect the content of the observed tasks and activities,
including the network and collaborative behavior. Analyses both norms and
practices.
FIG 4.6 Use Case Diagram of Camouflaging Worm Detection System
Department of Computer Science & Engg, SaIT Page 17

Modeling and Detection of Camouflaging Worm 2012
Low Level Design Of The Modules
1. C-Worm detection Module
The C-Worm has a self-propagating behavior similar to traditional worms, i.e., it
intends to rapidly infect as many vulnerable computers as possible. However, the C-Worm
is quite different from traditional worms in which it camouflages any noticeable trends in
the number of infected computers over time. The camouflage is achieved by manipulating
the scan traffic volume of worm-infected computers. Such a manipulation of the scan traffic
volume prevents exhibition of any exponentially increasing trends or even crossing of
thresholds that are tracked by existing detection schemes.
Worm detection has been intensively studied in the past and can be generally
classified into two categories: “hostbased” detection and “network-based” detection.
Hostbased detection systems detect worms by monitoring, collecting, and analyzing worm
behaviors on end-hosts. Since worms are malicious programs that execute on these
computers, analyzing the behavior of worm executables plays an important role in host-
based detection systems. Many detection schemes fall under this category [37]. In contrast,
network-based detection systems detect worms primarily by monitoring, collecting, and
analyzing the scan traffic (messages to identify vulnerable computers) generated by worm
attacks.
Many detection schemes fall under this category [19]. Ideally, security
vulnerabilities must be prevented to begin with, a problem, which must addressed by the
programming language community. However, while vulnerabilities exist and pose threats of
large-scale damage, it is critical to also focus on network-based detection, as this paper
does, to detect widespreading worms. In order to rapidly and accurately detect Internet-wide
large-scale propagation of active worms, it is imperative to monitor and analyze the traffic
in multiple locations over the Internet to detect suspicious traffic generated by worms.
The widely adopted worm detection framework consists of multiple distributed
monitors and a worm detection center that controls the former [41]. This framework is well
adopted and similar to other existing worm detection systems, such as the Cybercenter for
disease controller [11], Internet motion sensor [42], SANS ISC [23], Internet sink [41], and
network telescope [43].
The monitors are distributed across the Internet and can be deployed at endhosts,
router, or firewalls, etc. Each monitor passively records irregular port-scan traffic, such as
Department of Computer Science & Engg, SaIT Page 18

Modeling and Detection of Camouflaging Worm 2012
connection attempts to a range of void IP addresses (IP addresses not being used) and
restricted service ports. Periodically, the monitors send traffic logs to the detection center.
The detection center analyzes the traffic logs and determines whether or not there are
suspicious scans to restricted ports or to invalid IP addresses.
Network-based detection schemes commonly analyze the collected scanning traffic
data by applying certain decision rules for detecting the worm propagation. For example,
Venkataraman et al. [20] andWuet al. [21] proposed schemes to examine statistics of scan
traffic volume, Zou et al. presented a trend-based detection scheme to examine the
exponential increase pattern of scan traffic [19], Lakhina et al. [40] proposed schemes to
examine other features of scan traffic, such as the distribution of destination addresses.
Other works study worms that attempt to take on new patterns to avoid detection
[39]. Besides the above detection schemes that are based on the global scan traffic monitor
by detecting traffic anomalous behavior, there are other worm detection and defense
schemes, such as sequential hypothesis testing for detecting worm-infected computers [44]
and payload-based worm signature detection [45]. In addition, Cai et al. [46] presented both
theoretical modeling and experimental results on a collaborative worm signature generation
system that employs distributed fingerprint filtering and aggregation and multiple edge
networks.
Dantu et al. [47] presented a state-space feedback control model that detects and
control the spread of these viruses or worms by measuring the velocity of the number of
new connections an infected computer makes. Despite the different approaches described
above, we believe that detecting widely scanning anomaly behavior continues to be a useful
weapon against worms, and that, in practice, multifaceted defense has advantages.
2. Worms are malicious: Detection Module OR Anomaly Detection
Worms are malicious programs that execute on these computers, analyzing the
behavior of worm executables plays an important role in host based detection systems.
Many detection schemes fall under this category. In contrast, network-based detection
systems detect worms primarily by monitoring, collecting, and analyzing the scan traffic
(messages to identify vulnerable computers) generated by worm attacks. Many detection
schemes fall under this category. Ideally, security vulnerabilities must be prevented to begin
with, a problem which must addressed by the programming language community. However,
Department of Computer Science & Engg, SaIT Page 19

Modeling and Detection of Camouflaging Worm 2012
while vulnerabilities exist and pose threats of large-scale damage, it is critical to also focus
on network-based detection, as this paper does, to detect wide spreading worms.
In this section, we develop a novel spectrum-based detection scheme. Recall that the
C-Worm goes undetected by detection schemes that try to determine the worm propagation
only in the time domain. Our detection scheme captures the distinct pattern of the C-Worm
in the frequency domain, and thereby has the potential of effectively detecting the C-Worm
propagation. In order to identify the C-Worm propagation in the frequency domain, we use
the distribution of PSD and its corresponding SFM of the scan traffic.
Particularly, PSD describes how the power of a time series is distributed in the
frequency domain. Mathematically, it is defined as the Fourier transform of the
autocorrelation of a time series. In our case, the time series corresponds to the changes in
the number of worm instances that actively conduct scans over time. The SFM of PSD is
defined as the ratio of geometric mean to arithmetic mean of the coefficients of PSD. The
range of SFM values is [0,1] and a larger SFM value implies flatter PSD distribution and
vice versa.
Notice that the frequency-domain analysis will require more samples in comparison
with the time-domain analysis, since the frequency-domain analysis technique, such as the
Fourier transform, needs to derive power spectrum amplitude for different frequencies. In
order to generate the accurate spectrum amplitude for relatively high frequencies, a high
granularity of data sampling will be required. In our case, we rely on ITM systems to collect
traffic traces from monitors (motion sensors) in a timely manner. As a matter of fact, other
existing detection schemes based on the scan traffic rate [20], variance [21], or trend [19]
will also demand a high-sampling frequency for ITM systems in order to accurately detect
worm attacks. Enabling the ITM system with timely data collection will benefit worm
detection in real time.
3. Pure Random Scan (PRS) Module
C-Worm can be extended to defeat other newly developed detection schemes, such
as destination distribution-based detection. In the following, Recall that the attack target
distribution based schemes analyze the distribution of attack targets (the scanned destination
IP addresses) as basic detection data to capture the fundamental features of worm
propagation, i.e., they continuously scan different targets.
Department of Computer Science & Engg, SaIT Page 20

Modeling and Detection of Camouflaging Worm 2012
Pure Random Scan Strategy: The worm propagator can randomly select computers
in cyber Space to identify whether a computer is vulnerable. For example, the pure random
scan (PRS) worm randomly scans the entire network IPv4 address space [1, 19]. In this
model, worm- infected hosts do not have any prior vulnerability knowledge or
active/inactive information of other hosts. The worm-infected host randomly selects IP
addresses of victims from the global network IP address space and launches the attack to
those addresses. When the new host is infected, it continuously attacks the network via the
same method.
The main short coming in this approach is that many IP addresses in the network are
not being used by any valid host. Thus, many scans are wasted when targeting non existing
hosts. To address this issue, improvements on random scan have been proposed to launch
selective scans by using the knowledge of network address allocation. For example, some
chunk of IP addresses are used by organizations or enterprises, and thus are more likely to
be well-maintained and less vulnerable.
Some other IP addresses are more likely to be occupied by personal computers, and
thus have higher probability to be vulnerable [33]. Also, computers in the same subnet work
are more likely to use similar system settings and May share the same vulnerabilities. Such
network topology-related information can be obtained through routing tables and DNS and
can improve the probability of successful identification by (up to) three times [34].
We describe a generic random scan algorithm by a sequence of iterates {Xk} on
iteration k = 0, 1, . . . which may depend on previous points and algorithmic parameters.
The current iterate Xk may represent a single point, or a collection of points, to include
populationbased algorithms. The iterates are also capitalized to denote that they are random
variables, reflecting the probabilistic nature of the random search algorithm.
Generic Random Scan Algorithm
Step 0. Initialize algorithm parameters Θ0, initial points X0 ⊂ S and iteration index k = 0.
Step 1. Generate a collection of candidate points Vk+1 ⊂ S according to a specific
generator and associated sampling distribution.
Step 2. Update Xk+1 based on the candidate points Vk+1, previous iterates and algorithmic
parameters. Also update algorithm parameters Θk+1.
Step 3. If a stopping criterion is met, stop. Otherwise increment k and return to Step 1.
Department of Computer Science & Engg, SaIT Page 21

Modeling and Detection of Camouflaging Worm 2012
4. Worm propagation Module
Worm scan traffic volume in the open-loop control system will expose a much
higher probability to show an increasing trend with the progress of worm propagation. As
more and more computers get infected, they, in turn, take part in scanning other computers.
Hence, we consider the C-Worm as a worst case attacking scenario that uses a closed loop
control for regulating the propagation speed based on the feedback propagation status.
To analyze the C-Worm, we adopt the epidemic dynamic model for disease
propagation, which has been extensively used for worm propagation modeling [2]. Based on
existing results [12], this model matches the dynamics of real-worm propagation over the
Internet quite well. For this reason, similar to other publications, we adopt this model in our
paper as well.
Since our investigated C-Worm is a novel attack, we modified the original epidemic
dynamic formula to model the propagation of the C-Worm by introducing the P2P—the
attack probability that a worm-infected computer participates in worm propagation at time t.
We note that there is a wide scope to notably improve our modified model in the future to
reflect several characteristics that are relevant in real-world practice.
Particularly, the epidemic dynamic model assumes that any given computer is in one
of the following states: immune, vulnerable, or infected. An immune computer is one that
cannot be infected by a worm; a vulnerable computer is one that has the potential of being
infected by a worm; an infected computer is one that has been infected by a worm.
Algorithm for worm propagation:
Step 1. Collect traffic in local network
Step 2. Create suspicious list from outbound traffic
Step 3. foreach (record in suspicious list) do
Step 4. if (destination addresses have sequential distribution)
Step 5. then ‘worm alert’
Step 6. else if (destination addresses contain unused IP addresses)
Step 7. then ‘worm alert’
Step 8. else if (the number of distinct addresses of inbound traffic with related port
are large)
Step 9. then ‘worm alert’
Step 10. else ‘the record is normal activity
Step 11. End For.
Department of Computer Science & Engg, SaIT Page 22

Modeling and Detection of Camouflaging Worm 2012
We think our algorithm can effectively detect random, sequential and other
intelligent worm such as selective-random scan worm. And we can know infected hosts in
local network and take proper actions against those hosts. In addition, our algorithm can be
applied to a real network having a lot of worms that are not removed. It detects not only the
appearance of a new worm also already existing worms.
Department of Computer Science & Engg, SaIT Page 23