independent study report # 1

8/4/2019 Independent Study Report # 1

1/75

1

Independent Study Report

Artificial Immune Systems

1. Introduction:The biological Immune systems is a complex and adaptive system that defends body from the

antigens or pathogens from attack. It is possible to differentiate between immune cells as self-

cells and non-self cells. It is probable with the aid of the distributed and parallel force that has

the intelligence to take appropriate action from local and both global view using its connections

of chemical messengers for interactions.

There are two majors branches of the immune systems:

1. The innate system is static system which indentify and destroys antigens while;2. The Adaptive immune system reacts to unknown antigens patterns and develop a reaction

to those encountered antigens that can remain within body for longer time.

Such noticeable information processing capability of bio-logical immune system has caughtattention of computer engineers around the world for its application in computer security,

anomaly detection, fault tolerance, pattern recognition, etc.

This field has got its application in robotics and in some cases involves optimization tasks also.

2. Overview of Bio-logical Immune Systems:

The biological immune system has evolved over millions years and it is elaborate defense

system. The immune system employs multilevel and overlapping defense in parallel and

distributed way although the immune mechanism namely innate and adaptive and processes like

humeral and cellular are not known completely.


2/75

2

The biological immune system respond to attack either to neutralize the antigenic effect or

destroy the antigen. Such response is dependent on the way the antigen type and the way it

enters.

The crucial features of the biological immune system are:

a. Affinity (matching)b. Diversityc. Distributed operation (no central mechanism)

Affinity or matching degree refers to the binding between antibody and antigen.

Diversity means there should be different number of antibody types that can act as key to antigen

locks.

Distributed control means that there is no central mechanism to govern the immune response

when antigen attacks. There are local interactions between immune cells and antigens.

There are two immune cells that play important role in immune response:

1. B-cells (Bone Marrow),2. T-cells (Thymus).

Both these types of immune cells belongs to bone marrow but T-cells migrate to thymus to get

mature and in this way flow in the body through blood. There are three types of T-cells which

are mentioned below:

a. Helper T-cellsThese cells are important for the activation of B-cells.

b. Killer T-cells


3/75

3

Such cells are attached to the alien invaders and inject the destroying chemical molecules in to

antigens thereby causing their destruction.

c. Suppressor T-cellsThese genre of T-cells suppress the autoimmune interactions between cells. Thereby they

contribute to the network stabilization.

On the other hand, the B-cells are responsible for the production of antibodies that binds to

antigens and cause them to die out. Each B-cell generate only one type of antibody (which

numbers in millions).

In the figure below, I-II show the invade entering the body and activating T-Cells, which then in

IV activate the B-cells, V is the antigen matching, VI the antibody production and VII the

antigens destruction.

Figure (1) Immune system Cells [6]


4/75

4

From above description one can say that the innate immune system is responsible for the primary

response and the adaptive immune system is responsible for secondary response.

Hence, the human body is protected against foreign invaders by a multilevel system.

The biological immune system composed of skin, respiratory system, destructive enzymes and

stomach acids. The immune system is divided into two heads:

1. Innate immunity (non-specific);2. Adaptive immunity (specific ).

Such systems affect each other and linked to each other.

Again there are two types of adaptive immunity which are:

a. Humoral immunity,b. Cell mediated immunity.

1. Innate immunity:This immunity is congenital. pH temperature and chemicals rises unbeneficial living conditions

for foreign organisms. Extracellular molecules are ingested by macrophages and such process of

ingestion is affected by chemical messengers called lymphokines. The sialic acid on foreign

molecules make C3b bind to these surfaces for longer time. Thus, MAC is developed that

penetrates the cell surface and kill the cell of foreign antigen.

2. Adaptive Immunity:It is crucial for learning and memory.


5/75

5

a. Humoral ImmunityThis kind of immunity is happened by antibodies molecules contained by fluids within body

termed as humors. It involves the interactions between B-cells and antigens. The subsequent

proliferation and formation of memory cells. When there is an interaction between antibody and

antigen, the antigen can be destroyed in many ways. For instance, antibody can cross-link the

antigen forming the clusters that are more readily ingested by macrophages cells.

b. Cell ImmunityAs the name indicates that it is cell mediated. T-cells are responsible for cell-mediated

immunity. Cytotoxic T cells participate in cell-mediated immunity reactions by killing

altered self cells. Cytokines secreted by TDH can mediate this kind of cellular immnunity.

3. Artificial Immune Systems Basic Concepts

3.1 Initialization and Encoding:

In order to implement Artificial Immune System, there are four parameters which are needed to

be considered:

1. Encoding2. Similarity Measure3. Selection4. Mutation

Once we encode, then a similarity measure is determined in order to calculate degree matching

which perform selection and mutation until we reach the stopping criteria.


6/75

6

Selection of encoding scheme is very important for algorithms success. Similar to Genetic

Algorithm, there is close relationship between encoding and fitness function of genetic

algorithms. Fitness function is nothing but matching or affinity in artificial immune systems.

Now we have to consider two terms namely antigen and antibody. An antigen is target or

solution for a given problem. For example, the data to be checked or intrusion in system. An

antibody is the remaining data, e.g., other users in the data set or the network traffic.

Antigens and antibodies are encoded in the similar way. The most common way is string

representation, where length is number of variables, the position is variable identifier and the

corresponding value of variable.

For data mining and intrusion detection, a five variable binary problem can be shown as: (10010)

Example:

Data Mining: The problem of recommending movies.

The encoding deals with representation of users profile with respective to movies seen and the

like and respective dislikes. A list of numbers representing the vote can turn out to be encoding.

The votes can be binary or it can be 10 integers in a range. [0,5] where 0 indicates not like movie

and from 1 to 5 shows the rating of how much the movie is appreciated.

A possible encoding scheme for movie recommendation:

**+ *+ *++ (1)

id = identifier

score = score to the user.


7/75

7

Intrusion Detection:

The encoding looks like:

[ ], example: [

which represents an incoming data packet send to

port 25. In these scenarios, wildcards like any port are also often used [2,4].

3.2 Similarity or Affinity Measure

Matching degree is one of the most important in developing Artificial Immune Systems

algorithm. Two of the matching algorithm are described below with binary representation:

Now consider two strings below:

(0 0 0 0 0) and (0 0 0 1 1)

It is noticed that by bit-by-bit comparison, there are two different bits at the last. We can say that

the score is 3 depending on the matching between the two strings. This kind of matching

whichever we did is opposite to Hamming Distance technique in which the different bits are

needed to changed in order to bring similarity.

Again consider the strings (00000) and (01010). Once again the score is 3. The way in which the

matching results is different still the score is 3. So, this could be a problem. In order to avoid

such anamoly, we identify the continuous number of bits that match and get the length of the

longest matching as the similarity measure. So, for the first example, the score is 3 and for the

example second, the score is 1. If we do not want to use the binary representation, real-valued

representation is available. We can determine the Euclidean distance between two strings.


8/75

8

For data mining, the matching degree is refered to as correlation. If we take the instance of

movie recommendation, assume that we are finding the users from the data that are same to the

main users profile. In that situation, whatever we are trying to do is to determine the similarity.

For this we can use, the Pearson Correlation Coefficient between the two users.

Let there are two users u and v:

(2)

n represents the votes for which u and v have voted. ui is the vote of user for movie i and

represents the average of user u over entire movies. The measure is amended so default to a

value of 0 if the two users have no films in common.

The output ranges from -1 to 1 indicating the strong agreement to strong disagreement. 0 means

no correlation. For data mining, the 1 and -1 are the most important.

In negative selection algorithm, the element that are matched are eliminated and this shows that

the B-cell maturation involves no matching between self molecules or cells.

Now the question arises, where the Negative Selection is applied for artificial immune systems

implementation.

Consider the Intrusion Detection,

One way of solution to such intrusion detection problems is define self set S. Then the set of

detectors are randomly initialized. The set of detectors are subjected to matching algorithm that


9/75

9

compares set self. Any matching detector is rejected and we remain with the elements that do not

match with self. All these non-similar elements are comprising resultant detector set.

Such detector set is used to continually monitor the network. If there is a match, this is sign of

danger or alert.

The branch of Computational Intelligence emerged in 1990s Artificial Immune Systems is used

in computer security, pattern recognition, etc. [2,4,6].

4. Biological Immune System Models

4.1 Negative Selection Principle

Its been clear that the thymus is responsible for maturation of T-cells and is shielded by the

blood barrier which is able to exclude non-self antigens from thymus. Hence, the majority of the

biological cells present in thymic environment are self and not non-self. As an inference, the T-

cells containing repertoire that recognize the self cells are excluded from the thymus through the

biological process termed as Negative Selection. All the matured cells that leave the thymic

environment are self-tolerant and they do not identify the self cells.

From information processing view, negative selection perform pattern recognition by collection

important or crucial information about the non-self of the patterns to be identified. So, by taking

inspiration from biology, negative selection algorithm has been put forward for anomaly

detection or fault tolerance.

Define the set that has to be protected and let it be self set (P). Generate the set of detectors (M)

that detects all the elements not belonging to set P. The negative selection algorithm goes as

follows:

1. Produce the random elements (C);


10/75

10

2. Compare P and C. If the element of set C matches with an element of set P then discardsuch element or else store it in set M.

Now the set M is created, the next step is to monitor the system for detection of non-self patterns.

Consider set P to be monitored. The set P consists of elements of P and some new patterns or it

can be totally new set. For all the items in set M, that corresponds to non-self patterns, detect it

whether identifies an element of P and if it does then a non-self pattern is recognized and an

action is taken. [12]

Figure (2) Negative Selection Principle [12]

4.2 Clonal Selection

It is the theory that is used to describe how an immune response is executed when a non-self

pattern is identified by a B-cell complimentary to negative selection. Figure shows clonal

selection, proliferation and affinity maturation. The process can be explained as when a B-cell

recognizes an antigen with certain degree of affinity, it is selected to generate high volume of

antibodies which binds to antigens and results into their elimination with the aid of other immune


11/75

11

cells. The proliferation process is asexual which is a mitotic process in which cells divide

themselves. The B-cells clones undergo a hyper mutation resulting B-cells with high affinity

towards antigens. The B-cells also become memory cells.

From the computation point of view,

1. An antigen selects immune cells to proliferate. This rate of proliferation is directlyproportional to affinity. The higher the affinity, the higher the proliferation.

2. The mutation rate is inversely proportional to the affinity.

Figure (3) Clonal selection [12]


12/75

12

Genetic algorithms are similar to clonal selection if cross-over operator is not there. However,

the genetic algorithm has no affinity proportional reproduction and mutation properties. So,

CLONALG l algorithm has been proposed to include these properties. Such algorithm was

proposed for pattern recognition and thereafter it was modified for optimization tasks.

Suppose the set of patterns given to be P that are to be recognized, then the CLONALG

algorithm steps are termed as below:

1. Generate a population of patterns (M) randomly.2. Now, to the population (M), present each pattern of P to it. Determine affinity with each

and every element of set M.

3. Identify the individuals of M that have best affinity. Produce copies of such elements inproportion to the affinity with the antigen. The more the affinity, the more the number of

copies.

4. Mutate all the copies of the element in proportion to the affinity to the input pattern. Themore the affinity, the lesser the mutation rate.

5. These mutated elements are then added to set M and determine the elements that arematured. These are memories of the system.

6. Iterate steps 2 to 5, until the certain criteria is met. Such criteria are minimum patternrecognition or classification error.

This very algorithm enables the Artificial Immune Systems to become good at pattern

recognition. Hence, the CLONALG learns to recognize patterns depending on evolutionary like

behavior. [12]


13/75

13

4.3 Immune Network

The immune network theory states that the dynamic behavior is still there in immune system

even when the antigen is not present. So, how does it happen? It is proposed the cells and

molecules are able to identify each other. However, such theory is criticized by many

immunologist but the computational features of immune network are very important in robotics.

In accordance to this theory, the molecules that are on the surface of antibodies which are

recognized by other antibodies are called idiotopes.

In order to explain this theory, assume that there is antibody Ab1 recognises antigen Ag. Now

imagine that this antibody Ab1 recognises the idiotope of antibody Ab2. So, Ab1 recognises Ab2

and Ag. We say that the Ab2 is internal image of Ag. Such recognition of idiotopes between

molecules gives rise to connected cells network. A network is network of affinities. As a result of

such interactions, a antibody-antibody recognition gives network suppression and antibody-

antigen recognition gives rise to network activation and cell proliferation.

The recognition of one antibody by another one results in network suppression. Such ideology is

modeled by eliminating all but one of the self-recognising cells.

Figure (4) Immune Network [12]


14/75

14

Set (P) contains patterns to be recognized.

1. Generate network population randomly.2. For every element in set P, allow CLONALG that gives M* (memory cells) and their co-

ordinates for the current antigen.

3. Calculate the affinity between elements of M*.4. Accept all but those elements from M* that are having threshold more than prescribed.

The intent is to eliminate redundancy in the network by suppressing self-recognising

elements.

5. Combine the remaining elements of step 4 with the remaining elements found for eachantigen element presented. This gives Set M.

6. Calculate the matching degree between each and every element of Set M and suppress allbut self-recognizing.

7. Iterate step 2 to 6, until desired result is attained. [12]

5. Modeling the Bio-logical Immune Systems

5.1 Shape- Space Model:

The interactions between the antibody and antigen is of importance in immune systems. The

concept of Shape-Space is introduced to describe the interactions between immune cell

molecules and antigens quantitatively by Perelson and Oster in 1979.

According to this concept, the antigens can be recognized within a known region known as

recognition region around a antibody. The degree of binding between a antibody and attacking

antigen usually involves the short range non-covalent interactions based on electrostatic charge,

hydrogen-binding, van-der Waals force of attractions/repulsions, etc. The molecules should


15/75

15

interact with each other over sufficient portion of their respective surfaces. Hence, there is

extensive region of complementarity.

The existence of chemical groups as well as the shape and charge distributions are characteristic

properties of antigens and antibodies which are crucial in identifying the interactions between

these molecules. This set of features was called the generalized shape of a molecule [1].

Imagine that the generalized shape of antibody combining site can be described by L parameters:

length, height, width of any bump or groove in the combining site, its charge, etc. The confirm

numbers of parameters or their values is not desirable. Then a specific point in L-dimensional

space called shape-space shows the generalized shape of an attacking molecule of an antigen

binding region with relation to its antigen binding properties.

If an organism has a repertoire of N size, the shape space would contain N points. These points

would lie in finite volume V of the space because there is only a limited lengths, widths, charges,

etc. that an antibody combining site can assume. Antigenic determinants (epitopes) are

characterized by generalized shapes whose complements lie within V as the Ag-Ab interactions

are measured via regions of complementarity.

It is not necessary that antigen and antibody should match exactly. They may match with lower

affinity. The paratopes interacts with almost all the epitopes with Volume V with radius e.

Each antibody can recognize all types of epitope within recognition region of volume V, we

assume that an antigen can present different types of epitopes and hence a finite number of

antibodies can recognize almost infinite numbers of points


16/75

16

Figure (5) Shape-Space Model [6]

into volume V. This is related to cross-reactivity phenomenon in bio-logical immune systems.

So, in shape-space model like patterns occupy adajacent regions of the shape space and might be

recognized by the same antibody shape as far as e is provided [6].

5.2 Ag - Ab Representations and Affinities:

The Ag-Ab representation determine the distance measure that can be used to calculate the

degree of interaction between these molecules.

Mathematically, there are three ways to represent antibody-antigen pairs and to determine theirmatching strength:

1. Euclidean shape-space2. Manhattan shape-space3. Hamming Shape-space [4]


17/75

17

The generalized shape of a molecule (m), either antibody or antigen can be represented by a set

of real valued coordinates m = . m belongs to L dimensional real valued shape -

space.

The affinity between antibody and antigen is measured by the distance they have between two

strings or vectors, for example in Euclidean or the Manhattan distance. In the case of Euclidean

distance, if the coordinates of an antibody are given by and the

coordinates are given by , then the distance (D) between them is:

(3)

(4)Eqn (3) is depicts Euclidean distance case and Eqn (4) depicts Manhattan distance case.

Shape-spaces that use real valued coordinates and that measure distance in the form of eq (1) are

called Euclidean distance shape-spaces and those iin the form of eq (2) are called Manhattan

shape-spaces.bols

Another shape space is Hamming shape space in which the antigen and antibody are termed as

symbols sequences over an alphabet of size k. Such sequences can be interpreted as peptides

and the different symbols as characteristic properties of amino acids. In context of artificial

immune systems the mapping between shape and sequence are equivalent.


18/75

18

(5)

Equation (5) depicts hamming distance measure.

From equation (3) to (5) we see how to determine the affinities between molecules in Euclidean,

manhattan and hamming shape-spaces, respectively. In order to study the cross-reactivity, it is

important to coin the relation between distance D, recognition region and matching threshold.

When the distance between two sequences is maximum, the molecules have exact complement

and their affinity is also maximum. In other cases, suppose the matching affinity is not

maximum, it is good to take into consideration real valued spaces differently than hamming

spaces in measuring ag-ab interactions.

In Euclidean and Manhattan, a limit on the magnitude of each shape-space parameter cab be

employed. Moreover, the distance can be normalized, for example, over the interval [0, 1], so

that the matching strength also lies in the same range.

If we assume binary representation of ag-ab interactions then graphical ieraction is clear in

hamming shape-space. In the universe of bitstring representation the molecular binding takes

place only when the bitstrings are complementary to each other. For example,

ab =

ag =


19/75

19

Figure (6) Antigen- Antibody perfect matching using bit-string representation [6]

The affinity between antibody and antigen is the number of bits that are complementary in the

representation string. The way to measure the affinity is by XOR operator. The desired matching

strength between two randomly taken bitstrings equals to half of thir length(if they are the same

length).

A binding value shows whether the molecules are bound or not. In other words, it means if the

antigen is recognized or not by antibody. We can use several activation functions that can give us

idea regarding the binding value in proportion to the distances between the ab and ag molecules.

A bond is established only when the value of the match score is greater than (L e) in case of the

threshold function.

In continuous case the sigmoid function is good to apply where the e relies in the inflexion

point pf the curve.

In the hamming shape-space, the set of all possible antigens is considered as a spaces points,

where antigenic molecules with similar shapes occupy the adajacent points in the space. The


20/75

20

total number of unique antibodies and antigens is , where k = size of alphabet and L = thebitstring length.

A given antibody covers some portion of the shape-space depending on the recognition of some

sets of antigens. The matching threshold e determines the coverage provided by a single

antibody and in case when e = 0, then a perfect match is necessary. It means that an antibody and

antigen must be exacy complement of each other.

The number of antigens covered within a region of radiuse is given by:

() (6.1)

C = coverage of the antibody,

L = length of the bitstring,

e = matching threshold.

On the basis of eqn (6), a given bitstring of length L and an matching threshold e, the minimum

number of antibody molecules (N) necessary to complete the shape-space coverage can be

defined as

(6.2)ceil is the operator that rounds the value in parenthesis towards its upper nearest integer [2,4,6].


21/75

21

6. The AIS ModelThe artificial immune system model proposed by J.D. Farrner and N.H. Packard is simple

enough to simulate on computer but that still contains enough realism to embody characteristic

properties of the network. In this model they have left out many crucial features such as T-cellsand macrophages which contain the essence of the idiotypic netwok.

The sequence of amino acids specifying the chemical properties of the epitope and paratope are

represented as binary strings. So, in this case, the antibodies are viewed as to be composed of

two amino acids , 0 and 1. The sequence of five binary numbers can be corresponded to amino

acid. In this way twenty amino acids can be represented. The simplification that is considered

here is that each antigen and antibody has only one epitope but in reality one can see antigen or

antibody has many different epitopes[5].

Thus, an antibody is represented as (p,e), where p represents paratope and e represents the

epitope string. The allowed reactions between different antibodies and between antibodies and

antigens are found by searching the complementary matches between strings.

The exact string matching is not required. The strings are allowed to match in any possible match

in order to model the two molecules in more than one way. Let represents the length ofepitope string and represents the length of paratope string. So, the matching threshold isdefined as s min(, ), below which the two antibodies will not react at all. Let denotethe value of the n-th bit of i-th epitope string, shows the n-th bit value of the j-th paratopestring [1,2].


22/75

22

Now, the matching specificities is given by:

( )... (7)

In above equation (7), represents the exclusive-or operation for complementary matching.

6.1 Procedure Used for Computing Partial Matches:

Figure (7) Epitope and Paratope string matching [5]

In this example, = = 8 and s = 6. Alignments with -2 k 2 are possible. Here k = -1 sothat is comparable to . For the above example, G = 1; for k = -1 and G = 0 forall other values of k, hence = 1.

So, G = x for x > 0 and G = 0 otherwise. The sum over n ranges over all possible positionson the epitope and paratope; the sum over k allows the epitope to be shifted with respect to the

paratope . G determines the strength of a possible reaction between the epiopte and the paratope.

For goven alignment, i.e, value of k, G is 0 if less than s bits are complimentary and G = 1 +


23/75

23

when s or more bits are complimentary. If matches occur at more than one alignment, we sum

their strength to consider that the molecules might be able to interact in more than one way, and

thus react more strongly because they spend more time together than molecules that can interact

in only one alignment [5].

In this model, free antibodies with antibodies attached to cells are lumped together and only of

the total number of antibodies of a given type i in terms of the concentration variable xi are kept

track of.

What happens when two different antibodies interact? In this interaction Farmer and Packard

assume the paratope on one antibody recognizes the epitopes on the other antibody. They agin

aasume that the result of such interaction is that the antibody with the paratope reproduces some

fixed numbers of times, while some fixed probability , the antibody with the epitope is

eliminated. The degree to which one antibody reproduces and the other dies is controlled by the

degree of complementarity between the paratope and the epitope. So, the model is symmetric

with regard to antibody interaction.

Suppose N be the number of antibodies with concentrations {, , , } and n antigenswith concentrations {, , ..}. It is possible to avoid simulating the microscopicdynamics in differential equations for the concentrations. This is only possible only when the

system is well mixed and sufficiently large such that the number of interactions needed to

produce a significant change in the concentration of any particular type of antibody is huge.


24/75

24

On the basis of assumptions:

[

]

(8)

In above equation (8), the first term represents the stimulation of the paratope of an i-th anitibody

by the epitope of j-th antibody. The second term represents the suppression of i-th antibody by j-

th antibody. The probabaility of collision of antibody of type i with antibody of typr j is shown

by term and parameter c indicates the number of collisions per unit time and rate ofamtibody production simulated by collision.

The match specificities term indicates what reactions occur and how strongly. representsprobable inequality between stimulation and suppression. When = , there aresymmetrical interactions between paratopes and epitopes and the model is similar to one

proposed by Hoffman.

In order to model entire immune response, the concentrations of antigens should also be

introduced that may change depending upon the number of antigens increase or decrease. The

last term shows the death rate. The best way to change in such a way the total concentrationof the system at a fixed value[5].

The list of antibody and antigen types is dynamic. The changing occurs due to new types are

added or removed. The value N and n changes with time but on time scale it is slow as compared

to changes in . In eqn. (8), we do integration over a period of time. The composition of systemis examined and updated as it is needed. To update we put minimum threshold an all

concentrations so that a variable and all of its reactions is eliminated when the concentration

goes below threshold.


25/75

25

The generation of new antibody types is done through genetic operators that is applied to

paratope and epitope strings such as Crossover, inversion and point mutation. In crossover, two

antibody types are randomly selected and randomly positions within the two strings are chosen

and then the pieces on one side of the chosen position are interchanged in order to produce two

new types. Epitopes and paratopes are crossed over separately. By randomly changing one of the

bits in a given string point mutation is implemented and the implementation of inversion is

performed by inverting a randomly chosen segment of the string.

Antigens can be generated by a variety of mechanisms either randomly or by design. The same

antigen type can be given to the system so that we can see whether it can eliminate it or not.

Once the system learns to eliminate it, the number of antigens can be presented to see whether

system forget to eliminate or remember to eliminate the antigen. The number of antigen provided

to the system can be varied [5].

The antibodies whose paratopes match epitopes are amplified at the expense of other antibodies.

If = 1 (equal suppression and stimulation) and > 0 then every antibody type eventuallydies due to the damping term. Letting

< 1 favors the formation of loops of reaction, since all

the numbers of reaction loop gain concentration and can neutralize the damping term. When N

increases, the number of loops and respective lengths also increases.

Even when the system is disturbed by introduction of new types, it can remember certain states

due to robust properties of the reaction loops. The antibodies that can recognize the internal and

external other molecules are retained in the system and their concentration is increased.

Antibodies that do not recognize the other molecules are eliminated. Hence, together with

immunological memory, the system posses the immunological forgetting [5].


26/75

26

In the bio-logical immune system, antigens are sometimes restored in the system for long time

which is comparable to lifespan of organism. The exact reason for this is not now known. One

theory states that the antigen remain in degraded form in lymph nodes and their periodic

exposure to immune system retain memory. But as antigens are potentially dangerous, this

theory is highly risky. Another theory is that the B-cells that have reacted to antigens undergo the

dormant state and surface up when similar or kind of antigen occurs again. Such dormant state

can last for periods of weeks or may be months [1].

Another hypothesis is proposed by Farmer and Packard by means of idiotypic network.

6.2 Hypothesis:

Let the concentration of antibodies that recognize the antigen be ab1. Now the concentration of

antibodies that recognize the epitopes of ab1 antibodies be ab2. Continuing this way, let abn be

the concentation of antibody that recognize the paratope of ab (n-1) antibodies. If abn is like

original antigen, then it is like a loop because ab1 is going to recognize abn [3].

Figure (8) The formation of a cycle allows the antigen with epitope e0 to be remembered.[5]


27/75

27

Arrows denote recognition through string matching algorithmn. Paratope p(i) recognizes epitope

e (i-1) for i= 1,2 n. To form a cycle, we assume that by chance p(i) recognizes en in addition to

e0. Thus, en must resemble the antigen e0. If the antigen is eliminated, the existence of the cycle

can maintain the concentration of ab1, an antibody that specifically recognizes the antigen [5].

If the paratopes are assumed to functions as epiotpes, then for sure the values of n resemble the

antigen [5].

7. String Matching RulesA matching rule defines matching or recognition, and the distance measure that the former is

based on are the cornerstones in any detection, classification, or recognition algorithms. If you

are dealing with categorical data, then a string representation may be more suitable and a

matching rule like rcb is useful [7].

Several string-matching rules are described below:

7.1 Hamming Distance:

It is defined as the number of different characters between two strings. The hamming distance

between x and y strings is expressed as:

( ) (9)

N = length of the string, and represents the i-th bit of the respective strings, the operationwithin bracket shows the x-or operation [7].


28/75

28

7.2Binary Distance:

(10)

Based on the number of bits that match or differ, the extensions of hamming distance have

proposed.

(11)

(12)

(13)

a counts the number of 1s that match at the same position of both the strings; d enumerates the

number of 0s that match at the same position of both the strings; b counts the number of 1s in

string x that do not match string y; and c counts the number of 0s in string x that do not match

string y [7].


29/75

29

Different similarity measures are developed which are as follows:

1. Russel and Rao (13)

2. Jacard and Needham

14)

3. Kulzinski

5

4. Sokal and Michener

6

5. Rogers and Tanimoto

7


30/75

30

6. Yule

8

7.3 Edit Distance:

It is defined as the minimum number of string transformations between two strings s1 and s2

required to change string s1 into s2 where the possible string transformations include (i)

changing a character, (ii) inserting a character and (iii) deleting a character.

It is also termed as Levenshtein distance, it is a generalization of the hamming distance [7].

Value Difference Metric:

(19)Where

( )

And

denotes the probability that xi equals to the character c in the alphabet C [7].


31/75

31

7.4 LandscapeAffinity Matching:

This type of matching is used to capture the notion of matching biochemical and physical

structures and approximate matching to immune system. Input string and antibody string are

converted to bytes and then into positive integers to create landscape. Using sliding window, two

strings are compared [7]. Three different similarity measures are defined as:

Difference Matching Rule:

| | (20)

Slope-Matching Rule:

| | (21)

Physical matching:

(22)

7.5 R-Contiguous Bits Matching:

The rcb matching rule is defined as follows:

If x and y are equal length strings, then they are said to be matched if x and y match at atleast r

contiguous locations and we say match(x,y) is true.


32/75

32

Example:

If x=ABADCBAB and y=CAGDCBBA, then we can say that match (x,y) is true for r


33/75

33

8.1 The Bone-Marrow Object

It decides where in network the antigen has to be inserted, which B-cell is dying and causing

increase in concentration of cells beneficial to the network. The bone marrow object possesses

main algorithm which starts immune response by inserting antigen in b-cell network. Thealgorithm is as below:

Randomly initialize B-cell population

Load antigen population

Till end is reach DO

Select antigen randomly from antigen population

And insert such selected antigen in random point in B-cell network.

Select the approximate percentage of B-cells around insertion point.

For every B-cell selectedDo interaction between antigen and each B-cell selected for immune response.

Arrange these B-cells by the level of their avidity

Delete 5% bad cells out of B-cell population

Create n new B-cells (n = 25% of B-cell population)

Out of this n, select m cells to join the immune network (m = 5% of population) [9]

B-cell Object

The B-cell object possesses a pattern matching element. The B-cell object records the affinity

level of the B-cell and looks after the links to any other B-cell object it is in connection within

network of B-cells.

Antibodies

When an antigen meets antibody, an immune response is elicited and a match score is recorded.

If this score is more than or equal to threshold, the binding between antibody and antigen occurs.


34/75

34

Antigens

Each antigen which is potential is represented by antigen object possessing one epiotpe. The

antigens are defined in external ASCII files and are inserted into AIS by the antigen population

object. The object realizes the a series of lists from files and instantiates those series of list as

objects of antigens.

B-cell Stimulation

[ () () ] -

Above equation represents the stimulation of B-cell.

8.2 Applying AIS to Pattern Recognition Problem

1. B-cell ObjectsThe antibodys paratope is created from mRNA list. The bit string is copied by AIS in

complementary manner.

2. AntibodiesBit String representation is used for pattern recognition problem. So, the antibody

representation is of 0s and 1s.

3. AntigensAIS is tested by two diverse antigens population possessing the antigens binary list of

20 elements.

The antigen population used to immunize the AIS is of three pattern type forming 33% of

the population of antigen. The population consists of originals as well as the modified bit

strings introducing noise into the data.


35/75

35

Antigen Population Representation:

11111111110000000000 33%

00000000001111111111 33%

00000111111111100000 33%

4. Antigen/AntibodyIn order to determine the match between Ag-Ab, instead of following match to start at

any point on the antigen, a circular approach is followed. Hence, if the pattern described

by the antibody starts halfway along the antigen, then the antibody is shifted half way

along its length and hence a entire match is noted.

Bit Shifted Antibody:

Antibody 0 0 1 0 1 0 1 1 1 0

Antigen 1 0 0 0 1 1 1 0 1 0

Bit Shifted Antibody 0 1 1 1 0 0 0 1 0 1

8.3 The match algorithm:

Repeat

For each region consisting of 2 or more 1s note their length if

then

=

Shift Ab right 1 bit

Until Ab shift complete


36/75

36

Calculating Match Value:

Antigen: 0 1 1 0 0 0 0 1 1 1 1 0 1 1 0

Antibody: 1 0 0 1 1 1 0 0 0 1 0 1 1 0 1

XOR: 1 1 1 1 1 1 0 1 1 0 1 1 0 1 1 12

Length: 6 2 2 2

MatchValue: 12 + + + + 88Hypermutation:

In milti-point mutation, each bit selected was flipped and in sub-string regeneration, all the

elements between the two desired points are flipped.

8.4 Running the System

99 binary antigens were used to immunize the system. The test population was then presented to

AIS. The learning part was turned off while testing phase and hence the system is capable of

showing the secondary immune response. In other words, the system can determine whether the

antibody determine the antigen or not.

50 Iterations were performed for the immunization process in which the antibody population

increased from 10 to 28. Then comes the turn for secondary response by presenting antigens as

shown below.


37/75

37

1111111110000000000 TEST 1 *

0000111000110010001 TEST 2

1110010010010010010 TEST 3

0000000001111111111 TEST 4*

1010101000101001110 TEST 5

1111001010100110100 TEST 6

0000011111111110000 TEST 7*

TEST 1,4 and 7 are original antigens used in primary response. TEST 2,3 are modified versions

of TEST1. On the same lines, TEST 5,6 are noised version of TEST 4.

AIS should be able to identify TEST 2,3,5,6 without any difficulty [9].

9. Dynamic Behavior Arbitration using AISAkio Ishiguro et. al proposed a inference making system inspired from immune system in living

organism and applied it to behavior arbitration of autonomous mobile robot as conventional AI

systems have brittleness under dynamic changing environment. They try to evolve affinities

among antibodies using genetic operators.

Much attention has been focused on the behavioral decomposition approaches as there are

limitations on the functional decomposition for conventional AI. The arbitration among

competence modules arises difficulties in behavior-based arbitration.


38/75

38

To overcome such difficulties, Maes proposed behavior network system under which an action

suitable for the current situation and the given goals emerges on account of interaction between

different competence modules. Akio Ishiguro et. al approached this problem from

immunological point of view as shown in fig. 6.

Figure (9) Architecture of Algorithm [9]

As shown in figure, current situation, like, distance, direction to the detected obstacle perform

action like antigen and competence modules and interactions between modules perform action as

antibody and stimulation/suppression between antibodies, respectively. The baseline for such

approach is that the best possible antibody is selected for antigen.


39/75

39

Figure (10) Immune Networks [8, 9]

In order to verify the ability of their proposed, they simulated it. There are three kinds of objects

in this simulated environment: a] predators, b] obstacles and c] foods. For quantitative

evaluation, following assumptions are made:

1. For movement, the immunobot consumes energy say Em.2. If the immunobot is captured by predators, Ep amount of energy is consumed.3. If immunobot collides, Eo energy is vanished.4. If the immunobot get the food, it gets Ef energy.5. For avoiding over-charging, the obtain-food behavior is not emerged after sufficient of

food is already obtained.

The predators attack immuno-bot only if they are in predefined limit or range. So, to survive, the

best possible antibody is desired.

The figure below shows the structure of immunobot used in the simulations. It is armed with

external and internal detectors. External detectors are sensors in eight directions detecting


40/75

40

predators, obstacle and food. The distance is also detected by each detector in terms like near,

mid and far. The internal detector detects energy level.

Figure (11) Structure of Robot [8]

9.1 Description of Antibodies

The prepared competence module is antibody. The important thing for immunobot is to select the

best antibody for antigen and such is dependent on the how the antibodies are described. The

selection should be made in bottom-up manner with proper communication between the

modules. The structure of paratope and epitope is crucial for specificity or we can say for

identity of any specific antibody.

Paratope is desirable condition and the epitope is disallowed condition. The paratope and

idiotope are divided into three positions: obstacles, direction and distance. The typical

inference/consensus system adopt a condition-action description just like in fuzzy inference and


41/75

41

the proposed system uses condition-action-condition manner. Such manner provides

decentralized dynamic inference in a bottom-up manner.

Figure (12) Antibody Description [9]

The prepared antibody for antigen can be like below:

The antibody is activated if the immunobot detects the food in the front direction and mid-range,

and makes the immunobot move forward to pick it up.

Figure (13) Prepared Antibody [9]


42/75

42

However, if a predator exists in front and near/mid range, or if a food is in near range, the

prepared antibody can hesitate to be activated.

On similar lines, the other antibodies are designed.

9.2 Dynamics

In this model, the authors allow only one antibody to get activated when it surpasses the

prespecified threshold. One state variable is introduced in terms of concentration of each

antibody.

{ } (23)

= concentration of antibody that varies with time. =matching ratio between antibody i and j.

9.3 Basic mechanism of the proposed inference making network

Four antigens are listed in the figure shown and the listed five antibodies mainly participate in

the inference/consensus making. For instance, antibody 1 means that the food is detected by

immunobot in far range in front direction and so it is allowed to move forward. Other situations

involve immunobot identifies food in near range/predator in front/high energy level, this

antibody would stimulate other antibodies whose paratopes displays such conditions.


43/75

43

Figure (14) Antibody Selection [7,9]

Consider current energy level high, the antibodies 1, 2, 3, and 5 are stimulated by the antigen.

The concentrations of these very antibodies are incremented in accordance to its antigen. The

interaction within immune networks antibodies is importan. In the end, antibody 5 is selected in

figure 9.

In the case of current energy level low, antibody 3 is selected [9].

10. Latest Immune Models and Hybrid Approaches

10.1 Danger Theory based algorithms

In 2002, Aickelin and Cayzer include the following aspects in their AIS from danger theory:

1. Appropriate number of APC to display danger signals needs to be modeled.2. Danger signal is either positive or negative, representing the presence or absence of the

signal.


44/75

44

3. So far as biology is concerned, the danger zone is spatial but in computation model theother notions such as temporal proximity is used.

4. Sometimes the killer cells causes self cell death, this should not generate other dangersignals.

5. Priming killer cells should be considered via APCs in AIS models6. Antibody migration rule should specify the concentration of antibodies receiving signal 1

and signal 2 from a given APC.

DT depends on the concentration so different immune cells.These aspects are used to build better

AIS for anomaly detection in which the non-self do not trigger immune response without danger

signal [7].

Figure 15 (a) One Signal Model [7]

Figure 15 (b) Two Signal Model [7] Figure 15 (c) APC controlling IR [7]


45/75

45

Figure 15 (d) INS with third signal [7] Figure 15 (e) danger in control through zoning[7]

Figure 15 (f) Control through INS and zoning [7]


46/75

46

In 2010, the online supervised two-class classification problem was attempted to solve by using

danger theory. The proposed method is described below:

The algorithm regarding the proposed method are as follows:

Algorithm 1

Danger theory based immune algorithm.

1. Introduce antibody population and memory

2. While stopping conditions are not met do3. For i=0 to antigen population do4. Present antigen to the system5. Now the danger is created by antigen presented6. General antibody population receives signal 0 from antigen presented7. General antibody population receives signal 1 from danger zone8. Antibodies that receives both 0 and 1 signals are selected9. For all antibodies belonging to stimulated antibodies10.Change the status of antibodies11.Now the calculate the interaction between antibody and antigen12.End for


47/75

47

13.Suppress antibody population14.Decrease the danger from the antigen which has been already considered

15.For all antibodies belonging to stimulated antibodies16.Ifthe antibodies stimulation reaches certain threshold value then17.Apply clonal selection algorithm18.End if19.End for20. End for21.Check the stopping criteria22.End while23.Output is the memory of antibodies selected via clonal selection and met threshold value

When the learning algorithm is ended, the output antibodies are used to classify for unknown

antigens. A simple process in which an unknown antigen will be classified as the same class as

the antibody with which it has the very low affinity.

Learning Algorithm explained:

1. Initialization: The above algorithm mentioned starts with the antibody random populationand they are assigned labels. Their status are set to zero and memory are set to empty set.

2. Two kinds of signals: The detection of danger signals are co stimulation signal whichare termed as 1 while other are termed as 0. The antibodies populations are divide in to


48/75

48

two parts; a] general and b] memory. The memory antibodies are not interested in

reaction with antigens. They are the fixed memory of antigens. They are changed only

when they are suppressed. The general antibodies get signal 0 when presented with

antigen. So, the antibody can detect the stimuli of current antigen and when signal 0 is

perceived only when danger zone is created. The antibodies receiving both signals are

stimulated and can change their status.

Algorithm 2

1. Antibody stimulated = antibody stimulated +1.

2. Ifantibody label == antigen label then

3. Antibody-Antigen reaction =1

4. Else

5. Antibody-Antigen reaction = -1

6. End if7. Antibodyrelevance = antibody-relevance + antibody antigen reaction8. Variable danger zone (var) = affinity between antigen and antibody9. Calculate the antibody stimulation = antibody +antibody - antigen reaction * var10. Var = stimulated antibody population11.Antigen danger = Var *var*antibody stimulation


49/75

49

Algorithm 3

1. Ifantibody stimulation (as) < threshold value (t) then

2. Delete antibody population that are less than threshold3. Else ifas


50/75

50

5. Delete the antibody with high interactivity6. End if7. End for8. Group the memory antibodies in to pairs9. For all pairs do10.Calculate probability p211.Ifrandom< p2 then12.Remove the memory antibody with high affinity13.End if14.End for

10.2 Combining Dendritic Cells and Danger Theory

In 2007, Yeom used a approach of mixing DT and DC to form model for signal pre-

categorization. The following are principles:

1. Pathogens associated molecular proteins (PAMPs) are expressed by bacteria that can beidentified by DCs for change in behavior.

2. Danger signals are generated by unplanned death of necrotic cells. The sudden andbizarre or chaotic death of internal components of cell causes danger signal to surface up.

DCs are sensitive to concentration of danger signals. The presence of danger signal may

or may not show change but the probability of change is higher than the normal

situations.


51/75

51

3. Safe signals are due to normal death of any cell for regulations reasons and the tightlycontrolled process results in the release of various signals into the tissue. Such safe

signals give rise to suppression signals.

4. Inflammatory cytokines can be released as a result of injury, although the process ofinflammation is not enough to stimulate DCs alone.

DCs can stimulate nave T cells and have number/ of functional properties (Yeom, 2007)

DCs first function is to inform immune system to respond when there is attack.

DCs perform different functions depending upon their state of maturation. Modulation between

these state is facilitated by identification of signal between tissues, namely, danger signal,

apoptotic signal and inflammatory signal.

In tissue, DCs collect antigen and experience danger signals from necrosing cells and safe

signals from apoptotic cells. Maturation of DCs occurs in response to the receipt of these signals.

According to Yeom (2007), if there is concentration of danger signals in the tissue at the time of

pick of antigen, the DC is fully matures. Conversely, if there is safe signal, then DC gets matured

differently [7].

10.3 Multilevel Immune Learning Algorithm (MILA)

Both T and B level recognition mechanism is used in this algorithm. It is inspired by the

communication and processes of T-cell dependent humoral immune response. In biological

immune system, B- cells recognize antigen through immnoglobin receptors on their surfaces but

they are not proliferate and differentiate until the green signal is given from Th cells.


52/75

52

For Th cells to allow B cells to proliferate and differentiate, Th cells should get stimulated and

that happens only when Th cells recognize antigens in the context of major histocompatibilty

complex (MHC).

Suppression of B cells also occurs due to suppressor T cells. The activated B and T cells move to

lymph nodes where they proliferate, mutate, select, differentiate, and death of B cell takes place

in germinal centres (GCs).

In MILA, an abstraction of above events is incorporated to develop detection algorithm. The

algorithm consists of initialization, recognition, evolutionary and response.

In initialization phase, the detection system is trained to recognize the self. The result of

initialization is used to produce detectors, similar to populations of Th, Ts, Bcells which

participate in immune response (humoral). There are three level :

1. APCs level, that corresponds to highest one.2. B-cell level, the intermediate one.3. Th- cell level, bit level for local patterns.

MILA use rcb-matching rule for real valued representations. A Th cell uses the slide window to

get the w elements. However, B cells uses randomly chosen w elements. The concept of

prematuration and crossover operators can be used.

The another feature of MILA is positive selection by Ts cells that are based on self samples.

An evolutionary phase in MILA is a process of refining the detector set if the earlier detection

rates can be evaluated. This phase involves cloning, mutation, and selection; however cloning in

MILA is targeted one only those detectors that are activated in the recognition phase can cloned

[7].


53/75

53

10.4 Combining Negative Selection and Classification technique

In anomaly detection technique, only positive samples are available (self-sample) at the training

stage. However, most conventional classification algorithms need noth self as wells as non-self

algorithms.

In order to allow conventional algorithm to be used, when only self samples are there, a hybrid

algorithm is proposed by Gonzalez (2002) which is used to create synthetic samples from a set of

self-samples. The algo develop the detector set that covers the non-self space using NS and then

points are used to generate the samples for non-self class allowing the use of conventional

algorithm useful.

Figure (16) NS-SOM in generation classifier dataset [7]

.

Particularly, negative samples are generated from positive samples. Then samples from the both

classes are used for neural network for self organizing map (SOM). An SOM, composed of

nodes or neurons (that are able to identify input type) , is a type of AIN that is trained to produce


54/75

54

a low-dimensional representation of the input space or self/non-self feature space of the training

samples called map. [7,8].

In order to allow conventional algorithm to be used, when only self samples are there, a hybrid

algorithm is proposed by Gonzalez (2002) which is used to create synthetic samples from a set of

self-samples. The algo develop the detector set that covers the non-self space using NS and then

points are used to generate the samples for non-self class allowing the use of conventional

algorithm useful.

The three phases of NS-SOM are shown in figure below:

Figure (17) NS-SOM Model Structure [7,8]


55/75

55

11.Immune Networks and Negative Selection Based algorithmThe mixture of Negative selection and Ab-Ab communications algorithm was developed for

navigation control and path mapping of autonomous mobile robot by Prashant Rao (2008) for

Khepera II robot.

The following is the step by step formulation of the algorithm:

1. Initialization: First initialize a network of immune cells (there is superset of 64 antibodiesfrom 0 to 63). The initial concentrations of antibodies are initialized and the robot is

reset. The subset of 20 antibodies is chosen randomly. The stimulation and suppression

between antibodies using basic matching function is defined. The first two sensors are not

ON in their Khepera II robot

2. Population Loop:i) Antigenic Recognition: The information from the sensors is collected and an

antigen is formed based on that information. The matching is determined between

antigen and randomly selected antibodies and affinities are allotted. Each antigen

stimulates many antibodies but only one is perfectly matched and so selected for

process.

ii) Self-Nonself Determination: The antigen is seen for matching to self set in caseinnate memory takes over and system is allotted standard solution and the loop

executes again OR the system moves on to next step.

iii) Network Communications: The interactions between different selected randomlyantibodies is calculated.

iv) Dynamics: The stimulation minus suppression added to affinity betweenantibodies subtracted from the natural death co-efficient gives over all stimulation


56/75

56

of the system. The product from the stimulation and concentration of antibodie

provides us with the rate of change of concentration with time. The antibody with

high concentration is sent to critic that rewards or penalize and in respect to this

affinity are modified.

3. Feedback: The penalty allotted T-cell helper is activated and its calculation is determinedat each step. Adaption function is determined by interaction between T-cell and other

cells in network by modifying the affinities between antibodies employing a suitable

learning rate.

4. Step 2 and 3 are repeated until convergence criteria is met.

Figure (18) Algorithm based on Negative selection and Ab-Ab interaction [6]


57/75

57

Figure (19) Algorithm based on Negative selection and Ab-Ab interaction [6]

11.1 Latest Dendritic Cell Algorithm Inspired from Danger Theory

Danger theory states that the dangers signals are generated to activate APCs. APCs stimulate

T-helper cells and which finally gives rise to adaptive immune response. The danger signals

are detected by dendritic cells which acts in three modes namely immature, mature and semi-

mature. If the signal detected is safe then the dendritic cell become immature upon presenting


58/75

58

antigen to T-cell. If the dangerous signal is found then the dendritic cell is matured and T-cell

become antigen reactive.

The dendritic cell algorithm takes into consideration safe, danger and PAMPs signals. [11]

ALGORITHM:

input : S = set of data items to be labeled safe or dangerous

output :D = set of data items labeled as safe ordangerous.

Start

Generate initial population of dendritic cells (DCs), D

Create a set to include the migrated DCs, M

forall items in set S do

Select a set of DCs by randomly selecting from D, P

forall DCs in set P do

Add data item to DCs collected list

Update safe, danger and PAMPs concentrations

Update cytokiness concentration

Move DC from D to M and generate a new DC in set D if the

concentration is above threshold.

stop

stop

forall data items in S do

count the number of times data item is presented by a mature and semi-mature DC

Label item to be safe if if presented by more than semi-mature DCs than mature DCs,

Add data item to labeled set M

Stop [11]


59/75

59

11.2 Latest TLR (toll-like receptor) Algorithm

Algorithmic steps of TLR algo as described by Aickelin and Greensmith (2007) which is

simply designed for anomaly detection in computer networks are as below:

1. Collect set of system calls that are made in training data2. Collect signal values correspondingly3. Determine the complement set of sets in step 1 and step 2.

Figure (20) Systematic Overview of TLR algorithm [7]


60/75

60

4. Generate immature DCs (iDCs) set with signal receptors selected randomly from thecomplement signal set and with antigen receptors randomly selected from the

complement system call set.

5. Similarly, generate nave T-cells (nTCs) with antigen receptors randomly drawn fromcomplement system call set.

6. Immature DCs are exposed to sample signals and antigens, respectively.

7. If iDCs matches the signal. it matures (mDCs) and migrates.

8. If an iDC do not migrate in its lifetime, it is semi mature DC (smDCs) and then itmigrates.

9. Migrated smDCs and mDCs present their antigen and try and match nTCs.

10.If mDC presenting antigen matches to nave T cell, then nTCs are activated and it is saidthat we have anomaly.

11.If smDC expressing antigen matches nTC , then it kills nTC to lower false positives.

12.Migrated smDCs and mDCs and killed nTCs are replaced by new cells as per steps 4 and5. [7]


61/75

61

12.Recent Developments and Real world ApplicationsSolving problems using Immunological Computation

In order to apply the knowledge of biological immune system to real world problems, one must

first select the immune algorithm depending on the type of problem. The first step is to identify

the elements involved into the problem and how they can be represented in terms of particular

AIS.

To encode such entities, bit-string, real valued, etc, representation approaches can be chosen.

Then the affinity determination measure is selected related to matching rules employed. Next

step is to decide which AIS is beneficial to create a set of suitable entities that can provide a

good solution to the problem in the context [7].

Figure (21) Problem Solving Using AIS [7]


62/75

62

12.1 Virus Detection

Kephart(1994) proposed immunologically inspired approach to detect viruses in computer

system. In this, known viruses are identified by their computer coded sequences and unknown

viruses are detected by their unusual behavior in the system. The virus detection software

continuously scans the system to detect the changes. These changes triggers the release of decoy

programs whose sole intention is to become infected by virus [7].

Figure (22) Flow Diagram for Khephart approach for virus detection [7]


63/75

63

A diverse suit of decoy programs are kept at different locations in the systems memory to detect

virus. If one or more decoy programs are modifies, then it is sure that the virus has entered the

system and each decoy program contains the sample of virus. The infected decoy programs are

processed by signature extractor to generate the recognizer for the respective virus.

The signature extractor also extracts the attachment pattern of virus to the host in order to repair

the host in case. The signature extractor also must select the virus signature so that it can avoid

false-positives and false-negatives. The signature must be found in each sample of virus and it is

very likely not to be found in uninfected programs in computer system. Once the best possible

signature is found from virus infected programs, it id compared with half-gigabyte corpus of

legitimate programs to make sure that there is no false-positive. The repair information is

checked by testing on samples of the virus and again by human expert [7].

12.2 Immunogenetic Approaches in Intrusion detection

Gonzalez (2002) proposed negative selection with detector rules to detect attacks by monitoring

network traffic. A real valued representation is used for evolving hyper-rectangular shaped

detectors, interpreted as if-then rules, for high level characteristics of self / non-self space. The

experiments were performed using data from 1999 Defense Advanced Research Project Agency

intrusion detection evaluation dataset. AIS approach was able to produce detectors that gave a

good estimation of the amount of deviation from the normal [7].

12.3 Danger theory in Network Security

Aickelin (2002) first proposed danger theory application to network security. Their system

behaves like DCs looking for danger signals just like impulse increase in network traffic or

abnormally high flow of error messages. If such signals goes above threshold, then an alarm is

raised [7].


64/75

64

12.4 Robotics and Control

Robot controlled by Ishiguro et. al. (1996, 1998) , Wantanabe et. al. (1998, 1999) and Lee et. al.

(1999) focused on the development of dynamic decentralized consensus-making mechanism

based on the immune network theory. In dynamic environment, the immunoid is able to collect

the garbage. The metaphor of antibodies, which were potential behaviors of immunoid ; antigens

were related to environmental inputs just like garbage, wall, home base. For the immunoid to

take decide to the best, it matches antigen to antibody [7].

Vertebrate immune systems are inspiration for computer scientist and engineers to create new

algorithms in order to solve real world problems, four main AIS algorithms are:

1. Negative selection algorithms2. Artificial immune networks3. Clonal selection algorithm4. Danger theory and dendritic cell algorithm

The recent development include AIS application in computer security, optimization, data mining,

fault detection, etc. Many authors have explained the recent developments in AIS just like Garret

(2005) who tried to deal with the development before 2005 and attempt to make evaluation of

AIS in criteria of distinctiveness and effectiveness. Hart and Timmins (2010) discussed

application of AIS and proposed a set of problems features for the heavy applications of AIS.

Some of the recent developed models and Hybrid approaches are explained below:

12.5 Conserved Self Pattern Recognition Algorithm (CSPRA)

This very algorithm is recent algorithm in AIS area with an inspiration from Pattern Recognition

Receptors Model (PRR). According to PRR Model, the self/nonself discrimination requires


65/75

65

stimulation from APC. On the other hand if one sees, APCs are not stimulated until and unless

they are activated via PRR that identify molecular patterns on bacteria. So, for sure, the PRR

model added additional layers of molecular patterns. CSPRA (2010) naturally include negative

selection algorithm and the anomaly detection in CSPRA is performed by combining the results

from APCs self pattern recognition and T-cell negative selection. Self pattern recognition by

APCs is not done till antigen is not detected by T-cell negative selection algorithm. The

generation of APC detector includes two major steps:

1. Depending on the function between antigen and its feature space, we define theconserved self pattern that can be pre-defined from the data. This very data includes the

empirical one from the laboratory or it can be calculated mechanically by using Pearsons

co-efficient values between the coloum of each attribute and their respective label.

2. By evaluating the maximum, minimum and mean of all the values in the features space ofloc1, loc2,..,generate APC detector R = {(loc1, min, max, mean), (loc2, max, miin,

mean)..} within the conserved self pattern of features located in loc1, loc2..

As compared to classical negative selection algorithm, the proposed and tested CSPRA

Algorithm shows more better and promising results reducing the number of false errors

without increase the complexity. [3, 4, 13]

12.6 Recent Complex Artificial Immune Systems (CAIS)

CAIS consisted of five encountered layers namely encounter layer, preprocessing layer, MHC

layer, competitive layer and stimulation layer. Antigen and Antibody are termed as the input and

output. Suppose an antigen is encountered by the system then there are two ways in which wecan recognize it. One is in which B cell direct recognition and the other way is through the APC

layers. The input is given to APC layer, then the molecular complex pattern formed is given to

MHC layer for processing. The information coming from APC is transformed and translated into

MHC and feed to Th layer. In this Th layer, the cells receive different responses from MHC layer

and develop a set that consists of Th cells that provide better response to input antigens. B-cells


66/75

66

become activated due to stimulation from Th layer and also by input pattern. An antibody is the

difference between an input and weights associated with b cells. Ts cells modulate the weights

associated with immune cells located in neighborhood set. As compared to binary immune

systems, the CAIS has invariant feature to recognize translation, rotation and scale patterns. It

can be applied to hand writing pattern recognition problem [11, 13].

12.7 Hybrid Approaches

BAIS (Bayesian Artificial Immune Systems) is developed by removing the mutation and cloning

operators from the probabilistic model for solving the optimization problems and multiobjective

optimization. BAIS is capable of capturing the most relevant interactions between the problem

variables. The very algorithm adopts the population based strategy for search and Bayesian

network for implementing the probabilistic model.

Once the population is initiated, the algorithm starts the loop with stopping condition and the

following steps are evaluated for loops:

a. Using proper selection technique, select the best population from the given set.b. For the best solutions, develop the Bayesian networks that best fits to the selected best

solutions.

c. Sample the antibodiesd. Remove the antibodies with lower fitness and so the similar ones in the criteriae. Now put randomly generated antibodies in the selected ones to maintain diversity [13].

BAIS can be applied for feature selection using wrapper approach. It has the ability to handle the

building block in optimization of Trap-5 and such building blocks are non-overlapping and

overlapping. The multi objective Knapsack optimization can also be solved very efficiently by

BAIS algorithm. Such a approach is termed as the Multiobjective Bayesian Artificial Immune

Systems (MOBAIS) that can be applied for classification problems. It is capable of identifying


67/75

67

and preserving the building blocks effectively while it can search and find diverse high leve;

local optimal. The practical application shows that it has parsimonious results and thus shows

accurate results. Furthermore the Bayesian networks were enhanced by learning to avoid the

synthesis of the network at each iteration and only update those two parameters that are crucial

for example the conditional and marginal probabilities at each iteration [13].

The algorithm with an unstructured damage classification based on the data clustering and AIS

pattern recognition can be performed. Such a technique uses the data clustering training data to a

specified number of clusters and generate the initial memory cell set. By combining Afor

example.IS pattern recognition algorithms, this algorithm for the evolution for memory cells.

AIS with SVM can be used for fault diagnosis of induction motors. AIS is used for tuning the

parameters of kernel and penalty for classification accuracy.

In immune multiagent recognizer, each agent recognizer is an immune RBF neural network

model. In the immune RBF neural network model, antigen is input and the antigens are the

compression cluster mapping that is the hidden layers. The output weight can be determined by

using least square algorithm. In this algorithm, each level of recognition systems contain

recognizer that can recognize a sort of antigen.

A multiple valued immune network classifier (MVINC) based on immune netwotk theory was

applied for remote sensing images and performing immune memory using logic theory and

immune theory for classification.

EaiNET combined the AIS and Particle Swarm optimization which uses the learning technique

of PSO which is nothing but the each individual is able to learn the best from the social

population on account of which the convergence rate increases.

Radial Basis Function (RBF) artificial neural network and AIS are combined for compression of

the data in the set. Such a tool is called as aiNET. This can also be used for determining the

number of RBF in ANNs and thus termed as RBFNN.


68/75

68

A fault diagnosis model was proposed based on the immune evolution algorithm. The design part

includes the diversity evaluation that is very complex and fault detection is hard, fault calculation

technique integrated the induction and static was designed [13].

Particularly, by combining the agent based modeling and UML, the computational properties of

degenerate recognition systems are investigated. In this, It is possible to determine the

degenerate receptors and that when compared to a non degenerate system, recognition appears

quickly.

In the resource limited AIS, the Network Affinity Threshold (NAT) does not calculate the

network evolution process because the network granularity is determined by NAT and the initial

value is calculated by distance between the antigens. The convergence of the public and the

stability can be impaired by pure clonal selection and random change operation.

The gene immune detection algorithm with complement operator decreases effectively false

position surfaced up in the previous gene immune detection. Also the vaccine and the

complement are introduced. The number of detector are reduced and the efficiency of detection

is increased. The complement operator overcome the defect of the gene immune algorithm and

the detection time can be increased drastically.

ICAIS for incremental clustering based on the principles of AIS was introduced and it

implements incremental clustering and uses the basic immunity response to determine the data

regarding to novel clusters and it also uses the secondary immune response to identify the data to

old patterns [13].

Based on Learning Vector Quantization (LVQ) and immune network [13] model that is an

extension to the basic Jernss Model was proposed that can be used for pattern recognition. The

new classification Hybrid Fuzzy Neuro- Immune Network method based om Multi Epitope

approach. The performance of the proposed method shows promising result in terms of pattern

recognition.


69/75

69

APPENDIX A

Pattern Recognition in the Immune System using a Growing SOM

[ The following project is taken from Ph. D Thesis of Leonardo De Castro ]

function [w,win,cwin,D] = abnet(ag,eps,comp,alfa,beta,pc,pm),

% Pattern Recognition in the Immune System using a Growing SOM% Bipolar Splitting/Pruning Self-Organizing Feature Map (GSOM)% with Evolutionary Phase% Main features: bipolar weights, Hamming Distance, Winner takes all% PHASE I: Growing followed by Pruning% PHASE II: Supervised Evolution%% function [w,win,cwin,D] = hybrid(ag,eps,comp,alfa,beta,pc,pm),

% w -> weight matrix (Ab population)% win -> winner for each Ag (v)% cwin -> amount of winning of each individual (tau)% D -> hamming distance of each Ag with relation to its mapped class% ag -> antigen population to be recognized (n2xs2)% eps -> ball of stimulation% comp -> comparison: 1 for comparing complementary chains% 0 for comparing identical chains (Hamm. dist.)% alfa -> amount of bits to be changed% beta -> number of iterations for reducing the learning rate%% Auxiliar functions: COVER, UPDATE, SPLIT, PRUNE, MATCH, CADEIA, TESTGSOM% The columns of w must be similar to each Ag

if nargin == 2,[n2,s2] = size(ag);comp = 0;alfa = 3;beta = 3;pc = 0.6;pm = 0.1;

end;

% Network parametersep = 0; alfa0 = alfa; TD = 1;[np,ni] = size(ag); no = 1; vep = [0];[C,maxno] = cover(ni,eps); vno = [1:1:no];disp(sprintf('Coverage of each Ab: %d',C));disp(sprintf('Initial number of classes: %d',no));disp(sprintf('Possible number of classes: %d',maxno));if maxno > np,

maxno = np; disp(sprintf('Maximum number of classes (N): %d',np));end;% disp(sprintf('Affinity threshold: %d',eps));disp(sprintf('Press any key to continue...'));


70/75

70

pause;[w] = cadeia(ni,no,0,0,1);max_ep = (beta + 1) * maxno;

% Network Definitionwhile (ep < max_ep & TD > 0)% & no < maxno),

cwin = zeros(1,no); k = 0;vet = randperm(np); % Assincronouswhile k < np,

k = k+1; i = vet(k); D = [];[D,mXOR] = match(w',ag(i,:),comp);[v(k),ind] = min(D);cwin(ind) = cwin(ind) + 1;win(i) = ind;w = update(w,ind,alfa,mXOR(ind,:)');

end;TD = sum(v);ep = ep + 1;% Growing Phaseif (rem(ep,beta)==0),

[w,no,alfa] = split(cwin,win,w,ag,eps,alfa,alfa0);vno = [vno no]; vep = [vep ep];

end;% Pruning Phase[aux,indmin] = min(cwin);if aux == 0,

[w,no,alfa] = prune(w,indmin,alfa0);vno = [vno no];

end;% Learning rate decreasingif (ep > 0.05*max_ep & rem(ep,0.05*max_ep)==0),

if alfa > 1,alfa = alfa - 1;

end;end;disp(sprintf('IT: %4.0d no: %d LR: %d TD: %d',ep,no,alfa,TD));

end;[v,win,cwin,perc] = testgsom(w,ag,eps);disp(sprintf('Percentage of misclassified Ag: %3.2f%%',perc));disp('Minimal Antigenic Affinity (HD)'); disp(v);disp('Concentration Level: '); disp(cwin);disp(sprintf('Final Architecture: [%d,%d].',ni,no));figure(1); plot(vep,vno); hold on; plot(vep,vno,'or'); axis([0 ep+1 0 no+1]);title('Growing Evolution');xlabel('Iteration'); hold off;

% --------------------------- %% INTERNAL SUBFUNCTIONS %% --------------------------- %

% Function CADEIAfunction [ab,ag] = cadeia(n1,s1,n2,s2,bip)if nargin == 2,

n2 = n1; s2 = s1; bip = 1;elseif nargin == 4,

bip = 1;end;


71/75

71

% Antibody (Ab) chainsab = 2 .* rand(n1,s1) - 1;if bip == 1,

ab = hardlims(ab);else,

ab = hardlim(ab);end;% Antigen (Ag) chainsag = 2 .* rand(n2,s2) - 1;if bip == 1,

ag = hardlims(ag);else,

ag = hardlim(ag);end;% End Function CADEIA

% Function SPLITfunction [w,no,alfa] = split(cwin,win,w,ag,eps,alfa,alfa0)[ni,no] = size(w);[ind] = find(cwin > 1); % which outputs map more than one Agif ~isempty(ind),

[val,out] = max(cwin);% out = ind(1);

v = find(win==out);Mag = ag(v,:); % matrix of ag mapped in the same outputD = match(Mag,w(:,out)',0);[aux,new] = max(D);if aux > eps,

disp('** Growing **');if out == 1,

w = [Mag(new,:)',w];elseif out == no,

w = [w,Mag(new,:)'];else,w = [w(:,1:out),Mag(new,:)',w(:,out+1:end)];

end;no = no + 1;alfa = alfa0;

end;end;% End Function SPLIT

% Function TESTGSOMfunction [v,win,cwin,k] = testgsom(w,ag,eps),% disp('** Running the trained network **');[np,ni] = size(ag); k = 0;

cwin = zeros(1,size(w,2));for i=1:np,

[D] = match(w',ag(i,:),0);[v(i),ind] = min(D);win(i) = ind;cwin(ind) = cwin(ind) + 1;

end;k = 100 * (sum(v > eps) / np);% End Function TESTGSOM


72/75

72

% Function PRUNEfunction [w,no,alfa] = prune(w,ind,alfa0),[ni,no] = size(w);disp('** Pruning **');

if ind == 1,w = w(:,2:no);elseif ind == no,

w = w(:,1:no-1);else,

w = [w(:,1:ind-1) w(:,ind+1:no)];end;no = no - 1;alfa = alfa0;% End Function PRUNE

% Function COVERfunction [C,no,eps] = cover(len,eps),fat = fatorial(len);

C = 0;while eps > len,

disp(sprintf('Ball of stimulation bigger than chain length %d',len));eps = input('Enter a new ball of stimulation: ');

end;

for i=0:eps,C = C + (fat/(fatorial(i) * fatorial(len-i)));

end;no = ceil((2^len)/C);% End Function COVER

% Function FATORIALfunction fat = fatorial(m);if m == 0,

fat = 1;elseif m < 0,

disp('Negative value');else,

fat = prod(1:1:m);end;% End Function FATORIAL

% Function UPDATEfunction [w] = update(w,ind,alfa,vXOR),

[ni,no] = size(w);for j = 1:alfa,

[val,pto] = max(vXOR);if val == 0,

break; % exit loop if vectors are equalend;w

independent study report # 1

Documents