independent study report # 1

Upload: sunny-dubey

Post on 07-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Independent Study Report # 1

    1/75

    1

    Independent Study Report

    Artificial Immune Systems

    1. Introduction:The biological Immune systems is a complex and adaptive system that defends body from the

    antigens or pathogens from attack. It is possible to differentiate between immune cells as self-

    cells and non-self cells. It is probable with the aid of the distributed and parallel force that has

    the intelligence to take appropriate action from local and both global view using its connections

    of chemical messengers for interactions.

    There are two majors branches of the immune systems:

    1. The innate system is static system which indentify and destroys antigens while;2. The Adaptive immune system reacts to unknown antigens patterns and develop a reaction

    to those encountered antigens that can remain within body for longer time.

    Such noticeable information processing capability of bio-logical immune system has caughtattention of computer engineers around the world for its application in computer security,

    anomaly detection, fault tolerance, pattern recognition, etc.

    This field has got its application in robotics and in some cases involves optimization tasks also.

    2. Overview of Bio-logical Immune Systems:

    The biological immune system has evolved over millions years and it is elaborate defense

    system. The immune system employs multilevel and overlapping defense in parallel and

    distributed way although the immune mechanism namely innate and adaptive and processes like

    humeral and cellular are not known completely.

  • 8/4/2019 Independent Study Report # 1

    2/75

    2

    The biological immune system respond to attack either to neutralize the antigenic effect or

    destroy the antigen. Such response is dependent on the way the antigen type and the way it

    enters.

    The crucial features of the biological immune system are:

    a. Affinity (matching)b. Diversityc. Distributed operation (no central mechanism)

    Affinity or matching degree refers to the binding between antibody and antigen.

    Diversity means there should be different number of antibody types that can act as key to antigen

    locks.

    Distributed control means that there is no central mechanism to govern the immune response

    when antigen attacks. There are local interactions between immune cells and antigens.

    There are two immune cells that play important role in immune response:

    1. B-cells (Bone Marrow),2. T-cells (Thymus).

    Both these types of immune cells belongs to bone marrow but T-cells migrate to thymus to get

    mature and in this way flow in the body through blood. There are three types of T-cells which

    are mentioned below:

    a. Helper T-cellsThese cells are important for the activation of B-cells.

    b. Killer T-cells

  • 8/4/2019 Independent Study Report # 1

    3/75

    3

    Such cells are attached to the alien invaders and inject the destroying chemical molecules in to

    antigens thereby causing their destruction.

    c. Suppressor T-cellsThese genre of T-cells suppress the autoimmune interactions between cells. Thereby they

    contribute to the network stabilization.

    On the other hand, the B-cells are responsible for the production of antibodies that binds to

    antigens and cause them to die out. Each B-cell generate only one type of antibody (which

    numbers in millions).

    In the figure below, I-II show the invade entering the body and activating T-Cells, which then in

    IV activate the B-cells, V is the antigen matching, VI the antibody production and VII the

    antigens destruction.

    Figure (1) Immune system Cells [6]

  • 8/4/2019 Independent Study Report # 1

    4/75

    4

    From above description one can say that the innate immune system is responsible for the primary

    response and the adaptive immune system is responsible for secondary response.

    Hence, the human body is protected against foreign invaders by a multilevel system.

    The biological immune system composed of skin, respiratory system, destructive enzymes and

    stomach acids. The immune system is divided into two heads:

    1. Innate immunity (non-specific);2. Adaptive immunity (specific ).

    Such systems affect each other and linked to each other.

    Again there are two types of adaptive immunity which are:

    a. Humoral immunity,b. Cell mediated immunity.

    1. Innate immunity:This immunity is congenital. pH temperature and chemicals rises unbeneficial living conditions

    for foreign organisms. Extracellular molecules are ingested by macrophages and such process of

    ingestion is affected by chemical messengers called lymphokines. The sialic acid on foreign

    molecules make C3b bind to these surfaces for longer time. Thus, MAC is developed that

    penetrates the cell surface and kill the cell of foreign antigen.

    2. Adaptive Immunity:It is crucial for learning and memory.

  • 8/4/2019 Independent Study Report # 1

    5/75

    5

    a. Humoral ImmunityThis kind of immunity is happened by antibodies molecules contained by fluids within body

    termed as humors. It involves the interactions between B-cells and antigens. The subsequent

    proliferation and formation of memory cells. When there is an interaction between antibody and

    antigen, the antigen can be destroyed in many ways. For instance, antibody can cross-link the

    antigen forming the clusters that are more readily ingested by macrophages cells.

    b. Cell ImmunityAs the name indicates that it is cell mediated. T-cells are responsible for cell-mediated

    immunity. Cytotoxic T cells participate in cell-mediated immunity reactions by killing

    altered self cells. Cytokines secreted by TDH can mediate this kind of cellular immnunity.

    3. Artificial Immune Systems Basic Concepts

    3.1 Initialization and Encoding:

    In order to implement Artificial Immune System, there are four parameters which are needed to

    be considered:

    1. Encoding2. Similarity Measure3. Selection4. Mutation

    Once we encode, then a similarity measure is determined in order to calculate degree matching

    which perform selection and mutation until we reach the stopping criteria.

  • 8/4/2019 Independent Study Report # 1

    6/75

    6

    Selection of encoding scheme is very important for algorithms success. Similar to Genetic

    Algorithm, there is close relationship between encoding and fitness function of genetic

    algorithms. Fitness function is nothing but matching or affinity in artificial immune systems.

    Now we have to consider two terms namely antigen and antibody. An antigen is target or

    solution for a given problem. For example, the data to be checked or intrusion in system. An

    antibody is the remaining data, e.g., other users in the data set or the network traffic.

    Antigens and antibodies are encoded in the similar way. The most common way is string

    representation, where length is number of variables, the position is variable identifier and the

    corresponding value of variable.

    For data mining and intrusion detection, a five variable binary problem can be shown as: (10010)

    Example:

    Data Mining: The problem of recommending movies.

    The encoding deals with representation of users profile with respective to movies seen and the

    like and respective dislikes. A list of numbers representing the vote can turn out to be encoding.

    The votes can be binary or it can be 10 integers in a range. [0,5] where 0 indicates not like movie

    and from 1 to 5 shows the rating of how much the movie is appreciated.

    A possible encoding scheme for movie recommendation:

    **+ *+ *++ (1)

    id = identifier

    score = score to the user.

  • 8/4/2019 Independent Study Report # 1

    7/75

    7

    Intrusion Detection:

    The encoding looks like:

    [ ], example: [

    which represents an incoming data packet send to

    port 25. In these scenarios, wildcards like any port are also often used [2,4].

    3.2 Similarity or Affinity Measure

    Matching degree is one of the most important in developing Artificial Immune Systems

    algorithm. Two of the matching algorithm are described below with binary representation:

    Now consider two strings below:

    (0 0 0 0 0) and (0 0 0 1 1)

    It is noticed that by bit-by-bit comparison, there are two different bits at the last. We can say that

    the score is 3 depending on the matching between the two strings. This kind of matching

    whichever we did is opposite to Hamming Distance technique in which the different bits are

    needed to changed in order to bring similarity.

    Again consider the strings (00000) and (01010). Once again the score is 3. The way in which the

    matching results is different still the score is 3. So, this could be a problem. In order to avoid

    such anamoly, we identify the continuous number of bits that match and get the length of the

    longest matching as the similarity measure. So, for the first example, the score is 3 and for the

    example second, the score is 1. If we do not want to use the binary representation, real-valued

    representation is available. We can determine the Euclidean distance between two strings.

  • 8/4/2019 Independent Study Report # 1

    8/75

    8

    For data mining, the matching degree is refered to as correlation. If we take the instance of

    movie recommendation, assume that we are finding the users from the data that are same to the

    main users profile. In that situation, whatever we are trying to do is to determine the similarity.

    For this we can use, the Pearson Correlation Coefficient between the two users.

    Let there are two users u and v:

    (2)

    n represents the votes for which u and v have voted. ui is the vote of user for movie i and

    represents the average of user u over entire movies. The measure is amended so default to a

    value of 0 if the two users have no films in common.

    The output ranges from -1 to 1 indicating the strong agreement to strong disagreement. 0 means

    no correlation. For data mining, the 1 and -1 are the most important.

    In negative selection algorithm, the element that are matched are eliminated and this shows that

    the B-cell maturation involves no matching between self molecules or cells.

    Now the question arises, where the Negative Selection is applied for artificial immune systems

    implementation.

    Consider the Intrusion Detection,

    One way of solution to such intrusion detection problems is define self set S. Then the set of

    detectors are randomly initialized. The set of detectors are subjected to matching algorithm that

  • 8/4/2019 Independent Study Report # 1

    9/75

    9

    compares set self. Any matching detector is rejected and we remain with the elements that do not

    match with self. All these non-similar elements are comprising resultant detector set.

    Such detector set is used to continually monitor the network. If there is a match, this is sign of

    danger or alert.

    The branch of Computational Intelligence emerged in 1990s Artificial Immune Systems is used

    in computer security, pattern recognition, etc. [2,4,6].

    4. Biological Immune System Models

    4.1 Negative Selection Principle

    Its been clear that the thymus is responsible for maturation of T-cells and is shielded by the

    blood barrier which is able to exclude non-self antigens from thymus. Hence, the majority of the

    biological cells present in thymic environment are self and not non-self. As an inference, the T-

    cells containing repertoire that recognize the self cells are excluded from the thymus through the

    biological process termed as Negative Selection. All the matured cells that leave the thymic

    environment are self-tolerant and they do not identify the self cells.

    From information processing view, negative selection perform pattern recognition by collection

    important or crucial information about the non-self of the patterns to be identified. So, by taking

    inspiration from biology, negative selection algorithm has been put forward for anomaly

    detection or fault tolerance.

    Define the set that has to be protected and let it be self set (P). Generate the set of detectors (M)

    that detects all the elements not belonging to set P. The negative selection algorithm goes as

    follows:

    1. Produce the random elements (C);

  • 8/4/2019 Independent Study Report # 1

    10/75

    10

    2. Compare P and C. If the element of set C matches with an element of set P then discardsuch element or else store it in set M.

    Now the set M is created, the next step is to monitor the system for detection of non-self patterns.

    Consider set P to be monitored. The set P consists of elements of P and some new patterns or it

    can be totally new set. For all the items in set M, that corresponds to non-self patterns, detect it

    whether identifies an element of P and if it does then a non-self pattern is recognized and an

    action is taken. [12]

    Figure (2) Negative Selection Principle [12]

    4.2 Clonal Selection

    It is the theory that is used to describe how an immune response is executed when a non-self

    pattern is identified by a B-cell complimentary to negative selection. Figure shows clonal

    selection, proliferation and affinity maturation. The process can be explained as when a B-cell

    recognizes an antigen with certain degree of affinity, it is selected to generate high volume of

    antibodies which binds to antigens and results into their elimination with the aid of other immune

  • 8/4/2019 Independent Study Report # 1

    11/75

    11

    cells. The proliferation process is asexual which is a mitotic process in which cells divide

    themselves. The B-cells clones undergo a hyper mutation resulting B-cells with high affinity

    towards antigens. The B-cells also become memory cells.

    From the computation point of view,

    1. An antigen selects immune cells to proliferate. This rate of proliferation is directlyproportional to affinity. The higher the affinity, the higher the proliferation.

    2. The mutation rate is inversely proportional to the affinity.

    Figure (3) Clonal selection [12]

  • 8/4/2019 Independent Study Report # 1

    12/75

    12

    Genetic algorithms are similar to clonal selection if cross-over operator is not there. However,

    the genetic algorithm has no affinity proportional reproduction and mutation properties. So,

    CLONALG l algorithm has been proposed to include these properties. Such algorithm was

    proposed for pattern recognition and thereafter it was modified for optimization tasks.

    Suppose the set of patterns given to be P that are to be recognized, then the CLONALG

    algorithm steps are termed as below:

    1. Generate a population of patterns (M) randomly.2. Now, to the population (M), present each pattern of P to it. Determine affinity with each

    and every element of set M.

    3. Identify the individuals of M that have best affinity. Produce copies of such elements inproportion to the affinity with the antigen. The more the affinity, the more the number of

    copies.

    4. Mutate all the copies of the element in proportion to the affinity to the input pattern. Themore the affinity, the lesser the mutation rate.

    5. These mutated elements are then added to set M and determine the elements that arematured. These are memories of the system.

    6. Iterate steps 2 to 5, until the certain criteria is met. Such criteria are minimum patternrecognition or classification error.

    This very algorithm enables the Artificial Immune Systems to become good at pattern

    recognition. Hence, the CLONALG learns to recognize patterns depending on evolutionary like

    behavior. [12]

  • 8/4/2019 Independent Study Report # 1

    13/75

    13

    4.3 Immune Network

    The immune network theory states that the dynamic behavior is still there in immune system

    even when the antigen is not present. So, how does it happen? It is proposed the cells and

    molecules are able to identify each other. However, such theory is criticized by many

    immunologist but the computational features of immune network are very important in robotics.

    In accordance to this theory, the molecules that are on the surface of antibodies which are

    recognized by other antibodies are called idiotopes.

    In order to explain this theory, assume that there is antibody Ab1 recognises antigen Ag. Now

    imagine that this antibody Ab1 recognises the idiotope of antibody Ab2. So, Ab1 recognises Ab2

    and Ag. We say that the Ab2 is internal image of Ag. Such recognition of idiotopes between

    molecules gives rise to connected cells network. A network is network of affinities. As a result of

    such interactions, a antibody-antibody recognition gives network suppression and antibody-

    antigen recognition gives rise to network activation and cell proliferation.

    The recognition of one antibody by another one results in network suppression. Such ideology is

    modeled by eliminating all but one of the self-recognising cells.

    Figure (4) Immune Network [12]

  • 8/4/2019 Independent Study Report # 1

    14/75

    14

    Set (P) contains patterns to be recognized.

    1. Generate network population randomly.2. For every element in set P, allow CLONALG that gives M* (memory cells) and their co-

    ordinates for the current antigen.

    3. Calculate the affinity between elements of M*.4. Accept all but those elements from M* that are having threshold more than prescribed.

    The intent is to eliminate redundancy in the network by suppressing self-recognising

    elements.

    5. Combine the remaining elements of step 4 with the remaining elements found for eachantigen element presented. This gives Set M.

    6. Calculate the matching degree between each and every element of Set M and suppress allbut self-recognizing.

    7. Iterate step 2 to 6, until desired result is attained. [12]

    5. Modeling the Bio-logical Immune Systems

    5.1 Shape- Space Model:

    The interactions between the antibody and antigen is of importance in immune systems. The

    concept of Shape-Space is introduced to describe the interactions between immune cell

    molecules and antigens quantitatively by Perelson and Oster in 1979.

    According to this concept, the antigens can be recognized within a known region known as

    recognition region around a antibody. The degree of binding between a antibody and attacking

    antigen usually involves the short range non-covalent interactions based on electrostatic charge,

    hydrogen-binding, van-der Waals force of attractions/repulsions, etc. The molecules should

  • 8/4/2019 Independent Study Report # 1

    15/75

    15

    interact with each other over sufficient portion of their respective surfaces. Hence, there is

    extensive region of complementarity.

    The existence of chemical groups as well as the shape and charge distributions are characteristic

    properties of antigens and antibodies which are crucial in identifying the interactions between

    these molecules. This set of features was called the generalized shape of a molecule [1].

    Imagine that the generalized shape of antibody combining site can be described by L parameters:

    length, height, width of any bump or groove in the combining site, its charge, etc. The confirm

    numbers of parameters or their values is not desirable. Then a specific point in L-dimensional

    space called shape-space shows the generalized shape of an attacking molecule of an antigen

    binding region with relation to its antigen binding properties.

    If an organism has a repertoire of N size, the shape space would contain N points. These points

    would lie in finite volume V of the space because there is only a limited lengths, widths, charges,

    etc. that an antibody combining site can assume. Antigenic determinants (epitopes) are

    characterized by generalized shapes whose complements lie within V as the Ag-Ab interactions

    are measured via regions of complementarity.

    It is not necessary that antigen and antibody should match exactly. They may match with lower

    affinity. The paratopes interacts with almost all the epitopes with Volume V with radius e.

    Each antibody can recognize all types of epitope within recognition region of volume V, we

    assume that an antigen can present different types of epitopes and hence a finite number of

    antibodies can recognize almost infinite numbers of points

  • 8/4/2019 Independent Study Report # 1

    16/75

    16

    Figure (5) Shape-Space Model [6]

    into volume V. This is related to cross-reactivity phenomenon in bio-logical immune systems.

    So, in shape-space model like patterns occupy adajacent regions of the shape space and might be

    recognized by the same antibody shape as far as e is provided [6].

    5.2 Ag - Ab Representations and Affinities:

    The Ag-Ab representation determine the distance measure that can be used to calculate the

    degree of interaction between these molecules.

    Mathematically, there are three ways to represent antibody-antigen pairs and to determine theirmatching strength:

    1. Euclidean shape-space2. Manhattan shape-space3. Hamming Shape-space [4]

  • 8/4/2019 Independent Study Report # 1

    17/75

    17

    The generalized shape of a molecule (m), either antibody or antigen can be represented by a set

    of real valued coordinates m = . m belongs to L dimensional real valued shape -

    space.

    The affinity between antibody and antigen is measured by the distance they have between two

    strings or vectors, for example in Euclidean or the Manhattan distance. In the case of Euclidean

    distance, if the coordinates of an antibody are given by and the

    coordinates are given by , then the distance (D) between them is:

    (3)

    (4)Eqn (3) is depicts Euclidean distance case and Eqn (4) depicts Manhattan distance case.

    Shape-spaces that use real valued coordinates and that measure distance in the form of eq (1) are

    called Euclidean distance shape-spaces and those iin the form of eq (2) are called Manhattan

    shape-spaces.bols

    Another shape space is Hamming shape space in which the antigen and antibody are termed as

    symbols sequences over an alphabet of size k. Such sequences can be interpreted as peptides

    and the different symbols as characteristic properties of amino acids. In context of artificial

    immune systems the mapping between shape and sequence are equivalent.

  • 8/4/2019 Independent Study Report # 1

    18/75

    18

    (5)

    Equation (5) depicts hamming distance measure.

    From equation (3) to (5) we see how to determine the affinities between molecules in Euclidean,

    manhattan and hamming shape-spaces, respectively. In order to study the cross-reactivity, it is

    important to coin the relation between distance D, recognition region and matching threshold.

    When the distance between two sequences is maximum, the molecules have exact complement

    and their affinity is also maximum. In other cases, suppose the matching affinity is not

    maximum, it is good to take into consideration real valued spaces differently than hamming

    spaces in measuring ag-ab interactions.

    In Euclidean and Manhattan, a limit on the magnitude of each shape-space parameter cab be

    employed. Moreover, the distance can be normalized, for example, over the interval [0, 1], so

    that the matching strength also lies in the same range.

    If we assume binary representation of ag-ab interactions then graphical ieraction is clear in

    hamming shape-space. In the universe of bitstring representation the molecular binding takes

    place only when the bitstrings are complementary to each other. For example,

    ab =

    ag =

  • 8/4/2019 Independent Study Report # 1

    19/75

    19

    Figure (6) Antigen- Antibody perfect matching using bit-string representation [6]

    The affinity between antibody and antigen is the number of bits that are complementary in the

    representation string. The way to measure the affinity is by XOR operator. The desired matching

    strength between two randomly taken bitstrings equals to half of thir length(if they are the same

    length).

    A binding value shows whether the molecules are bound or not. In other words, it means if the

    antigen is recognized or not by antibody. We can use several activation functions that can give us

    idea regarding the binding value in proportion to the distances between the ab and ag molecules.

    A bond is established only when the value of the match score is greater than (L e) in case of the

    threshold function.

    In continuous case the sigmoid function is good to apply where the e relies in the inflexion

    point pf the curve.

    In the hamming shape-space, the set of all possible antigens is considered as a spaces points,

    where antigenic molecules with similar shapes occupy the adajacent points in the space. The

  • 8/4/2019 Independent Study Report # 1

    20/75

    20

    total number of unique antibodies and antigens is , where k = size of alphabet and L = thebitstring length.

    A given antibody covers some portion of the shape-space depending on the recognition of some

    sets of antigens. The matching threshold e determines the coverage provided by a single

    antibody and in case when e = 0, then a perfect match is necessary. It means that an antibody and

    antigen must be exacy complement of each other.

    The number of antigens covered within a region of radiuse is given by:

    () (6.1)

    C = coverage of the antibody,

    L = length of the bitstring,

    e = matching threshold.

    On the basis of eqn (6), a given bitstring of length L and an matching threshold e, the minimum

    number of antibody molecules (N) necessary to complete the shape-space coverage can be

    defined as

    (6.2)ceil is the operator that rounds the value in parenthesis towards its upper nearest integer [2,4,6].

  • 8/4/2019 Independent Study Report # 1

    21/75

    21

    6. The AIS ModelThe artificial immune system model proposed by J.D. Farrner and N.H. Packard is simple

    enough to simulate on computer but that still contains enough realism to embody characteristic

    properties of the network. In this model they have left out many crucial features such as T-cellsand macrophages which contain the essence of the idiotypic netwok.

    The sequence of amino acids specifying the chemical properties of the epitope and paratope are

    represented as binary strings. So, in this case, the antibodies are viewed as to be composed of

    two amino acids , 0 and 1. The sequence of five binary numbers can be corresponded to amino

    acid. In this way twenty amino acids can be represented. The simplification that is considered

    here is that each antigen and antibody has only one epitope but in reality one can see antigen or

    antibody has many different epitopes[5].

    Thus, an antibody is represented as (p,e), where p represents paratope and e represents the

    epitope string. The allowed reactions between different antibodies and between antibodies and

    antigens are found by searching the complementary matches between strings.

    The exact string matching is not required. The strings are allowed to match in any possible match

    in order to model the two molecules in more than one way. Let represents the length ofepitope string and represents the length of paratope string. So, the matching threshold isdefined as s min(, ), below which the two antibodies will not react at all. Let denotethe value of the n-th bit of i-th epitope string, shows the n-th bit value of the j-th paratopestring [1,2].

  • 8/4/2019 Independent Study Report # 1

    22/75

    22

    Now, the matching specificities is given by:

    ( )... (7)

    In above equation (7), represents the exclusive-or operation for complementary matching.

    6.1 Procedure Used for Computing Partial Matches:

    Figure (7) Epitope and Paratope string matching [5]

    In this example, = = 8 and s = 6. Alignments with -2 k 2 are possible. Here k = -1 sothat is comparable to . For the above example, G = 1; for k = -1 and G = 0 forall other values of k, hence = 1.

    So, G = x for x > 0 and G = 0 otherwise. The sum over n ranges over all possible positionson the epitope and paratope; the sum over k allows the epitope to be shifted with respect to the

    paratope . G determines the strength of a possible reaction between the epiopte and the paratope.

    For goven alignment, i.e, value of k, G is 0 if less than s bits are complimentary and G = 1 +

  • 8/4/2019 Independent Study Report # 1

    23/75

    23

    when s or more bits are complimentary. If matches occur at more than one alignment, we sum

    their strength to consider that the molecules might be able to interact in more than one way, and

    thus react more strongly because they spend more time together than molecules that can interact

    in only one alignment [5].

    In this model, free antibodies with antibodies attached to cells are lumped together and only of

    the total number of antibodies of a given type i in terms of the concentration variable xi are kept

    track of.

    What happens when two different antibodies interact? In this interaction Farmer and Packard

    assume the paratope on one antibody recognizes the epitopes on the other antibody. They agin

    aasume that the result of such interaction is that the antibody with the paratope reproduces some

    fixed numbers of times, while some fixed probability , the antibody with the epitope is

    eliminated. The degree to which one antibody reproduces and the other dies is controlled by the

    degree of complementarity between the paratope and the epitope. So, the model is symmetric

    with regard to antibody interaction.

    Suppose N be the number of antibodies with concentrations {, , , } and n antigenswith concentrations {, , ..}. It is possible to avoid simulating the microscopicdynamics in differential equations for the concentrations. This is only possible only when the

    system is well mixed and sufficiently large such that the number of interactions needed to

    produce a significant change in the concentration of any particular type of antibody is huge.

  • 8/4/2019 Independent Study Report # 1

    24/75

    24

    On the basis of assumptions:

    [

    ]

    (8)

    In above equation (8), the first term represents the stimulation of the paratope of an i-th anitibody

    by the epitope of j-th antibody. The second term represents the suppression of i-th antibody by j-

    th antibody. The probabaility of collision of antibody of type i with antibody of typr j is shown

    by term and parameter c indicates the number of collisions per unit time and rate ofamtibody production simulated by collision.

    The match specificities term indicates what reactions occur and how strongly. representsprobable inequality between stimulation and suppression. When = , there aresymmetrical interactions between paratopes and epitopes and the model is similar to one

    proposed by Hoffman.

    In order to model entire immune response, the concentrations of antigens should also be

    introduced that may change depending upon the number of antigens increase or decrease. The

    last term shows the death rate. The best way to change in such a way the total concentrationof the system at a fixed value[5].

    The list of antibody and antigen types is dynamic. The changing occurs due to new types are

    added or removed. The value N and n changes with time but on time scale it is slow as compared

    to changes in . In eqn. (8), we do integration over a period of time. The composition of systemis examined and updated as it is needed. To update we put minimum threshold an all

    concentrations so that a variable and all of its reactions is eliminated when the concentration

    goes below threshold.

  • 8/4/2019 Independent Study Report # 1

    25/75

    25

    The generation of new antibody types is done through genetic operators that is applied to

    paratope and epitope strings such as Crossover, inversion and point mutation. In crossover, two

    antibody types are randomly selected and randomly positions within the two strings are chosen

    and then the pieces on one side of the chosen position are interchanged in order to produce two

    new types. Epitopes and paratopes are crossed over separately. By randomly changing one of the

    bits in a given string point mutation is implemented and the implementation of inversion is

    performed by inverting a randomly chosen segment of the string.

    Antigens can be generated by a variety of mechanisms either randomly or by design. The same

    antigen type can be given to the system so that we can see whether it can eliminate it or not.

    Once the system learns to eliminate it, the number of antigens can be presented to see whether

    system forget to eliminate or remember to eliminate the antigen. The number of antigen provided

    to the system can be varied [5].

    The antibodies whose paratopes match epitopes are amplified at the expense of other antibodies.

    If = 1 (equal suppression and stimulation) and > 0 then every antibody type eventuallydies due to the damping term. Letting

    < 1 favors the formation of loops of reaction, since all

    the numbers of reaction loop gain concentration and can neutralize the damping term. When N

    increases, the number of loops and respective lengths also increases.

    Even when the system is disturbed by introduction of new types, it can remember certain states

    due to robust properties of the reaction loops. The antibodies that can recognize the internal and

    external other molecules are retained in the system and their concentration is increased.

    Antibodies that do not recognize the other molecules are eliminated. Hence, together with

    immunological memory, the system posses the immunological forgetting [5].

  • 8/4/2019 Independent Study Report # 1

    26/75

    26

    In the bio-logical immune system, antigens are sometimes restored in the system for long time

    which is comparable to lifespan of organism. The exact reason for this is not now known. One

    theory states that the antigen remain in degraded form in lymph nodes and their periodic

    exposure to immune system retain memory. But as antigens are potentially dangerous, this

    theory is highly risky. Another theory is that the B-cells that have reacted to antigens undergo the

    dormant state and surface up when similar or kind of antigen occurs again. Such dormant state

    can last for periods of weeks or may be months [1].

    Another hypothesis is proposed by Farmer and Packard by means of idiotypic network.

    6.2 Hypothesis:

    Let the concentration of antibodies that recognize the antigen be ab1. Now the concentration of

    antibodies that recognize the epitopes of ab1 antibodies be ab2. Continuing this way, let abn be

    the concentation of antibody that recognize the paratope of ab (n-1) antibodies. If abn is like

    original antigen, then it is like a loop because ab1 is going to recognize abn [3].

    Figure (8) The formation of a cycle allows the antigen with epitope e0 to be remembered.[5]

  • 8/4/2019 Independent Study Report # 1

    27/75

    27

    Arrows denote recognition through string matching algorithmn. Paratope p(i) recognizes epitope

    e (i-1) for i= 1,2 n. To form a cycle, we assume that by chance p(i) recognizes en in addition to

    e0. Thus, en must resemble the antigen e0. If the antigen is eliminated, the existence of the cycle

    can maintain the concentration of ab1, an antibody that specifically recognizes the antigen [5].

    If the paratopes are assumed to functions as epiotpes, then for sure the values of n resemble the

    antigen [5].

    7. String Matching RulesA matching rule defines matching or recognition, and the distance measure that the former is

    based on are the cornerstones in any detection, classification, or recognition algorithms. If you

    are dealing with categorical data, then a string representation may be more suitable and a

    matching rule like rcb is useful [7].

    Several string-matching rules are described below:

    7.1 Hamming Distance:

    It is defined as the number of different characters between two strings. The hamming distance

    between x and y strings is expressed as:

    ( ) (9)

    N = length of the string, and represents the i-th bit of the respective strings, the operationwithin bracket shows the x-or operation [7].

  • 8/4/2019 Independent Study Report # 1

    28/75

    28

    7.2Binary Distance:

    (10)

    Based on the number of bits that match or differ, the extensions of hamming distance have

    proposed.

    (11)

    (12)

    (13)

    a counts the number of 1s that match at the same position of both the strings; d enumerates the

    number of 0s that match at the same position of both the strings; b counts the number of 1s in

    string x that do not match string y; and c counts the number of 0s in string x that do not match

    string y [7].

  • 8/4/2019 Independent Study Report # 1

    29/75

    29

    Different similarity measures are developed which are as follows:

    1. Russel and Rao (13)

    2. Jacard and Needham

    14)

    3. Kulzinski

    5

    4. Sokal and Michener

    6

    5. Rogers and Tanimoto

    7

  • 8/4/2019 Independent Study Report # 1

    30/75

    30

    6. Yule

    8

    7.3 Edit Distance:

    It is defined as the minimum number of string transformations between two strings s1 and s2

    required to change string s1 into s2 where the possible string transformations include (i)

    changing a character, (ii) inserting a character and (iii) deleting a character.

    It is also termed as Levenshtein distance, it is a generalization of the hamming distance [7].

    Value Difference Metric:

    (19)Where

    ( )

    And

    denotes the probability that xi equals to the character c in the alphabet C [7].

  • 8/4/2019 Independent Study Report # 1

    31/75

    31

    7.4 LandscapeAffinity Matching:

    This type of matching is used to capture the notion of matching biochemical and physical

    structures and approximate matching to immune system. Input string and antibody string are

    converted to bytes and then into positive integers to create landscape. Using sliding window, two

    strings are compared [7]. Three different similarity measures are defined as:

    Difference Matching Rule:

    | | (20)

    Slope-Matching Rule:

    | | (21)

    Physical matching:

    (22)

    7.5 R-Contiguous Bits Matching:

    The rcb matching rule is defined as follows:

    If x and y are equal length strings, then they are said to be matched if x and y match at atleast r

    contiguous locations and we say match(x,y) is true.

  • 8/4/2019 Independent Study Report # 1

    32/75

    32

    Example:

    If x=ABADCBAB and y=CAGDCBBA, then we can say that match (x,y) is true for r

  • 8/4/2019 Independent Study Report # 1

    33/75

    33

    8.1 The Bone-Marrow Object

    It decides where in network the antigen has to be inserted, which B-cell is dying and causing

    increase in concentration of cells beneficial to the network. The bone marrow object possesses

    main algorithm which starts immune response by inserting antigen in b-cell network. Thealgorithm is as below:

    Randomly initialize B-cell population

    Load antigen population

    Till end is reach DO

    Select antigen randomly from antigen population

    And insert such selected antigen in random point in B-cell network.

    Select the approximate percentage of B-cells around insertion point.

    For every B-cell selectedDo interaction between antigen and each B-cell selected for immune response.

    Arrange these B-cells by the level of their avidity

    Delete 5% bad cells out of B-cell population

    Create n new B-cells (n = 25% of B-cell population)

    Out of this n, select m cells to join the immune network (m = 5% of population) [9]

    B-cell Object

    The B-cell object possesses a pattern matching element. The B-cell object records the affinity

    level of the B-cell and looks after the links to any other B-cell object it is in connection within

    network of B-cells.

    Antibodies

    When an antigen meets antibody, an immune response is elicited and a match score is recorded.

    If this score is more than or equal to threshold, the binding between antibody and antigen occurs.

  • 8/4/2019 Independent Study Report # 1

    34/75

    34

    Antigens

    Each antigen which is potential is represented by antigen object possessing one epiotpe. The

    antigens are defined in external ASCII files and are inserted into AIS by the antigen population

    object. The object realizes the a series of lists from files and instantiates those series of list as

    objects of antigens.

    B-cell Stimulation

    [ () () ] -

    Above equation represents the stimulation of B-cell.

    8.2 Applying AIS to Pattern Recognition Problem

    1. B-cell ObjectsThe antibodys paratope is created from mRNA list. The bit string is copied by AIS in

    complementary manner.

    2. AntibodiesBit String representation is used for pattern recognition problem. So, the antibody

    representation is of 0s and 1s.

    3. AntigensAIS is tested by two diverse antigens population possessing the antigens binary list of

    20 elements.

    The antigen population used to immunize the AIS is of three pattern type forming 33% of

    the population of antigen. The population consists of originals as well as the modified bit

    strings introducing noise into the data.

  • 8/4/2019 Independent Study Report # 1

    35/75

    35

    Antigen Population Representation:

    11111111110000000000 33%

    00000000001111111111 33%

    00000111111111100000 33%

    4. Antigen/AntibodyIn order to determine the match between Ag-Ab, instead of following match to start at

    any point on the antigen, a circular approach is followed. Hence, if the pattern described

    by the antibody starts halfway along the antigen, then the antibody is shifted half way

    along its length and hence a entire match is noted.

    Bit Shifted Antibody:

    Antibody 0 0 1 0 1 0 1 1 1 0

    Antigen 1 0 0 0 1 1 1 0 1 0

    Bit Shifted Antibody 0 1 1 1 0 0 0 1 0 1

    8.3 The match algorithm:

    Repeat

    For each region consisting of 2 or more 1s note their length if

    then

    =

    Shift Ab right 1 bit

    Until Ab shift complete

  • 8/4/2019 Independent Study Report # 1

    36/75

    36

    Calculating Match Value:

    Antigen: 0 1 1 0 0 0 0 1 1 1 1 0 1 1 0

    Antibody: 1 0 0 1 1 1 0 0 0 1 0 1 1 0 1

    XOR: 1 1 1 1 1 1 0 1 1 0 1 1 0 1 1 12

    Length: 6 2 2 2

    MatchValue: 12 + + + + 88Hypermutation:

    In milti-point mutation, each bit selected was flipped and in sub-string regeneration, all the

    elements between the two desired points are flipped.

    8.4 Running the System

    99 binary antigens were used to immunize the system. The test population was then presented to

    AIS. The learning part was turned off while testing phase and hence the system is capable of

    showing the secondary immune response. In other words, the system can determine whether the

    antibody determine the antigen or not.

    50 Iterations were performed for the immunization process in which the antibody population

    increased from 10 to 28. Then comes the turn for secondary response by presenting antigens as

    shown below.

  • 8/4/2019 Independent Study Report # 1

    37/75

    37

    1111111110000000000 TEST 1 *

    0000111000110010001 TEST 2

    1110010010010010010 TEST 3

    0000000001111111111 TEST 4*

    1010101000101001110 TEST 5

    1111001010100110100 TEST 6

    0000011111111110000 TEST 7*

    TEST 1,4 and 7 are original antigens used in primary response. TEST 2,3 are modified versions

    of TEST1. On the same lines, TEST 5,6 are noised version of TEST 4.

    AIS should be able to identify TEST 2,3,5,6 without any difficulty [9].

    9. Dynamic Behavior Arbitration using AISAkio Ishiguro et. al proposed a inference making system inspired from immune system in living

    organism and applied it to behavior arbitration of autonomous mobile robot as conventional AI

    systems have brittleness under dynamic changing environment. They try to evolve affinities

    among antibodies using genetic operators.

    Much attention has been focused on the behavioral decomposition approaches as there are

    limitations on the functional decomposition for conventional AI. The arbitration among

    competence modules arises difficulties in behavior-based arbitration.

  • 8/4/2019 Independent Study Report # 1

    38/75

    38

    To overcome such difficulties, Maes proposed behavior network system under which an action

    suitable for the current situation and the given goals emerges on account of interaction between

    different competence modules. Akio Ishiguro et. al approached this problem from

    immunological point of view as shown in fig. 6.

    Figure (9) Architecture of Algorithm [9]

    As shown in figure, current situation, like, distance, direction to the detected obstacle perform

    action like antigen and competence modules and interactions between modules perform action as

    antibody and stimulation/suppression between antibodies, respectively. The baseline for such

    approach is that the best possible antibody is selected for antigen.

  • 8/4/2019 Independent Study Report # 1

    39/75

    39

    Figure (10) Immune Networks [8, 9]

    In order to verify the ability of their proposed, they simulated it. There are three kinds of objects

    in this simulated environment: a] predators, b] obstacles and c] foods. For quantitative

    evaluation, following assumptions are made:

    1. For movement, the immunobot consumes energy say Em.2. If the immunobot is captured by predators, Ep amount of energy is consumed.3. If immunobot collides, Eo energy is vanished.4. If the immunobot get the food, it gets Ef energy.5. For avoiding over-charging, the obtain-food behavior is not emerged after sufficient of

    food is already obtained.

    The predators attack immuno-bot only if they are in predefined limit or range. So, to survive, the

    best possible antibody is desired.

    The figure below shows the structure of immunobot used in the simulations. It is armed with

    external and internal detectors. External detectors are sensors in eight directions detecting

  • 8/4/2019 Independent Study Report # 1

    40/75

    40

    predators, obstacle and food. The distance is also detected by each detector in terms like near,

    mid and far. The internal detector detects energy level.

    Figure (11) Structure of Robot [8]

    9.1 Description of Antibodies

    The prepared competence module is antibody. The important thing for immunobot is to select the

    best antibody for antigen and such is dependent on the how the antibodies are described. The

    selection should be made in bottom-up manner with proper communication between the

    modules. The structure of paratope and epitope is crucial for specificity or we can say for

    identity of any specific antibody.

    Paratope is desirable condition and the epitope is disallowed condition. The paratope and

    idiotope are divided into three positions: obstacles, direction and distance. The typical

    inference/consensus system adopt a condition-action description just like in fuzzy inference and

  • 8/4/2019 Independent Study Report # 1

    41/75

    41

    the proposed system uses condition-action-condition manner. Such manner provides

    decentralized dynamic inference in a bottom-up manner.

    Figure (12) Antibody Description [9]

    The prepared antibody for antigen can be like below:

    The antibody is activated if the immunobot detects the food in the front direction and mid-range,

    and makes the immunobot move forward to pick it up.

    Figure (13) Prepared Antibody [9]

  • 8/4/2019 Independent Study Report # 1

    42/75

    42

    However, if a predator exists in front and near/mid range, or if a food is in near range, the

    prepared antibody can hesitate to be activated.

    On similar lines, the other antibodies are designed.

    9.2 Dynamics

    In this model, the authors allow only one antibody to get activated when it surpasses the

    prespecified threshold. One state variable is introduced in terms of concentration of each

    antibody.

    { } (23)

    = concentration of antibody that varies with time. =matching ratio between antibody i and j.

    9.3 Basic mechanism of the proposed inference making network

    Four antigens are listed in the figure shown and the listed five antibodies mainly participate in

    the inference/consensus making. For instance, antibody 1 means that the food is detected by

    immunobot in far range in front direction and so it is allowed to move forward. Other situations

    involve immunobot identifies food in near range/predator in front/high energy level, this

    antibody would stimulate other antibodies whose paratopes displays such conditions.

  • 8/4/2019 Independent Study Report # 1

    43/75

    43

    Figure (14) Antibody Selection [7,9]

    Consider current energy level high, the antibodies 1, 2, 3, and 5 are stimulated by the antigen.

    The concentrations of these very antibodies are incremented in accordance to its antigen. The

    interaction within immune networks antibodies is importan. In the end, antibody 5 is selected in

    figure 9.

    In the case of current energy level low, antibody 3 is selected [9].

    10. Latest Immune Models and Hybrid Approaches

    10.1 Danger Theory based algorithms

    In 2002, Aickelin and Cayzer include the following aspects in their AIS from danger theory:

    1. Appropriate number of APC to display danger signals needs to be modeled.2. Danger signal is either positive or negative, representing the presence or absence of the

    signal.

  • 8/4/2019 Independent Study Report # 1

    44/75

    44

    3. So far as biology is concerned, the danger zone is spatial but in computation model theother notions such as temporal proximity is used.

    4. Sometimes the killer cells causes self cell death, this should not generate other dangersignals.

    5. Priming killer cells should be considered via APCs in AIS models6. Antibody migration rule should specify the concentration of antibodies receiving signal 1

    and signal 2 from a given APC.

    DT depends on the concentration so different immune cells.These aspects are used to build better

    AIS for anomaly detection in which the non-self do not trigger immune response without danger

    signal [7].

    Figure 15 (a) One Signal Model [7]

    Figure 15 (b) Two Signal Model [7] Figure 15 (c) APC controlling IR [7]

  • 8/4/2019 Independent Study Report # 1

    45/75

    45

    Figure 15 (d) INS with third signal [7] Figure 15 (e) danger in control through zoning[7]

    Figure 15 (f) Control through INS and zoning [7]

  • 8/4/2019 Independent Study Report # 1

    46/75

    46

    In 2010, the online supervised two-class classification problem was attempted to solve by using

    danger theory. The proposed method is described below:

    The algorithm regarding the proposed method are as follows:

    Algorithm 1

    Danger theory based immune algorithm.

    1. Introduce antibody population and memory

    2. While stopping conditions are not met do3. For i=0 to antigen population do4. Present antigen to the system5. Now the danger is created by antigen presented6. General antibody population receives signal 0 from antigen presented7. General antibody population receives signal 1 from danger zone8. Antibodies that receives both 0 and 1 signals are selected9. For all antibodies belonging to stimulated antibodies10.Change the status of antibodies11.Now the calculate the interaction between antibody and antigen12.End for

  • 8/4/2019 Independent Study Report # 1

    47/75

    47

    13.Suppress antibody population14.Decrease the danger from the antigen which has been already considered

    15.For all antibodies belonging to stimulated antibodies16.Ifthe antibodies stimulation reaches certain threshold value then17.Apply clonal selection algorithm18.End if19.End for20. End for21.Check the stopping criteria22.End while23.Output is the memory of antibodies selected via clonal selection and met threshold value

    When the learning algorithm is ended, the output antibodies are used to classify for unknown

    antigens. A simple process in which an unknown antigen will be classified as the same class as

    the antibody with which it has the very low affinity.

    Learning Algorithm explained:

    1. Initialization: The above algorithm mentioned starts with the antibody random populationand they are assigned labels. Their status are set to zero and memory are set to empty set.

    2. Two kinds of signals: The detection of danger signals are co stimulation signal whichare termed as 1 while other are termed as 0. The antibodies populations are divide in to

  • 8/4/2019 Independent Study Report # 1

    48/75

    48

    two parts; a] general and b] memory. The memory antibodies are not interested in

    reaction with antigens. They are the fixed memory of antigens. They are changed only

    when they are suppressed. The general antibodies get signal 0 when presented with

    antigen. So, the antibody can detect the stimuli of current antigen and when signal 0 is

    perceived only when danger zone is created. The antibodies receiving both signals are

    stimulated and can change their status.

    Algorithm 2

    1. Antibody stimulated = antibody stimulated +1.

    2. Ifantibody label == antigen label then

    3. Antibody-Antigen reaction =1

    4. Else

    5. Antibody-Antigen reaction = -1

    6. End if7. Antibodyrelevance = antibody-relevance + antibody antigen reaction8. Variable danger zone (var) = affinity between antigen and antibody9. Calculate the antibody stimulation = antibody +antibody - antigen reaction * var10. Var = stimulated antibody population11.Antigen danger = Var *var*antibody stimulation

  • 8/4/2019 Independent Study Report # 1

    49/75

    49

    Algorithm 3

    1. Ifantibody stimulation (as) < threshold value (t) then

    2. Delete antibody population that are less than threshold3. Else ifas

  • 8/4/2019 Independent Study Report # 1

    50/75

    50

    5. Delete the antibody with high interactivity6. End if7. End for8. Group the memory antibodies in to pairs9. For all pairs do10.Calculate probability p211.Ifrandom< p2 then12.Remove the memory antibody with high affinity13.End if14.End for

    10.2 Combining Dendritic Cells and Danger Theory

    In 2007, Yeom used a approach of mixing DT and DC to form model for signal pre-

    categorization. The following are principles:

    1. Pathogens associated molecular proteins (PAMPs) are expressed by bacteria that can beidentified by DCs for change in behavior.

    2. Danger signals are generated by unplanned death of necrotic cells. The sudden andbizarre or chaotic death of internal components of cell causes danger signal to surface up.

    DCs are sensitive to concentration of danger signals. The presence of danger signal may

    or may not show change but the probability of change is higher than the normal

    situations.

  • 8/4/2019 Independent Study Report # 1

    51/75

    51

    3. Safe signals are due to normal death of any cell for regulations reasons and the tightlycontrolled process results in the release of various signals into the tissue. Such safe

    signals give rise to suppression signals.

    4. Inflammatory cytokines can be released as a result of injury, although the process ofinflammation is not enough to stimulate DCs alone.

    DCs can stimulate nave T cells and have number/ of functional properties (Yeom, 2007)

    DCs first function is to inform immune system to respond when there is attack.

    DCs perform different functions depending upon their state of maturation. Modulation between

    these state is facilitated by identification of signal between tissues, namely, danger signal,

    apoptotic signal and inflammatory signal.

    In tissue, DCs collect antigen and experience danger signals from necrosing cells and safe

    signals from apoptotic cells. Maturation of DCs occurs in response to the receipt of these signals.

    According to Yeom (2007), if there is concentration of danger signals in the tissue at the time of

    pick of antigen, the DC is fully matures. Conversely, if there is safe signal, then DC gets matured

    differently [7].

    10.3 Multilevel Immune Learning Algorithm (MILA)

    Both T and B level recognition mechanism is used in this algorithm. It is inspired by the

    communication and processes of T-cell dependent humoral immune response. In biological

    immune system, B- cells recognize antigen through immnoglobin receptors on their surfaces but

    they are not proliferate and differentiate until the green signal is given from Th cells.

  • 8/4/2019 Independent Study Report # 1

    52/75

    52

    For Th cells to allow B cells to proliferate and differentiate, Th cells should get stimulated and

    that happens only when Th cells recognize antigens in the context of major histocompatibilty

    complex (MHC).

    Suppression of B cells also occurs due to suppressor T cells. The activated B and T cells move to

    lymph nodes where they proliferate, mutate, select, differentiate, and death of B cell takes place

    in germinal centres (GCs).

    In MILA, an abstraction of above events is incorporated to develop detection algorithm. The

    algorithm consists of initialization, recognition, evolutionary and response.

    In initialization phase, the detection system is trained to recognize the self. The result of

    initialization is used to produce detectors, similar to populations of Th, Ts, Bcells which

    participate in immune response (humoral). There are three level :

    1. APCs level, that corresponds to highest one.2. B-cell level, the intermediate one.3. Th- cell level, bit level for local patterns.

    MILA use rcb-matching rule for real valued representations. A Th cell uses the slide window to

    get the w elements. However, B cells uses randomly chosen w elements. The concept of

    prematuration and crossover operators can be used.

    The another feature of MILA is positive selection by Ts cells that are based on self samples.

    An evolutionary phase in MILA is a process of refining the detector set if the earlier detection

    rates can be evaluated. This phase involves cloning, mutation, and selection; however cloning in

    MILA is targeted one only those detectors that are activated in the recognition phase can cloned

    [7].

  • 8/4/2019 Independent Study Report # 1

    53/75

    53

    10.4 Combining Negative Selection and Classification technique

    In anomaly detection technique, only positive samples are available (self-sample) at the training

    stage. However, most conventional classification algorithms need noth self as wells as non-self

    algorithms.

    In order to allow conventional algorithm to be used, when only self samples are there, a hybrid

    algorithm is proposed by Gonzalez (2002) which is used to create synthetic samples from a set of

    self-samples. The algo develop the detector set that covers the non-self space using NS and then

    points are used to generate the samples for non-self class allowing the use of conventional

    algorithm useful.

    Figure (16) NS-SOM in generation classifier dataset [7]

    .

    Particularly, negative samples are generated from positive samples. Then samples from the both

    classes are used for neural network for self organizing map (SOM). An SOM, composed of

    nodes or neurons (that are able to identify input type) , is a type of AIN that is trained to produce

  • 8/4/2019 Independent Study Report # 1

    54/75

    54

    a low-dimensional representation of the input space or self/non-self feature space of the training

    samples called map. [7,8].

    In order to allow conventional algorithm to be used, when only self samples are there, a hybrid

    algorithm is proposed by Gonzalez (2002) which is used to create synthetic samples from a set of

    self-samples. The algo develop the detector set that covers the non-self space using NS and then

    points are used to generate the samples for non-self class allowing the use of conventional

    algorithm useful.

    The three phases of NS-SOM are shown in figure below:

    Figure (17) NS-SOM Model Structure [7,8]

  • 8/4/2019 Independent Study Report # 1

    55/75

    55

    11.Immune Networks and Negative Selection Based algorithmThe mixture of Negative selection and Ab-Ab communications algorithm was developed for

    navigation control and path mapping of autonomous mobile robot by Prashant Rao (2008) for

    Khepera II robot.

    The following is the step by step formulation of the algorithm:

    1. Initialization: First initialize a network of immune cells (there is superset of 64 antibodiesfrom 0 to 63). The initial concentrations of antibodies are initialized and the robot is

    reset. The subset of 20 antibodies is chosen randomly. The stimulation and suppression

    between antibodies using basic matching function is defined. The first two sensors are not

    ON in their Khepera II robot

    2. Population Loop:i) Antigenic Recognition: The information from the sensors is collected and an

    antigen is formed based on that information. The matching is determined between

    antigen and randomly selected antibodies and affinities are allotted. Each antigen

    stimulates many antibodies but only one is perfectly matched and so selected for

    process.

    ii) Self-Nonself Determination: The antigen is seen for matching to self set in caseinnate memory takes over and system is allotted standard solution and the loop

    executes again OR the system moves on to next step.

    iii) Network Communications: The interactions between different selected randomlyantibodies is calculated.

    iv) Dynamics: The stimulation minus suppression added to affinity betweenantibodies subtracted from the natural death co-efficient gives over all stimulation

  • 8/4/2019 Independent Study Report # 1

    56/75

    56

    of the system. The product from the stimulation and concentration of antibodie

    provides us with the rate of change of concentration with time. The antibody with

    high concentration is sent to critic that rewards or penalize and in respect to this

    affinity are modified.

    3. Feedback: The penalty allotted T-cell helper is activated and its calculation is determinedat each step. Adaption function is determined by interaction between T-cell and other

    cells in network by modifying the affinities between antibodies employing a suitable

    learning rate.

    4. Step 2 and 3 are repeated until convergence criteria is met.

    Figure (18) Algorithm based on Negative selection and Ab-Ab interaction [6]

  • 8/4/2019 Independent Study Report # 1

    57/75

    57

    Figure (19) Algorithm based on Negative selection and Ab-Ab interaction [6]

    11.1 Latest Dendritic Cell Algorithm Inspired from Danger Theory

    Danger theory states that the dangers signals are generated to activate APCs. APCs stimulate

    T-helper cells and which finally gives rise to adaptive immune response. The danger signals

    are detected by dendritic cells which acts in three modes namely immature, mature and semi-

    mature. If the signal detected is safe then the dendritic cell become immature upon presenting

  • 8/4/2019 Independent Study Report # 1

    58/75

    58

    antigen to T-cell. If the dangerous signal is found then the dendritic cell is matured and T-cell

    become antigen reactive.

    The dendritic cell algorithm takes into consideration safe, danger and PAMPs signals. [11]

    ALGORITHM:

    input : S = set of data items to be labeled safe or dangerous

    output :D = set of data items labeled as safe ordangerous.

    Start

    Generate initial population of dendritic cells (DCs), D

    Create a set to include the migrated DCs, M

    forall items in set S do

    Select a set of DCs by randomly selecting from D, P

    forall DCs in set P do

    Add data item to DCs collected list

    Update safe, danger and PAMPs concentrations

    Update cytokiness concentration

    Move DC from D to M and generate a new DC in set D if the

    concentration is above threshold.

    stop

    stop

    forall data items in S do

    count the number of times data item is presented by a mature and semi-mature DC

    Label item to be safe if if presented by more than semi-mature DCs than mature DCs,

    Add data item to labeled set M

    Stop [11]

  • 8/4/2019 Independent Study Report # 1

    59/75

    59

    11.2 Latest TLR (toll-like receptor) Algorithm

    Algorithmic steps of TLR algo as described by Aickelin and Greensmith (2007) which is

    simply designed for anomaly detection in computer networks are as below:

    1. Collect set of system calls that are made in training data2. Collect signal values correspondingly3. Determine the complement set of sets in step 1 and step 2.

    Figure (20) Systematic Overview of TLR algorithm [7]

  • 8/4/2019 Independent Study Report # 1

    60/75

    60

    4. Generate immature DCs (iDCs) set with signal receptors selected randomly from thecomplement signal set and with antigen receptors randomly selected from the

    complement system call set.

    5. Similarly, generate nave T-cells (nTCs) with antigen receptors randomly drawn fromcomplement system call set.

    6. Immature DCs are exposed to sample signals and antigens, respectively.

    7. If iDCs matches the signal. it matures (mDCs) and migrates.

    8. If an iDC do not migrate in its lifetime, it is semi mature DC (smDCs) and then itmigrates.

    9. Migrated smDCs and mDCs present their antigen and try and match nTCs.

    10.If mDC presenting antigen matches to nave T cell, then nTCs are activated and it is saidthat we have anomaly.

    11.If smDC expressing antigen matches nTC , then it kills nTC to lower false positives.

    12.Migrated smDCs and mDCs and killed nTCs are replaced by new cells as per steps 4 and5. [7]

  • 8/4/2019 Independent Study Report # 1

    61/75

    61

    12.Recent Developments and Real world ApplicationsSolving problems using Immunological Computation

    In order to apply the knowledge of biological immune system to real world problems, one must

    first select the immune algorithm depending on the type of problem. The first step is to identify

    the elements involved into the problem and how they can be represented in terms of particular

    AIS.

    To encode such entities, bit-string, real valued, etc, representation approaches can be chosen.

    Then the affinity determination measure is selected related to matching rules employed. Next

    step is to decide which AIS is beneficial to create a set of suitable entities that can provide a

    good solution to the problem in the context [7].

    Figure (21) Problem Solving Using AIS [7]

  • 8/4/2019 Independent Study Report # 1

    62/75

    62

    12.1 Virus Detection

    Kephart(1994) proposed immunologically inspired approach to detect viruses in computer

    system. In this, known viruses are identified by their computer coded sequences and unknown

    viruses are detected by their unusual behavior in the system. The virus detection software

    continuously scans the system to detect the changes. These changes triggers the release of decoy

    programs whose sole intention is to become infected by virus [7].

    Figure (22) Flow Diagram for Khephart approach for virus detection [7]

  • 8/4/2019 Independent Study Report # 1

    63/75

    63

    A diverse suit of decoy programs are kept at different locations in the systems memory to detect

    virus. If one or more decoy programs are modifies, then it is sure that the virus has entered the

    system and each decoy program contains the sample of virus. The infected decoy programs are

    processed by signature extractor to generate the recognizer for the respective virus.

    The signature extractor also extracts the attachment pattern of virus to the host in order to repair

    the host in case. The signature extractor also must select the virus signature so that it can avoid

    false-positives and false-negatives. The signature must be found in each sample of virus and it is

    very likely not to be found in uninfected programs in computer system. Once the best possible

    signature is found from virus infected programs, it id compared with half-gigabyte corpus of

    legitimate programs to make sure that there is no false-positive. The repair information is

    checked by testing on samples of the virus and again by human expert [7].

    12.2 Immunogenetic Approaches in Intrusion detection

    Gonzalez (2002) proposed negative selection with detector rules to detect attacks by monitoring

    network traffic. A real valued representation is used for evolving hyper-rectangular shaped

    detectors, interpreted as if-then rules, for high level characteristics of self / non-self space. The

    experiments were performed using data from 1999 Defense Advanced Research Project Agency

    intrusion detection evaluation dataset. AIS approach was able to produce detectors that gave a

    good estimation of the amount of deviation from the normal [7].

    12.3 Danger theory in Network Security

    Aickelin (2002) first proposed danger theory application to network security. Their system

    behaves like DCs looking for danger signals just like impulse increase in network traffic or

    abnormally high flow of error messages. If such signals goes above threshold, then an alarm is

    raised [7].

  • 8/4/2019 Independent Study Report # 1

    64/75

    64

    12.4 Robotics and Control

    Robot controlled by Ishiguro et. al. (1996, 1998) , Wantanabe et. al. (1998, 1999) and Lee et. al.

    (1999) focused on the development of dynamic decentralized consensus-making mechanism

    based on the immune network theory. In dynamic environment, the immunoid is able to collect

    the garbage. The metaphor of antibodies, which were potential behaviors of immunoid ; antigens

    were related to environmental inputs just like garbage, wall, home base. For the immunoid to

    take decide to the best, it matches antigen to antibody [7].

    Vertebrate immune systems are inspiration for computer scientist and engineers to create new

    algorithms in order to solve real world problems, four main AIS algorithms are:

    1. Negative selection algorithms2. Artificial immune networks3. Clonal selection algorithm4. Danger theory and dendritic cell algorithm

    The recent development include AIS application in computer security, optimization, data mining,

    fault detection, etc. Many authors have explained the recent developments in AIS just like Garret

    (2005) who tried to deal with the development before 2005 and attempt to make evaluation of

    AIS in criteria of distinctiveness and effectiveness. Hart and Timmins (2010) discussed

    application of AIS and proposed a set of problems features for the heavy applications of AIS.

    Some of the recent developed models and Hybrid approaches are explained below:

    12.5 Conserved Self Pattern Recognition Algorithm (CSPRA)

    This very algorithm is recent algorithm in AIS area with an inspiration from Pattern Recognition

    Receptors Model (PRR). According to PRR Model, the self/nonself discrimination requires

  • 8/4/2019 Independent Study Report # 1

    65/75

    65

    stimulation from APC. On the other hand if one sees, APCs are not stimulated until and unless

    they are activated via PRR that identify molecular patterns on bacteria. So, for sure, the PRR

    model added additional layers of molecular patterns. CSPRA (2010) naturally include negative

    selection algorithm and the anomaly detection in CSPRA is performed by combining the results

    from APCs self pattern recognition and T-cell negative selection. Self pattern recognition by

    APCs is not done till antigen is not detected by T-cell negative selection algorithm. The

    generation of APC detector includes two major steps:

    1. Depending on the function between antigen and its feature space, we define theconserved self pattern that can be pre-defined from the data. This very data includes the

    empirical one from the laboratory or it can be calculated mechanically by using Pearsons

    co-efficient values between the coloum of each attribute and their respective label.

    2. By evaluating the maximum, minimum and mean of all the values in the features space ofloc1, loc2,..,generate APC detector R = {(loc1, min, max, mean), (loc2, max, miin,

    mean)..} within the conserved self pattern of features located in loc1, loc2..

    As compared to classical negative selection algorithm, the proposed and tested CSPRA

    Algorithm shows more better and promising results reducing the number of false errors

    without increase the complexity. [3, 4, 13]

    12.6 Recent Complex Artificial Immune Systems (CAIS)

    CAIS consisted of five encountered layers namely encounter layer, preprocessing layer, MHC

    layer, competitive layer and stimulation layer. Antigen and Antibody are termed as the input and

    output. Suppose an antigen is encountered by the system then there are two ways in which wecan recognize it. One is in which B cell direct recognition and the other way is through the APC

    layers. The input is given to APC layer, then the molecular complex pattern formed is given to

    MHC layer for processing. The information coming from APC is transformed and translated into

    MHC and feed to Th layer. In this Th layer, the cells receive different responses from MHC layer

    and develop a set that consists of Th cells that provide better response to input antigens. B-cells

  • 8/4/2019 Independent Study Report # 1

    66/75

    66

    become activated due to stimulation from Th layer and also by input pattern. An antibody is the

    difference between an input and weights associated with b cells. Ts cells modulate the weights

    associated with immune cells located in neighborhood set. As compared to binary immune

    systems, the CAIS has invariant feature to recognize translation, rotation and scale patterns. It

    can be applied to hand writing pattern recognition problem [11, 13].

    12.7 Hybrid Approaches

    BAIS (Bayesian Artificial Immune Systems) is developed by removing the mutation and cloning

    operators from the probabilistic model for solving the optimization problems and multiobjective

    optimization. BAIS is capable of capturing the most relevant interactions between the problem

    variables. The very algorithm adopts the population based strategy for search and Bayesian

    network for implementing the probabilistic model.

    Once the population is initiated, the algorithm starts the loop with stopping condition and the

    following steps are evaluated for loops:

    a. Using proper selection technique, select the best population from the given set.b. For the best solutions, develop the Bayesian networks that best fits to the selected best

    solutions.

    c. Sample the antibodiesd. Remove the antibodies with lower fitness and so the similar ones in the criteriae. Now put randomly generated antibodies in the selected ones to maintain diversity [13].

    BAIS can be applied for feature selection using wrapper approach. It has the ability to handle the

    building block in optimization of Trap-5 and such building blocks are non-overlapping and

    overlapping. The multi objective Knapsack optimization can also be solved very efficiently by

    BAIS algorithm. Such a approach is termed as the Multiobjective Bayesian Artificial Immune

    Systems (MOBAIS) that can be applied for classification problems. It is capable of identifying

  • 8/4/2019 Independent Study Report # 1

    67/75

    67

    and preserving the building blocks effectively while it can search and find diverse high leve;

    local optimal. The practical application shows that it has parsimonious results and thus shows

    accurate results. Furthermore the Bayesian networks were enhanced by learning to avoid the

    synthesis of the network at each iteration and only update those two parameters that are crucial

    for example the conditional and marginal probabilities at each iteration [13].

    The algorithm with an unstructured damage classification based on the data clustering and AIS

    pattern recognition can be performed. Such a technique uses the data clustering training data to a

    specified number of clusters and generate the initial memory cell set. By combining Afor

    example.IS pattern recognition algorithms, this algorithm for the evolution for memory cells.

    AIS with SVM can be used for fault diagnosis of induction motors. AIS is used for tuning the

    parameters of kernel and penalty for classification accuracy.

    In immune multiagent recognizer, each agent recognizer is an immune RBF neural network

    model. In the immune RBF neural network model, antigen is input and the antigens are the

    compression cluster mapping that is the hidden layers. The output weight can be determined by

    using least square algorithm. In this algorithm, each level of recognition systems contain

    recognizer that can recognize a sort of antigen.

    A multiple valued immune network classifier (MVINC) based on immune netwotk theory was

    applied for remote sensing images and performing immune memory using logic theory and

    immune theory for classification.

    EaiNET combined the AIS and Particle Swarm optimization which uses the learning technique

    of PSO which is nothing but the each individual is able to learn the best from the social

    population on account of which the convergence rate increases.

    Radial Basis Function (RBF) artificial neural network and AIS are combined for compression of

    the data in the set. Such a tool is called as aiNET. This can also be used for determining the

    number of RBF in ANNs and thus termed as RBFNN.

  • 8/4/2019 Independent Study Report # 1

    68/75

    68

    A fault diagnosis model was proposed based on the immune evolution algorithm. The design part

    includes the diversity evaluation that is very complex and fault detection is hard, fault calculation

    technique integrated the induction and static was designed [13].

    Particularly, by combining the agent based modeling and UML, the computational properties of

    degenerate recognition systems are investigated. In this, It is possible to determine the

    degenerate receptors and that when compared to a non degenerate system, recognition appears

    quickly.

    In the resource limited AIS, the Network Affinity Threshold (NAT) does not calculate the

    network evolution process because the network granularity is determined by NAT and the initial

    value is calculated by distance between the antigens. The convergence of the public and the

    stability can be impaired by pure clonal selection and random change operation.

    The gene immune detection algorithm with complement operator decreases effectively false

    position surfaced up in the previous gene immune detection. Also the vaccine and the

    complement are introduced. The number of detector are reduced and the efficiency of detection

    is increased. The complement operator overcome the defect of the gene immune algorithm and

    the detection time can be increased drastically.

    ICAIS for incremental clustering based on the principles of AIS was introduced and it

    implements incremental clustering and uses the basic immunity response to determine the data

    regarding to novel clusters and it also uses the secondary immune response to identify the data to

    old patterns [13].

    Based on Learning Vector Quantization (LVQ) and immune network [13] model that is an

    extension to the basic Jernss Model was proposed that can be used for pattern recognition. The

    new classification Hybrid Fuzzy Neuro- Immune Network method based om Multi Epitope

    approach. The performance of the proposed method shows promising result in terms of pattern

    recognition.

  • 8/4/2019 Independent Study Report # 1

    69/75

    69

    APPENDIX A

    Pattern Recognition in the Immune System using a Growing SOM

    [ The following project is taken from Ph. D Thesis of Leonardo De Castro ]

    function [w,win,cwin,D] = abnet(ag,eps,comp,alfa,beta,pc,pm),

    % Pattern Recognition in the Immune System using a Growing SOM% Bipolar Splitting/Pruning Self-Organizing Feature Map (GSOM)% with Evolutionary Phase% Main features: bipolar weights, Hamming Distance, Winner takes all% PHASE I: Growing followed by Pruning% PHASE II: Supervised Evolution%% function [w,win,cwin,D] = hybrid(ag,eps,comp,alfa,beta,pc,pm),

    % w -> weight matrix (Ab population)% win -> winner for each Ag (v)% cwin -> amount of winning of each individual (tau)% D -> hamming distance of each Ag with relation to its mapped class% ag -> antigen population to be recognized (n2xs2)% eps -> ball of stimulation% comp -> comparison: 1 for comparing complementary chains% 0 for comparing identical chains (Hamm. dist.)% alfa -> amount of bits to be changed% beta -> number of iterations for reducing the learning rate%% Auxiliar functions: COVER, UPDATE, SPLIT, PRUNE, MATCH, CADEIA, TESTGSOM% The columns of w must be similar to each Ag

    if nargin == 2,[n2,s2] = size(ag);comp = 0;alfa = 3;beta = 3;pc = 0.6;pm = 0.1;

    end;

    % Network parametersep = 0; alfa0 = alfa; TD = 1;[np,ni] = size(ag); no = 1; vep = [0];[C,maxno] = cover(ni,eps); vno = [1:1:no];disp(sprintf('Coverage of each Ab: %d',C));disp(sprintf('Initial number of classes: %d',no));disp(sprintf('Possible number of classes: %d',maxno));if maxno > np,

    maxno = np; disp(sprintf('Maximum number of classes (N): %d',np));end;% disp(sprintf('Affinity threshold: %d',eps));disp(sprintf('Press any key to continue...'));

  • 8/4/2019 Independent Study Report # 1

    70/75

    70

    pause;[w] = cadeia(ni,no,0,0,1);max_ep = (beta + 1) * maxno;

    % Network Definitionwhile (ep < max_ep & TD > 0)% & no < maxno),

    cwin = zeros(1,no); k = 0;vet = randperm(np); % Assincronouswhile k < np,

    k = k+1; i = vet(k); D = [];[D,mXOR] = match(w',ag(i,:),comp);[v(k),ind] = min(D);cwin(ind) = cwin(ind) + 1;win(i) = ind;w = update(w,ind,alfa,mXOR(ind,:)');

    end;TD = sum(v);ep = ep + 1;% Growing Phaseif (rem(ep,beta)==0),

    [w,no,alfa] = split(cwin,win,w,ag,eps,alfa,alfa0);vno = [vno no]; vep = [vep ep];

    end;% Pruning Phase[aux,indmin] = min(cwin);if aux == 0,

    [w,no,alfa] = prune(w,indmin,alfa0);vno = [vno no];

    end;% Learning rate decreasingif (ep > 0.05*max_ep & rem(ep,0.05*max_ep)==0),

    if alfa > 1,alfa = alfa - 1;

    end;end;disp(sprintf('IT: %4.0d no: %d LR: %d TD: %d',ep,no,alfa,TD));

    end;[v,win,cwin,perc] = testgsom(w,ag,eps);disp(sprintf('Percentage of misclassified Ag: %3.2f%%',perc));disp('Minimal Antigenic Affinity (HD)'); disp(v);disp('Concentration Level: '); disp(cwin);disp(sprintf('Final Architecture: [%d,%d].',ni,no));figure(1); plot(vep,vno); hold on; plot(vep,vno,'or'); axis([0 ep+1 0 no+1]);title('Growing Evolution');xlabel('Iteration'); hold off;

    % --------------------------- %% INTERNAL SUBFUNCTIONS %% --------------------------- %

    % Function CADEIAfunction [ab,ag] = cadeia(n1,s1,n2,s2,bip)if nargin == 2,

    n2 = n1; s2 = s1; bip = 1;elseif nargin == 4,

    bip = 1;end;

  • 8/4/2019 Independent Study Report # 1

    71/75

    71

    % Antibody (Ab) chainsab = 2 .* rand(n1,s1) - 1;if bip == 1,

    ab = hardlims(ab);else,

    ab = hardlim(ab);end;% Antigen (Ag) chainsag = 2 .* rand(n2,s2) - 1;if bip == 1,

    ag = hardlims(ag);else,

    ag = hardlim(ag);end;% End Function CADEIA

    % Function SPLITfunction [w,no,alfa] = split(cwin,win,w,ag,eps,alfa,alfa0)[ni,no] = size(w);[ind] = find(cwin > 1); % which outputs map more than one Agif ~isempty(ind),

    [val,out] = max(cwin);% out = ind(1);

    v = find(win==out);Mag = ag(v,:); % matrix of ag mapped in the same outputD = match(Mag,w(:,out)',0);[aux,new] = max(D);if aux > eps,

    disp('** Growing **');if out == 1,

    w = [Mag(new,:)',w];elseif out == no,

    w = [w,Mag(new,:)'];else,w = [w(:,1:out),Mag(new,:)',w(:,out+1:end)];

    end;no = no + 1;alfa = alfa0;

    end;end;% End Function SPLIT

    % Function TESTGSOMfunction [v,win,cwin,k] = testgsom(w,ag,eps),% disp('** Running the trained network **');[np,ni] = size(ag); k = 0;

    cwin = zeros(1,size(w,2));for i=1:np,

    [D] = match(w',ag(i,:),0);[v(i),ind] = min(D);win(i) = ind;cwin(ind) = cwin(ind) + 1;

    end;k = 100 * (sum(v > eps) / np);% End Function TESTGSOM

  • 8/4/2019 Independent Study Report # 1

    72/75

    72

    % Function PRUNEfunction [w,no,alfa] = prune(w,ind,alfa0),[ni,no] = size(w);disp('** Pruning **');

    if ind == 1,w = w(:,2:no);elseif ind == no,

    w = w(:,1:no-1);else,

    w = [w(:,1:ind-1) w(:,ind+1:no)];end;no = no - 1;alfa = alfa0;% End Function PRUNE

    % Function COVERfunction [C,no,eps] = cover(len,eps),fat = fatorial(len);

    C = 0;while eps > len,

    disp(sprintf('Ball of stimulation bigger than chain length %d',len));eps = input('Enter a new ball of stimulation: ');

    end;

    for i=0:eps,C = C + (fat/(fatorial(i) * fatorial(len-i)));

    end;no = ceil((2^len)/C);% End Function COVER

    % Function FATORIALfunction fat = fatorial(m);if m == 0,

    fat = 1;elseif m < 0,

    disp('Negative value');else,

    fat = prod(1:1:m);end;% End Function FATORIAL

    % Function UPDATEfunction [w] = update(w,ind,alfa,vXOR),

    [ni,no] = size(w);for j = 1:alfa,

    [val,pto] = max(vXOR);if val == 0,

    break; % exit loop if vectors are equalend;w