cristian urs and ben riveira. introduction the article we chose focuses on improving the performance...

26
Cristian Urs and Ben Riveira

Upload: toby-barrett

Post on 29-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Cristian Ursand

Ben Riveira

IntroductionThe article we chose focuses on improving

the performance of Genetic Algorithms by:

Use of predictive models to efficiently perform repetitive test case executions.

Directly improving the efficiency of the internal workings of the Genetic Algorithm itself.

The Genetic Algorithm, DefinedA GA is a search algorithm with the following

key features:

A population of individuals, where each individual represents a possible solution to the problem.

A fitness function, which selects individuals for reproduction, based on the individual’s fitness.

Genetic operators, which crossover or mutate selected individuals, creating new individuals for testing.

Example GA Pseudocode1. Choose the initial population of individuals. 2. Evaluate the fitness of each individual in that

population.3. Repeat on this generation until termination.

1. Select the best-fit individuals for reproduction.2. Breed new individuals through crossover and

mutation operations to give birth to offspring. 3. Evaluate the individual fitness of new

individuals.4. Replace least-fit population with new

individuals.

Advantages of Genetic AlgorithmsThe population of a GA allows it to:

Explore a search space without completely losing partial solutions that have already been found.

Perform parallel searches into multiple regions of the solution space.

In the area of software verification and validation, GA’s have become useful for automatically generating large volumes of software test cases.

Two Approaches

Neural Network-Based OraclesUse of a system oracle

Avoids expensive execution costs for evaluating test cases.

Provides efficient execution of repetitive testing tasks after deployment.

Dramatically reduces the burden of evaluating test cases in each genetic algorithm generation.

Neural Network-Based Oracles

Input Domain Data

Genetic Algorithm

Tester

Test OracleSoftware

Under Test

Input Input

OutputResult

Selected Individual

Failed Test Cases

Failure Intensity Evaluation

Use of a System Oracle

Neural Network-Based Oracles

Input Domain Data

Genetic Algorithm

Tester

Test OracleSoftware

Under Test

Input Input

OutputResult

Selected Individual

Failed Test Cases

Failure Intensity Evaluation

Use of a System Oracle

Neural Network-Based OraclesA neural network is an algorithm for

optimization and learning based loosely on the nature of the brain.

A directed graph known as the network topology whose arcs we refer to as links.

A state variable and real-valued bias associated with each node.

Real-valued weight and bias associated with each link.

A transfer function for each node.

x

1 2

z

y

1

Input

Output

-2

1 1 1 1

1 1

1

A simple Feed Forward Neural Network

Neural Network-Based Oracles

Input Domain Data

Genetic Algorithm

Tester

GA Trained Oracle

Random Trained Oracle

Input Input

OutputResult

Selected Individual

Failed Test Cases

Failure Intensity Evaluation

Neural Network-Based OraclesComparison of Accuracy for Random

Versus GA-Based Test Cases

GA Random

Overall Accuracy 81% 96%

Error Accuracy 96% 76%

Average per Error Accuracy

83% 29%

Improving the Fitness Function calculation

The Second strategy regarding the performance of genetic algorithm in automated test case generation is regarding the improvement of fitness function calculation.

Fitness Function CalculationsWhat is fitness?

The probability of survival of an individual chromosomes in the next generation

What is a chromosome? Chromosome=string of digits Gene= each digit that makes up the chromosome

Ex. of chromosome: 111001110101 100101100110 001010111000 1363 801

299Ex. of utilization: this chromosome encodes the

triangle sides values of x, y, z

Fitness Function Calculations

How do we calculate the overall fitness?Based on:

Likelihood of occurrenceFailures intensitySimilarity to other individuals from population

A. Likelihood of OccurrenceHighly fit individuals = high probability to be

usedPoorly fit individuals = low probability to be

usedHow to calculate the likelihood of input data?

By multiplying the probabilities of occurrence Ex: the likelihood that the user would select Input

values 1 and Input value 3 is: 0.75 x 0.005=0.003

B. Failure IntensityCombination between failure density and

severityEx:

Low density, high severity- single failure that resulted in crash of the software

High density, low severity- system doesn’t crash, but gives erroneous output

C. Niche SizeWhat is niche?

the number of individuals from the population who have common attributes

A situation very likely to occur and result in high failure intensity

Situations which are similar, but different

How to improve fitness function calculation?

1. Use a sample of fossil record

2. Summarize the fossil record

1. Use a sample of fossil recordFossil record= data warehouseAdvantages

Large reduction in computation timeMake the process predictable with fixed size

samplesEasy to implement

ExampleSample A of size 500 (6% from the fossil size)Sample B of size 5000 (17% from the fossil

size)

2. Summarize the fossil recordAdopt a higher level of abstractionAdvantages:

Reduced and predictable computation timeDisadvantages:

The strategy is complex and requires frequent re-calculation

Sample of fossil record

Conclusion (1)The GA based software test case generation

can be improved by using oracles or models and the way fitness function is calculated.

Conclusion (2)Though the methods for improving the

performance of GA’s discussed in this paper sound feasible, not enough evidence was presented to corroborate any of the authors’ claims. Much of the information that was presented here was actually discovered in other articles, like:

“Breeding Software Test Cases with Genetic Algorithms” by D. Berndt (2002) http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1174917