misc topics in testing

Misc Topics in Testing

McCabe’s Cyclomatic Complexity

Number of “linearly independent paths”– useful in defining test coverage (See later) – Counts the number of closed loops in the graph

• FA() = 0 • Fs(m1,m2) = m1 + m2

• FC(m1,m2) = m1 + m2 + 1• Fl(m1) = m1 + 1

v(P) = #edges - #nodes +2 (Familiar?)

McCabe: Example

Edges = 12

Nodes = 10

v = 12 - 10 + 2 = 4

4 Lin. Indep. Paths

More generally...

• Can define a set of prime flowgraphs – those which cannot be broken down by nesting– corresponding to the statements of the langauge

• And a measure for each

• Yields a Prime Decomposition Theorem:– “The decomposition of a flowgraph into primes

is unique”

A more general approach to CFGs

• For any language, a Prime Flowgraph is one which cannot be broken down by sequencing or nesting

if then repeat until

...

cases

??

Hierarchical measures (again)

• Define measure for each prime flowgraph

• Define measure for sequencing

• Define measure for nesting

Eg. number of nodes:

nd(P) = #nodes in P, for each prime nd F Fn nd Fi n

i

n( ;...; ) ( )1 1

1= − +∑

=

nd F F Fn nd F nd Fii

n( ( ,..., )) ( ) ( )1

1= +∑

=

Example: Structuredness

• Whether a program is structured can be seen as a measure as follows:

str(P) = 1 if P is one of the allowed primes

0 otherwise

str(F1;...Fn) = min(str(F1),...,str(Fn)

str(F(F1,...,Fn)) = min(str(F),str(F1),...,str(Fn))

Linearly Independent Paths

• The vector representation of a path is a vector which counts the number of occurrences of each edge.

• A set of paths is l.i. if none can be represented as a linear combination of the others (in the vector representation).

First number each edge

1 2

3 4

5

6 7

8

9

1011

12

A path can be representedas a vector counting edges visited

(1,0,1,0,1,1,0,1,0,0,0,1)

(1,0,1,0,1,0,1,0,0,0,1,1)

(1,0,1,0,1,0,1,0,1,1,1,1)

(0,1,0,1,1,1,0,1,0,0,0,1)

A B C D

Now can add and subtract vectors:

Eg. D-A = (-1,1,-1,1,0,0,0,0,0,0,0,0)

-1 1

-1 1

So E=B+D-A

E

How do we find test sets?

• Given a test strategy it is not easy to find test cases that exercise the required paths– Even for Statement Coverage some parts of the code

may be unreachable– A single path can achieve Branch Coverage for: while(...) do “some complex program” but unlikely to be possible in practice

Domain Partitioning

What have we been doing?

• Partitioning input space according to some property

• Selecting Test case inputs which are representatives of each partition – Eg to ensure different paths executed

• Assuming behaviour similar for all values of partition

Boundary Value Analysis

• Also important to test software at the boundaries of the partitions.– Less than (or equal)?– length of list (or n-1)?– closure reversal (“not <” is not “>”)?

• How do we identify boundaries?

Single variable case

• Open and closed intervals

min maxBoth ends closed

min maxHalf open

min maxBoth ends open

P1 P2 P3

Multiple variables• Input domains are multi-dimensional• Boundaries are hyperplanes• Can be open or closed at each intersection

open boundary

closed boundary

on point

off point

extreme point

Finding Test Cases• CFGs model software

• Test strategy to select paths to test

• Data flow Analysis to choose “best” test paths

• Now need to find test inputs which exercise those paths

Example

• Find All DU paths for example program

• Find test cases which execute the paths

smallest(int p) (*p>2*){int q = 2;

while(p mod q > 0 AND q < sqrt p) do q := q+1 ;if (p mod q = 0) then print(q,’is factor’) else print(p,’is prime’);}

d

u

u

u

Usagep q

d

u

ud

u

u

1

2

3

4

5

6

7

8

p

123123431235123435123571234357

q

232342352356434344354356

Program CFG ADUP

ADUP

p

1231235123435123571234357

q

232342352356434344354356

Subpathssubsumed

123571234357

2356

434

4356

100%coverage

12357812343578

123568

12343435 8

12343568

TestInput

p=3p=5

p=4,6,8...

p=4,8,12... 9,10,..15p=9,15,21..

Test Output

3 is prime5 is prime

2 is sm fact

11 is prime

3 is sm fact

How were test cases found?

• Required outcome at each predicate node

• Consider all requirements together

• Guess a value that will satisfy them

• Can we improve on this!

Symbolic Execution• How to find test inputs to exercise a path?

– Need certain choice at each predicate node

– Give a symbolic value to each variable

– Walk the path collecting requirements on symbolic input

• Then have a set of inequalities to solve

• Example: Find test cases for each path by symbolic execution:

smallest(p){int q = 2;

while(p mod q > 0 AND q < sqrt p) do q := q+1 ;if (p mod q = 0) then print(q,’is factor’) else print(p,’is prime’);}

F

F

p q

X Y

X 2

X 2

X 2

X 2

X 2

Conditions

X mod 2 =0OR2 ge sqrt X

X mod 2 > 0

Candidates

X=4,6,8,... 3,4

X=3,5,7,...

SolutionsX=3

Path 123578

p q

X YX 2while (T)

X 3

while (F)

if (F)

X is prime

Conditions

X mod 2 > 02 < sqrt X

X mod 3 = 0OR3 ge sqrt(X)

X mod 3 > 0

Candidates

X=3,5,7,...X=5,6,7..

X=3,6,9.. 3,4..9

X=4,5,7,8,..

Solutions

X=5,7

Output:5 is prime7 is prime

Path 12343578

p q

X YX 2while (F)

if (T)

Y is sm fact

Conditions

X mod 2 = 0OR2 ge sqrt X

X mod 2 = 0

Candidates

X=4,6,8,.. 3,4

X=4,6,8,..

Solutions

X=4,6,8..

Output:2 is sm fact

Path 123568

p q

X YX 2while (T)

X 3

while (F)

if (T)

Y is sm fact

Conditions



X mod 3 = 0

Candidates

X=3,5,7..X=5,6,7..

X=3,6,9.. 3,4..9

X=3,6,9..

Solutions

X=9,15,21..

Path 12343568

p q

X 2while (T)

X 3while (T)

X 4while (F)

if (_)

???????

Conditions




X mod 4 ? 0

Candidates

X=3,5,7..X=5,6,7..

X=4,5,7,8..X=10,11,12..

X=4,8,12.. 3,4..16

X=.....

Solutions

[5,7,9,11,13..

[5,7,11,13,17[11,13,17,19..

[none from this[11,13

[must be falseX=11,13

Output:11 is prime13 is prime

Path 12343435_8

Difficulties with Symbolic Execution

• Generally, many paths are not feasible

• Conditions can become complex:– when complex expressions on rhs of

assignments– then program variables are complex

expressions in terms of the symbolic vars

• Sets of conditions can be computationally complex to solve

Possible Solutions

• Computational Complexity:– Use numerical methods to calculate the tests

• Straight line equivalents• Program Instrumentation

– Adaptive testing (later)

• Complex predicates– Condition/Decision strategies (later)

• Many Infeasible paths– Adaptive testing (later)

Straight Line equivalents

• Construct the “straight line” program corresponding to the path required.

• replace predicates with path constraints– a real valued expression which records the

requirement as a minimisation

• Solve the path constraints using numerical methods

Path Constraints

• Eg. if(x = y) is replaced by

c1 := abs(x-y) • and if(x>y) is replaced by

c2 := x-y• Then we must minimise the ci• Can use numerical methods to do this

Program instrumentation

• generally - a method to allow testing of a unit in place by augmenting program

• Here - add function calls which record value of key variables

• replace predicates with calls which guarantee correct path is taken

• run program to generate conditions

• Again use numerical methods to solve

Conditions and Decisions• Above strategies do not take account of

predicates with more than one conjunct

• There are more strategies which distinguish– Conditions - the individual clauses of

predicate, from

– Decisions - the outcome of evaluating the whole predicate

Condition Coverage

• Achieve all possible combinations of simple Boolean conditions at each decision node

• In critical real-time applications over half of statements may be Boolean expressions

• Several variants of strategies which account for individual conditions

Example Condition Strategies• Decision coverage (DC)

– every decision tested in each possible outcome• Condition/Decision coverage (C/DC)

– as above plus, every condition in each decision tested in each possible outcome

• Modified Condition/Decision (MC/DC)– as above plus, every condition shown to

independently affect a decision outcome (by varying that condition only)

• Multiple-condition coverage (M-CC)– all possible combinations of conditions within

each decision taken

Modified Condition/Decision Coverage

• Multiple-condition coverage is strongest but grows exponentially in # conditions

• Modified C/D is linear like C/D• Eg. For A and B

– (T,T) required to exercise decision true– (F,T) required for independence of A– (T,F) required for independence of B– (F,F) not required

• MC/DC (among others) is required for flight-critical commercial avionics software

Further Problems with Symb. Ex.

• When loop conditions are input dependent

• When array indices are input dependent

• When external functions are called

Adaptive TestingThe above approach has been in 4 stages:

1) Construct the control flow graph– a parsing problem - automatable– can all add “instrumentation” here

2) Choose the test paths– According to some test strategy– CFG - possibly with data flow considerations

Four stages (cont.)

3) Choose the test cases– by symbolic execution and simultaneous ineqs

• or by backwards substitution

– can reveal Infeasible paths requiring reverting to stage 2.

4) Execute the test cases– Only now do we execute the program

• Adaptive testing merges stages 2), 3) and 4)

Problems with 4-stage approach

• Infeasible paths (stage 3) require selection of new paths (return to stage 2)

• Computational complexity of test case selection

Adaptive testing develops test cases one at a time and uses result of previous test case execution to help select next test case

Inductive Strategies

• Choose first test input x1 (perhaps at random)• Execute test and record path taken, p1

• Say k-1 tests have been done giving{(x1,p1),...(xk-1,pk-1)}

• use some strategy to select xn

Several such strategies exist.

Diagonalisation

Important “method” in Mathematics:

• Cantor’s uncountability of Reals

• Godel’s Incompleteness

• Undecidability of Halting problem

For list of lists, find a new list by choosing an element different from each on the diagonal

A11, A12, A13, ...A21, A22, A23, ...A31, A32, A33, ......

New = B1, B2, B3, ...where B1 = A11 B2 = A22 B3 = A33 ...

Diagonalisation (2)

• Each path pi gives a conjunctive predicate Pi

• These predicates characterise a set of non-overlapping subdomains of the input space

• We must find a new input xk not in any Pi

• Let Pi be conjunction of Ci,1,Ci,2,...Ci,ki

• For each i, choose xk to violate some Ci,j – eg. xk not in Ci,i

Path Prefix Strategy[Prather and Myers, IEEE Trans. SE-13(7) 1987]

For Branch coverage

• For a path p, define its reversible prefix q– the initial portion of p to the first decision node

where the branches are not yet fully covered

• A reversal of p is then any path with same reversible prefix but then a different continuation

Path Prefix Strategy (2)

• Choose first input in some way and execute to give first path, p1

• Given p1,...,pk-1, let pi be path with shortest reversible prefix

• Choose next input to give a reversal of pi

• Execute and add the new path to set of paths

Path Prefix: earlier example• Choose first input p = 3 (say)

– execution gives path p1 = 12357– Reversible prefix = 123, Reversal = 1234....

• Deduce second input, p = 5– execution gives path p2 = 12343578– reversible prefix 123435– path p1 also now has reversible prefix 1235 – choose shorter p2, Reversal = 12356

• Deduce 3rd input, p = 4– execution gives path p3 = 123568

• All branches covered

Problems with Path prefix

• Still need to deduce input for new path

– the inversion problem (later)

• Still may get infeasible paths– absolute infeasibility - a path can never be

executed

– relative infeasibilty - a path cannot be the continuation of any of the current reversible prefixes

Example of relative infeasibilty

simple(bool x, y)if(x = true) then S1 else S2;if(x xor y = true) then S3 else S4;if(x and y = true) then S5 else S6;

in1 = (false,false)p1 = F,F,F reverse at 1 gives:in2 = (true,false)p2 = T,T,F reverse p1 at 2 gives F,F,T - infeasible reverse p2 at 2 givesT,T,T infeasible butT,F,T is feasible, egin3 = (true,true)

1

2

3

- # paths to node grows exponentially - # previous nodes grows linearly

Conditionals in sequence:

The Inversion Problem

• How do we find the input which reverses the decision at Pk ?

D

P1&...&Pk-1

Pknot Pk

xx’

The Inversion Problem (2)• Need to find x’ given x

• Done by Back Substitution

• execute with x recording all states for prefix• pick change of a variable to change Pk

• substitute back through program logic to calculate required input– same as for 4 step approach but with actual

values– For real-valued conditions can use grad(Pk) to

cross boundary via normal

Advantages of adaptive approach

• Informal common sense tells us:– Change only one thing at a time– Exploit nearness of previous test cases to the

required path

• Formal analysis gives us:– overall complexity of adaptive approach is less

than 4 stage approach [Myers, SEJ 7(1) 1992]

References• ADTEST, Gallagher and Narasimhan, IEEE

Trans. SE-23(8), 1997.

• Symbolic Execution, Girgis, SEJ 7(4), 1992.

• Instrumentation, Luo, Probert and Ural, Software Engineering Journal (SEJ) 10(6), 1995.

• Path Prefix, Prather and Myers, IEEE Trans. SE-13(7), 1987.

• Complexity of adaptive, Myers, SEJ 7(1), 1992.

• MC/DC, Chilenski and Miller, SEJ 9(5) 1994.

misc topics in testing

Documents

test coverage

test inputs

test output3

test sets

test case inputs

example programfind

best test pathsnow

number of nodes