misc topics in testing
DESCRIPTION
Misc Topics in Testing. McCabe’s Cyclomatic Complexity. Number of “linearly independent paths” useful in defining test coverage (See later) Counts the number of closed loops in the graph F A () = 0 F s (m 1 ,m 2 ) = m 1 + m 2 F C (m 1 ,m 2 ) = m 1 + m 2 + 1 F l (m 1 ) = m 1 + 1 - PowerPoint PPT PresentationTRANSCRIPT
Misc Topics in Testing
McCabe’s Cyclomatic Complexity
Number of “linearly independent paths”– useful in defining test coverage (See later) – Counts the number of closed loops in the graph
• FA() = 0 • Fs(m1,m2) = m1 + m2
• FC(m1,m2) = m1 + m2 + 1• Fl(m1) = m1 + 1
v(P) = #edges - #nodes +2 (Familiar?)
McCabe: Example
Edges = 12
Nodes = 10
v = 12 - 10 + 2 = 4
4 Lin. Indep. Paths
More generally...
• Can define a set of prime flowgraphs – those which cannot be broken down by nesting– corresponding to the statements of the langauge
• And a measure for each
• Yields a Prime Decomposition Theorem:– “The decomposition of a flowgraph into primes
is unique”
A more general approach to CFGs
• For any language, a Prime Flowgraph is one which cannot be broken down by sequencing or nesting
if then repeat until
...
cases
??
Hierarchical measures (again)
• Define measure for each prime flowgraph
• Define measure for sequencing
• Define measure for nesting
Eg. number of nodes:
nd(P) = #nodes in P, for each prime nd F Fn nd Fi n
i
n( ;...; ) ( )1 1
1= − +∑
=
nd F F Fn nd F nd Fii
n( ( ,..., )) ( ) ( )1
1= +∑
=
Example: Structuredness
• Whether a program is structured can be seen as a measure as follows:
str(P) = 1 if P is one of the allowed primes
0 otherwise
str(F1;...Fn) = min(str(F1),...,str(Fn)
str(F(F1,...,Fn)) = min(str(F),str(F1),...,str(Fn))
Linearly Independent Paths
• The vector representation of a path is a vector which counts the number of occurrences of each edge.
• A set of paths is l.i. if none can be represented as a linear combination of the others (in the vector representation).
First number each edge
1 2
3 4
5
6 7
8
9
1011
12
A path can be representedas a vector counting edges visited
(1,0,1,0,1,1,0,1,0,0,0,1)
(1,0,1,0,1,0,1,0,0,0,1,1)
(1,0,1,0,1,0,1,0,1,1,1,1)
(0,1,0,1,1,1,0,1,0,0,0,1)
A B C D
Now can add and subtract vectors:
Eg. D-A = (-1,1,-1,1,0,0,0,0,0,0,0,0)
-1 1
-1 1
So E=B+D-A
E
How do we find test sets?
• Given a test strategy it is not easy to find test cases that exercise the required paths– Even for Statement Coverage some parts of the code
may be unreachable– A single path can achieve Branch Coverage for: while(...) do “some complex program” but unlikely to be possible in practice
Domain Partitioning
What have we been doing?
• Partitioning input space according to some property
• Selecting Test case inputs which are representatives of each partition – Eg to ensure different paths executed
• Assuming behaviour similar for all values of partition
Boundary Value Analysis
• Also important to test software at the boundaries of the partitions.– Less than (or equal)?– length of list (or n-1)?– closure reversal (“not <” is not “>”)?
• How do we identify boundaries?
Single variable case
• Open and closed intervals
min maxBoth ends closed
min maxHalf open
min maxBoth ends open
P1 P2 P3
Multiple variables• Input domains are multi-dimensional• Boundaries are hyperplanes• Can be open or closed at each intersection
open boundary
closed boundary
on point
off point
extreme point
Finding Test Cases• CFGs model software
• Test strategy to select paths to test
• Data flow Analysis to choose “best” test paths
• Now need to find test inputs which exercise those paths
Example
• Find All DU paths for example program
• Find test cases which execute the paths
smallest(int p) (*p>2*){int q = 2;
while(p mod q > 0 AND q < sqrt p) do q := q+1 ;if (p mod q = 0) then print(q,’is factor’) else print(p,’is prime’);}
d
u
u
u
Usagep q
d
u
ud
u
u
1
2
3
4
5
6
7
8
p
123123431235123435123571234357
q
232342352356434344354356
Program CFG ADUP
ADUP
p
1231235123435123571234357
q
232342352356434344354356
Subpathssubsumed
123571234357
2356
434
4356
100%coverage
12357812343578
123568
12343435 8
12343568
TestInput
p=3p=5
p=4,6,8...
p=4,8,12... 9,10,..15p=9,15,21..
Test Output
3 is prime5 is prime
2 is sm fact
11 is prime
3 is sm fact
How were test cases found?
• Required outcome at each predicate node
• Consider all requirements together
• Guess a value that will satisfy them
• Can we improve on this!
Symbolic Execution• How to find test inputs to exercise a path?
– Need certain choice at each predicate node
– Give a symbolic value to each variable
– Walk the path collecting requirements on symbolic input
• Then have a set of inequalities to solve
• Example: Find test cases for each path by symbolic execution:
smallest(p){int q = 2;
while(p mod q > 0 AND q < sqrt p) do q := q+1 ;if (p mod q = 0) then print(q,’is factor’) else print(p,’is prime’);}
F
F
p q
X Y
X 2
X 2
X 2
X 2
X 2
Conditions
X mod 2 =0OR2 ge sqrt X
X mod 2 > 0
Candidates
X=4,6,8,... 3,4
X=3,5,7,...
SolutionsX=3
Path 123578
p q
X YX 2while (T)
X 3
while (F)
if (F)
X is prime
Conditions
X mod 2 > 02 < sqrt X
X mod 3 = 0OR3 ge sqrt(X)
X mod 3 > 0
Candidates
X=3,5,7,...X=5,6,7..
X=3,6,9.. 3,4..9
X=4,5,7,8,..
Solutions
X=5,7
Output:5 is prime7 is prime
Path 12343578
p q
X YX 2while (F)
if (T)
Y is sm fact
Conditions
X mod 2 = 0OR2 ge sqrt X
X mod 2 = 0
Candidates
X=4,6,8,.. 3,4
X=4,6,8,..
Solutions
X=4,6,8..
Output:2 is sm fact
Path 123568
p q
X YX 2while (T)
X 3
while (F)
if (T)
Y is sm fact
Conditions
X mod 2 > 02 < sqrt X
X mod 3 = 0OR3 ge sqrt(X)
X mod 3 = 0
Candidates
X=3,5,7..X=5,6,7..
X=3,6,9.. 3,4..9
X=3,6,9..
Solutions
X=9,15,21..
Path 12343568
p q
X 2while (T)
X 3while (T)
X 4while (F)
if (_)
???????
Conditions
X mod 2 > 02 < sqrt X
X mod 3 > 03 < sqrt X
X mod 4 = 0OR4 ge sqrt(X)
X mod 4 ? 0
Candidates
X=3,5,7..X=5,6,7..
X=4,5,7,8..X=10,11,12..
X=4,8,12.. 3,4..16
X=.....
Solutions
[5,7,9,11,13..
[5,7,11,13,17[11,13,17,19..
[none from this[11,13
[must be falseX=11,13
Output:11 is prime13 is prime
Path 12343435_8
Difficulties with Symbolic Execution
• Generally, many paths are not feasible
• Conditions can become complex:– when complex expressions on rhs of
assignments– then program variables are complex
expressions in terms of the symbolic vars
• Sets of conditions can be computationally complex to solve
Possible Solutions
• Computational Complexity:– Use numerical methods to calculate the tests
• Straight line equivalents• Program Instrumentation
– Adaptive testing (later)
• Complex predicates– Condition/Decision strategies (later)
• Many Infeasible paths– Adaptive testing (later)
Straight Line equivalents
• Construct the “straight line” program corresponding to the path required.
• replace predicates with path constraints– a real valued expression which records the
requirement as a minimisation
• Solve the path constraints using numerical methods
Path Constraints
• Eg. if(x = y) is replaced by
c1 := abs(x-y) • and if(x>y) is replaced by
c2 := x-y• Then we must minimise the ci• Can use numerical methods to do this
Program instrumentation
• generally - a method to allow testing of a unit in place by augmenting program
• Here - add function calls which record value of key variables
• replace predicates with calls which guarantee correct path is taken
• run program to generate conditions
• Again use numerical methods to solve
Conditions and Decisions• Above strategies do not take account of
predicates with more than one conjunct
• There are more strategies which distinguish– Conditions - the individual clauses of
predicate, from
– Decisions - the outcome of evaluating the whole predicate
Condition Coverage
• Achieve all possible combinations of simple Boolean conditions at each decision node
• In critical real-time applications over half of statements may be Boolean expressions
• Several variants of strategies which account for individual conditions
Example Condition Strategies• Decision coverage (DC)
– every decision tested in each possible outcome• Condition/Decision coverage (C/DC)
– as above plus, every condition in each decision tested in each possible outcome
• Modified Condition/Decision (MC/DC)– as above plus, every condition shown to
independently affect a decision outcome (by varying that condition only)
• Multiple-condition coverage (M-CC)– all possible combinations of conditions within
each decision taken
Modified Condition/Decision Coverage
• Multiple-condition coverage is strongest but grows exponentially in # conditions
• Modified C/D is linear like C/D• Eg. For A and B
– (T,T) required to exercise decision true– (F,T) required for independence of A– (T,F) required for independence of B– (F,F) not required
• MC/DC (among others) is required for flight-critical commercial avionics software
Further Problems with Symb. Ex.
• When loop conditions are input dependent
• When array indices are input dependent
• When external functions are called
Adaptive TestingThe above approach has been in 4 stages:
1) Construct the control flow graph– a parsing problem - automatable– can all add “instrumentation” here
2) Choose the test paths– According to some test strategy– CFG - possibly with data flow considerations
Four stages (cont.)
3) Choose the test cases– by symbolic execution and simultaneous ineqs
• or by backwards substitution
– can reveal Infeasible paths requiring reverting to stage 2.
4) Execute the test cases– Only now do we execute the program
• Adaptive testing merges stages 2), 3) and 4)
Problems with 4-stage approach
• Infeasible paths (stage 3) require selection of new paths (return to stage 2)
• Computational complexity of test case selection
Adaptive testing develops test cases one at a time and uses result of previous test case execution to help select next test case
Inductive Strategies
• Choose first test input x1 (perhaps at random)• Execute test and record path taken, p1
• Say k-1 tests have been done giving{(x1,p1),...(xk-1,pk-1)}
• use some strategy to select xn
Several such strategies exist.
Diagonalisation
Important “method” in Mathematics:
• Cantor’s uncountability of Reals
• Godel’s Incompleteness
• Undecidability of Halting problem
For list of lists, find a new list by choosing an element different from each on the diagonal
A11, A12, A13, ...A21, A22, A23, ...A31, A32, A33, ......
New = B1, B2, B3, ...where B1 = A11 B2 = A22 B3 = A33 ...
Diagonalisation (2)
• Each path pi gives a conjunctive predicate Pi
• These predicates characterise a set of non-overlapping subdomains of the input space
• We must find a new input xk not in any Pi
• Let Pi be conjunction of Ci,1,Ci,2,...Ci,ki
• For each i, choose xk to violate some Ci,j – eg. xk not in Ci,i
Path Prefix Strategy[Prather and Myers, IEEE Trans. SE-13(7) 1987]
For Branch coverage
• For a path p, define its reversible prefix q– the initial portion of p to the first decision node
where the branches are not yet fully covered
• A reversal of p is then any path with same reversible prefix but then a different continuation
Path Prefix Strategy (2)
• Choose first input in some way and execute to give first path, p1
• Given p1,...,pk-1, let pi be path with shortest reversible prefix
• Choose next input to give a reversal of pi
• Execute and add the new path to set of paths
Path Prefix: earlier example• Choose first input p = 3 (say)
– execution gives path p1 = 12357– Reversible prefix = 123, Reversal = 1234....
• Deduce second input, p = 5– execution gives path p2 = 12343578– reversible prefix 123435– path p1 also now has reversible prefix 1235 – choose shorter p2, Reversal = 12356
• Deduce 3rd input, p = 4– execution gives path p3 = 123568
• All branches covered
Problems with Path prefix
• Still need to deduce input for new path
– the inversion problem (later)
• Still may get infeasible paths– absolute infeasibility - a path can never be
executed
– relative infeasibilty - a path cannot be the continuation of any of the current reversible prefixes
Example of relative infeasibilty
simple(bool x, y)if(x = true) then S1 else S2;if(x xor y = true) then S3 else S4;if(x and y = true) then S5 else S6;
in1 = (false,false)p1 = F,F,F reverse at 1 gives:in2 = (true,false)p2 = T,T,F reverse p1 at 2 gives F,F,T - infeasible reverse p2 at 2 givesT,T,T infeasible butT,F,T is feasible, egin3 = (true,true)
1
2
3
- # paths to node grows exponentially - # previous nodes grows linearly
Conditionals in sequence:
The Inversion Problem
• How do we find the input which reverses the decision at Pk ?
D
P1&...&Pk-1
Pknot Pk
xx’
The Inversion Problem (2)• Need to find x’ given x
• Done by Back Substitution
• execute with x recording all states for prefix• pick change of a variable to change Pk
• substitute back through program logic to calculate required input– same as for 4 step approach but with actual
values– For real-valued conditions can use grad(Pk) to
cross boundary via normal
Advantages of adaptive approach
• Informal common sense tells us:– Change only one thing at a time– Exploit nearness of previous test cases to the
required path
• Formal analysis gives us:– overall complexity of adaptive approach is less
than 4 stage approach [Myers, SEJ 7(1) 1992]
References• ADTEST, Gallagher and Narasimhan, IEEE
Trans. SE-23(8), 1997.
• Symbolic Execution, Girgis, SEJ 7(4), 1992.
• Instrumentation, Luo, Probert and Ural, Software Engineering Journal (SEJ) 10(6), 1995.
• Path Prefix, Prather and Myers, IEEE Trans. SE-13(7), 1987.
• Complexity of adaptive, Myers, SEJ 7(1), 1992.
• MC/DC, Chilenski and Miller, SEJ 9(5) 1994.