testing and debugging. some terminology validation: the process of increasing confidence that a...

Testing and Debugging

Some Terminology Validation: The process of increasing

confidence that a program functions the way we intend to.

Done through Testing Formal reasoning about program

Debugging: Figuring out why program is not functioning properly

Defensive programming: Writing programs to make validation and debugging easy.

Verification vs. Validation Verification: Careful, formal reasoning about the

program Often done with machine aids

Theorem provers Model checkers Static checkers …

At the research stage Tools can handle limited, small programs

Testing: Apply inputs, see if output is as expected. Exhaustive testing impossible Must choose set of inputs carefully

Black-Box Testing Black box: Don’t look inside program

Generate tests based on spec alone No regard for the internal structure of program

Advantages: ?

Black-Box Testing Black box: Don’t look inside program

Generate tests based on spec alone No regard for the internal structure of program

Advantages: Test writer not influenced by program

implementation Test suite need not be modified if program

implementation is Test results can be interpreted by people who

don’t understand program QA analysts

Testing Paths through the Specification A good way to do black box testing:

Explore paths (combinations of entry and exit conditions) through

requires and effects clauses

static float sqrt (float x, float epsilon)// REQUIRES: x >= 0 && .00001 < epsilon < .001// EFFECTS: Returns sq such that // x – epsilon <= sq*sq <= x + epsilon.

Two paths:1. x = 0 && .00001 < epsilon < .0012. x > 0 && .00001 < epsilon < .001

Must test both

Testing Paths through the Specification Must test both cases

static boolean isPrime (int x)//EFFECTS: If x is a prime returns true else returns false.

Must test correct error handling Exercise all exceptions

static int search (int[ ] a, int x)throws NotFoundException, NullPointerException

// EFFECTS: If a is null throws NullPointerException // else if x is in a, returns i such that a[i] = x, // else throws NotFoundException.

Testing Paths through the Specification Programs must be tested for typical values

Values between the smallest and largest expected by a program Corner cases: Testing boundary conditions

Smallest, largest values Empty sets, empty Vectors Null pointers

These tests catch Logical errors: Special case handling is omitted by mistake Conditions that cause the language or hardware system to raise an

exception. Examples: Arithmetic overflow Divide by zero …

Testing Paths through the Specificationstatic void appendVector (Vector v1, Vector v2)

throws NullPointerException//MODIFIES: v1 and v2//EFFECTS: If v1 or v2 is null throws NullPointerException else // removes all elements of v2 and appends them in reverse// order to the end of v1.

static void appendVector (Vector v1, Vector v2) {if (v1 == null) or (v2 == null) throws new NullPointerException

(“Vectors.appendVector”);while (v2.size( ) > 0) {

v1.addElement(v2.lastElement ( ));v2.removeElementAt(v2.size( ) – 1); }

}

Aliasing Errors What if v1 and v2 are the same?

appendVector(v,v)

static void appendVector (Vector v1, Vector v2) {if (v1 == null) throws new NullPointerException

(“Vectors.appendVector”);while (v2.size( ) > 0) {

v1.addElement(v2.lastElement ( ));v2.removeElementAt(v2.size( ) – 1); }

}

Glass-Box Testing Black box testing best place to start But not enough Difficult to know which tests give new information without looking at

program’s internal structure Implementation may distinguish between cases that spec doesn’t

Important concept: Coverage Line coverage Branch coverage Path coverage …

Ideal: Full path coverage Often impossible False paths

static int maxOfThree (int x, int y, int z) {if (x > y)

if (x > z) return x; else return z;if (y > z) return y; else return z; }

Glass-Box Testing Ideal: Full path coverage

Often impossible And not enough!!

static int maxOfThree (int x, int y, int z) {return x; }

Example: Input: x=3, y=2, z=1 Output looks correct Gets 100% path coverage

Problem: Missing paths not revealed A common error!

So, no testing purely based on program text good enough Spec-based testing and glass-box testing must complement each other.

Path coverage

j = k;for (int i = 1; i <= 100; i++)

if (Tests.pred(i*j)) j++;

2100 paths!

What to do? Select small, representative subset

j = k;for (int i = 1; i <=2; i++)

if (Tests.pred(i*j)) j++;

Shoot for branch coverage Two iterations is a good rule of thumb For recursive procedures, select tests with

No recursive calls One recursive call

Possible exceptions are like branchesint x = a[0];

Must try to exercises cases where Exception is raised Exception is not raised

The palindrome procedure

static boolean palindrome (String s) throws NullPointerException {//EFFECTS: If s is null throws NullPointerException, else// returns true if s reads the same forward and backward; else// returns false.// E.g., “deed” and “ “ are both palindromes.int low = 0;int high = s.length( ) –1;while (high > low) {

if (s.charAt(low) != s.charAt(high)) return false;low ++;high --; }

returns true; }

What to test? NullPointerException Not executing the loop Return true in first iteration Return false in first iteration Return true in second iteration Return false in second iteration

Example test set: “ “, “d”, “deed”, “ceed”

Testing Data Abstractions Similar to testing procedures. Consider

Paths through specs of operations Paths through implementations of operations

Must test operations as a group The output or result of one is the input to another

repOK very important Must call after each operation

Constructors Other methods

Do case analysis on Specs Implementations

Partial Specification of the IntSet data abstractionpublic class IntSet {//OVERVIEW: IntSets are mutable, unbounded sets of integers.// A typical IntSet is {x1 . . . , xn}.

//constructorspublic IntSet ( )

//EFFECTS: Initializes this to be empty.

//methodspublic void insert (int x)

//MODIFIES: this//EFFECTS: Adds x to the elements of this, i.e.,// this_post = this + { x }.

public void remove (int x)//MODIFIES: this//EFFECTS: Removes x from this, i.e., this_post = this – x.

public boolean isIN (int x)//EFFECTS: If x is in this returns true else returns false.

public int size ( )//EFFECTS: Returns the cardinality of this.

public Iterator elements ( )//EFFECTS: Returns a generator that produces all the elements of// this (as Integers), each exactly once, in arbitrary order.//REQUIRES: this must not be modified while the generator is in use.

Partial Implementation of the IntSet data abstraction

public class IntSet {private Vector els; // the rep

public IntSet ( ) { els = new Vector ( ); }

public void insert (int x) {Integer y = new Integer(x);if (getIndex(y) < 0) els.add(y);

public void remove (int x) {int i = getIndex(new Integer(x));if (i < 0) return;els.set(i, els.lastElement( ));els.remove(els.size( ) –1); }

public boolean isIN (int x) {return getIndex(new Integer(x)) >= 0;

private int getIndex (Integer x) {//EFFECTS: If x is in this returns index where x appears else returns // -1.for (int i = 0; i < els.size( ); i++)

if (x.equals(els.get(i))) return i;return –1; }

public int size ( ) { return els.size( ); }}

Testing Type Hierarchies Black-box tests for subtype must include black-box tests for

supertype General approach to test subtype

BB tests for supertype Calls to subtype constructor BB tests for subtype Glass-box tests for subtype

Tests for abstract classes are templates: Actual calls must be filled in for the subtype

If method specs have changed, must exercise new specs And corner cases between old specs and new specs

Need not use glass-box test for supertype for subtype Why not?

Testing Type Hierarchies Black-box tests for subtype must include black-box tests for

supertype General approach to test subtype

BB tests for supertype Calls to subtype constructor BB tests for subtype Glass-box tests for subtype

Tests for abstract classes are templates: Actual calls must be filled in for the subtype

If method specs have changed, must exercise new specs And corner cases between old specs and new specs

Need not use glass-box test for supertype for subtype Either it is inherited, in which case it’s already tested Or it is overwritten

Meaning of Subtypes Substitution principle: Subtypes must support

all functionality of supertypes More precisely

Signature rule: Must have all methods, with compatible signatures

Methods rule Calls of subtype methods must “behave like”calls to

corresponding supertype methods Properties rule

Subtype must preserve all properties that can be proved about supertype objects

All three rules concern specifications!

Compatible Signatures Signature rule:

Subtype must have the same methods with the same signatures Fewer exceptions are OK.

Example:Object clone ()Foo x = (Foo) y.clone();

Not OK to return “Foo” instead Foo clone();

The Methods Rule Cannot be checked by compiler Methods rule: Can reason about method using supertype’s spec

for it Even though actual subtype method is running

Example: IntSet y, we call y.insert(x)

x must be in set y afterwards, for all IntSet implementations So far, subtype and supertype specs were same But subtype can

Weaken the precondition Subtype method works OK in more cases

Strengthen the post-condition Subtype method gives more guarantees after completion

Typically Supertype spec non-deterministic (more than one OK result) Subtype picks one possibility

Methods Rule Examples:Supertype:

public void addZero ()//REQUIRES: this is not empty//EFFECTS: Adds 0 to this

Subtype:

public void addZero ()//EFFECTS: Adds 0 to this

Example: Supertype iterator: Elements in random order Subtype iterator: Elements in a particular sorted order

Methods Rule Example Legal subtype spec for IntSet:

//OVERVIEW: A LogIntSet is an IntSet plus a log. The //log is also a set; it contains all the integers that //have ever been members of the set.

public void insert (int x)//MODIFIES: this//EFFECTS: Adds x to the set and also to the log.

Illegal subtype spec:public void insert (int x)

//MODIFIES: this//EFFECTS: If x is odd adds it to this else does //nothing.

Methods Rule Example

Supertype spec:public void addEl (int x) throws DuplicateException

//MODIFIES: this//EFFECTS: If x is in “this” throws DuplicateException // else adds x to “this”.

Illegal subtype spec:public void addEl (int x)

//MODIFIES: this//EFFECTS: If x is not in “this” adds it to “this”.

Satisfies signature rule But not methods rule

The Properties Rule Subtype must satisfy all supertype properties Example: The rep invariant

Must reason by induction that subtype preserves representation invariant Constructors and methods preserve invariant

For additional methods and subtype constructors also

Properties Rule Example//OVERVIEW: A FatSet is a mutable set of integers

whose //size is always at least 1.

OK method:public void removeNonEmpty (int x)

//MODIFIES: this//EFFECTS: If x is in this and this contains other // elements, removes x from this.

Illegal method:public void remove (int x)

//MODIFIES: this//EFFECTS: removes x from this

Unit vs. Integration Testing Unit testing: Individual modules work properly Integration testing: Modules put together

work properly Integration testing more difficult

Intended behavior of program more difficult to characterize than its parts

Errors of specification come up during integration testing Each unit does its part, but parts put together

don’t do the overall job Parts were erroneously specified

Unit vs. Integration Testing Why the separation?

Unit vs. Integration Testing Why the separation? Program P

Calls module Q During unit testing, Q and P tested individually

While testing P, we put in module that “behaves like” Q Maybe give values by hand

Similarly for Q When testing P integrated with Q, we use P’s test

cases If test fails

Either Q being tested on behavior not covered earlier Q does not behave as was assumed in testing P

Easier to isolate these possibilities

Defensive Programming Good practice:

Check @requires conditions

static boolean inRange (int[ ] a, int x, int y, int e)throws NullPointerException

//REQUIRES: x <= y//EFFECTS: If a is null throws NullPointerException// else returns true if e is an element of // a[x], . . ., a[y].

Good practice: Make conditionals “complete”: i.e. have code for each case

s = Comm.receive( );if (s.equals(“deliver”)) { // carry out the deliver request }

else if (s.equals(“examine”)) { // carry out the examine request }else { // handle error case }

Nonexecution-based Testing

Definitions

Execution-based testing running test cases against executing code

Nonexecution-based testing reviewing code or documents carefully

why should we also do this for code?

Nonexecution-based Testing Underlying principles

cannot review own work – why? team of reviewers – why?

Two types walkthroughs inspections key difference?

walkthroughs have fewer steps & are less formal inspections record detailed data & use it in later phases &

projects

Inspections

Five stage process overview preparation inspection rework follow-up

How to Conduct a Code Inspection

Preparing for Inspection To get ready for the inspection, print separate hardcopies

of the source code for each inspector. A single code inspector should cover no more than 250

source code lines, including comments

Inspection overview. The code author spends 20 - 40 minutes explaining the general

layout of the code to the inspectors. The inspectors are not allowed to ask questions -- the code is

supposed to answer them, but this overview is designed to speed up the process.

The author's goal is to stick to the major, important points, and keep it as close to 20 minutes as possible without undercutting the explanation.


Individual inspections. Each inspector uses a checklist to try to put forward a maximum

number of discovered possible defects. This should be done in a single, uninterrupted sitting.

The inspector should have a goal of covering 70-120 source lines of code per hour.

To do the inspection, go through the code line by line, attempting to fully understand what you are reading.

At each line or block of code, skim through the inspection checklist, looking for questions which apply.

For each applicable question, find whether the answer is "yes." A yes answer means a probable defect. Write it down.

You will notice that some of the questions are very low-level and concern themselves with syntactical details, while others are high-level and require an understanding of what a block of code does. Be prepared to change your mental focus.


The meeting. The meeting is attended by all the code inspectors for that

chunk of code. To be more like a formal inspection, the meeting should have

a moderator who is well experienced in Java, and in conducting code inspections, and the author of the code should not be present.

To be more like a walkthrough, the moderator may be omitted, or the author may be present, or both.

If the author is present, it is for the purpose of collecting feedback, not for defending or explaining the code.

Each meeting is strictly limited to two hours duration, including interruptions. This is because inspection ability generally drops off after this amount of time. Strive to stay on task, and to not allow interruptions.


If the group is not finished at the end of two hours, quit. Do not attempt to push ahead. The moderator or note taker should submit the existing notes to the author or maintainer, and the remaining material should be covered in a subsequent meeting.

Rework The defects list is submitted to the author, or to another

assigned individual for "rework." This can consist of changing code, adding or deleting comments, restructuring or relocating things, etc.

Note that solutions are not discussed at the inspection meeting! They are neither productive nor necessary in that setting.

If the author/maintainer desires feedback on solutions or improvements, he or she may hold a short meeting with any or all of the inspectors, following the code inspection meeting.


Follow up. It is the moderator's personal responsibility to ensure all defects

have been satisfactorily reworked. An individual is selected for this role at the inspection meeting. The correctness of the rework will be verified either at a short review meeting, or during later inspection stages during the project.

Record keeping. In order to objectively track success in detecting and correcting

defects, one of the by-products of the meeting will be a count of the total number of different types of potential defects noted.

Fault Statistics Recorded by severity & fault type Metrics

fault density (faults / page or faults / KLOC) by severity (major vs. minor) or phase

fault detection rate (faults detected / hour) Several uses

help inspectors focus their inspections help warn of problems

compare to previous products at same stage disproportionate # of faults in particular module disproportionate # of faults of certain type

Success of Inspections Inspections can find many of the faults

82%, 70%, 93% of all detected faults (IBM, ’76, ’78, ‘86) Inspections decrease the cost of finding faults

90% decrease in cost of detecting (Switching system, ‘86) 4 major faults, 14 minor faults per 2 hours (JPL, 1990)

savings of $25,000 per inspection Inspections can decrease cost of fixing

# of faults decreased exponentially by phase (JPL, 1992)

Warning fault statistics and performance appraisal

Walkthroughs

Team of 4-6 members, chaired by SQA spec writer & manager, client, designers, SQA more experienced & senior better

Distribute info in advance, each reviewer creates list of confusing items list of items in error

Process detect suspected faults, don’t correct document-driven & interactive

verbalization leads to fault finding spontaneously by presenter no performance appraisal

Strengths & Weaknesses of Reviews

Strengths effective way of detecting a fault faults are detected early in the process

Weaknesses if system has poor architecture, unwieldy to review

OO paradigm makes this easier – small, independent modules

must have updated documentation from previous phase


Defect testing• Tests designed to discover system defects.• A successful defect test is one which reveals the

presence of defects in a system. Validation testing

• Intended to show that the software meets its requirements.

• A successful test is one that shows that a requirements has been properly implemented.

Types of testing


Automated static analysis

Static analysers are software tools for source text processing.

They parse the program text and try to discover potentially erroneous conditions and bring these to the attention of the V & V team.

They are very effective as an aid to inspections - they are a supplement to but not a replacement for inspections.


Static analysis checks


LINT static analysis


Use of static analysis

Particularly valuable when a language such as C is used which has weak typing and hence many errors are undetected by the compiler,

Less cost-effective for languages like Java that have strong type checking and can therefore detect many errors during compilation.


Software testing


Use cases

Use cases can be a basis for deriving the tests for a system. They help identify operations to be tested and help design the required test cases.

From an associated sequence diagram, the inputs and outputs to be created for the tests can be identified.


Collect weather data sequence chart

:CommsController

request (report)

acknowledge ()report ()

summarise ()

reply (report)

acknowledge ()

send (report)

:WeatherStation :WeatherData


Performance testing

Part of release testing may involve testing the emergent properties of a system, such as performance and reliability.

Performance tests usually involve planning a series of tests where the load is steadily increased until the system performance becomes unacceptable.


Stress testing

Exercises the system beyond its maximum design load. Stressing the system often causes defects to come to light.

Stressing the system tests failure behaviour..• Systems should not fail catastrophically. • Stress testing checks for unacceptable loss of service or

data. Stress testing is particularly relevant to distributed

systems that can exhibit severe degradation as a network becomes overloaded.

testing and debugging. some terminology validation: the process of increasing confidence that a...

Documents