3.5 acceptance testing - cs.tut.fitie21201/s2011/luennot/ohj-3066_2011_110-170.pdf · 3.5...

Department of Software Systems110

3.5 Acceptance testing

• Acceptance testing can be used to determine whether the product conforms to the agreements or not

• When development is outsourced, acceptance test plan and even the related test cases could be attached to the agreement made between organizations

• Acceptance testing is based on customer requirements• According to V-model it is the first phase in test design and

the last phase test execution• Acceptance test plan has to be kept up to date

– When new features are requested in the middle of the project (which is common), also those must be tested in the acceptance testing phase

Mika Katara: Software Testing, 2011


• In acceptance testing the entire finished product is tested– End users of the product should be used as testers– The test environment should be as close as possible to the real

end user environment• Usually the time is short, as the purpose is only to show that

the requirements have been satisfied– The finding of errors should no longer be the primary objective– Acceptance testing of a larger system should not take more than

some weeks– Otherwise it might be that system testing is just being called

acceptance testing (which is OK, as long as it is acknowledged), then the primary objective is to find bugs



• Acceptance test should be mostly a demonstration– If errors are found, fixing of them might be very expensive– Subcontracting can alter the situation, if the purpose is to check

if terms of agreement have been met– The client might want to postpone the end of acceptance testing,

for example when the guarantee starts from the end of this phase

• If the end users get to use the product only when it is supposed to be done, with a great probability errors are found– Idealistically thinking, users should participate in all the phases

of the process in some way– Iterative development: the customers comment on unfinished

versions (demos) during the development• everyone gets a better view of the requirements



• Because making big changes at the end of the project might lead to chaos, they are to be considered carefully– After the prioritizing of change requirements it might make most

sense to leave the fixing of some errors to the next releases– If changes are decided to be implemented, the risks may have to

be re-evaluated• Maybe some feature should be left out of this version or

testing of it reduced to save resources



• Differences between the test environment and the end user environment are issues that are not tested– Are real databases, or some other related systems, used or are

they simulated somehow?– Is test data generated or real?– Are there other software in the end user environment?– Is the hardware configuration realistic?

• There can be many different kinds of end user environments– Creation of profiles to correspond to the most common

environments can be beneficial



• In an ideal case, the end users or their representatives write the acceptance test plan

• The users understand– Requirements – The risks related to their own business

• The users can– Supply realistic test data– Supply use cases– Define passing conditions for acceptance testing– Run tests– Inspect and review

• Test reports• Other documents created in the project



• For every requirement, there should be at least one test case• One test case can cover several (valid/legal) requirements• Requirements coverage matrix can be of help, if the

information is not obtained from the test management tool• Even if every requirement had been tested, still not everything

has been tested– Actually not a single requirement has been tested completely– A program has a lot of states that have not been tested by any

test case– In the acceptance test phase the technology used in the

implementation is often overlooked• Test cases should be prioritized by analyzing the risks of the

corresponding requirementsMika Katara: Software Testing, 2011


Alpha and Beta tests

• Done by users• Alpha tests are executed by the user at the supplier’s

premises in an environment that is as realistic as possible– Development or test environment is not good enough

• Beta tests are done by the user at the user’s premises and environment

• In both alpha and beta testing it must be remembered that ad-hoc testing does not replace systematical testing using a predefined test strategy and test cases– Distributing an unfinished version to selected users is not

enough by itself– Ad-hoc testing can also be useful, but only after more systematic

tests (unless used as a part of a smoke test) Mika Katara: Software Testing, 2011


Excursion: Risk analysis

• Risk analysis is performed – Unknowingly: If I skip studying for an exam and read only at the

last night, what is the probability that I’ll fail, and what consequences there will be?

– Knowingly: When the key persons of the organization have to travel on the same day to a same place, is it a good idea to put them on a same flight?

• Unfortunately, out of all possible tests, only a small fraction can actually be executed

• By using risk analysis can we choose the most useful tests• The results of the risk analysis can also be used in

communication with the upper management



• Outline of a risk analysis process (according to [Craig&Jaskiel 02]):

List the featuresand attributes of the system

Gather a brainstorming

team

Estimate the probability

of a fault

Estimate theeffect of fault to the users

CalculateRisk priority

Estimate results Make changesif necessary

Orderfeatures and

attributes

Decide what is the lowest

priority to betested

Consider how risks could be

reduced



• Risk analysis should be perform as early as possible during the project

• Gathering of a brainstorming team: including for example users, developers, testers and persons from marketing, customer service and technical support

• The objective is to gather as much information as possible about the product and the related business



• Listing of all features and attributes of the system: all the available documentation etc. can be used as a source– Start from higher abstraction level, concentrate on the details

later– Analysis can concentrate only on a part of the system as long as

the necessary interfaces are considered– Typical attributes: functionality, reliability, compatibility, usability,

maintainability, performance, scalability, security



• The estimation of the probability of a fault: go through the previous list and for each attribute and feature estimate the probability of a fault– Relative and not absolute scale– For example, scale 1-3 (low, medium, high) – In this phase it is important to get something decided, not a

complete unanimity• Values can be changed afterwards

– For example, if a team T implements a feature F in an environment E, what is the probability of a fault



• Estimation of the effect of a fault: again the previous list is used and it is evaluated what will be the effect for the user– Scale could be the same as previous– A user’s view, not the impact for the project, the work load, etc.– Sometimes users want the highest possible value to each case

• In this case, the number of values per user can be limited, for instance



• Calculating the risk priority: add up the values calculated in the two previous phases– As a result we get a five step risk priority on a scale 2-6– Instead of adding up, values can be multiplied together

• If the scale has included zero value, that has to be taken into consideration when choosing the mathematical operation

• Problems: – Because the values of effect and probability are based on

subjective evaluation, arithmetical operations are not allowed, at least in principle

– Non-linearity• The values of risk priority are just estimates, they are not be

taken as hard facts!



• Evaluation of the results and changing of them if necessary: the probabilities created by the brainstorming can be re-evaluated at this phase (with new information) – For example, if it is known that a feature will be created by an

inexperienced development team, the probability of a fault could be increased

– Changes can also be based on a value produced by a complexity metric, for instance



• Ordering of features and attributes: organize the list again in the order of risk priority– The organized list can help to concentrate resources where they

are needed the most– The dependencies between test cases should not be taken into

consideration at this phase• In practice, some dependencies can determine that a lower

priority feature has to be tested before a higher priority feature



• Determining the lowest priority to be tested: the attributes and features that have a lower priority will not be tested at all or only minimal testing is done – The available resources determine where the line is drawn– As the project continues the line can moved in both directions



• Try to lower the risks by lowering the probabilities of faults– Concentrate on high priority features and attributes– There can be several ways to lower the probabilities:

inspections, reviews, prototype development, applying more efficient testing techniques, etc.



• Just as the other documents, the risk analysis results must be maintained– For example, whenever requirements change

• When an implementation of a newer version of the system is started, it is possible to use the results of the previous risk analysis– Risks will increase for those parts that are to be changed– Usually the probabilities of errors change rather than the effects



• Exercise: Risk analysis for the courseware used on a course on Software Testing

Reducing of probabilityPriorityEffectProbabilityAttributeFeature



• Another prioritizing method is the so called MoSCoW method that defines the following four values for priority (in a descending priority order):– Must test– Should test– Could test– Won’t test

See “Successful Test Management: An Integral Approach”, Iris Pinkster & Bob van de

Burgt & Dennis Janssen & Erik van Veenendaal, Springer, 2004.



An example of MoSCoW-analysis

If this requirementis not met...

PeopleDie

We lose money

Clients losemoney

No one losesmoney

All clients suffer

One clientsuffers

All clients suffer

One clientsuffers

No detour

Detour exists

No detour

Detour exists

Must test

Must test

Must test

Should test

Should test

Could test

Could test

Won’t test



Measuring progress, one example

Must test

Should test

Could test

estimate

actual

time

Test

ed re

quire

men

ts

delivery



• In some projects more detailed prioritization can be needed– Test suites are collections of test cases that inherit the MoSCoW

priority of the test suite• For example, a test suite can contain test cases that

correspond to a single requirement– Inside a test suite test cases are prioritized with the scale: High,

Medium and Low– Now a decision might be made that in the ”Should test” test suite

only those test cases that have a High priority are executed– Note! The priorities of test cases between different test suites are

not necessarily comparable• Is Should test Low more important than Could test High?



• Matching the requirements and risks of the system under test:– If there is a risk but no corresponding requirement, either the risk is

unnecessary or the corresponding requirement has to be added– If there is a requirement but no corresponding risk, the risk should be

added or the requirement removed if it is not needed• Risk analysis can also improve the quality of requirements

specification• Analysis can be done also from a financial point of view

– The effects of the errors for the users is not evaluated, but rather the effect on the amount of money that the software will produce

– For example, if a feature is extremely important for some high paying customer, this can be taken into a consideration when the effect of a possible failure is evaluated



3.6 Are all test phases needed?

• Systematic phasing has clear benefits compared to the unplanned approach

• Especially in small projects going through all the test phases might, however, causes extra work

• When planning the phases the following should be taken into consideration– The complexity of the system that is to be developed– Budget of the project – Test organization



• The goal is not to maximize the number of phases but to choose right phases for each project

• Note! The same phase can be called differently– Unit testing can also be called module testing, component

testing, developer testing, thread testing, etc.• Don’t let the terms fool you



• Previously, when presenting integration testing, its relation to unit testing was considered: does it need its own phase or should it be a part of unit testing

• Could system testing be combined with integration testing?– Craig and Jaskiel [Craig&Jaskiel 02] list attributes that are

common among their customer organizations in which combining of these two phases have been successful:

• Good product management• Relatively large number of automated tests• Interaction between testers and designers works well



– Strategy for combining system and integration testing [Craig&Jaskiel 02]: Test team tests every build that is noticeably different than the previous one

• The last level of integration test, where all units are involved, becomes the system test

• Compared to a situation were developers would be responsible for integration testing, this approach transfer some of the responsibility of the code quality from the developers to the testers



• What about combining acceptance testing with system testing?– Might be reasonable when end users participate in system

testing– Tests should then be executed in an environment that is as close

as possible to the one of end users



3.7 When to move to the next phase?

• Two essential questions when considering the different phases– How do the phases relate to each other?

• Clear separation helps when moving from one phase to another

– How does one know when to move from one phase to another?• Clear start and passing criteria

• Different phases should not overlap, and there should not be great gaps between them

• Motivation to have overlap is usually to hasten the lifecycle



• Problems caused by overlap – Bugs that were found (and fixed) earlier can be found again

• Bugs found in unit and interface testing are much cheaper to fix than the bugs found later on

– When testing is transferred away from developers to a separate testing team the price of errors gets higher

– A challenge for error management– Configuration management becomes more complex



• Well defined and carefully applied starting and passing criteria for different phases create common rules for forwarding the project

• Some of the starting and passing criteria depend on the project, some can be standardized within the organization

• The ones working for the previous phase are interested in the next phase too as their efforts affect not only the reaching of the passing criteria of their own phase but also reaching of the start criteria of the next phase



• An example of integration testing passing criteria:– All unit and integration tests have been executed and their

results have been documented– All serious errors have been fixed and regression tests have

been executed– The rate of finding errors has dropped below an agreed limit (for

example, at most five medium level bugs during the last three days)

– An adequate statement coverage has been reached (for example 90%)

– Tests have covered enough of the program’s specification– The results of the code review have been documented and they

are acceptable



• The starting criteria of a phase should contain at least some part of the passing criteria of the previous phase– Additionally, one can add conditions regarding for example the

creation of the test environment, test tools and even getting the work force for testing

• When there is a danger that the version to be tested is untestable, the starting criteria should include a requirement that a smoke test has to be passed

• The passing criteria of acceptance testing are very important as they are the ending criteria of entire testing



• Examples of ending criteria of acceptance testing (according to [Craig&Jaskiel 02]):– There must not be medium or more serious errors open– In any feature there must not be two open errors or more and

altogether there must not be over 50 open errors– One test case for each requirement must have been passed– Test cases TT23, TT25 and TT38-52 must have been passed – Eight out of ten skilled banking clerks can open an account in ten

minutes using the on-line instructions– The system can open 1000 accounts in an hour



– The transition from one screen to another shall take on average at most one second when there are 100 users using the system

– The users must accept the results of the tests with their signatures



3.8 About documentation

• As usual in software engineering documentation has a significant role in testing– A lot of documentation is created– A considerable amount of time is spent on writing it– Synchronization of the documents and the actual tests can

become a problem• The purpose of documentation is to get the information from

the minds of the designers/testers to a form that others can use too– When things are written down, misunderstandings and flaws are

usually found– Dilemma: how to document everything that is necessary but

nothing useless



• The purpose of test planning is to make the execution of tests and reporting of the results as easy as possible– Project plan:

• What documents are produced• Who produces • When

• A general plan of testing (Master Test Plan) can be useful in big projects: – Schedules for testing– Start and passing criteria of testing– Who is responsible and for what (for example the setup of test

environment, etc.)



• Test planning [adapted from Haikala&Märijärvi 06]:

Functionalspecification

Experiences,check lists

Designspecification

Previous test documentation

Quality system code of practice

Test plan & design

Test reports

Test planning& design

Testexecution



• Because acceptance test plan is read by the users, its readability should be emphasized– In an ideal case the users write the acceptance test plan

themselves, if they have enough expertise– Sources needed:

• Project plan that shows the big picture• Master test plan• Requirements as the bases for tests• In addition, the user manual is useful, if it is available



• System test plan can be included in the functional specification

• Integration test plan can be included in the architecture design document

• In each module’s design document one can attach the definitions of the corresponding test environments and the test cases



• When tests are designed in detail, there might be change requests on the master test plan– Naturally, updating it has to be taken care of– Documentation that is not up-to-date can often do more harm

than good



• Sometimes documentation might have hidden purposes– Supporting the project?– Following the process or the quality system?– If there are some hidden purposes, why are they not written

down?• Good documentation helps to concentrate on essential issues

when changes occur• Test tools might help with the documentation

– (Partial) generation of some types of documentation



• Similarly to phasing, documentation has to be designed based on the size of the project and the size of the system under test

• If the project in question is small, usually one test plan is enough, covering both integration and system testing– The plan is made in the

functional specification phase and updated in the architecture design phase

Example Table of Contents I [Haikala&Märijärvi 06]:

1. Introduction

2. Test target and objectives

3. Environment

4. Organization and reporting

5. Test strategy and integration plan

6. Features to be tested

7. Test cases and acceptance criteria

8. Testing for non-functional features

9. Exceptional cases

10. Features not to be tested



– Introduction• Purpose and coverage• Product and environment• Definitions, terms and abbreviations• References• Overview of the document

– Test definition• Requirements to be tested• Focusing of approach• Naming of the test cases• Acceptance and rejection criteria

– Dependencies between test cases – Definition of test cases– Errors found in the functional / technical specification

Example Table of Contents II:



• The biggest difference between the example TOC I and II is that II assumes the existence of a separate Master Test Plan:

Master testplan

System Test Plan

Unit test plan

Integration test plan



• Test report:– Introduction– Conflicts and deviation– Coverage evaluation– Results– Evaluation– Functionality (actions) – Acceptance



IEEE Std 829-1998 (new version 2008)

• IEEE (Institute of Electrical and Electronics Engineers) Standard for Software Test Documentation

• Defines the following document templates– For test planning

• Test Plan: master test plan, plans for different phases– For test design

• Test Design Specification: test designs for different phases, dependencies between test cases, coverage requirements etc.

• Test Case Specification: used when necessary to define test cases and scripts of test automation



• Test Procedure Specification describes how a test suite is to be executed

– For reporting• Test Item Transmittal Report: used when it is necessary to

report a transition from development to testing– For example from a development team to a separate

testing team• Test Log: used to document the results of executed test

cases and test suites• Test Incident Report: used to report the bugs etc. found

– Can be related to any bug, including the test case definitions

• Test Summary Report: reports the passing of entire testing or a phase or reaching of some other significant goal



3.9 Follow up

• When test cases have been executed, the results are reported in the test reports

• Customer complaints have to be documented carefully• To improve the quality system errors are analyzed and filed

– Dates and times of finding, creating and fixing the errors– Coverage metrics of testing– Classification of errors (for example mild, moderate, serious,

catastrophic) • Too many levels make it difficult to classify

• It might be useful to keep a diary just for testing purposes



4. Dynamic testing techniques

In the traditional sense, testing is considered to consist of execution of test cases. In this section, we present different kinds of dynamic testing techniques.



• There exists a huge number of different techniques– The techniques to be used have to be selected case by case– Generally speaking, it’s really hard to find Best Practices™

• There are some rules of thumb for choosing the right technique in each situation

• Techniques can also be classified in many ways



• Whichever dynamic testing technique is used, the following six questions need an answer:– Is the code of the program used in test design or not?– Who does the testing?– What part of the program is to be tested?– What types of problems are searched for?– What needs to be done?– How do you know if a test run was successful or not?



• The following classification divides the techniques based on which of the previous questions they try to answer– Classification and descriptions are mostly based on [Kaner et al.

02]• Different techniques have to be combined when necessary• Techniques form a kind of a toolbox

– Screwdriver and hammer will get you far, but sometimes you need a saw



4.1 Techniques – whether to use source code or not?

• In white box testing test cases are chosen based on information about the internals of the system, such as the source code (glass/clear box testing) – White box testing cannot detect issues that are specified and but

not implemented• In the diagram of slide 20 it would mean that the circle that

describes the behavior of test cases is entirely inside the circle that describes the implemented behavior of the program



• In black box testing the program under test is seen as a black box ignoring the implementation details– When the size of the unit to be tested grows, there is a transition

from white box testing to black box testing• Unit testing is often white box testing• System testing is often black box testing• On the other hand, when a required coverage is needed to

be achieved, first black box tests are executed and then new white box tests are designed and executed until the required coverage is met



– Test cases are designed using the specifications and not the implementation

– Tests can be designed although the implementation does not exist yet

– Test cases preserve their value when the implementation changes, if the specification remains the same

– Downsides of the method• Test cases might be redundant with each other• It is easy to leave parts of the program untested



– Black box testing cannot detect the kind of behavior in a program that has been implemented but not specified

• In the diagram of slide 20 this would mean that the circle that describes the test cases is entirely inside the circle that describes the specified behavior of the program

– The method also works for the other parts of the system than just code, so the ideas are generalized

– Mostly cheaper than white box testing, which requires a person (if not the programmer) who gets to know the code first



• The techniques complement each other: white box tests find low level errors (closer to code) and black box tests find higher level errors (closer to requirements) – Specification are at a higher level of abstraction than the

implementation• If some knowledge of the program implementation is used in

testing, but it is not pure white box testing, this can be called gray box testing


3.5 acceptance testing - cs.tut.fitie21201/s2011/luennot/ohj-3066_2011_110-170.pdf · 3.5...

Documents