stest

M.Sc Information Technology

SOFTWARE QUALITY ASSURANCE AND TESTING

Unit I

Principles of Testing – Software Development Life Cycle Models

Unit II : Testing Fundamentals – 1

White box Testing – Integration testing – System and acceptance testing.

Unit III

Testing Fundamentals- 2 & Specialized testing: Performance testing – Regression

Testing – Testing of object oriented systems – Usability and accessibility testing.

Unit IV

Test planning, Management, Execution and Reporting

Unit V

Software Test Automation – Test Metrics and Measurements.

Text Book(s):

1. Software testing – Srinivasan Desikan, Gopalswamy Ramesh- Pearson

Education 2006

References

1. Introducing Software Testing – Louis Tamres, Addison Wesley Publications,

First Edition.

2. Software Testing, Ron Potton, SAMS Techmedia, Indian Edition 2001

3. Software Quality – Producing Practical, Consistent Software – Mordechai Ben

– Menachem, Gary s. Marliss, Thomson Learning, 2003.

1

1

UNIT - I

Structure

1.1 Objectives

1.2 Introduction

1.3 Software Testing Fundamentals

1.3.1. Software Chaos

1.3.2. Criteria for Project Success

1.4 Testing Principles

1.5 Software Development Life Cycle Models

1.5.1 Big-Bang

1.5.2 Code and fix

1.5.3 Waterfall

1.5.4 Prototype Model

1.5.5 The RAD Model

1.6. Evolutionary Software Process Models

1.6.1 The incremental model

1.6.2 The Spiral Model

1.6.3 The WIN-WIN spiral model

1.6.4 The Concurrent development model

1.7. Summary

1.8. Check your progress

2

2

1.1. Objectives

• To know the testing fundamentals and objectives

• To learn the principles of testing

• To understand the various life cycle models for software

1.2 Introduction

Testing presents an interesting anomaly for the software engineer. During earlier

software engineering activities, the engineer attempts to build software from an

abstract concept to a tangible product. Now comes testing. The engineer creates a

series of test cases that are intended to "demolish" the software that has been built. In

fact, testing is the one step in the software process that could be viewed

(psychologically, at least) as destructive rather than constructive. Software engineers

are by their nature constructive people. Testing requires that the developer discard

preconceived notions of the "correctness" of software just developed and overcome a

conflict of interest that occurs when errors are uncovered.

Beizer describes this situation effectively when he states:

There's a myth that if we were really good at programming, there would be no bugs

to catch. If only we could really concentrate, if only everyone used structured

programming, top down design, decision tables, if programs were written in SQUISH,

if we had the right silver bullets, then there would be no bugs. Therefore, testing and

test case design is an admission of failure, which instills a goodly dose of guilt. And

the tedium of testing is just punishment for our errors. Software requirements are

exercised using “black box” test case design techniques. In both cases, the intent is to

find the maximum number of errors with the minimum amount of effort and time.

What is the work product? A set of test cases designed to exercise both internal logic

and external requirements is designed and documented, expected results are defined,

and actual results are recorded.

How do I ensure that I’ve done it right? When you begin testing, change your point of

view. Try hard to “break” the software!

Q

3

3

1.3 Software Testing Fundamentals

The fundamental principles of testing are as follows:

1. The goal of testing is to find defects before customers find them out.

2. Exhaustive testing is not possible; program testing can only show the presence

of defects, never their absence.

3. Testing applies all through the software lifecycle and is not an end-of-cycle

activity.

4. Understand the reason behind the test.

5. Test the test first.

6. Tests develop immunity and have to revised constantly.

7. Defects occur in convoys or clusters, and testing should focus on these

convoys.

8. Testing encompasses defect prevention.

9. Testing is fine balance of defect prevention and defect detection.

10. Intelligent and well-planned automation is key to realizing the benefits of

testing.

11. Testing requires talented, committed people who beleive in themselves and

work in teams.

Software Testing Techniques

Should testing instill guilt? Is testing really destructive? The answer to these

questions is "No!" However, the objectives of testing are somewhat different than we

might expect.

Testing Objectives

In an excellent book on software testing, Glen Myers states a number of rules that can

serve well as testing objectives:

1. Testing is a process of executing a program with the intent of finding an error.

2. A good test case is one that has a high probability of finding an as-yet undiscovered

error.

4

4

3. A successful test is one that uncovers an as-yet-undiscovered error.

These objectives imply a dramatic change in viewpoint. They move counter to the

commonly held view that a successful test is one in which no errors are found. Our

objective is to design tests that systematically uncover different classes of errors and

to do so with a minimum amount of time and effort. If testing is conducted

successfully (according to the objectives stated previously), it will uncover errors in

the software. As a secondary benefit, testing demonstrates that software functions

appear to be working according to specification, that behavioral and performance

requirements appear to have been met. In addition, data collected as testing provides a

good indication of software reliability and some indication of software quality as a

whole. But testing cannot show the absence of errors and defects, it can show only

that software errors and defects are present. It is important to keep this (rather

gloomy) statement in mind as testing is being conducted.

It is easy to take software for granted and not really appreciate how it has infiltrated

our daily lives. Most of us now can’t go a day without logging on to the internet and

checking our email. We rely on overnight packages, long distance phone service, and

cutting-edge medical treatments.

1.3.1 Software Chaos

Software is everywhere. However, it is written by people. So it is not perfect, as the

following examples show:

Disney’s Lion King, 1994-1995

In the fall of 1994, the Disney Company released its first multimedia CD-ROM game

for children. On December 26, customer support engineers were swamped with calls

from angry parents who could not get the software to work. It turns out that Disney

failed to properly test the software on the many different PC models available on the

market.

The other infamous software error case studies are listed below:

• Intel Pentium Floating Point Division Bug, 1994

• NASA Mars Polar Lander, 1999

• Patriot Missile Defense system, 1991

5

5

• The Y2K Bug, circa 1974

What is a Bug?

We have just read examples of what happens when software fails. In these instances,

it was obvious that the software did not operate as intended. Problem, error, and bug

are probably the most generic terms used.

Why do bugs occur?

The number one cause of software bugs is the specification. There are several reasons

why specifications are the largest bug producers. In many cases specifications are not

written. Other reasons may be that the specification is not thorough enough, it is

constantly changing, or it is not communicated well to the entire development team.

Planning software is vitally important. If it is not done correctly, bugs will be created.

The next largest source of bugs is the design. Coding errors may be more familiar to

you if you are a programmer.

The Cost of Bugs

Software does not just magically appear. There is usually a planned, methodical

development process used to create it. From its inception, through the planning,

programming and testing, to it’s use by the public, there is the potential for bugs to be

found. The cost to fix bugs increases dramatically over time.

What exactly does a software tester do?

The goal of software tester is

• To find bugs

• Find the bugs as early as possible

• And make sure they get fixed.

It has been said, “If you do not know where you are going, all roads lead there.”

Traditionally, many IT organizations annually develop a list of improvements to

incorporate into their operations without establishing a goal. Using this approach, the

IT organization can declare “victory” any time it wants. This lesson will help you

understand the importance of following a well-defined process for becoming a

world-class software testing organization. This lesson will help you define your

6

6

strengths and deficiencies, your staff competencies and deficiencies, and areas of user

dissatisfaction.

1.3.2 Criteria for Project Success

The Three-Step Process to Becoming a World-Class Testing Organization

The roadmap to become a world-class software testing organization is a simple

three-step process, as follows:

1. Define or adopt a world-class software testing model.

2. Determine your organization’s current level of software testing capabilities,

competencies, and user satisfaction.

3. Develop and implement a plan to upgrade from your current capabilities, competencies, and user satisfaction to those in the world-class software testing model.

This three-step process requires you to compare your current capabilities,

competencies, and user satisfaction against those of the world-class software testing

model. This assessment will enable you to develop a baseline of your organization’s

performance. The plan that you develop will, over time, move that baseline from its

current level of performance to a world-class level. Understanding the model for a

world-class software testing organization and then comparing your organization will

provide you with a plan for using the remainder of the material in this book.

Software testing is an integral part of the software-development process, which

comprises the following four components (see Figure 1):

1. Plan (P): Devise a plan. Define your objective and determine the strategy and

supporting methods to achieve it. You should base the plan on an assessment of your

current situation, and the strategy should clearly focus on the strategic initiatives/key

units that will drive your improvement plan.

2. Do (D): Execute the plan. Create the conditions and perform the necessary

training to execute the plan. Make sure everyone thoroughly understands the

objectives and the plan. Teach workers the procedures and skills they need to fulfill

the plan and thoroughly understand the job. Then perform the work according to these

procedures.

3. Check (C): Check the results. Check to determine whether work is progressing

according to the plan and whether the expected results are being obtained. Check for

7

7

performance of the set procedures, changes in conditions, or abnormalities that may

appear. As often as possible, compare the results of the work with the objectives.

4. Act (A): Take the necessary action. If your checkup reveals that the work is not

being performed according to the plan or that results are not what you anticipated,

devise measures to take appropriate actions.

Fig. 1 The four components of a software development process.

Testing involves only the “check” component of the plan-do-check-act (PDCA)

cycle. The software development team is responsible for the three remaining

components. The development team plans the project and builds the software (the

“do” component); the testers check to determine that the software meets the needs of

the customers and users. If it does not, the testers report defects to the development

team. It is the development team that makes the determination as to whether the

uncovered defects are to be corrected. The role of testing is to fulfill the check

responsibilities assigned to the testers; it is not to determine whether the software can

be placed into production. That is the responsibility of the customers, users, and

development team.

1.4 Testing Principles

Before applying methods to design effective test cases, a software engineer must

understand the basic principles that guide software testing. Davis suggests a set1 of

testing principles that have been adapted for use in this book:

8

8

• All tests should be traceable to customer requirements. As we have seen, the

objective of software testing is to uncover errors. It follows that the most severe

defects (from the customer’s point of view) are those that cause the program to fail to

meet its requirements.

• Tests should be planned long before testing begins. Test planning can begin as

soon as the requirements model is complete. All tests can be planned and designed

before any code has been generated.

• The Pareto principle applies to software testing. Stated simply, the Pareto

principle implies that 80 percent of all errors uncovered during testing will likely be

traceable to 20 percent of all program components. The problem, of course, is to

isolate these suspect components and to thoroughly test them.

• Testing should begin “in the small” and progress toward testing “in the large.”

The first tests planned and executed generally focus on individual components. As

testing progresses, focus shifts in an attempt to find errors in integrated clusters of

components and ultimately in the entire system.

• Exhaustive testing is not possible. The number of path permutations for even a

moderately sized program is exceptionally large. For this reason, it is impossible to

execute every combination of paths during testing. It is possible, however, to

adequately cover program logic and to ensure that all conditions in the

component-level design have been exercised.

• To be most effective, testing should be conducted by an independent third

party. By most effective, we mean testing that has the highest probability of finding

errors (the primary objective of testing).

Testability

In ideal circumstances, a software engineer designs a computer program, a system, or

a product with “testability” in mind. This enables the individuals charged with testing

to design effective test cases more easily. But what is testability? James Bach2

describes testability in the following manner. Software testability is simply how easily

[a computer program] can be tested. Since testing is so profoundly difficult, it pays to

know what can be done to streamline it. Sometimes programmers are willing to do

things that will help the testing process and a checklist of possible design points,

features, etc., can be useful in negotiating with them. There are certainly metrics that

could be used to measure testability in most of its aspects.

9

9

Operability. "The better it works, the more efficiently it can be tested."

• The system has few bugs (bugs add analysis and reporting overhead to the test

process).

• No bugs block the execution of tests.

• The product evolves in functional stages (allows simultaneous development and

testing).

Observability. "What you see is what you test."

• Distinct output is generated for each input.

• System states and variables are visible or queriable during execution.

• Past system states and variables are visible or queriable (e.g., transaction logs).

• All factors affecting the output are visible.

• Incorrect output is easily identified.

• Internal errors are automatically detected through self-testing mechanisms.

• Internal errors are automatically reported.

• Source code is accessible.

Controllability. "The better we can control the software, the more the testing can be

automated and optimized."

• All possible outputs can be generated through some combination of input.

• All code is executable through some combination of input.

• Software and hardware states and variables can be controlled directly by the test

engineer.

• Input and output formats are consistent and structured.

• Tests can be conveniently specified, automated, and reproduced.

Decomposability. "By controlling the scope of testing, we can more quickly isolate

problems and perform smarter retesting."

• The software system is built from independent modules.

• Software modules can be tested independently.

Simplicity. "The less there is to test, the more quickly we can test it."

10

10

• Functional simplicity (e.g., the feature set is the minimum necessary to meet

requirements).

• Structural simplicity (e.g., architecture is modularized to limit the propagation of

faults).

• Code simplicity (e.g., a coding standard is adopted for ease of inspection and

maintenance).

Stability. "The fewer the changes, the fewer the disruptions to testing."

• Changes to the software are infrequent.

• Changes to the software are controlled.

• Changes to the software do not invalidate existing tests.

• The software recovers well from failures.

Understandability. "The more information we have, the smarter we will test."

• The design is well understood.

• Dependencies between internal, external, and shared components are well

understood.

• Changes to the design are communicated.

• Technical documentation is instantly accessible.

• Technical documentation is well organized.

• Technical documentation is specific and detailed.

• Technical documentation is accurate.

The attributes suggested by Bach can be used by a software engineer to develop a

software configuration (i.e., programs, data, and documents) that is amenable to

testing. And what about the tests themselves? Kaner, Falk, and Nguyen suggest the

following attributes of a “good” test:

1. A good test has a high probability of finding an error. To achieve this goal, the

tester must understand the software and attempt to develop a mental picture of how

the software might fail. Ideally, the classes of failure are probed. For example, one

class of potential failure in a GUI (Graphical User Interface) is a failure to recognize

proper mouse position. A set of tests would be designed to exercise the mouse in an

attempt to demonstrate an error in mouse position recognition.

11

11

2. A good test is not redundant. Testing time and resources are limited. There is no

point in conducting a test that has the same purpose as another test. Every test should

have a different purpose (even if it is subtly different). For example, a module of the

Safe Home software is designed to recognize a user password to activate and

deactivate the system. In an effort to uncover an error in password input, the tester

designs a series of tests that input a sequence of passwords. Valid and invalid

passwords (four numeral sequences) are input as separate tests.

1.5 Software Development Life Cycle Models

To solve actual problems in an industry setting, a software engineer or a team of

engineers must incorporate a development strategy that encompasses the process,

methods, tools layers and the generic phases. This strategy is often referred to as a

process model or a software engineering paradigm. A process model for software

engineering is chosen based on the nature of the project and application, the methods

and tools to be used, and the controls and deliverables that are required. In an

intriguing paper on the nature of the software process, L. B. S. Raccoon [RAC95]

uses fractals as the basis for a discussion of the true nature of the software process.

“Too often, software work follows the first law of bicycling: No matter where you're

going its uphill and against the wind.”

In the sections that follow, a variety of different process models for software

engineering are discussed. Each represents an attempt to bring order to an inherently

chaotic activity. It is important to remember that each of the models has been

characterized in a way that (ideally) assists in the control and coordination of a real

software project.

A life cycle model describes how the phases combine together to form a complete

project or life cycle. Such a model is characterized by the following attributes:

The activities performed

The deliverables form each activity

Methods of validation of the deliverables

The sequence of activities

12

12

Methods of verification of each activity, including the mechanism of

communication amongst the activities.

The process used to create a software product from its initial conception to its release

is known as the software development life cycle model.

1.5.1 Big – Bang Model

One theory of the creation of the universe is the big-bang theory. It states that billions

of years ago, the universe was created in a single huge explosion of nearly infinite

energy. Everything that exists is the result of energy. A big-bang model for software

development follows the same principle. A huge amount of matter (People and

money) is put together, a lot of energy is expended- often violently – and out comes

the perfect software product. The beauty of the big – bang method is that it’s simple.

There is little if any planning, scheduling, or formal development process. All the

effort is spent developing the software and writing the code. It is an ideal process if

the product requirements are not well understood and the final release date is flexible.

It is also important to have very flexible customers, too, because they won’t know

what they are getting until the very end.

Fig.2 Big-Bang Model

Notice that testing is not shown in the figure. In most cases, there is little to no formal

testing done under the big-bang model. If testing does occur, it is squeezed in just

before the product is released. If you are called in to test a product under the big-bang

model, you have both an easy and a difficult task. Because the software is already

complete, you have the perfect specification- the product itself. And, because it’s

impossible to go back and fix things that are broken, your job is really just to report

what you find so the customers can be told about the problems. The downside is that,

in the eyes of project management, the product is ready to go, so your work is holding

13

13

BOOM

?

or

up delivery to the customer. The longer you take to do your job and the more bugs

you find, the more contentious the situation will become. Try to stay away from

testing in this model.

1.5.2 Code and Fix Model

The code and fix model is usually the one that project teams fall into by default if

they don’t consciously attempt to use something else. It is a step up, procedurally,

from the big-bang model in that at least requires some idea of what the product

requirements are.

Typically informal Code, Fix Final product

Product specification Repeat until

Fig. 3 Code and Fix model

A team using this approach usually starts with a rough idea of what they want, does

some simple design, and then proceeds into a long repeating cycle of coding, testing

and fixing bugs. At some point they decide that it is enough and release the product.

As there is very little overhead for planning and documenting, a project team can

show results immediately. For this reason the code and fix model works very well for

some projects intended to be created quickly and then thrown out shortly after they

are done, such as prototypes and demos. Even so, code and fix has been used on many

large and well know software products. If your word processor or spreadsheet

software has lots of little bugs or it just doesn’t seem quite finished, it was likely

created with the code and fix model.

As a tester on a code and fix project, you need to be aware that you, along with the

programmers, will be in a constant state of cycling. As often as every day you will be

given new or updated releases of the software and will set off to test it. You will run

your tests, report the bugs, and then get a new software release. You may not have

14

14

finished testing the previous release when the new one arrives, and the new one may

have new or changed features. Eventually, you will get a chance to test most of the

features, find fewer and fewer bugs, and then someone will decide that it is time to

release the product.

1.5.3 Waterfall Model

Sometimes called the classic life cycle or the waterfall model, the linear sequential

model suggests a systematic, sequential approach to software development that begins

at the system level and progresses through analysis, design, coding, testing, and

support. Modeled after a conventional engineering cycle, the linear sequential model

encompasses the following activities:

System/information engineering and modeling. Because software is always part of

a larger system (or business), work begins by establishing requirements for all system

elements and then allocating some subset of these requirements to software. This

system view is essential when software must interact with other elements such as

hardware, people, and databases. System engineering and analysis encompass

requirements gathering at the system level with a small amount of top level design

and analysis. Information engineering encompasses requirements gathering at the

strategic business level and at the business area level.

Software requirements analysis. The requirements gathering process is intensified

and focused specifically on software. To understand the nature of the program(s) to be

built, the software engineer ("analyst") must understand the information domain for

the software, as well as required function, behavior, performance, and interface.

Requirements for both the system and the software are documented and reviewed with

the customer.

Design. Software design is actually a multistep process that focuses on four distinct

attributes of a program: data structure, software architecture, interface representations,

and procedural (algorithmic) detail. The design process translates requirements into a

representation of the software that can be assessed for quality before coding begins.

Like requirements, the design is documented and becomes part of the software

configuration.

15

15

Code generation. The design must be translated into a machine-readable form. The

code generation step performs this task. If design is performed in a detailed manner,

code generation can be accomplished mechanistically.

Testing. Once code has been generated, program testing begins. The testing process

focuses on the logical internals of the software, ensuring that all statements have been

tested, and on the functional externals; that is, conducting tests to uncover errors and

ensure that defined input will produce actual results that agree with required results.

Support. Software will undoubtedly undergo change after it is delivered to the

customer (a possible exception is embedded software). Change will occur because

errors have been encountered, because the software must be adapted to accommodate

changes in its external environment (e.g., a change required because of a new

operating system or peripheral device), or because the customer requires functional or

performance enhancements. Software support/maintenance reapplies each of the

preceding phases to an existing program rather than a new one.

The waterfall model is usually the first one taught in programming school.

Fig. 4 Water fall model.

The above Figure.4 shows the steps involved in this model. A project using the

waterfall model moves down a series of steps starting from an initial idea to a final

product. At the end of each step, the project team holds a review to determine if they

are ready to move to the next step. Notice three important things about the waterfall

model:

16

16

Idea

Analysis

Design

Development

Test

Finalproduct

• There is a large emphasis on specifying what the product will be.

• The steps are discrete; there is no overlap.

• There is no way to back up. As soon as you are on a step, you need to

complete the tasks for that step and then move on- You can’t go back.

The advantage is everything is carefully and thoroughly specified. But, with this

advantage, comes a large disadvantage. Because testing occurs only at the end, a

fundamental problem could creep in early on and not be detected until days before the

scheduled product release.

The linear sequential model is the oldest and the most widely used paradigm for

software engineering. However, criticism of the paradigm has caused even active

supporters to question its efficacy. Among the problems that are sometimes

encountered when the linear sequential model is applied are:

1. Real projects rarely follow the sequential flow that the model proposes. Although

the linear model can accommodate iteration, it does so indirectly. As a result, changes

can cause confusion as the project team proceeds.

2. It is often difficult for the customer to state all requirements explicitly. The linear

sequential model requires this and has difficulty accommodating the natural

uncertainty that exists at the beginning of many projects.

3. The customer must have patience. A working version of the program(s) will not be

available until late in the project time-span. A major blunder, if undetected until the

working program is reviewed, can be disastrous.

In an interesting analysis of actual projects Bradac [BRA94], found that the linear

nature of the classic life cycle leads to “blocking states” in which some project team

members must wait for other members of the team to complete dependent tasks. In

fact, the time spent waiting can exceed the time spent on productive work! The

blocking state tends to be more prevalent at the beginning and end of a linear

sequential process.

Each of these problems is real. However, the classic life cycle paradigm has a

definite and important place in software engineering work. It provides a template into

which methods for analysis, design, coding, testing, and support can be placed. The

classic life cycle remains a widely used procedural model for software engineering.

While it does have weaknesses, it is significantly better than a haphazard approach to

software development.

17

17

1.5.4 The Prototyping Model

Often, a customer defines a set of general objectives for software but does not

identify detailed input, processing, or output requirements. In other cases, the

developer may be unsure of the efficiency of an algorithm, the adaptability of an

operating system, or the form that human/machine interaction should take. In these,

and many other situations, a prototyping paradigm may offer the best approach.

The prototyping paradigm (Figure 1.5) begins with requirements gathering. Developer

and customer meet and define the overall objectives for the software, identify

whatever requirements are known, and outline areas where further definition is

mandatory. A "quick design" then occurs. The quick design focuses on a

representation of those aspects of the software that will be visible to the customer/user

(e.g., input approaches and output formats). The quick design leads to the construction

of a prototype. The prototype is evaluated by the customer/user and used to refine

requirements for the software to be developed. Iteration occurs as the prototype is

tuned to satisfy the needs of the customer, while at the same time enabling the

developer to better understand what needs to be done.

Fig. 5 Prototyping Paradigm

18

18

Ideally, the prototype serves as a mechanism for identifying software requirements.

If a working prototype is built, the developer attempts to use existing program

fragments or applies tools (e.g., report generators, window managers) that enable

working programs to be generated quickly. But what do we do with the prototype

when it has served the purpose just described? Brooks [BRO75] provides an answer:

In most projects, the first system built is barely usable. It may be too slow, too big,

awkward in use or all three. There is no alternative but to start again, smarting but

smarter, and build a redesigned version in which these problems are solved . . . When

a new system concept or new technology is used, one has to build a system to throw

away, for even the best planning is not so omniscient as to get it right the first time.

The management question, therefore, is not whether to build a pilot system and throw

it away. You will do that. The only question is whether to plan in advance to build a

throwaway, or to promise to deliver the throwaway to customers.

The prototype can serve as "the first system." The one that Brooks recommends we

throw away. But this may be an idealized view. It is true that both customers and

developers like the prototyping paradigm. Users get a feel for the actual system and

developers get to build something immediately. Yet, prototyping can also be

problematic for the following reasons:

1. The customer sees what appears to be a working version of the software, unaware

that the prototype is held together “with chewing gum and baling wire,” unaware that

in the rush to get it working no one has considered overall software quality or

long-term maintainability. When informed that the product must be rebuilt so that

high levels of quality can be maintained, the customer cries foul and demands that "a

few fixes" be applied to make the prototype a working product. Too often, software

development management relents.

2. The developer often makes implementation compromises in order to get a

prototype working quickly. An inappropriate operating system or programming

language may be used simply because it is available and known; an inefficient

algorithm may be implemented simply to demonstrate capability.

After a time, the developer may become familiar with these choices and forget all the

reasons why they were inappropriate. The less-than-ideal choice has now become an

integral part of the system.

19

19

Although problems can occur, prototyping can be an effective paradigm for software

engineering. The key is to define the rules of the game at the beginning; that is, the

customer and developer must both agree that the prototype is built to serve as a

mechanism for defining requirements. It is then discarded (at least in part) and the

actual software is engineered with an eye toward quality and maintainability.

1.5.5 The RAD Model

Rapid application development (RAD) is an incremental software development

process model that emphasizes an extremely short development cycle. The RAD

model is a “high-speed” adaptation of the linear sequential model in which rapid

development is achieved by using component-based construction. If requirements are

well understood and project scope is constrained, the RAD process enables a

development team to create a “fully functional system” within very short time periods

(e.g., 60 to 90 days). Used primarily for information systems applications, the RAD

approach encompasses the following phases:

20

20

Fig. 6 the RAD Model

Business modeling. The information flow among business functions is modeled in a

way that answers the following questions: What information drives the business

process? What information is generated? Who generates it? Where does the

information go? Who processes it?

Data modeling. The information flow defined as part of the business modeling phase

is refined into a set of data objects that are needed to support the business. The

characteristics (called attributes) of each object are identified and the relationships

between these objects defined.

Process modeling. The data objects defined in the data modeling phase are

transformed to achieve the information flow necessary to implement a business

function. Processing descriptions are created for adding, modifying, deleting, or

retrieving a data object.

Application generation. RAD assumes the use of fourth generation techniques.

Rather than creating software using conventional third generation programming

21

21

languages the RAD process works to reuse existing program components (when

possible) or create reusable components (when necessary). In all cases, automated

tools are used to facilitate construction of the software.

Testing and turnover. Since the RAD process emphasizes reuse, many of the

program components have already been tested. This reduces overall testing time.

However, new components must be tested and all interfaces must be fully exercised.

The RAD process model is illustrated in Figure 2.6. Obviously, the time constraints

imposed on a RAD project demand “scalable scope” [KER94]. If a business

application can be modularized in a way that enables each major function to be

completed in less than three months (using the approach described previously), it is a

candidate for RAD. Each major function can be addressed by a separate RAD team

and then integrated to form a whole.

Like all process models, the RAD approach has drawbacks:

• For large but scalable projects, RAD requires sufficient human resources to create

the right number of RAD teams.

• RAD requires developers and customers who are committed to the rapid-fire

activities necessary to get a system complete in a much abbreviated time frame. If

commitment is lacking from either constituency, RAD projects will fail.

• Not all types of applications are appropriate for RAD. If a system cannot be properly

modularized, building the components necessary for RAD will be problematic. If high

performance is an issue and performance is to be achieved through tuning the

interfaces to system components, the RAD approach may not work.

• RAD is not appropriate when technical risks are high. This occurs when a new

application makes heavy use of new technology or when the new software requires a

high degree of interoperability with existing computer programs.

1.6 Evolutionary Process models

There is growing recognition that software, like all complex systems, evolves over a

period of time. Business and product requirements often change as development

proceeds, making a straight path to an end product unrealistic; tight market deadlines

make completion of a comprehensive software product impossible, but a limited

version must be introduced to meet competitive or business pressure; a set of core

22

22

product or system requirements is well understood, but the details of product or

system extensions have yet to be defined. In these and similar situations, software

engineers need a process model that has been explicitly designed to accommodate a

product that evolves over time.

The linear sequential model is designed for straight-line development. In essence, this

waterfall approach assumes that a complete system will be delivered after the linear

sequence is completed. The prototyping model is designed to assist the customer (or

developer) in understanding requirements. In general, it is not designed to deliver a

production system. The evolutionary nature of software is not considered in either of

these classic software engineering paradigms.

Evolutionary models are iterative. They are characterized in a manner that enables

software engineers to develop increasingly more complete versions of the software.

1.6.1 The Incremental Model

The incremental model combines elements of the linear sequential model (applied

repetitively) with the iterative philosophy of prototyping. Referring to Figure.7, the

incremental model applies linear sequences in a staggered fashion as calendar time

progresses. Each linear sequence produces a deliverable “increment” of the software

[MDE93]. For example, word-processing software developed using the incremental

paradigm might deliver basic file management, editing, and document production

functions in the first increment; more sophisticated editing and document production

capabilities in the second increment; spelling and grammar checking in the third

increment; and advanced page layout capability in the fourth increment. It should be

noted that the process flow for any increment can incorporate the prototyping

paradigm.

When an incremental model is used, the first increment is often a core product.

That is, basic requirements are addressed, but many supplementary features (some

known, others unknown) remain undelivered. The core product is used by the

customer (or undergoes detailed review). As a result of use and/or evaluation, a plan

is developed for the next increment. The plan addresses the modification of the core

product to better meet the needs of the customer and the delivery of additional

23

23

features and functionality. This process is repeated following the delivery of each

increment, until the complete product is produced.

Fig. 7 The Incremental Model

The incremental process model, like prototyping and other evolutionary approaches,

is iterative in nature. But unlike prototyping, the incremental model focuses on the

delivery of an operational product with each increment. Early increments are stripped

down versions of the final product, but they do provide capability that serves the user

and also provide a platform for evaluation by the user.

Incremental development is particularly useful when staffing is unavailable for a

complete implementation by the business deadline that has been established for the

project. Early increments can be implemented with fewer people. If the core product

is well received, then additional staff (if required) can be added to implement the next

increment. In addition, increments can be planned to manage technical risks. For

example, a major system might require the availability of new hardware that is under

development and whose delivery date is uncertain. It might be possible to plan early

increments in a way that avoids the use of this hardware, thereby enabling partial

functionality to be delivered to end-users without inordinate delay.

24

24

1.6.2 Spiral Model

Fig. 8 The spiral model

The spiral model, originally proposed by Boehm, is an evolutionary software process

model that couples the iterative nature of prototyping with the controlled and

systematic aspects of the linear sequential model. It provides the potential for rapid

development of incremental versions of the software. Using the spiral model, software

is developed in a series of incremental releases. During early iterations, the

incremental release might be a paper model or prototype. During later iterations,

increasingly more complete versions of the engineered system are produced. A spiral

model is divided into a number of framework activities, also called task regions..

Figure.8 depicts a spiral model that contains six task regions:

• Customer communication—tasks required to establish effective

communication between developer and customer.

• Planning—tasks required to define resources, timelines, and other project

related information.

• Risk analysis—tasks required to assess both technical and management risks.

25

25

• Engineering—tasks required to build one or more representations of the

application.

• Construction and release—tasks required to construct, test, install, and

provide user support (e.g., documentation and training).

• Customer evaluation – tasks required to evaluate the project.

The spiral model is a realistic approach to the development of large-scale systems

and software. Because software evolves as the process progresses, the developer and

customer better understand and react to risks at each evolutionary level. The spiral

model uses prototyping as a risk reduction mechanism but, more important, enables

the developer to apply the prototyping approach at any stage in the evolution of the

product. It maintains the systematic stepwise approach suggested by the classic life

cycle but incorporates it into an iterative framework that more realistically reflects the

real world. The spiral model demands a direct consideration of technical risks at all

stages of the project and, if properly applied, should reduce risks before they become

problematic.

But like other paradigms, the spiral model is not a panacea. It may be difficult to

convince customers (particularly in contract situations) that the evolutionary approach

is controllable. It demands considerable risk assessment expertise and relies on this

expertise for success. If a major risk is not uncovered and managed, problems will

undoubtedly occur. Finally, the model has not been used as widely as the linear

sequential or prototyping paradigms. It will take a number of years before efficacy of

this important paradigm can be determined with absolute certainty.

1.6.3 The WIN-WIN Spiral Model

The spiral model discussed in the previous Section suggests a framework activity that

addresses customer communication. The objective of this activity is to elicit project

requirements from the customer. In an ideal context, the developer simply asks the

customer what is required and the customer provides sufficient detail to proceed.

Unfortunately, this rarely happens. In reality, the customer and the developer enter

into a process of negotiation, where the customer may be asked to balance

functionality, performance, and other product or system characteristics against cost

and time to market.

26

26

The best negotiations strive for a “win-win” result.7 That is, the customer wins by

getting the system or product that satisfies the majority of the customer’s needs and

the developer wins by working to realistic and achievable budgets and deadlines.

Boehm’s WINWIN spiral model defines a set of negotiation activities at the

beginning of each pass around the spiral. Rather than a single customer

communication activity, the following activities are defined:

1. Identification of the system or subsystem’s key “stakeholders.”

2. Determination of the stakeholders’ “win conditions.”

3. Negotiation of the stakeholders’ win conditions to reconcile them into a set of

win-win conditions for all concerned (including the software project team).

Successful completion of these initial steps achieves a win-win result, which becomes

the key criterion for proceeding to software and system definition. The WINWIN

spiral model is illustrated in the following Figure.

Fig. 9 The WIN WIN Spiral Model

In addition to the emphasis placed on early negotiation, the WINWIN spiral model

introduces three process milestones, called anchor points that help establish the

completion of one cycle around the spiral and provide decision milestones before the

software project proceeds.

In essence, the anchor points represent three different views of progress as the project

traverses the spiral. The first anchor point, life cycle objectives (LCO), defines a set of

objectives for each major software engineering activity. For example, as part of LCO,

27

27

a set of objectives establishes the definition of top-level system/product requirements.

The second anchor point, life cycle architecture (LCA), establishes objectives that

must be met as the system and software architecture is defined. For example, as part

of LCA, the software project team must demonstrate that it has evaluated the

applicability of off-the-shelf and reusable software components and considered their

impact on architectural decisions. Initial operational capability (IOC) is the third

anchor point and represents a set of objectives associated with the preparation of the

software for installation/distribution, site preparation prior to installation, and

assistance required by all parties that will use or support the software.

1.6.4 The Concurrent Developmental Model

The concurrent process model can be represented schematically as a series of major

technical activities, tasks, and their associated states. For example, the engineering

activity defined for the spiral model is accomplished by invoking the following tasks:

prototyping and/or analysis modeling, requirements specification, and design.

28

28

Fig. 10 The Concurrent Developmental Model

Figure.10 provides a schematic representation of one activity with the concurrent

process model. The activity—analysis—may be in any one of the states noted at any

given time. Similarly, other activities (e.g., design or customer communication) can

be represented in an analogous manner. All activities exist concurrently but reside in

different states. For example, early in a project the customer communication activity

(not shown in the figure) has completed its first iteration and exists in the awaiting

changes state. The analysis activity (which existed in the none state while initial

customer communication was completed) now makes a transition into the under

development state. If, however, the customer indicates that changes in requirements

must be made, the analysis activity moves from the under development

State into the awaiting changes state.

The concurrent process model defines a series of events that will trigger transitions

from state to state for each of the software engineering activities. For example, during

early stages of design, an inconsistency in the analysis model is uncovered. This

29

29

generates the event analysis model correction which will trigger the analysis activity

from the done state into the awaiting changes state.

The concurrent process model is often used as the paradigm for the development of

client/server11 applications. A client/server system is composed of a set of functional

components. When applied to client/server, the concurrent process model defines

activities in two dimensions: a system dimension and a component dimension. System

level issues are addressed using three activities: design, assembly, and use. The

component dimension is addressed with two activities: design and realization.

Concurrency is achieved in two ways: (1) system and component activities occur

simultaneously and can be modeled using the state-oriented approach described

previously; (2) a typical client/server application is implemented with many

components, each of which can be designed and realized concurrently.

In reality, the concurrent process model is applicable to all types of software

development and provides an accurate picture of the current state of a project. Rather

than confining software engineering activities to a sequence of events, it defines a

network of activities. Each activity on the network exists simultaneously with other

activities. Events generated within a given activity or at some other place in the

activity network trigger transitions among the states of an activity.

1.7 Summary

Software engineering is a discipline that integrates process, methods, and tools for the

development of computer software. A number of different process models for

software engineering have been proposed, each exhibiting strengths and weaknesses,

but all having a series of generic phases in common. As can be seen from the above

discussion, each of the models has its advantages and disadvantages. Each of them

has applicability in a specific scenario. Each of them also provides different issues,

challenges, and opportunities for verification and validation.

30

30

1.8 Check your Progress

1. What are the objectives of testing?

2. Write down the principles of testing.

3. Discuss the criteria for project success.

4. Write short notes on various software development lifecycle models.

5. What are called evolutionary life cycle models? How they differ from

the old ones?

6. Explain the Spiral and WIN - WIN spiral model with a neat diagram.

7. What are the phases involved in a Waterfall model? Explain

8. Explain the Prototyping and RAD model.

9. Write short notes on Code and Fix model & Big Bang Model.

10. Write down the advantages of making a concurrent

developmental model.

31

31

Unit II

Structure

2.0 Objectives

2.1. Introduction

2.2. White Box Testing

2.2.1 Static Testing

2.2.2 Structural Testing

2.2.3 Code Complexity Testing

2.3 Integration Testing

2.3.1 Top-Down Integration Testing

2.3.2 Bottom – Up Integration testing

2.3.3 Integration testing Documentation

2.3.4 Alpha and Beta Testing

2.4 System and Acceptance Testing

2.4.1 System Testing

2.4.2 Acceptance Testing

2.5. Summary

2.6 Check your progress

32

32

2.0 Objectives

To know the various types of testing and their importance

To learn the White box testing methods and its features

To understand what is the necessity of doing Integration testing in the top down and

bottom up sequences

To learn the system and acceptance testing to finally accept the software to use

2.1 Introduction

Testing requires asking about and understanding what you are trying to test, knowing

what the correct outcome is, and why you are performing any test. Why one test is as

important as what to test and how to test. Understanding the rationale of why we are

testing certain functionality leads to different types of tests, which we will see in the

following sections.

We do white box testing to check the various paths in the code and make sure they

are exercised correctly. Knowing which code paths should be exercised for a given

test enables making necessary changes to ensure that appropriate paths are covered.

Knowing the external functionality of what the product should do, we design black

box tests. Integration tests are used to make sure that the different components fit

together. Regression testing is done to ensure that changes work as designed and do

not have any unintended side-effects. So teat the test first, a defective test is more

dangerous than a defective product.

TYPES OF TESTING

The various types of testing which are often used are listed below:

White Box Testing

Black Box Testing

Integration Testing

System and Acceptance Testing

Performance Testing

33

33

Regression testing

Testing of Object Oriented Systems

Usability and Accessibility Testing

2.2 WHITE-BOX TESTING

White box testing is a way of testing the external functionality of the code by

examining and testing the program code that realizes the external functionality. This

is also known as clear box, or glass box or open box testing. White box testing takes

into account the program code, code structure, and internal design flow. White box

testing is classified into static and structural testing.

White-box testing, sometimes called glass-box testing, is a test case design method

that uses the control structure of the procedural design to derive test cases. Using

white-box testing methods, the software engineer can derive test cases that

(1) Guarantee that all independent paths within a module have been exercised at least

once

(2) Exercise all logical decisions on their true and false sides

(3) Execute all loops at their boundaries and within their operational bounds and

(4) Exercise internal data structures to ensure their validity.

It is not possible to exhaustively test every program path because the number of paths

is simply too large. White-box tests can be designed only after a component-level

design (or source code) exists. The logical details of the program must be available.

34

34

Fig . 11 Classification of white box testing

2.2.1 Static testing

Static testing requires only the source code of the product, not the binaries or

executables. Static testing does not involve executing the programs on computers but

involves select people going through the code to find out whether

The code works according to the functional requirement

The code has been written in accordance with the design developed

Earlier in the project life cycle

The code for any functionality has been missed out

The code handles errors properly

Static testing can be done by humans or with the help of specialized tools.

White box testing

Static testing Structural testing

Desk checking

Code walkthrough

CodeInspection

Unit/codeFunctional testing

Code coverage

Codecomplexity

Cyclomaticcomplexity

StatementCoverage

Path coverage

Condition coverage

Functioncoverage

35

35

Static testing by human

These methods rely on the principle of humans reading the program code to detect

errors rather than computers executing the code to find errors. This process has

several advantages.

1. Sometimes human can find errors that computers can not. For example, when

there are two variables with similar names and the programmer used a wrong

variable by mistake in an expression, the computer will not detect the error but

execute the statement and produce incorrect results, whereas a human being

can spot such an error.

2. By making multiple humans read and evaluate the program, we can get

multiple perspectives and therefore have more problems identified upfront

than a computer could.

3. A human evaluation of the code can compare it against the specifications or

design and thus ensure that it does what is intended to do. This may not always

be possible when a computer runs a test.

4. A human evaluation can detect many problems at one go an can even try to

identify the root causes of the problems.

5. By making humans test the code before execution, computer resources can be

saved. Of course, this comes at the expense of human resources.

6. A proactive method of testing like static testing minimizes the delay in

identification of the problems.

7. From a psychological point of view, finding defects later in the cycle creates

immense pressure on programmers. They have to fix defects with less time to

spare. With this kind of pressure, there are higher chances of other defects

creeping in.

There are multiple methods to achieve static testing by humans. They are

1. Desk checking of the code

2. Code walk through

3. Code review

4. Code inspection

Desk checking Normally done manually by the author of the code to verify the

portions of the code for correctness. Such verification is done by comparing the code

36

36

with the design or specifications to make sure that the code does what it is supposed

to do and effectively. Whenever errors are found the author applies the correction for

errors on the spot. This method of catching and correcting errors is characterized by

1. No structured method or formulation to ensure completeness and

2. No maintaining of a log or check list.

Some of the disadvantages of this method of testing are as follows:

1. A developer is not the best person to detect problems in his own

code.

2. Developers generally prefer to write new code rather than any form

of testing.

3. This method is essentially person dependent and informal.

Code walkthrough

Walkthroughs are less formal than inspection. The advantage that walkthrough

has over desk checking is that it brings multiple prospective. In walkthroughs, a set of

people look at the program code and raise questions for the author. The author

explains the logic of the code and answers the questions.

Formal inspection

Code inspection – also called Fagan inspection is a method normally with a

high degree of formalism. The focus of this method is to detect all faults, violations,

and other side effects.

Combining various methods

The methods discussed above are not mutually exclusive. They need to be

used in a judicious combination to be effective in achieving the goal of finding defects

early.

Static analysis tools

There are several static analysis tools available in the market that can reduce the

manual work and perform analysis of the code to find out errors such as

1. Whether there are unreachable codes

2. Variables declared but not used

3. Mismatch in definition and assignment of values to variables etc.

37

37

While following any of the methods of human checking – desk checking,

walkthroughs, or formal inspection- it is useful to have a code review check list.

Code review checklist

- Data item declaration related

- Data usage related

- Control flow related

- Standards related

- Style related.

Ensuring that program requirements have been met?" Stated another way, why don't

we spend all of our energy on black-box tests? The answer lies in the nature of

software defects.

• Logic errors and incorrect assumptions are inversely proportional to the probability

that a program path will be executed. Errors tend to creep into our work when we

design and implement function, conditions, or control that is out of the mainstream.

Everyday processing tends to be well understood (and well scrutinized), while

"special case" processing tends to fall into the cracks.

• We often believe that a logical path is not likely to be executed when, in fact, it may

be executed on a regular basis. The logical flow of a program is sometimes

counterintuitive, meaning that our unconscious assumptions about flow of control and

data may lead us to make design errors that are uncovered only once path testing

commences.

• Typographical errors are random. When a program is translated into programming

language source code, it is likely that some typing errors will occur. Many will be

uncovered by syntax and type checking mechanisms, but others may go undetected

until testing begins. It is as likely that a typo will exist on an obscure logical path as

on a mainstream path. Each of these reasons provides an argument for conducting

white-box tests. Black box testing, no matter how thorough, may miss the kinds of

errors noted here. White box testing is far more likely to uncover them.

38

38

2.2.2 Structural testing

Structural testing takes into account the code, code structure, internal design, and how

they are coded. In structural testing tests are actually run by the computer on the built

product, whereas in static testing the product is tested by humans using just the source

code and not the executables or binaries. Structural testing can be further classified

into

• Unit/code functional testing,

• Code coverage and

• Code complexity testing.

Unit/Code functional testing

This initial part of structural testing corresponds to some quick checks that a

developer performs before subjecting the code to more extensive code coverage

testing or code complexity testing.

Initially the developer can perform certain obvious tests, knowing the input

variables and the corresponding expected output variables. This can be a quick test

that checks out any obvious mistakes. By repeating these tests for multiple values of

input variables, the confidence level of the developer to go to the next level increases.

This can even be done prior to formal reviews of static testing so that the review

mechanism does not waste time catching obvious errors.

For modules with complex logic or conditions, the developer can build a

“debug version” of the product by putting intermediate print statements and making

sure the program is passing through the right loops and iterations the right number of

times. It is important to remove the intermediate print statements after the defects are

fixed.

Another approach to do the initial test is to run the product under a debugger

or an Integrated Development Environment (IDE). These tools allow single stepping

of instructions, stepping break points at any function or instruction, and viewing the

various system parameters or program variable values.

Code coverage testing

39

39

Code coverage testing involves designing and executing test cases and finding out the

percentage of code that is covered by testing. The percentage of code that is covered

by a test is found by a test is found by adopting a technique called instrumentation of

code. There are specialized tools available to achieve instrumentation.

The tools also allow reporting on the portions of the code that are covered frequently,

so that the critical or most-often portions of code can be identified.

Code coverage testing is made up of the following types of coverage.

1. Statement coverage

2. Path coverage

3. Condition coverage

4. Function coverage

Statement coverage

Program constructs in most conventional programming languages can be classified as

1. Sequential control flow

2. Two-way decision statements like if then else

3. Multi-way decision statements like switch

4. Loops like while do, repeat until and for

Object-oriented languages have all of the above and, in addition, a number of other

constructs and concepts. Statement coverage refers to writing test cases that execute

each of the program statements.

Path coverage

In path coverage, we split a program into a number of distinct paths. A program can

start from the beginning and take any of the paths to its completion. Path coverage

provides a stronger condition of coverage than statement coverage as it relates to the

various logical paths in the program rather than just program statements.

Condition coverage

In the path coverage testing though we have covered all the paths possible, it would

not mean that the program is fully covered.

Condition coverage = (Total decisions exercised / Total number of decisions in

program)*100

40

40

The condition coverage as defined by the formula alongside in the margin gives an

indication of the percentage of conditions covered by a set of test cases. Condition

coverage is a much stronger criteria than statement coverage.

Function coverage

This is a new addition to structural testing to identify how many program functions

are covered by test cases. The requirements of a product are mapped into functions

during the design phase and each of the functions from a logical unit. The advantages

that function coverage provides over the other types of coverage are as follows:

Functions are easier to identify in a program and hence it is easier to write test

cases to provide function coverage.

Since functions are at a much higher level of abstraction than code, it is easier

to achieve 100 percent function coverage than 100 percent coverage in any of

the earlier methods.

Functions have a more logical mapping to requirements and hence can provide

a more direct correlation to the test coverage of the product.

Function coverage provides a natural transition to black box testing.

Basis path testing

Basis path testing is a white-box testing technique first proposed by Tom McCabe.

The basis path method enables the test case designer to derive a logical complexity

measure of a procedural design and use this measure as a guide for defining a basis set

of execution paths. Test cases derived to exercise the basis set are guaranteed to

execute every statement in the program at least one time during testing. Before the

basis path method can be introduced, a simple notation for the representation of

control flow, called a flow graph (or program graph) must be introduced. In actuality,

the basis path method can be conducted without the use of flow graphs. However,

they serve as a useful tool for understanding control flow and illustrating the

approach.

41

41

Figure 12 - Flow graph notation

The above figure.12 maps the flowchart into a corresponding flow graph (assuming

that no compound conditions are contained in the decision diamonds of the

flowchart). Referring to Figure, each circle, called a flow graph node, represents one

or more procedural statements. A sequence of process boxes and a decision diamond

can map into a single node. The arrows on the flow graph, called edges or links,

represent flow of control and are analogous to flowchart arrows. An edge must

terminate at a node, even if the node does not represent any procedural statements

(e.g., see the symbol for the if-then-else construct). Areas bounded by edges and

nodes are called regions. When counting regions, we include the area outside the

graph as a region.

When compound conditions are encountered in a procedural design, the generation of

a flow graph becomes slightly more complicated. A compound condition occurs when

one or more Boolean operators (logical OR, AND, NAND, NOR) is present in a

conditional statement. Referring to Figure.12, the PDL segment translates into the

flow graph shown. Note that a separate node is created for each of the conditions a

and b in the statement IF a OR b. Each node that contains a condition is called a

predicate node and is characterized by two or more edges emanating from it.

42

42

2.2.3 Code complexity testing

Cyclomatic Complexity

Cyclomatic complexity is software metric that provides a quantitative measure of the

logical complexity of a program. When used in the context of the basis path testing

method, the value computed for cyclomatic complexity defines the number of

independent paths in the basis set of a program and provides us with an upper bound

for the number of tests that must be conducted to ensure that all statements have been

executed at least once. An independent path is any path through the program that

introduces at least one new set of processing statements or a new condition.

43

43

Figure 13 - Flowchart, (A) and flow graph (B)

Figure 14 - Compound logic

In flow graph, an independent path must move along at least one edge that has not

been traversed before the path is defined. For example, a set of independent paths for

the flow graph illustrated in the above figure.14 is

path 1: 1-11

path 2: 1-2-3-4-5-10-1-11

path 3: 1-2-3-6-8-9-10-1-11

44

44

path 4: 1-2-3-6-7-9-10-1-11

Note that each new path introduces a new edge. The path

1-2-3-4-5-10-1-2-3-6-8-9-10-1-11 is not considered to be an independent path

because it is simply a combination of already specified paths and does not traverse

any new edges.

Paths 1, 2, 3, and 4 constitute a basis set for the flow graph in the Figure given above.

That is, if tests can be designed to force execution of these paths (a basis set), every

statement in the program will have been guaranteed to be executed at least one time

and every condition will have been executed on its true and false sides. It should be

noted that the basis set is not unique. In fact, a number of different basis sets can be

derived for a given procedural design.

Cyclomatic complexity is a useful metric for predicting those modules that are likely

to be error prone. It can be used for test planning as well as test case design.

How do we know how many paths to look for? The computation of cyclomatic

complexity provides the answer. Cyclomatic complexity has a foundation in graph

theory and provides us with an extremely useful software metric. Complexity is

computed in one of three ways:

1. The number of regions of the flow graph correspond to the cyclomatic complexity.

2. Cyclomatic complexity, V(G), for a flow graph, G, is defined as Compound logic

where E is the number of flow graph edges, N is the number of flow graph nodes.

3. Cyclomatic complexity, V(G), for a flow graph, G, is also defined as V(G) = P + 1

where P is the number of predicate nodes contained in the flow graph G.

Referring once more to the flow graph in Figure 14, the cyclomatic complexity can be

computed using each of the algorithms just noted:

1. The flow graph has four regions.

2. V(G) = 11 edges _ 9 nodes + 2 = 4.

3. V(G) = 3 predicate nodes + 1 = 4.

Therefore, the cyclomatic complexity of the flow graph in Figure 14 is 4. More

important, the value for V(G) provides us with an upper bound for the number of

independent paths that form the basis set and, by implication, an upper bound on the

number of tests that must be designed and executed to guarantee coverage of all

program statements.

45

45

Deriving Test Cases

The basis path testing method can be applied to a procedural design or to source

code. In this section, we present basis path testing as a series of steps. The procedure

average, depicted in PDL in Figure 1.4, will be used as an example to illustrate each

step in the test case design method. Note that average, although an extremely simple

algorithm contains compound conditions and loops. The following steps can be

applied to derive the basis set:

Figure 15 - Flow graph for the procedure average

1. Using the design or code as a foundation, draw a corresponding flow graph. A

flow graph is created using the symbols and construction rules. The corresponding

flow graph is in the figure given above.

2. Determine the cyclomatic complexity of the resultant flow graph. The

cyclomatic complexity, V(G), is determined by applying the algorithms . It should be

noted that V(G) can be determined without developing a flow graph by counting all

46

46

conditional statements in the PDL (for the procedure average, compound conditions

count as two) and adding 1.

Referring to Figure,

V(G) = 6 regions

V(G) = 17 edges _ 13 nodes + 2 = 6

V(G) = 5 predicate nodes + 1 = 6

3. Determine a basis set of linearly independent paths. The value of V(G) provides

the number of linearly independent paths through the program control structure. In the

case of procedure average, we expect to specify six paths:

path 1: 1-2-10-11-13

path 2: 1-2-10-12-13

path 3: 1-2-3-10-11-13

path 4: 1-2-3-4-5-8-9-2-. . .

path 5: 1-2-3-4-5-6-8-9-2-. . .

path 6: 1-2-3-4-5-6-7-8-9-2-. . .

The ellipsis (. . .) following paths 4, 5, and 6 indicates that any path through the

remainder of the control structure is acceptable. It is often worthwhile to identify

predicate nodes as an aid in the derivation of test cases. In this case, nodes 2, 3, 5, 6,

and 10 are predicate nodes.

4. Prepare test cases that will force execution of each path in the basis set. Data

should be chosen so that conditions at the predicate nodes are appropriately set as

each path is tested. Test cases that satisfy the basis set just described are

PROCEDURE average;

INTERFACE RETURNS average, total.input, total.valid;

INTERFACE ACCEPTS value, minimum, maximum;

TYPE value[1:100] IS SCALAR ARRAY;

TYPE average, total.input, total.valid;

minimum, maximum, sum IS SCALAR;

TYPE i IS INTEGER;

* This procedure computes the average of 100 or fewer numbers that lie between

bounding values; it also computes the sum and the total number valid.

47

47

i = 1;

total.input = total.valid = 0;

sum = 0;

DO WHILE value[i] <> –999 AND total.input < 100

ENDDO

IF total.valid > 0

ENDIF

END average

increment total.input by 1;

IF value[i] > = minimum AND value[i] < = maximum

ENDIF

increment i by 1;

THEN average = sum / total.valid;

ELSE average = –999;

THEN increment total.valid by 1;

sum = s sum + value[i]

ELSE skip

Path 1 test case:

value(k) = valid input, where k < i for 2 ≤i ≤100

value(i) = _999 where 2 ≤i ≤100

Expected results: Correct average based on k values and proper totals.

Note: Path 1 cannot be tested stand-alone but must be tested as part of path 4, 5, and 6

tests.

Path 2 test case:

value(1) = _999

Expected results: Average = _999; other totals at initial values.

Path 3 test case:

Attempt to process 101 or more values.

First 100 values should be valid.

Expected results: Same as test case 1.

Path 4 test case:

value(i) = valid input where i < 100

48

48

value(k) < minimum where k < i

Expected results: Correct average based on k values and proper totals.

1

2

3

4

5

6

7

8

9

10

12 11

13

Path 5 test case:


value(k) > maximum where k <= i

Expected results: Correct average based on n values and proper totals.

Path 6 test case:


Expected results: Correct average based on n values and proper totals.

Each test case is executed and compared to expected results. Once all test cases have

been completed, the tester can be sure that all statements in the program have been

executed at least once. It is important to note that some independent paths (e.g., path 1

in our example) cannot be tested in stand-alone fashion. That is, the combination of

data required to traverse the path cannot be achieved in the normal flow of the

program. In such cases, these paths are tested as part of another path test.

49

49

Graph Matrices

The procedure for deriving the flow graph and even determining a set of basis paths

is amenable to mechanization. To develop a software tool that assists in basis path

testing, a data structure, called a graph matrix, can be quite useful. A graph matrix is

a square matrix whose size (i.e., number of rows and columns) is equal to the number

of nodes on the flow graph. Each row and column corresponds to an identified node,

and matrix entries correspond to connections (an edge) between nodes. Each node on

the flow graph is identified by numbers, while each edge is identified by letters. A

letter entry is made in the matrix to correspond to a connection between two nodes.

For example, node 3 is connected to node 4 by edge b.

To this point, the graph matrix is nothing more than a tabular representation of a

flow graph. However, by adding a link weight to each matrix entry, the graph matrix

can become a powerful tool for evaluating program control structure during testing.

The link weight provides additional information about control flow. In its simplest

form, the link weight is 1 (a connection exists) or 0 (a connection does not exist). But

link weights can be assigned other, more interesting properties:

• The probability that a link (edge) will be executed.

• The processing time expended during traversal of a link.

• The memory required during traversal of a link.

• The resources required during traversal of a link.

Control structure testing

The basis path testing technique is one of a number of techniques for control

structure testing. Although basis path testing is simple and highly effective, it is not

sufficient in itself. In this section, other variations on control structure testing are

discussed. These broaden testing coverage and improve quality of white-box testing.

Condition Testing

Condition testing is a test case design method that exercises the logical conditions

contained in a program module. A simple condition is a Boolean variable or a

50

50

relational expression, possibly preceded with one NOT (¬) operator. A relational

expression takes the form

E1 <relational-operator> E2

where E1 and E2 are arithmetic expressions and <relational-operator> is one of the

following: <, ≤, =, ≠(nonequality), >, or ≥. A compound condition is composed of two

or more simple conditions, Boolean operators, and parentheses. We assume that

Boolean operators allowed in a compound condition include OR (|), AND (&) and

NOT (¬). A condition without relational expressions is referred to as a Boolean

expression.

Therefore, the possible types of elements in a condition include a Boolean operator, a

Boolean variable, a pair of Boolean parentheses (surrounding a simple or compound

condition), a relational operator, or an arithmetic expression. If a condition is

incorrect, then at least one component of the condition is incorrect. Therefore, types

of errors in a condition include the following:

• Boolean operator error (incorrect/missing/extra Boolean operators).

• Boolean variable error.

• Boolean parenthesis error.

• Relational operator error.

• Arithmetic expression error.

The condition testing method focuses on testing each condition in the program.

Condition testing strategies (discussed later in this section) generally have two

advantages. First, measurement of test coverage of a condition is simple. Second, the

test coverage of conditions in a program provides guidance for the generation of

additional tests for the program. The purpose of condition testing is to detect not only

errors in the conditions of a program but also other errors in the program.

For detecting errors in the conditions contained in P, it is likely that this test set is

also effective for detecting other errors in P. In addition, if a testing strategy is

effective for detecting errors in a condition, then it is likely that this strategy will also

be effective for detecting errors in a program. A number of condition testing strategies

have been proposed. Branch testing is probably the simplest condition testing

strategy. For a compound condition C, the true and false branches of C and every

simple condition in C need to be executed at least once.

51

51

Domain testing requires three or four tests to be derived for a relational expression.

For a relational expression of the form E1 <relational-operator> E2 three tests are

required to make the value of E1 greater than, equal to, or less than that of E2. If

<relational-operator> is incorrect and E1 and E2 are correct, then these three tests

guarantee the detection of the relational operator error. To detect errors in E1 and E2,

a test that makes the value of E1 greater or less than that of E2 should make the

difference between these two values as small as possible.

For a Boolean expression with n variables, all of 2n possible tests are required (n >

0). This strategy can detect Boolean operator, variable, and parenthesis errors, but it is

practical only if n is small. Error-sensitive tests for Boolean expressions can also be

derived. For a singular Boolean expression (a Boolean expression in which each

Boolean variable occurs only once) with n Boolean variables (n > 0), we can easily

generate a test set with less than 2n tests such that this test set guarantees the detection

of multiple Boolean operator errors and is also effective for detecting other errors.

Tai suggests a condition testing strategy that builds on the techniques just outlined.

Called BRO (branch and relational operator) testing, the technique guarantees the

detection of branch and relational operator errors in a condition provided that all

Boolean variables and relational operators in the condition occur only once and have

no common variables. The BRO strategy uses condition constraints for a condition C.

A condition constraint for C with n simple conditions is defined as (D1, D2, . . ., Dn),

where Di (0 < i ≤n) is a symbol specifying a constraint on the outcome of the ith

simple condition in condition C. A condition constraint D for condition C is said to be

covered by an execution of C if, during this execution of C, the outcome of each

simple condition in C satisfies the corresponding constraint in D.

For a Boolean variable, B, we specify a constraint on the outcome of B that states that

B must be either true (t) or false (f). Similarly, for a relational expression, the symbols

>, =, < are used to specify constraints on the outcome of the expression.

As an example, consider the condition

C1: B1 & B2

where B1 and B2 are Boolean variables. The condition constraint for C1 is of the form

(D1, D2), where each of D1 and D2 is t or f. The value (t, f) is a condition constraint

for C1 and is covered by the test that makes the value of B1 to be true and the value of

B2 to be false. The BRO testing strategy requires that the constraint set {(t, t), (f, t), (t,

52

52

f)} be covered by the executions of C1. If C1 is incorrect due to one or more Boolean

operator errors, at least one of the constraint set will force C1 to fail.

As a second example, a condition of the form

C2: B1 & (E3 = E4)

where B1 is a Boolean expression and E3 and E4 are arithmetic expressions. A

condition constraint for C2 is of the form (D1, D2), where each of D1 is t or f and D2

is >, =, <. Since C2 is the same as C1 except that the second simple condition in C2 is

a relational expression, we can construct a constraint set for C2 by modifying the

constraint set {(t, t), (f, t), (t, f)} defined for C1. Note that t for (E3 = E4) implies =

and that f for (E3 = E4) implies either < or >. By replacing (t, t) and (f, t) with (t, =)

and (f, =), respectively, and by replacing (t, f) with (t, <) and (t, >), the resulting

constraint set for C2 is {(t, =), (f, =), (t, <), (t, >)}. Coverage of the preceding

constraint set will guarantee detection of Boolean and relational operator errors in C2.

As a third example, we consider a condition of the form

C3: (E1 > E2) & (E3 = E4)

where E1, E2, E3 and E4 are arithmetic expressions. A condition constraint for C3 is

of the form (D1, D2), where each of D1 and D2 is >, =, <. Since C3 is the same as C2

except that the first simple condition in C3 is a relational expression, we can construct

a constraint set for C3 by modifying the constraint set for C2, obtaining {(>, =), (=,

=), (<, =), (>, >), (>, <)}Coverage of this constraint set will guarantee detection of

relational operator errors in C3.

Data Flow Testing

The data flow testing method selects test paths of a program according to the

locations of definitions and uses of variables in the program. To illustrate the data

flow testing approach, assume that each statement in a program is assigned a unique

statement number and that each function does not modify its parameters or global

variables.

For a statement with S as its statement number,

DEF(S) = {X | statement S contains a definition of X}

USE(S) = {X | statement S contains a use of X}

If statement S is an if or loop statement, its DEF set is empty and its USE set is based

on the condition of statement S. The definition of variable X at statement S is said to

53

53

be live at statement S' if there exists a path from statement S to statement S' that

contains no other definition of X.

A definition-use (DU) chain of variable X is of the form [X, S, S'], where S and S' are

statement numbers, X is in DEF(S) and USE(S'), and the definition of X in statement S

is live at statement S'.

One simple data flow testing strategy is to require that every DU chain be covered at

least once. We refer to this strategy as the DU testing strategy. It has been shown that

DU testing does not guarantee the coverage of all branches of a program. However, a

branch is not guaranteed to be covered by DU testing only in rare situations such as

if-then-else constructs in which the then part has no definition of any variable and the

else part does not exist. In this situation, the else branch of the if statement is not

necessarily covered by DU testing.

54

54

Loop Testing

Figure 16 - Classes of loops

Loops are the cornerstone for the vast majority of all algorithms implemented in

software. And yet, we often pay them little heed while conducting software tests.

Loop testing is a white-box testing technique that focuses exclusively on the validity

of loop constructs. Four different classes of loops can be defined: simple loops,

concatenated loops, nested loops, and unstructured loops.

Simple loops. The following set of tests can be applied to simple loops, where n is the

maximum number of allowable passes through the loop.

1. Skip the loop entirely.

2. Only one pass through the loop.

3. Two passes through the loop.

4. m passes through the loop where m < n.

5. n _1, n, n + 1 passes through the loop.

55

55

Nested loops. If we were to extend the test approach for simple loops to nested loops,

the number of possible tests would grow geometrically as the level of nesting

increases. This would result in an impractical number of tests. Beizer suggests an

approach that will help to reduce the number of tests:

1. Start at the innermost loop. Set all other loops to minimum values.

2. Conduct simple loop tests for the innermost loop while holding the outer loops at

their minimum iteration parameter (e.g., loop counter) values. Add other tests for

out-of-range or excluded values.

Complex loop structures are another hiding place for bugs It’s well worth spending

time designing tests that fully exercise loop structures

3. Work outward, conducting tests for the next loop, but keeping all other outer loops

at minimum values and other nested loops to "typical" values.

4. Continue until all loops have been tested.

Concatenated loops. Concatenated loops can be tested using the approach defined for

simple loops, if each of the loops is independent of the other. However, if two loops

are concatenated and the loop counter for loop 1 is used as the initial value for loop 2,

then the loops are not independent. When the loops are not independent, the approach

applied to nested loops is recommended.

Unstructured loops. Whenever possible, this class of loops should be redesigned to

reflect the use of the structured programming constructs.

Challenges in white box testing

White box testing requires a sound knowledge of the program code and the

programming language. This means that the developers should get intimately

involved in white box testing. Developers, in general, do not like to perform testing

functions. This applies to structural testing as well as static testing methods such as

reviews. In addition, because of the timeline pressures, the programmers may not find

time for reviews.

Human tendency of a developer being unable to find the defects in his code.

Fully tested code may not correspond to realistic scenarios

56

56

These challenges do not mean that white box testing is ineffective. But when

white-box testing is carried out and these challenges are addressed by other means of

testing, there is a higher likelihood of more effective testing.

2.3 INTEGRATION TESTING

A neophyte in the software world might ask a seemingly legitimate question once all

modules have been unit tested: "If they all work individually, why do you doubt that

they'll work when we put them together?" The problem, of course, is "putting them

together"—interfacing. Data can be lost across an interface; one module can have an

inadvertent, adverse affect on another; sub functions, when combined, may not

produce the desired major function; individually acceptable imprecision may be

magnified to unacceptable levels; global data structures can present problems. Sadly,

the list goes on and on. Integration testing is a systematic technique for constructing

the program structure while at the same time conducting tests to uncover errors

associated with interfacing.

The objective is to take unit tested components and build a program structure that has

been dictated by design. There is often a tendency to attempt non incremental

integration; that is, to construct the program using a "big bang" approach. All

components are combined in advance. The entire program is tested as a whole. And

chaos usually results! A set of errors is encountered. Correction is difficult because

isolation of causes is complicated by the vast expanse of the entire program. Once

these errors are corrected, new ones appear and the process continues in a seemingly

endless loop.

Incremental integration is the antithesis of the big bang approach. The program is

constructed and tested in small increments, where errors are easier to isolate and

correct; interfaces are more likely to be tested completely; and a systematic test

approach may be applied. In the sections that follow, a number of different

incremental integration strategies are discussed.

57

57

2.3.1 Top-down Integration

Figure 17 - Top- Down integration testing

Top-down integration testing is an incremental approach to construction of program

structure. Modules are integrated by moving downward through the control hierarchy,

beginning with the main control module (main program). Modules subordinate (and

ultimately subordinate) to the main control module are incorporated into the structure

in either a depth-first or breadth-first manner. Referring to Figure.17, depth-first

integration would integrate all components on a major control path of the structure.

Selection of a major path is somewhat arbitrary and depends on application-specific

characteristics. For example, selecting the left hand path, components M1, M2 , M5

would be integrated first. Next, M8 or (if necessary for proper functioning of M2) M6

would be integrated. Then, the central and right hand control paths are built.

Breadth-first integration incorporates all components directly subordinate at each

level, moving across the structure horizontally. From the figure, components M2, M3,

and M4 (a replacement for stub S4) would be integrated first. The next control level,

M5, M6, and so on, follows.

58

58

The integration process is performed in a series of five steps:

1. The main control module is used as a test driver and stubs are substituted for all

components directly subordinate to the main control module.

2. Depending on the integration approach selected (i.e., depth or breadth first),

subordinate stubs are replaced one at a time with actual components.

3. Tests are conducted as each component is integrated.

4. On completion of each set of tests, another stub is replaced with the real

component.

5. Regression testing may be conducted to ensure that new errors have not been

introduced.

The process continues from step 2 until the entire program structure is built. The

top-down integration strategy verifies major control or decision points early in the test

process. In a well-factored program structure, decision making occurs at upper levels

in the hierarchy and is therefore encountered first. If major control problems do exist,

early recognition is essential. If depth-first integration is selected, a complete function

of the software may be implemented and demonstrated.

The incoming path may be integrated in a top-down manner. All input processing

(for subsequent transaction dispatching) may be demonstrated before other elements

of the structure have been integrated. Early demonstration of functional capability is a

confidence builder for both the developer and the customer.

Top-down strategy sounds relatively uncomplicated, but in practice, logistical

problems can arise. The most common of these problems occurs when processing at

low levels in the hierarchy is required to adequately test upper levels. Stubs replace

low level modules at the beginning of top-down testing; therefore, no significant data

can flow upward in the program structure. The tester is left with three choices:

(1) Delay many tests until stubs are replaced with actual modules,

(2) Develop stubs that perform limited functions that simulate the actual module, or

(3) Integrate the software from the bottom of the hierarchy upward.

The first approach (delay tests until stubs are replaced by actual modules) causes us

to loose some control over correspondence between specific tests and incorporation of

specific modules. This can lead to difficulty in determining the cause of errors and

tends to violate the highly constrained nature of the top-down approach. The second

approach is workable but can lead to significant overhead, as stubs become more and

59

59

more complex. The third approach, called bottom-up testing, is discussed in the next

section.

2.3.2 Bottom-up Integration

Bottom-up integration testing, as its name implies, begins construction and testing

with atomic modules (i.e., components at the lowest levels in the program structure).

Because components are integrated from the bottom up, processing required for

components subordinate to a given level is always available and the need for stubs is

eliminated.

A bottom-up integration strategy may be implemented with the following steps:

1. Low-level components are combined into clusters (sometimes called builds) that

perform a specific software sub function.

2. A driver (a control program for testing) is written to coordinate test case input and

output.

3. The cluster is tested.

4. Drivers are removed and clusters are combined moving upward in the program

structure.

Figure 18 - Bottom- Up Integration

60

60

Integration follows the pattern illustrated in Figure.18. Components are combined to

form clusters 1, 2, and 3. Each of the clusters is tested using a driver (shown as a

dashed block). Components in clusters 1 and 2 are subordinate to Ma. Drivers D1 and

D2 are removed and the clusters are interfaced directly to Ma. Similarly, driver D3 for

cluster 3 is removed prior to integration with module Mb. Both Ma and Mb will

ultimately be integrated with component Mc, and so forth. Bottom-up integration

eliminates the need for complex stubs.

As integration moves upward, the need for separate test drivers lessens. In fact, if the

top two levels of program structure are integrated top down, the number of drivers can

be reduced substantially and integration of clusters is greatly simplified.

Comments on Integration Testing

There has been much discussion of the relative advantages and disadvantages of

top-down versus bottom-up integration testing. In general, the advantages of one

strategy tend to result in disadvantages for the other strategy. The major disadvantage

of the top-down approach is the need for stubs and the attendant testing difficulties

that can be associated with them. Problems associated with stubs may be offset by the

advantage of testing major control functions early. The major disadvantage of

bottom-up integration is that "the program as an entity does not exist until the last

module is added". This drawback is tempered by easier test case design and a lack of

stubs.

Selection of an integration strategy depends upon software characteristics and,

sometimes, project schedule. In general, a combined approach (sometimes called

sandwich testing) that uses top-down tests for upper levels of the program structure,

coupled with bottom-up tests for subordinate levels may be the best compromise. As

integration testing is conducted, the tester should identify critical modules. A critical

module has one or more of the following characteristics:

(1) Addresses several software requirements,

(2) Has a high level of control (resides relatively high in the program structure),

(3) Is complex or error prone (cyclomatic complexity may be used as an indicator), or

(4) has definite performance requirements. Critical modules should be tested as early

as is possible. In addition, regression tests should focus on critical module function.

61

61

2.3.3 Integration Test Documentation

An overall plan for integration of the software and a description of specific tests are

documented in a Test Specification. This document contains a test plan, and a test

procedure, is a work product of the software process, and becomes part of the

software configuration. The test plan describes the overall strategy for integration.

Testing is divided into phases and builds that address specific functional and

behavioral characteristics of the software. For example, integration testing for a CAD

system might be divided into the following test phases:

• User interaction (command selection, drawing creation, display representation, error

processing and representation).

• Data manipulation and analysis (symbol creation, dimensioning; rotation,

computation of physical properties).

• Display processing and generation (two-dimensional displays, three dimensional

displays, graphs and charts).

• Database management (access, update, integrity, performance).

Each of these phases and sub phases (denoted in parentheses) delineates a broad

functional category within the software and can generally be related to a specific

domain of the program structure. Therefore, program builds (groups of modules) are

created to correspond to each phase. The following criteria and corresponding tests

are applied for all test phases:

Interface integrity. Internal and external interfaces are tested as each module (or

cluster) is incorporated into the structure.

Functional validity. Tests designed to uncover functional errors are conducted.

Information content. Tests designed to uncover errors associated with local or global

data structures are conducted.

Performance. Tests designed to verify performance bounds established during

software design are conducted.

A schedule for integration, the development of overhead software, and related topics

is also discussed as part of the test plan. Start and end dates for each phase are

established and "availability windows" for unit tested modules are defined. A brief

62

62

description of overhead software (stubs and drivers) concentrates on characteristics

that might require special effort. Finally, test environment and resources are

described.

Unusual hardware configurations, exotic simulators, and special test tools or

techniques are a few of many topics that may also be discussed.

The order of integration and corresponding tests at each integration step are

described. A listing of all test cases (annotated for subsequent reference) and expected

results is also included. A history of actual test results, problems, or peculiarities is

recorded in the Test Specification. Information contained in this section can be vital

during software maintenance. Like all other elements of a software configuration, the

test specification format may be tailored to the local needs of a software engineering

organization. It is important to note, however, that an integration strategy (contained

in a test plan) and testing details (described in a test procedure) are essential

ingredients and must appear.

Validation testing

At the culmination of integration testing, software is completely assembled as a

package, interfacing errors have been uncovered and corrected, and a final series of

software tests—validation testing—may begin. Validation can be defined in many

ways, but a simple (albeit harsh) definition is that validation succeeds when software

functions in a manner that can be reasonably expected by the customer. At this point a

battle-hardened software developer might protest: "Who or what is the arbiter of

reasonable expectations?" Reasonable expectations are defined in the Software

Requirements Specification— a document that describes all user-visible attributes of

the software. The specification contains a section called Validation Criteria.

Information contained in that section forms the basis for a validation testing approach.

Validation Test Criteria

Software validation is achieved through a series of black-box tests that demonstrate

conformity with requirements. A test plan outlines the classes of tests to be conducted

and a test procedure defines specific test cases that will be used to demonstrate

conformity with requirements. Both the plan and procedure are designed to ensure

that all functional requirements are satisfied, all behavioral characteristics are

achieved, all performance requirements are attained, documentation is correct, and

63

63

human engineered and other requirements are met (e.g., transportability,

compatibility, error recovery, maintainability).

After each validation test case has been conducted, one of two possible conditions

exist:

(1) The function or performance characteristics conform to specification and are

accepted or

(2) a deviation from specification is uncovered and a deficiency list is created.

Deviation or error discovered at this stage in a project can rarely be corrected prior to

scheduled delivery. It is often necessary to negotiate with the customer to establish a

method for resolving deficiencies.

Configuration Review

An important element of the validation process is a configuration review. The intent

of the review is to ensure that all elements of the software configuration have been

properly developed, are cataloged, and have the necessary detail to bolster the support

phase of the software life cycle.

2.3.4 Alpha and Beta Testing

It is virtually impossible for a software developer to foresee how the customer will

really use a program. Instructions for use may be misinterpreted; strange

combinations of data may be regularly used; output that seemed clear to the tester

may be unintelligible to a user in the field. When custom software is built for one

customer, a series of acceptance tests are conducted to enable the customer to validate

all requirements. Conducted by the end user rather than software engineers, an

acceptance test can range from an informal "test drive" to a planned and

systematically executed series of tests. In fact, acceptance testing can be conducted

over a period of weeks or months, thereby uncovering cumulative errors that might

degrade the system over time.

If software is developed as a product to be used by many customers, it is impractical

to perform formal acceptance tests with each one. Most software product builders use

a process called alpha and beta testing to uncover errors that only the end-user seems

able to find.

64

64

The alpha test is conducted at the developer's site by a customer. The software is used

in a natural setting with the developer "looking over the shoulder" of the user and

recording errors and usage problems. Alpha tests are conducted in a controlled

environment.

The beta test is conducted at one or more customer sites by the end-user of the

software. Unlike alpha testing, the developer is generally not present. Therefore, the

beta test is a "live" application of the software in an environment that cannot be

controlled by the developer. The customer records all problems (real or imagined) that

are encountered during beta testing and reports these to the developer at regular

intervals. As a result of problems reported during beta tests, software engineers make

modifications and then prepare for release of the software product to the entire

customer base.

2.4 SYSTEM AND ACCEPTANCE TESTING

The testing conducted on the complete integrated products and solutions to evaluate

system compliance with specified requirements on functional and non-functional

aspects is called system testing. System testing is conducted with an objective to find

product level defects and in building the confidence before the product is released to

the customer. Since system testing is the last phase of testing before the release, not

all defects can be fixed in code in time due to time and effort needed in development

and testing and due to the potential risk involved in any last-minute changes.

Hence, an impact analysis is done for those defects to reduce the risk of releasing a

product with defects. The analysis of defects and their classification into various

categories also gives an idea about the kind of defects that will be found by the

customer after release. This information helps in planning some activities such as

providing workarounds, documentation on alternative approaches, and so on. Hence,

system testing helps in reducing the risk of releasing a product.

65

65

2.4.1 System testing

System testing is defined as a testing phase conducted on the complete integrated

system, to evaluate the system compliance with its specific requirements. It is done

after unit, component and integration testing phases. System testing is the only phase

of testing which tests both the functional and non-functional aspects of the product.

On the functional side, system testing focuses on real-life customer usage of the

product and solutions. System testing simulates customer deployments.

On the non-functional side, system brings in different testing types, some of which are

as follows.

1. Performance/Load testing

2. Scalability testing

3. Reliability testing

4. Stress testing

5. interoperability testing

6. Localization testing

Software is only one element of a larger computer-based system. Ultimately,

software is incorporated with other system elements (e.g., hardware, people,

information), and a series of system integration and validation tests are conducted.

These tests fall outside the scope of the software process and are not conducted solely

by software engineers. However, steps taken during software design and testing can

greatly improve the probability of successful software integration in the larger system.

A classic system testing problem is "finger-pointing." This occurs when an error is

uncovered, and each system element developer blames the other for the problem.

Rather than indulging in such nonsense, the software engineer should anticipate

potential interfacing problems and

(1) Design error-handling paths that test all information coming from other elements

of the system,

(2) Conduct a series of tests that simulate bad data or other potential errors at the

software interface,

66

66

(3) Record the results of tests to use as "evidence" if finger-pointing does occur, and

(4) Participate in planning and design of system tests to ensure that software is

adequately tested.

System testing is actually a series of different tests whose primary purpose is to fully

exercise the computer-based system. Although each test has a different purpose, all

work to verify that system elements have been properly integrated and perform

allocated functions. In the sections that follow, we discuss the types of system tests

that are worthwhile for software-based systems.

To summarize, system testing is done for the following reasons.

• Provide independent perspective in testing

• Bring in customer perspective in testing

• Provide a “fresh pair of eyes” to discover defects not found earlier by testing

• Test product behavior in a holistic, complete and realistic environment

• Test both functional and non-functional aspects of the product

• Build confidence in the product

• Analyze and reduce the risk of releasing the product

• Ensure all requirements are met and ready the product for acceptance testing.

Functional system testing

Functional testing is performed at different phases and the focus is on product level

features. As functional testing is performed at various testing phases, there are two

obvious problems. One is duplication and other one is gray area. Duplication refers to

the same tests being performed multiple times and gray area refers to certain tests

being missed out in all the phases. Gray areas in testing happen due to lack of product

knowledge, lack of knowledge of customer usage, and lack of co-ordination across

test teams. There are multiple ways system functional testing is performed. There are

also many ways product level test cases are derived for functional testing.

67

67

Functional vs non-functional testing

Testing aspects Functional testing Non-functional testingInvolves Product features and

functionality

Quality factors

Tests Product behavior Behavior and experienceResult conclusion Simple steps written to check

expected results

Huge data collected and

analyzed

Result varies due

to

Product implementation Product implementation,

resources, and configurationsTesting focus Defect detection Qualification of productKnowledge

required

Product and domain Product, domain, design,

architecture, statistical skillsFailures normally

due to

Code Architecture, design, and code

systemTesting phase Unit, component, integration,

system

System

Test case

repeatability

Repeated many times Repeated only in case of failures

and for different configurationsConfiguration One-time setup for a set of

test cases

Configuration changes for each

test case

Some of the common techniques are given below:

• Design and architecture verification

• Business vertical testing

• Deployment testing

• Beta testing

• Certification, standards, and testing for compliance.

Non-functional testing

The process followed by non-functional testing is similar to that of functional testing

but differs from the aspects of complexity, knowledge requirement, effort needed, and

number of times the test cases are repeated. Since repeating non-functional test cases

involve more time, effort, and resources, the process for non-functional testing has to

be more robust stronger than functional testing to minimize the need for repetition.

68

68

This is achieved by having more stringent entry/exit criteria, better planning, and by

setting up the configuration with data population in advance for test execution.

Recovery Testing

Many computer based systems must recover from faults and resume processing

within a pre specified time. In some cases, a system must be fault tolerant; that is,

processing faults must not cause overall system function to cease. In other cases, a

system failure must be corrected within a specified period of time or severe economic

damage will occur.

Recovery testing is a system test that forces the software to fail in a variety of ways

and verifies that recovery is properly performed. If recovery is automatic (performed

by the system itself), reinitialization, check pointing mechanisms, data recovery, and

restart are evaluated for correctness. If recovery requires human intervention, the

mean-time-to-repair (MTTR) is evaluated to determine whether it is within acceptable

limits.

Scalability testing

The objective of scalability testing is to find out the maximum capability of the

product parameters. As the exercise involves finding the maximum, the resources that

are needed for this kind of testing are normally very high. At the beginning of the

scalability exercise, there may not be an obvious clue about the maximum capability

of the system. Hence a high-end configuration is selected and the scalability

parameter is increased step by step to reach the maximum capability.

Failures during scalability test include the system not responding, or the system

crashing and so on. Scalability tests help in identifying the major bottlenecks in a

product. When resources are found to be the bottleneck, they are increased after

validating the assumptions mentioned. Scalability tests are performed on different

configurations to check the product’s behavior.

There can be some bottlenecks during scalability testing, which will require certain

OS parameters and product parameters to be tuned. “Number of open files” and

“Number of product threads” are some examples of parameters that may need tuning.

When such tuning is performed, it should be appropriately documented. A document

containing such tuning parameters and the recommended values of other product and

69

69

environmental parameters for attaining the scalability numbers is called a sizing

guide. This guide is one of the mandatory deliverables from scalability testing.

Reliability testing

Reliability testing is done to evaluate the product’s ability to perform its required

functions under stated conditions for a specified period of time or for a large number

of iterations. Examples of reliability include querying a database continuously for 48

hours and performing login operations 10,000 times.

The reliability of a product should not be confused with reliability testing. Reliability

here is an all-encompassing term used to mean all the quality factors and functionality

aspects of the product. This product reliability is achieved by focusing on the

following activities:

Defined engineering processes

Review of work products at each stage

Change management procedures

Review of testing coverage

Ongoing monitoring of the product

Reliability testing, on the other hand, refers to testing the product for a continuous

period of time. Reliability testing only delivers a “reliability tested product” but not a

reliable product. The main factor that is taken into account for reliability testing is

defects.

To summarize, a “reliability tested product” will have the following characteristics:

• No errors or very few errors from repeated transactions

• Zero downtime

• Optimum utilization of resources.

• Consistent performance and response time of the product for repeated

transactions for a specified time duration

• No side-effects after the repeated transactions are executed.

Security Testing

Any computer-based system that manages sensitive information or causes actions

that can improperly harm (or benefit) individuals is a target for improper or illegal

penetration. Penetration spans a broad range of activities: hackers who attempt to

70

70

penetrate systems for sport; disgruntled employees who attempt to penetrate for

revenge; dishonest individuals who attempt to penetrate for illicit personal gain.

Security testing attempts to verify that protection mechanisms built into a system will,

in fact, protect it from improper penetration. To quote Beizer: "The system's security

must, of course, be tested for invulnerability from frontal attack—but must also be

tested for invulnerability from flank or rear attack.

During security testing, the tester plays the role(s) of the individual who desires to

penetrate the system. Anything goes! The tester may attempt to acquire passwords

through external clerical means; may attack the system with custom software

designed to breakdown any defenses that have been constructed; may overwhelm the

system, thereby denying service to others; may purposely cause system errors, hoping

to penetrate during recovery; may browse through insecure data, hoping to find the

key to system entry.

Given enough time and resources, good security testing will ultimately penetrate a

system. The role of the system designer is to make penetration cost more than the

value of the information that will be obtained.

Stress Testing

During earlier software testing steps, white-box and black-box techniques resulted in

thorough evaluation of normal program functions and performance. Stress tests are

designed to confront programs with abnormal situations. In essence, the tester who

performs stress testing asks: "How high can we crank this up before it fails?" Stress

testing executes a system in a manner that demands resources in abnormal quantity,

frequency, or volume. For example,

(1) special tests may be designed that generate ten interrupts per second, when one or

two is the average rate,

(2) input data rates may be increased by an order of magnitude to determine how

input functions will respond,

(3) test cases that require maximum memory or other resources are executed,

(4) test cases that may cause thrashing in a virtual operating system are designed,

(5) test cases that may cause excessive hunting for disk-resident data are created.

Essentially, the tester attempts to break the program.

A variation of stress testing is a technique called sensitivity testing. In some situations

(the most common occur in mathematical algorithms), a very small range of data

71

71

contained within the bounds of valid data for a program may cause extreme and even

erroneous processing or profound performance degradation. Sensitivity testing

attempts to uncover data combinations within valid input classes that may cause

instability or improper processing.

Interoperability testing

Interoperability testing is done to ensure the two or more products can exchange

information, use information, and work properly together. Systems can be

interoperable unidirectional or bi-directional. Unless two or more products are

designed for exchanging information, interoperability cannot be achieved. The

following are some guidelines that help in improving interoperability.

1. Consistency of information flow across systems

2. Changes to data representation as per the system requirements

3. Correlated interchange of messages and receiving appropriate responses

4. Communication and messages

5. Meeting quality factors.

2.4.2 ACCEPTANCE TESTING

Acceptance testing is a phase after system testing that is normally done by the

customers or representatives of the customers. The customer defines a set of test cases

that will be executed to qualify and accept the product. These test cases are executed

by the customers themselves to quickly judge the quality of the product before

deciding to buy the product. Acceptance test cases are normally small in number and

are not written with the intention of finding defects. Acceptance tests are written to

execute near real-life scenarios. Apart from verifying the functional requirements,

acceptance tests are run to verify the non-functional aspects of the system also.

Acceptance test cases failing in a customer site may cause the product to be rejected

and may mean financial loss or may mean rework of product involving effort and

time.

72

72

Acceptance criteria

Acceptance criteria-product acceptance

During the requirements phase, each requirement is associated with acceptance

criteria. It is possible that one or more requirements may be mapped to form

acceptance criteria. Whenever there are changes to requirements, the acceptance

criteria are accordingly modified and maintained. Acceptance testing is not meant for

executing test cases that have not been executed before. Hence, the existing test cases

are looked at and certain categories of test cases can be grouped to form acceptance

criteria.

Acceptance criteria – procedure acceptance

Acceptance criteria can be defined based on the procedures followed for delivery. An

example of procedure acceptance could be documentation and release media. Some

examples of acceptance criteria of this nature are as follows:

• User, administration and troubleshooting documentation should be part of the

release.

• Along with binary code, the source code of the product with build scripts to be

delivered in a CD.

• A minimum of 20 employees are trained on the product usage prior to

deployment.

These procedural acceptance criteria are verified /tested as part of acceptance testing.

Acceptance criteria – service level agreements

Service level agreements are generally part of a contract signed by the customer and

the product organization. The important contract items are taken and verified as part

of acceptance testing.

Selecting test cases for acceptance testing

This section gives some guideline on what test cases can be included for acceptance

testing:

• End-to-end functionality verification

73

73

• Domain tests

• User scenario tests

• Basic sanity tests

• New functionality

• A few non-functional tests

• Tests pertaining to legal obligations and service level agreements

• Acceptance test data

Executing acceptance tests

Sometimes the customers themselves do the acceptance tests. In such cases, the job of

the product organization is to assist the customers in acceptance testing and resolve

the issues that come out it. If the acceptance testing is done by the product

organization, forming the acceptance test team becomes an important activity. An

acceptance test team usually comprises members who are involved in the day – to-day

activities of the product usage or are familiar with such scenarios. The product

management, support, and consulting team, who have good knowledge of the

customers, contribute to the acceptance testing definition and execution. They may

not be familiar with the testing process or the technical aspect of the software. But

they know whether the product does what it is intended to do. An acceptance test team

may be formed with 90% of them possessing the required business process knowledge

of the product and 10% being representatives of the technical testing team. The

number of test team members needed to perform acceptance testing is not much when

compared to other phases of testing.

The role of the testing team members during and prior to acceptance test is crucial

since they may constantly interact with the acceptance team members. Test team

members help the acceptance members to get the required test data, select and identify

test cases, and analyze the acceptance test results. During test execution, the

acceptance test team reports its progress regularly. The defect reports are generated on

a periodic basis.

Defects reported during acceptance tests could be of different priorities. Test teams

help acceptance test team report defects. Showstopper and high-priority defects are

necessarily fixed before software is released. In case major defects are identified

during acceptance testing, then there is a risk of missing the release date. When the

74

74

defect fixes point to scope or requirement changes, then it may either result in the

extension of the release date to include the feature in the current release or get

postponed to subsequent releases. All resolution of those defects are discussed with

the acceptance test team and their approval is obtained for concluding the completion

of acceptance testing.

2.5 Summary

White box testing requires a sound knowledge of the program code and the

programming language. This means that the developers should get intimately

involved in white box testing.

All testing activities that are conducted from the point where two components are

integrated to the point where all system components work together, are considered a

part of the integration testing phase. The integration testing phase involves developing

and executing test cases that cover multiple components and functionality. This

testing is both a type of testing and a phase of testing. Integration testing, if done

properly, can reduce the number of defects that will be found in the system testing

phase.

System testing is conducted with an objective to find product level defects, and in

building the confidence before the product is released to the customer. System testing

is done to provide independent perspective in testing, bring in customer perspective in

testing, provide a fresh pair of eyes to discover defects not found earlier by testing,

test product in a holistic, complete, and realistic environment, test both functional and

non-functional aspects of a product and analyze and reduce the risk of releasing the

product. It ensures all requirements are met and ready the product for acceptance

testing.

Acceptance testing is a phase after system testing that is normally done by the

customers or representatives of the customer. Acceptance test cases failing in a

customer site may cause the product to be rejected and may mean financial loss or

may mean rework of product involving effort and time.

75

75

2.6 Check Your Progress

1. Explain the types of testing and their importance?

2. What is called white-box testing? Explain static testing.

3. What are the phases involved in structural testing? Explain.

4. What is called a code complexity testing? Explain with an example.

5. Why integration testing is needed? What are the two types of integration

testing?

6. Why system testing is done? Explain

7. Define beta testing and it’s importance.

8. Explain the acceptance testing.

9. How will you select test cases for acceptance testing?

10. Explain the concept system and acceptance testing as a whole.

76

76

Unit III Testing Fundamental – 2 & Specialized Testing

Structure

3.0. Objectives

3.1 Introduction

3.2 Performance Testing

3.3 Regression Testing

3.4 Testing of Object Oriented System

3.5 Usability and Accessibility Testing

3.6 Summary


77

77

3.0 Objectives

To learn the types of specialized testing and their importance

To understand the importance of making a performance testing and it’s uses

To know the regression testing to make the software with improved qualities

To understand how to make Object oriented system testing and their features

To understand the importance of making usability and accessibility testing as a

special kind of testing methodology

3.1 Introduction

The testing performed to evaluate the response time, throughput, and utilization of

the system, to execute its required functions in comparison with different versions of

the same product or a different competitive product is called performance testing. In

this internet era, when more and more of business is transacted online, there is big and

understandable expectation that all applications run as fast as possible. When

applications run fast, a system can fulfill the business requirements quickly and put it

in a position to expand it business and handle future needs as well. A system or a

product that is not able to service business transactions due to its slow performance is

a big loss for the product organization, and its customers. Hence performance is a

basic requirement for any product and is fast becoming a subject of great interest in

the testing community.

3.2 PERFORMANCE TESTING

For real-time and embedded systems, software that provides required function but

does not conform to performance requirements is unacceptable. Performance testing

is designed to test the run-time performance of software within the context of an

integrated system. Performance testing occurs throughout all steps in the testing

process. Even at the unit level, the performance of an individual module may be

assessed as white-box tests are conducted. However, it is not until all system elements

are fully integrated that the true performance of a system can be ascertained.

78

78

Performance tests are often coupled with stress testing and usually require both

hardware and software instrumentation. That is, it is often necessary to measure

resource utilization (e.g., processor cycles) in an exacting fashion. External

instrumentation can monitor execution intervals, log events (e.g., interrupts) as they

occur, and sample machine states on a regular basis. By instrumenting a system, the

tester can uncover situations that lead to degradation and possible system failure.

Figure 19 - The debugging process

There are many factors that govern performance testing. It is critical to understand

the definition and purpose of these factors prior to understanding the methodology for

performance testing and for analyzing the results. The capability of the system or the

product in handling multiple transactions is determined by a factor called throughput.

Throughput represents the number of request/business transactions processed by

the product in specified time duration. It is important to understand that the

throughput varies according to the load of the product is subjected to. The “optimum

throughput” is represented by the saturation point and is the one that represents the

maximum throughput for the product.

Response time can be defined as the delay between the point of request and the first

response from the product. In a typical client-server environment, throughput

represents the number of transactions that can be handled by the server and response

time represents the delay between the request and response.

In reality, not all the delay that happens between the request and the response is

caused by the product. In the networking scenario, the network or other products

79

79

which are sharing the network resources can cause the delays. This brings up yet

another factor for performance – latency. Latency is a delay caused by the application,

operating system, and by the environment that are calculated separately.

Fig. 20 Example of latencies at various levels- network and applications

To explain latency, let us take an example of a web application providing a service

by talking to a web server and a database server connected in the network from the

above figure. From the above picture, latency and response time can be calculated as

Network latency = N1 + N2 + N3 + N4

Product latency = A1 + A2 + A3

Actual response time = network latency + product latency

The next factor that governs the performance testing is tuning. Tuning is a procedure

by which the product performance is enhanced by setting different values to the

parameters of the product, operating system and other components.

Yet another factor that needs to be considered for performance testing is performance

of competitive products. This type of performance testing wherein competitive

products are compared is called benchmarking.

To summarize, performance testing is done to ensure that a product

Client

WebServerA1

A3

DatabaseServerN2A2

N3

N1N4

n4

80

80

• Processes the required number of transactions in any given interval

(throughput).

• Is available and running under different load conditions(availability)

• Responds fast enough for different load conditions( response time)

• Delivers worthwhile return on investment for the resources- hardware and

software

• Is comparable to and better than that of the competitors for different

parameters.

Methodology for performance testing involves the following steps.

1. Collecting requirements

2. Writing test cases

3. Automating performance test cases

4. Executing performance test cases

5. Analyzing performance test results

6. Performance tuning

7. Performance benchmarking

8. Recommending right configuration for the customers

Tools for performance testing

There are two types of tools that can be used for performance testing- functional

performance tools and load tools.

• Functional performance tools help in recording and playing back the

transactions and obtaining performance numbers. This test involves very few

machines.

• Load testing tools simulate the load condition for performance testing without

having to keep that many users or machines.

The list of some popular performance tools are listed below:

Functional performance tools

• WinRunner from Mercury

• QA Partner from compuware

• Silktest from Segue

81

81

Load testing tools

• Load Runner from Mercury

• QA Load from Compuware

• Silk Performer from Segue

Process for performance testing

Performance testing follows the same process as does any other testing type. The only

difference is in getting more details and analysis.

Ever-changing requirements for performance are a serious threat to the product as

performance can only be improved marginally by fixing it in code. Making the

requirements testable and measurable is the first activity needed for the success of

performance testing.

The next step in the performance testing process is to create a performance test plan.

This test plan needs to have the following details.

1. Resource requirements

2. Test bed ( simulated and real life ), test-lab setup

3. Responsibilities

4. Setting up product traces, audits, and traces

5. Entry and exit criteria

The Art of Debugging

Software testing is a process that can be systematically planned and specified. Test

case design can be conducted, a strategy can be defined, and results can be evaluated

against prescribed expectations.

Debugging occurs as a consequence of successful testing. That is, when a test case

uncovers an error, debugging is the process that results in the removal of the error.

Although debugging can and should be an orderly process, it is still very much an art.

A software engineer, evaluating the results of a test, is often confronted with a

"symptomatic" indication of a software problem. That is, the external manifestation of

the error and the internal cause of the error may have no obvious relationship to one

another. The poorly understood mental process that connects a symptom to a cause is

debugging.

82

82

The Debugging Process

Debugging is not testing but always occurs as a consequence of testing. The

debugging process begins with the execution of a test case. Results are assessed and a

lack of correspondence between expected and actual performance is encountered.

The debugging process will always have one of two outcomes:

(1) The cause will be found and corrected, or

(2) The cause will not be found. In the latter case, the person performing debugging

may suspect a cause, design a test case to help validate that suspicion, and work

toward error correction in an iterative fashion.

Why is debugging so difficult? In all likelihood, human psychology has more to do

with an answer than software technology. However, a few characteristics of bugs

provide some clues:

1. The symptom and the cause may be geographically remote. That is, the symptom

may appear in one part of a program, while the cause may actually be located at a site

that is far removed.

2. The symptom may disappear (temporarily) when another error is corrected.

3. The symptom may actually be caused by non errors (e.g., round-off inaccuracies).

4. The symptom may be caused by human error that is not easily traced.

5. The symptom may be a result of timing problems, rather than processing problems.

6. It may be difficult to accurately reproduce input conditions (e.g., a real-time

application in which input ordering is indeterminate).

7. The symptom may be intermittent. This is particularly common in embedded

systems that couple hardware and software inextricably.

8. The symptom may be due to causes that are distributed across

3.3 REGRESSION TESTING

Each time a new module is added as part of integration testing, the software changes.

New data flow paths are established, new I/O may occur, and new control logic is

invoked. These changes may cause problems with functions that previously worked

flawlessly. In the context of an integration test strategy, regression testing is the re

83

83

execution of some subset of tests that have already been conducted to ensure that

changes have not propagated unintended side effects. In a broader context, successful

tests (of any kind) result in the discovery of errors, and errors must be corrected.

Whenever software is corrected, some aspect of the software configuration (the

program, its documentation, or the data that support it) is changed. Regression testing

is the activity that helps to ensure that changes (due to testing or for other reasons) do

not introduce unintended behavior or additional errors.

Regression testing may be conducted manually, by re-executing a subset of all test

cases or using automated capture/playback tools. Capture/playback tools enable the

software engineer to capture test cases and results for subsequent playback and

comparison.

The regression test suite (the subset of tests to be executed) contains three different

classes of test cases:

• A representative sample of tests that will exercise all software functions.

• Additional tests that focus on software functions that are likely to be affected by the

change.

• Tests that focus on the software components that have been changed.

As integration testing proceeds, the number of regression tests can grow quite large.

Therefore, the regression test suite should be designed to include only those tests that

address one or more classes of errors in each of the major program functions. It is

impractical and inefficient to re-execute every test for every program function once a

change has occurred.

Types of regression testing

There are two types of regression testing in practice.

• Regular regression testing

• Final regression testing

A regular regression testing is done between test cycles to ensure that the defect fixes

that are done and the functionality that were working with the earlier test cycles

continue to work. A regular regression testing can use more than one product build for

the test cases to be executed. A build is an aggregation of all the defect fixes and

features that are present in the product.

A final regression testing is done to validate the final build before release.

84

84

It is necessary to perform regression testing when

• A reasonable amount of initial testing is already carried out.

• A good number of defects have been fixed.

• Defect fixes that can produce side-effects are taken care of.

How to do regression testing

A well defined methodology for regression testing is very important as this among is

the final type of testing that is normally performed just before release. The

methodology here is made of the following steps:

• Performing an initial “smoke” or “sanity” test

• Understanding the criteria for selecting the test cases

• Classifying the test cases

• Methodology for selecting test cases

• Resetting the test cases for regression testing

Smoke Testing

Smoke testing is an integration testing approach that is commonly used when “shrink

wrapped” software products are being developed. It is designed as a pacing

mechanism for time-critical projects, allowing the software team to assess its project

on a frequent basis. In essence, the smoke testing approach encompasses the

following activities:

1. Software components that have been translated into code are integrated into a

“build”. A build includes all data files, libraries, reusable modules, and engineered

components that are required to implement one or more product functions.

2. A series of tests is designed to expose errors that will keep the build from properly

performing its function. The intent should be to uncover “show stopper” errors that

have the highest likelihood of throwing the software project behind schedule.

3. The build is integrated with other builds and the entire product (in its current form)

is smoke tested daily. The integration approach may be top down or bottom up.

85

85

The daily frequency of testing the entire product may surprise some readers.

However, frequent tests give both managers and practitioners a realistic assessment of

integration testing progress. McConnell describes the smoke test in the following

manner:

The smoke test should exercise the entire system from end to end. It does not have to

be exhaustive, but it should be capable of exposing major problems. The smoke test

should be thorough enough that if the build passes, you can assume that it is stable

enough to be tested more thoroughly. Smoke testing provides a number of benefits

when it is applied on complex, time critical software engineering projects:

• Integration risk is minimized. Because smoke tests are conducted daily,

incompatibilities and other show-stopper errors are uncovered early, thereby reducing

the likelihood of serious schedule impact when errors are uncovered.

• The quality of the end-product is improved. Because the approach is construction

(integration) oriented, smoke testing is likely to uncover both functional errors and

architectural and component-level design defects. If these defects are corrected early,

better product quality will result.

• Error diagnosis and correction are simplified. Like all integration testing

approaches, errors uncovered during smoke testing are likely to be associated with

“new software increments”—that is, the software that has just been added to the

build(s) is a probable cause of a newly discovered error.

• Progress is easier to assess. With each passing day, more of the software has been

integrated and more has been demonstrated to work. This improves team morale and

gives managers a good indication that progress is being made.

Best practices in regression testing

Regression methodology can be applied when

1. We need to assess the quality of product between test cycles

2. We are doing a major release of a product, have executed all test cycles, and are

planning a regression test cycle for defect fixes and

3. We are doing a minor release of a product having only defect fixes, and we can

plan for regression test cycles to take care of those defect fixes.

86

86

The best practices are listed below:

• Regression can be used for all types of releases.

• Mapping defect identifiers with test cases improves regression quality

• Create and execute regression test bed daily

• Ask your best test engineer to select the test cases

• Detect defects, and protect your product from defects and defect fixes.

3.4 Testing of Object Oriented Systems

The objective of testing, stated simply, is to find the greatest possible number of

errors with a manageable amount of effort applied over a realistic time span. Although

this fundamental objective remains unchanged for object-oriented software, the nature

of OO programs changes both testing strategy and testing tactics. It might be argued

that, as OOA and OOD mature, greater reuse of design patterns will mitigate the need

for heavy testing of OO systems. Exactly the opposite is true. Binder discusses this

when he states:

Each reuse is a new context of usage and retesting is prudent. It seems likely that

more, not less, testing will be needed to obtain high reliability in object-oriented

systems.

The testing of OO systems presents a new set of challenges to the software engineer.

The definition of testing must be broadened to include error discovery techniques

(formal technical reviews) applied to OOA and OOD models. The completeness and

consistency of OO representations must be assessed as they are built. Unit testing

loses much of its meaning, and integration strategies change significantly. In

summary, both testing strategies and testing tactics must account for the unique

characteristics of OO software.

3 The architecture of object-oriented software results in a series of layered subsystems

that encapsulate collaborating classes. Each of these system elements (subsystems and

classes) performs functions that help to achieve system requirements. It is necessary

to test an OO system at a variety of different levels in an effort to uncover errors that

87

87

may occur as classes collaborate with one another and subsystems communicate

across architectural layers.

Who does it? Object-oriented testing is performed by software engineers and testing

specialists.

Why is it important? You have to execute the program before it gets to the customer

with the specific intent of removing all errors, so that the customer will not experience

the frustration associated with a poor-quality product. In order to find the highest

possible number of errors, tests must be conducted systematically and test cases must

be designed using disciplined techniques.

What are the steps? OO testing is strategically similar to the testing of conventional

systems, but it is tactically different. Because the OO analysis and design models are

similar in structure and content to the resultant OO program, “testing” begins with the

review of these models. Once code has been generated, OO testing begins “in the

small” with class testing. Problems could occur (and will have been avoided because

of the earlier review) during design:

1. Improper allocation of the class to subsystem and/or tasks may occur during system

design.

2. Unnecessary design work may be expended to create the procedural design for the

operations that address the extraneous attribute.

3.The messaging model will be incorrect (because messages must be designed for the

operations that are extraneous).

If the error remains undetected during design and passes into the coding activity,

considerable effort will be expended to generate code that implements an unnecessary

attribute, two unnecessary operations, messages that drive inter object

communication, and many other related issues. In addition, testing of the class will

absorb more time than necessary. Once the problem is finally uncovered, modification

of the system must be carried out with the ever-present potential for side effects that

are caused by change.

During later stages of their development, OOA and OOD models provide substantial

information about the structure and behavior of the system. For this reason, these

models should be subjected to rigorous review prior to the generation of code. All

object-oriented models should be tested (in this context, the term testing is used to

88

88

incorporate formal technical reviews) for correctness, completeness, and consistency

within the context of the model’s syntax, semantics, and pragmatics.

Testing OOA and OOD Models

Analysis and design models cannot be tested in the conventional sense, because they

cannot be executed. However, formal technical reviews can be used to examine the

correctness and consistency of both analysis and design models.

Correctness of OOA and OOD Models

The notation and syntax used to represent analysis and design models will be tied to

the specific analysis and design method that is chosen for the project. Hence, syntactic

correctness is judged on proper use of the symbology; each model is reviewed to

ensure that proper modeling conventions have been maintained. During analysis and

design, semantic correctness must be judged based on the model’s conformance to the

real world problem domain.

If the model accurately reflects the real world (to a level of detail that is appropriate

to the stage of development at which the model is reviewed), then it is semantically

correct. To determine whether the model does, in fact, reflect the real world, it should

be presented to problem domain experts, who will examine the class definitions and

hierarchy for omissions and ambiguity. Class relationships (instance connections) are

evaluated to determine whether they accurately reflect real world object connections.

Consistency of OOA and OOD Models

The consistency of OOA and OOD models may be judged by “considering the

relationships among entities in the model. An inconsistent model has representations

in one part that are not correctly reflected in other portions of the model”. To assess

consistency, each class and its connections to other classes should be examined. The

class-responsibility-collaboration model and an object-relationship diagram can be

used to facilitate this activity. The CRC model is composed on CRC index cards.

Each CRC card lists the class name, its responsibilities (operations), and its

collaborators (other classes to which it sends messages and on which it depends for

the accomplishment of its responsibilities). The collaborations imply a series of

relationships (i.e., connections) between classes of the OO system. The

89

89

object-relationship model provides a graphic representation of the connections

between classes. All of this information can be obtained from the OOA model.

To evaluate the class model the following steps have been recommended:

1. Revisit the CRC model and the object-relationship model. Cross check to

ensure that all collaborations implied by the OOA model are properly represented.

2. Inspect the description of each CRC index card to determine if a delegated

responsibility is part of the collaborator’s definition. For example, consider a class

defined for a point-of-sale checkout system, called credit sale. This class has a CRC

index card illustrated in Figure 23.1. For this collection of classes and collaborations,

we ask whether a responsibility (e.g., readcredit card) is accomplished if delegated to

the named collaborator (credit card). That is, does the class credit card have an

operation that enables it to be read? In this case the answer is, “Yes.” The

object-relationship is traversed to ensure that all such connections are valid.

3. Invert the connection to ensure that each collaborator that is asked for service

is receiving requests from a reasonable source. For example, if the credit card

class receives a request for purchase amount from the credit sale class, there would

be a problem. Credit card does not know the purchase amount.

4. Using the inverted connections examined in step 3, determine whether other

classes might be required and whether responsibilities are properly grouped

among the classes.

5. Determine whether widely requested responsibilities might be combined into a

single responsibility. For example, read credit card and get authorization occur in

every situation. They might be combined into a validate credit request responsibility

that incorporates getting the credit card number and gaining authorization.

6. Steps 1 through 5 are applied iteratively to each class and through each

evolution of the OOA model.

Once the OOD model is created, reviews of the system design and the object design

should also be conducted. The system design depicts the overall product architecture,

the subsystems that compose the product, the manner in which subsystems are

allocated to processors, the allocation of classes to subsystems, and the design of the

user interface. The object model presents the details of each class and the messaging

activities that are necessary to implement collaborations between classes.

90

90

The system design is reviewed by examining the object-behavior model developed

during OOA and mapping required system behavior against the subsystems designed

to accomplish this behavior. Concurrency and task allocation are also reviewed within

the context of system behavior. The behavioral states of the system are evaluated to

determine which exist concurrently. Use-case scenarios are used to exercise the user

interface design.

Figure 21 - An example CRC index card used for review OOD Model

Object Oriented Testing Strategies

The classical strategy for testing computer software begins with “testing in the small”

and works outward toward “testing in the large.” Stated in the jargon of software

testing, we begin with unit testing, then progress toward integration testing, and

culminate with validation and system testing. In conventional applications, unit

testing focuses on the smallest compliable program unit—the subprogram (e.g.,

module, subroutine, procedure, component). Once each of these units has been tested

individually, it is integrated into a program structure while a series of regression tests

are run to uncover errors due to interfacing between the modules and side effects

caused by the addition of new units. Finally, the system as a whole is tested to ensure

that errors in requirements are uncovered.

91

91

Unit Testing in the OO Context

When object-oriented software is considered, the concept of the unit changes.

Encapsulation drives the definition of classes and objects. This means that each class

and each instance of a class (object) packages attributes (data) and the operations

(also known as methods or services) that manipulate these data. Rather than testing an

individual module, the smallest testable unit is the encapsulated class or object.

Because a class can contain a number of different operations and a particular

operation may exist as part of a number of different classes, the meaning of unit

testing changes dramatically. We can no longer test a single operation in isolation (the

conventional view of unit testing) but rather as part of a class.

To illustrate, consider a class hierarchy in which an operation X is defined for the

super class and is inherited by a number of subclasses. Each subclass uses operation

X, but it is applied within the context of the private attributes and operations that have

been defined for the subclass. Because the context in which operation X is used varies

in subtle ways, it is necessary to test operation X in the context of each of the

subclasses. This means that testing operation X in a vacuum (the traditional unit

testing approach) is ineffective in the object-oriented context.

Class testing for OO software is the equivalent of unit testing for conventional

software. Unlike unit testing of conventional software, which tends to focus on the

algorithmic detail of a module and the data that flow across the module interface,

class testing for OO software is driven by the operations encapsulated by the class and

the state behavior of the class.

Integration Testing in the OO Context

Because object-oriented software does not have a hierarchical control structure,

conventional top-down and bottom-up integration strategies have little meaning. In

addition, integrating operations one at a time into a class (the conventional

incremental integration approach) is often impossible because of the “direct and

indirect interactions of the components that make up the class”.

There are two different strategies for integration testing of OO systems. The first,

thread-based testing, integrates the set of classes required to respond to one input or

92

92

event for the system. Each thread is integrated and tested individually. Regression

testing is applied to ensure that no side effects occur. The second integration

approach, use-based testing, begins the construction of the system by testing those

classes (called independent classes) that use very few (if any) of server classes. After

the independent classes are tested, the next layer of classes, called dependent classes,

that use the independent classes are tested. This sequence of testing layers of

dependent classes continues until the entire system is constructed. Unlike

conventional integration, the use of drivers and stubs as replacement operations is to

be avoided, when possible.

Cluster testing is one step in the integration testing of OO software. Here, a cluster of

collaborating classes (determined by examining the CRC and object-relationship

model) is exercised by designing test cases that attempt to uncover errors in the

collaborations.

Validation Testing in an OO Context

At the validation or system level, the details of class connections disappear. Like

conventional validation, the validation of OO software focuses on user-visible actions

and user-recognizable output from the system. To assist in the derivation of validation

tests, the tester should draw upon the use-cases that are part of the analysis model.

The use-case provides a scenario that has a high likelihood of uncovered errors in user

interaction requirements. Conventional black-box testing methods can be used to

drive validations tests. In addition, test cases may be derived from the object-behavior

model and from event flow diagram created as part of OOA.

Test case Design for OO Software

Test case design methods for OO software are still evolving. However, an overall

approach to OO test case design has been defined by Berard :

1. Each test case should be uniquely identified and explicitly associated with the class

to be tested.

2. The purpose of the test should be stated.

3. A list of testing steps should be developed for each test and should contain:

a. A list of specified states for the object that is to be tested.

b. A list of messages and operations that will be exercised as a consequence of the

test.

93

93

c. A list of exceptions that may occur as the object is tested.

d. A list of external conditions (i.e., changes in the environment external to the

software that must exist in order to properly conduct the test).

e. Supplementary information that will aid in understanding or implementing the test.

Unlike conventional test case design, which is driven by an input-process-output view

of software or the algorithmic detail of individual modules, object-oriented testing

focuses on designing appropriate sequences of operations to exercise the states of a

class.

The Test Case Design Implications of OO Concepts

As we have already seen, the OO class is the target for test case design. Because

attributes and operations are encapsulated, testing operations outside of the class is

generally unproductive. Although encapsulation is an essential design concept for

OO, it can create a minor obstacle when testing. As Binder notes, “Testing requires

reporting on the concrete and abstract state of an object.” Yet, encapsulation can make

this information somewhat difficult to obtain. Unless built-in operations are provided

to report the values for class attributes, a snapshot of the state of an object may be

difficult to acquire. Inheritance also leads to additional challenges for the test case

designer. We have already noted that each new context of usage requires retesting,

even though reuse has been achieved.

In addition, multiple inheritance3 complicates testing further by increasing the

number of contexts for which testing is required. If subclasses instantiated from a

super class are used within the same problem domain, it is likely that the set of test

cases derived for the super class can be used when testing the subclass. However, if

the subclass is used in an entirely different context, the super class test cases will have

little applicability and a new set of tests must be designed.

Applicability of Conventional Test Case Design Methods

The white-box testing methods can be applied to the operations defined for a class.

Basis path, loop testing, or data flow techniques can help to ensure that every

statement in an operation has been tested. However, the concise structure of many

class operations causes some to argue that the effort applied to white-box testing

might be better redirected to tests at a class level.

94

94

Black-box testing methods are as appropriate for OO systems as they are for systems

developed using conventional software engineering methods. As we noted earlier in

this chapter, use-cases can provide useful input in the design of black-box and

state-based tests.

Fault-Based Testing

The object of fault-based testing within an OO system is to design tests that have a

high likelihood of uncovering plausible faults. Because the product or system must

conform to customer requirements, the preliminary planning required to perform fault

based testing begins with the analysis model. The tester looks for plausible faults (i.e.,

aspects of the implementation of the system that may result in defects). To determine

whether these faults exist, test cases are designed to exercise the design or code.

Consider a simple example; Software engineers often make errors at the boundaries of

a problem. For example, when testing a SQRT operation that returns errors for

negative numbers, we know to try the boundaries: a negative number close to zero

and zero itself. "Zero itself" checks whether the programmer made a mistake like

if (x > 0) calculate_the_square_root();

instead of the correct

if (x >= 0) calculate_the_square_root();

As another example, consider a Boolean expression:

if (a && !b || c)

Multicondition testing and related techniques probe for certain plausible faults in this

expression, such as

&& should be ||

! was left out where it was needed There should be parentheses around !b || c

For each plausible fault, we design test cases that will force the incorrect expression

to fail. In the previous expression, (a=0, b=0, c=0) will make the expression as given

evaluate false. If the && should have been ||, the code has done the wrong thing and

might branch to the wrong path.

Of course, the effectiveness of these techniques depends on how testers perceive a

"plausible fault." If real faults in an OO system are perceived to be "implausible,"

then this approach is really no better than any random testing technique. However, if

the analysis and design models can provide insight into what is likely to go wrong,

then fault-based testing can find significant numbers of errors with relatively low

95

95

expenditures of effort. Integration testing looks for plausible faults in operation calls

or message connections. Three types of faults are encountered in this context:

unexpected result, wrong operation/message used, incorrect invocation. To determine

plausible faults as functions (operations) are invoked, the behavior of the operation

must be examined. Integration testing applies to attributes as well as to operations.

The "behaviors" of an object are defined by the values that its attributes are assigned.

Testing should exercise the attributes to determine whether proper values occur for

distinct types of object behavior. It is important to note that integration testing

attempts to find errors in the client object, not the server. Stated in conventional

terms, the focus of integration testing is to determine whether errors exist in the

calling code, not the called code. The operation call is used as a clue, a way to find

test requirements that exercise the calling code.

The Impact of OO Programming on Testing

There are several ways object-oriented programming can have an impact on testing.

Depending on the approach to OOP,

• Some types of faults become less plausible (not worth testing for).

• Some types of faults become more plausible (worth testing now).

• Some new types of faults appear.

When an operation is invoked, it may be hard to tell exactly what code gets exercised.

That is, the operation may belong to one of many classes. Also, it can be hard to

determine the exact type or class of a parameter. When the code accesses it, it may get

an unexpected value. The difference can be understood by considering a conventional

function call:

x = func (y);

For conventional software, the tester need consider all behaviors attributed to func

and nothing more. In an OO context, the tester must consider the behaviors of

base::func(), of derived::func(), and so on. Each time func is invoked, the tester must

consider the union of all distinct behaviors. This is easier if good OO design practices

are followed and the difference between super classes and subclasses (in C++ jargon,

these are called base classes and derived classes) are limited. The testing approach for

base and derived classes is essentially the same. The difference is one of

bookkeeping.

96

96

Testing OO class operations is analogous to testing code that takes a function

parameter and then invokes it. Inheritance is a convenient way of producing

polymorphic operations. At the call site, what matters is not the inheritance, but the

polymorphism. Inheritance does make the search for test requirements more

straightforward. By virtue of OO software architecture and construction, are some

types of faults more plausible for an OO system and others less plausible? The answer

is, “Yes.” For example, because OO operations are generally smaller, more time tends

to be spent on integration because there are more opportunities for integration faults.

Therefore, integration faults become more plausible.

Test Cases and the Class Hierarchy

As noted earlier in this chapter, inheritance does not obviate the need for thorough

testing of all derived classes. In fact, it can actually complicate the testing process.

Consider the following situation. A class base contains operations inherited and

redefined. A class derived redefines redefined to serve in a local context. There is

little doubt the derived::redefined() has to be tested because it represents a new design

and new code. But does derived::inherited() have to be retested? If

derived::inherited() calls redefined and the behavior of redefined has changed,

derived::inherited() may mishandle the new behavior. Therefore, it needs new tests

even though the design and code have not changed. It is important to note, however,

that only a subset of all tests for derived::inherited() may have to be conducted. If part

of the design and code for inherited does not depend on redefined (i.e., that does not

call it nor call any code that indirectly calls it), that code need not be retested in the

derived class.

Base::redefined() and derived::redefined() are two different operations with different

specifications and implementations. Each would have a set of test requirements

derived from the specification and implementation. Those test requirements probe for

plausible faults: integration faults, condition faults, boundary faults, and so forth. But

the operations are likely to be similar. Their sets of test requirements will overlap.

The better the OO design, the greater is the overlap. New tests need to be derived only

for those derived::redefined() requirements that are not satisfied by the

base::redefined() tests.

97

97

To summarize, the base::redefined() tests are applied to objects of class derived.

Test inputs may be appropriate for both base and derived classes, but the expected

results may differ in the derived class.

Scenario-Based Test Design

Fault-based testing misses two main types of errors:

(1) Incorrect specifications and

(2) Interactions among subsystems.

When errors associated with incorrect specification occur, the product doesn't do

what the customer wants. It might do the wrong thing or it might omit important

functionality. But in either circumstance, quality (conformance to requirements)

suffers. Errors associated with subsystem interaction occur when the behavior of one

subsystem creates circumstances (e.g., events, data flow) that cause another

subsystem to fail.

Scenario-based testing concentrates on what the user does, not what the product

does. This means capturing the tasks (via use-cases) that the user has to perform, then

applying them and their variants as tests. Scenarios uncover interaction errors. But to

accomplish this, test cases must be more complex and more realistic than fault-based

tests. Scenario-based testing tends to exercise multiple subsystems in a single test

(users do not limit themselves to the use of one subsystem at a time).

As an example, consider the design of scenario-based tests for a text editor. Use cases

follow:

Use-Case: Fix the Final Draft

Background: It's not unusual to print the "final" draft, read it, and discover some

annoying errors that weren't obvious from the on-screen image. This use-case

describes the sequence of events that occurs when this happens.

1. Print the entire document.

2. Move around in the document, changing certain pages.

3. As each page is changed, it's printed.

4. Sometimes a series of pages is printed.

This scenario describes two things: a test and specific user needs. The user needs are

obvious: (1) a method for printing single pages and (2) a method for printing a range

of pages. As far as testing goes, there is a need to test editing after printing (as well as

the reverse). The tester hopes to discover that the printing function causes errors in

98

98

the editing function; that is, that the two software functions are not properly

independent.

Use-Case: Print a New Copy

Background: Someone asks the user for a fresh copy of the document. It must be

printed.

1. Open the document.

2. Print it.

3. Close the document.

Again, the testing approach is relatively obvious. Except that this document didn't

appear out of nowhere. It was created in an earlier task. Does that task affect this one?

In many modern editors, documents remember how they were last printed. By default,

they print the same way next time. After the Fix the Final Draft scenario, just

selecting "Print" in the menu and clicking the "Print" button in the dialog box will

cause the last corrected page to print again. So, according to the editor, the correct

scenario should look like this:

Use-Case: Print a New Copy

1. Open the document.

2. Select "Print" in the menu.

3. Check if you're printing a page range; if so, click to print the entire document.

4. Click on the Print button.

5. Close the document.

But this scenario indicates a potential specification error. The editor does not do what

the user reasonably expects it to do. Customers will often overlook the check noted in

step 3 above. They will then be annoyed when they trot off to the printer and find one

page when they wanted 100. Annoyed customers signal specification bugs. A test case

designer might miss this dependency in test design, but it is likely that the problem

would surface during testing. The tester would then have to contend with the probable

response, "That's the way it's supposed to work!"

Testing Surface Structure and Deep Structure

Surface structure refers to the externally observable structure of an OO program. That

is, the structure that is immediately obvious to an end-user. Rather than performing

functions, the users of many OO systems may be given objects to manipulate in some

99

99

way. But whatever the interface, tests are still based on user tasks. Capturing these

tasks involves understanding, watching, and talking with representative users (and as

many non representative users as are worth considering).

There will surely be some difference in detail. For example, in a conventional system

with a command-oriented interface, the user might use the list of all commands as a

testing checklist. If no test scenarios existed to exercise a command, testing has likely

overlooked some user tasks (or the interface has useless commands). In a object based

interface, the tester might use the list of all objects as a testing checklist. The best tests

are derived when the designer looks at the system in a new or unconventional way.

For example, if the system or product has a command-based interface, more thorough

tests will be derived if the test case designer pretends that operations are independent

of objects. Ask questions like, “Might the user want to use this operation—which

applies only to the Scanner object—while working with the printer?" Whatever the

interface style, test case design that exercises the surface structure should use both

objects and operations as clues leading to overlooked tasks.

Deep structure refers to the internal technical details of an OO program. That is, the

structure that is understood by examining the design and/or code. Deep structure

testing is designed to exercise dependencies, behaviors, and communication

mechanisms that have been established as part of the system and object design of OO

software. The analysis and design models are used as the basis for deep structure

testing.

For example, the object-relationship diagram or the subsystem collaboration diagram

depicts collaborations between objects and subsystems that may not be externally

visible. The test case design then asks: “Have we captured (as a test) some task that

exercises the collaboration noted on the object-relationship diagram or the subsystem

collaboration diagram? If not, why not?”

Design representations of class hierarchy provide insight into inheritance structure.

Inheritance structure is used in fault-based testing. Consider a situation in which an

operation named caller has only one argument and that argument is a reference to a

base class. What might happen when caller is passed a derived class? What are the

differences in behavior that could affect caller? The answers to these questions might

lead to the design of specialized tests.

10

10

Testing Methods Applicable At the Class Level

Software testing begins “in the small” and slowly progresses toward testing “in the

large.” Testing in the small focuses on a single class and the methods that are

encapsulated by the class. Random testing and partitioning are methods that can be

used to exercise a class during OO testing.

Random Testing for OO Classes

To provide brief illustrations of these methods, consider a banking application in

which an account class has the following operations: open, setup, deposit, withdraw,

balance, summarize, creditLimit, and close. Each of these operations may be applied

for account, but certain constraints (e.g., the account must be opened before other

operations can be applied and closed after all operations are completed) are implied

by the nature of the problem. Even with these constraints, there are many

permutations of the operations. The minimum behavioral life history of an instance of

account includes the following operations:

• Open

• Setup

• Deposit

• Withdraw

• Close

This represents the minimum test sequence for account. However, a wide variety of

other behaviors may occur within this sequence:

open•setup•deposit•[deposit|withdraw|balance|summarize|creditLimit]n•withdraw•clo

se

A variety of different operation sequences can be generated randomly. For example:

Test case r1: open•setup•deposit•deposit•balance•summarize•withdraw•close

Test case r2:

open•setup•deposit•withdraw•deposit•balance•creditLimit•withdraw•close

These and other random order tests are conducted to exercise different class instance

life histories.

10

10

Partition Testing at the Class Level

Partition testing reduces the number of test cases required to exercise the class in

much the same manner as equivalence partitioning for conventional software. Input

and output are categorized and test cases are designed to exercise each category. But

how are the partitioning categories derived?

State-based partitioning categorizes class operations based on their ability to change

the state of the class. Again considering the account class, state operations include

deposit and withdraw, whereas nonstate operations include balance, summarize, and

creditLimit. Tests are designed in a way that exercises operations that change state

and those that do not change state separately. Therefore,

Test case p1: open•setup•deposit•deposit•withdraw•withdraw•close

Test case p2: open•setup•deposit•summarize•creditLimit•withdraw•close

Test case p1 changes state, while test case p2 exercises operations that do not change

state (other than those in the minimum test sequence).

Attribute-based partitioning categorizes class operations based on the attributes that

they use. For the account class, the attributes balance and creditLimit can be used to

define partitions. Operations are divided into three partitions: (1) operations that use

creditLimit, (2) operations that modify creditLimit, and (3) operations that do not use

or modify creditLimit. Test sequences are then designed for each partition.

Category-based partitioning categorizes class operations based on the generic

function that each performs. For example, operations in the account class can be

categorized in initialization operations (open, setup), computational operations

(deposit, withdraw). Queries (balance, summarize, creditLimit) and termination

operations (close).

Interclass Test Case Design

Test case design becomes more complicated as integration of the OO system begins.

It is at this stage that testing of collaborations between classes must begin. To

illustrate “interclass test case generation”, we expand the banking example to include

the classes and collaborations noted in Figure.

The direction of the arrows in the figure indicates the direction of messages and the

labeling indicates the operations that are invoked as a consequence of the

collaborations implied by the messages. Like the testing of individual classes, class

10

10

collaboration testing can be accomplished by applying random and partitioning

methods, as well as scenario-based testing and behavioral testing.

Multiple Class Testing Kirani and Tsai suggest the following sequence of steps to

generate multiple class random test cases:

1. For each client class, use the list of class operations to generate a series of random

test sequences. The operations will send messages to other server classes.

2. For each message that is generated, determine the collaborator class and the

corresponding operation in the server object.

3. For each operation in the server object (that has been invoked by messages sent

from the client object), determine the messages that it transmits.

4. For each of the messages, determine the next level of operations that are invoked

and incorporate these into the test sequence.

To illustrate, consider a sequence of operations for the bank class relative to an ATM

class:

verifyAcct•verifyPIN•[[verifyPolicy•withdrawReq]|depositReq|acctInfoREQ]n

A random test case for the bank class might be

test case r3 = verifyAcct•verifyPIN•depositReq

In order to consider the collaborators involved in this test, the messages associated

with each of the operations noted in test case r3 is considered. Bank must collaborate

with ValidationInfo to execute the verifyAcct and verifyPIN. Bank must collaborate

with account to execute depositReq. Hence, a new test case that exercises these

collaborations is

test case r4 = verifyAcctBank[validAcctValidationInfo]•verifyPINBank•

[validPinValidationInfo]•depositReq• [depositaccount]

cardinserted

password

deposit

withdraw

accntStatus

terminate

verifyStatus

depositStatus

dispenseCash

10

10

print AccntStat

read CardInfo

getCashAmnt

verifyAcct

verifyPIN

verifyPolicy

withdrawReq

depositReq

acctInfo

openAcct

initialDeposit

authorizeCard

deauthorize

closeAcct

validPIN

validAcct

creditLimit

accntType

balance

withdraw

deposit

close

ATM

user

interface

ATM

Cashier Account Validation

info

Bank

10

10

Figure 22- Class collaboration diagram for banking application

The approach for multiple class partition testing is similar to the approach used for

partition testing of individual classes. However, the test sequence is expanded to

include those operations that are invoked via messages to collaborating classes. An

alternative approach partitions tests based on the interfaces to a particular class.

Referring to the above figure, the bank class receives messages from the ATM and

cashier classes. The methods within bank can therefore be tested by partitioning

them into those that serve ATM and those that serve cashier. State-based partitioning

can be used to refine the partitions further.

Tests Derived from Behavior Models

The state transition diagram is a model that represents the dynamic behavior of a

class. The STD for a class can be used to help derive a sequence of tests that will

10

10

exercise the dynamic behavior of the class (and those classes that collaborate with it).

The state model can be traversed in a “breadth-first” manner. In this context, breadth

first implies that a test case exercises a single transition and that when a new

transition is to be tested only previously tested transitions are used.

Consider the credit card object discussed in the previous section. The initial state of

credit card is undefined (i.e., no credit card number has been provided). Upon

reading the credit card during a sale, the object takes on a defined state; that is, the

attributes card number and expiration date, along with bank specific identifiers are

defined. The credit card is submitted when it is sent for authorization and it is

approved when authorization is received. The transition of credit card from one state

to another can be tested by deriving test cases that cause the transition to occur. A

breadth-first approach to this type of testing would not exercise submitted before it

exercised undefined and defined. If it did, it would make use of transitions that had

not been previously tested and would therefore violate the breadth-first criterion.

Tools for testing of Object Oriented Testing

There are several tools that aid in testing OO systems. Some of these are

1. Use cases

2. Class diagrams

3. Sequence diagrams

4. State charts

3.5 USABILITY AND ACCESSABILITY TESTING

Usability testing attempts to characterize the “look and Feel” and usage aspects of a

product, from the point of view of users. Most types of testing are objective in nature.

Some of the characteristics of usability testing are as follows:

Usability testing tests the product from the user’s point of view. It

encompasses a range of techniques for identifying how users actually interact

with and use the product.

Usability testing is for checking the product to see if it is easy to use for the

various categories of users.

10

10

Usability testing is a process to identify discrepancies between the user

interface of the product and the human user requirements, in terms of the

pleasantness and aesthetics aspects.

If we combine all the above characterizations of the various factors that determine

usability testing, then the common threads are

1. Ease of use

2. Speed

3. Pleasantness and aesthetics

Approach to usability

When doing usability testing, certain human factors can be represented in a

quantifiable way and can be tested objectively. Generally, the people best suited to

perform usability testing are

11. Typical representatives of the actual user segments who would

be using the product, so that the typical user patterns can be captured,

and

12. People who are new to the product, so that they can start

without any bias and be able to identify usability problems.

When to do usability testing

The most appropriate way of ensuring usability is by performing the usability testing

in two phases. First is design validation and the second is usability testing done as a

part of component and integration testing phases of a test cycle. A product has to be

designed for usability. A product designed only for functionality may not get user

acceptance. A product designed for functionality may also involve a high degree of

training, which can be minimized if it is designed for both functionality and usability.

Usability design is verified through several means. Some of them are as follows:

Style sheets

Screen prototypes

Paper designs

Layout design

10

10

A “usable product” is always the result of mutual collaboration from all the

stakeholders, for the entire duration of the project. Usability is a habit and a behavior.

Just like humans, the products are expected to behave differently and correctly with

different users and to their expectations.

Quality factors for usability

Some quality factors are very important when performing usability testing. Focusing

on some of the quality factors given below help in improving objectivity in usability

testing are as follows.

Comprehensibility

Consistency

Navigation

Responsiveness

Aesthetics testing

Another important aspect in usability is making the product “beautiful”. Performing

aesthetics testing helps in improving usability further. It is not possible for all

products to measure up with the Taj Mahal for its beauty. Testing for aesthetics can at

least ensure the product is pleasing to the eye. Aesthetics testing can be performed by

anyone who appreciates beauty. Involving beauticians, artists, and architects who

have regular roles of making different aspects of life beautiful serve as experts here in

aesthetics testing. Involving them during design and testing phases and incorporating

their inputs may improve the aesthetics of the product. For example, the icons used in

the product may look more appealing if they are designed by an artist, as they are not

meant only for conveying messages but also help in making the product beautiful.

ACCESSIBILITY TESTING

There are a large number of people who are challenged with vision, hearing, and

mobility related problems-partial or complete. Product usability that does not look

into their requirements would result in lack of acceptance. There are several tools that

10

10

are available to help them with alternatives. These tools are generally referred as

accessibility tools or assistive technologies. Verifying the product usability for

physically challenged users is called accessibility testing. Accessibility is a subset of

usability and should be included as part of usability test planning.

Accessibility of the product can be provided by two means.

Making use of accessibility features provided by the underlying infrastructure(

for example, operating system), called basic accessibility, and

Providing accessibility in the product through standards and guidelines, called

product accessibility.

Basic accessibility

Basic accessibility is provided by the hardware and operating system. All the input

and output devices of the computer and their accessibility options are categorized

under basic accessibility. The keyboard accessibility and screen accessibility are some

of the basic accessibility features.

Product accessibility

A good understanding of the basic accessibility features is needed while providing

accessibility to the product. A product should do everything possible to ensure that the

basic accessibility features are utilized by it. A good understanding of basic

accessibility features and the requirements of different types users with special needs

help in creating certain guidelines on how the product’s user interface has to be

designed.

This requirement explains the importance of providing text equivalents for picture

messages and providing captions for audio portions. When an audio file is played,

providing captions for the audio improves accessibility for the hearing impaired.

Providing audio clippings improves accessibility for impaired users who can not

understand the video streams and pictures. Hence, text equivalents for audio, audio

descriptions for pictures and visuals become an important requirement for

accessibility.

Tools for usability

10

10

There are not many tools that help in usability because of the high degree of

subjectivity involved in evaluating this aspect. A sample list of usability and

accessibility tools are listed below:

JAWS

HTML validator

Style sheet validator

Magnifier

Narrator

Soft keyboard

Test roles for usability

Usability testing is not as formal as other types of testing in several companies and is

not performed with a pre-written set of test cases/checklists. Various methods adopted

by companies for usability testing are as follows.

Performing usability testing as a separate cycle of testing

Hiring external consultants to do usability validation

Setting up a separate group for usability to institutionalize the practices across

various product development teams and to set up organization –wide standards

for usability.

3.6 Summary

The overall objective of object-oriented testing—to find the maximum number of

errors with a minimum amount of effort—is identical to the objective of conventional

software testing. But the strategy and tactics for OO testing differ significantly. The

view of testing broadens to include the review of both the analysis and design model.

In addition, the focus of testing moves away from the procedural component (the

module) and toward the class. Because the OO analysis and design models and the

resulting source code are semantically coupled, testing (in the form of formal

technical reviews) begins during these engineering activities. For this reason, the

review of CRC, object-relationship, and object-behavior models can be viewed as first

stage testing.

11

11

Once OOP has been accomplished, unit testing is applied for each class. The design of

tests for a class uses a variety of methods: fault-based testing, random testing, and

partition testing. Each of these methods exercises the operations encapsulated by the

class. Test sequences are designed to ensure that relevant operations are exercised.

The state of the class, represented by the values of its attributes, is examined to

determine if errors exist. Integration testing can be accomplished using a thread-based

or use-based strategy. Thread-based testing integrates the set of classes that

collaborate to respond to one input or event. Use-based testing constructs the system

in layers, beginning with those classes that do not use server classes.

There is an increasing awareness of usability testing in the industry. Soon, usability

testing will become an engineering discipline, a life cycle activity, and a profession.

Several companies plan for usability testing in the beginning of the product life cycle

and track them to completion. Usability is not achieved only by testing. Usability is

more in the design and in the minds of the people who contribute to the product.

Usability is all about user experiences. Thinking from the perspective of the user all

the time during the project will go a long way in ensuring usability.


11 Explain the purpose of making the performance testing and the factors

governing performance testing.

11 How will collect the requirements, test cases for making performance testing?

11 How will you automate performance test cases? Explain with an example.

11 Define the terms performance tuning and benchmarking.

11 Mention some of the performance testing tools.

11 What is called a regression testing? Mention the types.

11 When to do regression testing?

11 How will you perform an initial “smoke” or “sanity” test?

11 How will you select the test cases for making a regression testing?

111 What are the best practices to be followed in regression testing?

11

11

UNIT - IV

Structure

4.0 Objectives

4.1 Introduction

4.2 Test Planning

4.3 Test management

4.4 Test Execution and Reporting

4.5 Summary


11

11

4.0 Objectives

To learn how to make a test plan for the whole testing process and the steps

To understand what is test management and the art of making it

To learn the test execution and how to make a test reporting after making every test

4.1 Introduction

In this chapter, we will look at some of the project management aspects of testing.

The Project Management Institute defines a project formally as a temporary endeavor

to create a unique product or service. This means that every project or service is

different in some distinguishing way from all similar products or services. Testing is

integrated into the endeavor of creating a given product or service; each phase and

each type of testing has different characteristics and what is tested in each version

could be different. Hence, testing satisfies this definition of a project fully. Given that

testing can be considered as a project on its own, it has to be planned, executed,

tracked, and periodically reported on.

4.2 TEST PLANNING

Preparing a test plan

Testing – like any project- should be driven by a plan. The test plan covers the

following:

What needs to be tested – the scope of testing, including clear

identification of what will be tested and what will not be tested.

How the testing is going to be performed

What resources are needed for testing- computer as well as human

resources.

The time lines by which the testing activities will be performed.

Risks that may be faced in all the above, with appropriate mitigation

and contingency plans.

11

11

Scope management:

One single plan can be prepared to cover all phases or there can be separate plans for

each phase. In situations where there are multiple test plans, there should be one test

plan, which covers the activities common for all plans. This is called the master test

plan.

Scope management pertains to specifying the scope of a project. For testing, scope

management entails

1. Understanding what constitutes a release of a product

2. Breaking down the release into features

3. Prioritizing the features of testing

4. Deciding which features will be tested and which will not be and

5. Gathering details to prepare for estimation of resources for testing.

Knowing the features and understanding them from the usage perspective will enable

the testing team to prioritize the features for testing. The following factors drive

choice and prioritization of features to be tested.

Features that is new and critical for the release

The new features of a release set the expectations of the customers and must perform

properly. These new features result in new program code and thus have a higher

susceptibility and exposure to defects.

Features whose failures can be catastrophic

Regardless of whether a feature is new or not, any feature the failure of which can be

catastrophic has to be high on the list of features to be tested. For example, recovery

mechanisms in a database will always have to be among the most important features

to be tested.

Features that are expected to be complex to test

Early participation by the testing team can help identify features that are difficult to

test. This can help in starting the work on these features earl and line up appropriate

resources in time.

Features which are extensions of earlier features that have been defect prone

11

11

Defect prone areas need very thorough testing so that old defects do not creep in

again.

Deciding test approach/strategy

Once we have this prioritized feature list, the next step is to drill down into some

more details of what needs to be tested, to enable estimation of size, effort, and

schedule. This includes identifying

1. What type of testing would you use for testing the functionality?

2. What are the configurations or scenarios for testing the features?

3. What integration testing is followed to ensure these features work

together?

4. What localization validations would be needed?

5. What non-functional tests would you need to do?

The test approach should result in identifying the right type of test for each of the

features or combinations.

Setting up criteria for testing

There must be clear entry and exit criteria for different phases of testing.

Ideally, tests must be run as early as possible so that the last minute pressure of

running tests after development delays is minimized. The entry criteria for a test

specify threshold criteria for each phase or type of test. The completion/exit criteria

specify when a test cycle or a testing activity can be deemed complete.

Suspension criteria specify when a test cycle or a test activity can be

suspended.

Identifying responsibilities, staffing, and training needs

A testing project requires different people to play different roles. There are

roles for test engineers, test leads, and test managers. The different role definitions

should

1. Ensure there is clear accountability for a given task, so that each person

knows what he has to do;

2. Clearly list the responsibilities for various functions to various people

11

11

3. Complement each other, ensuring no one steps on an others’ toes; and

4. Supplement each other, so that no task is left unassigned.

Staffing is done based on estimation of effort involved and the availability of

time for release. In order to ensure that the right tasks get executed, the features and

tasks are prioritized the basis of an effort, time and importance.

It may not be possible to find the perfect fit between the requirements and

availability of skills; they should be addressed with appropriate training programs.

Identifying resource requirements

As a part of planning for a testing project, the project manager should provide

estimates for the various hardware and software resources required. Some of the

following factors need to be considered.

1. Machine configuration needed to run the product under test

2. Overheads required by the test automation tool, if any

3. Supporting tools such as compilers, test data generators, configuration

management tools, and so on

4. The different configurations of the supporting software that must be

present

5. Special requirements for running machine-intensive tests such as load tests

and performance tests

6. Appropriate number of licenses of all the software

Identifying test deliverables

The test plan also identifies the deliverables that should come out of the test

cycle/testing activity. The deliverables include the following,

1. The test plan itself

2. Test case design specifications

3. Test cases, including any automation that is specified in the plan

4. Test logs produced by running the tests

5. Test summary reports

Testing tasks: size and effort estimation

11

11

The scope identified above gives a broad overview of what needs to be tested.

This understanding is quantified in the estimation step. Estimation happens broadly in

three phases.

1. Size estimation

2. Effort estimation

3. Schedule estimation

Size estimation

Size estimate quantifies the actual amount of testing that needs to be done. The

factors contribute to the size estimate of a testing project are as follows:

Size of the product under test – Line of Code (LOC), Function Point (FP) are the

popular methods to estimate the size of an application. A somewhat simpler

representation of application size is the number of screens, reports, or transactions.

Extent of automation required

Number of platforms and inter-operability environments to be tested

Productivity data

Reuse opportunities

Robustness of processes

Activity breakdown and scheduling

Activity breakdown and schedule estimation entail translating the effort required into

specific time frames. The following steps make up this translation.

• Identifying external and internal dependencies among the activities

• Sequencing the activities, based on the expected duration as well as on

the dependencies

• Monitoring the progress in terms of time and effort

• Rebalancing schedules and resources as necessary

Communications management

11

11

Communications management consists of evolving and following procedures for

communication that ensure that everyone is kept in sync with the right level of detail.

Risk management

Like every project, testing projects also face risks. Risks are events that could

potentially affect a project’s outcome. Risk management entails

• Identifying the possible risks;

• Quantifying the risks;

• Planning how to mitigate the risks; and

• Responding to risks when they become a reality.

Fig.23 - Aspects of risk management

i) Risk identification consists of identifying the possible risks that may

hit a project. Use of checklists, Use of organizational history and

metrics and informal networking across the industry are the common

ways to identify risks in testing.

ii) Risk quantification deals with expressing the risk in numerical terms.

The probability of the risk happening and the impact of the risk are the

two components to the quantification of risk.

iii) Risk mitigation planning deals with identifying alternative strategies

to combat a risk event. To handle the effects of a risk, it is advisable to

have multiple mitigation strategies.

The following are some of the common risks encountered in testing projects:

11

11

Risk identification

Riskresponse

Risk Mitigationplanning

Riskquantification

• Unclear requirements,

• Schedule dependence,

• Insufficient time for testing,

• Show stopper defects,

• Availability of skilled and motivated people for testing and

• Inability to get a test automation tool.

4.3 TEST MANAGEMENT

Choice of standards

Standards comprise an important part of planning in any organization. There are two

types of standards – external standards and internal standards.

External standards are standards that a product should comply with, are externally

visible, and are usually stipulated by external consortia. Compliance to external

standards is usually mandated by external parties.

Internal standards are standards formulated by a testing organization to bring in

consistency and predictability. They standardize the processes and methods of

working within the organization. Some of the internal standards include

Naming and storage conventions for test artifacts – Every test artifact have to be

named appropriately and meaningfully. Such naming conventions should enable easy

identification of the product functionality that a set of tests are intended for; and

reverse mapping to identify the functionality corresponding to a given set of tests.

Document standards

Most of the discussion on documentation and coding standards pertain to automated

testing. Documentation standards specify how to capture information about the tests

within the test scripts themselves. Internal documentation of test scripts are similar to

internal documentation of program code and should include the following:

11

11

• Appropriate header level comments at he beginning of the file that outlines the

functions to be served by the test.

• Sufficient in-line comments spread throughout the file, explaining the

functions served by the various parts of a test script.

• Up-to-date change history information, recording all the changes made to the

test file.

Test coding standards

Test coding standards go one level deeper into the tests and enforce standards on how

the test themselves are written. The standards may

1. Enforce the right type of initialization

2. Stipulate ways of naming variables within the scripts to make sure

that a reader understands consistently the purpose of a variable.

3. Encourage reusability of test artifacts.

4. Provide standard interfaces to external entries like operating

system, hardware, and so on.

Test reporting standards

Since testing is tightly interlinked with product quality, all the stakeholders must get a

consistent and timely view of the progress of tests. The test reporting provides

guidelines on the level of detail that should be present in the test reports, their

standard formats and contents, recipients of the report, and so on.

Test infrastructure management

Testing requires a robust infrastructure to be planned upfront. This infrastructure is

made up of three essential elements.

1. A test case database (TCDB )

2. A defect repository (DR )

3. Configuration management repository and tool

A test case database captures all the relevant information about the test cases in an

organization.

12

12

A defect repository captures all the relevant details of defects reported for a product.

Most of the metrics classified as testing defect metrics and development defect

metrics are derived out of the data in defect repository.

Yet another infrastructure that is required for a software product organization is a

Software Configuration Management (SCM) repository. An SCM repository keeps

track of change control and version control of all the files that make up a software

product. Change controls ensures that

Changes to test files are made in a controlled fashion and only with

proper approvals.

Changes made by one test engineer are not accidentally lost or

overwritten by other changes.

Each change produces a distinct version of the file that is recreatable at

any point of time.

At any point of time, everyone gets access to only the most recent

version of the test files.

Version control ensures that the test scripts associated with a given release of a

product are base lined along with the product files.

TCDB, Defect Repository, and SCM repository should complement each other and

work together in an integrated fashion.

12

12

Figure 24 – relationship SCM, DR and TCDB

Test people management

People management is an integral part of any project management. It requires the

ability to hire, motivate and retain the right people. These skills are seldom formally

taught. Testing projects present several additional challenges. We believe that the

success of a testing organization depends on judicious people management skills.

The important point is that the common goals and the spirit of teamwork have

to be internalized by all the stakeholders. Such an internalization and upfront team

building has to be part of the planning process for the team to succeed.

12

12

TCDB

DR SCM

Test caseProductXREF

Test caseinfo

Test caseinfo

Test casedefectXREF

ProductTest cases Product

Sourcecode

Environ-Mentfiles

Productdocumentation

Defectdetails

Defect fixdetails

DefectCommu-nication

DefectTestdetails

Integrating with product release

Ultimately, the success of a product depends on the effectiveness of integration of the

development and testing activities. These job functions have to work in tight unison

between themselves and with other groups such as product support, product

management, and so on. The schedules of testing have to be linked directly to product

release. The following are some of the points to be decided for this planning.

• Sync points between development and testing as to when different types of

testing can commence.

• Service level agreements between development and testing as to how long it

would take for the testing team to complete the testing. This will ensure that

testing focuses on finding relevant and important defects only.

• Consistent definitions of the various priorities and severities of the defects.

• Communication mechanisms to the documentation group to ensure that the

documentation is kept in sync with the product in terms of known defects,

workarounds and so on.

The purpose of the testing team is to identify the defects in the product and the risks

that could be faced by releasing the product with the existing defects.

4.4 TEST PROCESS

Putting together and base lining a test plan

A test plan combines all the points discussed above into a single document that acts

as an anchor point for the entire testing project. An organization normally arrives at a

template that is to be used across the board. Each testing project puts together a test

plan based on the template. The test plan is reviewed by a designated set of competent

people in the organization. It then is approved by a competent authority, who is

independent of the project manager directly responsible for testing. After this, the test

plan is base lined into the configuration management repository. From then on, the

base lined test plan becomes the basis for running the testing project. In addition,

12

12

periodically, any change needed to the test plan templates are discussed among the

different stake holders and this is kept current and applicable to the testing teams.

Test case specification

Using the test plan as the basis, the testing team designs test case specifications,

which then becomes the basis for preparing individual test cases. A test case is a

series of steps executed on a product, using a pre-defined set of input data, expected

to produce a pre-defined set of outputs, in a given environment. Hence, a test case

specification should clearly identify,

• The purpose of the test: this lists what feature or part the test is intended for.

• Items being tested, along with their version/release numbers as appropriate.

• Environment that needs to be set up for running the test case.

• Input data to be used for the test case.

• Steps to be followed to execute the test

• The expected results that are considered to be correct results

• A step to compare the actual results produced with the expected results

• Any relationship between this and other tests

Update of traceability matrix

A traceability matrix is a tool to validate that every requirement is tested. This matrix

is created during the requirements gathering phase itself by filling up the unique

identifier for each requirement. When a test case specification is complete, the row

corresponding to the requirement which is being tested by the test case is updated

with the test case specification identifier. This ensures that there is a two-way

mapping between requirements and test cases.

Identifying possible candidates for automation

Before writing the test cases, a decision should be taken as to which tests are to be

automated and which should be run manually. Some of the criteria that will be used in

deciding which scripts to automate include

• Repetitive nature of the test

12

12

• Effort involved in automation

• Amount of manual intervention required for the test, and

• Cost of automation tool.

Developing and base lining test cases

Based on the test case specifications and the choice of candidates for automation, test

cases have to be developed. The test cases should also have change history

documentation, which specifies

• What was the change

• Why the change was necessitated

• Who made the change

• When was the change made

• A brief description of how the change has been implemented and

• Other files affected by the change

All the artifacts of test cases – the test scripts, inputs, scripts, expected outputs, and

so on should be stored in the test case database and SCM.

Executing test cases and keeping traceability matrix current

The prepared test cases have to be executed at the appropriate times during a

project. For example, test cases corresponding to smoke tests may be run on a daily

basis. System testing test cases will be run during system testing.

As the test cases are executed during a test cycle, the defect repository is updated with

1. Defects from the earlier test cycles that are fixed in the current build and

2. New defects that get uncovered in the current run of the tests.

During test design and execution, the traceability matrix should be kept current. When

tests get designed and executed successfully, the traceability matrix should be

updated.

Collecting and analyzing metrics

12

12

When tests are executed, information about the test execution gets collected in test

logs and other files. The basic measurements from running the tests are then

converted to meaningful metrics by the use of appropriate transformations and

formulae.

Preparing test summary report

At the completion of a test cycle, a test summary report is produced. This report gives

insights to the senior management about the fitness of the product for release.

Recommending product release criteria

One of the purposes of testing is to decide the fitness of a product for release.

Testing can never conclusively prove the absence of defects in a software product.

What it provides is an evidence of what defects exist in the product, their severity, and

impact. The job of the testing team is to articulate to the senior management and the

product release team

1. What defect the product has

2. What is the impact/severity of each of the defects

3. What would be the risks of releasing the product with the existing

defects?

The senior management can then take a meaningful business decision on whether to

release given version or not.

4.5 Test Execution and Reporting

Testing requires constant communication between the test team and other

teams. Test reporting is a means of achieving this communication. There are two

types of reports or communication that are required; test incident reports and test

summary reports.

Test incident report

A test incident report is a communication that happens through the testing

cycle as and when defects are encountered. A test incident report is an entry made in

12

12

the defect repository. Each defect has a unique ID and this is used to identify the

incident. The high impact test incidents are highlighted in the test summary report.

Test cycle report

Test projects take place in units of test cycles. A test cycle entails planning

and running certain tests in cycles, each cycle using a different build of a product. A

test cycle report, at the end of each cycle, gives

1. A summary of the activities carried out during that cycle;

2. Defects that were uncovered during that cycle, based on their severity and

impact.

3. Progress from the previous cycle to the current cycle in terms of defects

fixed;

4. Outstanding defects that are yet to be fixed in this cycle; and

5. Any variations observed in effort or schedule.

Test summary report

The final step in a test cycle is to recommend the suitability of a product for release.

A report that summarizes the results of a test cycle is the test summary report.

There are two types of test summary reports:

1. Phase-wise test summary, which is produced at the end of every phase

2. Final test summary reports.

A summary report should present

• A summary of the activities carried out during the test cycle or phase

• Variance of the activities carried out from the activities planned

• Summary of results which includes tests that failed, with any root cause

descriptions and severity of impact of the defects uncovered by the tests.

• Comprehensive assessment and recommendation for release should include fit

for release assessment and recommendation of release.

Recommending product release

Based on the test summary report, an organization can take a decision on whether to

release the product or not. Ideally an organization would like to release a product with

zero defects. However, market pressures may cause the product to be released with

12

12

the defects provided that the senior management is convinced that there is no major

risk of customer dissatisfaction. Such a decision should be taken by the senior

manager only after consultation with the customer support team, development team

and testing team so that the overall workload for all parts of the organization can be

evaluated.

Best Practices

Best practices in testing can be classified into three categories.

1. Process related

2. People related

3. Technology related

Process related best practices

A strong process infrastructure and process culture is required to achieve better

predictability and consistency. A process database, a federation of information about

the definition and execution of various processes can be a valuable addition to the

tools in an organization.

People related best practices

While individual goals are required for the development and testing teams, it is very

important to understand the overall goals that define the success of the product as a

whole. Job rotation among support, development and testing can also increase the

gelling among the teams. Such job rotation can help the different teams develop better

empathy and appreciation of the challenges faced in each other’s roles and thus result

in better teamwork.

Technology related best practices

A fully integrated TCDB-SCM – DR can help in better automation of testing

activities. When test automation tools are used, it is useful to integrate the tool with

TCDB, defect repository and an SCM tool.

A final remark on best practices, the three dimensions of best practices cannot be

carried out in isolation. A good technology infrastructure should be aptly supported

by effective process infrastructure and be executed by competent people. These best

12

12

practices are inter-dependent, self-supporting, and mutually enhancing. Thus, the

organization needs to take a holistic view of these practices and keep a fine balance

among the three dimensions.

4.6 Summary

Failing to plan is planning to fail. Testing – like any project – should be driven by a

plan. The scope management for deciding the features to be tested/ not tested,

deciding a test approach, setting up criteria for testing and identifying responsibilities,

staffing, and training needs are included in the test planning.

The test management includes the test infrastructure management and test people

management. The test infrastructure consists of a test case database, a defect

repository and a configuration management repository and tool.

The test process includes the test case specification, putting a baseline to the test plan

and an update of traceability matrix. The test process also has to identify possible

candidates for automation.


1. How will you prepare a test plan? Explain the strategy.

2. Explain the concept of identifying responsibilities, staffing, and

training needs.

3. How will you make the size and effort estimation of the product?

4. Explain the aspects of Risk management.

5. Explain the relationship between SCM, DR and TCDB.

6. Explain the test process with an example.

7. What is called ‘test reporting’?

8. How will you make a test report? Explain with a sample report.

9. Explain the best practices to be followed in test process.

10. Differentiate between a test cycle report and test summary report.

12

12

UNIT - V

Structure

5.0 Objectives

5.1 Introduction

5.2 Software Test Automation

5.3 Test metrics and measurements

5.4 Summary


13

13

5.0 Objectives

To know the basic concepts of software test automation and their benefits

To understand the test metrics and measurements and the methods

5.1 Introduction

Developing software to test the software is called test automation. Test automation

can help address several problems.

Automation saves time as software can execute test case faster than

human do.

Test automation can free the test engineers from mundane tasks and

make them focus on more creative tasks.

Automated tests can be more reliable

Automation helps in immediate testing

Automation can protect an organization against attrition of test

engineers.

Test automation opens up opportunities for better utilization of global

resources.

Certain types of testing cannot be executed without automation

Automation means end-to-end, not test execution alone.

Automation should have scripts that produce test data to maximize coverage of

permutations and combinations of inputs and expected output for result comparison.

They are called test data generators. The automation script should be able to map the

error patterns dynamically to conclude the result. The error pattern mapping is done

not only to conclude the result of a test, but also point out the root cause.

13

13

5.2 SOFTWARE TEST AUTOMATION

Terms used in automation

A test case is a set of sequential steps to execute a test operating on a set of

predefined inputs to produce certain expected outputs. There are two types of test

cases namely automated and manual. Test case in this chapter refers to automated test

cases. A test case can be documented as a set of simple steps, or it could be an

assertion statement or a set of assertions. An example of assertion is “Opening a file,

which is already opened should fail.” The following table describes some test cases

for the log in example, on how the log in can be tested for different types of testing.

S.No. Test cases for testing Belongs to what type of

testing1. Check whether login works Functionality2. Repeat log in operation in a loop for 48 hours Reliability3. Perform log in from 10000 clients Load/Stress testing4. Measure time taken for log in operations

In different conditions

Performance

5. Run log in operation from a machine running

Japanese language

Internalization

Table- Same test case being used for different types of testing

In the above table the how portion of the test case is called scenarios. What an

operation has to do is a product specific feature and how they are to be run is a

framework-specific requirement. When a set of test cases is combined and associated

with a set of scenarios, they are called “test suite”.

13

13

User

Defined scenarios

How to execute the tests

What a test should do

Fig. 25 Framework for test automation

Skills Needed for Automation

The automation of testing is broadly classified into three generations.

First generation – record and playback

Record and playback avoids the repetitive nature of executing tests. Almost all

the test tools available in the market have the record and playback feature. A test

engineer records the sequence of actions by keyboard characters or mouse clicks and

those recorded scripts are played back later, in the same order as they were recorded.

When there is frequent change, the record and playback generation of test automation

tools may not be very effective.

Second generation – data – driven

This method helps in developing test scripts that generates the set of input conditions

and corresponding expected output. This enables the tests to be repeated for different

input and output conditions. This generation of automation focuses on input and

output conditions using the black box testing approach.

Third generation action driven

This technique enables a layman to create automated tests; there are no input and

expected output condition required for running the tests. All action that appear on

application are automatically tested based on a generic set of controls defined for

13

13

Scenarios

Framework/harness test tool

Test casesATestsuite w W

automation e input and out put condition are automatically generated and used the

scenarios for test execution can be dynamically changed using the test framework that

available in this approach of automation hence automation in the third generation

involves two major aspects “test case automation” and “frame work design”.

What to Automate, Scope of Automation

The specific requirements can vary from product to product, from situation to

situation, from time to time. The following gives some generic tips for identifying the

scope of automation.

Identifying the types of testing amenable to automation

Stress, reliability, scalability, and performance testing

These types of testing require the test cases to be run from a large number of

different machines for an extended period of time, such as 24 hours, 48 hours, and so

on. Test cases belonging to these testing types become the first candidates for

automation.

Regression tests

Regression tests are repetitive in nature. Given the repetitive nature of the test cases,

automation will save significant time and effort in the long run.

Functional tests

These kinds of tests may require a complex set up and thus required specialized skill,

which may not be available on an ongoing basis. Automating these once, using the

expert skill tests, can enable using less-skilled people to run these tests on an ongoing

basis.

Automating areas less prone to change

User interfaces normally go through significant changes during a project. To avoid

rework on automated test cases, proper analysis has to be done to find out the areas of

changes to user interfaces, and automate only those areas that will go through

relatively less change. The non-user interface portions of the product can be

automated first. This enables the non-GUI portions of the automation to be reused

even when GUI goes through changes.

13

13

Automate tests that pertain to standards

One of the tests that products may have to undergo is compliance to standards. For

example, a product providing a JDBC interface should satisfy the standard JDBC

tests. Automating for standards provides a dual advantage. Test suites developed for

standards are not only used for product testing but can also be sold as test tools for the

market. Testing for standards has certain legal requirements. To certify the software, a

test suite is developed and handed over to different companies. This is called

“certification testing” and requires perfectly compliant results every time the tests are

executed.

Management aspects in automation

Prior to starting automation, adequate effort has to be spent to obtain management

commitment. The automated test cases need to be maintained till the product reaches

obsolescence. Since automation involves effort over an extended period of time,

management permissions are only given in phases and part by part. It is important to

automate the critical and basic functionalities of a product first. To achieve this, all

test cases need to be prioritized as high, medium, and low, based on customer

expectations. Automation should start from high priority and then over medium and

low-priority requirements.

Design and Architecture for Automation

Design and architecture is an important aspect of automation. As in product

development, the design has to represent all requirements in modules and in the

interactions between modules.

In integration testing both internal interfaces and external interfaces have to be

captured by design and architecture. Architecture for test automation involves two

major heads: a test infrastructure that covers a test case database and a defect database

or defect repository. Using this infrastructure, the test framework provides a backbone

that ties the selection and execution of test cases.

External modules

There are two modules that are external modules to automation – TCDB and defect

DB. Manual test cases do not need any interaction between the framework and

13

13

TCDB. Test engineers submit the defects for manual test cases. For automated test

cases, the framework can automatically submit the defects to the defect DB during

execution. These external modules can be accessed by any module in automation

framework.

Scenario and configuration file modules

Scenarios are information on “how to execute a particular test case“. A configuration

file contains a set of variables that are used in automation. A configuration file is

important for running the test cases for various execution conditions and for running

the tests for various input and output conditions and states. The values of variables in

this configuration file can be changed dynamically to achieve different execution

input, output and state conditions.

Test cases and test framework modules

Test case is an object for execution for other modules in the architecture and does not

represent any interaction by itself. A test framework is a module that combines “what

to execute” and “how they have to be executed.” The test framework is considered the

core of automation design. It can be developed by the organization internally or can

be bought from the vendor.

Tools and results modules

When a test framework performs its operations, there are a set of tools that may be

required. For example, when test cases are stored as source code files in TCDB, they

need to be extracted and compiled by build tools. In order to run the compiled code,

certain runtime tools and utilities may be required.

The results that come out of the test must be stored for future analysis. The history of

all the previous tests run should be recorded and kept as archives. This results help the

test engineer to execute the test cases compared with the previous test run. The audit

of all tests that are run and the related information are stored in the module of

automation. This can also help in selecting test cases for regression runs.

Report generator and reports /metrics modules

Once the results of a test run are available, the next step is to prepare the test reports

and metrics. Preparing reports is a complex work and hence it should be part of the

13

13

automation design. The periodicity of the reports is different, such as daily, weekly,

monthly, and milestone reports. Having reports of different levels of detail can

address the needs of multiple constituents and thus provide significant returns.

The module that takes the necessary inputs and prepares a formatted report is called a

report generator. Once the results are available, the report generator can generate

metrics. All the reports and metrics that are generated are stored in the reports/metrics

module of automation for future use and analysis.

Generic Requirements for Test Tool/Framework

In the previous section, we described a generic framework for test automation. This

section presents detailed criteria that such a framework and its usage should satisfy.

• No hard coding in the test suite.

• Test case/suite expandability.

• Reuse of code for different types of testing, test cases.

• Automatic setup and cleanup.

• Independent test cases.

• Test case dependency

• Insulating test cases during execution

• Coding standards and directory structure.

• Selective execution of test cases.

• Random execution of test cases.

• Parallel execution of test cases.

• Looping the test cases

• Grouping of test scenarios

• Test case execution based on previous results.

• Remote execution of test cases.

• Automatic archival of test data.

• Reporting scheme.

• Independent of languages

• Probability to different platforms.

13

13

Process Model for Automation

The work on automation can go simultaneously with product development and can

overlap with multiple releases of the product. One specific requirement for

automation is that the delivery of the automated tests should be done before the test

execution phase so that the deliverables from automation effort can be utilized for the

current release of the product.

Test automation life cycle activities bear a strong similarity to product development

activities. Just as product requirements need to be gathered on the product side,

automation requirements too need to be gathered. Similarly, just as product planning,

design and coding are done, so also during test automation are automation planning,

design and coding.

After introducing testing activities for both the product and automation, the

above figure includes two parallel sets of activities for development and testing

separately. When they are put together, it becomes a “W” model. Hence for a product

development involving automation, it will be a good choice to follow the W model to

ensure that the quality of the product as well as the test suite developed meets the

expected quality norms.

Selecting a test tool

Having identified the requirements of what to automate, a related question is the

choice of an appropriate tool for automation. Selecting the test tool is an important

aspect of test automation for several reasons given below:

1. Free tools are not well supported and get phased out soon.

2. Developing in-house tools take time.

3. Test tools sold by vendors are expensive.

4. Test tools require strong training.

5. Test tools generally do not meet all the requirements for

automation.

6. Not all test tools run on all platform.

For all the above strong reasons, adequate focus needs to be provided for selecting the

right tool for automation.

13

13

Criteria for selecting test tools

In the previous section, we looked at some reasons for evaluating the test tools

and how requirements gathering will help. This will change according to context and

are different for different companies and products. We will now look into the broad

categories for classifying the criteria. The categories are

1. Meeting requirements

2. Technology expectations

3. Training/skills and

4. Management aspects.

Meeting requirements

Firstly, there are plenty of tools available in the market, but they do not meet all the

requirements of a given product. Evaluating different tools for different requirements

involves significant effort, money and time.

Secondly, test tools are usually one generation behind and may not provide backward

or forward compatibility. Thirdly, test tools may not go through the same amount of

evaluation for new requirements.

Finally, a number of test tools cannot differentiate between a product failure and a test

failure. So the test tool must have some intelligence to proactively find out the

changes that happened in the product and accordingly analyze the results.

Technology expectations

Extensibility and customization are important expectations of a test

tool.

A good number of test tools require their libraries to be liked with

product binaries.

Test tools are not 100% cross platform. When there is an impact

analysis of the product on the network, the first suspect is the test tool

and it is uninstalled when such analysis starts.

Training skills

While test tools require plenty of training, very few vendors provide the training to

the required level. Test tools expect the users to learn new language/scripts and may

13

13

not use standard languages/scripts. This increases skill requirements for automation

and increases the need for a learning curve inside the organization.

Management aspects

Test tools require system upgrades.

Migration to other test tools difficult

Deploying tool requires huge planning and effort.

Steps for tool selection and deployment

1. Identify your test suite requirements among the generic

requirements discussed. Add other requirements if any.

2. Make sure experiences discussed in previous sections are taken

care of.

3. Collect the experiences of other organizations which used similar

test tools.

4. Keep a checklist of questions to be asked to the vendors on

cost/effort/support.

5. Identify list of tools that meet the above requirements.

6. Evaluate and shortlist one/set of tools and train all test developers

on the tool.

7. Deploy the tool across the teams after training all potential users of

the tool.

Challenges in Automation

The most important challenge of automation is the management commitment.

Automation takes time and effort and pays off in the long run. Management should

have patience and persist with automation. Successful test automation endeavors are

characterized unflinching management commitment, a clear vision of goals that track

progress with respect to the long-term vision.

14

14

5.3 TEST METRICS AND MEASUREMENTS

What are metrics and measurements

Metrics derive information from raw date with a view to help in decision making.

Some of the areas that such information would shed light on are

Relationship between the data points

Any cause and effect correlation between the observed data points.

Any pointers to how the data can be used for future planning and continuous

improvements

Metrics are thus derived from measurement using appropriate formulas or calculation.

Obviously the same set measurement can help product different set of metrics of

interest to different people.

From the above discussion it is obvious that in order that a project performance be

tracked and its progress monitored effectively,

The right parameters must be measured; the parameters may pertain to product

or to process

The right analysis must be done on the date measured to draw within a project

or organization.

The result of the analysis must be presented in an appropriate form to the

stakeholders to enable them to make the right decisions on improving product

or process quality

Effort is the actual time that is spent on a particular activity or a phase. Elapsed days

are the difference between the start of an activity and the completion of the activity.

Collecting and analyzing metrics involves effort and several steps. This is depicted in

the following figure.26.

14

14

Figure. 26 Steps in a metrics program

The first step involved in a metrics programme is to decide what measurement are

important and collect data accordingly. The effort spent on testing number of defect

and number of test cases is some examples of measurement Depending on what the

data is used for the granularity of measurement will vary.

While deciding what to measure the following aspects need to be kept in mind.

1. What is measured should be of relevance to what we are trying to achieve.

For testing functions we would obviously be interested in the effort spent on

testing number of test cases number of defects reported from test cases and so

on.

2. The entities measured should be natural and should not involve too many

overheads for measurement. If there are too many overheads in making the

measurements do not follows naturally from the actual work being done then

the people who supply the date may resist giving the measurement data.

14

14

Identify what To measure

TransformMeasurementsTo metrics

Decide operational requirements

Refine measurementsAnd metrics

Take actions and follow up

Perform metricanalysis

3. What is measured should be at the right level of granularity to satisfy the

objective for which the measurement is being made.

An approach involved in getting the granular detail is called data drilling

It is important to provide as much granularity in measurement as possible. A set of

measurement can be combined to generate metrics. An example question involving

multiple measurements is “How many test cases produced the 40 defect in date

migration involving different schema?” There are two measurements involved in this

question the number of test cases and the number of defect. Hence the second step

involved in metrics collection is defining how to combine data points or

measurements to provide meaningful metrics.

A particular metric can use one or more measurements. The operational requirement

for a metrics plan should lay down not only the periodicity but also other operational

issues such as who should collect measurements, who should receive the analysis, and

so on. The final step involved in a metrics plan is to take necessary action and follow

up on the action.

Why Metrics in Testing

Metrics are needed to know test case execution productivity and to estimate test

completion date.

Days needed to complete testing = total test cases yet to be executed /

Test case execution productivity

The defect fixing trend collected over a period of time gives another estimate of the

defect-fixing capability of the team.

Total days needed for defect fixes = (Outstanding defects yet to be fixed + Defects

That can be found in future test cycles) /

Defect fixing capability.

14

14

Hence, metrics helps in estimating the total days needed for fixing defects.

Once the time needed for testing and the time for defect fixing are known, the release

date can be estimated.

Days needed for release = Max (days needed for testing, days needed for

Defect fixes ).

The defect fixes may arrive after the regular test cycles are completed. These defect

fixes have to be verified by regression testing before the product can be released.

Hence the formula for days needed for release is to be modified as follows:

Days needed for release = Max [ Days needed for testing, [ Days needed for defect

fixes+ Days needed for regressing outstanding defect fixes]

The idea of discussing the formula here is to explain that metrics are important and

help in arriving at the release date for the product. Metrics are not only used for

reactive activities. Metrics and their analysis help in preventing the defects

proactively, thereby saving cost and effort. Metrics are used in resource management

to identify the right size of product development teams.

To summarize, metrics in testing help in identifying

When to make the release.

What to release

Whether the product is being released with known quality.

10.2 Types of Metrics

Metrics can be classified into different types based on what they measure and what

area they focus on. Metrics can be classified as product metrics and process metrics.

Product metrics can be further classified as

Project metrics – a set of metrics that indicates how the project is planned and

executed.

14

14

Progress metrics – a set of metrics that tracks how the activities of the project are

progressing.

Productivity metrics – a set of metrics that takes up into account various

productivity numbers that can be collected and used for

planning and tracking testing activities.

Project Metrics

A typical project starts with requirements gathering and ends with product release. All

the phases that fall in between these points need to be planned and tracked. The

project scope gets translated to size estimates, which specify the quantum of work to

be done. This size estimate gets translated to effort estimate for each of the phases and

activities by using the available productivity data available. This initial effort is called

baselined effort.

As the project progresses and if the scope of the project changes then the effort

estimates are re-evaluated again and this re-evaluated effort estimate is called revised

effort.

Effort variance ( planned vs actual )

If there is substantial difference between the baselined and revised effort, it points to

incorrect initial estimation. Calculating effort variance for each of the phases provides

a quantitative measure of the relative difference between the revised and actual

efforts.

14

14

Effort Req Design Coding Testing Doc Defect

fixingVariance

%

7.1 8.7 5 0 40 15

Table. Sample variance percentage by phase.

Variance % = [( Actual effort – Revised estimate) / Revised estimate] * 100

A variance more than 5% in any of the SDLC phase indicates the scope for

improvement in the estimation. The variance can be negative also. A negative

variance is an indication of an over estimate. These variance numbers along with

analysis can help in better estimation for the next release or the next revised

estimation cycle.

14

14

Fig. 27 Types of metrics

Process metrics

Productmetrics

Project metrics

ProgressMetrics

Productivitymetrics

Testing defect metrics

Development defect metrics

Defects per 100 hrs of testing

Test cases executed per 100 hrs of testing

Test cases developed per 100 hrs

Defects per 100 test cases

Defects per 100 failed test cases

Test phase effectiveness

Closed defects distribution

Defect find rate

Defect fix rate

Outstanding defects rate

Priority outstanding rate

Defects trend

Defect classification trend

Weighted defects trend

Defect cause distribution

Effort variance

Schedule variance

Effort distribution

Component wise defect distribution

Defect density and defect removal rate

Age analysis of outstanding defects

Introduced and reopened defects rate

14

14

Schedule variance ( planned vs actual )

Schedule variance, like effort variance, is the deviation of the actual schedule from

the estimated schedule. There is one difference though. Depending on the SDLC

model used by the project, several phases could be active at the same time. Further the

different phases in SDLC are interrelated and could share the same set of individuals.

Because of all these complexities involved, schedule variance is calculated only at the

overall project level, at specific milestones, not with respect to each of the SDLC

phases.

Effort and schedule variance have to be analyzed in totality, not in isolation. This is

because while effort is a major driver of the cost, schedule determines how best a

product can exploit market opportunities. Variance can be classified into negative

variance, zero variance, acceptable variance and unacceptable variance. Generally

0-5% is considered as acceptable variance.

Effort distribution across phases

Variance calculation helps in finding out whether commitments are met on time and

whether the estimation method works well. The distribution percentage across the

different phases can be estimated at the time of planning and these can be compared

with the actual at the time of release for getting a comfort feeling on the release and

estimation methods.

Mature organizations spend at least 10-15% of the total effort in requirements and

approximately the same effort in the design phase. The effort percentage for testing

depends on the type of release and amount of change to the existing code base and

functionality. Typically, organizations spend about 20 – 50% of their total effort in

testing.

Progress Metrics

The number of defects that are found in the product is one of the main indicators of

quality. Hence, we will look at progress metrics that reflect the defects of a product.

Defects get detected by the testing team and get fixed by the development team.

Based on this, defect metrics are further classified into test defect metrics and

development defect metrics.

14

14

How many defects have already been found and how many more defects may get

unearthed are two parameters that determine product quality and its assessment, the

progress of testing has to be understood. If only 50% of testing is complete and if 100

defects are found, then, assuming that the defects are uniformly distributed over the

product, another 80-100 defects can be estimated as residual defects.

1. Test defect metrics

The next set of metrics helps us understand how the defects that are found can be

used to improve testing and product quality. Not all defects are equal in impact or

importance. Some organizations classify defects by assigning a defect priority. The

priority of a defect provides a management perspective for the order of defect fixes.

Some organization use defect severity levels, they provide the test team a perspective

of the impact of that defect in product functionality. Since different organizations use

different methods of defining priorities and severities, a common set of defect

definitions and classification are provided in the table given below.

Defect find rate

When tracking and plotting the total number of defects found in the product at

regular intervals, from beginning to end of a product development cycle, it may show

a pattern for defect arrival. For a product to be fit for release, not only is such a

pattern of defect arrival in a particular duration should be kept at a bare minimum

number. A bell curve along with minimum number of defects found in the last few

days indicate that the release quality of the product is likely to be good.

14

14

Defect

classification

What it means

Extreme • Product crashes or unusable

• Needs to be fixed immediatelyCritical • Basic functionality of the product not working

• Needs to be fixed before next test cycle startsImportant • Extended functionality of the product not working

• Does not affect the progress of testing

Minor • Product behaves differently

• No impact on the test team or customers

• Fix it when tome permitsCosmetic • Minor irritant

Need not be fixed for this release

Table. A common defect definition and classification

Defect fix rate

If the goal of testing is to find defects as early as possible, it is natural to expect that

the goal of development should be to fix defects as soon as they arrive. If the defect

fixing curve is in line with defect arrival a “bell curve” will be the result again. There

is a reason why defect fixing rate should be same as defect arrival rate. As discussed

in the regression testing, when defects are fixed in the product, it opens the door for

the introduction of new defects. Hence, it is a good idea to fix the defects early and

test those defect fixes thoroughly to find our all introduced defects. If this principle is

not followed, defects introduced by the defect fixes may come up for testing just

before the release and end up in surfacing of new defects.

Outstanding defects rate

The number of defects outstanding in the product is calculated by subtracting the

total defects fixed from the total defects found in the product. If the defect fixing

pattern is constant like a straight line, the outstanding defects will result in a bell

curve again. If the defect-fixing pattern matches the arrival rate, then the outstanding

15

15

defects curve will look like a straight line. However it is not possible to fix all defects

when the arrival rate is at the top end of the bell curve. Hence, the outstanding defect

rate results in a ball curve in many projects. When testing is in progress, the

outstanding defects should be kept very close to zero so that the development team’s

bandwidth is available to analyze and fix the issues soon after they arrive.

Priority outstanding rate

Sometimes the defects that are coming out of testing may be very critical and may

take enormous effort to fix and to test. Hence, it is important to look at how many

serious issues are being uncovered in the product. The modification to the outstanding

defects rate curve by plotting only the high priority defects is called priority

outstanding defects. The priority outstanding defects correspond to extreme and

critical classification of defects. Some organizations include important defects also in

priority outstanding defects.

The effectiveness of analysis increases when several perspectives of find rate, fix rate,

outstanding, and priority outstanding defects are combined. There are different defect

trends like defect trend, defect classification trend and weighted defects trend.

Development defect metrics

In this section we will look how metrics can be used to improve the development

activities. The defect metrics that directly help in improving development activities

are discussed in this section and are termed as development defect metrics. While

defect metrics focus on the number of defects, development defect metrics try to map

those defects to different components of the product and to some of the parameters of

development such as lines of code.

Component-wise defect distribution

While it is important to count the number of defects in the product, for development

it is important to map them to different components of the product so that they can be

assigned to the appropriate developer to fix those defects. The project manager in

charge of development maintains a module ownership list where all product modules

and owners are listed. Based on the number of defects existing in each of the modules,

the project manager assigns resources accordingly.

15

15

Defect density and defect removal rate

A good quality product can have a long lifetime before becoming obsolete. The

lifetime of the product depends on its quality, over the different releases it goes

through. One of the metrics that correlates source code and defects is defect density.

This metric maps the defects in the product with the volume of code that is produced

for the product.

There are several standard formulae for calculating defect density. Of these, defects

per KLOC is the most practical and easy metric to calculate and plot. KLOC stands

for kilo lines of code, every 1000 lines of executable statements in the product is

counted as one KLOC.

The metric compares the defects per KLOC of the current release with previous

releases of the product. There are several variants if this metric to make it relevant to

releases, and one of them is calculating AMD ( added, modified, deleted code) to find

out how a particular release affects product quality.

Defects per KLOC = ( Total defects found in the product/ Total executable

AMD lines of code in KLOC)

Defects per KLOC can be used as a release criteria as well as a product quality

indicator with respect to code and defects. Defects found by the testing team have to

be fixed by the development team. The ultimate quality of the product depends both

on development and testing activities and there is a need for a metric to analyze both

the development and the testing phases together and map them to release. The defect

removal rate is used for the purpose.

The formula for calculating the defect removal rate is

( Defects found by verification activities + Defects found in unit testing) /

( Defects found by test teams) * 100

The above formula helps in finding the efficiency of verification activities and unit

testing which are normally responsibilities of the development team and compare

them to the defects found by the testing teams. These metrics are tracked over various

releases to study in-release-on-release trends in the verification /quality assurance

activities.

15

15

Age analysis of outstanding defects

Age here means those defects that have been waiting to be fixed for a long time.

Some defects that are difficult to be fixed or require significant effort may get

postponed for a longer duration. Hence, the age of a defect in a way represents the

complexity of the defect fix needed. Given the complexity and time involved in fixing

those defects, they need to be tracked closely else they may get postponed close to

release which may even delay the release. A method to track such defects is called age

analysis of outstanding defects.

Productivity Metrics

Productivity metrics combine several measurements and parameters with effort spent

on the product. They help in finding out the capability of the team as well as for other

purposes, such as

a. Estimating for the new release

b. Finding out how well the team is progressing, understanding, the

reasons for variation in results.

c. Estimating the number of defects that can be found.

d. Estimating release date and quality.

e. Estimating the cost involved in the release.

Defects per 100 hours of testing

Program testing can only prove the presence of defects, never their absence. Hence, it

is reasonable to conclude that there is no end to testing and more testing may reveal

more new defects. But here may be a point of diminishing returns when further testing

may not reveal any defects. If incoming defects in the product are reducing , it may

mean various things.

1. Testing is not effective.

2. The quality of the product is improving

3. Effort spent in testing is falling.

Defects per 100 hours of testing = ( Total defects found in the product

for a period/ Total hours spent to get those defects) * 100

15

15

Test cases executed per 100 hours of testing

The number of test cases executed by the test team for a particular duration depends

on team productivity and quality of product. The team productivity has to be

calculated accurately so that it can be tracked for the current release and be used to

estimate the next release of the product.

Test cases executed per 100 hours of testing = ( Total test cases executed for a

period / total hours spent in test execution) * 100

Test cases developed per 100 hours of testing

Both manual and automating test cases require estimating and tracking of productivity

numbers. In a product scenario not all test cases are written afresh for every release

new test cases are added to address new functionality and for testing features that

were not tested earlier Existing test cases are modified to reflect changes in the

product. Some test cases are deleted changers in the product, Some test cases are

deleted if they are no longer useful or if corresponding features are removed from the

product features are removed from the product, Hence the formula for test cases

developed uses the count corresponding to added modified and deleted test cases,

Test cases developed per 100 hours of testing= Total test cases developed for a

period total hours spent in test case development 100

Defect per 100 Test Cases

Since the goal of testing is find out as many defects as possible, it is appropriate to

measure the defect yield of test that is how many defects get uncovered during testing.

This is a function of two parameters one the effectiveness of the tests that are capable

of uncovering defects. The ability of test case to uncover defect depend on how well

the test cases are designed and developed. But in a typical product scenario not all test

cases are executed for every test cycle, hence it is better to select test cases that

produce defect. A measure that quantifies these two parameters is defect per 100 test

cases. Yet another parameter that influences this metric is the quality of product, If

15

15

product quality is poor, it produces more defects per 100 test cases compared to a

good quality product. The formula used for calculating this metric is

Defects per 100 test cases = (Total defect found for a period/ Total test cases

executed for the same period) * 100

Defects per 100 Failed Test Cases

Defect per 100 failed test cases is a good measure to find out how granular the test

cases are it indicates

How many test cases need to be executed when a defect is fixed

What defect need to be fixed so that an acceptable number of test cases

reach the pass state and

How the fail rate of test cases and defect affect each other for release

readiness analysis.

DEFECT PER 100 FAILED TEST CASES = (TOTAL DEFECT FOUND FOR A

PERIOD/ TOTAL TEST CASES FAILED DUE TO THOSE DEEFECTS)* 100

Test phase effectiveness

As per the principles of testing, testing is not the job of testers alone. Developers

perform unit testing and there could be multiple testing teams performing component,

integration and system testing phases. The idea of testing is to find defects early in the

cycle and in the early phases of testing. As testing is performed by various teams with

the objective of finding defects early at various phases, a metric needed to compare

the defects filed by each of the phases in testing. The defects found in various phases

such as unit testing(UT), component testing(CT), integration testing(IT), and system

testing are plotted and analyzed.

15

15

Fig. 28 Test phase effectiveness

In the chart given, the total defects found by each test phase are plotted. The

following few observations can be made.

1. A good proportion of defects were found in the early phases of testing (UT

and CT).

2. Product quality improved from phase to phase

Closed defect distribution

The objective of testing is not only to find defects. The testing team also has

the objective to ensure that all defects found through testing are fixed so that the

customer gets the benefit of testing and the product quality improves the testing team

has to track the defects and analyze how they are closed.

Release Metrics

The decision to release a product would need to consider several perspectives and

metrics. All the metrics that were discussed in the previous section need to be

considered in totality for making the release decision. The following table gives some

of the perspectives and some sample guidelines needed for release analysis.

15

15

UT 39%

ST 12%%

IT 17%

CT 32%

Metric Perspectives to be

considered

Guidelines

Test cases executed Execution %

Pass %

All 100% of test cases to be executed

Test cases passed should be minimum

98%Effort distribution Adequate effort has

been spent on all

phases

15-20% effort spent each on

requirements, design, and testing

phases.

Defect find rate Defect trend Defect arrival trend showing bell curve

Incoming defects close to zero in the

last weekDefect fix rate Defect fix trend Defect fixing trend matching arrival

trendTable. Guidelines for Release analysis

5.5 Summary

Automation makes the software to test the software and enables the human effort to

be spent on creative testing. Automation bridges the gap in skills requirement between

testing and development; at times it demands more skills for test teams. What to

automate takes into account the technical and management aspects, as well as the long

term vision. Product and automation are like the two lines in a railway track; they go

parallel in the same direction with similar expectations.

Test metrics are needed to know test case execution productivity and to estimate test

completion date. To summarize, metrics in testing helps in identifying when to make

the release, what to release and whether the product is being released with known

quality.

15

15

5.6 Check your Progress

1. What is test automation? Why it is important?

2. Explain the scope of automation.

3. How will you make the design and architecture for automation?

4. Explain the generic requirements for test tool/framework.

5. How will you select a test tool?

6. What are the challenges involved in test automation?

7. What are test metrics and measurements?

8. Why metrics are needed in testing?

9. Explain the Project metrics.

10. Explain the Progress and Productivity metrics.

15

15

stest

Documents