stest
DESCRIPTION
stestTRANSCRIPT
M.Sc Information Technology
SOFTWARE QUALITY ASSURANCE AND TESTING
Unit I
Principles of Testing – Software Development Life Cycle Models
Unit II : Testing Fundamentals – 1
White box Testing – Integration testing – System and acceptance testing.
Unit III
Testing Fundamentals- 2 & Specialized testing: Performance testing – Regression
Testing – Testing of object oriented systems – Usability and accessibility testing.
Unit IV
Test planning, Management, Execution and Reporting
Unit V
Software Test Automation – Test Metrics and Measurements.
Text Book(s):
1. Software testing – Srinivasan Desikan, Gopalswamy Ramesh- Pearson
Education 2006
References
1. Introducing Software Testing – Louis Tamres, Addison Wesley Publications,
First Edition.
2. Software Testing, Ron Potton, SAMS Techmedia, Indian Edition 2001
3. Software Quality – Producing Practical, Consistent Software – Mordechai Ben
– Menachem, Gary s. Marliss, Thomson Learning, 2003.
1
1
UNIT - I
Structure
1.1 Objectives
1.2 Introduction
1.3 Software Testing Fundamentals
1.3.1. Software Chaos
1.3.2. Criteria for Project Success
1.4 Testing Principles
1.5 Software Development Life Cycle Models
1.5.1 Big-Bang
1.5.2 Code and fix
1.5.3 Waterfall
1.5.4 Prototype Model
1.5.5 The RAD Model
1.6. Evolutionary Software Process Models
1.6.1 The incremental model
1.6.2 The Spiral Model
1.6.3 The WIN-WIN spiral model
1.6.4 The Concurrent development model
1.7. Summary
1.8. Check your progress
2
2
1.1. Objectives
• To know the testing fundamentals and objectives
• To learn the principles of testing
• To understand the various life cycle models for software
1.2 Introduction
Testing presents an interesting anomaly for the software engineer. During earlier
software engineering activities, the engineer attempts to build software from an
abstract concept to a tangible product. Now comes testing. The engineer creates a
series of test cases that are intended to "demolish" the software that has been built. In
fact, testing is the one step in the software process that could be viewed
(psychologically, at least) as destructive rather than constructive. Software engineers
are by their nature constructive people. Testing requires that the developer discard
preconceived notions of the "correctness" of software just developed and overcome a
conflict of interest that occurs when errors are uncovered.
Beizer describes this situation effectively when he states:
There's a myth that if we were really good at programming, there would be no bugs
to catch. If only we could really concentrate, if only everyone used structured
programming, top down design, decision tables, if programs were written in SQUISH,
if we had the right silver bullets, then there would be no bugs. Therefore, testing and
test case design is an admission of failure, which instills a goodly dose of guilt. And
the tedium of testing is just punishment for our errors. Software requirements are
exercised using “black box” test case design techniques. In both cases, the intent is to
find the maximum number of errors with the minimum amount of effort and time.
What is the work product? A set of test cases designed to exercise both internal logic
and external requirements is designed and documented, expected results are defined,
and actual results are recorded.
How do I ensure that I’ve done it right? When you begin testing, change your point of
view. Try hard to “break” the software!
Q
3
3
1.3 Software Testing Fundamentals
The fundamental principles of testing are as follows:
1. The goal of testing is to find defects before customers find them out.
2. Exhaustive testing is not possible; program testing can only show the presence
of defects, never their absence.
3. Testing applies all through the software lifecycle and is not an end-of-cycle
activity.
4. Understand the reason behind the test.
5. Test the test first.
6. Tests develop immunity and have to revised constantly.
7. Defects occur in convoys or clusters, and testing should focus on these
convoys.
8. Testing encompasses defect prevention.
9. Testing is fine balance of defect prevention and defect detection.
10. Intelligent and well-planned automation is key to realizing the benefits of
testing.
11. Testing requires talented, committed people who beleive in themselves and
work in teams.
Software Testing Techniques
Should testing instill guilt? Is testing really destructive? The answer to these
questions is "No!" However, the objectives of testing are somewhat different than we
might expect.
Testing Objectives
In an excellent book on software testing, Glen Myers states a number of rules that can
serve well as testing objectives:
1. Testing is a process of executing a program with the intent of finding an error.
2. A good test case is one that has a high probability of finding an as-yet undiscovered
error.
4
4
3. A successful test is one that uncovers an as-yet-undiscovered error.
These objectives imply a dramatic change in viewpoint. They move counter to the
commonly held view that a successful test is one in which no errors are found. Our
objective is to design tests that systematically uncover different classes of errors and
to do so with a minimum amount of time and effort. If testing is conducted
successfully (according to the objectives stated previously), it will uncover errors in
the software. As a secondary benefit, testing demonstrates that software functions
appear to be working according to specification, that behavioral and performance
requirements appear to have been met. In addition, data collected as testing provides a
good indication of software reliability and some indication of software quality as a
whole. But testing cannot show the absence of errors and defects, it can show only
that software errors and defects are present. It is important to keep this (rather
gloomy) statement in mind as testing is being conducted.
It is easy to take software for granted and not really appreciate how it has infiltrated
our daily lives. Most of us now can’t go a day without logging on to the internet and
checking our email. We rely on overnight packages, long distance phone service, and
cutting-edge medical treatments.
1.3.1 Software Chaos
Software is everywhere. However, it is written by people. So it is not perfect, as the
following examples show:
Disney’s Lion King, 1994-1995
In the fall of 1994, the Disney Company released its first multimedia CD-ROM game
for children. On December 26, customer support engineers were swamped with calls
from angry parents who could not get the software to work. It turns out that Disney
failed to properly test the software on the many different PC models available on the
market.
The other infamous software error case studies are listed below:
• Intel Pentium Floating Point Division Bug, 1994
• NASA Mars Polar Lander, 1999
• Patriot Missile Defense system, 1991
5
5
• The Y2K Bug, circa 1974
What is a Bug?
We have just read examples of what happens when software fails. In these instances,
it was obvious that the software did not operate as intended. Problem, error, and bug
are probably the most generic terms used.
Why do bugs occur?
The number one cause of software bugs is the specification. There are several reasons
why specifications are the largest bug producers. In many cases specifications are not
written. Other reasons may be that the specification is not thorough enough, it is
constantly changing, or it is not communicated well to the entire development team.
Planning software is vitally important. If it is not done correctly, bugs will be created.
The next largest source of bugs is the design. Coding errors may be more familiar to
you if you are a programmer.
The Cost of Bugs
Software does not just magically appear. There is usually a planned, methodical
development process used to create it. From its inception, through the planning,
programming and testing, to it’s use by the public, there is the potential for bugs to be
found. The cost to fix bugs increases dramatically over time.
What exactly does a software tester do?
The goal of software tester is
• To find bugs
• Find the bugs as early as possible
• And make sure they get fixed.
It has been said, “If you do not know where you are going, all roads lead there.”
Traditionally, many IT organizations annually develop a list of improvements to
incorporate into their operations without establishing a goal. Using this approach, the
IT organization can declare “victory” any time it wants. This lesson will help you
understand the importance of following a well-defined process for becoming a
world-class software testing organization. This lesson will help you define your
6
6
strengths and deficiencies, your staff competencies and deficiencies, and areas of user
dissatisfaction.
1.3.2 Criteria for Project Success
The Three-Step Process to Becoming a World-Class Testing Organization
The roadmap to become a world-class software testing organization is a simple
three-step process, as follows:
1. Define or adopt a world-class software testing model.
2. Determine your organization’s current level of software testing capabilities,
competencies, and user satisfaction.
3. Develop and implement a plan to upgrade from your current capabilities, competencies, and user satisfaction to those in the world-class software testing model.
This three-step process requires you to compare your current capabilities,
competencies, and user satisfaction against those of the world-class software testing
model. This assessment will enable you to develop a baseline of your organization’s
performance. The plan that you develop will, over time, move that baseline from its
current level of performance to a world-class level. Understanding the model for a
world-class software testing organization and then comparing your organization will
provide you with a plan for using the remainder of the material in this book.
Software testing is an integral part of the software-development process, which
comprises the following four components (see Figure 1):
1. Plan (P): Devise a plan. Define your objective and determine the strategy and
supporting methods to achieve it. You should base the plan on an assessment of your
current situation, and the strategy should clearly focus on the strategic initiatives/key
units that will drive your improvement plan.
2. Do (D): Execute the plan. Create the conditions and perform the necessary
training to execute the plan. Make sure everyone thoroughly understands the
objectives and the plan. Teach workers the procedures and skills they need to fulfill
the plan and thoroughly understand the job. Then perform the work according to these
procedures.
3. Check (C): Check the results. Check to determine whether work is progressing
according to the plan and whether the expected results are being obtained. Check for
7
7
performance of the set procedures, changes in conditions, or abnormalities that may
appear. As often as possible, compare the results of the work with the objectives.
4. Act (A): Take the necessary action. If your checkup reveals that the work is not
being performed according to the plan or that results are not what you anticipated,
devise measures to take appropriate actions.
Fig. 1 The four components of a software development process.
Testing involves only the “check” component of the plan-do-check-act (PDCA)
cycle. The software development team is responsible for the three remaining
components. The development team plans the project and builds the software (the
“do” component); the testers check to determine that the software meets the needs of
the customers and users. If it does not, the testers report defects to the development
team. It is the development team that makes the determination as to whether the
uncovered defects are to be corrected. The role of testing is to fulfill the check
responsibilities assigned to the testers; it is not to determine whether the software can
be placed into production. That is the responsibility of the customers, users, and
development team.
1.4 Testing Principles
Before applying methods to design effective test cases, a software engineer must
understand the basic principles that guide software testing. Davis suggests a set1 of
testing principles that have been adapted for use in this book:
8
8
• All tests should be traceable to customer requirements. As we have seen, the
objective of software testing is to uncover errors. It follows that the most severe
defects (from the customer’s point of view) are those that cause the program to fail to
meet its requirements.
• Tests should be planned long before testing begins. Test planning can begin as
soon as the requirements model is complete. All tests can be planned and designed
before any code has been generated.
• The Pareto principle applies to software testing. Stated simply, the Pareto
principle implies that 80 percent of all errors uncovered during testing will likely be
traceable to 20 percent of all program components. The problem, of course, is to
isolate these suspect components and to thoroughly test them.
• Testing should begin “in the small” and progress toward testing “in the large.”
The first tests planned and executed generally focus on individual components. As
testing progresses, focus shifts in an attempt to find errors in integrated clusters of
components and ultimately in the entire system.
• Exhaustive testing is not possible. The number of path permutations for even a
moderately sized program is exceptionally large. For this reason, it is impossible to
execute every combination of paths during testing. It is possible, however, to
adequately cover program logic and to ensure that all conditions in the
component-level design have been exercised.
• To be most effective, testing should be conducted by an independent third
party. By most effective, we mean testing that has the highest probability of finding
errors (the primary objective of testing).
Testability
In ideal circumstances, a software engineer designs a computer program, a system, or
a product with “testability” in mind. This enables the individuals charged with testing
to design effective test cases more easily. But what is testability? James Bach2
describes testability in the following manner. Software testability is simply how easily
[a computer program] can be tested. Since testing is so profoundly difficult, it pays to
know what can be done to streamline it. Sometimes programmers are willing to do
things that will help the testing process and a checklist of possible design points,
features, etc., can be useful in negotiating with them. There are certainly metrics that
could be used to measure testability in most of its aspects.
9
9
Operability. "The better it works, the more efficiently it can be tested."
• The system has few bugs (bugs add analysis and reporting overhead to the test
process).
• No bugs block the execution of tests.
• The product evolves in functional stages (allows simultaneous development and
testing).
Observability. "What you see is what you test."
• Distinct output is generated for each input.
• System states and variables are visible or queriable during execution.
• Past system states and variables are visible or queriable (e.g., transaction logs).
• All factors affecting the output are visible.
• Incorrect output is easily identified.
• Internal errors are automatically detected through self-testing mechanisms.
• Internal errors are automatically reported.
• Source code is accessible.
Controllability. "The better we can control the software, the more the testing can be
automated and optimized."
• All possible outputs can be generated through some combination of input.
• All code is executable through some combination of input.
• Software and hardware states and variables can be controlled directly by the test
engineer.
• Input and output formats are consistent and structured.
• Tests can be conveniently specified, automated, and reproduced.
Decomposability. "By controlling the scope of testing, we can more quickly isolate
problems and perform smarter retesting."
• The software system is built from independent modules.
• Software modules can be tested independently.
Simplicity. "The less there is to test, the more quickly we can test it."
10
10
• Functional simplicity (e.g., the feature set is the minimum necessary to meet
requirements).
• Structural simplicity (e.g., architecture is modularized to limit the propagation of
faults).
• Code simplicity (e.g., a coding standard is adopted for ease of inspection and
maintenance).
Stability. "The fewer the changes, the fewer the disruptions to testing."
• Changes to the software are infrequent.
• Changes to the software are controlled.
• Changes to the software do not invalidate existing tests.
• The software recovers well from failures.
Understandability. "The more information we have, the smarter we will test."
• The design is well understood.
• Dependencies between internal, external, and shared components are well
understood.
• Changes to the design are communicated.
• Technical documentation is instantly accessible.
• Technical documentation is well organized.
• Technical documentation is specific and detailed.
• Technical documentation is accurate.
The attributes suggested by Bach can be used by a software engineer to develop a
software configuration (i.e., programs, data, and documents) that is amenable to
testing. And what about the tests themselves? Kaner, Falk, and Nguyen suggest the
following attributes of a “good” test:
1. A good test has a high probability of finding an error. To achieve this goal, the
tester must understand the software and attempt to develop a mental picture of how
the software might fail. Ideally, the classes of failure are probed. For example, one
class of potential failure in a GUI (Graphical User Interface) is a failure to recognize
proper mouse position. A set of tests would be designed to exercise the mouse in an
attempt to demonstrate an error in mouse position recognition.
11
11
2. A good test is not redundant. Testing time and resources are limited. There is no
point in conducting a test that has the same purpose as another test. Every test should
have a different purpose (even if it is subtly different). For example, a module of the
Safe Home software is designed to recognize a user password to activate and
deactivate the system. In an effort to uncover an error in password input, the tester
designs a series of tests that input a sequence of passwords. Valid and invalid
passwords (four numeral sequences) are input as separate tests.
1.5 Software Development Life Cycle Models
To solve actual problems in an industry setting, a software engineer or a team of
engineers must incorporate a development strategy that encompasses the process,
methods, tools layers and the generic phases. This strategy is often referred to as a
process model or a software engineering paradigm. A process model for software
engineering is chosen based on the nature of the project and application, the methods
and tools to be used, and the controls and deliverables that are required. In an
intriguing paper on the nature of the software process, L. B. S. Raccoon [RAC95]
uses fractals as the basis for a discussion of the true nature of the software process.
“Too often, software work follows the first law of bicycling: No matter where you're
going its uphill and against the wind.”
In the sections that follow, a variety of different process models for software
engineering are discussed. Each represents an attempt to bring order to an inherently
chaotic activity. It is important to remember that each of the models has been
characterized in a way that (ideally) assists in the control and coordination of a real
software project.
A life cycle model describes how the phases combine together to form a complete
project or life cycle. Such a model is characterized by the following attributes:
The activities performed
The deliverables form each activity
Methods of validation of the deliverables
The sequence of activities
12
12
Methods of verification of each activity, including the mechanism of
communication amongst the activities.
The process used to create a software product from its initial conception to its release
is known as the software development life cycle model.
1.5.1 Big – Bang Model
One theory of the creation of the universe is the big-bang theory. It states that billions
of years ago, the universe was created in a single huge explosion of nearly infinite
energy. Everything that exists is the result of energy. A big-bang model for software
development follows the same principle. A huge amount of matter (People and
money) is put together, a lot of energy is expended- often violently – and out comes
the perfect software product. The beauty of the big – bang method is that it’s simple.
There is little if any planning, scheduling, or formal development process. All the
effort is spent developing the software and writing the code. It is an ideal process if
the product requirements are not well understood and the final release date is flexible.
It is also important to have very flexible customers, too, because they won’t know
what they are getting until the very end.
Fig.2 Big-Bang Model
Notice that testing is not shown in the figure. In most cases, there is little to no formal
testing done under the big-bang model. If testing does occur, it is squeezed in just
before the product is released. If you are called in to test a product under the big-bang
model, you have both an easy and a difficult task. Because the software is already
complete, you have the perfect specification- the product itself. And, because it’s
impossible to go back and fix things that are broken, your job is really just to report
what you find so the customers can be told about the problems. The downside is that,
in the eyes of project management, the product is ready to go, so your work is holding
13
13
BOOM
?
or
up delivery to the customer. The longer you take to do your job and the more bugs
you find, the more contentious the situation will become. Try to stay away from
testing in this model.
1.5.2 Code and Fix Model
The code and fix model is usually the one that project teams fall into by default if
they don’t consciously attempt to use something else. It is a step up, procedurally,
from the big-bang model in that at least requires some idea of what the product
requirements are.
Typically informal Code, Fix Final product
Product specification Repeat until
Fig. 3 Code and Fix model
A team using this approach usually starts with a rough idea of what they want, does
some simple design, and then proceeds into a long repeating cycle of coding, testing
and fixing bugs. At some point they decide that it is enough and release the product.
As there is very little overhead for planning and documenting, a project team can
show results immediately. For this reason the code and fix model works very well for
some projects intended to be created quickly and then thrown out shortly after they
are done, such as prototypes and demos. Even so, code and fix has been used on many
large and well know software products. If your word processor or spreadsheet
software has lots of little bugs or it just doesn’t seem quite finished, it was likely
created with the code and fix model.
As a tester on a code and fix project, you need to be aware that you, along with the
programmers, will be in a constant state of cycling. As often as every day you will be
given new or updated releases of the software and will set off to test it. You will run
your tests, report the bugs, and then get a new software release. You may not have
14
14
finished testing the previous release when the new one arrives, and the new one may
have new or changed features. Eventually, you will get a chance to test most of the
features, find fewer and fewer bugs, and then someone will decide that it is time to
release the product.
1.5.3 Waterfall Model
Sometimes called the classic life cycle or the waterfall model, the linear sequential
model suggests a systematic, sequential approach to software development that begins
at the system level and progresses through analysis, design, coding, testing, and
support. Modeled after a conventional engineering cycle, the linear sequential model
encompasses the following activities:
System/information engineering and modeling. Because software is always part of
a larger system (or business), work begins by establishing requirements for all system
elements and then allocating some subset of these requirements to software. This
system view is essential when software must interact with other elements such as
hardware, people, and databases. System engineering and analysis encompass
requirements gathering at the system level with a small amount of top level design
and analysis. Information engineering encompasses requirements gathering at the
strategic business level and at the business area level.
Software requirements analysis. The requirements gathering process is intensified
and focused specifically on software. To understand the nature of the program(s) to be
built, the software engineer ("analyst") must understand the information domain for
the software, as well as required function, behavior, performance, and interface.
Requirements for both the system and the software are documented and reviewed with
the customer.
Design. Software design is actually a multistep process that focuses on four distinct
attributes of a program: data structure, software architecture, interface representations,
and procedural (algorithmic) detail. The design process translates requirements into a
representation of the software that can be assessed for quality before coding begins.
Like requirements, the design is documented and becomes part of the software
configuration.
15
15
Code generation. The design must be translated into a machine-readable form. The
code generation step performs this task. If design is performed in a detailed manner,
code generation can be accomplished mechanistically.
Testing. Once code has been generated, program testing begins. The testing process
focuses on the logical internals of the software, ensuring that all statements have been
tested, and on the functional externals; that is, conducting tests to uncover errors and
ensure that defined input will produce actual results that agree with required results.
Support. Software will undoubtedly undergo change after it is delivered to the
customer (a possible exception is embedded software). Change will occur because
errors have been encountered, because the software must be adapted to accommodate
changes in its external environment (e.g., a change required because of a new
operating system or peripheral device), or because the customer requires functional or
performance enhancements. Software support/maintenance reapplies each of the
preceding phases to an existing program rather than a new one.
The waterfall model is usually the first one taught in programming school.
Fig. 4 Water fall model.
The above Figure.4 shows the steps involved in this model. A project using the
waterfall model moves down a series of steps starting from an initial idea to a final
product. At the end of each step, the project team holds a review to determine if they
are ready to move to the next step. Notice three important things about the waterfall
model:
16
16
Idea
Analysis
Design
Development
Test
Finalproduct
• There is a large emphasis on specifying what the product will be.
• The steps are discrete; there is no overlap.
• There is no way to back up. As soon as you are on a step, you need to
complete the tasks for that step and then move on- You can’t go back.
The advantage is everything is carefully and thoroughly specified. But, with this
advantage, comes a large disadvantage. Because testing occurs only at the end, a
fundamental problem could creep in early on and not be detected until days before the
scheduled product release.
The linear sequential model is the oldest and the most widely used paradigm for
software engineering. However, criticism of the paradigm has caused even active
supporters to question its efficacy. Among the problems that are sometimes
encountered when the linear sequential model is applied are:
1. Real projects rarely follow the sequential flow that the model proposes. Although
the linear model can accommodate iteration, it does so indirectly. As a result, changes
can cause confusion as the project team proceeds.
2. It is often difficult for the customer to state all requirements explicitly. The linear
sequential model requires this and has difficulty accommodating the natural
uncertainty that exists at the beginning of many projects.
3. The customer must have patience. A working version of the program(s) will not be
available until late in the project time-span. A major blunder, if undetected until the
working program is reviewed, can be disastrous.
In an interesting analysis of actual projects Bradac [BRA94], found that the linear
nature of the classic life cycle leads to “blocking states” in which some project team
members must wait for other members of the team to complete dependent tasks. In
fact, the time spent waiting can exceed the time spent on productive work! The
blocking state tends to be more prevalent at the beginning and end of a linear
sequential process.
Each of these problems is real. However, the classic life cycle paradigm has a
definite and important place in software engineering work. It provides a template into
which methods for analysis, design, coding, testing, and support can be placed. The
classic life cycle remains a widely used procedural model for software engineering.
While it does have weaknesses, it is significantly better than a haphazard approach to
software development.
17
17
1.5.4 The Prototyping Model
Often, a customer defines a set of general objectives for software but does not
identify detailed input, processing, or output requirements. In other cases, the
developer may be unsure of the efficiency of an algorithm, the adaptability of an
operating system, or the form that human/machine interaction should take. In these,
and many other situations, a prototyping paradigm may offer the best approach.
The prototyping paradigm (Figure 1.5) begins with requirements gathering. Developer
and customer meet and define the overall objectives for the software, identify
whatever requirements are known, and outline areas where further definition is
mandatory. A "quick design" then occurs. The quick design focuses on a
representation of those aspects of the software that will be visible to the customer/user
(e.g., input approaches and output formats). The quick design leads to the construction
of a prototype. The prototype is evaluated by the customer/user and used to refine
requirements for the software to be developed. Iteration occurs as the prototype is
tuned to satisfy the needs of the customer, while at the same time enabling the
developer to better understand what needs to be done.
Fig. 5 Prototyping Paradigm
18
18
Ideally, the prototype serves as a mechanism for identifying software requirements.
If a working prototype is built, the developer attempts to use existing program
fragments or applies tools (e.g., report generators, window managers) that enable
working programs to be generated quickly. But what do we do with the prototype
when it has served the purpose just described? Brooks [BRO75] provides an answer:
In most projects, the first system built is barely usable. It may be too slow, too big,
awkward in use or all three. There is no alternative but to start again, smarting but
smarter, and build a redesigned version in which these problems are solved . . . When
a new system concept or new technology is used, one has to build a system to throw
away, for even the best planning is not so omniscient as to get it right the first time.
The management question, therefore, is not whether to build a pilot system and throw
it away. You will do that. The only question is whether to plan in advance to build a
throwaway, or to promise to deliver the throwaway to customers.
The prototype can serve as "the first system." The one that Brooks recommends we
throw away. But this may be an idealized view. It is true that both customers and
developers like the prototyping paradigm. Users get a feel for the actual system and
developers get to build something immediately. Yet, prototyping can also be
problematic for the following reasons:
1. The customer sees what appears to be a working version of the software, unaware
that the prototype is held together “with chewing gum and baling wire,” unaware that
in the rush to get it working no one has considered overall software quality or
long-term maintainability. When informed that the product must be rebuilt so that
high levels of quality can be maintained, the customer cries foul and demands that "a
few fixes" be applied to make the prototype a working product. Too often, software
development management relents.
2. The developer often makes implementation compromises in order to get a
prototype working quickly. An inappropriate operating system or programming
language may be used simply because it is available and known; an inefficient
algorithm may be implemented simply to demonstrate capability.
After a time, the developer may become familiar with these choices and forget all the
reasons why they were inappropriate. The less-than-ideal choice has now become an
integral part of the system.
19
19
Although problems can occur, prototyping can be an effective paradigm for software
engineering. The key is to define the rules of the game at the beginning; that is, the
customer and developer must both agree that the prototype is built to serve as a
mechanism for defining requirements. It is then discarded (at least in part) and the
actual software is engineered with an eye toward quality and maintainability.
1.5.5 The RAD Model
Rapid application development (RAD) is an incremental software development
process model that emphasizes an extremely short development cycle. The RAD
model is a “high-speed” adaptation of the linear sequential model in which rapid
development is achieved by using component-based construction. If requirements are
well understood and project scope is constrained, the RAD process enables a
development team to create a “fully functional system” within very short time periods
(e.g., 60 to 90 days). Used primarily for information systems applications, the RAD
approach encompasses the following phases:
20
20
Fig. 6 the RAD Model
Business modeling. The information flow among business functions is modeled in a
way that answers the following questions: What information drives the business
process? What information is generated? Who generates it? Where does the
information go? Who processes it?
Data modeling. The information flow defined as part of the business modeling phase
is refined into a set of data objects that are needed to support the business. The
characteristics (called attributes) of each object are identified and the relationships
between these objects defined.
Process modeling. The data objects defined in the data modeling phase are
transformed to achieve the information flow necessary to implement a business
function. Processing descriptions are created for adding, modifying, deleting, or
retrieving a data object.
Application generation. RAD assumes the use of fourth generation techniques.
Rather than creating software using conventional third generation programming
21
21
languages the RAD process works to reuse existing program components (when
possible) or create reusable components (when necessary). In all cases, automated
tools are used to facilitate construction of the software.
Testing and turnover. Since the RAD process emphasizes reuse, many of the
program components have already been tested. This reduces overall testing time.
However, new components must be tested and all interfaces must be fully exercised.
The RAD process model is illustrated in Figure 2.6. Obviously, the time constraints
imposed on a RAD project demand “scalable scope” [KER94]. If a business
application can be modularized in a way that enables each major function to be
completed in less than three months (using the approach described previously), it is a
candidate for RAD. Each major function can be addressed by a separate RAD team
and then integrated to form a whole.
Like all process models, the RAD approach has drawbacks:
• For large but scalable projects, RAD requires sufficient human resources to create
the right number of RAD teams.
• RAD requires developers and customers who are committed to the rapid-fire
activities necessary to get a system complete in a much abbreviated time frame. If
commitment is lacking from either constituency, RAD projects will fail.
• Not all types of applications are appropriate for RAD. If a system cannot be properly
modularized, building the components necessary for RAD will be problematic. If high
performance is an issue and performance is to be achieved through tuning the
interfaces to system components, the RAD approach may not work.
• RAD is not appropriate when technical risks are high. This occurs when a new
application makes heavy use of new technology or when the new software requires a
high degree of interoperability with existing computer programs.
1.6 Evolutionary Process models
There is growing recognition that software, like all complex systems, evolves over a
period of time. Business and product requirements often change as development
proceeds, making a straight path to an end product unrealistic; tight market deadlines
make completion of a comprehensive software product impossible, but a limited
version must be introduced to meet competitive or business pressure; a set of core
22
22
product or system requirements is well understood, but the details of product or
system extensions have yet to be defined. In these and similar situations, software
engineers need a process model that has been explicitly designed to accommodate a
product that evolves over time.
The linear sequential model is designed for straight-line development. In essence, this
waterfall approach assumes that a complete system will be delivered after the linear
sequence is completed. The prototyping model is designed to assist the customer (or
developer) in understanding requirements. In general, it is not designed to deliver a
production system. The evolutionary nature of software is not considered in either of
these classic software engineering paradigms.
Evolutionary models are iterative. They are characterized in a manner that enables
software engineers to develop increasingly more complete versions of the software.
1.6.1 The Incremental Model
The incremental model combines elements of the linear sequential model (applied
repetitively) with the iterative philosophy of prototyping. Referring to Figure.7, the
incremental model applies linear sequences in a staggered fashion as calendar time
progresses. Each linear sequence produces a deliverable “increment” of the software
[MDE93]. For example, word-processing software developed using the incremental
paradigm might deliver basic file management, editing, and document production
functions in the first increment; more sophisticated editing and document production
capabilities in the second increment; spelling and grammar checking in the third
increment; and advanced page layout capability in the fourth increment. It should be
noted that the process flow for any increment can incorporate the prototyping
paradigm.
When an incremental model is used, the first increment is often a core product.
That is, basic requirements are addressed, but many supplementary features (some
known, others unknown) remain undelivered. The core product is used by the
customer (or undergoes detailed review). As a result of use and/or evaluation, a plan
is developed for the next increment. The plan addresses the modification of the core
product to better meet the needs of the customer and the delivery of additional
23
23
features and functionality. This process is repeated following the delivery of each
increment, until the complete product is produced.
Fig. 7 The Incremental Model
The incremental process model, like prototyping and other evolutionary approaches,
is iterative in nature. But unlike prototyping, the incremental model focuses on the
delivery of an operational product with each increment. Early increments are stripped
down versions of the final product, but they do provide capability that serves the user
and also provide a platform for evaluation by the user.
Incremental development is particularly useful when staffing is unavailable for a
complete implementation by the business deadline that has been established for the
project. Early increments can be implemented with fewer people. If the core product
is well received, then additional staff (if required) can be added to implement the next
increment. In addition, increments can be planned to manage technical risks. For
example, a major system might require the availability of new hardware that is under
development and whose delivery date is uncertain. It might be possible to plan early
increments in a way that avoids the use of this hardware, thereby enabling partial
functionality to be delivered to end-users without inordinate delay.
24
24
1.6.2 Spiral Model
Fig. 8 The spiral model
The spiral model, originally proposed by Boehm, is an evolutionary software process
model that couples the iterative nature of prototyping with the controlled and
systematic aspects of the linear sequential model. It provides the potential for rapid
development of incremental versions of the software. Using the spiral model, software
is developed in a series of incremental releases. During early iterations, the
incremental release might be a paper model or prototype. During later iterations,
increasingly more complete versions of the engineered system are produced. A spiral
model is divided into a number of framework activities, also called task regions..
Figure.8 depicts a spiral model that contains six task regions:
• Customer communication—tasks required to establish effective
communication between developer and customer.
• Planning—tasks required to define resources, timelines, and other project
related information.
• Risk analysis—tasks required to assess both technical and management risks.
25
25
• Engineering—tasks required to build one or more representations of the
application.
• Construction and release—tasks required to construct, test, install, and
provide user support (e.g., documentation and training).
• Customer evaluation – tasks required to evaluate the project.
The spiral model is a realistic approach to the development of large-scale systems
and software. Because software evolves as the process progresses, the developer and
customer better understand and react to risks at each evolutionary level. The spiral
model uses prototyping as a risk reduction mechanism but, more important, enables
the developer to apply the prototyping approach at any stage in the evolution of the
product. It maintains the systematic stepwise approach suggested by the classic life
cycle but incorporates it into an iterative framework that more realistically reflects the
real world. The spiral model demands a direct consideration of technical risks at all
stages of the project and, if properly applied, should reduce risks before they become
problematic.
But like other paradigms, the spiral model is not a panacea. It may be difficult to
convince customers (particularly in contract situations) that the evolutionary approach
is controllable. It demands considerable risk assessment expertise and relies on this
expertise for success. If a major risk is not uncovered and managed, problems will
undoubtedly occur. Finally, the model has not been used as widely as the linear
sequential or prototyping paradigms. It will take a number of years before efficacy of
this important paradigm can be determined with absolute certainty.
1.6.3 The WIN-WIN Spiral Model
The spiral model discussed in the previous Section suggests a framework activity that
addresses customer communication. The objective of this activity is to elicit project
requirements from the customer. In an ideal context, the developer simply asks the
customer what is required and the customer provides sufficient detail to proceed.
Unfortunately, this rarely happens. In reality, the customer and the developer enter
into a process of negotiation, where the customer may be asked to balance
functionality, performance, and other product or system characteristics against cost
and time to market.
26
26
The best negotiations strive for a “win-win” result.7 That is, the customer wins by
getting the system or product that satisfies the majority of the customer’s needs and
the developer wins by working to realistic and achievable budgets and deadlines.
Boehm’s WINWIN spiral model defines a set of negotiation activities at the
beginning of each pass around the spiral. Rather than a single customer
communication activity, the following activities are defined:
1. Identification of the system or subsystem’s key “stakeholders.”
2. Determination of the stakeholders’ “win conditions.”
3. Negotiation of the stakeholders’ win conditions to reconcile them into a set of
win-win conditions for all concerned (including the software project team).
Successful completion of these initial steps achieves a win-win result, which becomes
the key criterion for proceeding to software and system definition. The WINWIN
spiral model is illustrated in the following Figure.
Fig. 9 The WIN WIN Spiral Model
In addition to the emphasis placed on early negotiation, the WINWIN spiral model
introduces three process milestones, called anchor points that help establish the
completion of one cycle around the spiral and provide decision milestones before the
software project proceeds.
In essence, the anchor points represent three different views of progress as the project
traverses the spiral. The first anchor point, life cycle objectives (LCO), defines a set of
objectives for each major software engineering activity. For example, as part of LCO,
27
27
a set of objectives establishes the definition of top-level system/product requirements.
The second anchor point, life cycle architecture (LCA), establishes objectives that
must be met as the system and software architecture is defined. For example, as part
of LCA, the software project team must demonstrate that it has evaluated the
applicability of off-the-shelf and reusable software components and considered their
impact on architectural decisions. Initial operational capability (IOC) is the third
anchor point and represents a set of objectives associated with the preparation of the
software for installation/distribution, site preparation prior to installation, and
assistance required by all parties that will use or support the software.
1.6.4 The Concurrent Developmental Model
The concurrent process model can be represented schematically as a series of major
technical activities, tasks, and their associated states. For example, the engineering
activity defined for the spiral model is accomplished by invoking the following tasks:
prototyping and/or analysis modeling, requirements specification, and design.
28
28
Fig. 10 The Concurrent Developmental Model
Figure.10 provides a schematic representation of one activity with the concurrent
process model. The activity—analysis—may be in any one of the states noted at any
given time. Similarly, other activities (e.g., design or customer communication) can
be represented in an analogous manner. All activities exist concurrently but reside in
different states. For example, early in a project the customer communication activity
(not shown in the figure) has completed its first iteration and exists in the awaiting
changes state. The analysis activity (which existed in the none state while initial
customer communication was completed) now makes a transition into the under
development state. If, however, the customer indicates that changes in requirements
must be made, the analysis activity moves from the under development
State into the awaiting changes state.
The concurrent process model defines a series of events that will trigger transitions
from state to state for each of the software engineering activities. For example, during
early stages of design, an inconsistency in the analysis model is uncovered. This
29
29
generates the event analysis model correction which will trigger the analysis activity
from the done state into the awaiting changes state.
The concurrent process model is often used as the paradigm for the development of
client/server11 applications. A client/server system is composed of a set of functional
components. When applied to client/server, the concurrent process model defines
activities in two dimensions: a system dimension and a component dimension. System
level issues are addressed using three activities: design, assembly, and use. The
component dimension is addressed with two activities: design and realization.
Concurrency is achieved in two ways: (1) system and component activities occur
simultaneously and can be modeled using the state-oriented approach described
previously; (2) a typical client/server application is implemented with many
components, each of which can be designed and realized concurrently.
In reality, the concurrent process model is applicable to all types of software
development and provides an accurate picture of the current state of a project. Rather
than confining software engineering activities to a sequence of events, it defines a
network of activities. Each activity on the network exists simultaneously with other
activities. Events generated within a given activity or at some other place in the
activity network trigger transitions among the states of an activity.
1.7 Summary
Software engineering is a discipline that integrates process, methods, and tools for the
development of computer software. A number of different process models for
software engineering have been proposed, each exhibiting strengths and weaknesses,
but all having a series of generic phases in common. As can be seen from the above
discussion, each of the models has its advantages and disadvantages. Each of them
has applicability in a specific scenario. Each of them also provides different issues,
challenges, and opportunities for verification and validation.
30
30
1.8 Check your Progress
1. What are the objectives of testing?
2. Write down the principles of testing.
3. Discuss the criteria for project success.
4. Write short notes on various software development lifecycle models.
5. What are called evolutionary life cycle models? How they differ from
the old ones?
6. Explain the Spiral and WIN - WIN spiral model with a neat diagram.
7. What are the phases involved in a Waterfall model? Explain
8. Explain the Prototyping and RAD model.
9. Write short notes on Code and Fix model & Big Bang Model.
10. Write down the advantages of making a concurrent
developmental model.
31
31
Unit II
Structure
2.0 Objectives
2.1. Introduction
2.2. White Box Testing
2.2.1 Static Testing
2.2.2 Structural Testing
2.2.3 Code Complexity Testing
2.3 Integration Testing
2.3.1 Top-Down Integration Testing
2.3.2 Bottom – Up Integration testing
2.3.3 Integration testing Documentation
2.3.4 Alpha and Beta Testing
2.4 System and Acceptance Testing
2.4.1 System Testing
2.4.2 Acceptance Testing
2.5. Summary
2.6 Check your progress
32
32
2.0 Objectives
To know the various types of testing and their importance
To learn the White box testing methods and its features
To understand what is the necessity of doing Integration testing in the top down and
bottom up sequences
To learn the system and acceptance testing to finally accept the software to use
2.1 Introduction
Testing requires asking about and understanding what you are trying to test, knowing
what the correct outcome is, and why you are performing any test. Why one test is as
important as what to test and how to test. Understanding the rationale of why we are
testing certain functionality leads to different types of tests, which we will see in the
following sections.
We do white box testing to check the various paths in the code and make sure they
are exercised correctly. Knowing which code paths should be exercised for a given
test enables making necessary changes to ensure that appropriate paths are covered.
Knowing the external functionality of what the product should do, we design black
box tests. Integration tests are used to make sure that the different components fit
together. Regression testing is done to ensure that changes work as designed and do
not have any unintended side-effects. So teat the test first, a defective test is more
dangerous than a defective product.
TYPES OF TESTING
The various types of testing which are often used are listed below:
White Box Testing
Black Box Testing
Integration Testing
System and Acceptance Testing
Performance Testing
33
33
Regression testing
Testing of Object Oriented Systems
Usability and Accessibility Testing
2.2 WHITE-BOX TESTING
White box testing is a way of testing the external functionality of the code by
examining and testing the program code that realizes the external functionality. This
is also known as clear box, or glass box or open box testing. White box testing takes
into account the program code, code structure, and internal design flow. White box
testing is classified into static and structural testing.
White-box testing, sometimes called glass-box testing, is a test case design method
that uses the control structure of the procedural design to derive test cases. Using
white-box testing methods, the software engineer can derive test cases that
(1) Guarantee that all independent paths within a module have been exercised at least
once
(2) Exercise all logical decisions on their true and false sides
(3) Execute all loops at their boundaries and within their operational bounds and
(4) Exercise internal data structures to ensure their validity.
It is not possible to exhaustively test every program path because the number of paths
is simply too large. White-box tests can be designed only after a component-level
design (or source code) exists. The logical details of the program must be available.
34
34
Fig . 11 Classification of white box testing
2.2.1 Static testing
Static testing requires only the source code of the product, not the binaries or
executables. Static testing does not involve executing the programs on computers but
involves select people going through the code to find out whether
The code works according to the functional requirement
The code has been written in accordance with the design developed
Earlier in the project life cycle
The code for any functionality has been missed out
The code handles errors properly
Static testing can be done by humans or with the help of specialized tools.
White box testing
Static testing Structural testing
Desk checking
Code walkthrough
CodeInspection
Unit/codeFunctional testing
Code coverage
Codecomplexity
Cyclomaticcomplexity
StatementCoverage
Path coverage
Condition coverage
Functioncoverage
35
35
Static testing by human
These methods rely on the principle of humans reading the program code to detect
errors rather than computers executing the code to find errors. This process has
several advantages.
1. Sometimes human can find errors that computers can not. For example, when
there are two variables with similar names and the programmer used a wrong
variable by mistake in an expression, the computer will not detect the error but
execute the statement and produce incorrect results, whereas a human being
can spot such an error.
2. By making multiple humans read and evaluate the program, we can get
multiple perspectives and therefore have more problems identified upfront
than a computer could.
3. A human evaluation of the code can compare it against the specifications or
design and thus ensure that it does what is intended to do. This may not always
be possible when a computer runs a test.
4. A human evaluation can detect many problems at one go an can even try to
identify the root causes of the problems.
5. By making humans test the code before execution, computer resources can be
saved. Of course, this comes at the expense of human resources.
6. A proactive method of testing like static testing minimizes the delay in
identification of the problems.
7. From a psychological point of view, finding defects later in the cycle creates
immense pressure on programmers. They have to fix defects with less time to
spare. With this kind of pressure, there are higher chances of other defects
creeping in.
There are multiple methods to achieve static testing by humans. They are
1. Desk checking of the code
2. Code walk through
3. Code review
4. Code inspection
Desk checking Normally done manually by the author of the code to verify the
portions of the code for correctness. Such verification is done by comparing the code
36
36
with the design or specifications to make sure that the code does what it is supposed
to do and effectively. Whenever errors are found the author applies the correction for
errors on the spot. This method of catching and correcting errors is characterized by
1. No structured method or formulation to ensure completeness and
2. No maintaining of a log or check list.
Some of the disadvantages of this method of testing are as follows:
1. A developer is not the best person to detect problems in his own
code.
2. Developers generally prefer to write new code rather than any form
of testing.
3. This method is essentially person dependent and informal.
Code walkthrough
Walkthroughs are less formal than inspection. The advantage that walkthrough
has over desk checking is that it brings multiple prospective. In walkthroughs, a set of
people look at the program code and raise questions for the author. The author
explains the logic of the code and answers the questions.
Formal inspection
Code inspection – also called Fagan inspection is a method normally with a
high degree of formalism. The focus of this method is to detect all faults, violations,
and other side effects.
Combining various methods
The methods discussed above are not mutually exclusive. They need to be
used in a judicious combination to be effective in achieving the goal of finding defects
early.
Static analysis tools
There are several static analysis tools available in the market that can reduce the
manual work and perform analysis of the code to find out errors such as
1. Whether there are unreachable codes
2. Variables declared but not used
3. Mismatch in definition and assignment of values to variables etc.
37
37
While following any of the methods of human checking – desk checking,
walkthroughs, or formal inspection- it is useful to have a code review check list.
Code review checklist
- Data item declaration related
- Data usage related
- Control flow related
- Standards related
- Style related.
Ensuring that program requirements have been met?" Stated another way, why don't
we spend all of our energy on black-box tests? The answer lies in the nature of
software defects.
• Logic errors and incorrect assumptions are inversely proportional to the probability
that a program path will be executed. Errors tend to creep into our work when we
design and implement function, conditions, or control that is out of the mainstream.
Everyday processing tends to be well understood (and well scrutinized), while
"special case" processing tends to fall into the cracks.
• We often believe that a logical path is not likely to be executed when, in fact, it may
be executed on a regular basis. The logical flow of a program is sometimes
counterintuitive, meaning that our unconscious assumptions about flow of control and
data may lead us to make design errors that are uncovered only once path testing
commences.
• Typographical errors are random. When a program is translated into programming
language source code, it is likely that some typing errors will occur. Many will be
uncovered by syntax and type checking mechanisms, but others may go undetected
until testing begins. It is as likely that a typo will exist on an obscure logical path as
on a mainstream path. Each of these reasons provides an argument for conducting
white-box tests. Black box testing, no matter how thorough, may miss the kinds of
errors noted here. White box testing is far more likely to uncover them.
38
38
2.2.2 Structural testing
Structural testing takes into account the code, code structure, internal design, and how
they are coded. In structural testing tests are actually run by the computer on the built
product, whereas in static testing the product is tested by humans using just the source
code and not the executables or binaries. Structural testing can be further classified
into
• Unit/code functional testing,
• Code coverage and
• Code complexity testing.
Unit/Code functional testing
This initial part of structural testing corresponds to some quick checks that a
developer performs before subjecting the code to more extensive code coverage
testing or code complexity testing.
Initially the developer can perform certain obvious tests, knowing the input
variables and the corresponding expected output variables. This can be a quick test
that checks out any obvious mistakes. By repeating these tests for multiple values of
input variables, the confidence level of the developer to go to the next level increases.
This can even be done prior to formal reviews of static testing so that the review
mechanism does not waste time catching obvious errors.
For modules with complex logic or conditions, the developer can build a
“debug version” of the product by putting intermediate print statements and making
sure the program is passing through the right loops and iterations the right number of
times. It is important to remove the intermediate print statements after the defects are
fixed.
Another approach to do the initial test is to run the product under a debugger
or an Integrated Development Environment (IDE). These tools allow single stepping
of instructions, stepping break points at any function or instruction, and viewing the
various system parameters or program variable values.
Code coverage testing
39
39
Code coverage testing involves designing and executing test cases and finding out the
percentage of code that is covered by testing. The percentage of code that is covered
by a test is found by a test is found by adopting a technique called instrumentation of
code. There are specialized tools available to achieve instrumentation.
The tools also allow reporting on the portions of the code that are covered frequently,
so that the critical or most-often portions of code can be identified.
Code coverage testing is made up of the following types of coverage.
1. Statement coverage
2. Path coverage
3. Condition coverage
4. Function coverage
Statement coverage
Program constructs in most conventional programming languages can be classified as
1. Sequential control flow
2. Two-way decision statements like if then else
3. Multi-way decision statements like switch
4. Loops like while do, repeat until and for
Object-oriented languages have all of the above and, in addition, a number of other
constructs and concepts. Statement coverage refers to writing test cases that execute
each of the program statements.
Path coverage
In path coverage, we split a program into a number of distinct paths. A program can
start from the beginning and take any of the paths to its completion. Path coverage
provides a stronger condition of coverage than statement coverage as it relates to the
various logical paths in the program rather than just program statements.
Condition coverage
In the path coverage testing though we have covered all the paths possible, it would
not mean that the program is fully covered.
Condition coverage = (Total decisions exercised / Total number of decisions in
program)*100
40
40
The condition coverage as defined by the formula alongside in the margin gives an
indication of the percentage of conditions covered by a set of test cases. Condition
coverage is a much stronger criteria than statement coverage.
Function coverage
This is a new addition to structural testing to identify how many program functions
are covered by test cases. The requirements of a product are mapped into functions
during the design phase and each of the functions from a logical unit. The advantages
that function coverage provides over the other types of coverage are as follows:
Functions are easier to identify in a program and hence it is easier to write test
cases to provide function coverage.
Since functions are at a much higher level of abstraction than code, it is easier
to achieve 100 percent function coverage than 100 percent coverage in any of
the earlier methods.
Functions have a more logical mapping to requirements and hence can provide
a more direct correlation to the test coverage of the product.
Function coverage provides a natural transition to black box testing.
Basis path testing
Basis path testing is a white-box testing technique first proposed by Tom McCabe.
The basis path method enables the test case designer to derive a logical complexity
measure of a procedural design and use this measure as a guide for defining a basis set
of execution paths. Test cases derived to exercise the basis set are guaranteed to
execute every statement in the program at least one time during testing. Before the
basis path method can be introduced, a simple notation for the representation of
control flow, called a flow graph (or program graph) must be introduced. In actuality,
the basis path method can be conducted without the use of flow graphs. However,
they serve as a useful tool for understanding control flow and illustrating the
approach.
41
41
Figure 12 - Flow graph notation
The above figure.12 maps the flowchart into a corresponding flow graph (assuming
that no compound conditions are contained in the decision diamonds of the
flowchart). Referring to Figure, each circle, called a flow graph node, represents one
or more procedural statements. A sequence of process boxes and a decision diamond
can map into a single node. The arrows on the flow graph, called edges or links,
represent flow of control and are analogous to flowchart arrows. An edge must
terminate at a node, even if the node does not represent any procedural statements
(e.g., see the symbol for the if-then-else construct). Areas bounded by edges and
nodes are called regions. When counting regions, we include the area outside the
graph as a region.
When compound conditions are encountered in a procedural design, the generation of
a flow graph becomes slightly more complicated. A compound condition occurs when
one or more Boolean operators (logical OR, AND, NAND, NOR) is present in a
conditional statement. Referring to Figure.12, the PDL segment translates into the
flow graph shown. Note that a separate node is created for each of the conditions a
and b in the statement IF a OR b. Each node that contains a condition is called a
predicate node and is characterized by two or more edges emanating from it.
42
42
2.2.3 Code complexity testing
Cyclomatic Complexity
Cyclomatic complexity is software metric that provides a quantitative measure of the
logical complexity of a program. When used in the context of the basis path testing
method, the value computed for cyclomatic complexity defines the number of
independent paths in the basis set of a program and provides us with an upper bound
for the number of tests that must be conducted to ensure that all statements have been
executed at least once. An independent path is any path through the program that
introduces at least one new set of processing statements or a new condition.
43
43
Figure 13 - Flowchart, (A) and flow graph (B)
Figure 14 - Compound logic
In flow graph, an independent path must move along at least one edge that has not
been traversed before the path is defined. For example, a set of independent paths for
the flow graph illustrated in the above figure.14 is
path 1: 1-11
path 2: 1-2-3-4-5-10-1-11
path 3: 1-2-3-6-8-9-10-1-11
44
44
path 4: 1-2-3-6-7-9-10-1-11
Note that each new path introduces a new edge. The path
1-2-3-4-5-10-1-2-3-6-8-9-10-1-11 is not considered to be an independent path
because it is simply a combination of already specified paths and does not traverse
any new edges.
Paths 1, 2, 3, and 4 constitute a basis set for the flow graph in the Figure given above.
That is, if tests can be designed to force execution of these paths (a basis set), every
statement in the program will have been guaranteed to be executed at least one time
and every condition will have been executed on its true and false sides. It should be
noted that the basis set is not unique. In fact, a number of different basis sets can be
derived for a given procedural design.
Cyclomatic complexity is a useful metric for predicting those modules that are likely
to be error prone. It can be used for test planning as well as test case design.
How do we know how many paths to look for? The computation of cyclomatic
complexity provides the answer. Cyclomatic complexity has a foundation in graph
theory and provides us with an extremely useful software metric. Complexity is
computed in one of three ways:
1. The number of regions of the flow graph correspond to the cyclomatic complexity.
2. Cyclomatic complexity, V(G), for a flow graph, G, is defined as Compound logic
where E is the number of flow graph edges, N is the number of flow graph nodes.
3. Cyclomatic complexity, V(G), for a flow graph, G, is also defined as V(G) = P + 1
where P is the number of predicate nodes contained in the flow graph G.
Referring once more to the flow graph in Figure 14, the cyclomatic complexity can be
computed using each of the algorithms just noted:
1. The flow graph has four regions.
2. V(G) = 11 edges _ 9 nodes + 2 = 4.
3. V(G) = 3 predicate nodes + 1 = 4.
Therefore, the cyclomatic complexity of the flow graph in Figure 14 is 4. More
important, the value for V(G) provides us with an upper bound for the number of
independent paths that form the basis set and, by implication, an upper bound on the
number of tests that must be designed and executed to guarantee coverage of all
program statements.
45
45
Deriving Test Cases
The basis path testing method can be applied to a procedural design or to source
code. In this section, we present basis path testing as a series of steps. The procedure
average, depicted in PDL in Figure 1.4, will be used as an example to illustrate each
step in the test case design method. Note that average, although an extremely simple
algorithm contains compound conditions and loops. The following steps can be
applied to derive the basis set:
Figure 15 - Flow graph for the procedure average
1. Using the design or code as a foundation, draw a corresponding flow graph. A
flow graph is created using the symbols and construction rules. The corresponding
flow graph is in the figure given above.
2. Determine the cyclomatic complexity of the resultant flow graph. The
cyclomatic complexity, V(G), is determined by applying the algorithms . It should be
noted that V(G) can be determined without developing a flow graph by counting all
46
46
conditional statements in the PDL (for the procedure average, compound conditions
count as two) and adding 1.
Referring to Figure,
V(G) = 6 regions
V(G) = 17 edges _ 13 nodes + 2 = 6
V(G) = 5 predicate nodes + 1 = 6
3. Determine a basis set of linearly independent paths. The value of V(G) provides
the number of linearly independent paths through the program control structure. In the
case of procedure average, we expect to specify six paths:
path 1: 1-2-10-11-13
path 2: 1-2-10-12-13
path 3: 1-2-3-10-11-13
path 4: 1-2-3-4-5-8-9-2-. . .
path 5: 1-2-3-4-5-6-8-9-2-. . .
path 6: 1-2-3-4-5-6-7-8-9-2-. . .
The ellipsis (. . .) following paths 4, 5, and 6 indicates that any path through the
remainder of the control structure is acceptable. It is often worthwhile to identify
predicate nodes as an aid in the derivation of test cases. In this case, nodes 2, 3, 5, 6,
and 10 are predicate nodes.
4. Prepare test cases that will force execution of each path in the basis set. Data
should be chosen so that conditions at the predicate nodes are appropriately set as
each path is tested. Test cases that satisfy the basis set just described are
PROCEDURE average;
INTERFACE RETURNS average, total.input, total.valid;
INTERFACE ACCEPTS value, minimum, maximum;
TYPE value[1:100] IS SCALAR ARRAY;
TYPE average, total.input, total.valid;
minimum, maximum, sum IS SCALAR;
TYPE i IS INTEGER;
* This procedure computes the average of 100 or fewer numbers that lie between
bounding values; it also computes the sum and the total number valid.
47
47
i = 1;
total.input = total.valid = 0;
sum = 0;
DO WHILE value[i] <> –999 AND total.input < 100
ENDDO
IF total.valid > 0
ENDIF
END average
increment total.input by 1;
IF value[i] > = minimum AND value[i] < = maximum
ENDIF
increment i by 1;
THEN average = sum / total.valid;
ELSE average = –999;
THEN increment total.valid by 1;
sum = s sum + value[i]
ELSE skip
Path 1 test case:
value(k) = valid input, where k < i for 2 ≤i ≤100
value(i) = _999 where 2 ≤i ≤100
Expected results: Correct average based on k values and proper totals.
Note: Path 1 cannot be tested stand-alone but must be tested as part of path 4, 5, and 6
tests.
Path 2 test case:
value(1) = _999
Expected results: Average = _999; other totals at initial values.
Path 3 test case:
Attempt to process 101 or more values.
First 100 values should be valid.
Expected results: Same as test case 1.
Path 4 test case:
value(i) = valid input where i < 100
48
48
value(k) < minimum where k < i
Expected results: Correct average based on k values and proper totals.
1
2
3
4
5
6
7
8
9
10
12 11
13
Path 5 test case:
value(i) = valid input where i < 100
value(k) > maximum where k <= i
Expected results: Correct average based on n values and proper totals.
Path 6 test case:
value(i) = valid input where i < 100
Expected results: Correct average based on n values and proper totals.
Each test case is executed and compared to expected results. Once all test cases have
been completed, the tester can be sure that all statements in the program have been
executed at least once. It is important to note that some independent paths (e.g., path 1
in our example) cannot be tested in stand-alone fashion. That is, the combination of
data required to traverse the path cannot be achieved in the normal flow of the
program. In such cases, these paths are tested as part of another path test.
49
49
Graph Matrices
The procedure for deriving the flow graph and even determining a set of basis paths
is amenable to mechanization. To develop a software tool that assists in basis path
testing, a data structure, called a graph matrix, can be quite useful. A graph matrix is
a square matrix whose size (i.e., number of rows and columns) is equal to the number
of nodes on the flow graph. Each row and column corresponds to an identified node,
and matrix entries correspond to connections (an edge) between nodes. Each node on
the flow graph is identified by numbers, while each edge is identified by letters. A
letter entry is made in the matrix to correspond to a connection between two nodes.
For example, node 3 is connected to node 4 by edge b.
To this point, the graph matrix is nothing more than a tabular representation of a
flow graph. However, by adding a link weight to each matrix entry, the graph matrix
can become a powerful tool for evaluating program control structure during testing.
The link weight provides additional information about control flow. In its simplest
form, the link weight is 1 (a connection exists) or 0 (a connection does not exist). But
link weights can be assigned other, more interesting properties:
• The probability that a link (edge) will be executed.
• The processing time expended during traversal of a link.
• The memory required during traversal of a link.
• The resources required during traversal of a link.
Control structure testing
The basis path testing technique is one of a number of techniques for control
structure testing. Although basis path testing is simple and highly effective, it is not
sufficient in itself. In this section, other variations on control structure testing are
discussed. These broaden testing coverage and improve quality of white-box testing.
Condition Testing
Condition testing is a test case design method that exercises the logical conditions
contained in a program module. A simple condition is a Boolean variable or a
50
50
relational expression, possibly preceded with one NOT (¬) operator. A relational
expression takes the form
E1 <relational-operator> E2
where E1 and E2 are arithmetic expressions and <relational-operator> is one of the
following: <, ≤, =, ≠(nonequality), >, or ≥. A compound condition is composed of two
or more simple conditions, Boolean operators, and parentheses. We assume that
Boolean operators allowed in a compound condition include OR (|), AND (&) and
NOT (¬). A condition without relational expressions is referred to as a Boolean
expression.
Therefore, the possible types of elements in a condition include a Boolean operator, a
Boolean variable, a pair of Boolean parentheses (surrounding a simple or compound
condition), a relational operator, or an arithmetic expression. If a condition is
incorrect, then at least one component of the condition is incorrect. Therefore, types
of errors in a condition include the following:
• Boolean operator error (incorrect/missing/extra Boolean operators).
• Boolean variable error.
• Boolean parenthesis error.
• Relational operator error.
• Arithmetic expression error.
The condition testing method focuses on testing each condition in the program.
Condition testing strategies (discussed later in this section) generally have two
advantages. First, measurement of test coverage of a condition is simple. Second, the
test coverage of conditions in a program provides guidance for the generation of
additional tests for the program. The purpose of condition testing is to detect not only
errors in the conditions of a program but also other errors in the program.
For detecting errors in the conditions contained in P, it is likely that this test set is
also effective for detecting other errors in P. In addition, if a testing strategy is
effective for detecting errors in a condition, then it is likely that this strategy will also
be effective for detecting errors in a program. A number of condition testing strategies
have been proposed. Branch testing is probably the simplest condition testing
strategy. For a compound condition C, the true and false branches of C and every
simple condition in C need to be executed at least once.
51
51
Domain testing requires three or four tests to be derived for a relational expression.
For a relational expression of the form E1 <relational-operator> E2 three tests are
required to make the value of E1 greater than, equal to, or less than that of E2. If
<relational-operator> is incorrect and E1 and E2 are correct, then these three tests
guarantee the detection of the relational operator error. To detect errors in E1 and E2,
a test that makes the value of E1 greater or less than that of E2 should make the
difference between these two values as small as possible.
For a Boolean expression with n variables, all of 2n possible tests are required (n >
0). This strategy can detect Boolean operator, variable, and parenthesis errors, but it is
practical only if n is small. Error-sensitive tests for Boolean expressions can also be
derived. For a singular Boolean expression (a Boolean expression in which each
Boolean variable occurs only once) with n Boolean variables (n > 0), we can easily
generate a test set with less than 2n tests such that this test set guarantees the detection
of multiple Boolean operator errors and is also effective for detecting other errors.
Tai suggests a condition testing strategy that builds on the techniques just outlined.
Called BRO (branch and relational operator) testing, the technique guarantees the
detection of branch and relational operator errors in a condition provided that all
Boolean variables and relational operators in the condition occur only once and have
no common variables. The BRO strategy uses condition constraints for a condition C.
A condition constraint for C with n simple conditions is defined as (D1, D2, . . ., Dn),
where Di (0 < i ≤n) is a symbol specifying a constraint on the outcome of the ith
simple condition in condition C. A condition constraint D for condition C is said to be
covered by an execution of C if, during this execution of C, the outcome of each
simple condition in C satisfies the corresponding constraint in D.
For a Boolean variable, B, we specify a constraint on the outcome of B that states that
B must be either true (t) or false (f). Similarly, for a relational expression, the symbols
>, =, < are used to specify constraints on the outcome of the expression.
As an example, consider the condition
C1: B1 & B2
where B1 and B2 are Boolean variables. The condition constraint for C1 is of the form
(D1, D2), where each of D1 and D2 is t or f. The value (t, f) is a condition constraint
for C1 and is covered by the test that makes the value of B1 to be true and the value of
B2 to be false. The BRO testing strategy requires that the constraint set {(t, t), (f, t), (t,
52
52
f)} be covered by the executions of C1. If C1 is incorrect due to one or more Boolean
operator errors, at least one of the constraint set will force C1 to fail.
As a second example, a condition of the form
C2: B1 & (E3 = E4)
where B1 is a Boolean expression and E3 and E4 are arithmetic expressions. A
condition constraint for C2 is of the form (D1, D2), where each of D1 is t or f and D2
is >, =, <. Since C2 is the same as C1 except that the second simple condition in C2 is
a relational expression, we can construct a constraint set for C2 by modifying the
constraint set {(t, t), (f, t), (t, f)} defined for C1. Note that t for (E3 = E4) implies =
and that f for (E3 = E4) implies either < or >. By replacing (t, t) and (f, t) with (t, =)
and (f, =), respectively, and by replacing (t, f) with (t, <) and (t, >), the resulting
constraint set for C2 is {(t, =), (f, =), (t, <), (t, >)}. Coverage of the preceding
constraint set will guarantee detection of Boolean and relational operator errors in C2.
As a third example, we consider a condition of the form
C3: (E1 > E2) & (E3 = E4)
where E1, E2, E3 and E4 are arithmetic expressions. A condition constraint for C3 is
of the form (D1, D2), where each of D1 and D2 is >, =, <. Since C3 is the same as C2
except that the first simple condition in C3 is a relational expression, we can construct
a constraint set for C3 by modifying the constraint set for C2, obtaining {(>, =), (=,
=), (<, =), (>, >), (>, <)}Coverage of this constraint set will guarantee detection of
relational operator errors in C3.
Data Flow Testing
The data flow testing method selects test paths of a program according to the
locations of definitions and uses of variables in the program. To illustrate the data
flow testing approach, assume that each statement in a program is assigned a unique
statement number and that each function does not modify its parameters or global
variables.
For a statement with S as its statement number,
DEF(S) = {X | statement S contains a definition of X}
USE(S) = {X | statement S contains a use of X}
If statement S is an if or loop statement, its DEF set is empty and its USE set is based
on the condition of statement S. The definition of variable X at statement S is said to
53
53
be live at statement S' if there exists a path from statement S to statement S' that
contains no other definition of X.
A definition-use (DU) chain of variable X is of the form [X, S, S'], where S and S' are
statement numbers, X is in DEF(S) and USE(S'), and the definition of X in statement S
is live at statement S'.
One simple data flow testing strategy is to require that every DU chain be covered at
least once. We refer to this strategy as the DU testing strategy. It has been shown that
DU testing does not guarantee the coverage of all branches of a program. However, a
branch is not guaranteed to be covered by DU testing only in rare situations such as
if-then-else constructs in which the then part has no definition of any variable and the
else part does not exist. In this situation, the else branch of the if statement is not
necessarily covered by DU testing.
54
54
Loop Testing
Figure 16 - Classes of loops
Loops are the cornerstone for the vast majority of all algorithms implemented in
software. And yet, we often pay them little heed while conducting software tests.
Loop testing is a white-box testing technique that focuses exclusively on the validity
of loop constructs. Four different classes of loops can be defined: simple loops,
concatenated loops, nested loops, and unstructured loops.
Simple loops. The following set of tests can be applied to simple loops, where n is the
maximum number of allowable passes through the loop.
1. Skip the loop entirely.
2. Only one pass through the loop.
3. Two passes through the loop.
4. m passes through the loop where m < n.
5. n _1, n, n + 1 passes through the loop.
55
55
Nested loops. If we were to extend the test approach for simple loops to nested loops,
the number of possible tests would grow geometrically as the level of nesting
increases. This would result in an impractical number of tests. Beizer suggests an
approach that will help to reduce the number of tests:
1. Start at the innermost loop. Set all other loops to minimum values.
2. Conduct simple loop tests for the innermost loop while holding the outer loops at
their minimum iteration parameter (e.g., loop counter) values. Add other tests for
out-of-range or excluded values.
Complex loop structures are another hiding place for bugs It’s well worth spending
time designing tests that fully exercise loop structures
3. Work outward, conducting tests for the next loop, but keeping all other outer loops
at minimum values and other nested loops to "typical" values.
4. Continue until all loops have been tested.
Concatenated loops. Concatenated loops can be tested using the approach defined for
simple loops, if each of the loops is independent of the other. However, if two loops
are concatenated and the loop counter for loop 1 is used as the initial value for loop 2,
then the loops are not independent. When the loops are not independent, the approach
applied to nested loops is recommended.
Unstructured loops. Whenever possible, this class of loops should be redesigned to
reflect the use of the structured programming constructs.
Challenges in white box testing
White box testing requires a sound knowledge of the program code and the
programming language. This means that the developers should get intimately
involved in white box testing. Developers, in general, do not like to perform testing
functions. This applies to structural testing as well as static testing methods such as
reviews. In addition, because of the timeline pressures, the programmers may not find
time for reviews.
Human tendency of a developer being unable to find the defects in his code.
Fully tested code may not correspond to realistic scenarios
56
56
These challenges do not mean that white box testing is ineffective. But when
white-box testing is carried out and these challenges are addressed by other means of
testing, there is a higher likelihood of more effective testing.
2.3 INTEGRATION TESTING
A neophyte in the software world might ask a seemingly legitimate question once all
modules have been unit tested: "If they all work individually, why do you doubt that
they'll work when we put them together?" The problem, of course, is "putting them
together"—interfacing. Data can be lost across an interface; one module can have an
inadvertent, adverse affect on another; sub functions, when combined, may not
produce the desired major function; individually acceptable imprecision may be
magnified to unacceptable levels; global data structures can present problems. Sadly,
the list goes on and on. Integration testing is a systematic technique for constructing
the program structure while at the same time conducting tests to uncover errors
associated with interfacing.
The objective is to take unit tested components and build a program structure that has
been dictated by design. There is often a tendency to attempt non incremental
integration; that is, to construct the program using a "big bang" approach. All
components are combined in advance. The entire program is tested as a whole. And
chaos usually results! A set of errors is encountered. Correction is difficult because
isolation of causes is complicated by the vast expanse of the entire program. Once
these errors are corrected, new ones appear and the process continues in a seemingly
endless loop.
Incremental integration is the antithesis of the big bang approach. The program is
constructed and tested in small increments, where errors are easier to isolate and
correct; interfaces are more likely to be tested completely; and a systematic test
approach may be applied. In the sections that follow, a number of different
incremental integration strategies are discussed.
57
57
2.3.1 Top-down Integration
Figure 17 - Top- Down integration testing
Top-down integration testing is an incremental approach to construction of program
structure. Modules are integrated by moving downward through the control hierarchy,
beginning with the main control module (main program). Modules subordinate (and
ultimately subordinate) to the main control module are incorporated into the structure
in either a depth-first or breadth-first manner. Referring to Figure.17, depth-first
integration would integrate all components on a major control path of the structure.
Selection of a major path is somewhat arbitrary and depends on application-specific
characteristics. For example, selecting the left hand path, components M1, M2 , M5
would be integrated first. Next, M8 or (if necessary for proper functioning of M2) M6
would be integrated. Then, the central and right hand control paths are built.
Breadth-first integration incorporates all components directly subordinate at each
level, moving across the structure horizontally. From the figure, components M2, M3,
and M4 (a replacement for stub S4) would be integrated first. The next control level,
M5, M6, and so on, follows.
58
58
The integration process is performed in a series of five steps:
1. The main control module is used as a test driver and stubs are substituted for all
components directly subordinate to the main control module.
2. Depending on the integration approach selected (i.e., depth or breadth first),
subordinate stubs are replaced one at a time with actual components.
3. Tests are conducted as each component is integrated.
4. On completion of each set of tests, another stub is replaced with the real
component.
5. Regression testing may be conducted to ensure that new errors have not been
introduced.
The process continues from step 2 until the entire program structure is built. The
top-down integration strategy verifies major control or decision points early in the test
process. In a well-factored program structure, decision making occurs at upper levels
in the hierarchy and is therefore encountered first. If major control problems do exist,
early recognition is essential. If depth-first integration is selected, a complete function
of the software may be implemented and demonstrated.
The incoming path may be integrated in a top-down manner. All input processing
(for subsequent transaction dispatching) may be demonstrated before other elements
of the structure have been integrated. Early demonstration of functional capability is a
confidence builder for both the developer and the customer.
Top-down strategy sounds relatively uncomplicated, but in practice, logistical
problems can arise. The most common of these problems occurs when processing at
low levels in the hierarchy is required to adequately test upper levels. Stubs replace
low level modules at the beginning of top-down testing; therefore, no significant data
can flow upward in the program structure. The tester is left with three choices:
(1) Delay many tests until stubs are replaced with actual modules,
(2) Develop stubs that perform limited functions that simulate the actual module, or
(3) Integrate the software from the bottom of the hierarchy upward.
The first approach (delay tests until stubs are replaced by actual modules) causes us
to loose some control over correspondence between specific tests and incorporation of
specific modules. This can lead to difficulty in determining the cause of errors and
tends to violate the highly constrained nature of the top-down approach. The second
approach is workable but can lead to significant overhead, as stubs become more and
59
59
more complex. The third approach, called bottom-up testing, is discussed in the next
section.
2.3.2 Bottom-up Integration
Bottom-up integration testing, as its name implies, begins construction and testing
with atomic modules (i.e., components at the lowest levels in the program structure).
Because components are integrated from the bottom up, processing required for
components subordinate to a given level is always available and the need for stubs is
eliminated.
A bottom-up integration strategy may be implemented with the following steps:
1. Low-level components are combined into clusters (sometimes called builds) that
perform a specific software sub function.
2. A driver (a control program for testing) is written to coordinate test case input and
output.
3. The cluster is tested.
4. Drivers are removed and clusters are combined moving upward in the program
structure.
Figure 18 - Bottom- Up Integration
60
60
Integration follows the pattern illustrated in Figure.18. Components are combined to
form clusters 1, 2, and 3. Each of the clusters is tested using a driver (shown as a
dashed block). Components in clusters 1 and 2 are subordinate to Ma. Drivers D1 and
D2 are removed and the clusters are interfaced directly to Ma. Similarly, driver D3 for
cluster 3 is removed prior to integration with module Mb. Both Ma and Mb will
ultimately be integrated with component Mc, and so forth. Bottom-up integration
eliminates the need for complex stubs.
As integration moves upward, the need for separate test drivers lessens. In fact, if the
top two levels of program structure are integrated top down, the number of drivers can
be reduced substantially and integration of clusters is greatly simplified.
Comments on Integration Testing
There has been much discussion of the relative advantages and disadvantages of
top-down versus bottom-up integration testing. In general, the advantages of one
strategy tend to result in disadvantages for the other strategy. The major disadvantage
of the top-down approach is the need for stubs and the attendant testing difficulties
that can be associated with them. Problems associated with stubs may be offset by the
advantage of testing major control functions early. The major disadvantage of
bottom-up integration is that "the program as an entity does not exist until the last
module is added". This drawback is tempered by easier test case design and a lack of
stubs.
Selection of an integration strategy depends upon software characteristics and,
sometimes, project schedule. In general, a combined approach (sometimes called
sandwich testing) that uses top-down tests for upper levels of the program structure,
coupled with bottom-up tests for subordinate levels may be the best compromise. As
integration testing is conducted, the tester should identify critical modules. A critical
module has one or more of the following characteristics:
(1) Addresses several software requirements,
(2) Has a high level of control (resides relatively high in the program structure),
(3) Is complex or error prone (cyclomatic complexity may be used as an indicator), or
(4) has definite performance requirements. Critical modules should be tested as early
as is possible. In addition, regression tests should focus on critical module function.
61
61
2.3.3 Integration Test Documentation
An overall plan for integration of the software and a description of specific tests are
documented in a Test Specification. This document contains a test plan, and a test
procedure, is a work product of the software process, and becomes part of the
software configuration. The test plan describes the overall strategy for integration.
Testing is divided into phases and builds that address specific functional and
behavioral characteristics of the software. For example, integration testing for a CAD
system might be divided into the following test phases:
• User interaction (command selection, drawing creation, display representation, error
processing and representation).
• Data manipulation and analysis (symbol creation, dimensioning; rotation,
computation of physical properties).
• Display processing and generation (two-dimensional displays, three dimensional
displays, graphs and charts).
• Database management (access, update, integrity, performance).
Each of these phases and sub phases (denoted in parentheses) delineates a broad
functional category within the software and can generally be related to a specific
domain of the program structure. Therefore, program builds (groups of modules) are
created to correspond to each phase. The following criteria and corresponding tests
are applied for all test phases:
Interface integrity. Internal and external interfaces are tested as each module (or
cluster) is incorporated into the structure.
Functional validity. Tests designed to uncover functional errors are conducted.
Information content. Tests designed to uncover errors associated with local or global
data structures are conducted.
Performance. Tests designed to verify performance bounds established during
software design are conducted.
A schedule for integration, the development of overhead software, and related topics
is also discussed as part of the test plan. Start and end dates for each phase are
established and "availability windows" for unit tested modules are defined. A brief
62
62
description of overhead software (stubs and drivers) concentrates on characteristics
that might require special effort. Finally, test environment and resources are
described.
Unusual hardware configurations, exotic simulators, and special test tools or
techniques are a few of many topics that may also be discussed.
The order of integration and corresponding tests at each integration step are
described. A listing of all test cases (annotated for subsequent reference) and expected
results is also included. A history of actual test results, problems, or peculiarities is
recorded in the Test Specification. Information contained in this section can be vital
during software maintenance. Like all other elements of a software configuration, the
test specification format may be tailored to the local needs of a software engineering
organization. It is important to note, however, that an integration strategy (contained
in a test plan) and testing details (described in a test procedure) are essential
ingredients and must appear.
Validation testing
At the culmination of integration testing, software is completely assembled as a
package, interfacing errors have been uncovered and corrected, and a final series of
software tests—validation testing—may begin. Validation can be defined in many
ways, but a simple (albeit harsh) definition is that validation succeeds when software
functions in a manner that can be reasonably expected by the customer. At this point a
battle-hardened software developer might protest: "Who or what is the arbiter of
reasonable expectations?" Reasonable expectations are defined in the Software
Requirements Specification— a document that describes all user-visible attributes of
the software. The specification contains a section called Validation Criteria.
Information contained in that section forms the basis for a validation testing approach.
Validation Test Criteria
Software validation is achieved through a series of black-box tests that demonstrate
conformity with requirements. A test plan outlines the classes of tests to be conducted
and a test procedure defines specific test cases that will be used to demonstrate
conformity with requirements. Both the plan and procedure are designed to ensure
that all functional requirements are satisfied, all behavioral characteristics are
achieved, all performance requirements are attained, documentation is correct, and
63
63
human engineered and other requirements are met (e.g., transportability,
compatibility, error recovery, maintainability).
After each validation test case has been conducted, one of two possible conditions
exist:
(1) The function or performance characteristics conform to specification and are
accepted or
(2) a deviation from specification is uncovered and a deficiency list is created.
Deviation or error discovered at this stage in a project can rarely be corrected prior to
scheduled delivery. It is often necessary to negotiate with the customer to establish a
method for resolving deficiencies.
Configuration Review
An important element of the validation process is a configuration review. The intent
of the review is to ensure that all elements of the software configuration have been
properly developed, are cataloged, and have the necessary detail to bolster the support
phase of the software life cycle.
2.3.4 Alpha and Beta Testing
It is virtually impossible for a software developer to foresee how the customer will
really use a program. Instructions for use may be misinterpreted; strange
combinations of data may be regularly used; output that seemed clear to the tester
may be unintelligible to a user in the field. When custom software is built for one
customer, a series of acceptance tests are conducted to enable the customer to validate
all requirements. Conducted by the end user rather than software engineers, an
acceptance test can range from an informal "test drive" to a planned and
systematically executed series of tests. In fact, acceptance testing can be conducted
over a period of weeks or months, thereby uncovering cumulative errors that might
degrade the system over time.
If software is developed as a product to be used by many customers, it is impractical
to perform formal acceptance tests with each one. Most software product builders use
a process called alpha and beta testing to uncover errors that only the end-user seems
able to find.
64
64
The alpha test is conducted at the developer's site by a customer. The software is used
in a natural setting with the developer "looking over the shoulder" of the user and
recording errors and usage problems. Alpha tests are conducted in a controlled
environment.
The beta test is conducted at one or more customer sites by the end-user of the
software. Unlike alpha testing, the developer is generally not present. Therefore, the
beta test is a "live" application of the software in an environment that cannot be
controlled by the developer. The customer records all problems (real or imagined) that
are encountered during beta testing and reports these to the developer at regular
intervals. As a result of problems reported during beta tests, software engineers make
modifications and then prepare for release of the software product to the entire
customer base.
2.4 SYSTEM AND ACCEPTANCE TESTING
The testing conducted on the complete integrated products and solutions to evaluate
system compliance with specified requirements on functional and non-functional
aspects is called system testing. System testing is conducted with an objective to find
product level defects and in building the confidence before the product is released to
the customer. Since system testing is the last phase of testing before the release, not
all defects can be fixed in code in time due to time and effort needed in development
and testing and due to the potential risk involved in any last-minute changes.
Hence, an impact analysis is done for those defects to reduce the risk of releasing a
product with defects. The analysis of defects and their classification into various
categories also gives an idea about the kind of defects that will be found by the
customer after release. This information helps in planning some activities such as
providing workarounds, documentation on alternative approaches, and so on. Hence,
system testing helps in reducing the risk of releasing a product.
65
65
2.4.1 System testing
System testing is defined as a testing phase conducted on the complete integrated
system, to evaluate the system compliance with its specific requirements. It is done
after unit, component and integration testing phases. System testing is the only phase
of testing which tests both the functional and non-functional aspects of the product.
On the functional side, system testing focuses on real-life customer usage of the
product and solutions. System testing simulates customer deployments.
On the non-functional side, system brings in different testing types, some of which are
as follows.
1. Performance/Load testing
2. Scalability testing
3. Reliability testing
4. Stress testing
5. interoperability testing
6. Localization testing
Software is only one element of a larger computer-based system. Ultimately,
software is incorporated with other system elements (e.g., hardware, people,
information), and a series of system integration and validation tests are conducted.
These tests fall outside the scope of the software process and are not conducted solely
by software engineers. However, steps taken during software design and testing can
greatly improve the probability of successful software integration in the larger system.
A classic system testing problem is "finger-pointing." This occurs when an error is
uncovered, and each system element developer blames the other for the problem.
Rather than indulging in such nonsense, the software engineer should anticipate
potential interfacing problems and
(1) Design error-handling paths that test all information coming from other elements
of the system,
(2) Conduct a series of tests that simulate bad data or other potential errors at the
software interface,
66
66
(3) Record the results of tests to use as "evidence" if finger-pointing does occur, and
(4) Participate in planning and design of system tests to ensure that software is
adequately tested.
System testing is actually a series of different tests whose primary purpose is to fully
exercise the computer-based system. Although each test has a different purpose, all
work to verify that system elements have been properly integrated and perform
allocated functions. In the sections that follow, we discuss the types of system tests
that are worthwhile for software-based systems.
To summarize, system testing is done for the following reasons.
• Provide independent perspective in testing
• Bring in customer perspective in testing
• Provide a “fresh pair of eyes” to discover defects not found earlier by testing
• Test product behavior in a holistic, complete and realistic environment
• Test both functional and non-functional aspects of the product
• Build confidence in the product
• Analyze and reduce the risk of releasing the product
• Ensure all requirements are met and ready the product for acceptance testing.
Functional system testing
Functional testing is performed at different phases and the focus is on product level
features. As functional testing is performed at various testing phases, there are two
obvious problems. One is duplication and other one is gray area. Duplication refers to
the same tests being performed multiple times and gray area refers to certain tests
being missed out in all the phases. Gray areas in testing happen due to lack of product
knowledge, lack of knowledge of customer usage, and lack of co-ordination across
test teams. There are multiple ways system functional testing is performed. There are
also many ways product level test cases are derived for functional testing.
67
67
Functional vs non-functional testing
Testing aspects Functional testing Non-functional testingInvolves Product features and
functionality
Quality factors
Tests Product behavior Behavior and experienceResult conclusion Simple steps written to check
expected results
Huge data collected and
analyzed
Result varies due
to
Product implementation Product implementation,
resources, and configurationsTesting focus Defect detection Qualification of productKnowledge
required
Product and domain Product, domain, design,
architecture, statistical skillsFailures normally
due to
Code Architecture, design, and code
systemTesting phase Unit, component, integration,
system
System
Test case
repeatability
Repeated many times Repeated only in case of failures
and for different configurationsConfiguration One-time setup for a set of
test cases
Configuration changes for each
test case
Some of the common techniques are given below:
• Design and architecture verification
• Business vertical testing
• Deployment testing
• Beta testing
• Certification, standards, and testing for compliance.
Non-functional testing
The process followed by non-functional testing is similar to that of functional testing
but differs from the aspects of complexity, knowledge requirement, effort needed, and
number of times the test cases are repeated. Since repeating non-functional test cases
involve more time, effort, and resources, the process for non-functional testing has to
be more robust stronger than functional testing to minimize the need for repetition.
68
68
This is achieved by having more stringent entry/exit criteria, better planning, and by
setting up the configuration with data population in advance for test execution.
Recovery Testing
Many computer based systems must recover from faults and resume processing
within a pre specified time. In some cases, a system must be fault tolerant; that is,
processing faults must not cause overall system function to cease. In other cases, a
system failure must be corrected within a specified period of time or severe economic
damage will occur.
Recovery testing is a system test that forces the software to fail in a variety of ways
and verifies that recovery is properly performed. If recovery is automatic (performed
by the system itself), reinitialization, check pointing mechanisms, data recovery, and
restart are evaluated for correctness. If recovery requires human intervention, the
mean-time-to-repair (MTTR) is evaluated to determine whether it is within acceptable
limits.
Scalability testing
The objective of scalability testing is to find out the maximum capability of the
product parameters. As the exercise involves finding the maximum, the resources that
are needed for this kind of testing are normally very high. At the beginning of the
scalability exercise, there may not be an obvious clue about the maximum capability
of the system. Hence a high-end configuration is selected and the scalability
parameter is increased step by step to reach the maximum capability.
Failures during scalability test include the system not responding, or the system
crashing and so on. Scalability tests help in identifying the major bottlenecks in a
product. When resources are found to be the bottleneck, they are increased after
validating the assumptions mentioned. Scalability tests are performed on different
configurations to check the product’s behavior.
There can be some bottlenecks during scalability testing, which will require certain
OS parameters and product parameters to be tuned. “Number of open files” and
“Number of product threads” are some examples of parameters that may need tuning.
When such tuning is performed, it should be appropriately documented. A document
containing such tuning parameters and the recommended values of other product and
69
69
environmental parameters for attaining the scalability numbers is called a sizing
guide. This guide is one of the mandatory deliverables from scalability testing.
Reliability testing
Reliability testing is done to evaluate the product’s ability to perform its required
functions under stated conditions for a specified period of time or for a large number
of iterations. Examples of reliability include querying a database continuously for 48
hours and performing login operations 10,000 times.
The reliability of a product should not be confused with reliability testing. Reliability
here is an all-encompassing term used to mean all the quality factors and functionality
aspects of the product. This product reliability is achieved by focusing on the
following activities:
Defined engineering processes
Review of work products at each stage
Change management procedures
Review of testing coverage
Ongoing monitoring of the product
Reliability testing, on the other hand, refers to testing the product for a continuous
period of time. Reliability testing only delivers a “reliability tested product” but not a
reliable product. The main factor that is taken into account for reliability testing is
defects.
To summarize, a “reliability tested product” will have the following characteristics:
• No errors or very few errors from repeated transactions
• Zero downtime
• Optimum utilization of resources.
• Consistent performance and response time of the product for repeated
transactions for a specified time duration
• No side-effects after the repeated transactions are executed.
Security Testing
Any computer-based system that manages sensitive information or causes actions
that can improperly harm (or benefit) individuals is a target for improper or illegal
penetration. Penetration spans a broad range of activities: hackers who attempt to
70
70
penetrate systems for sport; disgruntled employees who attempt to penetrate for
revenge; dishonest individuals who attempt to penetrate for illicit personal gain.
Security testing attempts to verify that protection mechanisms built into a system will,
in fact, protect it from improper penetration. To quote Beizer: "The system's security
must, of course, be tested for invulnerability from frontal attack—but must also be
tested for invulnerability from flank or rear attack.
During security testing, the tester plays the role(s) of the individual who desires to
penetrate the system. Anything goes! The tester may attempt to acquire passwords
through external clerical means; may attack the system with custom software
designed to breakdown any defenses that have been constructed; may overwhelm the
system, thereby denying service to others; may purposely cause system errors, hoping
to penetrate during recovery; may browse through insecure data, hoping to find the
key to system entry.
Given enough time and resources, good security testing will ultimately penetrate a
system. The role of the system designer is to make penetration cost more than the
value of the information that will be obtained.
Stress Testing
During earlier software testing steps, white-box and black-box techniques resulted in
thorough evaluation of normal program functions and performance. Stress tests are
designed to confront programs with abnormal situations. In essence, the tester who
performs stress testing asks: "How high can we crank this up before it fails?" Stress
testing executes a system in a manner that demands resources in abnormal quantity,
frequency, or volume. For example,
(1) special tests may be designed that generate ten interrupts per second, when one or
two is the average rate,
(2) input data rates may be increased by an order of magnitude to determine how
input functions will respond,
(3) test cases that require maximum memory or other resources are executed,
(4) test cases that may cause thrashing in a virtual operating system are designed,
(5) test cases that may cause excessive hunting for disk-resident data are created.
Essentially, the tester attempts to break the program.
A variation of stress testing is a technique called sensitivity testing. In some situations
(the most common occur in mathematical algorithms), a very small range of data
71
71
contained within the bounds of valid data for a program may cause extreme and even
erroneous processing or profound performance degradation. Sensitivity testing
attempts to uncover data combinations within valid input classes that may cause
instability or improper processing.
Interoperability testing
Interoperability testing is done to ensure the two or more products can exchange
information, use information, and work properly together. Systems can be
interoperable unidirectional or bi-directional. Unless two or more products are
designed for exchanging information, interoperability cannot be achieved. The
following are some guidelines that help in improving interoperability.
1. Consistency of information flow across systems
2. Changes to data representation as per the system requirements
3. Correlated interchange of messages and receiving appropriate responses
4. Communication and messages
5. Meeting quality factors.
2.4.2 ACCEPTANCE TESTING
Acceptance testing is a phase after system testing that is normally done by the
customers or representatives of the customers. The customer defines a set of test cases
that will be executed to qualify and accept the product. These test cases are executed
by the customers themselves to quickly judge the quality of the product before
deciding to buy the product. Acceptance test cases are normally small in number and
are not written with the intention of finding defects. Acceptance tests are written to
execute near real-life scenarios. Apart from verifying the functional requirements,
acceptance tests are run to verify the non-functional aspects of the system also.
Acceptance test cases failing in a customer site may cause the product to be rejected
and may mean financial loss or may mean rework of product involving effort and
time.
72
72
Acceptance criteria
Acceptance criteria-product acceptance
During the requirements phase, each requirement is associated with acceptance
criteria. It is possible that one or more requirements may be mapped to form
acceptance criteria. Whenever there are changes to requirements, the acceptance
criteria are accordingly modified and maintained. Acceptance testing is not meant for
executing test cases that have not been executed before. Hence, the existing test cases
are looked at and certain categories of test cases can be grouped to form acceptance
criteria.
Acceptance criteria – procedure acceptance
Acceptance criteria can be defined based on the procedures followed for delivery. An
example of procedure acceptance could be documentation and release media. Some
examples of acceptance criteria of this nature are as follows:
• User, administration and troubleshooting documentation should be part of the
release.
• Along with binary code, the source code of the product with build scripts to be
delivered in a CD.
• A minimum of 20 employees are trained on the product usage prior to
deployment.
These procedural acceptance criteria are verified /tested as part of acceptance testing.
Acceptance criteria – service level agreements
Service level agreements are generally part of a contract signed by the customer and
the product organization. The important contract items are taken and verified as part
of acceptance testing.
Selecting test cases for acceptance testing
This section gives some guideline on what test cases can be included for acceptance
testing:
• End-to-end functionality verification
73
73
• Domain tests
• User scenario tests
• Basic sanity tests
• New functionality
• A few non-functional tests
• Tests pertaining to legal obligations and service level agreements
• Acceptance test data
Executing acceptance tests
Sometimes the customers themselves do the acceptance tests. In such cases, the job of
the product organization is to assist the customers in acceptance testing and resolve
the issues that come out it. If the acceptance testing is done by the product
organization, forming the acceptance test team becomes an important activity. An
acceptance test team usually comprises members who are involved in the day – to-day
activities of the product usage or are familiar with such scenarios. The product
management, support, and consulting team, who have good knowledge of the
customers, contribute to the acceptance testing definition and execution. They may
not be familiar with the testing process or the technical aspect of the software. But
they know whether the product does what it is intended to do. An acceptance test team
may be formed with 90% of them possessing the required business process knowledge
of the product and 10% being representatives of the technical testing team. The
number of test team members needed to perform acceptance testing is not much when
compared to other phases of testing.
The role of the testing team members during and prior to acceptance test is crucial
since they may constantly interact with the acceptance team members. Test team
members help the acceptance members to get the required test data, select and identify
test cases, and analyze the acceptance test results. During test execution, the
acceptance test team reports its progress regularly. The defect reports are generated on
a periodic basis.
Defects reported during acceptance tests could be of different priorities. Test teams
help acceptance test team report defects. Showstopper and high-priority defects are
necessarily fixed before software is released. In case major defects are identified
during acceptance testing, then there is a risk of missing the release date. When the
74
74
defect fixes point to scope or requirement changes, then it may either result in the
extension of the release date to include the feature in the current release or get
postponed to subsequent releases. All resolution of those defects are discussed with
the acceptance test team and their approval is obtained for concluding the completion
of acceptance testing.
2.5 Summary
White box testing requires a sound knowledge of the program code and the
programming language. This means that the developers should get intimately
involved in white box testing.
All testing activities that are conducted from the point where two components are
integrated to the point where all system components work together, are considered a
part of the integration testing phase. The integration testing phase involves developing
and executing test cases that cover multiple components and functionality. This
testing is both a type of testing and a phase of testing. Integration testing, if done
properly, can reduce the number of defects that will be found in the system testing
phase.
System testing is conducted with an objective to find product level defects, and in
building the confidence before the product is released to the customer. System testing
is done to provide independent perspective in testing, bring in customer perspective in
testing, provide a fresh pair of eyes to discover defects not found earlier by testing,
test product in a holistic, complete, and realistic environment, test both functional and
non-functional aspects of a product and analyze and reduce the risk of releasing the
product. It ensures all requirements are met and ready the product for acceptance
testing.
Acceptance testing is a phase after system testing that is normally done by the
customers or representatives of the customer. Acceptance test cases failing in a
customer site may cause the product to be rejected and may mean financial loss or
may mean rework of product involving effort and time.
75
75
2.6 Check Your Progress
1. Explain the types of testing and their importance?
2. What is called white-box testing? Explain static testing.
3. What are the phases involved in structural testing? Explain.
4. What is called a code complexity testing? Explain with an example.
5. Why integration testing is needed? What are the two types of integration
testing?
6. Why system testing is done? Explain
7. Define beta testing and it’s importance.
8. Explain the acceptance testing.
9. How will you select test cases for acceptance testing?
10. Explain the concept system and acceptance testing as a whole.
76
76
Unit III Testing Fundamental – 2 & Specialized Testing
Structure
3.0. Objectives
3.1 Introduction
3.2 Performance Testing
3.3 Regression Testing
3.4 Testing of Object Oriented System
3.5 Usability and Accessibility Testing
3.6 Summary
3.7 Check Your Progress
77
77
3.0 Objectives
To learn the types of specialized testing and their importance
To understand the importance of making a performance testing and it’s uses
To know the regression testing to make the software with improved qualities
To understand how to make Object oriented system testing and their features
To understand the importance of making usability and accessibility testing as a
special kind of testing methodology
3.1 Introduction
The testing performed to evaluate the response time, throughput, and utilization of
the system, to execute its required functions in comparison with different versions of
the same product or a different competitive product is called performance testing. In
this internet era, when more and more of business is transacted online, there is big and
understandable expectation that all applications run as fast as possible. When
applications run fast, a system can fulfill the business requirements quickly and put it
in a position to expand it business and handle future needs as well. A system or a
product that is not able to service business transactions due to its slow performance is
a big loss for the product organization, and its customers. Hence performance is a
basic requirement for any product and is fast becoming a subject of great interest in
the testing community.
3.2 PERFORMANCE TESTING
For real-time and embedded systems, software that provides required function but
does not conform to performance requirements is unacceptable. Performance testing
is designed to test the run-time performance of software within the context of an
integrated system. Performance testing occurs throughout all steps in the testing
process. Even at the unit level, the performance of an individual module may be
assessed as white-box tests are conducted. However, it is not until all system elements
are fully integrated that the true performance of a system can be ascertained.
78
78
Performance tests are often coupled with stress testing and usually require both
hardware and software instrumentation. That is, it is often necessary to measure
resource utilization (e.g., processor cycles) in an exacting fashion. External
instrumentation can monitor execution intervals, log events (e.g., interrupts) as they
occur, and sample machine states on a regular basis. By instrumenting a system, the
tester can uncover situations that lead to degradation and possible system failure.
Figure 19 - The debugging process
There are many factors that govern performance testing. It is critical to understand
the definition and purpose of these factors prior to understanding the methodology for
performance testing and for analyzing the results. The capability of the system or the
product in handling multiple transactions is determined by a factor called throughput.
Throughput represents the number of request/business transactions processed by
the product in specified time duration. It is important to understand that the
throughput varies according to the load of the product is subjected to. The “optimum
throughput” is represented by the saturation point and is the one that represents the
maximum throughput for the product.
Response time can be defined as the delay between the point of request and the first
response from the product. In a typical client-server environment, throughput
represents the number of transactions that can be handled by the server and response
time represents the delay between the request and response.
In reality, not all the delay that happens between the request and the response is
caused by the product. In the networking scenario, the network or other products
79
79
which are sharing the network resources can cause the delays. This brings up yet
another factor for performance – latency. Latency is a delay caused by the application,
operating system, and by the environment that are calculated separately.
Fig. 20 Example of latencies at various levels- network and applications
To explain latency, let us take an example of a web application providing a service
by talking to a web server and a database server connected in the network from the
above figure. From the above picture, latency and response time can be calculated as
Network latency = N1 + N2 + N3 + N4
Product latency = A1 + A2 + A3
Actual response time = network latency + product latency
The next factor that governs the performance testing is tuning. Tuning is a procedure
by which the product performance is enhanced by setting different values to the
parameters of the product, operating system and other components.
Yet another factor that needs to be considered for performance testing is performance
of competitive products. This type of performance testing wherein competitive
products are compared is called benchmarking.
To summarize, performance testing is done to ensure that a product
Client
WebServerA1
A3
DatabaseServerN2A2
N3
N1N4
n4
80
80
• Processes the required number of transactions in any given interval
(throughput).
• Is available and running under different load conditions(availability)
• Responds fast enough for different load conditions( response time)
• Delivers worthwhile return on investment for the resources- hardware and
software
• Is comparable to and better than that of the competitors for different
parameters.
Methodology for performance testing involves the following steps.
1. Collecting requirements
2. Writing test cases
3. Automating performance test cases
4. Executing performance test cases
5. Analyzing performance test results
6. Performance tuning
7. Performance benchmarking
8. Recommending right configuration for the customers
Tools for performance testing
There are two types of tools that can be used for performance testing- functional
performance tools and load tools.
• Functional performance tools help in recording and playing back the
transactions and obtaining performance numbers. This test involves very few
machines.
• Load testing tools simulate the load condition for performance testing without
having to keep that many users or machines.
The list of some popular performance tools are listed below:
Functional performance tools
• WinRunner from Mercury
• QA Partner from compuware
• Silktest from Segue
81
81
Load testing tools
• Load Runner from Mercury
• QA Load from Compuware
• Silk Performer from Segue
Process for performance testing
Performance testing follows the same process as does any other testing type. The only
difference is in getting more details and analysis.
Ever-changing requirements for performance are a serious threat to the product as
performance can only be improved marginally by fixing it in code. Making the
requirements testable and measurable is the first activity needed for the success of
performance testing.
The next step in the performance testing process is to create a performance test plan.
This test plan needs to have the following details.
1. Resource requirements
2. Test bed ( simulated and real life ), test-lab setup
3. Responsibilities
4. Setting up product traces, audits, and traces
5. Entry and exit criteria
The Art of Debugging
Software testing is a process that can be systematically planned and specified. Test
case design can be conducted, a strategy can be defined, and results can be evaluated
against prescribed expectations.
Debugging occurs as a consequence of successful testing. That is, when a test case
uncovers an error, debugging is the process that results in the removal of the error.
Although debugging can and should be an orderly process, it is still very much an art.
A software engineer, evaluating the results of a test, is often confronted with a
"symptomatic" indication of a software problem. That is, the external manifestation of
the error and the internal cause of the error may have no obvious relationship to one
another. The poorly understood mental process that connects a symptom to a cause is
debugging.
82
82
The Debugging Process
Debugging is not testing but always occurs as a consequence of testing. The
debugging process begins with the execution of a test case. Results are assessed and a
lack of correspondence between expected and actual performance is encountered.
The debugging process will always have one of two outcomes:
(1) The cause will be found and corrected, or
(2) The cause will not be found. In the latter case, the person performing debugging
may suspect a cause, design a test case to help validate that suspicion, and work
toward error correction in an iterative fashion.
Why is debugging so difficult? In all likelihood, human psychology has more to do
with an answer than software technology. However, a few characteristics of bugs
provide some clues:
1. The symptom and the cause may be geographically remote. That is, the symptom
may appear in one part of a program, while the cause may actually be located at a site
that is far removed.
2. The symptom may disappear (temporarily) when another error is corrected.
3. The symptom may actually be caused by non errors (e.g., round-off inaccuracies).
4. The symptom may be caused by human error that is not easily traced.
5. The symptom may be a result of timing problems, rather than processing problems.
6. It may be difficult to accurately reproduce input conditions (e.g., a real-time
application in which input ordering is indeterminate).
7. The symptom may be intermittent. This is particularly common in embedded
systems that couple hardware and software inextricably.
8. The symptom may be due to causes that are distributed across
3.3 REGRESSION TESTING
Each time a new module is added as part of integration testing, the software changes.
New data flow paths are established, new I/O may occur, and new control logic is
invoked. These changes may cause problems with functions that previously worked
flawlessly. In the context of an integration test strategy, regression testing is the re
83
83
execution of some subset of tests that have already been conducted to ensure that
changes have not propagated unintended side effects. In a broader context, successful
tests (of any kind) result in the discovery of errors, and errors must be corrected.
Whenever software is corrected, some aspect of the software configuration (the
program, its documentation, or the data that support it) is changed. Regression testing
is the activity that helps to ensure that changes (due to testing or for other reasons) do
not introduce unintended behavior or additional errors.
Regression testing may be conducted manually, by re-executing a subset of all test
cases or using automated capture/playback tools. Capture/playback tools enable the
software engineer to capture test cases and results for subsequent playback and
comparison.
The regression test suite (the subset of tests to be executed) contains three different
classes of test cases:
• A representative sample of tests that will exercise all software functions.
• Additional tests that focus on software functions that are likely to be affected by the
change.
• Tests that focus on the software components that have been changed.
As integration testing proceeds, the number of regression tests can grow quite large.
Therefore, the regression test suite should be designed to include only those tests that
address one or more classes of errors in each of the major program functions. It is
impractical and inefficient to re-execute every test for every program function once a
change has occurred.
Types of regression testing
There are two types of regression testing in practice.
• Regular regression testing
• Final regression testing
A regular regression testing is done between test cycles to ensure that the defect fixes
that are done and the functionality that were working with the earlier test cycles
continue to work. A regular regression testing can use more than one product build for
the test cases to be executed. A build is an aggregation of all the defect fixes and
features that are present in the product.
A final regression testing is done to validate the final build before release.
84
84
It is necessary to perform regression testing when
• A reasonable amount of initial testing is already carried out.
• A good number of defects have been fixed.
• Defect fixes that can produce side-effects are taken care of.
How to do regression testing
A well defined methodology for regression testing is very important as this among is
the final type of testing that is normally performed just before release. The
methodology here is made of the following steps:
• Performing an initial “smoke” or “sanity” test
• Understanding the criteria for selecting the test cases
• Classifying the test cases
• Methodology for selecting test cases
• Resetting the test cases for regression testing
Smoke Testing
Smoke testing is an integration testing approach that is commonly used when “shrink
wrapped” software products are being developed. It is designed as a pacing
mechanism for time-critical projects, allowing the software team to assess its project
on a frequent basis. In essence, the smoke testing approach encompasses the
following activities:
1. Software components that have been translated into code are integrated into a
“build”. A build includes all data files, libraries, reusable modules, and engineered
components that are required to implement one or more product functions.
2. A series of tests is designed to expose errors that will keep the build from properly
performing its function. The intent should be to uncover “show stopper” errors that
have the highest likelihood of throwing the software project behind schedule.
3. The build is integrated with other builds and the entire product (in its current form)
is smoke tested daily. The integration approach may be top down or bottom up.
85
85
The daily frequency of testing the entire product may surprise some readers.
However, frequent tests give both managers and practitioners a realistic assessment of
integration testing progress. McConnell describes the smoke test in the following
manner:
The smoke test should exercise the entire system from end to end. It does not have to
be exhaustive, but it should be capable of exposing major problems. The smoke test
should be thorough enough that if the build passes, you can assume that it is stable
enough to be tested more thoroughly. Smoke testing provides a number of benefits
when it is applied on complex, time critical software engineering projects:
• Integration risk is minimized. Because smoke tests are conducted daily,
incompatibilities and other show-stopper errors are uncovered early, thereby reducing
the likelihood of serious schedule impact when errors are uncovered.
• The quality of the end-product is improved. Because the approach is construction
(integration) oriented, smoke testing is likely to uncover both functional errors and
architectural and component-level design defects. If these defects are corrected early,
better product quality will result.
• Error diagnosis and correction are simplified. Like all integration testing
approaches, errors uncovered during smoke testing are likely to be associated with
“new software increments”—that is, the software that has just been added to the
build(s) is a probable cause of a newly discovered error.
• Progress is easier to assess. With each passing day, more of the software has been
integrated and more has been demonstrated to work. This improves team morale and
gives managers a good indication that progress is being made.
Best practices in regression testing
Regression methodology can be applied when
1. We need to assess the quality of product between test cycles
2. We are doing a major release of a product, have executed all test cycles, and are
planning a regression test cycle for defect fixes and
3. We are doing a minor release of a product having only defect fixes, and we can
plan for regression test cycles to take care of those defect fixes.
86
86
The best practices are listed below:
• Regression can be used for all types of releases.
• Mapping defect identifiers with test cases improves regression quality
• Create and execute regression test bed daily
• Ask your best test engineer to select the test cases
• Detect defects, and protect your product from defects and defect fixes.
3.4 Testing of Object Oriented Systems
The objective of testing, stated simply, is to find the greatest possible number of
errors with a manageable amount of effort applied over a realistic time span. Although
this fundamental objective remains unchanged for object-oriented software, the nature
of OO programs changes both testing strategy and testing tactics. It might be argued
that, as OOA and OOD mature, greater reuse of design patterns will mitigate the need
for heavy testing of OO systems. Exactly the opposite is true. Binder discusses this
when he states:
Each reuse is a new context of usage and retesting is prudent. It seems likely that
more, not less, testing will be needed to obtain high reliability in object-oriented
systems.
The testing of OO systems presents a new set of challenges to the software engineer.
The definition of testing must be broadened to include error discovery techniques
(formal technical reviews) applied to OOA and OOD models. The completeness and
consistency of OO representations must be assessed as they are built. Unit testing
loses much of its meaning, and integration strategies change significantly. In
summary, both testing strategies and testing tactics must account for the unique
characteristics of OO software.
3 The architecture of object-oriented software results in a series of layered subsystems
that encapsulate collaborating classes. Each of these system elements (subsystems and
classes) performs functions that help to achieve system requirements. It is necessary
to test an OO system at a variety of different levels in an effort to uncover errors that
87
87
may occur as classes collaborate with one another and subsystems communicate
across architectural layers.
Who does it? Object-oriented testing is performed by software engineers and testing
specialists.
Why is it important? You have to execute the program before it gets to the customer
with the specific intent of removing all errors, so that the customer will not experience
the frustration associated with a poor-quality product. In order to find the highest
possible number of errors, tests must be conducted systematically and test cases must
be designed using disciplined techniques.
What are the steps? OO testing is strategically similar to the testing of conventional
systems, but it is tactically different. Because the OO analysis and design models are
similar in structure and content to the resultant OO program, “testing” begins with the
review of these models. Once code has been generated, OO testing begins “in the
small” with class testing. Problems could occur (and will have been avoided because
of the earlier review) during design:
1. Improper allocation of the class to subsystem and/or tasks may occur during system
design.
2. Unnecessary design work may be expended to create the procedural design for the
operations that address the extraneous attribute.
3.The messaging model will be incorrect (because messages must be designed for the
operations that are extraneous).
If the error remains undetected during design and passes into the coding activity,
considerable effort will be expended to generate code that implements an unnecessary
attribute, two unnecessary operations, messages that drive inter object
communication, and many other related issues. In addition, testing of the class will
absorb more time than necessary. Once the problem is finally uncovered, modification
of the system must be carried out with the ever-present potential for side effects that
are caused by change.
During later stages of their development, OOA and OOD models provide substantial
information about the structure and behavior of the system. For this reason, these
models should be subjected to rigorous review prior to the generation of code. All
object-oriented models should be tested (in this context, the term testing is used to
88
88
incorporate formal technical reviews) for correctness, completeness, and consistency
within the context of the model’s syntax, semantics, and pragmatics.
Testing OOA and OOD Models
Analysis and design models cannot be tested in the conventional sense, because they
cannot be executed. However, formal technical reviews can be used to examine the
correctness and consistency of both analysis and design models.
Correctness of OOA and OOD Models
The notation and syntax used to represent analysis and design models will be tied to
the specific analysis and design method that is chosen for the project. Hence, syntactic
correctness is judged on proper use of the symbology; each model is reviewed to
ensure that proper modeling conventions have been maintained. During analysis and
design, semantic correctness must be judged based on the model’s conformance to the
real world problem domain.
If the model accurately reflects the real world (to a level of detail that is appropriate
to the stage of development at which the model is reviewed), then it is semantically
correct. To determine whether the model does, in fact, reflect the real world, it should
be presented to problem domain experts, who will examine the class definitions and
hierarchy for omissions and ambiguity. Class relationships (instance connections) are
evaluated to determine whether they accurately reflect real world object connections.
Consistency of OOA and OOD Models
The consistency of OOA and OOD models may be judged by “considering the
relationships among entities in the model. An inconsistent model has representations
in one part that are not correctly reflected in other portions of the model”. To assess
consistency, each class and its connections to other classes should be examined. The
class-responsibility-collaboration model and an object-relationship diagram can be
used to facilitate this activity. The CRC model is composed on CRC index cards.
Each CRC card lists the class name, its responsibilities (operations), and its
collaborators (other classes to which it sends messages and on which it depends for
the accomplishment of its responsibilities). The collaborations imply a series of
relationships (i.e., connections) between classes of the OO system. The
89
89
object-relationship model provides a graphic representation of the connections
between classes. All of this information can be obtained from the OOA model.
To evaluate the class model the following steps have been recommended:
1. Revisit the CRC model and the object-relationship model. Cross check to
ensure that all collaborations implied by the OOA model are properly represented.
2. Inspect the description of each CRC index card to determine if a delegated
responsibility is part of the collaborator’s definition. For example, consider a class
defined for a point-of-sale checkout system, called credit sale. This class has a CRC
index card illustrated in Figure 23.1. For this collection of classes and collaborations,
we ask whether a responsibility (e.g., readcredit card) is accomplished if delegated to
the named collaborator (credit card). That is, does the class credit card have an
operation that enables it to be read? In this case the answer is, “Yes.” The
object-relationship is traversed to ensure that all such connections are valid.
3. Invert the connection to ensure that each collaborator that is asked for service
is receiving requests from a reasonable source. For example, if the credit card
class receives a request for purchase amount from the credit sale class, there would
be a problem. Credit card does not know the purchase amount.
4. Using the inverted connections examined in step 3, determine whether other
classes might be required and whether responsibilities are properly grouped
among the classes.
5. Determine whether widely requested responsibilities might be combined into a
single responsibility. For example, read credit card and get authorization occur in
every situation. They might be combined into a validate credit request responsibility
that incorporates getting the credit card number and gaining authorization.
6. Steps 1 through 5 are applied iteratively to each class and through each
evolution of the OOA model.
Once the OOD model is created, reviews of the system design and the object design
should also be conducted. The system design depicts the overall product architecture,
the subsystems that compose the product, the manner in which subsystems are
allocated to processors, the allocation of classes to subsystems, and the design of the
user interface. The object model presents the details of each class and the messaging
activities that are necessary to implement collaborations between classes.
90
90
The system design is reviewed by examining the object-behavior model developed
during OOA and mapping required system behavior against the subsystems designed
to accomplish this behavior. Concurrency and task allocation are also reviewed within
the context of system behavior. The behavioral states of the system are evaluated to
determine which exist concurrently. Use-case scenarios are used to exercise the user
interface design.
Figure 21 - An example CRC index card used for review OOD Model
Object Oriented Testing Strategies
The classical strategy for testing computer software begins with “testing in the small”
and works outward toward “testing in the large.” Stated in the jargon of software
testing, we begin with unit testing, then progress toward integration testing, and
culminate with validation and system testing. In conventional applications, unit
testing focuses on the smallest compliable program unit—the subprogram (e.g.,
module, subroutine, procedure, component). Once each of these units has been tested
individually, it is integrated into a program structure while a series of regression tests
are run to uncover errors due to interfacing between the modules and side effects
caused by the addition of new units. Finally, the system as a whole is tested to ensure
that errors in requirements are uncovered.
91
91
Unit Testing in the OO Context
When object-oriented software is considered, the concept of the unit changes.
Encapsulation drives the definition of classes and objects. This means that each class
and each instance of a class (object) packages attributes (data) and the operations
(also known as methods or services) that manipulate these data. Rather than testing an
individual module, the smallest testable unit is the encapsulated class or object.
Because a class can contain a number of different operations and a particular
operation may exist as part of a number of different classes, the meaning of unit
testing changes dramatically. We can no longer test a single operation in isolation (the
conventional view of unit testing) but rather as part of a class.
To illustrate, consider a class hierarchy in which an operation X is defined for the
super class and is inherited by a number of subclasses. Each subclass uses operation
X, but it is applied within the context of the private attributes and operations that have
been defined for the subclass. Because the context in which operation X is used varies
in subtle ways, it is necessary to test operation X in the context of each of the
subclasses. This means that testing operation X in a vacuum (the traditional unit
testing approach) is ineffective in the object-oriented context.
Class testing for OO software is the equivalent of unit testing for conventional
software. Unlike unit testing of conventional software, which tends to focus on the
algorithmic detail of a module and the data that flow across the module interface,
class testing for OO software is driven by the operations encapsulated by the class and
the state behavior of the class.
Integration Testing in the OO Context
Because object-oriented software does not have a hierarchical control structure,
conventional top-down and bottom-up integration strategies have little meaning. In
addition, integrating operations one at a time into a class (the conventional
incremental integration approach) is often impossible because of the “direct and
indirect interactions of the components that make up the class”.
There are two different strategies for integration testing of OO systems. The first,
thread-based testing, integrates the set of classes required to respond to one input or
92
92
event for the system. Each thread is integrated and tested individually. Regression
testing is applied to ensure that no side effects occur. The second integration
approach, use-based testing, begins the construction of the system by testing those
classes (called independent classes) that use very few (if any) of server classes. After
the independent classes are tested, the next layer of classes, called dependent classes,
that use the independent classes are tested. This sequence of testing layers of
dependent classes continues until the entire system is constructed. Unlike
conventional integration, the use of drivers and stubs as replacement operations is to
be avoided, when possible.
Cluster testing is one step in the integration testing of OO software. Here, a cluster of
collaborating classes (determined by examining the CRC and object-relationship
model) is exercised by designing test cases that attempt to uncover errors in the
collaborations.
Validation Testing in an OO Context
At the validation or system level, the details of class connections disappear. Like
conventional validation, the validation of OO software focuses on user-visible actions
and user-recognizable output from the system. To assist in the derivation of validation
tests, the tester should draw upon the use-cases that are part of the analysis model.
The use-case provides a scenario that has a high likelihood of uncovered errors in user
interaction requirements. Conventional black-box testing methods can be used to
drive validations tests. In addition, test cases may be derived from the object-behavior
model and from event flow diagram created as part of OOA.
Test case Design for OO Software
Test case design methods for OO software are still evolving. However, an overall
approach to OO test case design has been defined by Berard :
1. Each test case should be uniquely identified and explicitly associated with the class
to be tested.
2. The purpose of the test should be stated.
3. A list of testing steps should be developed for each test and should contain:
a. A list of specified states for the object that is to be tested.
b. A list of messages and operations that will be exercised as a consequence of the
test.
93
93
c. A list of exceptions that may occur as the object is tested.
d. A list of external conditions (i.e., changes in the environment external to the
software that must exist in order to properly conduct the test).
e. Supplementary information that will aid in understanding or implementing the test.
Unlike conventional test case design, which is driven by an input-process-output view
of software or the algorithmic detail of individual modules, object-oriented testing
focuses on designing appropriate sequences of operations to exercise the states of a
class.
The Test Case Design Implications of OO Concepts
As we have already seen, the OO class is the target for test case design. Because
attributes and operations are encapsulated, testing operations outside of the class is
generally unproductive. Although encapsulation is an essential design concept for
OO, it can create a minor obstacle when testing. As Binder notes, “Testing requires
reporting on the concrete and abstract state of an object.” Yet, encapsulation can make
this information somewhat difficult to obtain. Unless built-in operations are provided
to report the values for class attributes, a snapshot of the state of an object may be
difficult to acquire. Inheritance also leads to additional challenges for the test case
designer. We have already noted that each new context of usage requires retesting,
even though reuse has been achieved.
In addition, multiple inheritance3 complicates testing further by increasing the
number of contexts for which testing is required. If subclasses instantiated from a
super class are used within the same problem domain, it is likely that the set of test
cases derived for the super class can be used when testing the subclass. However, if
the subclass is used in an entirely different context, the super class test cases will have
little applicability and a new set of tests must be designed.
Applicability of Conventional Test Case Design Methods
The white-box testing methods can be applied to the operations defined for a class.
Basis path, loop testing, or data flow techniques can help to ensure that every
statement in an operation has been tested. However, the concise structure of many
class operations causes some to argue that the effort applied to white-box testing
might be better redirected to tests at a class level.
94
94
Black-box testing methods are as appropriate for OO systems as they are for systems
developed using conventional software engineering methods. As we noted earlier in
this chapter, use-cases can provide useful input in the design of black-box and
state-based tests.
Fault-Based Testing
The object of fault-based testing within an OO system is to design tests that have a
high likelihood of uncovering plausible faults. Because the product or system must
conform to customer requirements, the preliminary planning required to perform fault
based testing begins with the analysis model. The tester looks for plausible faults (i.e.,
aspects of the implementation of the system that may result in defects). To determine
whether these faults exist, test cases are designed to exercise the design or code.
Consider a simple example; Software engineers often make errors at the boundaries of
a problem. For example, when testing a SQRT operation that returns errors for
negative numbers, we know to try the boundaries: a negative number close to zero
and zero itself. "Zero itself" checks whether the programmer made a mistake like
if (x > 0) calculate_the_square_root();
instead of the correct
if (x >= 0) calculate_the_square_root();
As another example, consider a Boolean expression:
if (a && !b || c)
Multicondition testing and related techniques probe for certain plausible faults in this
expression, such as
&& should be ||
! was left out where it was needed There should be parentheses around !b || c
For each plausible fault, we design test cases that will force the incorrect expression
to fail. In the previous expression, (a=0, b=0, c=0) will make the expression as given
evaluate false. If the && should have been ||, the code has done the wrong thing and
might branch to the wrong path.
Of course, the effectiveness of these techniques depends on how testers perceive a
"plausible fault." If real faults in an OO system are perceived to be "implausible,"
then this approach is really no better than any random testing technique. However, if
the analysis and design models can provide insight into what is likely to go wrong,
then fault-based testing can find significant numbers of errors with relatively low
95
95
expenditures of effort. Integration testing looks for plausible faults in operation calls
or message connections. Three types of faults are encountered in this context:
unexpected result, wrong operation/message used, incorrect invocation. To determine
plausible faults as functions (operations) are invoked, the behavior of the operation
must be examined. Integration testing applies to attributes as well as to operations.
The "behaviors" of an object are defined by the values that its attributes are assigned.
Testing should exercise the attributes to determine whether proper values occur for
distinct types of object behavior. It is important to note that integration testing
attempts to find errors in the client object, not the server. Stated in conventional
terms, the focus of integration testing is to determine whether errors exist in the
calling code, not the called code. The operation call is used as a clue, a way to find
test requirements that exercise the calling code.
The Impact of OO Programming on Testing
There are several ways object-oriented programming can have an impact on testing.
Depending on the approach to OOP,
• Some types of faults become less plausible (not worth testing for).
• Some types of faults become more plausible (worth testing now).
• Some new types of faults appear.
When an operation is invoked, it may be hard to tell exactly what code gets exercised.
That is, the operation may belong to one of many classes. Also, it can be hard to
determine the exact type or class of a parameter. When the code accesses it, it may get
an unexpected value. The difference can be understood by considering a conventional
function call:
x = func (y);
For conventional software, the tester need consider all behaviors attributed to func
and nothing more. In an OO context, the tester must consider the behaviors of
base::func(), of derived::func(), and so on. Each time func is invoked, the tester must
consider the union of all distinct behaviors. This is easier if good OO design practices
are followed and the difference between super classes and subclasses (in C++ jargon,
these are called base classes and derived classes) are limited. The testing approach for
base and derived classes is essentially the same. The difference is one of
bookkeeping.
96
96
Testing OO class operations is analogous to testing code that takes a function
parameter and then invokes it. Inheritance is a convenient way of producing
polymorphic operations. At the call site, what matters is not the inheritance, but the
polymorphism. Inheritance does make the search for test requirements more
straightforward. By virtue of OO software architecture and construction, are some
types of faults more plausible for an OO system and others less plausible? The answer
is, “Yes.” For example, because OO operations are generally smaller, more time tends
to be spent on integration because there are more opportunities for integration faults.
Therefore, integration faults become more plausible.
Test Cases and the Class Hierarchy
As noted earlier in this chapter, inheritance does not obviate the need for thorough
testing of all derived classes. In fact, it can actually complicate the testing process.
Consider the following situation. A class base contains operations inherited and
redefined. A class derived redefines redefined to serve in a local context. There is
little doubt the derived::redefined() has to be tested because it represents a new design
and new code. But does derived::inherited() have to be retested? If
derived::inherited() calls redefined and the behavior of redefined has changed,
derived::inherited() may mishandle the new behavior. Therefore, it needs new tests
even though the design and code have not changed. It is important to note, however,
that only a subset of all tests for derived::inherited() may have to be conducted. If part
of the design and code for inherited does not depend on redefined (i.e., that does not
call it nor call any code that indirectly calls it), that code need not be retested in the
derived class.
Base::redefined() and derived::redefined() are two different operations with different
specifications and implementations. Each would have a set of test requirements
derived from the specification and implementation. Those test requirements probe for
plausible faults: integration faults, condition faults, boundary faults, and so forth. But
the operations are likely to be similar. Their sets of test requirements will overlap.
The better the OO design, the greater is the overlap. New tests need to be derived only
for those derived::redefined() requirements that are not satisfied by the
base::redefined() tests.
97
97
To summarize, the base::redefined() tests are applied to objects of class derived.
Test inputs may be appropriate for both base and derived classes, but the expected
results may differ in the derived class.
Scenario-Based Test Design
Fault-based testing misses two main types of errors:
(1) Incorrect specifications and
(2) Interactions among subsystems.
When errors associated with incorrect specification occur, the product doesn't do
what the customer wants. It might do the wrong thing or it might omit important
functionality. But in either circumstance, quality (conformance to requirements)
suffers. Errors associated with subsystem interaction occur when the behavior of one
subsystem creates circumstances (e.g., events, data flow) that cause another
subsystem to fail.
Scenario-based testing concentrates on what the user does, not what the product
does. This means capturing the tasks (via use-cases) that the user has to perform, then
applying them and their variants as tests. Scenarios uncover interaction errors. But to
accomplish this, test cases must be more complex and more realistic than fault-based
tests. Scenario-based testing tends to exercise multiple subsystems in a single test
(users do not limit themselves to the use of one subsystem at a time).
As an example, consider the design of scenario-based tests for a text editor. Use cases
follow:
Use-Case: Fix the Final Draft
Background: It's not unusual to print the "final" draft, read it, and discover some
annoying errors that weren't obvious from the on-screen image. This use-case
describes the sequence of events that occurs when this happens.
1. Print the entire document.
2. Move around in the document, changing certain pages.
3. As each page is changed, it's printed.
4. Sometimes a series of pages is printed.
This scenario describes two things: a test and specific user needs. The user needs are
obvious: (1) a method for printing single pages and (2) a method for printing a range
of pages. As far as testing goes, there is a need to test editing after printing (as well as
the reverse). The tester hopes to discover that the printing function causes errors in
98
98
the editing function; that is, that the two software functions are not properly
independent.
Use-Case: Print a New Copy
Background: Someone asks the user for a fresh copy of the document. It must be
printed.
1. Open the document.
2. Print it.
3. Close the document.
Again, the testing approach is relatively obvious. Except that this document didn't
appear out of nowhere. It was created in an earlier task. Does that task affect this one?
In many modern editors, documents remember how they were last printed. By default,
they print the same way next time. After the Fix the Final Draft scenario, just
selecting "Print" in the menu and clicking the "Print" button in the dialog box will
cause the last corrected page to print again. So, according to the editor, the correct
scenario should look like this:
Use-Case: Print a New Copy
1. Open the document.
2. Select "Print" in the menu.
3. Check if you're printing a page range; if so, click to print the entire document.
4. Click on the Print button.
5. Close the document.
But this scenario indicates a potential specification error. The editor does not do what
the user reasonably expects it to do. Customers will often overlook the check noted in
step 3 above. They will then be annoyed when they trot off to the printer and find one
page when they wanted 100. Annoyed customers signal specification bugs. A test case
designer might miss this dependency in test design, but it is likely that the problem
would surface during testing. The tester would then have to contend with the probable
response, "That's the way it's supposed to work!"
Testing Surface Structure and Deep Structure
Surface structure refers to the externally observable structure of an OO program. That
is, the structure that is immediately obvious to an end-user. Rather than performing
functions, the users of many OO systems may be given objects to manipulate in some
99
99
way. But whatever the interface, tests are still based on user tasks. Capturing these
tasks involves understanding, watching, and talking with representative users (and as
many non representative users as are worth considering).
There will surely be some difference in detail. For example, in a conventional system
with a command-oriented interface, the user might use the list of all commands as a
testing checklist. If no test scenarios existed to exercise a command, testing has likely
overlooked some user tasks (or the interface has useless commands). In a object based
interface, the tester might use the list of all objects as a testing checklist. The best tests
are derived when the designer looks at the system in a new or unconventional way.
For example, if the system or product has a command-based interface, more thorough
tests will be derived if the test case designer pretends that operations are independent
of objects. Ask questions like, “Might the user want to use this operation—which
applies only to the Scanner object—while working with the printer?" Whatever the
interface style, test case design that exercises the surface structure should use both
objects and operations as clues leading to overlooked tasks.
Deep structure refers to the internal technical details of an OO program. That is, the
structure that is understood by examining the design and/or code. Deep structure
testing is designed to exercise dependencies, behaviors, and communication
mechanisms that have been established as part of the system and object design of OO
software. The analysis and design models are used as the basis for deep structure
testing.
For example, the object-relationship diagram or the subsystem collaboration diagram
depicts collaborations between objects and subsystems that may not be externally
visible. The test case design then asks: “Have we captured (as a test) some task that
exercises the collaboration noted on the object-relationship diagram or the subsystem
collaboration diagram? If not, why not?”
Design representations of class hierarchy provide insight into inheritance structure.
Inheritance structure is used in fault-based testing. Consider a situation in which an
operation named caller has only one argument and that argument is a reference to a
base class. What might happen when caller is passed a derived class? What are the
differences in behavior that could affect caller? The answers to these questions might
lead to the design of specialized tests.
10
10
Testing Methods Applicable At the Class Level
Software testing begins “in the small” and slowly progresses toward testing “in the
large.” Testing in the small focuses on a single class and the methods that are
encapsulated by the class. Random testing and partitioning are methods that can be
used to exercise a class during OO testing.
Random Testing for OO Classes
To provide brief illustrations of these methods, consider a banking application in
which an account class has the following operations: open, setup, deposit, withdraw,
balance, summarize, creditLimit, and close. Each of these operations may be applied
for account, but certain constraints (e.g., the account must be opened before other
operations can be applied and closed after all operations are completed) are implied
by the nature of the problem. Even with these constraints, there are many
permutations of the operations. The minimum behavioral life history of an instance of
account includes the following operations:
• Open
• Setup
• Deposit
• Withdraw
• Close
This represents the minimum test sequence for account. However, a wide variety of
other behaviors may occur within this sequence:
open•setup•deposit•[deposit|withdraw|balance|summarize|creditLimit]n•withdraw•clo
se
A variety of different operation sequences can be generated randomly. For example:
Test case r1: open•setup•deposit•deposit•balance•summarize•withdraw•close
Test case r2:
open•setup•deposit•withdraw•deposit•balance•creditLimit•withdraw•close
These and other random order tests are conducted to exercise different class instance
life histories.
10
10
Partition Testing at the Class Level
Partition testing reduces the number of test cases required to exercise the class in
much the same manner as equivalence partitioning for conventional software. Input
and output are categorized and test cases are designed to exercise each category. But
how are the partitioning categories derived?
State-based partitioning categorizes class operations based on their ability to change
the state of the class. Again considering the account class, state operations include
deposit and withdraw, whereas nonstate operations include balance, summarize, and
creditLimit. Tests are designed in a way that exercises operations that change state
and those that do not change state separately. Therefore,
Test case p1: open•setup•deposit•deposit•withdraw•withdraw•close
Test case p2: open•setup•deposit•summarize•creditLimit•withdraw•close
Test case p1 changes state, while test case p2 exercises operations that do not change
state (other than those in the minimum test sequence).
Attribute-based partitioning categorizes class operations based on the attributes that
they use. For the account class, the attributes balance and creditLimit can be used to
define partitions. Operations are divided into three partitions: (1) operations that use
creditLimit, (2) operations that modify creditLimit, and (3) operations that do not use
or modify creditLimit. Test sequences are then designed for each partition.
Category-based partitioning categorizes class operations based on the generic
function that each performs. For example, operations in the account class can be
categorized in initialization operations (open, setup), computational operations
(deposit, withdraw). Queries (balance, summarize, creditLimit) and termination
operations (close).
Interclass Test Case Design
Test case design becomes more complicated as integration of the OO system begins.
It is at this stage that testing of collaborations between classes must begin. To
illustrate “interclass test case generation”, we expand the banking example to include
the classes and collaborations noted in Figure.
The direction of the arrows in the figure indicates the direction of messages and the
labeling indicates the operations that are invoked as a consequence of the
collaborations implied by the messages. Like the testing of individual classes, class
10
10
collaboration testing can be accomplished by applying random and partitioning
methods, as well as scenario-based testing and behavioral testing.
Multiple Class Testing Kirani and Tsai suggest the following sequence of steps to
generate multiple class random test cases:
1. For each client class, use the list of class operations to generate a series of random
test sequences. The operations will send messages to other server classes.
2. For each message that is generated, determine the collaborator class and the
corresponding operation in the server object.
3. For each operation in the server object (that has been invoked by messages sent
from the client object), determine the messages that it transmits.
4. For each of the messages, determine the next level of operations that are invoked
and incorporate these into the test sequence.
To illustrate, consider a sequence of operations for the bank class relative to an ATM
class:
verifyAcct•verifyPIN•[[verifyPolicy•withdrawReq]|depositReq|acctInfoREQ]n
A random test case for the bank class might be
test case r3 = verifyAcct•verifyPIN•depositReq
In order to consider the collaborators involved in this test, the messages associated
with each of the operations noted in test case r3 is considered. Bank must collaborate
with ValidationInfo to execute the verifyAcct and verifyPIN. Bank must collaborate
with account to execute depositReq. Hence, a new test case that exercises these
collaborations is
test case r4 = verifyAcctBank[validAcctValidationInfo]•verifyPINBank•
[validPinValidationInfo]•depositReq• [depositaccount]
cardinserted
password
deposit
withdraw
accntStatus
terminate
verifyStatus
depositStatus
dispenseCash
10
10
print AccntStat
read CardInfo
getCashAmnt
verifyAcct
verifyPIN
verifyPolicy
withdrawReq
depositReq
acctInfo
openAcct
initialDeposit
authorizeCard
deauthorize
closeAcct
validPIN
validAcct
creditLimit
accntType
balance
withdraw
deposit
close
ATM
user
interface
ATM
Cashier Account Validation
info
Bank
10
10
Figure 22- Class collaboration diagram for banking application
The approach for multiple class partition testing is similar to the approach used for
partition testing of individual classes. However, the test sequence is expanded to
include those operations that are invoked via messages to collaborating classes. An
alternative approach partitions tests based on the interfaces to a particular class.
Referring to the above figure, the bank class receives messages from the ATM and
cashier classes. The methods within bank can therefore be tested by partitioning
them into those that serve ATM and those that serve cashier. State-based partitioning
can be used to refine the partitions further.
Tests Derived from Behavior Models
The state transition diagram is a model that represents the dynamic behavior of a
class. The STD for a class can be used to help derive a sequence of tests that will
10
10
exercise the dynamic behavior of the class (and those classes that collaborate with it).
The state model can be traversed in a “breadth-first” manner. In this context, breadth
first implies that a test case exercises a single transition and that when a new
transition is to be tested only previously tested transitions are used.
Consider the credit card object discussed in the previous section. The initial state of
credit card is undefined (i.e., no credit card number has been provided). Upon
reading the credit card during a sale, the object takes on a defined state; that is, the
attributes card number and expiration date, along with bank specific identifiers are
defined. The credit card is submitted when it is sent for authorization and it is
approved when authorization is received. The transition of credit card from one state
to another can be tested by deriving test cases that cause the transition to occur. A
breadth-first approach to this type of testing would not exercise submitted before it
exercised undefined and defined. If it did, it would make use of transitions that had
not been previously tested and would therefore violate the breadth-first criterion.
Tools for testing of Object Oriented Testing
There are several tools that aid in testing OO systems. Some of these are
1. Use cases
2. Class diagrams
3. Sequence diagrams
4. State charts
3.5 USABILITY AND ACCESSABILITY TESTING
Usability testing attempts to characterize the “look and Feel” and usage aspects of a
product, from the point of view of users. Most types of testing are objective in nature.
Some of the characteristics of usability testing are as follows:
Usability testing tests the product from the user’s point of view. It
encompasses a range of techniques for identifying how users actually interact
with and use the product.
Usability testing is for checking the product to see if it is easy to use for the
various categories of users.
10
10
Usability testing is a process to identify discrepancies between the user
interface of the product and the human user requirements, in terms of the
pleasantness and aesthetics aspects.
If we combine all the above characterizations of the various factors that determine
usability testing, then the common threads are
1. Ease of use
2. Speed
3. Pleasantness and aesthetics
Approach to usability
When doing usability testing, certain human factors can be represented in a
quantifiable way and can be tested objectively. Generally, the people best suited to
perform usability testing are
11. Typical representatives of the actual user segments who would
be using the product, so that the typical user patterns can be captured,
and
12. People who are new to the product, so that they can start
without any bias and be able to identify usability problems.
When to do usability testing
The most appropriate way of ensuring usability is by performing the usability testing
in two phases. First is design validation and the second is usability testing done as a
part of component and integration testing phases of a test cycle. A product has to be
designed for usability. A product designed only for functionality may not get user
acceptance. A product designed for functionality may also involve a high degree of
training, which can be minimized if it is designed for both functionality and usability.
Usability design is verified through several means. Some of them are as follows:
Style sheets
Screen prototypes
Paper designs
Layout design
10
10
A “usable product” is always the result of mutual collaboration from all the
stakeholders, for the entire duration of the project. Usability is a habit and a behavior.
Just like humans, the products are expected to behave differently and correctly with
different users and to their expectations.
Quality factors for usability
Some quality factors are very important when performing usability testing. Focusing
on some of the quality factors given below help in improving objectivity in usability
testing are as follows.
Comprehensibility
Consistency
Navigation
Responsiveness
Aesthetics testing
Another important aspect in usability is making the product “beautiful”. Performing
aesthetics testing helps in improving usability further. It is not possible for all
products to measure up with the Taj Mahal for its beauty. Testing for aesthetics can at
least ensure the product is pleasing to the eye. Aesthetics testing can be performed by
anyone who appreciates beauty. Involving beauticians, artists, and architects who
have regular roles of making different aspects of life beautiful serve as experts here in
aesthetics testing. Involving them during design and testing phases and incorporating
their inputs may improve the aesthetics of the product. For example, the icons used in
the product may look more appealing if they are designed by an artist, as they are not
meant only for conveying messages but also help in making the product beautiful.
ACCESSIBILITY TESTING
There are a large number of people who are challenged with vision, hearing, and
mobility related problems-partial or complete. Product usability that does not look
into their requirements would result in lack of acceptance. There are several tools that
10
10
are available to help them with alternatives. These tools are generally referred as
accessibility tools or assistive technologies. Verifying the product usability for
physically challenged users is called accessibility testing. Accessibility is a subset of
usability and should be included as part of usability test planning.
Accessibility of the product can be provided by two means.
Making use of accessibility features provided by the underlying infrastructure(
for example, operating system), called basic accessibility, and
Providing accessibility in the product through standards and guidelines, called
product accessibility.
Basic accessibility
Basic accessibility is provided by the hardware and operating system. All the input
and output devices of the computer and their accessibility options are categorized
under basic accessibility. The keyboard accessibility and screen accessibility are some
of the basic accessibility features.
Product accessibility
A good understanding of the basic accessibility features is needed while providing
accessibility to the product. A product should do everything possible to ensure that the
basic accessibility features are utilized by it. A good understanding of basic
accessibility features and the requirements of different types users with special needs
help in creating certain guidelines on how the product’s user interface has to be
designed.
This requirement explains the importance of providing text equivalents for picture
messages and providing captions for audio portions. When an audio file is played,
providing captions for the audio improves accessibility for the hearing impaired.
Providing audio clippings improves accessibility for impaired users who can not
understand the video streams and pictures. Hence, text equivalents for audio, audio
descriptions for pictures and visuals become an important requirement for
accessibility.
Tools for usability
10
10
There are not many tools that help in usability because of the high degree of
subjectivity involved in evaluating this aspect. A sample list of usability and
accessibility tools are listed below:
JAWS
HTML validator
Style sheet validator
Magnifier
Narrator
Soft keyboard
Test roles for usability
Usability testing is not as formal as other types of testing in several companies and is
not performed with a pre-written set of test cases/checklists. Various methods adopted
by companies for usability testing are as follows.
Performing usability testing as a separate cycle of testing
Hiring external consultants to do usability validation
Setting up a separate group for usability to institutionalize the practices across
various product development teams and to set up organization –wide standards
for usability.
3.6 Summary
The overall objective of object-oriented testing—to find the maximum number of
errors with a minimum amount of effort—is identical to the objective of conventional
software testing. But the strategy and tactics for OO testing differ significantly. The
view of testing broadens to include the review of both the analysis and design model.
In addition, the focus of testing moves away from the procedural component (the
module) and toward the class. Because the OO analysis and design models and the
resulting source code are semantically coupled, testing (in the form of formal
technical reviews) begins during these engineering activities. For this reason, the
review of CRC, object-relationship, and object-behavior models can be viewed as first
stage testing.
11
11
Once OOP has been accomplished, unit testing is applied for each class. The design of
tests for a class uses a variety of methods: fault-based testing, random testing, and
partition testing. Each of these methods exercises the operations encapsulated by the
class. Test sequences are designed to ensure that relevant operations are exercised.
The state of the class, represented by the values of its attributes, is examined to
determine if errors exist. Integration testing can be accomplished using a thread-based
or use-based strategy. Thread-based testing integrates the set of classes that
collaborate to respond to one input or event. Use-based testing constructs the system
in layers, beginning with those classes that do not use server classes.
There is an increasing awareness of usability testing in the industry. Soon, usability
testing will become an engineering discipline, a life cycle activity, and a profession.
Several companies plan for usability testing in the beginning of the product life cycle
and track them to completion. Usability is not achieved only by testing. Usability is
more in the design and in the minds of the people who contribute to the product.
Usability is all about user experiences. Thinking from the perspective of the user all
the time during the project will go a long way in ensuring usability.
3.7 Check Your Progress
11 Explain the purpose of making the performance testing and the factors
governing performance testing.
11 How will collect the requirements, test cases for making performance testing?
11 How will you automate performance test cases? Explain with an example.
11 Define the terms performance tuning and benchmarking.
11 Mention some of the performance testing tools.
11 What is called a regression testing? Mention the types.
11 When to do regression testing?
11 How will you perform an initial “smoke” or “sanity” test?
11 How will you select the test cases for making a regression testing?
111 What are the best practices to be followed in regression testing?
11
11
UNIT - IV
Structure
4.0 Objectives
4.1 Introduction
4.2 Test Planning
4.3 Test management
4.4 Test Execution and Reporting
4.5 Summary
4.6 Check Your Progress
11
11
4.0 Objectives
To learn how to make a test plan for the whole testing process and the steps
To understand what is test management and the art of making it
To learn the test execution and how to make a test reporting after making every test
4.1 Introduction
In this chapter, we will look at some of the project management aspects of testing.
The Project Management Institute defines a project formally as a temporary endeavor
to create a unique product or service. This means that every project or service is
different in some distinguishing way from all similar products or services. Testing is
integrated into the endeavor of creating a given product or service; each phase and
each type of testing has different characteristics and what is tested in each version
could be different. Hence, testing satisfies this definition of a project fully. Given that
testing can be considered as a project on its own, it has to be planned, executed,
tracked, and periodically reported on.
4.2 TEST PLANNING
Preparing a test plan
Testing – like any project- should be driven by a plan. The test plan covers the
following:
What needs to be tested – the scope of testing, including clear
identification of what will be tested and what will not be tested.
How the testing is going to be performed
What resources are needed for testing- computer as well as human
resources.
The time lines by which the testing activities will be performed.
Risks that may be faced in all the above, with appropriate mitigation
and contingency plans.
11
11
Scope management:
One single plan can be prepared to cover all phases or there can be separate plans for
each phase. In situations where there are multiple test plans, there should be one test
plan, which covers the activities common for all plans. This is called the master test
plan.
Scope management pertains to specifying the scope of a project. For testing, scope
management entails
1. Understanding what constitutes a release of a product
2. Breaking down the release into features
3. Prioritizing the features of testing
4. Deciding which features will be tested and which will not be and
5. Gathering details to prepare for estimation of resources for testing.
Knowing the features and understanding them from the usage perspective will enable
the testing team to prioritize the features for testing. The following factors drive
choice and prioritization of features to be tested.
Features that is new and critical for the release
The new features of a release set the expectations of the customers and must perform
properly. These new features result in new program code and thus have a higher
susceptibility and exposure to defects.
Features whose failures can be catastrophic
Regardless of whether a feature is new or not, any feature the failure of which can be
catastrophic has to be high on the list of features to be tested. For example, recovery
mechanisms in a database will always have to be among the most important features
to be tested.
Features that are expected to be complex to test
Early participation by the testing team can help identify features that are difficult to
test. This can help in starting the work on these features earl and line up appropriate
resources in time.
Features which are extensions of earlier features that have been defect prone
11
11
Defect prone areas need very thorough testing so that old defects do not creep in
again.
Deciding test approach/strategy
Once we have this prioritized feature list, the next step is to drill down into some
more details of what needs to be tested, to enable estimation of size, effort, and
schedule. This includes identifying
1. What type of testing would you use for testing the functionality?
2. What are the configurations or scenarios for testing the features?
3. What integration testing is followed to ensure these features work
together?
4. What localization validations would be needed?
5. What non-functional tests would you need to do?
The test approach should result in identifying the right type of test for each of the
features or combinations.
Setting up criteria for testing
There must be clear entry and exit criteria for different phases of testing.
Ideally, tests must be run as early as possible so that the last minute pressure of
running tests after development delays is minimized. The entry criteria for a test
specify threshold criteria for each phase or type of test. The completion/exit criteria
specify when a test cycle or a testing activity can be deemed complete.
Suspension criteria specify when a test cycle or a test activity can be
suspended.
Identifying responsibilities, staffing, and training needs
A testing project requires different people to play different roles. There are
roles for test engineers, test leads, and test managers. The different role definitions
should
1. Ensure there is clear accountability for a given task, so that each person
knows what he has to do;
2. Clearly list the responsibilities for various functions to various people
11
11
3. Complement each other, ensuring no one steps on an others’ toes; and
4. Supplement each other, so that no task is left unassigned.
Staffing is done based on estimation of effort involved and the availability of
time for release. In order to ensure that the right tasks get executed, the features and
tasks are prioritized the basis of an effort, time and importance.
It may not be possible to find the perfect fit between the requirements and
availability of skills; they should be addressed with appropriate training programs.
Identifying resource requirements
As a part of planning for a testing project, the project manager should provide
estimates for the various hardware and software resources required. Some of the
following factors need to be considered.
1. Machine configuration needed to run the product under test
2. Overheads required by the test automation tool, if any
3. Supporting tools such as compilers, test data generators, configuration
management tools, and so on
4. The different configurations of the supporting software that must be
present
5. Special requirements for running machine-intensive tests such as load tests
and performance tests
6. Appropriate number of licenses of all the software
Identifying test deliverables
The test plan also identifies the deliverables that should come out of the test
cycle/testing activity. The deliverables include the following,
1. The test plan itself
2. Test case design specifications
3. Test cases, including any automation that is specified in the plan
4. Test logs produced by running the tests
5. Test summary reports
Testing tasks: size and effort estimation
11
11
The scope identified above gives a broad overview of what needs to be tested.
This understanding is quantified in the estimation step. Estimation happens broadly in
three phases.
1. Size estimation
2. Effort estimation
3. Schedule estimation
Size estimation
Size estimate quantifies the actual amount of testing that needs to be done. The
factors contribute to the size estimate of a testing project are as follows:
Size of the product under test – Line of Code (LOC), Function Point (FP) are the
popular methods to estimate the size of an application. A somewhat simpler
representation of application size is the number of screens, reports, or transactions.
Extent of automation required
Number of platforms and inter-operability environments to be tested
Productivity data
Reuse opportunities
Robustness of processes
Activity breakdown and scheduling
Activity breakdown and schedule estimation entail translating the effort required into
specific time frames. The following steps make up this translation.
• Identifying external and internal dependencies among the activities
• Sequencing the activities, based on the expected duration as well as on
the dependencies
• Monitoring the progress in terms of time and effort
• Rebalancing schedules and resources as necessary
Communications management
11
11
Communications management consists of evolving and following procedures for
communication that ensure that everyone is kept in sync with the right level of detail.
Risk management
Like every project, testing projects also face risks. Risks are events that could
potentially affect a project’s outcome. Risk management entails
• Identifying the possible risks;
• Quantifying the risks;
• Planning how to mitigate the risks; and
• Responding to risks when they become a reality.
Fig.23 - Aspects of risk management
i) Risk identification consists of identifying the possible risks that may
hit a project. Use of checklists, Use of organizational history and
metrics and informal networking across the industry are the common
ways to identify risks in testing.
ii) Risk quantification deals with expressing the risk in numerical terms.
The probability of the risk happening and the impact of the risk are the
two components to the quantification of risk.
iii) Risk mitigation planning deals with identifying alternative strategies
to combat a risk event. To handle the effects of a risk, it is advisable to
have multiple mitigation strategies.
The following are some of the common risks encountered in testing projects:
11
11
Risk identification
Riskresponse
Risk Mitigationplanning
Riskquantification
• Unclear requirements,
• Schedule dependence,
• Insufficient time for testing,
• Show stopper defects,
• Availability of skilled and motivated people for testing and
• Inability to get a test automation tool.
4.3 TEST MANAGEMENT
Choice of standards
Standards comprise an important part of planning in any organization. There are two
types of standards – external standards and internal standards.
External standards are standards that a product should comply with, are externally
visible, and are usually stipulated by external consortia. Compliance to external
standards is usually mandated by external parties.
Internal standards are standards formulated by a testing organization to bring in
consistency and predictability. They standardize the processes and methods of
working within the organization. Some of the internal standards include
Naming and storage conventions for test artifacts – Every test artifact have to be
named appropriately and meaningfully. Such naming conventions should enable easy
identification of the product functionality that a set of tests are intended for; and
reverse mapping to identify the functionality corresponding to a given set of tests.
Document standards
Most of the discussion on documentation and coding standards pertain to automated
testing. Documentation standards specify how to capture information about the tests
within the test scripts themselves. Internal documentation of test scripts are similar to
internal documentation of program code and should include the following:
11
11
• Appropriate header level comments at he beginning of the file that outlines the
functions to be served by the test.
• Sufficient in-line comments spread throughout the file, explaining the
functions served by the various parts of a test script.
• Up-to-date change history information, recording all the changes made to the
test file.
Test coding standards
Test coding standards go one level deeper into the tests and enforce standards on how
the test themselves are written. The standards may
1. Enforce the right type of initialization
2. Stipulate ways of naming variables within the scripts to make sure
that a reader understands consistently the purpose of a variable.
3. Encourage reusability of test artifacts.
4. Provide standard interfaces to external entries like operating
system, hardware, and so on.
Test reporting standards
Since testing is tightly interlinked with product quality, all the stakeholders must get a
consistent and timely view of the progress of tests. The test reporting provides
guidelines on the level of detail that should be present in the test reports, their
standard formats and contents, recipients of the report, and so on.
Test infrastructure management
Testing requires a robust infrastructure to be planned upfront. This infrastructure is
made up of three essential elements.
1. A test case database (TCDB )
2. A defect repository (DR )
3. Configuration management repository and tool
A test case database captures all the relevant information about the test cases in an
organization.
12
12
A defect repository captures all the relevant details of defects reported for a product.
Most of the metrics classified as testing defect metrics and development defect
metrics are derived out of the data in defect repository.
Yet another infrastructure that is required for a software product organization is a
Software Configuration Management (SCM) repository. An SCM repository keeps
track of change control and version control of all the files that make up a software
product. Change controls ensures that
Changes to test files are made in a controlled fashion and only with
proper approvals.
Changes made by one test engineer are not accidentally lost or
overwritten by other changes.
Each change produces a distinct version of the file that is recreatable at
any point of time.
At any point of time, everyone gets access to only the most recent
version of the test files.
Version control ensures that the test scripts associated with a given release of a
product are base lined along with the product files.
TCDB, Defect Repository, and SCM repository should complement each other and
work together in an integrated fashion.
12
12
Figure 24 – relationship SCM, DR and TCDB
Test people management
People management is an integral part of any project management. It requires the
ability to hire, motivate and retain the right people. These skills are seldom formally
taught. Testing projects present several additional challenges. We believe that the
success of a testing organization depends on judicious people management skills.
The important point is that the common goals and the spirit of teamwork have
to be internalized by all the stakeholders. Such an internalization and upfront team
building has to be part of the planning process for the team to succeed.
12
12
TCDB
DR SCM
Test caseProductXREF
Test caseinfo
Test caseinfo
Test casedefectXREF
ProductTest cases Product
Sourcecode
Environ-Mentfiles
Productdocumentation
Defectdetails
Defect fixdetails
DefectCommu-nication
DefectTestdetails
Integrating with product release
Ultimately, the success of a product depends on the effectiveness of integration of the
development and testing activities. These job functions have to work in tight unison
between themselves and with other groups such as product support, product
management, and so on. The schedules of testing have to be linked directly to product
release. The following are some of the points to be decided for this planning.
• Sync points between development and testing as to when different types of
testing can commence.
• Service level agreements between development and testing as to how long it
would take for the testing team to complete the testing. This will ensure that
testing focuses on finding relevant and important defects only.
• Consistent definitions of the various priorities and severities of the defects.
• Communication mechanisms to the documentation group to ensure that the
documentation is kept in sync with the product in terms of known defects,
workarounds and so on.
The purpose of the testing team is to identify the defects in the product and the risks
that could be faced by releasing the product with the existing defects.
4.4 TEST PROCESS
Putting together and base lining a test plan
A test plan combines all the points discussed above into a single document that acts
as an anchor point for the entire testing project. An organization normally arrives at a
template that is to be used across the board. Each testing project puts together a test
plan based on the template. The test plan is reviewed by a designated set of competent
people in the organization. It then is approved by a competent authority, who is
independent of the project manager directly responsible for testing. After this, the test
plan is base lined into the configuration management repository. From then on, the
base lined test plan becomes the basis for running the testing project. In addition,
12
12
periodically, any change needed to the test plan templates are discussed among the
different stake holders and this is kept current and applicable to the testing teams.
Test case specification
Using the test plan as the basis, the testing team designs test case specifications,
which then becomes the basis for preparing individual test cases. A test case is a
series of steps executed on a product, using a pre-defined set of input data, expected
to produce a pre-defined set of outputs, in a given environment. Hence, a test case
specification should clearly identify,
• The purpose of the test: this lists what feature or part the test is intended for.
• Items being tested, along with their version/release numbers as appropriate.
• Environment that needs to be set up for running the test case.
• Input data to be used for the test case.
• Steps to be followed to execute the test
• The expected results that are considered to be correct results
• A step to compare the actual results produced with the expected results
• Any relationship between this and other tests
Update of traceability matrix
A traceability matrix is a tool to validate that every requirement is tested. This matrix
is created during the requirements gathering phase itself by filling up the unique
identifier for each requirement. When a test case specification is complete, the row
corresponding to the requirement which is being tested by the test case is updated
with the test case specification identifier. This ensures that there is a two-way
mapping between requirements and test cases.
Identifying possible candidates for automation
Before writing the test cases, a decision should be taken as to which tests are to be
automated and which should be run manually. Some of the criteria that will be used in
deciding which scripts to automate include
• Repetitive nature of the test
12
12
• Effort involved in automation
• Amount of manual intervention required for the test, and
• Cost of automation tool.
Developing and base lining test cases
Based on the test case specifications and the choice of candidates for automation, test
cases have to be developed. The test cases should also have change history
documentation, which specifies
• What was the change
• Why the change was necessitated
• Who made the change
• When was the change made
• A brief description of how the change has been implemented and
• Other files affected by the change
All the artifacts of test cases – the test scripts, inputs, scripts, expected outputs, and
so on should be stored in the test case database and SCM.
Executing test cases and keeping traceability matrix current
The prepared test cases have to be executed at the appropriate times during a
project. For example, test cases corresponding to smoke tests may be run on a daily
basis. System testing test cases will be run during system testing.
As the test cases are executed during a test cycle, the defect repository is updated with
1. Defects from the earlier test cycles that are fixed in the current build and
2. New defects that get uncovered in the current run of the tests.
During test design and execution, the traceability matrix should be kept current. When
tests get designed and executed successfully, the traceability matrix should be
updated.
Collecting and analyzing metrics
12
12
When tests are executed, information about the test execution gets collected in test
logs and other files. The basic measurements from running the tests are then
converted to meaningful metrics by the use of appropriate transformations and
formulae.
Preparing test summary report
At the completion of a test cycle, a test summary report is produced. This report gives
insights to the senior management about the fitness of the product for release.
Recommending product release criteria
One of the purposes of testing is to decide the fitness of a product for release.
Testing can never conclusively prove the absence of defects in a software product.
What it provides is an evidence of what defects exist in the product, their severity, and
impact. The job of the testing team is to articulate to the senior management and the
product release team
1. What defect the product has
2. What is the impact/severity of each of the defects
3. What would be the risks of releasing the product with the existing
defects?
The senior management can then take a meaningful business decision on whether to
release given version or not.
4.5 Test Execution and Reporting
Testing requires constant communication between the test team and other
teams. Test reporting is a means of achieving this communication. There are two
types of reports or communication that are required; test incident reports and test
summary reports.
Test incident report
A test incident report is a communication that happens through the testing
cycle as and when defects are encountered. A test incident report is an entry made in
12
12
the defect repository. Each defect has a unique ID and this is used to identify the
incident. The high impact test incidents are highlighted in the test summary report.
Test cycle report
Test projects take place in units of test cycles. A test cycle entails planning
and running certain tests in cycles, each cycle using a different build of a product. A
test cycle report, at the end of each cycle, gives
1. A summary of the activities carried out during that cycle;
2. Defects that were uncovered during that cycle, based on their severity and
impact.
3. Progress from the previous cycle to the current cycle in terms of defects
fixed;
4. Outstanding defects that are yet to be fixed in this cycle; and
5. Any variations observed in effort or schedule.
Test summary report
The final step in a test cycle is to recommend the suitability of a product for release.
A report that summarizes the results of a test cycle is the test summary report.
There are two types of test summary reports:
1. Phase-wise test summary, which is produced at the end of every phase
2. Final test summary reports.
A summary report should present
• A summary of the activities carried out during the test cycle or phase
• Variance of the activities carried out from the activities planned
• Summary of results which includes tests that failed, with any root cause
descriptions and severity of impact of the defects uncovered by the tests.
• Comprehensive assessment and recommendation for release should include fit
for release assessment and recommendation of release.
Recommending product release
Based on the test summary report, an organization can take a decision on whether to
release the product or not. Ideally an organization would like to release a product with
zero defects. However, market pressures may cause the product to be released with
12
12
the defects provided that the senior management is convinced that there is no major
risk of customer dissatisfaction. Such a decision should be taken by the senior
manager only after consultation with the customer support team, development team
and testing team so that the overall workload for all parts of the organization can be
evaluated.
Best Practices
Best practices in testing can be classified into three categories.
1. Process related
2. People related
3. Technology related
Process related best practices
A strong process infrastructure and process culture is required to achieve better
predictability and consistency. A process database, a federation of information about
the definition and execution of various processes can be a valuable addition to the
tools in an organization.
People related best practices
While individual goals are required for the development and testing teams, it is very
important to understand the overall goals that define the success of the product as a
whole. Job rotation among support, development and testing can also increase the
gelling among the teams. Such job rotation can help the different teams develop better
empathy and appreciation of the challenges faced in each other’s roles and thus result
in better teamwork.
Technology related best practices
A fully integrated TCDB-SCM – DR can help in better automation of testing
activities. When test automation tools are used, it is useful to integrate the tool with
TCDB, defect repository and an SCM tool.
A final remark on best practices, the three dimensions of best practices cannot be
carried out in isolation. A good technology infrastructure should be aptly supported
by effective process infrastructure and be executed by competent people. These best
12
12
practices are inter-dependent, self-supporting, and mutually enhancing. Thus, the
organization needs to take a holistic view of these practices and keep a fine balance
among the three dimensions.
4.6 Summary
Failing to plan is planning to fail. Testing – like any project – should be driven by a
plan. The scope management for deciding the features to be tested/ not tested,
deciding a test approach, setting up criteria for testing and identifying responsibilities,
staffing, and training needs are included in the test planning.
The test management includes the test infrastructure management and test people
management. The test infrastructure consists of a test case database, a defect
repository and a configuration management repository and tool.
The test process includes the test case specification, putting a baseline to the test plan
and an update of traceability matrix. The test process also has to identify possible
candidates for automation.
4.7 Check Your Progress
1. How will you prepare a test plan? Explain the strategy.
2. Explain the concept of identifying responsibilities, staffing, and
training needs.
3. How will you make the size and effort estimation of the product?
4. Explain the aspects of Risk management.
5. Explain the relationship between SCM, DR and TCDB.
6. Explain the test process with an example.
7. What is called ‘test reporting’?
8. How will you make a test report? Explain with a sample report.
9. Explain the best practices to be followed in test process.
10. Differentiate between a test cycle report and test summary report.
12
12
UNIT - V
Structure
5.0 Objectives
5.1 Introduction
5.2 Software Test Automation
5.3 Test metrics and measurements
5.4 Summary
5.5 Check Your Progress
13
13
5.0 Objectives
To know the basic concepts of software test automation and their benefits
To understand the test metrics and measurements and the methods
5.1 Introduction
Developing software to test the software is called test automation. Test automation
can help address several problems.
Automation saves time as software can execute test case faster than
human do.
Test automation can free the test engineers from mundane tasks and
make them focus on more creative tasks.
Automated tests can be more reliable
Automation helps in immediate testing
Automation can protect an organization against attrition of test
engineers.
Test automation opens up opportunities for better utilization of global
resources.
Certain types of testing cannot be executed without automation
Automation means end-to-end, not test execution alone.
Automation should have scripts that produce test data to maximize coverage of
permutations and combinations of inputs and expected output for result comparison.
They are called test data generators. The automation script should be able to map the
error patterns dynamically to conclude the result. The error pattern mapping is done
not only to conclude the result of a test, but also point out the root cause.
13
13
5.2 SOFTWARE TEST AUTOMATION
Terms used in automation
A test case is a set of sequential steps to execute a test operating on a set of
predefined inputs to produce certain expected outputs. There are two types of test
cases namely automated and manual. Test case in this chapter refers to automated test
cases. A test case can be documented as a set of simple steps, or it could be an
assertion statement or a set of assertions. An example of assertion is “Opening a file,
which is already opened should fail.” The following table describes some test cases
for the log in example, on how the log in can be tested for different types of testing.
S.No. Test cases for testing Belongs to what type of
testing1. Check whether login works Functionality2. Repeat log in operation in a loop for 48 hours Reliability3. Perform log in from 10000 clients Load/Stress testing4. Measure time taken for log in operations
In different conditions
Performance
5. Run log in operation from a machine running
Japanese language
Internalization
Table- Same test case being used for different types of testing
In the above table the how portion of the test case is called scenarios. What an
operation has to do is a product specific feature and how they are to be run is a
framework-specific requirement. When a set of test cases is combined and associated
with a set of scenarios, they are called “test suite”.
13
13
User
Defined scenarios
How to execute the tests
What a test should do
Fig. 25 Framework for test automation
Skills Needed for Automation
The automation of testing is broadly classified into three generations.
First generation – record and playback
Record and playback avoids the repetitive nature of executing tests. Almost all
the test tools available in the market have the record and playback feature. A test
engineer records the sequence of actions by keyboard characters or mouse clicks and
those recorded scripts are played back later, in the same order as they were recorded.
When there is frequent change, the record and playback generation of test automation
tools may not be very effective.
Second generation – data – driven
This method helps in developing test scripts that generates the set of input conditions
and corresponding expected output. This enables the tests to be repeated for different
input and output conditions. This generation of automation focuses on input and
output conditions using the black box testing approach.
Third generation action driven
This technique enables a layman to create automated tests; there are no input and
expected output condition required for running the tests. All action that appear on
application are automatically tested based on a generic set of controls defined for
13
13
Scenarios
Framework/harness test tool
Test casesATestsuite w W
automation e input and out put condition are automatically generated and used the
scenarios for test execution can be dynamically changed using the test framework that
available in this approach of automation hence automation in the third generation
involves two major aspects “test case automation” and “frame work design”.
What to Automate, Scope of Automation
The specific requirements can vary from product to product, from situation to
situation, from time to time. The following gives some generic tips for identifying the
scope of automation.
Identifying the types of testing amenable to automation
Stress, reliability, scalability, and performance testing
These types of testing require the test cases to be run from a large number of
different machines for an extended period of time, such as 24 hours, 48 hours, and so
on. Test cases belonging to these testing types become the first candidates for
automation.
Regression tests
Regression tests are repetitive in nature. Given the repetitive nature of the test cases,
automation will save significant time and effort in the long run.
Functional tests
These kinds of tests may require a complex set up and thus required specialized skill,
which may not be available on an ongoing basis. Automating these once, using the
expert skill tests, can enable using less-skilled people to run these tests on an ongoing
basis.
Automating areas less prone to change
User interfaces normally go through significant changes during a project. To avoid
rework on automated test cases, proper analysis has to be done to find out the areas of
changes to user interfaces, and automate only those areas that will go through
relatively less change. The non-user interface portions of the product can be
automated first. This enables the non-GUI portions of the automation to be reused
even when GUI goes through changes.
13
13
Automate tests that pertain to standards
One of the tests that products may have to undergo is compliance to standards. For
example, a product providing a JDBC interface should satisfy the standard JDBC
tests. Automating for standards provides a dual advantage. Test suites developed for
standards are not only used for product testing but can also be sold as test tools for the
market. Testing for standards has certain legal requirements. To certify the software, a
test suite is developed and handed over to different companies. This is called
“certification testing” and requires perfectly compliant results every time the tests are
executed.
Management aspects in automation
Prior to starting automation, adequate effort has to be spent to obtain management
commitment. The automated test cases need to be maintained till the product reaches
obsolescence. Since automation involves effort over an extended period of time,
management permissions are only given in phases and part by part. It is important to
automate the critical and basic functionalities of a product first. To achieve this, all
test cases need to be prioritized as high, medium, and low, based on customer
expectations. Automation should start from high priority and then over medium and
low-priority requirements.
Design and Architecture for Automation
Design and architecture is an important aspect of automation. As in product
development, the design has to represent all requirements in modules and in the
interactions between modules.
In integration testing both internal interfaces and external interfaces have to be
captured by design and architecture. Architecture for test automation involves two
major heads: a test infrastructure that covers a test case database and a defect database
or defect repository. Using this infrastructure, the test framework provides a backbone
that ties the selection and execution of test cases.
External modules
There are two modules that are external modules to automation – TCDB and defect
DB. Manual test cases do not need any interaction between the framework and
13
13
TCDB. Test engineers submit the defects for manual test cases. For automated test
cases, the framework can automatically submit the defects to the defect DB during
execution. These external modules can be accessed by any module in automation
framework.
Scenario and configuration file modules
Scenarios are information on “how to execute a particular test case“. A configuration
file contains a set of variables that are used in automation. A configuration file is
important for running the test cases for various execution conditions and for running
the tests for various input and output conditions and states. The values of variables in
this configuration file can be changed dynamically to achieve different execution
input, output and state conditions.
Test cases and test framework modules
Test case is an object for execution for other modules in the architecture and does not
represent any interaction by itself. A test framework is a module that combines “what
to execute” and “how they have to be executed.” The test framework is considered the
core of automation design. It can be developed by the organization internally or can
be bought from the vendor.
Tools and results modules
When a test framework performs its operations, there are a set of tools that may be
required. For example, when test cases are stored as source code files in TCDB, they
need to be extracted and compiled by build tools. In order to run the compiled code,
certain runtime tools and utilities may be required.
The results that come out of the test must be stored for future analysis. The history of
all the previous tests run should be recorded and kept as archives. This results help the
test engineer to execute the test cases compared with the previous test run. The audit
of all tests that are run and the related information are stored in the module of
automation. This can also help in selecting test cases for regression runs.
Report generator and reports /metrics modules
Once the results of a test run are available, the next step is to prepare the test reports
and metrics. Preparing reports is a complex work and hence it should be part of the
13
13
automation design. The periodicity of the reports is different, such as daily, weekly,
monthly, and milestone reports. Having reports of different levels of detail can
address the needs of multiple constituents and thus provide significant returns.
The module that takes the necessary inputs and prepares a formatted report is called a
report generator. Once the results are available, the report generator can generate
metrics. All the reports and metrics that are generated are stored in the reports/metrics
module of automation for future use and analysis.
Generic Requirements for Test Tool/Framework
In the previous section, we described a generic framework for test automation. This
section presents detailed criteria that such a framework and its usage should satisfy.
• No hard coding in the test suite.
• Test case/suite expandability.
• Reuse of code for different types of testing, test cases.
• Automatic setup and cleanup.
• Independent test cases.
• Test case dependency
• Insulating test cases during execution
• Coding standards and directory structure.
• Selective execution of test cases.
• Random execution of test cases.
• Parallel execution of test cases.
• Looping the test cases
• Grouping of test scenarios
• Test case execution based on previous results.
• Remote execution of test cases.
• Automatic archival of test data.
• Reporting scheme.
• Independent of languages
• Probability to different platforms.
13
13
Process Model for Automation
The work on automation can go simultaneously with product development and can
overlap with multiple releases of the product. One specific requirement for
automation is that the delivery of the automated tests should be done before the test
execution phase so that the deliverables from automation effort can be utilized for the
current release of the product.
Test automation life cycle activities bear a strong similarity to product development
activities. Just as product requirements need to be gathered on the product side,
automation requirements too need to be gathered. Similarly, just as product planning,
design and coding are done, so also during test automation are automation planning,
design and coding.
After introducing testing activities for both the product and automation, the
above figure includes two parallel sets of activities for development and testing
separately. When they are put together, it becomes a “W” model. Hence for a product
development involving automation, it will be a good choice to follow the W model to
ensure that the quality of the product as well as the test suite developed meets the
expected quality norms.
Selecting a test tool
Having identified the requirements of what to automate, a related question is the
choice of an appropriate tool for automation. Selecting the test tool is an important
aspect of test automation for several reasons given below:
1. Free tools are not well supported and get phased out soon.
2. Developing in-house tools take time.
3. Test tools sold by vendors are expensive.
4. Test tools require strong training.
5. Test tools generally do not meet all the requirements for
automation.
6. Not all test tools run on all platform.
For all the above strong reasons, adequate focus needs to be provided for selecting the
right tool for automation.
13
13
Criteria for selecting test tools
In the previous section, we looked at some reasons for evaluating the test tools
and how requirements gathering will help. This will change according to context and
are different for different companies and products. We will now look into the broad
categories for classifying the criteria. The categories are
1. Meeting requirements
2. Technology expectations
3. Training/skills and
4. Management aspects.
Meeting requirements
Firstly, there are plenty of tools available in the market, but they do not meet all the
requirements of a given product. Evaluating different tools for different requirements
involves significant effort, money and time.
Secondly, test tools are usually one generation behind and may not provide backward
or forward compatibility. Thirdly, test tools may not go through the same amount of
evaluation for new requirements.
Finally, a number of test tools cannot differentiate between a product failure and a test
failure. So the test tool must have some intelligence to proactively find out the
changes that happened in the product and accordingly analyze the results.
Technology expectations
Extensibility and customization are important expectations of a test
tool.
A good number of test tools require their libraries to be liked with
product binaries.
Test tools are not 100% cross platform. When there is an impact
analysis of the product on the network, the first suspect is the test tool
and it is uninstalled when such analysis starts.
Training skills
While test tools require plenty of training, very few vendors provide the training to
the required level. Test tools expect the users to learn new language/scripts and may
13
13
not use standard languages/scripts. This increases skill requirements for automation
and increases the need for a learning curve inside the organization.
Management aspects
Test tools require system upgrades.
Migration to other test tools difficult
Deploying tool requires huge planning and effort.
Steps for tool selection and deployment
1. Identify your test suite requirements among the generic
requirements discussed. Add other requirements if any.
2. Make sure experiences discussed in previous sections are taken
care of.
3. Collect the experiences of other organizations which used similar
test tools.
4. Keep a checklist of questions to be asked to the vendors on
cost/effort/support.
5. Identify list of tools that meet the above requirements.
6. Evaluate and shortlist one/set of tools and train all test developers
on the tool.
7. Deploy the tool across the teams after training all potential users of
the tool.
Challenges in Automation
The most important challenge of automation is the management commitment.
Automation takes time and effort and pays off in the long run. Management should
have patience and persist with automation. Successful test automation endeavors are
characterized unflinching management commitment, a clear vision of goals that track
progress with respect to the long-term vision.
14
14
5.3 TEST METRICS AND MEASUREMENTS
What are metrics and measurements
Metrics derive information from raw date with a view to help in decision making.
Some of the areas that such information would shed light on are
Relationship between the data points
Any cause and effect correlation between the observed data points.
Any pointers to how the data can be used for future planning and continuous
improvements
Metrics are thus derived from measurement using appropriate formulas or calculation.
Obviously the same set measurement can help product different set of metrics of
interest to different people.
From the above discussion it is obvious that in order that a project performance be
tracked and its progress monitored effectively,
The right parameters must be measured; the parameters may pertain to product
or to process
The right analysis must be done on the date measured to draw within a project
or organization.
The result of the analysis must be presented in an appropriate form to the
stakeholders to enable them to make the right decisions on improving product
or process quality
Effort is the actual time that is spent on a particular activity or a phase. Elapsed days
are the difference between the start of an activity and the completion of the activity.
Collecting and analyzing metrics involves effort and several steps. This is depicted in
the following figure.26.
14
14
Figure. 26 Steps in a metrics program
The first step involved in a metrics programme is to decide what measurement are
important and collect data accordingly. The effort spent on testing number of defect
and number of test cases is some examples of measurement Depending on what the
data is used for the granularity of measurement will vary.
While deciding what to measure the following aspects need to be kept in mind.
1. What is measured should be of relevance to what we are trying to achieve.
For testing functions we would obviously be interested in the effort spent on
testing number of test cases number of defects reported from test cases and so
on.
2. The entities measured should be natural and should not involve too many
overheads for measurement. If there are too many overheads in making the
measurements do not follows naturally from the actual work being done then
the people who supply the date may resist giving the measurement data.
14
14
Identify what To measure
TransformMeasurementsTo metrics
Decide operational requirements
Refine measurementsAnd metrics
Take actions and follow up
Perform metricanalysis
3. What is measured should be at the right level of granularity to satisfy the
objective for which the measurement is being made.
An approach involved in getting the granular detail is called data drilling
It is important to provide as much granularity in measurement as possible. A set of
measurement can be combined to generate metrics. An example question involving
multiple measurements is “How many test cases produced the 40 defect in date
migration involving different schema?” There are two measurements involved in this
question the number of test cases and the number of defect. Hence the second step
involved in metrics collection is defining how to combine data points or
measurements to provide meaningful metrics.
A particular metric can use one or more measurements. The operational requirement
for a metrics plan should lay down not only the periodicity but also other operational
issues such as who should collect measurements, who should receive the analysis, and
so on. The final step involved in a metrics plan is to take necessary action and follow
up on the action.
Why Metrics in Testing
Metrics are needed to know test case execution productivity and to estimate test
completion date.
Days needed to complete testing = total test cases yet to be executed /
Test case execution productivity
The defect fixing trend collected over a period of time gives another estimate of the
defect-fixing capability of the team.
Total days needed for defect fixes = (Outstanding defects yet to be fixed + Defects
That can be found in future test cycles) /
Defect fixing capability.
14
14
Hence, metrics helps in estimating the total days needed for fixing defects.
Once the time needed for testing and the time for defect fixing are known, the release
date can be estimated.
Days needed for release = Max (days needed for testing, days needed for
Defect fixes ).
The defect fixes may arrive after the regular test cycles are completed. These defect
fixes have to be verified by regression testing before the product can be released.
Hence the formula for days needed for release is to be modified as follows:
Days needed for release = Max [ Days needed for testing, [ Days needed for defect
fixes+ Days needed for regressing outstanding defect fixes]
The idea of discussing the formula here is to explain that metrics are important and
help in arriving at the release date for the product. Metrics are not only used for
reactive activities. Metrics and their analysis help in preventing the defects
proactively, thereby saving cost and effort. Metrics are used in resource management
to identify the right size of product development teams.
To summarize, metrics in testing help in identifying
When to make the release.
What to release
Whether the product is being released with known quality.
10.2 Types of Metrics
Metrics can be classified into different types based on what they measure and what
area they focus on. Metrics can be classified as product metrics and process metrics.
Product metrics can be further classified as
Project metrics – a set of metrics that indicates how the project is planned and
executed.
14
14
Progress metrics – a set of metrics that tracks how the activities of the project are
progressing.
Productivity metrics – a set of metrics that takes up into account various
productivity numbers that can be collected and used for
planning and tracking testing activities.
Project Metrics
A typical project starts with requirements gathering and ends with product release. All
the phases that fall in between these points need to be planned and tracked. The
project scope gets translated to size estimates, which specify the quantum of work to
be done. This size estimate gets translated to effort estimate for each of the phases and
activities by using the available productivity data available. This initial effort is called
baselined effort.
As the project progresses and if the scope of the project changes then the effort
estimates are re-evaluated again and this re-evaluated effort estimate is called revised
effort.
Effort variance ( planned vs actual )
If there is substantial difference between the baselined and revised effort, it points to
incorrect initial estimation. Calculating effort variance for each of the phases provides
a quantitative measure of the relative difference between the revised and actual
efforts.
14
14
Effort Req Design Coding Testing Doc Defect
fixingVariance
%
7.1 8.7 5 0 40 15
Table. Sample variance percentage by phase.
Variance % = [( Actual effort – Revised estimate) / Revised estimate] * 100
A variance more than 5% in any of the SDLC phase indicates the scope for
improvement in the estimation. The variance can be negative also. A negative
variance is an indication of an over estimate. These variance numbers along with
analysis can help in better estimation for the next release or the next revised
estimation cycle.
14
14
Fig. 27 Types of metrics
Process metrics
Productmetrics
Project metrics
ProgressMetrics
Productivitymetrics
Testing defect metrics
Development defect metrics
Defects per 100 hrs of testing
Test cases executed per 100 hrs of testing
Test cases developed per 100 hrs
Defects per 100 test cases
Defects per 100 failed test cases
Test phase effectiveness
Closed defects distribution
Defect find rate
Defect fix rate
Outstanding defects rate
Priority outstanding rate
Defects trend
Defect classification trend
Weighted defects trend
Defect cause distribution
Effort variance
Schedule variance
Effort distribution
Component wise defect distribution
Defect density and defect removal rate
Age analysis of outstanding defects
Introduced and reopened defects rate
14
14
Schedule variance ( planned vs actual )
Schedule variance, like effort variance, is the deviation of the actual schedule from
the estimated schedule. There is one difference though. Depending on the SDLC
model used by the project, several phases could be active at the same time. Further the
different phases in SDLC are interrelated and could share the same set of individuals.
Because of all these complexities involved, schedule variance is calculated only at the
overall project level, at specific milestones, not with respect to each of the SDLC
phases.
Effort and schedule variance have to be analyzed in totality, not in isolation. This is
because while effort is a major driver of the cost, schedule determines how best a
product can exploit market opportunities. Variance can be classified into negative
variance, zero variance, acceptable variance and unacceptable variance. Generally
0-5% is considered as acceptable variance.
Effort distribution across phases
Variance calculation helps in finding out whether commitments are met on time and
whether the estimation method works well. The distribution percentage across the
different phases can be estimated at the time of planning and these can be compared
with the actual at the time of release for getting a comfort feeling on the release and
estimation methods.
Mature organizations spend at least 10-15% of the total effort in requirements and
approximately the same effort in the design phase. The effort percentage for testing
depends on the type of release and amount of change to the existing code base and
functionality. Typically, organizations spend about 20 – 50% of their total effort in
testing.
Progress Metrics
The number of defects that are found in the product is one of the main indicators of
quality. Hence, we will look at progress metrics that reflect the defects of a product.
Defects get detected by the testing team and get fixed by the development team.
Based on this, defect metrics are further classified into test defect metrics and
development defect metrics.
14
14
How many defects have already been found and how many more defects may get
unearthed are two parameters that determine product quality and its assessment, the
progress of testing has to be understood. If only 50% of testing is complete and if 100
defects are found, then, assuming that the defects are uniformly distributed over the
product, another 80-100 defects can be estimated as residual defects.
1. Test defect metrics
The next set of metrics helps us understand how the defects that are found can be
used to improve testing and product quality. Not all defects are equal in impact or
importance. Some organizations classify defects by assigning a defect priority. The
priority of a defect provides a management perspective for the order of defect fixes.
Some organization use defect severity levels, they provide the test team a perspective
of the impact of that defect in product functionality. Since different organizations use
different methods of defining priorities and severities, a common set of defect
definitions and classification are provided in the table given below.
Defect find rate
When tracking and plotting the total number of defects found in the product at
regular intervals, from beginning to end of a product development cycle, it may show
a pattern for defect arrival. For a product to be fit for release, not only is such a
pattern of defect arrival in a particular duration should be kept at a bare minimum
number. A bell curve along with minimum number of defects found in the last few
days indicate that the release quality of the product is likely to be good.
14
14
Defect
classification
What it means
Extreme • Product crashes or unusable
• Needs to be fixed immediatelyCritical • Basic functionality of the product not working
• Needs to be fixed before next test cycle startsImportant • Extended functionality of the product not working
• Does not affect the progress of testing
Minor • Product behaves differently
• No impact on the test team or customers
• Fix it when tome permitsCosmetic • Minor irritant
Need not be fixed for this release
Table. A common defect definition and classification
Defect fix rate
If the goal of testing is to find defects as early as possible, it is natural to expect that
the goal of development should be to fix defects as soon as they arrive. If the defect
fixing curve is in line with defect arrival a “bell curve” will be the result again. There
is a reason why defect fixing rate should be same as defect arrival rate. As discussed
in the regression testing, when defects are fixed in the product, it opens the door for
the introduction of new defects. Hence, it is a good idea to fix the defects early and
test those defect fixes thoroughly to find our all introduced defects. If this principle is
not followed, defects introduced by the defect fixes may come up for testing just
before the release and end up in surfacing of new defects.
Outstanding defects rate
The number of defects outstanding in the product is calculated by subtracting the
total defects fixed from the total defects found in the product. If the defect fixing
pattern is constant like a straight line, the outstanding defects will result in a bell
curve again. If the defect-fixing pattern matches the arrival rate, then the outstanding
15
15
defects curve will look like a straight line. However it is not possible to fix all defects
when the arrival rate is at the top end of the bell curve. Hence, the outstanding defect
rate results in a ball curve in many projects. When testing is in progress, the
outstanding defects should be kept very close to zero so that the development team’s
bandwidth is available to analyze and fix the issues soon after they arrive.
Priority outstanding rate
Sometimes the defects that are coming out of testing may be very critical and may
take enormous effort to fix and to test. Hence, it is important to look at how many
serious issues are being uncovered in the product. The modification to the outstanding
defects rate curve by plotting only the high priority defects is called priority
outstanding defects. The priority outstanding defects correspond to extreme and
critical classification of defects. Some organizations include important defects also in
priority outstanding defects.
The effectiveness of analysis increases when several perspectives of find rate, fix rate,
outstanding, and priority outstanding defects are combined. There are different defect
trends like defect trend, defect classification trend and weighted defects trend.
Development defect metrics
In this section we will look how metrics can be used to improve the development
activities. The defect metrics that directly help in improving development activities
are discussed in this section and are termed as development defect metrics. While
defect metrics focus on the number of defects, development defect metrics try to map
those defects to different components of the product and to some of the parameters of
development such as lines of code.
Component-wise defect distribution
While it is important to count the number of defects in the product, for development
it is important to map them to different components of the product so that they can be
assigned to the appropriate developer to fix those defects. The project manager in
charge of development maintains a module ownership list where all product modules
and owners are listed. Based on the number of defects existing in each of the modules,
the project manager assigns resources accordingly.
15
15
Defect density and defect removal rate
A good quality product can have a long lifetime before becoming obsolete. The
lifetime of the product depends on its quality, over the different releases it goes
through. One of the metrics that correlates source code and defects is defect density.
This metric maps the defects in the product with the volume of code that is produced
for the product.
There are several standard formulae for calculating defect density. Of these, defects
per KLOC is the most practical and easy metric to calculate and plot. KLOC stands
for kilo lines of code, every 1000 lines of executable statements in the product is
counted as one KLOC.
The metric compares the defects per KLOC of the current release with previous
releases of the product. There are several variants if this metric to make it relevant to
releases, and one of them is calculating AMD ( added, modified, deleted code) to find
out how a particular release affects product quality.
Defects per KLOC = ( Total defects found in the product/ Total executable
AMD lines of code in KLOC)
Defects per KLOC can be used as a release criteria as well as a product quality
indicator with respect to code and defects. Defects found by the testing team have to
be fixed by the development team. The ultimate quality of the product depends both
on development and testing activities and there is a need for a metric to analyze both
the development and the testing phases together and map them to release. The defect
removal rate is used for the purpose.
The formula for calculating the defect removal rate is
( Defects found by verification activities + Defects found in unit testing) /
( Defects found by test teams) * 100
The above formula helps in finding the efficiency of verification activities and unit
testing which are normally responsibilities of the development team and compare
them to the defects found by the testing teams. These metrics are tracked over various
releases to study in-release-on-release trends in the verification /quality assurance
activities.
15
15
Age analysis of outstanding defects
Age here means those defects that have been waiting to be fixed for a long time.
Some defects that are difficult to be fixed or require significant effort may get
postponed for a longer duration. Hence, the age of a defect in a way represents the
complexity of the defect fix needed. Given the complexity and time involved in fixing
those defects, they need to be tracked closely else they may get postponed close to
release which may even delay the release. A method to track such defects is called age
analysis of outstanding defects.
Productivity Metrics
Productivity metrics combine several measurements and parameters with effort spent
on the product. They help in finding out the capability of the team as well as for other
purposes, such as
a. Estimating for the new release
b. Finding out how well the team is progressing, understanding, the
reasons for variation in results.
c. Estimating the number of defects that can be found.
d. Estimating release date and quality.
e. Estimating the cost involved in the release.
Defects per 100 hours of testing
Program testing can only prove the presence of defects, never their absence. Hence, it
is reasonable to conclude that there is no end to testing and more testing may reveal
more new defects. But here may be a point of diminishing returns when further testing
may not reveal any defects. If incoming defects in the product are reducing , it may
mean various things.
1. Testing is not effective.
2. The quality of the product is improving
3. Effort spent in testing is falling.
Defects per 100 hours of testing = ( Total defects found in the product
for a period/ Total hours spent to get those defects) * 100
15
15
Test cases executed per 100 hours of testing
The number of test cases executed by the test team for a particular duration depends
on team productivity and quality of product. The team productivity has to be
calculated accurately so that it can be tracked for the current release and be used to
estimate the next release of the product.
Test cases executed per 100 hours of testing = ( Total test cases executed for a
period / total hours spent in test execution) * 100
Test cases developed per 100 hours of testing
Both manual and automating test cases require estimating and tracking of productivity
numbers. In a product scenario not all test cases are written afresh for every release
new test cases are added to address new functionality and for testing features that
were not tested earlier Existing test cases are modified to reflect changes in the
product. Some test cases are deleted changers in the product, Some test cases are
deleted if they are no longer useful or if corresponding features are removed from the
product features are removed from the product, Hence the formula for test cases
developed uses the count corresponding to added modified and deleted test cases,
Test cases developed per 100 hours of testing= Total test cases developed for a
period total hours spent in test case development 100
Defect per 100 Test Cases
Since the goal of testing is find out as many defects as possible, it is appropriate to
measure the defect yield of test that is how many defects get uncovered during testing.
This is a function of two parameters one the effectiveness of the tests that are capable
of uncovering defects. The ability of test case to uncover defect depend on how well
the test cases are designed and developed. But in a typical product scenario not all test
cases are executed for every test cycle, hence it is better to select test cases that
produce defect. A measure that quantifies these two parameters is defect per 100 test
cases. Yet another parameter that influences this metric is the quality of product, If
15
15
product quality is poor, it produces more defects per 100 test cases compared to a
good quality product. The formula used for calculating this metric is
Defects per 100 test cases = (Total defect found for a period/ Total test cases
executed for the same period) * 100
Defects per 100 Failed Test Cases
Defect per 100 failed test cases is a good measure to find out how granular the test
cases are it indicates
How many test cases need to be executed when a defect is fixed
What defect need to be fixed so that an acceptable number of test cases
reach the pass state and
How the fail rate of test cases and defect affect each other for release
readiness analysis.
DEFECT PER 100 FAILED TEST CASES = (TOTAL DEFECT FOUND FOR A
PERIOD/ TOTAL TEST CASES FAILED DUE TO THOSE DEEFECTS)* 100
Test phase effectiveness
As per the principles of testing, testing is not the job of testers alone. Developers
perform unit testing and there could be multiple testing teams performing component,
integration and system testing phases. The idea of testing is to find defects early in the
cycle and in the early phases of testing. As testing is performed by various teams with
the objective of finding defects early at various phases, a metric needed to compare
the defects filed by each of the phases in testing. The defects found in various phases
such as unit testing(UT), component testing(CT), integration testing(IT), and system
testing are plotted and analyzed.
15
15
Fig. 28 Test phase effectiveness
In the chart given, the total defects found by each test phase are plotted. The
following few observations can be made.
1. A good proportion of defects were found in the early phases of testing (UT
and CT).
2. Product quality improved from phase to phase
Closed defect distribution
The objective of testing is not only to find defects. The testing team also has
the objective to ensure that all defects found through testing are fixed so that the
customer gets the benefit of testing and the product quality improves the testing team
has to track the defects and analyze how they are closed.
Release Metrics
The decision to release a product would need to consider several perspectives and
metrics. All the metrics that were discussed in the previous section need to be
considered in totality for making the release decision. The following table gives some
of the perspectives and some sample guidelines needed for release analysis.
15
15
UT 39%
ST 12%%
IT 17%
CT 32%
Metric Perspectives to be
considered
Guidelines
Test cases executed Execution %
Pass %
All 100% of test cases to be executed
Test cases passed should be minimum
98%Effort distribution Adequate effort has
been spent on all
phases
15-20% effort spent each on
requirements, design, and testing
phases.
Defect find rate Defect trend Defect arrival trend showing bell curve
Incoming defects close to zero in the
last weekDefect fix rate Defect fix trend Defect fixing trend matching arrival
trendTable. Guidelines for Release analysis
5.5 Summary
Automation makes the software to test the software and enables the human effort to
be spent on creative testing. Automation bridges the gap in skills requirement between
testing and development; at times it demands more skills for test teams. What to
automate takes into account the technical and management aspects, as well as the long
term vision. Product and automation are like the two lines in a railway track; they go
parallel in the same direction with similar expectations.
Test metrics are needed to know test case execution productivity and to estimate test
completion date. To summarize, metrics in testing helps in identifying when to make
the release, what to release and whether the product is being released with known
quality.
15
15
5.6 Check your Progress
1. What is test automation? Why it is important?
2. Explain the scope of automation.
3. How will you make the design and architecture for automation?
4. Explain the generic requirements for test tool/framework.
5. How will you select a test tool?
6. What are the challenges involved in test automation?
7. What are test metrics and measurements?
8. Why metrics are needed in testing?
9. Explain the Project metrics.
10. Explain the Progress and Productivity metrics.
15
15