non functional performance requirements v2.2

1

Ian McDonaldNon Functional Performance Requirements

© 2010, 2013 2014 Ian McDonald

July 2010 v1

January 2014 v2.2

2

Purpose

These slides are aimed at those writing Non Functional – performance related requirements.

They aim to:Show how to create testable, verifiable

requirements.Demonstrate how to create data that is

often missing which could lead to impropper validation.

OverviewThese slides are to aid those writing non-functional performance and volumetric related requirements.They specifically address the issues of:Writing requirements that are testable.Avoiding self-defeat through creating requirements that will never pass.Avoiding disappointment through creating requirements that are too easy to test and pass, but then fail quickly within an operational setting. This happens when the requirement is poorly structured.Avoid being ignored, through developers throwing away your work and re-writing the requirements to deliver what they want to code, not what you want delivered.Equipping your champion with the information required to ensure what you want is actually delivered. Your champion is the test manager – make the most from them!The reader is taken through the process by example, showing at first how not to write tests and then how to structure these correctly.By the end of the presentation, the reader should be able to take an existing requirement and re-shape this to produce a testable requirement that makes a meaningful contribution to the final product.

3

Verifying Requirements

When verifying a set of requirements, test teams need to know the following: Given that time is limited, where should the greatest focus be on

reducing risk? This is obtained through: Knowing the importance for delivery order – The Priority Score Knowing the Impact on the business if the delivered functionality were to

fail – Failure Impact Score. Information from the development team that identifies technical likelihood

of failure. What is to be tested – This is presented as single atomic logical

points. One point per requirement. Larger parent requirements can be broken down into daughter sub-requirements.

Under what conditions are verifications to be made – this is very important for non-functional requirements – as we shall see in a moment.

4

An example of a poor Non-Functional Requirements

“A user is to be given access to the system instantaneously after submitting a request to sign on.”

Clearly it is impossible to take a user input, convert that to a digital signal, then convert that to an analogue signal, send that over miles of cable to a server where the message is converted back to digital, processed and finally a response sent back. This is not going to happen in 0.00000000000000 micro-seconds. The requirement has failed even before it is tested.A poor requirement such as this is very common. The words “instant”, “instantaneously”, “immediate” and “immediately” should always be avoided.There are clear steps to improving a requirement such as the following…

5

Cost Implications #1 IT projects have costs built in to cover risk. The more

risk perceived in requirements, the higher the cost, since the more risk there is for a development team.

So getting requirements right, clear and testable is important for driving down project risk and costs.

A further advantage is that clear testable requirements also directly reduce development time and test time, so this brings forward project delivery and cuts development costs significantly.

6

Cost Implications #2

Question - Why is this not normally done?Answer – Training for BA’s does not normally include the full lifecycle of the

requirement through the verification phase. BA’s in large consultancies usually do not stay around to see the full

delivery through the test process. So the Test manager will often need to oversee the restructure and correcting of requirements.

Organisations are now beginning to insist that Test managers review and approve requirements before they are accepted by the business.

Some even use metric measurements to drive up requirement standards.

The Business often does not appreciate the cost implications and is not interested in fine detail – 2 man days can save 2 team months of work for a 12 month project.

7

Set Realistic Levels

An improvement on the previous requirement might be:“A user is to be given access to the system within 3 seconds after submitting a request to sign on.”

However there are two immediate problems with this: If you have 30,000 users logging in within 3.000 seconds, but one user logs in after 3.001 seconds, the requirement has failed. What if the login is 2seconds (ie not 3 seconds). There are no grey areas in testing it is either pass or fail. Here we have a fail. Is this reasonable?

What is the allowable tolerance of acceptability? We could improve this by saying:

“A user is to be given access to the system in 3.0 seconds or less after submitting a request to sign on.”

Here we give an indication of the tolerance for the measurement.

There are still however problems…

8

Setting the ConditionsThe requirement:

“A user is to be given access to the system in 3.0 seconds or less after submitting a request to sign on.”

Has a very specific problem. It can be tested, however the test is not repeatable.

Where is the measurement taken from?

If taken from the terminal itself, then data is being queued and routed over a network and the response time will depend upon the network traffic.

In nearly all cases response times are measured from the server, not the terminal. So an improvement would be:

“The server will grant user access to the system in 3.0 seconds or less after a sign on request is received from the server.”

This now identifies: a realistic time, the tolerance level and where the measurement is made. Yet there is still a significant problem, where the spirit of the business analyst can be totally disregarded. So a further improvement is required.

9

Setting the Conditions

The requirement: “The server will grant user access to the system in 3.0 seconds or less after a sign on request is received from the server.” Can be tested for a single user sign on and a pass or fail result can be awarded.

However signing onto a multimillion pound piece of kit as a single user, with no background tasks running and no other users on the system, is not going to be a fair implementation of what the business analyst had in mind!

It is therefore important to provide the test team with at least an idea of an expected daily load during normal operation and during special periods e.g. end of year accounting. We can therefore define a normal and peak load for the system and cross reference to this in the non-functional requirements:

“The server will grant user access to the system in 3.0 seconds or less after a sign on request is received from the server and during a normal system load (as defined in reference xyz)”

We will come onto defining a normal load in a moment. However there is still a problem over the duration and how we decide what is acceptable.

10

Using 90 PercentileWhat if we have 1,000 logins within an hour and the 999 are within the limit specified, but one takes 4 seconds. Further this only happens occasionally. Do we fail the system? What is really acceptable?

The requirement might be better written as:“For a 90 percentile of user access requests over a 1 hour period, the server will grant user access to the system in no more than 3.0 seconds after a sign on request is received from the server, during a normal system load (as defined in reference xyz)”

We now have a repeatable testable requirement. Specifically note that:We are setting the conditions for the test to be valid.We are setting specific limits for the test.We are setting the duration for the test.We are setting an expectation as to what is acceptable as a trend across test results.

NOTE: As a general principle, any response time should be written as a 90 percentile figure. If need be this can be further clarified with an absolute maximum that is not acceptable.

11

90 Percentile & Absolute Maximum

NOTE: As a general principle, any response time should be written as a 90 percentile figure. If need be this can be further clarified with an absolute maximum that is not acceptable.e.g:

“For a 90 percentile of user access requests over a 1 hour period, the server will grant user access to the system in no more than 3.0 seconds after a sign on request is received from the server, during a normal system load (as defined in reference xyz), under no circumstances is an access request to take longer than 8 seconds.”

Avoid trying to second guess response times. Better still is to classify transactions as simply: Light Medium HeavyThen set a limit for these transaction types…

12

Transaction Types

Process Type Average (50 Percentile) Sec* For guidance only

90 PercentileSec*Used as a test limit

Maximum Limit Sec *Used as a test limit

Light 1.0 2.0 6.0

Medium 2.0 3.5 8.0

Heavy 3.5 6.0 12.0

Classify transactions (within requirements) as simply: Light resource use Medium resource use Heavy resource use

Then define the transaction types (units in seconds):

13

Non Functional Requirement with cross reference

So our requirement:“For a 90 percentile of user access requests over a 1 hour period, the server will grant user access to the system in no more than 3.0 seconds after a sign on request is received from the server, during a normal system load (as defined in reference xyz)”

Could be simply restructured as:“The user access transaction is defined as a light transaction (see table abc). This is to be verified at the server, under a one hour test simulating a normal load as defined within reference xyz.”

14

Error Margin.

NOTE: It is normal to apply / interpret an error of +/- 1 in the lowest quoted significant digit. 3 would mean between 2 and 4. 3.0 would mean 2.9 to 3.1 3.00 would mean 2.99 to 3.01

So in quoting accuracy, think of significant digit and at what point you really want a product to be rejected. Remember the test team will be quoting the accuracy of their readings within a margin or error.

15

Reflection

We can see that we now have created a requirement that is: Testable. Has real potential of passing in a properly constructed test. Reflects the true intention of the business analyst and avoids the requirement being treated literally and so providing a pass, where a fail would be more appropriate.

16

Creating a Predicted User Load

For testing, the test team will create an automated scenario to simulate user traffic. It is therefore helpful to understand what a typical day’s load is.

What you are doing here is providing information to help the test team define a load that is reasonable and in accordance of what you the BA would expect.

Against this load test times can be measured. It is important not to just test lots of people logging on

with not other tasks running, since those tasks might influence the performance being tested and we need to be as realistic as possible.

17

User Activity

Start by listing the types of users and how many of each type are likely to be logged onto the system during a 24 hour period, hour by hour for a typical day:

0:00 – 3:00 3:01 – 6:00 6:01 – 9:00 9:01 – 12:00

12:01 – 15:00

15:01 – 18:00

18:01 – 21:00

21:01 – 24:00

System Admin

1 1 2 4 6 4 3 1

User Type A

0 0 1,000 900 200 700 1,000 100

User Type B

1 2 10 50 30 25 60 80

This can be obtained by looking at predicted business demand, looking at current usage, etc.

18

Background Tasks

Next identify any background tasks and processes that may be running during a typical day:

0:00 – 3:00

3:01 – 6:00

6:01 – 9:00

9:01 – 12:00

12:01 – 15:00

15:01 – 18:00

18:01 – 21:00

21:01 – 24:00

Finance Report

5 5 0 1 0 1 0 0

Sales Report

0 0 1,000 900 200 700 1,000 100

Salesman Activity

1 2 10 50 30 25 60 80

Backup

Full System

Full System

19

User Activity

Finally think about what is the activity for the system – what are the users doing? User

Type0:00 – 3:00

3:01 – 6:00

6:01 – 9:00

9:01 – 12:00

12:01 – 15:00

15:01 – 18:00

18:01 – 21:00

21:01 – 24:00

Login All 2 3 500 512 100 30 2 1

Logout All 0 0 2 100 300 400 500 1

Create user

System Admin

0 0 5 10 0 0 0 0

Delete User

System Admin

0 0 0 0 0 4 0 0

Search for work

User A and User B

20 10 10 30 20 10 5 4

Amend record

User A 1 2 10 50 30 25 60 80

Raise Order

User B 0 0 50 100 200 40 0 0

Raise Invoice

User B 0 0 20 70 80 22 0 0

20

Special Days

Are there any special days that have additional load. We do not want the system to grind to a halt as soon as end of year accounts are run for example. So identify what those special days are and indicate the extra load.

21

Avoid Meaningless Figures that are Not Derived:

Some non-functional requirements creep in (much like the case of instant time) which add cost to the system, have no validity in terms of business and cannot usually be fully delivered, if at all. These made up numbers need to be avoided. Typical culprits are:

Screen refreshes that are significantly faster than the human eye can resolve, is an unnecessary overhead and expense. Be realistic.

Screen refreshes with reported data do not have to be very quick, users are often happy to wait a short while, so try to avoid unnecessary speed, just for the sake of it.

Impacted service levels from external systems, outside of the system under test.

22

Availability Requirements #1

Very rarely is it truly necessary to have 100% availability for 24 hours, 7 days a week, 365 days a year. There may also be the necessity to build in maintenance periods.

So avoid 100% availability. The best that is ever likely to be achieved is

99.999%, so this has a down time of less than 1 hour over a year (~4.6 minutes/month). Take into account scheduled system administration tasks, then unscheduled down time could be less than a minute each year. Can the significant expense for the few seconds be justified to the business?

23

Availability Requirements #2

Another way is to state what is acceptable over a given time span is to set standards over time periods, where the supplier has the opportunity to use grace periods within set limits and where the slate is wiped clean after a given period of time. e.g: Over a 24 hour period, corresponding to one calendar day measured at GMT, the total down time is not to be more than 3 minutes. Over a 7 day period measured from the start of Sunday to the close of Saturday using GMT, the total down time is not to be more than 3.5 minutes. Over any 28 day (GMT) reporting period (or could be concurrent period), the total down time is not to be more than 4 minutes Over any 6 month (GMT) reporting period (or could be concurrent period), the total down time is not to be more than 15 minutes Over any 12 month (GMT) reporting period (or could be concurrent period), the total down time is to be no more than 25 minutes.

24

Finally define Severity, Priority and Impact Scores

These scores should always be agreed with the test manager before specifying within the requirements. SLA’s will have limits in the number of defects that are permitted in the provision of live services and it is important to agree what a specific level of severity is. Similarly it will be necessary to define what is required in terms of a fix turn around and allow for more complex fixes, reasonably taking more time to identify and repair. Severity of defects may be sorted by priority and often priority does not match severity. For example a minor defect may prevent further testing, so the priority may be high.

25

Defect SeverityDefect Severity

Level Severity Description Examples1 Critical There is no work around for following issues: Total failure of the

software / system or Unrecoverable data loss or Required core system functionality lossDefect prevents the product from being released

Defect presence makes it impossible for testing to proceed

Defects that cause the system to crash, corrupt data files, or causes significant disruption to services

2 High A work around is available however this is unsatisfactory for daily services for the following issues: Severely impaired functionality or Non critical defect to Core System

Defect presence makes it difficult to improve system test coverage and so prevents uncovering of potentially more serious issues

Database queries fail - work around is to reboot the system

Printer crashes - work around is not to use the printer

3 Normal An alternative clear reasonable work around is available for the following issue types: Defect impacts only noncritical aspects of the system or functions that enhance usability. The product could be released if the defect is documented, but the defect presence may cause user dissatisfaction and so training or user notification may be required

Defect presence causes issues to the test team

The tape backup program doesn't work - work around is to use a different tape backup programme

Search and View functions fail

Different but semantically identical text in button labels

4 Minor and Cosmetic*

Defect is of minor significance. A work around exists or, if not, the impact is not significant. Could release with the defect, as most users would be unaware of the defect's existence or only slightly dissatisfied. In the case of very minor defects (Cosmetic) the problem can be ignored.

Out of date documentation or a formatting error in printed output

Spelling errors in manuals or error messages that could contain greater clarity

5 Cosmetic* It is sometimes useful to take out cosmetic defects as a separate category to minor defects and make the distinction between these.

Spelling errors in manuals or error messages that could contain greater clarity 26

Defect Priority

Defect PriorityLevel Priority Description1 Urgent Must be corrected in next build2 High Must be fixed in any of the upcoming builds but shall be

included in the release

3 Medium May be fixed in a future release, not necessarily in the next release

4 Low Potentially may not get fixed, but can be a candidate for future releases

NOTE: Defect Priority allows for prioritisation of defects to assist the development and test team. For example, while a defect may exhibit lower severity levels, it is strategically important to fix the problem soonerPrioritisation also allows for defining priorities within a severity level, especially where not all the defects would be scheduled to be fixed

27

Purpose of Impact Score

The Impact score is an input to calculating a risk score for the requirement. This in turn helps to refocus test priority towards greater areas of risk, which is the level of support for assuring delivery of requirements.

An individual requirement will therefore need an allocated business impact score.

Impact should not be skewed towards higher values as this does not help in sacrificing low risk testing to cover high risk areas with greater test coverage. Instead ideally a standard distribution should be assumed.

• Risk = (Business Impact if requirement fails) x (Technical Likelihood of failure)

Requirement

Impact Score Spread

28

Requirement Failure Impact

Requirement failure Impact (Assessed by Business) and assigned to each requirementLevel Impact Description5 Catastrophic A failure stops business and means a major loss of income. Work

stops. Such a loss would need to be fixed or patched within 24 hours

4 Major Business can continue, but a significant impact to volume of work and profits is present, some non-crucial work might have to be put on hold. Work is only sustainable over a few days. Such a loss would need to be fixed or patched within a working week

3 High Business can continue, although at significant inconvenience. A paper workaround might be possible. Such a loss would need to be fixed or patched within 30 calendar days

2 Normal Business can continue, there is some inconvenience, however a fix can be left till the next planned release

1 Low Business can continue, there is nominal inconvenience a fix would be a lower priority, perhaps waiting for more than 2 releases. Potentially this might reflect a fault in code that has a lower priority for delivery and with no impact upon high prioritised functionality

NOTE: This defines the business impact if a delivered requirement should fail in the live system

29

Development Assigned Failure Likelihood Score

Failure Likelihood (Technical Assessment by Development) and assigned to each requirement

Level Likelihood Description

5 Daily A failure potentially is expected once per day

4 Weekly A failure potentially is expected once per week

3 Monthly A failure potentially is expected once per month

2 Quarterly A failure potentially is expected once per quarter

1 Annually A failure potentially is expected once per year

NOTE: This defines the technical likelihood of a delivered requirement failing in the live system, based upon difficulty in implementing and the type of technology used. It is an engineers/developers assessment. It is expected that there will be some variance across the system, Unlike Impact, it may not necessarily follow a normal distributionThis is an input to calculating a risk score for the requirement. This in turn helps to refocus test priority towards greater areas of risk, which is the level of support for assuring delivery of requirements

30

Risk Level Calculation

Risk = Business Impact if requirement fails x Technical Likelihood of failure

So this provides a group of risksTest Team Risk Mitigation

Prioritisation

Likliehood ScoreImpact Score 1 2 3 4 5

1 1 2 3 4 5

2 2 4 6 8 10

3 3 6 9 12 15

4 4 8 12 16 20

5 5 10 15 20 25

Key Risk Level

R >16 Critical Risk(High priority for test effort – finer grain testing and greater variation in test conditions required)

9 < R < 16 High Risk

3 > R < 9 Medium Risk

R< = 3 Low Risk(Low priority for test effort)

Note: It is possible to use weighted calculations, the example shown here is the simplest level to help explain how requirement verification is targeted.

31

Final Form of Requirement

The earlier requirement could be finally restructured as:“The user access transaction is defined as a light transaction, as defined within table abc (e.g. slide 10). This is to be verified under a one hour test simulating a normal load as defined within reference xyz (e.g. slides 15, 16 and 17), with time measurements made at the server.

Requirement Property Score

Impact if Failure 5 - staff could not do any work

Likelihood of Failure To be completed by development

Risk Level (feed to test priority) To be completed by test

32

non functional performance requirements v2.2

Technology