non functional performance requirements v2.2
DESCRIPTION
How to write and structure non-functional requirements. Focusing upon performance requirements. This is a quick get you going guide in how to avoid writing untestable requirements and make sure what you want is delivered.TRANSCRIPT
1
Ian McDonaldNon Functional Performance Requirements
© 2010, 2013 2014 Ian McDonald
July 2010 v1
January 2014 v2.2
2
Purpose
These slides are aimed at those writing Non Functional – performance related requirements.
They aim to:Show how to create testable, verifiable
requirements.Demonstrate how to create data that is
often missing which could lead to impropper validation.
OverviewThese slides are to aid those writing non-functional performance and volumetric related requirements.They specifically address the issues of:Writing requirements that are testable.Avoiding self-defeat through creating requirements that will never pass.Avoiding disappointment through creating requirements that are too easy to test and pass, but then fail quickly within an operational setting. This happens when the requirement is poorly structured.Avoid being ignored, through developers throwing away your work and re-writing the requirements to deliver what they want to code, not what you want delivered.Equipping your champion with the information required to ensure what you want is actually delivered. Your champion is the test manager – make the most from them!The reader is taken through the process by example, showing at first how not to write tests and then how to structure these correctly.By the end of the presentation, the reader should be able to take an existing requirement and re-shape this to produce a testable requirement that makes a meaningful contribution to the final product.
3
Verifying Requirements
When verifying a set of requirements, test teams need to know the following: Given that time is limited, where should the greatest focus be on
reducing risk? This is obtained through: Knowing the importance for delivery order – The Priority Score Knowing the Impact on the business if the delivered functionality were to
fail – Failure Impact Score. Information from the development team that identifies technical likelihood
of failure. What is to be tested – This is presented as single atomic logical
points. One point per requirement. Larger parent requirements can be broken down into daughter sub-requirements.
Under what conditions are verifications to be made – this is very important for non-functional requirements – as we shall see in a moment.
4
An example of a poor Non-Functional Requirements
“A user is to be given access to the system instantaneously after submitting a request to sign on.”
Clearly it is impossible to take a user input, convert that to a digital signal, then convert that to an analogue signal, send that over miles of cable to a server where the message is converted back to digital, processed and finally a response sent back. This is not going to happen in 0.00000000000000 micro-seconds. The requirement has failed even before it is tested.A poor requirement such as this is very common. The words “instant”, “instantaneously”, “immediate” and “immediately” should always be avoided.There are clear steps to improving a requirement such as the following…
5
Cost Implications #1 IT projects have costs built in to cover risk. The more
risk perceived in requirements, the higher the cost, since the more risk there is for a development team.
So getting requirements right, clear and testable is important for driving down project risk and costs.
A further advantage is that clear testable requirements also directly reduce development time and test time, so this brings forward project delivery and cuts development costs significantly.
6
Cost Implications #2
Question - Why is this not normally done?Answer – Training for BA’s does not normally include the full lifecycle of the
requirement through the verification phase. BA’s in large consultancies usually do not stay around to see the full
delivery through the test process. So the Test manager will often need to oversee the restructure and correcting of requirements.
Organisations are now beginning to insist that Test managers review and approve requirements before they are accepted by the business.
Some even use metric measurements to drive up requirement standards.
The Business often does not appreciate the cost implications and is not interested in fine detail – 2 man days can save 2 team months of work for a 12 month project.
7
Set Realistic Levels
An improvement on the previous requirement might be:“A user is to be given access to the system within 3 seconds after submitting a request to sign on.”
However there are two immediate problems with this: If you have 30,000 users logging in within 3.000 seconds, but one user logs in after 3.001 seconds, the requirement has failed. What if the login is 2seconds (ie not 3 seconds). There are no grey areas in testing it is either pass or fail. Here we have a fail. Is this reasonable?
What is the allowable tolerance of acceptability? We could improve this by saying:
“A user is to be given access to the system in 3.0 seconds or less after submitting a request to sign on.”
Here we give an indication of the tolerance for the measurement.
There are still however problems…
8
Setting the ConditionsThe requirement:
“A user is to be given access to the system in 3.0 seconds or less after submitting a request to sign on.”
Has a very specific problem. It can be tested, however the test is not repeatable.
Where is the measurement taken from?
If taken from the terminal itself, then data is being queued and routed over a network and the response time will depend upon the network traffic.
In nearly all cases response times are measured from the server, not the terminal. So an improvement would be:
“The server will grant user access to the system in 3.0 seconds or less after a sign on request is received from the server.”
This now identifies: a realistic time, the tolerance level and where the measurement is made. Yet there is still a significant problem, where the spirit of the business analyst can be totally disregarded. So a further improvement is required.
9
Setting the Conditions
The requirement: “The server will grant user access to the system in 3.0 seconds or less after a sign on request is received from the server.” Can be tested for a single user sign on and a pass or fail result can be awarded.
However signing onto a multimillion pound piece of kit as a single user, with no background tasks running and no other users on the system, is not going to be a fair implementation of what the business analyst had in mind!
It is therefore important to provide the test team with at least an idea of an expected daily load during normal operation and during special periods e.g. end of year accounting. We can therefore define a normal and peak load for the system and cross reference to this in the non-functional requirements:
“The server will grant user access to the system in 3.0 seconds or less after a sign on request is received from the server and during a normal system load (as defined in reference xyz)”
We will come onto defining a normal load in a moment. However there is still a problem over the duration and how we decide what is acceptable.
10
Using 90 PercentileWhat if we have 1,000 logins within an hour and the 999 are within the limit specified, but one takes 4 seconds. Further this only happens occasionally. Do we fail the system? What is really acceptable?
The requirement might be better written as:“For a 90 percentile of user access requests over a 1 hour period, the server will grant user access to the system in no more than 3.0 seconds after a sign on request is received from the server, during a normal system load (as defined in reference xyz)”
We now have a repeatable testable requirement. Specifically note that:We are setting the conditions for the test to be valid.We are setting specific limits for the test.We are setting the duration for the test.We are setting an expectation as to what is acceptable as a trend across test results.
NOTE: As a general principle, any response time should be written as a 90 percentile figure. If need be this can be further clarified with an absolute maximum that is not acceptable.
11
90 Percentile & Absolute Maximum
NOTE: As a general principle, any response time should be written as a 90 percentile figure. If need be this can be further clarified with an absolute maximum that is not acceptable.e.g:
“For a 90 percentile of user access requests over a 1 hour period, the server will grant user access to the system in no more than 3.0 seconds after a sign on request is received from the server, during a normal system load (as defined in reference xyz), under no circumstances is an access request to take longer than 8 seconds.”
Avoid trying to second guess response times. Better still is to classify transactions as simply: Light Medium HeavyThen set a limit for these transaction types…
12
Transaction Types
Process Type Average (50 Percentile) Sec* For guidance only
90 PercentileSec*Used as a test limit
Maximum Limit Sec *Used as a test limit
Light 1.0 2.0 6.0
Medium 2.0 3.5 8.0
Heavy 3.5 6.0 12.0
Classify transactions (within requirements) as simply: Light resource use Medium resource use Heavy resource use
Then define the transaction types (units in seconds):
13
Non Functional Requirement with cross reference
So our requirement:“For a 90 percentile of user access requests over a 1 hour period, the server will grant user access to the system in no more than 3.0 seconds after a sign on request is received from the server, during a normal system load (as defined in reference xyz)”
Could be simply restructured as:“The user access transaction is defined as a light transaction (see table abc). This is to be verified at the server, under a one hour test simulating a normal load as defined within reference xyz.”
14
Error Margin.
NOTE: It is normal to apply / interpret an error of +/- 1 in the lowest quoted significant digit. 3 would mean between 2 and 4. 3.0 would mean 2.9 to 3.1 3.00 would mean 2.99 to 3.01
So in quoting accuracy, think of significant digit and at what point you really want a product to be rejected. Remember the test team will be quoting the accuracy of their readings within a margin or error.
15
Reflection
We can see that we now have created a requirement that is: Testable. Has real potential of passing in a properly constructed test. Reflects the true intention of the business analyst and avoids the requirement being treated literally and so providing a pass, where a fail would be more appropriate.
16
Creating a Predicted User Load
For testing, the test team will create an automated scenario to simulate user traffic. It is therefore helpful to understand what a typical day’s load is.
What you are doing here is providing information to help the test team define a load that is reasonable and in accordance of what you the BA would expect.
Against this load test times can be measured. It is important not to just test lots of people logging on
with not other tasks running, since those tasks might influence the performance being tested and we need to be as realistic as possible.
17
User Activity
Start by listing the types of users and how many of each type are likely to be logged onto the system during a 24 hour period, hour by hour for a typical day:
0:00 – 3:00 3:01 – 6:00 6:01 – 9:00 9:01 – 12:00
12:01 – 15:00
15:01 – 18:00
18:01 – 21:00
21:01 – 24:00
System Admin
1 1 2 4 6 4 3 1
User Type A
0 0 1,000 900 200 700 1,000 100
User Type B
1 2 10 50 30 25 60 80
This can be obtained by looking at predicted business demand, looking at current usage, etc.
18
Background Tasks
Next identify any background tasks and processes that may be running during a typical day:
0:00 – 3:00
3:01 – 6:00
6:01 – 9:00
9:01 – 12:00
12:01 – 15:00
15:01 – 18:00
18:01 – 21:00
21:01 – 24:00
Finance Report
5 5 0 1 0 1 0 0
Sales Report
0 0 1,000 900 200 700 1,000 100
Salesman Activity
1 2 10 50 30 25 60 80
Backup
Full System
Full System
19
User Activity
Finally think about what is the activity for the system – what are the users doing? User
Type0:00 – 3:00
3:01 – 6:00
6:01 – 9:00
9:01 – 12:00
12:01 – 15:00
15:01 – 18:00
18:01 – 21:00
21:01 – 24:00
Login All 2 3 500 512 100 30 2 1
Logout All 0 0 2 100 300 400 500 1
Create user
System Admin
0 0 5 10 0 0 0 0
Delete User
System Admin
0 0 0 0 0 4 0 0
Search for work
User A and User B
20 10 10 30 20 10 5 4
Amend record
User A 1 2 10 50 30 25 60 80
Raise Order
User B 0 0 50 100 200 40 0 0
Raise Invoice
User B 0 0 20 70 80 22 0 0
20
Special Days
Are there any special days that have additional load. We do not want the system to grind to a halt as soon as end of year accounts are run for example. So identify what those special days are and indicate the extra load.
21
Avoid Meaningless Figures that are Not Derived:
Some non-functional requirements creep in (much like the case of instant time) which add cost to the system, have no validity in terms of business and cannot usually be fully delivered, if at all. These made up numbers need to be avoided. Typical culprits are:
Screen refreshes that are significantly faster than the human eye can resolve, is an unnecessary overhead and expense. Be realistic.
Screen refreshes with reported data do not have to be very quick, users are often happy to wait a short while, so try to avoid unnecessary speed, just for the sake of it.
Impacted service levels from external systems, outside of the system under test.
22
Availability Requirements #1
Very rarely is it truly necessary to have 100% availability for 24 hours, 7 days a week, 365 days a year. There may also be the necessity to build in maintenance periods.
So avoid 100% availability. The best that is ever likely to be achieved is
99.999%, so this has a down time of less than 1 hour over a year (~4.6 minutes/month). Take into account scheduled system administration tasks, then unscheduled down time could be less than a minute each year. Can the significant expense for the few seconds be justified to the business?
23
Availability Requirements #2
Another way is to state what is acceptable over a given time span is to set standards over time periods, where the supplier has the opportunity to use grace periods within set limits and where the slate is wiped clean after a given period of time. e.g: Over a 24 hour period, corresponding to one calendar day measured at GMT, the total down time is not to be more than 3 minutes. Over a 7 day period measured from the start of Sunday to the close of Saturday using GMT, the total down time is not to be more than 3.5 minutes. Over any 28 day (GMT) reporting period (or could be concurrent period), the total down time is not to be more than 4 minutes Over any 6 month (GMT) reporting period (or could be concurrent period), the total down time is not to be more than 15 minutes Over any 12 month (GMT) reporting period (or could be concurrent period), the total down time is to be no more than 25 minutes.
24
Finally define Severity, Priority and Impact Scores
These scores should always be agreed with the test manager before specifying within the requirements. SLA’s will have limits in the number of defects that are permitted in the provision of live services and it is important to agree what a specific level of severity is. Similarly it will be necessary to define what is required in terms of a fix turn around and allow for more complex fixes, reasonably taking more time to identify and repair. Severity of defects may be sorted by priority and often priority does not match severity. For example a minor defect may prevent further testing, so the priority may be high.
25
Defect SeverityDefect Severity
Level Severity Description Examples1 Critical There is no work around for following issues: Total failure of the
software / system or Unrecoverable data loss or Required core system functionality lossDefect prevents the product from being released
Defect presence makes it impossible for testing to proceed
Defects that cause the system to crash, corrupt data files, or causes significant disruption to services
2 High A work around is available however this is unsatisfactory for daily services for the following issues: Severely impaired functionality or Non critical defect to Core System
Defect presence makes it difficult to improve system test coverage and so prevents uncovering of potentially more serious issues
Database queries fail - work around is to reboot the system
Printer crashes - work around is not to use the printer
3 Normal An alternative clear reasonable work around is available for the following issue types: Defect impacts only noncritical aspects of the system or functions that enhance usability. The product could be released if the defect is documented, but the defect presence may cause user dissatisfaction and so training or user notification may be required
Defect presence causes issues to the test team
The tape backup program doesn't work - work around is to use a different tape backup programme
Search and View functions fail
Different but semantically identical text in button labels
4 Minor and Cosmetic*
Defect is of minor significance. A work around exists or, if not, the impact is not significant. Could release with the defect, as most users would be unaware of the defect's existence or only slightly dissatisfied. In the case of very minor defects (Cosmetic) the problem can be ignored.
Out of date documentation or a formatting error in printed output
Spelling errors in manuals or error messages that could contain greater clarity
5 Cosmetic* It is sometimes useful to take out cosmetic defects as a separate category to minor defects and make the distinction between these.
Spelling errors in manuals or error messages that could contain greater clarity 26
Defect Priority
Defect PriorityLevel Priority Description1 Urgent Must be corrected in next build2 High Must be fixed in any of the upcoming builds but shall be
included in the release
3 Medium May be fixed in a future release, not necessarily in the next release
4 Low Potentially may not get fixed, but can be a candidate for future releases
NOTE: Defect Priority allows for prioritisation of defects to assist the development and test team. For example, while a defect may exhibit lower severity levels, it is strategically important to fix the problem soonerPrioritisation also allows for defining priorities within a severity level, especially where not all the defects would be scheduled to be fixed
27
Purpose of Impact Score
The Impact score is an input to calculating a risk score for the requirement. This in turn helps to refocus test priority towards greater areas of risk, which is the level of support for assuring delivery of requirements.
An individual requirement will therefore need an allocated business impact score.
Impact should not be skewed towards higher values as this does not help in sacrificing low risk testing to cover high risk areas with greater test coverage. Instead ideally a standard distribution should be assumed.
• Risk = (Business Impact if requirement fails) x (Technical Likelihood of failure)
Requirement
Impact Score Spread
28
Requirement Failure Impact
Requirement failure Impact (Assessed by Business) and assigned to each requirementLevel Impact Description5 Catastrophic A failure stops business and means a major loss of income. Work
stops. Such a loss would need to be fixed or patched within 24 hours
4 Major Business can continue, but a significant impact to volume of work and profits is present, some non-crucial work might have to be put on hold. Work is only sustainable over a few days. Such a loss would need to be fixed or patched within a working week
3 High Business can continue, although at significant inconvenience. A paper workaround might be possible. Such a loss would need to be fixed or patched within 30 calendar days
2 Normal Business can continue, there is some inconvenience, however a fix can be left till the next planned release
1 Low Business can continue, there is nominal inconvenience a fix would be a lower priority, perhaps waiting for more than 2 releases. Potentially this might reflect a fault in code that has a lower priority for delivery and with no impact upon high prioritised functionality
NOTE: This defines the business impact if a delivered requirement should fail in the live system
29
Development Assigned Failure Likelihood Score
Failure Likelihood (Technical Assessment by Development) and assigned to each requirement
Level Likelihood Description
5 Daily A failure potentially is expected once per day
4 Weekly A failure potentially is expected once per week
3 Monthly A failure potentially is expected once per month
2 Quarterly A failure potentially is expected once per quarter
1 Annually A failure potentially is expected once per year
NOTE: This defines the technical likelihood of a delivered requirement failing in the live system, based upon difficulty in implementing and the type of technology used. It is an engineers/developers assessment. It is expected that there will be some variance across the system, Unlike Impact, it may not necessarily follow a normal distributionThis is an input to calculating a risk score for the requirement. This in turn helps to refocus test priority towards greater areas of risk, which is the level of support for assuring delivery of requirements
30
Risk Level Calculation
Risk = Business Impact if requirement fails x Technical Likelihood of failure
So this provides a group of risksTest Team Risk Mitigation
Prioritisation
Likliehood ScoreImpact Score 1 2 3 4 5
1 1 2 3 4 5
2 2 4 6 8 10
3 3 6 9 12 15
4 4 8 12 16 20
5 5 10 15 20 25
Key Risk Level
R >16 Critical Risk(High priority for test effort – finer grain testing and greater variation in test conditions required)
9 < R < 16 High Risk
3 > R < 9 Medium Risk
R< = 3 Low Risk(Low priority for test effort)
Note: It is possible to use weighted calculations, the example shown here is the simplest level to help explain how requirement verification is targeted.
31
Final Form of Requirement
The earlier requirement could be finally restructured as:“The user access transaction is defined as a light transaction, as defined within table abc (e.g. slide 10). This is to be verified under a one hour test simulating a normal load as defined within reference xyz (e.g. slides 15, 16 and 17), with time measurements made at the server.
Requirement Property Score
Impact if Failure 5 - staff could not do any work
Likelihood of Failure To be completed by development
Risk Level (feed to test priority) To be completed by test
32