root cause analysis in testing "dealing with problems, not symptoms! "
TRANSCRIPT
© copyrights to Alon Linetzki, Best-Testing, 2015
Root Cause Analysis in Testing
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
2
Alon Linetzki
CEO and Managing director of Best-Testing
Co-founder and Vice President of ITCB, ISQTB® Partner
Program leader, ISTQB® Agile Tester Certification co-
author, Founder and Chairman of SIGiST Israel
32 years in IT: in Dev, System architecture, Testing, Quality
Assurance
Certified Scrum Master, Scrum Alliance, 2008
Specializes in: Software process improvement, Agile
transition, Risk Management, Risk Based Testing, Root Cause
Analysis, Test Strategy & Optimization, Test Management,
Test Design, Test Automation, Building Smart Teams
International Speaker worldwide, since 1995
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
We shall cover…
How to Use RCA for analyzing critical
problems?
Introduction to Root Cause Analysis
5whys technique & Cause-Effect diagram
(technique variation)
Technique description
Case study example
Wrap-up
3
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015
What is Root Cause Analysis?
November 2015
4
• RCA definition
• From the resources
• My interpretation
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Root Cause Analysis definition
From wiki:
Root cause analysis (RCA) is a class of problem solving
methods aimed at identifying the root causes of problems
or events.
The practice of RCA is predicated on the belief that
problems are best solved by attempting to correct or
eliminate root causes, as opposed to merely addressing
the immediately obvious symptoms.
By directing corrective measures at root causes, it is
hoped that the likelihood of problem recurrence will be
minimized.
5
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Root Cause Analysis(My interpretation)
A problem solving method/process designed to –
“search for the root causes of a problem using a
predefined structural thinking process,
identifying the underlying issues, with the expectation
that –
dealing with these issues will dramatically reduce the
likelihood of the problem to occur. “
The process involves data collection, cause charting,
root cause identification and recommendation
generation and implementation.
6
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
“Cows falling on a road from a mountain”
– is it a problem or a symptom?
Should we eliminate all cows on that area?
Should we dig-out the mountain?
Should we rotate the sign?
Should we divert the road elsewhere?
It seems that sometimes eliminating the causes is not an easy task, and finding the problems is even harder!
7
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015
How to Use RCA for analyzing critical
problems?
November 2015
8
© copyrights to Alon Linetzki, Best-Testing, 2015
Challenges the current method could
not solve – using 5whys
9
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
‘5ys’ or ‘5 whys’ technique, and the cause-effect diagram.
Presenting a problem,
Asking “why?” it happens, finding the effect that caused it (1
effect),
Presenting the effect on the diagram,
Asking “why?” it happens… [back to previous step, unless we
ask it for 5 times already]
Done.
Presenting the 5whys Technique10
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
‘5ys’ or ‘5 whys’ technique, and the cause-effect
diagram.
Presenting the RCA Technique11
CauseCause Cause Cause
Cause
Problem
Why
#1
Why
#2Why
#3Why
#4
Why
#5
Thinking path…
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
‘5ys’ or ‘5 whys’ technique, and the cause-effect diagram.
1. There is the assumption that a single cause, at each level
of "why", is sufficient to explain the effect in question.
2. What if one of the ‘Why’ is answered wrongly? Maybe
our answer is possible, but what if the actual cause is
something else entirely?
3. When we have found the problem, and draw the route,
how ‘strong’ is this solution? Maybe we should prefer one
over the other?
Challenges: what the method can
not solve12
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015
Enhancing the method – case study13
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Short structured interview with rep’s of
management, development, release, system,
testing, product teams.
Step 1: Draw a cause-effect diagram & exercise the
5whys
Step 2: Investigate the arrows/causes for:
Relevancy – High, Medium, Low
Strength - Strong, Weak
Impact – Direct, Indirect
Enhancing the Method:
Example project14
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Enhancing the Method:
Example project15
Type Question Cause-Effect
Relevancy What evidence you have that the cause exist? H/M/L
Strength
(S or W)
What evidence you have that the cause leads to
the effect?
H/M/L
Strength
(S or W)
Is anything else needed, together with the
cause, for the effect to occur?
Yes/No
Impact
(D or I)
Is there a evidence that the cause is contributing
to the problem I’m looking at?
Yes / No
Impact
(D or I)
How much this cause is contributing to a
possible resolution?
Direct /
Indirect
Mark
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Enhancing the Method:
Example project16
Type Question Cause-Effect
Relevancy What evidence you have that the cause exist? High (3)
Strength
(S or W)
What evidence you have that the cause leads to
the effect?
Medium (2)
Strength
(S or W)
Is anything else needed, together with the
cause, for the effect to occur?
No (1)
Impact
(D or I)
Is there a evidence that the cause is contributing
to the problem I’m looking at?
Yes (1)
Impact
(D or I)
How much this cause is contributing to a
possible resolution?
Direct (2)
Mark 9
You should mark each arrow using this table.
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Step 3: Identify the routes leading to the
problem/s,
Step 4: Identify the strength and direction (impact)
they have (calculating the mark for each arrow),
Step 5: Choose the best route to focus on,
[Improve it, and go to the next one].
Enhancing the Method:
Example project17
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015
Case Study - implementation18
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Background:
company was using a very advanced technology, and
a complex product line,
Complex product, uses mechanics, electronics,
hardware, software, devices, cooling device, has water
resistant, has heating resistant, accurate up to
1:1,000,000 cm,
In the last 0.5 year, 50% of released machines
returned from the floor (clients) for fixing,
Example project – Hi-Tech Company19
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
SQA manager was at a course I gave, and liked one of the tools,
He thought automation can solve many of his problems, because:
A lot more tests running,
Identifying more defects before the clients do,
Less products coming back,
Clients are happy!
Example project – Hi-Tech Company20
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
I investigated their automation
needs,
Followed the steps of the
enhanced method,
Found out their problems might
be elsewhere…
Example project – Hi-Tech Company21
Lets see the drawing board from
that meeting…
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
1st drawing – RCA meeting
22
Our way of thinking12
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
The RCA meeting (company exec’s and directors):
At first, the belief was that the primary problem
was:
Partial Test Planning (less tests are executed)
Example project – Hi-Tech Company23
Lets see an illustration diagram …
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
1st drawing – RCA meeting24
Many clients
ask for
different Sw
of the
product
Many
versions
open in
parallel
Complexity of
version control
management is
very high
Defining req’
not good
enough by
client
Spec Lvl 0
No specs
in lvl 1
Spec Lvl
1 not
complete
or does
not fit
Spec Lvl
2 not
written
Good
definition of
Spec Lvl 0
Spec Lvl
1 fits
Spec Lvl
2 fit
Spec Lvl 2
does not
fit/complete
Code
written
with low
match to
client req’
Only
Partial Test
planning
and not full
coverage
Partial test
case planning
and coverage
Partial test
execution
and low
coverage
Our way of thinking1
2
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
1st drawing – RCA meeting25
Many clients
ask for
different Sw
of the
product
Many
versions
open in
parallel
Complexity of
version control
management is
very high
Defining req’
not good
enough by
client
Spec Lvl 0
No specs
in lvl 1
Spec Lvl
1 not
complete
or does
not fit
Spec Lvl
2 not
written
Good
definition of
Spec Lvl 0
Spec Lvl
1 fits
Spec Lvl
2 fit
Spec Lvl 2
does not
fit/complete
Code
written
with low
match to
client req’
Only
Partial Test
planning
and not full
coverage
Partial test
case planning
and coverage
Partial test
execution
and low
coverage
Our way of thinking1
2
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
1st drawing – RCA meeting26
Many clients
ask for
different Sw
of the
product
Many
versions
open in
parallel
Complexity of
version control
management is
very high
Defining req’
not good
enough by
client
Spec Lvl 0
No specs
in lvl 1
Spec Lvl
1 not
complete
or does
not fit
Spec Lvl
2 not
written
Good
definition of
Spec Lvl 0
Spec Lvl
1 fits
Spec Lvl
2 fit
Spec Lvl 2
does not
fit/complete
Code
written
with low
match to
client req’
Only
Partial Test
planning
and not full
coverage
Partial test
case planning
and coverage
Partial test
execution
and low
coverage
Our way of thinking1
2
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
2nd drawing – RCA meeting27
Many clients
ask for
different Sw
of the
product
Many
versions
open in
parallel
Complexity of
version control
management is
very high
Defining req’
not good
enough by
client
Spec Lvl 0
No specs
in lvl 1
Spec Lvl
1 not
complete
or does
not fit
Spec Lvl
2 not
written
Good
definition of
Spec Lvl 0
Spec Lvl
1 fits
Spec Lvl
2 fit
Spec Lvl 2
does not
fit/complete
Code
written
with low
match to
client req’
Only
Partial Test
planning
and not full
coverage
Partial test
case planning
and coverage
Partial test
execution
and low
coverage
Our way of thinking1 2
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
After a while, we shifted the focus and agreed that
the real problem was actually:
Poor Product Quality
Because that was the reason the clients returned
their product.
And we started RCA from there.
After a while, we started to see the light – real
problems started to crystallize, problems that
involved people and processes
Example project – Hi-Tech Company28
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Our way of thinking
3rd drawing – RCA meeting29
Many clients
ask for
different Sw
of the
product
Many
versions
open in
parallel
SCM - Complexity
of version control
management is very
high
Defining req’ not
good enough by
client
Spec Lvl 0
No
specs in
lvl 1
Spec Lvl 1
not complete
or does not
fit
Spec Lvl
2 not
written
Good
definition
of Spec
Lvl 0
Spec Lvl
1 fits
Spec Lvl
2 fit
Spec Lvl 2
does not
fit/complete
Code written
with low
match to
client req’
Only
Partial Test
planning
and not full
coverage
Partial test
case
planning and
coverage
Partial test
execution
and low
coverage
1 2
Tight
schedule
projectPrioritization
and
compromise
on scope to
clients
Low
Quality
Product
Req’
managemen
t not good
enoughLack of methods
and techniques
in testing
Low lvl of test
identification
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
We then defined the relevancy, strength and
impact of each arrow (cause),
And calculated the grades for the arrows (which
are not seen here),
Example project – Hi-Tech Company30
Back to the board…
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
5th drawing – RCA meeting31
Many
clients ask
for
different
Sw of the
product
Many
versions
open in
parallel
SCM - Complexity of version control
management is very high
Defining
req’ not
good
enough by
client
Spec Lvl 0
No specs in
lvl 1
Spec Lvl 1
not
complete
or does not
fit
Spec Lvl 2
not written
Good
definition
of Spec Lvl
0
Spec Lvl 1
fits
Spec Lvl 2
fit
Spec Lvl 2
does not
fit/complete
Code
written
with low
match to
client req’
Only
Partial Test
planning
and not
full
coverage
Partial test
case
planning
and
coverage
Partial test
execution
and low
coverage
Our way of thinking12
Tight
schedule
project
Prioritization
and
compromise on
scope to clients
Low
Quality
Product
Req’
management
not good
enoughLack of methods and
techniques in testing
Low lvl of test
identification
S/D
W/D
W/I
S/I
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
We went back to double check the RCA of the
routes leading to the primary problem, marking
the arrows with their grades (from the table,
remember?)
We ended up circling the main causes, that have
initiated the strongest routes that are directly
impacting our problem,
Example project – Hi-Tech Company32
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Many
clients ask
for
different
Sw of the
product
Many
versions
open in
parallel
SCM - Complexity of version control
management is very high
Defining
req’ not
good
enough by
client
Spec Lvl 0
No specs in
lvl 1
Spec Lvl 1
not
complete
or does not
fit
Spec Lvl 2
not written
Good
definition
of Spec Lvl
0
Spec Lvl 1
fits
Spec Lvl 2
fit
Spec Lvl 2
does not
fit/complete
Code
written
with low
match to
client req’
Only
Partial Test
planning
and not
full
coverage
Partial test
case
planning
and
coverage
Partial test
execution
and low
coverage
Our way of thinking1
2
Tight
schedule
project
Prioritization
and
compromise on
scope to clients
Low
Quality
Product
Req’
manageme
nt not
good
enoughLack of methods and
techniques in testing
Low lvl of test
identification
S/D
W/D
W/I
S/I
Last drawing – RCA meeting33
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Many
clients ask
for
different
Sw of the
product
Many
versions
open in
parallel
SCM - Complexity of version control
management is very high
Defining
req’ not
good
enough by
client
Spec Lvl 0
No specs in
lvl 1
Spec Lvl 1
not
complete
or does not
fit
Spec Lvl 2
not written
Good
definition
of Spec Lvl
0
Spec Lvl 1
fits
Spec Lvl 2
fit
Spec Lvl 2
does not
fit/complete
Code
written
with low
match to
client req’
Only
Partial Test
planning
and not
full
coverage
Partial test
case
planning
and
coverage
Partial test
execution
and low
coverage
Our way of thinking1
2
Tight
schedule
project
Prioritization
and
compromise on
scope to clients
Low
Quality
Product
Req’
manageme
nt not
good
enoughLack of methods and
techniques in testing
Low lvl of test
identification
S/D
W/D
W/I
S/I
34
4/5
Last drawing – RCA meetingLets see the routes…
3/4
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Unique patterns
November 2015
35
Better Grades/score
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
5 major Root Topics were Identified, explained and prioritized:
1. Produce requirements from client definitions
2. Requirements management
3. Either ‘No Spec Level 1’, or ‘Spec level 1 not matching requirements’
4. Lack of methods and techniques in testing for development and testing teams
5. Allot of clients define slightly different requirement for the SW – allot of specials
We defined a pragmatic corrective actions plan, with priority items.
Example project – Hi-Tech Company36
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Major Areas of Concern identified and prioritized:
1. Requirements Management
2. Configuration Management
3. Design Documentation and Flow
4. Testing Methodologies, techniques and tools
Not discussed:
- Release Management
- Risk Management + Risk Based Testing
- Requirements Definition
- Project Management
- Professional Development
Example project – Hi-Tech
Company37
Organization Language!
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
5whys & Cause-effect diagram possible
Solution – example analyzing arrows38
November 2015
Bad Test
Planning
low lvl of
knowledge
in
estimation
High lvl of
uncertainty
when
planning
Late R&D
deliverables
Low lvl of
details for
R&D
deliverables
Unexperience
d team leader
Product req’
arrive late
Time
pressure on
R&D
Frequent
changes in
R&D
deliverables
Prod
management
frequent
changes
???
???
1
2
3
4
5
64
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Analyzing Routes
November 2015
39
Routes Topics
|V
A B C D E
Grade 17 23 18 17 25Factor 19 20 19Total 19 23 20 19 25Cost
Benefit144k
1510k
205k
Resistance M H LHave Contrl H H MDecision? 2 3 1
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Cause-effect & RCA can assist in improvement path
determination
We should pilot it, and make adjustments where
necessary
Integrate it in our life-cycle and processes
Measure to make sure we made the right decisions!
Summary 40
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Further enhancing the mode, we must think of the following:
What about the junctions points (inbound and outbound):
direct impact of routes with those? Indirect? Impact on speed of
performance (bottle-necks)?
What is the ROI of this method within context?
Can we validate a route? Can we tie it to be a successful
problem eliminator?
How much the method is [domain] context dependant?
Can we hook it to Test Process Improvement methods or other
Key Performance/Area Indicators?
Other?
Food for Thought…41
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Time for discussion…42
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
“It is not the strongest of the species that survives, nor the most intelligent but the one that is most responsive to change”
Charles Darwin
A changing world…43
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015© copyrights to Alon Linetzki, Best-Testing, 2015
Or perhaps . . .
44
. . . the one who had anticipated all possible
requirements !
November 2015
© copyrights to Alon Linetzki, Best-Testing, 2015
Root Cause Analysis in Testing