unit quan - session 5 design of experiments
TRANSCRIPT
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 1 of 48 Session 5 ©University of Portsmouth
Unit QUAN - Session 5
Design of Experiments
Quantitative Methods – Unit QUAN
Page 2 of 48 MSc SQM\QUAN ©University of Portsmouth Session 5
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 3 of 48 Session 5 ©University of Portsmouth
MSc Strategic Quality Management
Quantitative Methods Unit - QUAN
DESIGN OF EXPERIMENTS
Aims of Session
To define the purposes of experiments and show how to design and analyse simple
experiments.
To discuss Taguchi’s approach to quality improvement.
Learning Approach
Study Notes, Required and Recommended Reading, Questions to Stimulate Your Thinking,
Self-Appraisal Exercises.
Content
Traditional One-Factor-at-a-time Approach to Experimentation. Factorial Experiments at
two levels. Fractional factorial experiments at two levels. Introduction to ANOVA and
statistical significance. Taguchi's Approach to Quality and Quality Engineering System.
Reading
There is a good overview in Dale (2003), and more detail in Hoerl and Snee (2002). Further
reading from Antony and Kaye (1996 and 2000), Antony and Preece (2002), Ayres (2007)*,
Mitra (1993 or later edition), Taguchi (1986), Wood (2003).
* Ayres, Ian (2007). Super crunchers: how anything can be predicted. London: John Murray.
Revised December 2009
Quantitative Methods – Unit QUAN
Page 4 of 48 MSc SQM\QUAN ©University of Portsmouth Session 5
DESIGN OF EXPERIMENTS
Introduction
Scientists perform experiments to increase their understanding of a particular phenomenon.
Having performed the experiment, the scientist will then attempt to draw inferences from
experimental results to test a hypothesis, or measure something, about the phenomenon being
studied. Thus an experiment is a series of tests performed to discover an unknown effect or
establish a hypothesis. But why do we need to perform experiments at all?
As an example, consider a plastic injection moulding process which may create parts that
shrink too much. We cannot directly observe what is occurring to cause the shrinkage.
Experience and reference works may tell us that several factors may be responsible, such as
mould temperature, injection speed, type of plastic resin used, and so on. An experiment will
determine which of these factors affect shrinkage the most.
What is Experimental Design?
Experimental design (or DoE) is an advanced statistical approach for studying the effect of
various factors (variables) on the product/process performance. It helps one to determine
which factors (variables) in a process are most important, how they interact, and what the
optimum settings are for those factors (variables).
Design of experiments can be used:
To determine the optimal conditions of a process.
To reduce excessive variability in manufacturing processes.
To rapidly understand the process behaviour.
To improve the customer satisfaction by producing high quality products at low costs.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 5 of 48 Session 5 ©University of Portsmouth
To improve performance characteristics of products.
To see if a new medical procedure or teaching approach is an improvement
Designed experiments are useful for helping to improve manufacturing and service processes.
This session will focus on manufacturing because it is possible to design relatively complex,
but very useful, experiments in this environment. The underlying reason for this is that it is
relatively easy to manipulate many of the details of a manufacturing process to see what has a
beneficial impact. This is typically more difficult for processes producing a service for a
human customer. The Taguchi method, to which we turn at the end of this unit, was designed
primarily for manufacturing processes.
However, designed experiments are also very useful in many other areas – trials to evaluate
the effectiveness of drugs, experiments to evaluate the impact of educational reforms or of
different social policies, the effectivess of different web designs, and so on. Ayres (2007)
gives a non-technical, and very enthusiastic, account of several examples where designed
experiments have proved their worth. Some of issues relevant to non-manufacturing
examples are explored in Tasks 4 and 5 in the final Self-Appraisal Exercise of this session.
Traditional One-Factor-at-a-time Approach to Experimentation
This is the simplest type of experiment. It involves varying one factor or variable, keeping all
other factors (or variables) in the experiment fixed. For instance, consider three factors (say,
A, B and C) each of which can be at one of two levels: level 1 and level 2 (a level is a value
or setting of a factor – in the example below the factor ―mixer speed‖ can be set at two levels,
either 60 rpm or 80 rpm) as shown in Table 1. In the first experimental trial, it is obvious that
we keep all factors at level 1. In the second trial, only the level of factor A has changed to
level 2, keeping the levels of factors B and C constant (i.e. at level 1).
The difference in the results between these two experimental runs in Table 1 provides an
estimate of the effect of A. An effect here refers to the change in output (e.g., thickness,
weight, efficiency, strength, etc.) which we measure during the experiment due to the change
in factor levels (i.e., level 1 to level 2). This effect has been estimated when factors B and C
were at levels B (1) and C (1) and therefore there is no guarantee whatsoever that A will have
the same effect when the conditions of other factors change. We can do just the same thing
with Factor B (trial 3) and Factor C (trial 4).
Quantitative Methods – Unit QUAN
Page 6 of 48 MSc SQM\QUAN ©University of Portsmouth Session 5
The one-factor-at-a-time approach of experimentation can be misleading and often leads to
wrong conclusions. To obtain a more realistic answer, we need to find out the effect of each
factor in conjunction with different levels of the other factors. We can achieve this by
designing an experiment where we change the levels of factors simultaneously to study their
effect on output. Sir Ronald A Fisher in the early 1920s developed some methods for
effective experimentation which were a fundamental break from the old scientific tradition of
varying only one-factor-at-a-time. His initial experiments were concerned with determining
the effects of various fertilisers on plots of ground. Fisher used methods of statistical
experimental design and analysis to draw conclusions about the effect of each fertiliser on the
final condition of the crop. More recently, these experimental design techniques have been
widely accepted in manufacturing organisations for improving product and process
performance. We will explain the basic ideas using the scenario below as an illustration.
Experimental Trial or Run
Factor A Factor B Factor C
1
1 1 1
2
2 1 1
3
1 2 1
4 1 1 2
Table 1 One-factor-at-a-time method
Scenario
Suppose we are interested in finding the yield of a chemical process at two temperatures, say
T0 and T1, and at two pressures, P0 and P1.
We will first use the one-factor-at-a-time approach. The first step is to keep the temperature
constant (T0) and vary the pressure from P0 to P1. The experiment was repeated twice and
the average yield was calculated.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 7 of 48 Session 5 ©University of Portsmouth
The following results were obtained.
Temperature Pressure Average yield (%)
T0 P0 51
T0 P1 61
The next step is to vary the temperature from T0 to T1, keeping the pressure constant (P0).
The following results were recorded.
Temperature Pressure Average yield (%)
T0 P0 51
T1 P0 58
Here we have the average yield values corresponding to only three combinations of
temperature and pressure; (T0, P0), (T1, P0) and (T0, P1). The experimenter concluded from
the above data that the maximum yield of chemical process will be attained corresponding to
(T0, P1). But the question then arises as to what should be the average yield corresponding to
(T1, P1)?
This experiment fails to answer this question. The difficulty is that there may be an
interaction between the two factors. Two factors are said to interact with each other, if the
effect of one factor on the output depends on the levels of the other factor. In the present
example, if there is no interaction between temperature and pressure, then the output graphs
(Figure 1) at different levels of pressure will be parallel. Non-parallel lines show the
presence of interaction between two factors. Suppose the yield corresponding to the untested
combination is about 80%. The interaction graph for the present example is shown in Figure
1. Here the lines are non-parallel and therefore we can conclude that there is an interaction
between temperature and pressure: the effect of temperature on the yield of chemical process
depends on the level of pressure. To be more precise, temperature increases the yield by 7%
(58%-51%) at the lower pressure, but by 19% at the higher pressure.
Quantitative Methods – Unit QUAN
Page 8 of 48 MSc SQM\QUAN ©University of Portsmouth Session 5
Figure 1: Example of an Interaction Graph
For complex manufacturing processes in today’s industrial environment, interactions play an
important role and therefore should be studied for achieving sound experimental conclusions.
Therefore one may go for the factorial experiments recommended by Fisher so that both main
factor effects (i.e., effect of temperature and pressure on the yield) and interaction effects can
be studied.
Factorial experiment can be of two types: Full factorial experiment and Fractional factorial
experiment.
Full Factorial Experiments
A full factorial experiment is an experiment which enables one to study all possible
combinations of factor levels. For full factorial experiments, the experimenter must vary all
factors simultaneously and therefore permit the evaluation of interaction effects. Two level
experiments are the most widely used factorial experiments in industry.
Full Factorial Experiments at two levels
The full factorial experiment at two levels is generally represented by 2k, where 2 stands for
the number of levels (a high level and a low level) and k, the number of factors to be studied.
For example, if the number of factors to be studied is 3, then one may select an eight run
experiment (based on 23). When the number of factors (k) in an experiment is more than or
equal to five, it is recommended to select a fractional factorial design where we study only a
fraction of the full factorial experiment. But there are some limitations in using fractional
factorial designs as explained below.
Figure 1 Example of an Interaction Graph
To T
1
Yield
80
70
60
50x
xP
0
x
x
P1
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 9 of 48 Session 5 ©University of Portsmouth
Design matrix for a full factorial experiment: A design matrix basically shows all the
possible factor level combinations to be studied. The simplest one is four run (i.e. 22)
experiments, then eight run, sixteen run, etc. The design matrix for a four run experiment is
shown in Table 2. Level 1 (or low level) is denoted by 1 and higher level by 2. Using a four
run experiment, we may be able to study two factors at two levels and the interaction between
them.
Experimental run A B
1 1 1
2 2 1
3 1 2
4 2 2
Table 2 – A 22 Full Factorial Experiment
Now the question is how do we calculate the effects of A and B, and see if there is an
interaction between them? This can be demonstrated by the following scenario. But before
we look at this we will briefly recap the ideas of factors, levels and how the levels are coded.
Factors, levels and coding, and response (performance) variables
The response (performance) variable is the outcome that we are interested in changing or
controlling (yield in the example below).
Factors are things (variables) which we think might influence the response variable.
Each factor is set at two or more levels (in the example below one factor which is thought to
have an influence on yield is the mixer speed – which is set at two levels: 60 and 80 rpm).
It is often convenient to code these levels. With two levels there are three standard coding
schemes:
1 and 2 (usually 1 for the lower and 2 for the higher if they are numerical levels)
0 and 1
– 1 and +1
For non-mathematical work (i.e. if we don’t multiply or subtract the codes) it does not matter
which coding scheme is used. The first (1 and 2) has the advantage that it can be extended to
Quantitative Methods – Unit QUAN
Page 10 of 48 MSc SQM\QUAN ©University of Portsmouth Session 5
3 or 4 or more levels. For more mathematical work (which goes beyond the scope of this
unit), 0 and 1 are a convenient choice if we are not interested in interactions, and – 1 and +1
are more convenient if we want to analyse interactions.
Scenario
The yield of a chemical reaction was thought to be a function of mixer speed and
formulation. A four run, two-level experiment was conducted based on these two factors. The
design matrix for the experiment is shown in Table 3.
Factors Level 1 Level 2
Formulation A B
Mixer speed 60 rpm 80 rpm
Table 3 – List of Factors and their Levels
The response (or output) values are recorded in the response table (Table 4) as shown below.
Here there are two factors; each kept at two levels. Therefore we can select a 22 full factorial
experiment to study the two main effects and the interaction between them. The main effects
can be determined by the following equation:
Main Effect of a factor = Average response at high level of the factor – Average
response at low level of the factor
Experimental
condition
F M Yield (output)
1 -1 -1 82
2 +1 -1 93
3 -1 +1 80
4 +1 +1 88
The “higher level” of each factor is coded as +1 and the lower level as -1 in this table,
instead of 2 for the higher level and 1 for the lower level as in Table 2.
Table 4 – Response Table
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 11 of 48 Session 5 ©University of Portsmouth
Having obtained the response values, the next immediate step is to analyse and interpret the
data.
Calculation of Main effects:
Average response (i.e. yield) at low level of Formulation = (82 + 80)/2 = 81
Average response at high level of Formulation = (93 + 88)/2 = 90.5
Therefore effect of factor F = 90.5 – 81 = 9.5
Similarly, effect of factor M = 84 – 87.5 = -3.5 (negative effect implies that the average
response is higher at the lower level of the factor)
This, of course, assumes that the effect of each factor does not depend substantially on the
level of the other factor. The way to check if this condition is reasonable, or whether there is
an interaction between the factors, is to look at the interaction plot below.
Main effects plot
Figure 2 illustrates the main effects plot for the experiment. A main effects plot simply shows
the average response values at high and low levels of a factor. It is a powerful graphical tool
to determine the importance of a factor effect. The power of this tool increases when we have
to study a large number of factors for an experiment. The strength of the effect depends on
the slope of the line.
Formulation Mixer speed
-1 1 -1 1
82
84
86
88
90
Yie
ld
Figure 2: Main Effects Plot for Yield Experiment
Quantitative Methods – Unit QUAN
Page 12 of 48 MSc SQM\QUAN ©University of Portsmouth Session 5
Interaction plot
An interaction plot is a powerful graphical tool to determine whether there is any interaction
between the factors. It plots the average response values corresponding to each combination
of factor levels. In the above example, it plots the average yield values at the four factor level
combinations. Figure 3 illustrates the interaction plot between Formulation and Mixer speed.
As the lines in the plot are almost parallel, there is not a large interaction between the factors.
The effect of Factor F is similar at both levels of Factor M.
This graph is produced by the spreadsheet at
http://userweb.port.ac.uk/~woodm/interaction.xls . This is interactive and you should be able
to change the response variable in the green cells and see what the interaction graph looks
like.
Graph to show interactions
78
80
82
84
86
88
90
92
94
-1 0 1
Factor F
Resp
on
se v
ari
ab
le
Factor M = -1
Factor M = +1
Figure 3 Interaction Plot Between Mixer Speed and Formulation
To see what a strong interaction looks like, consider Table 4a. This leads to the interaction
graph in Figure 4. The two lines here are not parallel: for the high level of Factor M (+1) the
high level of Factor F has a lower yield, but the opposite is true for the low level of Factor M.
This interaction between the two factors means that the effect of one depends on the level of
the other, so the main effects are of less interest because they average across the two levels of
the other factor (although you can still work them out – the main effect of F is +2.5 and of M
– 6.5).
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 13 of 48 Session 5 ©University of Portsmouth
Experimental condition Factors Response variable
F M Yield (output)
1 -1 -1 82
2 1 -1 93
3 -1 1 84
4 1 1 78
Table 4a – Another response table for mixer speed and formulation
Graph to show interactions
76
78
80
82
84
86
88
90
92
94
-1 0 1
Factor F
Resp
on
se v
ari
ab
le
Factor M = -1
Factor M = +1
Figure 4 Interaction Plot Between Mixer Speed and Formulation for Table 4a
Replication or Repetition
This is the process of repeating the experimental trials to improve the accuracy of
experimentation. The analysis procedure is similar to experiments without replication. As an
example, suppose that each of the runs in Table 4 were replicated an additional two times
giving the results in Table 5. Note that the three figures in the final column correspond to the
yield from three runs with the same factor levels.
Quantitative Methods – Unit QUAN
Page 14 of 48 MSc SQM\QUAN ©University of Portsmouth Session 5
Experimental
condition
F M Yield (output)
1 1 1 82, 84, 82
2 2 1 93, 92, 93
3 1 2 80, 80, 81
4 2 2 88, 86, 87
Table 5 – Response Table (with three replications)
The main effects and the interactions can now be analysed in just the same way as before,
except that we use the average of the three yields at each combination of factor levels in place
of a single value. For example, the average yield for the first combination of factor levels is
82.7.
Statistical significance
One difficulty with experiments like this is that the effects and interactions observed may be
due to essentially random factors. Are the results real? Or would the next set of trials yield
very different results?
Intuitively the answers to these questions often seem fairly clear – although this is an area
where intuition may be an unreliable guide. For example, in Table 5, each of the three
experimental runs under each experimental condition are very similar, whereas the runs under
different conditions are very different. This all suggests that further replications would lead to
the same pattern.
As a contrast, if the three replications for the first experimental condition had been 83, 42,
123, and the remaining nine runs had been similarly varied, we would then have had much
less confidence in conclusions based on the average yield from three runs. Table 5a below
gives an example of this. The numbers are arranged so that means of the three replications for
each experimental condition are identical to those for Table 5. This means that the main
effects and the interaction will be just the same as for Table 5. However, the fact that the data
are so much more variable means that the conclusions about the main effects and interaction
are obviously far less certain.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 15 of 48 Session 5 ©University of Portsmouth
Table 5a. Response table with the same main effects and interactions as Table 5, but
with more variable data
The usual way this difference between Tables 5 and 5a is assessed statistically is by means of
an analysis of variance – the abbreviation ANOVA is widely used for this. This is a very
technical area of statistics, and we will only give a very brief account here.
Table 5a is in a different format from Table 5 because the format of Table 5a is the format
Excel needs to perform a two factor ANOVA with replication (which you will find under Data
Analysis in the Tools menu). The output from this procedure is detailed and technical, and
will not be covered here with the exception of the p-values. These are the main answers
produced by an analysis of variance. They are:
Table 5: p-value for main effect F = 0.000
p-value for main effect M = 0.000
p-value for interaction = 0.017
Table 5a p-value for main effect F = 0.644
p-value for main effect M = 0.827
p-value for interaction = 0.931
These p-values are the probabilities of getting results ―like‖ the actual results if there were no
effect and the pattern in the data is just due to chance. So, for Table 5, if there really were no
effect, and the results were just due to chance, the results obtained would be very unlikely
indeed – and this is reflected in the low p-values. On the other hand, for Table 5a, if there
really were no effect and the results were due to chance, the results obtained are entirely
plausible – which is reflected in the much higher p-values.
Notice that the interpretation of p-values is the opposite of what you might expect. The lower
the p-value, the stronger the evidence that the effect is real (and not simply a matter of
chance). Conventionally, 5% is often taken as the dividing line: p-values less than 5% are
described as significant (ie signifying a genuine effect), and those more than 5% are not
significant (ie the evidence is not strong enough). All three p-values for Table 5 are
significant (p<0.05 for all three), but none of the p-values for Table 5a are significant.
M 1 M 2
F 1 83 80
42 90
123 71
F 2 100 128
93 46
84 87
Quantitative Methods – Unit QUAN
Page 16 of 48 MSc SQM\QUAN ©University of Portsmouth Session 5
Other significance tests
There are many other methods for working out these p-values in a variety of different
situations. The general approach is known as null hypothesis testing, or significance testing
(significance levels are another term for p-values). (The null hypothesis in the example above
is that there are no real differences and that any observed differences are just due to chance.)
The result is always a p-value, and its interpretation is as described in the example above.
Significance (null hypothesis) tests are important in many other contexts besides designing
experiments. The underlying concept is very similar to the idea underlying control charts (see
Session 2), with the null hypothesis being the ―in control‖ state of the process.
As a second example, consider Table 5b below.
Table 5b. Correlations between 5 variables in the drink data
AGE SATUNITS SUNUNITS MONUNITS DAYCIGS
AGE Pearson Correlation 1 -0.288 -0.130 -0.348 -0.097
Sig. level (p-value) 0.005 0.218 0.001 0.359
N 92 92 92 92 92
SATUNITS Pearson Correlation -0.288 1.000 0.591 0.729 0.554
Sig. level (p-value) 0.005 0.000 0.000 0.000
N 92 92 92 92 92
SUNUNITS Pearson Correlation -0.130 0.591 1.000 0.649 0.759
Sig. level (p-value) 0.218 0.000 0.000 0.000
N 92 92 92 92 92
MONUNITS Pearson Correlation -0.348 0.729 0.649 1.000 0.480
Sig. level (p-value) 0.001 0.000 0.000 0.000
N 92 92 92 92 92
DAYCIGS Pearson Correlation -0.097 0.554 0.759 0.480 1.000
Sig. level (p-value) 0.359 0.000 0.000 0.000
N 92 92 92 92 92
The data on which this is based (at http://userweb.port.ac.uk/~woodm/nms/drink.xls) includes
answers from 92 students to questions about their age, how many units of alcohol they had
drunk the previous Saturday, Sunday and Monday, and the average number of cigarettes a
day they smoked. The (Pearson) correlations indicate the relationship between the variables:
the interpretation of these is explained in Session 2.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 17 of 48 Session 5 ©University of Portsmouth
The significance levels, or p-values, are based on the null hypotheses that all the correlations
are actually zero and there are no relationships between the data. Low p-values mean that this
hypothesis is not plausible, so the null hypothesis should be rejected and we should conclude
that the correlation is genuine. For example, the correlation between SATUNITS and
DAYCIGS is 0.554. This suggests a reasonably strong tendency for people who drink a lot on
Saturday night to be among the heavier smokers. The p-value for this is 0.000, which is very
low indicating a significant result – this means that this is not likely to be a chance effect but
that the correlation is genuine. On the other hand the correlation between AGE and
DAYCIGS is -0.097. This suggests a slight tendency for older people to smoke less – but the
high value of the significance level (0.359) suggests that the null hypothesis is plausible and
this could well be a chance effect.
As a final example, let’s see how significance testing applies to some research on customer
service in two different kinds of financial institution: banks and building societies
(McGoldrick and Greenland, Competition between banks and building societies. British
Journal of Management, 3, 169-172, 1992). The data in Table 5c was obtained from a sample
of customers who rated each institution on a scale ranging from 1 (very bad) to 9 (very good.).
The above six dimensions are a selection from the 22 reported in the paper. NS means not
significant – which in this table means that the p value is greater than 0.1. (This is a bit unusual:
as mentioned above this level would normally be 5%.)
Table 5c. Customer service ratings from McGoldrick and Greenland (1992)
Aspect of service Banks’ mean
rating
Building Society’s
mean rating
Level of significance
(p)
Sympathetic/under
standing
6.046 6.389 0.000
Helpful /friendly
staff
6.495 6.978 0.000
Not too pushy 6.397 6.644 0.003
Time for
decisions
6.734 6.865 0.028
Confidentiality of
details
7.834 7.778 NS
Branch manager
available
5.928 6.097 0.090
Remember: the lower the p value, the more convincing the evidence is against the null
hypothesis. Unfortunately, this is rather counter-intuitive and easily misinterpreted.
Take care!
Quantitative Methods – Unit QUAN
Page 18 of 48 MSc SQM\QUAN ©University of Portsmouth Session 5
We will not go into more detail in this unit. If you want to read further on hypothesis testing
there are some more detailed notes at http://userweb.port.ac.uk/~woodm/stats/StatNotes3.pdf,
and you will find more on analysis of variance and other techniques in Hoerl and Snee
(2002), Antony and Kaye (2000), Wood (2003), and on the website at
http://www.statsoft.com/textbook/stathome.html.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 19 of 48 Session 5 ©University of Portsmouth
QUESTIONS TO STIMULATE YOUR THINKING
Now consider and answer the following questions, the purpose of which is to
test your retention of the information given in the course notes.
Questions
1 Calculate the main effects from the data in Table 5.
2 Now suppose that the yields from the fourth experimental condition in Table 5
were 128, 126, 127 (instead of 88, 86, 87).
Calculate the main effects from this data, and also draw an interaction plot like
Figure 3.
Do the results show an interaction? If so, describe it.
What do you think the p-values for the main effects and the interaction are? (It
is not normally possible to work this out in your head, but in this example the
situation is so clear that you should be able to come up with a good guess.)
3 Can you think of any problems in your home or work life where a factorial
experiment might be
What are the null hypotheses being tested in Table 5c. Which of the results are significant
at the 5% level? For which aspects of service is evidence strongest for a difference in
service levels between banks and building societies?
Quantitative Methods - Unit QUAN
Page 20 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
(BLANK PAGE)
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 21 of 48 Session 5 ©University of Portsmouth
/
Suggested Answers
1 The average yields for each condition come to 82.7, 92.7, 80.3, 87. The main effect
for F is 8.3 and for M is -4.
2 The main effects are now 28.3 for F, and 16 for M. The lines on the graph are not
parallel, so there is a definite interaction. The effect of one factor depends on the
level of the other.
If we start from the first experimental condition, changing the level of F will
increase the yield by 10. Starting again from this condition, if we change the level
of M, the yield goes down by 2.4. This means we might expect doing both changes
together would lead to an increase of 10-2.4 or 7.6. However, this is not so:
changing the levels of both factors together leads to a massive increase of 44.3. The
two factors interact, so we cannot calculate their combined effects by adding the
effects of each factor.
All three p-values are 0.0000 (i.e. zero to four decimal places). The reason is that
the main effects and interaction are large compared with the variation between the
three replications in each experiment condition (the difference between biggest and
smallest is only 1 or 2 for all four experimental conditions). This means that the
results would have been very, very unlikely to be the result of chance, so the
evidence for the main effects and interaction is statistically significant.
3 There are many possible uses—e.g. experiments to find the best recipe for cooking
something, or to find the factors that have an impact on blood pressure (e.g.
exercise, salt consumption, stress level)
4 The null hypotheses are that there are no overall differences in means between
banks and building societies. The first four results are significant at the 5% level.
The last two are not significant – we can’t be sure they are not due to chance. The
evidence for a difference is strongest for the first two (sympathetic/understanding
and helpful/friendly).
Quantitative Methods - Unit QUAN
Page 22 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
Fractional Factorial Experiments at Two Levels
The difficulty with full factorial experiments with a large number of factors is that the
number of experimental runs may become too large. For example, if you want to study seven
factors for a certain experiment and you choose a full factorial experiment, then you have to
perform 27 = 128 experimental runs.
The solution is to ignore some of the possible combinations of factors and study only a
fraction of them. This is known as a fractional factorial experiment. For example if you study
one sixteenth of the possible combinations you will only have to study 8 combinations of
factor levels - this is a possibility illustrated by one of the case studies below. The difficulty,
of course, with such highly fractional factorial designs is that you will not be able to
investigate many of the interaction effects. The next scenario illustrates some of the issues.
Scenario
Baker's paradise is a newly established baking school. Despite continuous efforts, the bakery
had failed to produce cakes which the customers liked. The management was looking for the
combination of ingredients which would produce the nicest cakes. A project was initiated to
study this problem. After a brainstorming session of 2 hours, it was decided that the
experiment would include six factors. The factors which were considered for the experiment
are shown in Table 6. Each factor was kept at two levels and the goal was to determine the
factor-level combination yielding the nicest cakes. A full factorial experiment would require
26 = 64 experimental runs. Because of limited time and resources, a fractional factorial
experiment with eight runs was selected. Each run was evaluated by asking a panel of
customers to rate each set of cakes on a 1 (very nasty) to 10 (very nice) scale.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 23 of 48 Session 5 ©University of Portsmouth
Factors/variables Notation Level 1 Level 2
Milk (cups) M 1/4 1/2
Sugar (cups) S 1/2 3/4
Eggs E 2 3
Flour (cups) F 3/4 1
Oven Temperature (0C) O 200 225
Butter (cups) B 1/4 1/2
Table 6 - List of factors for the Cake Baking Experiment
The mean customer ratings from each run are shown in the response table shown in Table 7.
(The design in terms of which combinations of factor levels are run is discussed in more
detail in the section on orthogonal arrays below.)
Experimental run M S E F O B Mean customer
rating
1 1 1 1 2 2 1 5.5
2 2 1 1 1 2 2 5.8
3 1 2 1 2 1 2 6.5
4 2 2 1 1 1 1 6.0
5 1 1 2 1 1 2 6.2
6 2 1 2 2 1 1 7.2
7 1 2 2 1 2 1 6.2
8 2 2 2 2 2 2 7.3
Table 7 - Response Table for the Cake Baking Experiment
This example is a simplified, artificial one to demonstrate how experimental design works. In
a real situation, you would need to carry out research (involving, perhaps, a brainstorming
session with the people involved with the process) to make sure that:
the list of factors is appropriate, and
the levels are reasonable ones to try (obviously some levels may be quite obviously
inappropriate – these do not need to be considered in the experiment), and
the response variable – mean customer rating in this example – is a sensible one. It is
possible, for example, that there may be different groups of customers with different
tastes, and it may be more helpful to do a separate analysis for each of these groups.
Quantitative Methods - Unit QUAN
Page 24 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
Calculation of Main effects
These are shown in Table 8. The calculation of the figure for M goes as follows. First we find
the average of the mean customer rating for the high (2) level - i.e. the average of runs 2, 4, 6,
8 (since these have 2 in the first, M, column). Then we do the same the low (1) level runs.
This comes to
0.25(5.8+6.0+7.2+7.3) - 0.25(5.5+6.5+6.2+6.2) = 6.575 - 6.1 = 0.475
This means that, on average, the higher level of M gets a rating 0.475 higher than the low
level. The design has the advantage that the four high level runs include two high and two
low level runs for each of the other factors, which means that the comparison should be fair
with regard to the other factors. This is discussed in more detail in the section on orthogonal
arrays below.
Main effects
Estimate
M +0.475
S +0.325
E +0.775
F +0.575
O -0.275
B +0.225
Table 8 - Main Effects
Table 8 suggests that the higher level produces the higher average rating for all factors except
O. For O the lower temperature produces the best results. The biggest effect is achieved by
the Eggs factor: including an extra egg makes a bigger difference than increasing the
quantities of the other ingredients.
This suggests that the best results would be achieved by setting all the factors at the higher
level, except for O, which should be set at the lower level. This combination, however, is not
included in the experimental runs which have been performed. (Remember this is a fractional
factorial experiment, so we have not tried all possible combinations.)
Despite this it is possible to make a crude prediction of the rating that would result from this
optimum combination. The closest of the actual runs is Run 8 which produced a mean rating
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 25 of 48 Session 5 ©University of Portsmouth
of 7.3. This run differs just in the level of O, so, as the effect of O is -0.275, a crude
prediction for the optimum combination would be 7.3 + 0.275 = 7.575. A slightly more
robust estimate can be produced by starting from the average of all the mean ratings, and then
adding or subtracting the effect of each factor - this is explained in the section on Taguchi’s
approach below (in the subsection on Prediction of the mean and sd for the optimum factor
settings). However, either method will produce a prediction which may well be wrong: the
problems, and what we can do about these problems, are discussed in the next two sections.
Confirmation run
It is possible that increasing the quantities of all ingredients may not be as effective as our
results suggest. The obvious way of finding out is to do a confirmation run to verify that this
combination does in fact produce a mean rating close to the prediction (or at least better than
any of the combinations which have been tried in the experiment).
If this confirmation run were to be done, and the mean rating was 7.8, this would support the
idea that this combination is the best. If, on the other hand, the mean rating were only 6.0,
this would suggest that this conclusion is wrong and that we need to perform a more detailed
experiment in which we try more combinations of factor levels.
Interactions and statistical significance
It is tempting to try to get some idea of two way interactions from the results in Table 7.
However, this is not usually a good idea. To see why not, consider the factors S and O.
There are four runs with low levels of S, and of these two have low levels of O, and two have
high levels. The mean rating from the two with low levels of O (runs 5 and 6) is 6.7, and the
mean from two with high levels of O (1 and 2) is 5.65. This suggests that if the level of S is
low, O has a negative effect of -1.05 (i.e. the higher rating is obtained from the lower level).
Similar arithmetic with the four runs with the higher levels of S produces a positive effect of
+0.5.
This means that the effect of O seems to depend on the value of S. There seems to be an
interaction between the two factors. A diagram like Figure 3 would show two lines which are
not parallel; in fact one will slope upwards and the other downwards.
However, we should be very cautious about this result. The difficulty is that what appears to
Quantitative Methods - Unit QUAN
Page 26 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
be an interaction could simply be the effect of Factor E. This factor appears to have a strong
positive effect because the average of the last four experimental runs is higher than the
average of the first four. We cannot distinguish between the main effect of E and an
interaction between S and O. (If we did want to find out about the interaction we should
avoid having a Factor E, and use this column for the interaction – see, for example, Mitra,
1993, p. 531.)
Fractional experiments like this are mainly for assessing the main effects; we must be very
careful drawing conclusions about interactions from this. To do this we need a full factorial
experiment.
However, it is possible that interactions like this do exist, so we must be cautious about
drawing conclusions from the main effects, because these are an overall average which may
not be very useful if the effect varies according the values of the other variables. This is one
of the reasons why the confirmation run is so important.
The other reason why we should be cautious of the result and should do a confirmation run to
check it is that the amount of data is very limited, and the differences we have found are
fairly small, so there is a possibility that the patterns we found in the experiment are due to
chance. Perhaps if we took another sample the results would be different? The formal way of
checking whether this is plausible is to use an analysis of variance to estimate the statistical
significance of the results – see the section on Statistical significance above. A less formal
method of checking would be to do one or more confirmation runs. If the results are a matter
of chance they are unlikely to recur in the confirmation runs.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 27 of 48 Session 5 ©University of Portsmouth
Taguchi Methods for Quality Improvement
Introduction
Traditionally the role of quality control has been that of eliminating defective products by
means of statistical sampling and inspection rules, and improving the process by means of
SPC. As a result of an increased emphasis on quality products and cost effectiveness in
modern complex manufacturing systems, the scope of the quality control process was
significantly extended during the 1980s. One of the most outstanding contributions in this
area can be attributed to the Japanese engineer Dr Genichi Taguchi, an international
consultant in the field of quality management. Taguchi has formulated both a philosophy and
a methodology for the process of quality improvement that depends on statistical concepts,
especially the application of statistical experiments to process of designing products (see
Taguchi, 1986).
The Design Process
The goal of experimentation in manufacturing is to devise ways of minimizing the deviation
of a quality characteristic from its target value. This can be done by identifying those factors
which impact the quality characteristic in question and by changing the appropriate factor
levels so that the deviations are minimized and the quality characteristic is on target. In other
words, from a quality perspective, experimentation seeks to determine the best material, the
best pressure, the best temperature, chemical formulation, cycle time, etc. which will operate
together within a process to produce a desired quality characteristic such as length, durability,
etc.
Taguchi’s approach to the design of experiments utilizes robust design, which can be applied
to a wide variety of problems. Robust Design adds a new dimension to statistical
experimental design by explicitly addressing the concerns faced by all process and product
designers, such as:
how to reduce economically the variation of a product’s function in the customer’s
environment, and
how to ensure that decisions found to be optimum during laboratory experiments will
prove to be so in manufacturing and in customer environments.
Quantitative Methods - Unit QUAN
Page 28 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
The objective of engineering design
The objective of engineering design, a major part of research and development, is to produce
drawings, specifications, and other relevant information needed to manufacture products that
consistently meet customer requirements. Knowledge of scientific phenomena and past
engineering experience with similar product designs and manufacturing processes form the
basis of the engineering design activity. However, when a number of new decisions related to
a product must be made regarding process and product architecture, parameters of the
manufacturing process and the functional characteristics of a product, it becomes necessary to
engineer the whole research and development in a concurrent way. Additionally, a large
amount of engineering effort is always expended in conducting research and development
(either with hardware or software, by experimentation or simulation) to generate the
information needed to guide these decisions. Therefore, efficiency in generating such
information is the key to meeting market requirements, keeping product development and
manufacturing costs low while attaining high-quality products. Robust design is an
engineering methodology for improving productivity during research and development so
that high-quality products can be produced quickly and at low cost.
Variability due to noise factors
The factors that cause variability in a product’s proper functioning are called noise factors.
Such factors cause, for example, the brightness of a fluorescent lamp to vary with power
supply and voltage, to deteriorate over time, as well as to vary between different lamps. There
are three main types of noise:
external noise,
internal noise, and
unit-to-unit noise.
1. External noise (Ambient noise)
External noise refers to factors in the environment or conditions of use that influence the
ideal functioning of a product. Examples of environmental noise factors are ambient
temperature, humidity, dust, supply voltage, electromagnetic interference, vibrations and
human error in operating a product.
2. Internal noise (Deterioration noise)
Internal noise refers to factors that cause a product to deteriorate during storage or to wear out
during use so that it can no longer achieve its target functions. Examples of internal noise
factors are the wear of parts and the deterioration of components with age.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 29 of 48 Session 5 ©University of Portsmouth
3. Unit-to-unit noise (Variational noise)
Unit-to-unit noise refers to factors that cause differences between individual products that
have been manufactured to the same specifications. This variation is inevitable in a
manufacturing process and leads to variations in the product parameters from unit to unit. For
example, the value of a resistor may be specified to be 100 ohms, but the resistance value
may be 101 ohms in one particular unit and 98 ohms in another.
Examples of noise
1. Colour television power circuit
The function of a power circuit in a colour television set is to convert alternating current (AC)
input into direct current (DC) output. If the power circuits in all sets manufactured
maintained a constant direct current output under perfect conditions, their voltage would be
perfect. However, it is likely that the following noise factors may cause the output to deviate
from its target voltage.
i. External noise
All variations in environmental conditions such as temperature, humidity, dust, and
input voltage.
ii Internal noise
Changes in the component and material characteristics. For example, after 10 years
the resistance of a resistor may have increased by 10%.
iii Unit-to-unit noise
Differences between individual manufactured units, causing different output voltages
from the same input voltage.
2. Refrigerator
Some of the important noise factors related to the temperature control inside a refrigerator are
given below:
i External noise
The number of times the door is opened and closed, the amount of food kept and the
initial temperature of the food, variation in the ambient temperature and the
fluctuation in power supply voltage.
ii Internal noise
Quantitative Methods - Unit QUAN
Page 30 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
The leakage of refrigerant and mechanical wear of compressor parts and door seals.
iii Unit-to-unit noise
The tightness of the door closure and the amount of refrigerant used.
3. Automobile
The following noise factors are important for the breaking distance of an automobile:
i External noise
Wet or dry roads, concrete or asphalt surfaces and the number of passengers in the
car.
ii Internal noise
The wear of the drums and brake pads, and leakage of brake fluid.
iii Unit-to-unit noise
Variations in the friction coefficient of the pads and drums, and the amount of brake
fluid.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 31 of 48 Session 5 ©University of Portsmouth
Taguchi's Quality Engineering System
Quality engineering is an engineering approach to produce high quality products at low cost;
which includes product life time costs, manufacturing costs, etc. Taguchi divides quality
engineering into two stages - off-line and on-line quality engineering system. Off-line quality
control methods are those technical aids for both quality and cost control in both product and
process design. On-line quality methods are those technical aids for both quality and cost
control during actual production. Here we will discuss only the off-line quality engineering
system proposed by Taguchi.
Taguchi's Off-line Quality Engineering System
Taguchi divides the Off-line quality engineering system into two stages - product design and
process design optimisation. Taguchi developed a three stage approach for assuring quality
within each of the two stages of off-line quality engineering system. These are system
design, parameter design and tolerance design.
System Design
System design is the phase which involves generating a basic prototype design that performs
the function of the product with minimum deviation from its target performance. In this
phase, new ideas and concepts are developed to provide improved products to consumers.
Parameter Design
This is the most important phase of Taguchi's quality system and is used to make products
and processes less sensitive to external disturbances. The objective of parameter design is to
determine the optimal settings of process parameters (identified from brainstorming) that will
dampen the effect of external disturbances. These external disturbances are responsible for
excessive variation in the product's functional performance from its target value.
Tolerance Design
This phase involves looking at each parameter to see if it is useful to trade off quality loss and
cost. For example, the narrower the tolerance band, the more costly it becomes to
manufacture the product. On the other hand, the wider the tolerance band, the lower the
quality is likely to be and therefore the greater the risk of product non-uniformity.
Quantitative Methods - Unit QUAN
Page 32 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
Experimental design and orthogonal arrays
Taguchi advocates the use of formal experimental designs to assist in the process of
parameter design. Unless the number of factors is very small, he suggests the use of fractional
factorial designs instead of full factorial designs, in order to save time and cost. In particular
he suggests the use of certain orthogonal arrays to plan the experiment.
An orthogonal array is a matrix of numbers arranged in columns and rows. Each column
represents a specific factor or condition that can be changed from experiment to experiment.
Each row represents the state of the factors in a given experiment. So called orthogonal
arrays have the property that the levels of the various factors are arranged in such a way that
the effect of one factor can be separated from the effects of the other factors (assuming no
interactions). Table 9 shows one of these orthogonal arrays: L8 (27). 8 refers to the number of
experimental conditions tried out, 2 is the number of factor levels, and 7 is the number of
factors. A full factorial experiment would involve 27 or 128 experimental conditions, so this
design is far more economical.
Experiment Factors
1 2 3 4 5 6 7
1 1 1 1 1 1 1 1
2 1 1 1 2 2 2 2
3 1 2 2 1 1 2 2
4 1 2 2 2 2 1 1
5 2 1 2 1 2 1 2
6 2 1 2 2 1 2 1
7 2 2 1 1 2 2 1
8 2 2 1 2 1 1 2
Table 9: The L8 (27) orthogonal array.
The main advantage of orthogonal arrays is that they allow us to compare the effect of low
and high levels of (for example) Factor 1 because the comparison is ―fair‖ with respect to the
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 33 of 48 Session 5 ©University of Portsmouth
other factors. For example the high level (of Factor 1) is combined with low levels of Factor
2 in rows (experiments) 5 and 6, and with high levels of Factor 2 in rows 7 and 8. Exactly the
same is true of the low levels of Factor 1. This means the comparison is fair in the sense that
it cannot be attributed to any of the other factors. The same is true of any other pair of factors.
It is possible to delete one or more of the columns of an orthogonal array and still have an
orthogonal array. For example, if we delete the seventh column in Table 9, we get an array
for investigating the effects of six factors. This could have been used instead of the array in
Table 7. There are more orthogonal arrays in Appendix A - by deleting some of the columns
they can be used for a wide variety of situations. (The array in Table 7 is orthogonal,
although it is not one of those listed in the appendix.)
Type of Factors in Taguchi's Experimental Design Techniques
In performing a Taguchi experiment, one may be aware of two types of factors:
a Control factors
These are factors which can be easily controlled during actual production conditions. The
levels of these factors are generally selected by design engineers. For example, in a cake
baking experiment, eggs, sugar, oven temperature, etc. are control factors.
b Noise factors
These are factors which cannot be controlled or are very expensive to control during the
normal production conditions. These noise factors are sources of variation in products and
processes and hence the cause of poor quality. Examples of noise factors are: relative
humidity, ambient temperature, etc.
Different types of response variables
The response variable used to assess quality levels will obviously be different in different
situations. There are three possible types. It is very important to bear this in mind when
designing and analysing an experiment.
Quantitative Methods - Unit QUAN
Page 34 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
a Smaller-the-Better quality characteristics
This type of characteristic is considered to measure the porosity, tool wear, number of
defects, etc. A smaller-the-better quality characteristic has an ideal value of zero.
b Larger-the Better quality characteristics
This type of characteristic is typically a measure of yield, efficiency, customer rating, etc. and
has an ideal value as large as possible.
c Target-is-the-Best quality characteristics
For this type of characteristic, one measures dimensions such as the diameter, length, etc. A
target value is always specified for the characteristic and a minimal variability around the
target is desired. There are two issues here: is it on target, and is the variability around the
target acceptable?
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 35 of 48 Session 5 ©University of Portsmouth
Industrial Case Study: An Application of Parameter Design for a Tile
Manufacturing Process
This case study is an application of Taguchi's parameter design to minimise the percentage of
defective tiles from a calcining (baking) process. During the late 1950s, the Ina Tile company
in Japan faced the problem of high variability in the dimensions of the tiles it produced A
team of engineers were involved to investigate the cause of the problem. The team found that
the uneven temperature distribution in the kiln caused a variation in size of the tiles produced.
The team reported that it would cost approximately £250,000 to redesign and build a kiln in
which all the tiles would receive uniform temperature. Here the temperature distribution in
the kiln is a noise factor which cannot be controlled or would be very expensive to control for
normal production conditions. However the company wanted to reduce the tile size variation
without increasing costs. This case study will illustrate how the company have achieved a
significant improvement in the tile quality by reducing the variation of the tile dimensions.
The schematic diagram of the kiln is shown below.
The company decided to conduct an experiment to investigate the effects of various factors
(or parameters) in the tile manufacturing process, with a view to making the process more
robust, so that it can produce a consistent tile size despite the temperature variation.
Burner
Kiln Wall
Burner
Figure 9 Schematic Diagram of the Kiln
Quantitative Methods - Unit QUAN
Page 36 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
Selection of factors and Levels
Control factors: Level 1 Level 2
Amount of Limestone - Factor A 5% (new) 1% (existing)
Fineness of additive - Factor B coarser (existing) finer (new)
Amount of Agalmatolite - Factor C 43% (new) 53% (existing)
Type of Agalmatolite - Factor D Existing New
Raw material charging quantity - Factor E 1300 kg (new) 1200 kg (existing)
Amount of waste return - Factor F 0% (new) 4% (existing)
Amount of feldspar - Factor G 0% (new) 5% (existing)
The noise factor in this experiment was related to the position of the tile on the cart as it went
through the calcining process. Six different positions for tiles on the cart were to be
considered. Let 'P' be the position of the tile on the cart. The six different positions are:
P1 - Outside top P2 - Outside middle P3 - Outside bottom
P4 - Inside top P5 - Inside middle P6 - Inside bottom
The effects of these control factors were studied using an L8 orthogonal array (see the
Appendix) and the response was measured. The characteristic or response measured in this
experiment was the tile width, with the target value being 150. The response table for the
experiment is shown in Table 10.
L8 A B C D E F G P1 P2 P3 P4 P5 P6
1
2
3
4
5
6
7
8
1 1 1 1 1 1 1
1 1 1 2 2 2 2
1 2 2 1 1 2 2
1 2 2 2 2 1 1
2 1 2 1 2 1 2
2 1 2 2 1 2 1
2 2 1 1 2 2 1
2 2 1 2 1 1 2
151.9 151.4 150.4 150.2 149.6 149.5
151.5 150.8 150.0 149.8 149.4 149.1
153.1 151.8 151.8 151.4 150.6 150.3
152.2 151.3 151.1 150.6 150.1 150.0
151.5 150.8 150.6 150.2 149.7 149.5
156.5 152.1 150.3 148.5 146.3 144.6
154.5 153.3 151.8 150.9 150.4 149.6
153.0 152.5 152.0 151.9 151.3 149.5
Table 10 Response Table for the Tile Experiment
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 37 of 48 Session 5 ©University of Portsmouth
Statistical Analysis and Interpretation
The purpose of parameter design is to determine the control factor levels that will make the
process insensitive to noise. The width is obviously a target-is-best quality measurement, so
we need to:
Reduce variability caused by noise.
Bring the mean response or quality characteristic on to the target by using adjustment
factors which affect the mean response only.
There are various measures of variability we can use. The obvious, easy, ones are the
standard deviation (see Session 2), or the range (simply the largest value - smallest). Taguchi
suggests a more complex measure - the signal to noise ratio. This is not covered in this unit,
and there are strong objections to it—see Mitra (1993: 529-530). Here we will use the
standard deviation (sd). From the standpoint of theoretical statistics it is better to use a more
complex function than the this—log(sd2) as explained in Mitra (1993:529)—but the standard
deviation itself will be good enough for our purposes here.
Having obtained the response table, the next step is to obtain the sd and mean response table
corresponding to each experimental point. Table 11 gives the means and sds of the widths for
the six positions in each of the eight experimental runs.
Experimental run Mean width Sd of widths
1 150.5 0.97
2 150.1 0.90
3 151.5 1.00
4 150.9 0.83
5 150.4 0.74
6 149.7 4.27
7 151.8 1.85
8 151.7 1.22
Table 11 Means and Sds of tile widths for each run
The mean width for each experimental run is obtained by adding up all six widths and then
dividing the total by 6 (the number of observations in each experimental run). The standard
deviation is simply the standard deviation of the six numbers.
Quantitative Methods - Unit QUAN
Page 38 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
Once the sd and mean response is calculated for each experimental trial, the next step is to
identify which factors impact response variability, and which factors can be used to adjust the
mean response on to the required target. In order to achieve the above two objectives, we
simply calculate the average sd and the average of the means at each level of the factors
under consideration from which their effects can be easily computed. This is just like the
calculation of the main effects above, except that one of the response variables is a standard
deviation.
For example, for Factor A at the Level 1, the average mean is ¼ (150.5 + 150.1 + 151.5 +
150.9)=150.75. Similarly the average for Level 2 is 150.89, so the effect of Factor A is the
difference between these two - i.e. +0.14 (the + indicating that the average for level 2 is
higher than for 1). Just the same calculation can be done for the standard deviation.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 39 of 48 Session 5 ©University of Portsmouth
A B C D E F G
Mean
Level 1 150.75 150.18 151.01 151.03 150.85 150.87 150.71
Level 2 150.89 151.46 150.62 150.60 150.78 150.77 150.92
Difference 0.14 1.28 -0.39 -0.43 -0.08 -0.10 0.21
Standard deviation
Level 1 0.92 1.72 1.23 1.14 1.87 0.94 1.98
Level 2 2.02 1.23 1.71 1.81 1.08 2.01 0.97
Difference 1.10 -0.50 0.48 0.67 -0.79 1.07 -1.01
Table 12 Average Mean and Sd
Table 12 shows that factors A, D, E, F and G have the strongest influence on variability. (B
and C have a smaller influence since the difference between the standard deviations at the two
levels is smaller.) Their preferred levels (the one with the smallest sd) are:
A1, D1, E2, F1 and G2.
The decision to use these factors to adjust for robustness is a matter of judgment. There are
two other things the experiment is trying to achieve. The first of these is to identify an
adjustment factor to achieve the required nominal size of 150 mm. Factor B was chosen as an
adjustment factor as it has the largest impact on the mean response and has relatively little
impact on variability. For factor B, level 1 is closer to the target value of 150 mm.
Unfortunately Level 1 also has a worse (higher) standard deviation than Level 2, but this has
to be accepted if this Factor is used to adjust the mean. For these reasons, it was decided to
choose level 1 for factor B in the final combination of factor settings.
Factor C has relatively little influence on either mean response or response variability. This
type of factor can be treated as a cost reduction factor: the choice of level depends on the cost
of setting and convenience. It was decided to select a low percentage of Agalmatolite (because
this was cheaper), and therefore C1 was selected. This is also the setting with the lower sd,
which is obviously convenient.
So the final optimum combination of factor settings is:
A1, B1, C1, D1, E2, F1 and G2.
(The existing combination of factor settings for the process is A2, B1, C2, D1, E2, F2 and G2.)
Quantitative Methods - Unit QUAN
Page 40 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
Prediction of the mean for the optimum factor settings
The overall average mean in Table 11 is 150.82. This is also the average of the means for the
low level of Factor A and the high level (150.75 and 150.89). This suggests that Factor A has
the effect of decreasing the width by 150.82-150.75=0.07 if we choose the low level, and
increasing it by the same amount (150-89-150.82) if we choose the higher level. This means
that, starting from the overall mean, the effect of each factor is to raise or lower the mean by
half of the difference shown in Table 12.
Therefore, using the optimum combination, the predicted width is
150.82 - 0.07 - 0.64 + 0.20 + 0.22 - 0.04 + 0.05 + 0.10 = 150.64
(The plus or minus sign in this equation depends on whether the chosen level has a mean
above or below 150.82.)
A similar analysis for the standard deviations gives a prediction of -0.8! This is clearly a silly
prediction since the sd cannot be negative. This method obviously does not give an accurate
answer! This is not, however, a real problem because we are not trying to hit a target with the
sd, but just reduce it as far as possible. (If we did want a more realistic result, it would be a
good idea to analyse log(sd2) instead of the sd itself, as mentioned above.)
Confirmation Trial
A confirmation trial was conducted using the optimal condition for maximising robustness.
The standard deviation based on the confirmation trial was found to be 0.50 and the mean was
149.7. Both of these are better than any of the experimental runs. However, in other cases, the
confirmation trial may suggest that the ―optimum‖ factor levels are not really optimum, in
which case a more extensive experiment is called for.
Summary of Parameter Design Experiment
The company has substantially reduced the tile size variation due to an uncontrollable factor -
uneven temperature distribution in the kiln - by utilising Taguchi's parameter design. This is
achieved by finding the best possible settings for the seven controllable factors, A – G.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 41 of 48 Session 5 ©University of Portsmouth
SELF-APPRAISAL EXERCISE
You will have done the reading for session 5 of this unit, and absorbed the messages
which the session's material contains. Here follows some exercises to give you
further information and food for thought.
The assigned tasks are:
1 Is Table 1 an orthogonal array? Could it be used to calculate the main effects for the
three factors?
2 The variability of each run in the tile experiment is assessed above by means of the
standard deviation. Do the analysis in Tables 11 and 12 using the range instead. Do
you get similar results?
3 A Taguchi experiment was designed to investigate five two level factors; A, B, C, D
and E. The results are shown below.
Run A B C D E Response 1 1 1 1 1 1 42
2 1 1 1 2 2 50
3 1 2 2 1 1 36
4 1 2 2 2 2 45
5 2 1 2 1 2 35
6 2 1 2 2 1 55
7 2 2 1 1 2 30
8 2 2 1 2 1 54
Is this design based on an orthogonal array? If so, how can it be derived from the
orthogonal arrays in the appendix?
Determine the optimum design parameters (control factors) based on the
assumption that the experimenter wanted to minimise the response. Which variable
has the largest effect?
Draw a graph to show the interaction between A and E. What does this show? Does
this have any impact on your assessment of the optimum design parameters?
Suppose the experimenter also wants to minimise the variation of the response
variable due to noise factors. How does the experiment need to be extended to
achieve this?
...
Quantitative Methods - Unit QUAN
Page 42 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
......
4 A medical researcher wants to assess the effects of aspirin, diet and smoking on the
incidence of heart disease. She decides that an experiment is impractical and that
she will have to get her data by monitoring a large number of people over several
years and getting data on how much aspirin they take each week, what sort of diet
they have, how much they smoke, and any signs of heart disease. She then analyses
each of the first three variables (aspirin, diet and smoking) to see if it is related to
the incidence of heart disease.
What do you think of this approach to the research? What are the problems?
5 A second researcher working on the effects of aspirin, diet and smoking on the
incidence of heart disease recognises the difficulties of survey research (as described
in Question 4 above) and decides to do a controlled experiment over a period of five
years. He chooses a full factorial design with each factor at three levels:
Aspirin: 0, 60 or 300 mg per day
Diet: (high fish diet, vegetarian, junk food)
Smoking: 0, 10, 100 cigarettes per day
2700 volunteers are randomly assigned to each of the 27 different combinations of
factor levels.
Discuss whether this experiment is
Possible
Useful
Ethical.
What do you think is the best approach to this problem?
6 Are there any aspects of your work or personal life for which you feel a designed
experiment may be helpful?
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 43 of 48 Session 5 ©University of Portsmouth
/
Suggested Answers
1 No, it is not an orthogonal array and can’t be used to calculate main effects. (For
example, Level 2 of Factor A only occurs with Level 1 of Factor B, whereas Level
1 occurs with both levels of Factor B. This means that the comparison of the two
levels of Factor A cannot be fair.)
2 The results are very similar. The differences between the mean ranges at the two
levels of each variable (the bottom row of Table 12) are
3.13, -1.33, 1.42, 1.98, -2.28, 2.98, -2.68
The order of size of these (A largest, G smallest, etc) is exactly the same as for the
standard deviations, so, in this case, using the range would lead to very similar
conclusions.
3 It is an orthogonal array. It is simply the first five columns of the array L8 (27) in
the Appendix.
The analysis of the main effects suggests that the best factor levels for minimising
the response are.
A(1), B(2), C(2), D(1), E(2)
D has the largest effect on the response (15.25), so this is the most important
variable to control.
Your interaction graph should have two lines that cross and are far from parallel.
This indicates a strong interaction: the effect of each of these variables depends on
the level of the other. The lowest response value (32.5) is achieved by taking the
high level of both variables. The best factor levels worked out above are A(1) and
E(2) - this combination has a higher average response (50). This suggests that the
factor levels above may not be the best. Another possibility would be
A(2), B(2), C(2), D(1), E(2)
because this includes the A(2), E(2) combination. It would obviously be a good
idea to do a confirmation run of both possibilities.
To analyse any noise factors it would be necessary to measure the response
variable several times with each combination of control factors (as in the tile experiment).
These replications should be arranged to allow the noise factors to vary.
Quantitative Methods - Unit QUAN
Page 44 of 48 MScSQM\QUAN ©University of Portsmouth Session 5
Suggested Answers
4 There are two main problems with this type of survey. The first is that variables
like diet are extremely complex and not easily summarised in a helpful form to
analyse the statistics.
The second, even more important, problem is that many variables are not
controlled. To see why this matters, consider the aspirin analysis. The idea here
would be to compare people on different doses of aspirin to see if their propensity
to heart disease varies in any systematic way. However, to be useful, this
comparison needs to be ―fair‖ in the sense that the groups being compared are
similar apart from their consumption of aspirin - i.e. all the other important
variables need to be controlled. In a survey, this is unlikely to be so, perhaps
because the group taking a lot of aspirin may be doing because they are less
healthy than average. Or, with the diet comparison, it is likely that people on
supposedly healthy diets may also have other healthy habits like taking a lot of
exercise. It is impossible to be sure from a survey of this kind, and so we can never
be sure that the variable we are looking at is really the one having the effect. There
are always far too many other possibilities to check!
5 In theory, an experiment like this has the advantage that, because people are put in
groups at random, the groups should be similar except for the variables which are
controlled by the experiment. This avoids the second problem of the survey, and
means that any comparisons should be fair.
In practice, of course, the experiment is not possible. It would obviously not be
possible to persuade 2700 volunteers to take part, and if it was possible it would
not be ethical - especially telling people to smoke for the purposes of the
experiment. In addition, the idea of just three types of diet is clearly so unrealistic
that the results would be of little value.
Despite this, on occasions, there is good justification for doing experiments like
this with people - examples are drug trials, and experiments to analyse the
...
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 45 of 48
Session 5 ©University of Portsmouth
Quantitative Methods - Unit QUAN
Page 46 of 48 MScSQM\QUAN
©University of Portsmouth Session 5
Suggested Answers
…..
effectiveness of various aspects of a marketing campaign on the behaviour of
potential customers. In such experiments the main noise factors are likely to be due
to the people involved. Experiments with people tend to be rather more difficult to
design and organise than industrial experiments!
In practice, research on the factors causing heart disease has to use a mix of survey and
experiment. Experiments are possible for factors (like new drugs) which can be controlled,
and where there are not ethical bars to one of the treatments (e.g. where there is no strong
reason to believe that one treatment is better than the others). And in surveys, it may be
possible to use a statistical analysis to make allowances for some of the interfering
variables.
Quantitative Methods – Unit QUAN
MSc SQM\QUAN Page 47 of 48
Session 5 ©University of Portsmouth
Appendix: Orthogonal Array Design Tables
I Design Matrix of L8 (27) Orthogonal Array
Trial no.
Factor
1 2 3 4 5 6 7
1 1 1 1 1 1 1 1
2 1 1 1 2 2 2 2
3 1 2 2 1 1 2 2
4 1 2 2 2 2 1 1
5 2 1 2 1 2 1 2
6 2 1 2 2 1 2 1
7 2 2 1 1 2 2 1
8 2 2 1 2 1 1 2
II Design Matrix of L12 (211
) Orthogonal Array
Factor
Trial no. 1 2 3 4 5 6 7 8 9 10 11
1 1 1 1 1 1 1 1 1 1 1 1
2 1 1 1 1 1 2 2 2 2 2 2
3 1 1 2 2 2 1 1 1 2 2 2
4 1 2 1 2 2 1 2 2 1 1 2
5 1 2 2 1 2 2 1 2 1 2 1
6 1 2 2 2 1 2 2 1 2 1 1
7 2 1 2 2 1 1 2 2 1 2 1
8 2 1 2 1 2 2 2 1 1 1 2
9 2 1 1 2 2 2 1 2 2 1 1
10 2 2 2 1 1 1 1 2 2 1 2
11 2 2 1 2 1 2 1 1 1 2 2
12 2 2 1 1 2 1 2 1 2 2 1
Quantitative Methods - Unit QUAN
Page 48 of 48 MScSQM\QUAN
©University of Portsmouth Session 5
III Design Matrix of L16 (215
) Orthogonal Array
Factor
Trial 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2
3 1 1 1 2 2 2 2 1 1 1 1 2 2 2 2
4 1 1 1 2 2 2 2 2 2 2 2 1 1 1 1
5 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2
6 1 2 2 1 1 2 2 2 2 1 1 2 2 1 1
7 1 2 2 2 2 1 1 1 1 2 2 2 2 1 1
8 1 2 2 2 2 1 1 2 2 1 1 1 1 2 2
9 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
10 2 1 2 1 2 1 2 2 1 2 1 2 1 2 1
11 2 1 2 2 1 2 1 1 2 1 2 2 1 2 1
12 2 1 2 2 1 2 1 2 1 2 1 1 2 1 2
13 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1
14 2 2 1 1 2 2 1 2 1 1 2 2 1 2 1
15 2 2 1 2 1 1 2 1 2 2 1 2 1 1 2
16 2 2 1 2 1 1 2 2 1 1 2 1 2 2 1