unit quan - session 5 design of experiments

Quantitative Methods – Unit QUAN

MSc SQM\QUAN of 48 Session 5 ©University of Portsmouth

Unit QUAN - Session 5

Design of Experiments


of 48 MSc SQM\QUAN ©University of Portsmouth Session 5



MSc Strategic Quality Management

Quantitative Methods Unit - QUAN

DESIGN OF EXPERIMENTS

Aims of Session

To define the purposes of experiments and show how to design and analyse simple

experiments.

To discuss Taguchi’s approach to quality improvement.

Learning Approach

Study Notes, Required and Recommended Reading, Questions to Stimulate Your Thinking,

Self-Appraisal Exercises.

Content

Traditional One-Factor-at-a-time Approach to Experimentation. Factorial Experiments at

two levels. Fractional factorial experiments at two levels. Introduction to ANOVA and

statistical significance. Taguchi's Approach to Quality and Quality Engineering System.

Reading

There is a good overview in Dale (2003), and more detail in Hoerl and Snee (2002). Further

reading from Antony and Kaye (1996 and 2000), Antony and Preece (2002), Ayres (2007)*,

Mitra (1993 or later edition), Taguchi (1986), Wood (2003).

* Ayres, Ian (2007). Super crunchers: how anything can be predicted. London: John Murray.

Revised December 2009



DESIGN OF EXPERIMENTS

Introduction

Scientists perform experiments to increase their understanding of a particular phenomenon.

Having performed the experiment, the scientist will then attempt to draw inferences from

experimental results to test a hypothesis, or measure something, about the phenomenon being

studied. Thus an experiment is a series of tests performed to discover an unknown effect or

establish a hypothesis. But why do we need to perform experiments at all?

As an example, consider a plastic injection moulding process which may create parts that

shrink too much. We cannot directly observe what is occurring to cause the shrinkage.

Experience and reference works may tell us that several factors may be responsible, such as

mould temperature, injection speed, type of plastic resin used, and so on. An experiment will

determine which of these factors affect shrinkage the most.

What is Experimental Design?

Experimental design (or DoE) is an advanced statistical approach for studying the effect of

various factors (variables) on the product/process performance. It helps one to determine

which factors (variables) in a process are most important, how they interact, and what the

optimum settings are for those factors (variables).

Design of experiments can be used:

To determine the optimal conditions of a process.

To reduce excessive variability in manufacturing processes.

To rapidly understand the process behaviour.

To improve the customer satisfaction by producing high quality products at low costs.



To improve performance characteristics of products.

To see if a new medical procedure or teaching approach is an improvement

Designed experiments are useful for helping to improve manufacturing and service processes.

This session will focus on manufacturing because it is possible to design relatively complex,

but very useful, experiments in this environment. The underlying reason for this is that it is

relatively easy to manipulate many of the details of a manufacturing process to see what has a

beneficial impact. This is typically more difficult for processes producing a service for a

human customer. The Taguchi method, to which we turn at the end of this unit, was designed

primarily for manufacturing processes.

However, designed experiments are also very useful in many other areas – trials to evaluate

the effectiveness of drugs, experiments to evaluate the impact of educational reforms or of

different social policies, the effectivess of different web designs, and so on. Ayres (2007)

gives a non-technical, and very enthusiastic, account of several examples where designed

experiments have proved their worth. Some of issues relevant to non-manufacturing

examples are explored in Tasks 4 and 5 in the final Self-Appraisal Exercise of this session.

Traditional One-Factor-at-a-time Approach to Experimentation

This is the simplest type of experiment. It involves varying one factor or variable, keeping all

other factors (or variables) in the experiment fixed. For instance, consider three factors (say,

A, B and C) each of which can be at one of two levels: level 1 and level 2 (a level is a value

or setting of a factor – in the example below the factor ―mixer speed‖ can be set at two levels,

either 60 rpm or 80 rpm) as shown in Table 1. In the first experimental trial, it is obvious that

we keep all factors at level 1. In the second trial, only the level of factor A has changed to

level 2, keeping the levels of factors B and C constant (i.e. at level 1).

The difference in the results between these two experimental runs in Table 1 provides an

estimate of the effect of A. An effect here refers to the change in output (e.g., thickness,

weight, efficiency, strength, etc.) which we measure during the experiment due to the change

in factor levels (i.e., level 1 to level 2). This effect has been estimated when factors B and C

were at levels B (1) and C (1) and therefore there is no guarantee whatsoever that A will have

the same effect when the conditions of other factors change. We can do just the same thing

with Factor B (trial 3) and Factor C (trial 4).



The one-factor-at-a-time approach of experimentation can be misleading and often leads to

wrong conclusions. To obtain a more realistic answer, we need to find out the effect of each

factor in conjunction with different levels of the other factors. We can achieve this by

designing an experiment where we change the levels of factors simultaneously to study their

effect on output. Sir Ronald A Fisher in the early 1920s developed some methods for

effective experimentation which were a fundamental break from the old scientific tradition of

varying only one-factor-at-a-time. His initial experiments were concerned with determining

the effects of various fertilisers on plots of ground. Fisher used methods of statistical

experimental design and analysis to draw conclusions about the effect of each fertiliser on the

final condition of the crop. More recently, these experimental design techniques have been

widely accepted in manufacturing organisations for improving product and process

performance. We will explain the basic ideas using the scenario below as an illustration.

Experimental Trial or Run

Factor A Factor B Factor C

1

1 1 1

2

2 1 1

3

1 2 1

4 1 1 2

Table 1 One-factor-at-a-time method

Scenario

Suppose we are interested in finding the yield of a chemical process at two temperatures, say

T0 and T1, and at two pressures, P0 and P1.

We will first use the one-factor-at-a-time approach. The first step is to keep the temperature

constant (T0) and vary the pressure from P0 to P1. The experiment was repeated twice and

the average yield was calculated.



The following results were obtained.

Temperature Pressure Average yield (%)

T0 P0 51

T0 P1 61

The next step is to vary the temperature from T0 to T1, keeping the pressure constant (P0).

The following results were recorded.

Temperature Pressure Average yield (%)

T0 P0 51

T1 P0 58

Here we have the average yield values corresponding to only three combinations of

temperature and pressure; (T0, P0), (T1, P0) and (T0, P1). The experimenter concluded from

the above data that the maximum yield of chemical process will be attained corresponding to

(T0, P1). But the question then arises as to what should be the average yield corresponding to

(T1, P1)?

This experiment fails to answer this question. The difficulty is that there may be an

interaction between the two factors. Two factors are said to interact with each other, if the

effect of one factor on the output depends on the levels of the other factor. In the present

example, if there is no interaction between temperature and pressure, then the output graphs

(Figure 1) at different levels of pressure will be parallel. Non-parallel lines show the

presence of interaction between two factors. Suppose the yield corresponding to the untested

combination is about 80%. The interaction graph for the present example is shown in Figure

1. Here the lines are non-parallel and therefore we can conclude that there is an interaction

between temperature and pressure: the effect of temperature on the yield of chemical process

depends on the level of pressure. To be more precise, temperature increases the yield by 7%

(58%-51%) at the lower pressure, but by 19% at the higher pressure.



Figure 1: Example of an Interaction Graph

For complex manufacturing processes in today’s industrial environment, interactions play an

important role and therefore should be studied for achieving sound experimental conclusions.

Therefore one may go for the factorial experiments recommended by Fisher so that both main

factor effects (i.e., effect of temperature and pressure on the yield) and interaction effects can

be studied.

Factorial experiment can be of two types: Full factorial experiment and Fractional factorial

experiment.

Full Factorial Experiments

A full factorial experiment is an experiment which enables one to study all possible

combinations of factor levels. For full factorial experiments, the experimenter must vary all

factors simultaneously and therefore permit the evaluation of interaction effects. Two level

experiments are the most widely used factorial experiments in industry.

Full Factorial Experiments at two levels

The full factorial experiment at two levels is generally represented by 2k, where 2 stands for

the number of levels (a high level and a low level) and k, the number of factors to be studied.

For example, if the number of factors to be studied is 3, then one may select an eight run

experiment (based on 23). When the number of factors (k) in an experiment is more than or

equal to five, it is recommended to select a fractional factorial design where we study only a

fraction of the full factorial experiment. But there are some limitations in using fractional

factorial designs as explained below.

Figure 1 Example of an Interaction Graph

To T

1

Yield

80

70

60

50x

xP

0

x

x

P1



Design matrix for a full factorial experiment: A design matrix basically shows all the

possible factor level combinations to be studied. The simplest one is four run (i.e. 22)

experiments, then eight run, sixteen run, etc. The design matrix for a four run experiment is

shown in Table 2. Level 1 (or low level) is denoted by 1 and higher level by 2. Using a four

run experiment, we may be able to study two factors at two levels and the interaction between

them.

Experimental run A B

1 1 1

2 2 1

3 1 2

4 2 2

Table 2 – A 22 Full Factorial Experiment

Now the question is how do we calculate the effects of A and B, and see if there is an

interaction between them? This can be demonstrated by the following scenario. But before

we look at this we will briefly recap the ideas of factors, levels and how the levels are coded.

Factors, levels and coding, and response (performance) variables

The response (performance) variable is the outcome that we are interested in changing or

controlling (yield in the example below).

Factors are things (variables) which we think might influence the response variable.

Each factor is set at two or more levels (in the example below one factor which is thought to

have an influence on yield is the mixer speed – which is set at two levels: 60 and 80 rpm).

It is often convenient to code these levels. With two levels there are three standard coding

schemes:

1 and 2 (usually 1 for the lower and 2 for the higher if they are numerical levels)

0 and 1

– 1 and +1

For non-mathematical work (i.e. if we don’t multiply or subtract the codes) it does not matter

which coding scheme is used. The first (1 and 2) has the advantage that it can be extended to



3 or 4 or more levels. For more mathematical work (which goes beyond the scope of this

unit), 0 and 1 are a convenient choice if we are not interested in interactions, and – 1 and +1

are more convenient if we want to analyse interactions.

Scenario

The yield of a chemical reaction was thought to be a function of mixer speed and

formulation. A four run, two-level experiment was conducted based on these two factors. The

design matrix for the experiment is shown in Table 3.

Factors Level 1 Level 2

Formulation A B

Mixer speed 60 rpm 80 rpm

Table 3 – List of Factors and their Levels

The response (or output) values are recorded in the response table (Table 4) as shown below.

Here there are two factors; each kept at two levels. Therefore we can select a 22 full factorial

experiment to study the two main effects and the interaction between them. The main effects

can be determined by the following equation:

Main Effect of a factor = Average response at high level of the factor – Average

response at low level of the factor

Experimental

condition

F M Yield (output)

1 -1 -1 82

2 +1 -1 93

3 -1 +1 80

4 +1 +1 88

The “higher level” of each factor is coded as +1 and the lower level as -1 in this table,

instead of 2 for the higher level and 1 for the lower level as in Table 2.

Table 4 – Response Table



Having obtained the response values, the next immediate step is to analyse and interpret the

data.

Calculation of Main effects:

Average response (i.e. yield) at low level of Formulation = (82 + 80)/2 = 81

Average response at high level of Formulation = (93 + 88)/2 = 90.5

Therefore effect of factor F = 90.5 – 81 = 9.5

Similarly, effect of factor M = 84 – 87.5 = -3.5 (negative effect implies that the average

response is higher at the lower level of the factor)

This, of course, assumes that the effect of each factor does not depend substantially on the

level of the other factor. The way to check if this condition is reasonable, or whether there is

an interaction between the factors, is to look at the interaction plot below.

Main effects plot

Figure 2 illustrates the main effects plot for the experiment. A main effects plot simply shows

the average response values at high and low levels of a factor. It is a powerful graphical tool

to determine the importance of a factor effect. The power of this tool increases when we have

to study a large number of factors for an experiment. The strength of the effect depends on

the slope of the line.

Formulation Mixer speed

-1 1 -1 1

82

84

86

88

90

Yie

ld

Figure 2: Main Effects Plot for Yield Experiment



Interaction plot

An interaction plot is a powerful graphical tool to determine whether there is any interaction

between the factors. It plots the average response values corresponding to each combination

of factor levels. In the above example, it plots the average yield values at the four factor level

combinations. Figure 3 illustrates the interaction plot between Formulation and Mixer speed.

As the lines in the plot are almost parallel, there is not a large interaction between the factors.

The effect of Factor F is similar at both levels of Factor M.

This graph is produced by the spreadsheet at

http://userweb.port.ac.uk/~woodm/interaction.xls . This is interactive and you should be able

to change the response variable in the green cells and see what the interaction graph looks

like.

Graph to show interactions

78

80

82

84

86

88

90

92

94

-1 0 1

Factor F

Resp

on

se v

ari

ab

le

Factor M = -1

Factor M = +1

Figure 3 Interaction Plot Between Mixer Speed and Formulation

To see what a strong interaction looks like, consider Table 4a. This leads to the interaction

graph in Figure 4. The two lines here are not parallel: for the high level of Factor M (+1) the

high level of Factor F has a lower yield, but the opposite is true for the low level of Factor M.

This interaction between the two factors means that the effect of one depends on the level of

the other, so the main effects are of less interest because they average across the two levels of

the other factor (although you can still work them out – the main effect of F is +2.5 and of M

– 6.5).

http://userweb.port.ac.uk/~woodm/interaction.xls



Experimental condition Factors Response variable

F M Yield (output)

1 -1 -1 82

2 1 -1 93

3 -1 1 84

4 1 1 78

Table 4a – Another response table for mixer speed and formulation

Graph to show interactions

76

78

80

82

84

86

88

90

92

94

-1 0 1

Factor F

Resp

on

se v

ari

ab

le

Factor M = -1

Factor M = +1

Figure 4 Interaction Plot Between Mixer Speed and Formulation for Table 4a

Replication or Repetition

This is the process of repeating the experimental trials to improve the accuracy of

experimentation. The analysis procedure is similar to experiments without replication. As an

example, suppose that each of the runs in Table 4 were replicated an additional two times

giving the results in Table 5. Note that the three figures in the final column correspond to the

yield from three runs with the same factor levels.



Experimental

condition

F M Yield (output)

1 1 1 82, 84, 82

2 2 1 93, 92, 93

3 1 2 80, 80, 81

4 2 2 88, 86, 87

Table 5 – Response Table (with three replications)

The main effects and the interactions can now be analysed in just the same way as before,

except that we use the average of the three yields at each combination of factor levels in place

of a single value. For example, the average yield for the first combination of factor levels is

82.7.

Statistical significance

One difficulty with experiments like this is that the effects and interactions observed may be

due to essentially random factors. Are the results real? Or would the next set of trials yield

very different results?

Intuitively the answers to these questions often seem fairly clear – although this is an area

where intuition may be an unreliable guide. For example, in Table 5, each of the three

experimental runs under each experimental condition are very similar, whereas the runs under

different conditions are very different. This all suggests that further replications would lead to

the same pattern.

As a contrast, if the three replications for the first experimental condition had been 83, 42,

123, and the remaining nine runs had been similarly varied, we would then have had much

less confidence in conclusions based on the average yield from three runs. Table 5a below

gives an example of this. The numbers are arranged so that means of the three replications for

each experimental condition are identical to those for Table 5. This means that the main

effects and the interaction will be just the same as for Table 5. However, the fact that the data

are so much more variable means that the conclusions about the main effects and interaction

are obviously far less certain.



Table 5a. Response table with the same main effects and interactions as Table 5, but

with more variable data

The usual way this difference between Tables 5 and 5a is assessed statistically is by means of

an analysis of variance – the abbreviation ANOVA is widely used for this. This is a very

technical area of statistics, and we will only give a very brief account here.

Table 5a is in a different format from Table 5 because the format of Table 5a is the format

Excel needs to perform a two factor ANOVA with replication (which you will find under Data

Analysis in the Tools menu). The output from this procedure is detailed and technical, and

will not be covered here with the exception of the p-values. These are the main answers

produced by an analysis of variance. They are:

Table 5: p-value for main effect F = 0.000

p-value for main effect M = 0.000

p-value for interaction = 0.017

Table 5a p-value for main effect F = 0.644

p-value for main effect M = 0.827

p-value for interaction = 0.931

These p-values are the probabilities of getting results ―like‖ the actual results if there were no

effect and the pattern in the data is just due to chance. So, for Table 5, if there really were no

effect, and the results were just due to chance, the results obtained would be very unlikely

indeed – and this is reflected in the low p-values. On the other hand, for Table 5a, if there

really were no effect and the results were due to chance, the results obtained are entirely

plausible – which is reflected in the much higher p-values.

Notice that the interpretation of p-values is the opposite of what you might expect. The lower

the p-value, the stronger the evidence that the effect is real (and not simply a matter of

chance). Conventionally, 5% is often taken as the dividing line: p-values less than 5% are

described as significant (ie signifying a genuine effect), and those more than 5% are not

significant (ie the evidence is not strong enough). All three p-values for Table 5 are

significant (p<0.05 for all three), but none of the p-values for Table 5a are significant.

M 1 M 2

F 1 83 80

42 90

123 71

F 2 100 128

93 46

84 87



Other significance tests

There are many other methods for working out these p-values in a variety of different

situations. The general approach is known as null hypothesis testing, or significance testing

(significance levels are another term for p-values). (The null hypothesis in the example above

is that there are no real differences and that any observed differences are just due to chance.)

The result is always a p-value, and its interpretation is as described in the example above.

Significance (null hypothesis) tests are important in many other contexts besides designing

experiments. The underlying concept is very similar to the idea underlying control charts (see

Session 2), with the null hypothesis being the ―in control‖ state of the process.

As a second example, consider Table 5b below.

Table 5b. Correlations between 5 variables in the drink data

AGE SATUNITS SUNUNITS MONUNITS DAYCIGS

AGE Pearson Correlation 1 -0.288 -0.130 -0.348 -0.097

Sig. level (p-value) 0.005 0.218 0.001 0.359

N 92 92 92 92 92

SATUNITS Pearson Correlation -0.288 1.000 0.591 0.729 0.554

Sig. level (p-value) 0.005 0.000 0.000 0.000

N 92 92 92 92 92

SUNUNITS Pearson Correlation -0.130 0.591 1.000 0.649 0.759

Sig. level (p-value) 0.218 0.000 0.000 0.000

N 92 92 92 92 92

MONUNITS Pearson Correlation -0.348 0.729 0.649 1.000 0.480

Sig. level (p-value) 0.001 0.000 0.000 0.000

N 92 92 92 92 92

DAYCIGS Pearson Correlation -0.097 0.554 0.759 0.480 1.000

Sig. level (p-value) 0.359 0.000 0.000 0.000

N 92 92 92 92 92

The data on which this is based (at http://userweb.port.ac.uk/~woodm/nms/drink.xls) includes

answers from 92 students to questions about their age, how many units of alcohol they had

drunk the previous Saturday, Sunday and Monday, and the average number of cigarettes a

day they smoked. The (Pearson) correlations indicate the relationship between the variables:

the interpretation of these is explained in Session 2.

http://userweb.port.ac.uk/~woodm/nms/drink.xls



The significance levels, or p-values, are based on the null hypotheses that all the correlations

are actually zero and there are no relationships between the data. Low p-values mean that this

hypothesis is not plausible, so the null hypothesis should be rejected and we should conclude

that the correlation is genuine. For example, the correlation between SATUNITS and

DAYCIGS is 0.554. This suggests a reasonably strong tendency for people who drink a lot on

Saturday night to be among the heavier smokers. The p-value for this is 0.000, which is very

low indicating a significant result – this means that this is not likely to be a chance effect but

that the correlation is genuine. On the other hand the correlation between AGE and

DAYCIGS is -0.097. This suggests a slight tendency for older people to smoke less – but the

high value of the significance level (0.359) suggests that the null hypothesis is plausible and

this could well be a chance effect.

As a final example, let’s see how significance testing applies to some research on customer

service in two different kinds of financial institution: banks and building societies

(McGoldrick and Greenland, Competition between banks and building societies. British

Journal of Management, 3, 169-172, 1992). The data in Table 5c was obtained from a sample

of customers who rated each institution on a scale ranging from 1 (very bad) to 9 (very good.).

The above six dimensions are a selection from the 22 reported in the paper. NS means not

significant – which in this table means that the p value is greater than 0.1. (This is a bit unusual:

as mentioned above this level would normally be 5%.)

Table 5c. Customer service ratings from McGoldrick and Greenland (1992)

Aspect of service Banks’ mean

rating

Building Society’s

mean rating

Level of significance

(p)

Sympathetic/under

standing

6.046 6.389 0.000

Helpful /friendly

staff

6.495 6.978 0.000

Not too pushy 6.397 6.644 0.003

Time for

decisions

6.734 6.865 0.028

Confidentiality of

details

7.834 7.778 NS

Branch manager

available

5.928 6.097 0.090

Remember: the lower the p value, the more convincing the evidence is against the null

hypothesis. Unfortunately, this is rather counter-intuitive and easily misinterpreted.

Take care!



We will not go into more detail in this unit. If you want to read further on hypothesis testing

there are some more detailed notes at http://userweb.port.ac.uk/~woodm/stats/StatNotes3.pdf,

and you will find more on analysis of variance and other techniques in Hoerl and Snee

(2002), Antony and Kaye (2000), Wood (2003), and on the website at

http://www.statsoft.com/textbook/stathome.html.

http://userweb.port.ac.uk/~woodm/stats/StatNotes3.pdf

http://www.statsoft.com/textbook/stathome.html



QUESTIONS TO STIMULATE YOUR THINKING

Now consider and answer the following questions, the purpose of which is to

test your retention of the information given in the course notes.

Questions

1 Calculate the main effects from the data in Table 5.

2 Now suppose that the yields from the fourth experimental condition in Table 5

were 128, 126, 127 (instead of 88, 86, 87).

Calculate the main effects from this data, and also draw an interaction plot like

Figure 3.

Do the results show an interaction? If so, describe it.

What do you think the p-values for the main effects and the interaction are? (It

is not normally possible to work this out in your head, but in this example the

situation is so clear that you should be able to come up with a good guess.)

3 Can you think of any problems in your home or work life where a factorial

experiment might be

What are the null hypotheses being tested in Table 5c. Which of the results are significant

at the 5% level? For which aspects of service is evidence strongest for a difference in

service levels between banks and building societies?

Quantitative Methods - Unit QUAN

of 48 MScSQM\QUAN ©University of Portsmouth Session 5

(BLANK PAGE)



/

Suggested Answers

1 The average yields for each condition come to 82.7, 92.7, 80.3, 87. The main effect

for F is 8.3 and for M is -4.

2 The main effects are now 28.3 for F, and 16 for M. The lines on the graph are not

parallel, so there is a definite interaction. The effect of one factor depends on the

level of the other.

If we start from the first experimental condition, changing the level of F will

increase the yield by 10. Starting again from this condition, if we change the level

of M, the yield goes down by 2.4. This means we might expect doing both changes

together would lead to an increase of 10-2.4 or 7.6. However, this is not so:

changing the levels of both factors together leads to a massive increase of 44.3. The

two factors interact, so we cannot calculate their combined effects by adding the

effects of each factor.

All three p-values are 0.0000 (i.e. zero to four decimal places). The reason is that

the main effects and interaction are large compared with the variation between the

three replications in each experiment condition (the difference between biggest and

smallest is only 1 or 2 for all four experimental conditions). This means that the

results would have been very, very unlikely to be the result of chance, so the

evidence for the main effects and interaction is statistically significant.

3 There are many possible uses—e.g. experiments to find the best recipe for cooking

something, or to find the factors that have an impact on blood pressure (e.g.

exercise, salt consumption, stress level)

4 The null hypotheses are that there are no overall differences in means between

banks and building societies. The first four results are significant at the 5% level.

The last two are not significant – we can’t be sure they are not due to chance. The

evidence for a difference is strongest for the first two (sympathetic/understanding

and helpful/friendly).



Fractional Factorial Experiments at Two Levels

The difficulty with full factorial experiments with a large number of factors is that the

number of experimental runs may become too large. For example, if you want to study seven

factors for a certain experiment and you choose a full factorial experiment, then you have to

perform 27 = 128 experimental runs.

The solution is to ignore some of the possible combinations of factors and study only a

fraction of them. This is known as a fractional factorial experiment. For example if you study

one sixteenth of the possible combinations you will only have to study 8 combinations of

factor levels - this is a possibility illustrated by one of the case studies below. The difficulty,

of course, with such highly fractional factorial designs is that you will not be able to

investigate many of the interaction effects. The next scenario illustrates some of the issues.

Scenario

Baker's paradise is a newly established baking school. Despite continuous efforts, the bakery

had failed to produce cakes which the customers liked. The management was looking for the

combination of ingredients which would produce the nicest cakes. A project was initiated to

study this problem. After a brainstorming session of 2 hours, it was decided that the

experiment would include six factors. The factors which were considered for the experiment

are shown in Table 6. Each factor was kept at two levels and the goal was to determine the

factor-level combination yielding the nicest cakes. A full factorial experiment would require

26 = 64 experimental runs. Because of limited time and resources, a fractional factorial

experiment with eight runs was selected. Each run was evaluated by asking a panel of

customers to rate each set of cakes on a 1 (very nasty) to 10 (very nice) scale.



Factors/variables Notation Level 1 Level 2

Milk (cups) M 1/4 1/2

Sugar (cups) S 1/2 3/4

Eggs E 2 3

Flour (cups) F 3/4 1

Oven Temperature (0C) O 200 225

Butter (cups) B 1/4 1/2

Table 6 - List of factors for the Cake Baking Experiment

The mean customer ratings from each run are shown in the response table shown in Table 7.

(The design in terms of which combinations of factor levels are run is discussed in more

detail in the section on orthogonal arrays below.)

Experimental run M S E F O B Mean customer

rating

1 1 1 1 2 2 1 5.5

2 2 1 1 1 2 2 5.8

3 1 2 1 2 1 2 6.5

4 2 2 1 1 1 1 6.0

5 1 1 2 1 1 2 6.2

6 2 1 2 2 1 1 7.2

7 1 2 2 1 2 1 6.2

8 2 2 2 2 2 2 7.3

Table 7 - Response Table for the Cake Baking Experiment

This example is a simplified, artificial one to demonstrate how experimental design works. In

a real situation, you would need to carry out research (involving, perhaps, a brainstorming

session with the people involved with the process) to make sure that:

the list of factors is appropriate, and

the levels are reasonable ones to try (obviously some levels may be quite obviously

inappropriate – these do not need to be considered in the experiment), and

the response variable – mean customer rating in this example – is a sensible one. It is

possible, for example, that there may be different groups of customers with different

tastes, and it may be more helpful to do a separate analysis for each of these groups.



Calculation of Main effects

These are shown in Table 8. The calculation of the figure for M goes as follows. First we find

the average of the mean customer rating for the high (2) level - i.e. the average of runs 2, 4, 6,

8 (since these have 2 in the first, M, column). Then we do the same the low (1) level runs.

This comes to

0.25(5.8+6.0+7.2+7.3) - 0.25(5.5+6.5+6.2+6.2) = 6.575 - 6.1 = 0.475

This means that, on average, the higher level of M gets a rating 0.475 higher than the low

level. The design has the advantage that the four high level runs include two high and two

low level runs for each of the other factors, which means that the comparison should be fair

with regard to the other factors. This is discussed in more detail in the section on orthogonal

arrays below.

Main effects

Estimate

M +0.475

S +0.325

E +0.775

F +0.575

O -0.275

B +0.225

Table 8 - Main Effects

Table 8 suggests that the higher level produces the higher average rating for all factors except

O. For O the lower temperature produces the best results. The biggest effect is achieved by

the Eggs factor: including an extra egg makes a bigger difference than increasing the

quantities of the other ingredients.

This suggests that the best results would be achieved by setting all the factors at the higher

level, except for O, which should be set at the lower level. This combination, however, is not

included in the experimental runs which have been performed. (Remember this is a fractional

factorial experiment, so we have not tried all possible combinations.)

Despite this it is possible to make a crude prediction of the rating that would result from this

optimum combination. The closest of the actual runs is Run 8 which produced a mean rating



of 7.3. This run differs just in the level of O, so, as the effect of O is -0.275, a crude

prediction for the optimum combination would be 7.3 + 0.275 = 7.575. A slightly more

robust estimate can be produced by starting from the average of all the mean ratings, and then

adding or subtracting the effect of each factor - this is explained in the section on Taguchi’s

approach below (in the subsection on Prediction of the mean and sd for the optimum factor

settings). However, either method will produce a prediction which may well be wrong: the

problems, and what we can do about these problems, are discussed in the next two sections.

Confirmation run

It is possible that increasing the quantities of all ingredients may not be as effective as our

results suggest. The obvious way of finding out is to do a confirmation run to verify that this

combination does in fact produce a mean rating close to the prediction (or at least better than

any of the combinations which have been tried in the experiment).

If this confirmation run were to be done, and the mean rating was 7.8, this would support the

idea that this combination is the best. If, on the other hand, the mean rating were only 6.0,

this would suggest that this conclusion is wrong and that we need to perform a more detailed

experiment in which we try more combinations of factor levels.

Interactions and statistical significance

It is tempting to try to get some idea of two way interactions from the results in Table 7.

However, this is not usually a good idea. To see why not, consider the factors S and O.

There are four runs with low levels of S, and of these two have low levels of O, and two have

high levels. The mean rating from the two with low levels of O (runs 5 and 6) is 6.7, and the

mean from two with high levels of O (1 and 2) is 5.65. This suggests that if the level of S is

low, O has a negative effect of -1.05 (i.e. the higher rating is obtained from the lower level).

Similar arithmetic with the four runs with the higher levels of S produces a positive effect of

+0.5.

This means that the effect of O seems to depend on the value of S. There seems to be an

interaction between the two factors. A diagram like Figure 3 would show two lines which are

not parallel; in fact one will slope upwards and the other downwards.

However, we should be very cautious about this result. The difficulty is that what appears to



be an interaction could simply be the effect of Factor E. This factor appears to have a strong

positive effect because the average of the last four experimental runs is higher than the

average of the first four. We cannot distinguish between the main effect of E and an

interaction between S and O. (If we did want to find out about the interaction we should

avoid having a Factor E, and use this column for the interaction – see, for example, Mitra,

1993, p. 531.)

Fractional experiments like this are mainly for assessing the main effects; we must be very

careful drawing conclusions about interactions from this. To do this we need a full factorial

experiment.

However, it is possible that interactions like this do exist, so we must be cautious about

drawing conclusions from the main effects, because these are an overall average which may

not be very useful if the effect varies according the values of the other variables. This is one

of the reasons why the confirmation run is so important.

The other reason why we should be cautious of the result and should do a confirmation run to

check it is that the amount of data is very limited, and the differences we have found are

fairly small, so there is a possibility that the patterns we found in the experiment are due to

chance. Perhaps if we took another sample the results would be different? The formal way of

checking whether this is plausible is to use an analysis of variance to estimate the statistical

significance of the results – see the section on Statistical significance above. A less formal

method of checking would be to do one or more confirmation runs. If the results are a matter

of chance they are unlikely to recur in the confirmation runs.



Taguchi Methods for Quality Improvement

Introduction

Traditionally the role of quality control has been that of eliminating defective products by

means of statistical sampling and inspection rules, and improving the process by means of

SPC. As a result of an increased emphasis on quality products and cost effectiveness in

modern complex manufacturing systems, the scope of the quality control process was

significantly extended during the 1980s. One of the most outstanding contributions in this

area can be attributed to the Japanese engineer Dr Genichi Taguchi, an international

consultant in the field of quality management. Taguchi has formulated both a philosophy and

a methodology for the process of quality improvement that depends on statistical concepts,

especially the application of statistical experiments to process of designing products (see

Taguchi, 1986).

The Design Process

The goal of experimentation in manufacturing is to devise ways of minimizing the deviation

of a quality characteristic from its target value. This can be done by identifying those factors

which impact the quality characteristic in question and by changing the appropriate factor

levels so that the deviations are minimized and the quality characteristic is on target. In other

words, from a quality perspective, experimentation seeks to determine the best material, the

best pressure, the best temperature, chemical formulation, cycle time, etc. which will operate

together within a process to produce a desired quality characteristic such as length, durability,

etc.

Taguchi’s approach to the design of experiments utilizes robust design, which can be applied

to a wide variety of problems. Robust Design adds a new dimension to statistical

experimental design by explicitly addressing the concerns faced by all process and product

designers, such as:

how to reduce economically the variation of a product’s function in the customer’s

environment, and

how to ensure that decisions found to be optimum during laboratory experiments will

prove to be so in manufacturing and in customer environments.



The objective of engineering design

The objective of engineering design, a major part of research and development, is to produce

drawings, specifications, and other relevant information needed to manufacture products that

consistently meet customer requirements. Knowledge of scientific phenomena and past

engineering experience with similar product designs and manufacturing processes form the

basis of the engineering design activity. However, when a number of new decisions related to

a product must be made regarding process and product architecture, parameters of the

manufacturing process and the functional characteristics of a product, it becomes necessary to

engineer the whole research and development in a concurrent way. Additionally, a large

amount of engineering effort is always expended in conducting research and development

(either with hardware or software, by experimentation or simulation) to generate the

information needed to guide these decisions. Therefore, efficiency in generating such

information is the key to meeting market requirements, keeping product development and

manufacturing costs low while attaining high-quality products. Robust design is an

engineering methodology for improving productivity during research and development so

that high-quality products can be produced quickly and at low cost.

Variability due to noise factors

The factors that cause variability in a product’s proper functioning are called noise factors.

Such factors cause, for example, the brightness of a fluorescent lamp to vary with power

supply and voltage, to deteriorate over time, as well as to vary between different lamps. There

are three main types of noise:

external noise,

internal noise, and

unit-to-unit noise.

1. External noise (Ambient noise)

External noise refers to factors in the environment or conditions of use that influence the

ideal functioning of a product. Examples of environmental noise factors are ambient

temperature, humidity, dust, supply voltage, electromagnetic interference, vibrations and

human error in operating a product.

2. Internal noise (Deterioration noise)

Internal noise refers to factors that cause a product to deteriorate during storage or to wear out

during use so that it can no longer achieve its target functions. Examples of internal noise

factors are the wear of parts and the deterioration of components with age.



3. Unit-to-unit noise (Variational noise)

Unit-to-unit noise refers to factors that cause differences between individual products that

have been manufactured to the same specifications. This variation is inevitable in a

manufacturing process and leads to variations in the product parameters from unit to unit. For

example, the value of a resistor may be specified to be 100 ohms, but the resistance value

may be 101 ohms in one particular unit and 98 ohms in another.

Examples of noise

1. Colour television power circuit

The function of a power circuit in a colour television set is to convert alternating current (AC)

input into direct current (DC) output. If the power circuits in all sets manufactured

maintained a constant direct current output under perfect conditions, their voltage would be

perfect. However, it is likely that the following noise factors may cause the output to deviate

from its target voltage.

i. External noise

All variations in environmental conditions such as temperature, humidity, dust, and

input voltage.

ii Internal noise

Changes in the component and material characteristics. For example, after 10 years

the resistance of a resistor may have increased by 10%.

iii Unit-to-unit noise

Differences between individual manufactured units, causing different output voltages

from the same input voltage.

2. Refrigerator

Some of the important noise factors related to the temperature control inside a refrigerator are

given below:

i External noise

The number of times the door is opened and closed, the amount of food kept and the

initial temperature of the food, variation in the ambient temperature and the

fluctuation in power supply voltage.

ii Internal noise



The leakage of refrigerant and mechanical wear of compressor parts and door seals.


The tightness of the door closure and the amount of refrigerant used.

3. Automobile

The following noise factors are important for the breaking distance of an automobile:

i External noise

Wet or dry roads, concrete or asphalt surfaces and the number of passengers in the

car.

ii Internal noise

The wear of the drums and brake pads, and leakage of brake fluid.


Variations in the friction coefficient of the pads and drums, and the amount of brake

fluid.



Taguchi's Quality Engineering System

Quality engineering is an engineering approach to produce high quality products at low cost;

which includes product life time costs, manufacturing costs, etc. Taguchi divides quality

engineering into two stages - off-line and on-line quality engineering system. Off-line quality

control methods are those technical aids for both quality and cost control in both product and

process design. On-line quality methods are those technical aids for both quality and cost

control during actual production. Here we will discuss only the off-line quality engineering

system proposed by Taguchi.

Taguchi's Off-line Quality Engineering System

Taguchi divides the Off-line quality engineering system into two stages - product design and

process design optimisation. Taguchi developed a three stage approach for assuring quality

within each of the two stages of off-line quality engineering system. These are system

design, parameter design and tolerance design.

System Design

System design is the phase which involves generating a basic prototype design that performs

the function of the product with minimum deviation from its target performance. In this

phase, new ideas and concepts are developed to provide improved products to consumers.

Parameter Design

This is the most important phase of Taguchi's quality system and is used to make products

and processes less sensitive to external disturbances. The objective of parameter design is to

determine the optimal settings of process parameters (identified from brainstorming) that will

dampen the effect of external disturbances. These external disturbances are responsible for

excessive variation in the product's functional performance from its target value.

Tolerance Design

This phase involves looking at each parameter to see if it is useful to trade off quality loss and

cost. For example, the narrower the tolerance band, the more costly it becomes to

manufacture the product. On the other hand, the wider the tolerance band, the lower the

quality is likely to be and therefore the greater the risk of product non-uniformity.



Experimental design and orthogonal arrays

Taguchi advocates the use of formal experimental designs to assist in the process of

parameter design. Unless the number of factors is very small, he suggests the use of fractional

factorial designs instead of full factorial designs, in order to save time and cost. In particular

he suggests the use of certain orthogonal arrays to plan the experiment.

An orthogonal array is a matrix of numbers arranged in columns and rows. Each column

represents a specific factor or condition that can be changed from experiment to experiment.

Each row represents the state of the factors in a given experiment. So called orthogonal

arrays have the property that the levels of the various factors are arranged in such a way that

the effect of one factor can be separated from the effects of the other factors (assuming no

interactions). Table 9 shows one of these orthogonal arrays: L8 (27). 8 refers to the number of

experimental conditions tried out, 2 is the number of factor levels, and 7 is the number of

factors. A full factorial experiment would involve 27 or 128 experimental conditions, so this

design is far more economical.

Experiment Factors

1 2 3 4 5 6 7

1 1 1 1 1 1 1 1

2 1 1 1 2 2 2 2

3 1 2 2 1 1 2 2

4 1 2 2 2 2 1 1

5 2 1 2 1 2 1 2

6 2 1 2 2 1 2 1

7 2 2 1 1 2 2 1

8 2 2 1 2 1 1 2

Table 9: The L8 (27) orthogonal array.

The main advantage of orthogonal arrays is that they allow us to compare the effect of low

and high levels of (for example) Factor 1 because the comparison is ―fair‖ with respect to the



other factors. For example the high level (of Factor 1) is combined with low levels of Factor

2 in rows (experiments) 5 and 6, and with high levels of Factor 2 in rows 7 and 8. Exactly the

same is true of the low levels of Factor 1. This means the comparison is fair in the sense that

it cannot be attributed to any of the other factors. The same is true of any other pair of factors.

It is possible to delete one or more of the columns of an orthogonal array and still have an

orthogonal array. For example, if we delete the seventh column in Table 9, we get an array

for investigating the effects of six factors. This could have been used instead of the array in

Table 7. There are more orthogonal arrays in Appendix A - by deleting some of the columns

they can be used for a wide variety of situations. (The array in Table 7 is orthogonal,

although it is not one of those listed in the appendix.)

Type of Factors in Taguchi's Experimental Design Techniques

In performing a Taguchi experiment, one may be aware of two types of factors:

a Control factors

These are factors which can be easily controlled during actual production conditions. The

levels of these factors are generally selected by design engineers. For example, in a cake

baking experiment, eggs, sugar, oven temperature, etc. are control factors.

b Noise factors

These are factors which cannot be controlled or are very expensive to control during the

normal production conditions. These noise factors are sources of variation in products and

processes and hence the cause of poor quality. Examples of noise factors are: relative

humidity, ambient temperature, etc.

Different types of response variables

The response variable used to assess quality levels will obviously be different in different

situations. There are three possible types. It is very important to bear this in mind when

designing and analysing an experiment.



a Smaller-the-Better quality characteristics

This type of characteristic is considered to measure the porosity, tool wear, number of

defects, etc. A smaller-the-better quality characteristic has an ideal value of zero.

b Larger-the Better quality characteristics

This type of characteristic is typically a measure of yield, efficiency, customer rating, etc. and

has an ideal value as large as possible.

c Target-is-the-Best quality characteristics

For this type of characteristic, one measures dimensions such as the diameter, length, etc. A

target value is always specified for the characteristic and a minimal variability around the

target is desired. There are two issues here: is it on target, and is the variability around the

target acceptable?



Industrial Case Study: An Application of Parameter Design for a Tile

Manufacturing Process

This case study is an application of Taguchi's parameter design to minimise the percentage of

defective tiles from a calcining (baking) process. During the late 1950s, the Ina Tile company

in Japan faced the problem of high variability in the dimensions of the tiles it produced A

team of engineers were involved to investigate the cause of the problem. The team found that

the uneven temperature distribution in the kiln caused a variation in size of the tiles produced.

The team reported that it would cost approximately £250,000 to redesign and build a kiln in

which all the tiles would receive uniform temperature. Here the temperature distribution in

the kiln is a noise factor which cannot be controlled or would be very expensive to control for

normal production conditions. However the company wanted to reduce the tile size variation

without increasing costs. This case study will illustrate how the company have achieved a

significant improvement in the tile quality by reducing the variation of the tile dimensions.

The schematic diagram of the kiln is shown below.

The company decided to conduct an experiment to investigate the effects of various factors

(or parameters) in the tile manufacturing process, with a view to making the process more

robust, so that it can produce a consistent tile size despite the temperature variation.

Burner

Kiln Wall

Burner

Figure 9 Schematic Diagram of the Kiln



Selection of factors and Levels

Control factors: Level 1 Level 2

Amount of Limestone - Factor A 5% (new) 1% (existing)

Fineness of additive - Factor B coarser (existing) finer (new)

Amount of Agalmatolite - Factor C 43% (new) 53% (existing)

Type of Agalmatolite - Factor D Existing New

Raw material charging quantity - Factor E 1300 kg (new) 1200 kg (existing)

Amount of waste return - Factor F 0% (new) 4% (existing)

Amount of feldspar - Factor G 0% (new) 5% (existing)

The noise factor in this experiment was related to the position of the tile on the cart as it went

through the calcining process. Six different positions for tiles on the cart were to be

considered. Let 'P' be the position of the tile on the cart. The six different positions are:

P1 - Outside top P2 - Outside middle P3 - Outside bottom

P4 - Inside top P5 - Inside middle P6 - Inside bottom

The effects of these control factors were studied using an L8 orthogonal array (see the

Appendix) and the response was measured. The characteristic or response measured in this

experiment was the tile width, with the target value being 150. The response table for the

experiment is shown in Table 10.

L8 A B C D E F G P1 P2 P3 P4 P5 P6

1

2

3

4

5

6

7

8

1 1 1 1 1 1 1

1 1 1 2 2 2 2

1 2 2 1 1 2 2

1 2 2 2 2 1 1

2 1 2 1 2 1 2

2 1 2 2 1 2 1

2 2 1 1 2 2 1

2 2 1 2 1 1 2

151.9 151.4 150.4 150.2 149.6 149.5

151.5 150.8 150.0 149.8 149.4 149.1

153.1 151.8 151.8 151.4 150.6 150.3

152.2 151.3 151.1 150.6 150.1 150.0

151.5 150.8 150.6 150.2 149.7 149.5

156.5 152.1 150.3 148.5 146.3 144.6

154.5 153.3 151.8 150.9 150.4 149.6

153.0 152.5 152.0 151.9 151.3 149.5

Table 10 Response Table for the Tile Experiment



Statistical Analysis and Interpretation

The purpose of parameter design is to determine the control factor levels that will make the

process insensitive to noise. The width is obviously a target-is-best quality measurement, so

we need to:

Reduce variability caused by noise.

Bring the mean response or quality characteristic on to the target by using adjustment

factors which affect the mean response only.

There are various measures of variability we can use. The obvious, easy, ones are the

standard deviation (see Session 2), or the range (simply the largest value - smallest). Taguchi

suggests a more complex measure - the signal to noise ratio. This is not covered in this unit,

and there are strong objections to it—see Mitra (1993: 529-530). Here we will use the

standard deviation (sd). From the standpoint of theoretical statistics it is better to use a more

complex function than the this—log(sd2) as explained in Mitra (1993:529)—but the standard

deviation itself will be good enough for our purposes here.

Having obtained the response table, the next step is to obtain the sd and mean response table

corresponding to each experimental point. Table 11 gives the means and sds of the widths for

the six positions in each of the eight experimental runs.

Experimental run Mean width Sd of widths

1 150.5 0.97

2 150.1 0.90

3 151.5 1.00

4 150.9 0.83

5 150.4 0.74

6 149.7 4.27

7 151.8 1.85

8 151.7 1.22

Table 11 Means and Sds of tile widths for each run

The mean width for each experimental run is obtained by adding up all six widths and then

dividing the total by 6 (the number of observations in each experimental run). The standard

deviation is simply the standard deviation of the six numbers.



Once the sd and mean response is calculated for each experimental trial, the next step is to

identify which factors impact response variability, and which factors can be used to adjust the

mean response on to the required target. In order to achieve the above two objectives, we

simply calculate the average sd and the average of the means at each level of the factors

under consideration from which their effects can be easily computed. This is just like the

calculation of the main effects above, except that one of the response variables is a standard

deviation.

For example, for Factor A at the Level 1, the average mean is ¼ (150.5 + 150.1 + 151.5 +

150.9)=150.75. Similarly the average for Level 2 is 150.89, so the effect of Factor A is the

difference between these two - i.e. +0.14 (the + indicating that the average for level 2 is

higher than for 1). Just the same calculation can be done for the standard deviation.



A B C D E F G

Mean

Level 1 150.75 150.18 151.01 151.03 150.85 150.87 150.71

Level 2 150.89 151.46 150.62 150.60 150.78 150.77 150.92

Difference 0.14 1.28 -0.39 -0.43 -0.08 -0.10 0.21

Standard deviation

Level 1 0.92 1.72 1.23 1.14 1.87 0.94 1.98

Level 2 2.02 1.23 1.71 1.81 1.08 2.01 0.97

Difference 1.10 -0.50 0.48 0.67 -0.79 1.07 -1.01

Table 12 Average Mean and Sd

Table 12 shows that factors A, D, E, F and G have the strongest influence on variability. (B

and C have a smaller influence since the difference between the standard deviations at the two

levels is smaller.) Their preferred levels (the one with the smallest sd) are:

A1, D1, E2, F1 and G2.

The decision to use these factors to adjust for robustness is a matter of judgment. There are

two other things the experiment is trying to achieve. The first of these is to identify an

adjustment factor to achieve the required nominal size of 150 mm. Factor B was chosen as an

adjustment factor as it has the largest impact on the mean response and has relatively little

impact on variability. For factor B, level 1 is closer to the target value of 150 mm.

Unfortunately Level 1 also has a worse (higher) standard deviation than Level 2, but this has

to be accepted if this Factor is used to adjust the mean. For these reasons, it was decided to

choose level 1 for factor B in the final combination of factor settings.

Factor C has relatively little influence on either mean response or response variability. This

type of factor can be treated as a cost reduction factor: the choice of level depends on the cost

of setting and convenience. It was decided to select a low percentage of Agalmatolite (because

this was cheaper), and therefore C1 was selected. This is also the setting with the lower sd,

which is obviously convenient.

So the final optimum combination of factor settings is:

A1, B1, C1, D1, E2, F1 and G2.

(The existing combination of factor settings for the process is A2, B1, C2, D1, E2, F2 and G2.)



Prediction of the mean for the optimum factor settings

The overall average mean in Table 11 is 150.82. This is also the average of the means for the

low level of Factor A and the high level (150.75 and 150.89). This suggests that Factor A has

the effect of decreasing the width by 150.82-150.75=0.07 if we choose the low level, and

increasing it by the same amount (150-89-150.82) if we choose the higher level. This means

that, starting from the overall mean, the effect of each factor is to raise or lower the mean by

half of the difference shown in Table 12.

Therefore, using the optimum combination, the predicted width is

150.82 - 0.07 - 0.64 + 0.20 + 0.22 - 0.04 + 0.05 + 0.10 = 150.64

(The plus or minus sign in this equation depends on whether the chosen level has a mean

above or below 150.82.)

A similar analysis for the standard deviations gives a prediction of -0.8! This is clearly a silly

prediction since the sd cannot be negative. This method obviously does not give an accurate

answer! This is not, however, a real problem because we are not trying to hit a target with the

sd, but just reduce it as far as possible. (If we did want a more realistic result, it would be a

good idea to analyse log(sd2) instead of the sd itself, as mentioned above.)

Confirmation Trial

A confirmation trial was conducted using the optimal condition for maximising robustness.

The standard deviation based on the confirmation trial was found to be 0.50 and the mean was

149.7. Both of these are better than any of the experimental runs. However, in other cases, the

confirmation trial may suggest that the ―optimum‖ factor levels are not really optimum, in

which case a more extensive experiment is called for.

Summary of Parameter Design Experiment

The company has substantially reduced the tile size variation due to an uncontrollable factor -

uneven temperature distribution in the kiln - by utilising Taguchi's parameter design. This is

achieved by finding the best possible settings for the seven controllable factors, A – G.



SELF-APPRAISAL EXERCISE

You will have done the reading for session 5 of this unit, and absorbed the messages

which the session's material contains. Here follows some exercises to give you

further information and food for thought.

The assigned tasks are:

1 Is Table 1 an orthogonal array? Could it be used to calculate the main effects for the

three factors?

2 The variability of each run in the tile experiment is assessed above by means of the

standard deviation. Do the analysis in Tables 11 and 12 using the range instead. Do

you get similar results?

3 A Taguchi experiment was designed to investigate five two level factors; A, B, C, D

and E. The results are shown below.

Run A B C D E Response 1 1 1 1 1 1 42

2 1 1 1 2 2 50

3 1 2 2 1 1 36

4 1 2 2 2 2 45

5 2 1 2 1 2 35

6 2 1 2 2 1 55

7 2 2 1 1 2 30

8 2 2 1 2 1 54

Is this design based on an orthogonal array? If so, how can it be derived from the

orthogonal arrays in the appendix?

Determine the optimum design parameters (control factors) based on the

assumption that the experimenter wanted to minimise the response. Which variable

has the largest effect?

Draw a graph to show the interaction between A and E. What does this show? Does

this have any impact on your assessment of the optimum design parameters?

Suppose the experimenter also wants to minimise the variation of the response

variable due to noise factors. How does the experiment need to be extended to

achieve this?

...



......

4 A medical researcher wants to assess the effects of aspirin, diet and smoking on the

incidence of heart disease. She decides that an experiment is impractical and that

she will have to get her data by monitoring a large number of people over several

years and getting data on how much aspirin they take each week, what sort of diet

they have, how much they smoke, and any signs of heart disease. She then analyses

each of the first three variables (aspirin, diet and smoking) to see if it is related to

the incidence of heart disease.

What do you think of this approach to the research? What are the problems?

5 A second researcher working on the effects of aspirin, diet and smoking on the

incidence of heart disease recognises the difficulties of survey research (as described

in Question 4 above) and decides to do a controlled experiment over a period of five

years. He chooses a full factorial design with each factor at three levels:

Aspirin: 0, 60 or 300 mg per day

Diet: (high fish diet, vegetarian, junk food)

Smoking: 0, 10, 100 cigarettes per day

2700 volunteers are randomly assigned to each of the 27 different combinations of

factor levels.

Discuss whether this experiment is

Possible

Useful

Ethical.

What do you think is the best approach to this problem?

6 Are there any aspects of your work or personal life for which you feel a designed

experiment may be helpful?



/

Suggested Answers

1 No, it is not an orthogonal array and can’t be used to calculate main effects. (For

example, Level 2 of Factor A only occurs with Level 1 of Factor B, whereas Level

1 occurs with both levels of Factor B. This means that the comparison of the two

levels of Factor A cannot be fair.)

2 The results are very similar. The differences between the mean ranges at the two

levels of each variable (the bottom row of Table 12) are

3.13, -1.33, 1.42, 1.98, -2.28, 2.98, -2.68

The order of size of these (A largest, G smallest, etc) is exactly the same as for the

standard deviations, so, in this case, using the range would lead to very similar

conclusions.

3 It is an orthogonal array. It is simply the first five columns of the array L8 (27) in

the Appendix.

The analysis of the main effects suggests that the best factor levels for minimising

the response are.

A(1), B(2), C(2), D(1), E(2)

D has the largest effect on the response (15.25), so this is the most important

variable to control.

Your interaction graph should have two lines that cross and are far from parallel.

This indicates a strong interaction: the effect of each of these variables depends on

the level of the other. The lowest response value (32.5) is achieved by taking the

high level of both variables. The best factor levels worked out above are A(1) and

E(2) - this combination has a higher average response (50). This suggests that the

factor levels above may not be the best. Another possibility would be

A(2), B(2), C(2), D(1), E(2)

because this includes the A(2), E(2) combination. It would obviously be a good

idea to do a confirmation run of both possibilities.

To analyse any noise factors it would be necessary to measure the response

variable several times with each combination of control factors (as in the tile experiment).

These replications should be arranged to allow the noise factors to vary.



Suggested Answers

4 There are two main problems with this type of survey. The first is that variables

like diet are extremely complex and not easily summarised in a helpful form to

analyse the statistics.

The second, even more important, problem is that many variables are not

controlled. To see why this matters, consider the aspirin analysis. The idea here

would be to compare people on different doses of aspirin to see if their propensity

to heart disease varies in any systematic way. However, to be useful, this

comparison needs to be ―fair‖ in the sense that the groups being compared are

similar apart from their consumption of aspirin - i.e. all the other important

variables need to be controlled. In a survey, this is unlikely to be so, perhaps

because the group taking a lot of aspirin may be doing because they are less

healthy than average. Or, with the diet comparison, it is likely that people on

supposedly healthy diets may also have other healthy habits like taking a lot of

exercise. It is impossible to be sure from a survey of this kind, and so we can never

be sure that the variable we are looking at is really the one having the effect. There

are always far too many other possibilities to check!

5 In theory, an experiment like this has the advantage that, because people are put in

groups at random, the groups should be similar except for the variables which are

controlled by the experiment. This avoids the second problem of the survey, and

means that any comparisons should be fair.

In practice, of course, the experiment is not possible. It would obviously not be

possible to persuade 2700 volunteers to take part, and if it was possible it would

not be ethical - especially telling people to smoke for the purposes of the

experiment. In addition, the idea of just three types of diet is clearly so unrealistic

that the results would be of little value.

Despite this, on occasions, there is good justification for doing experiments like

this with people - examples are drug trials, and experiments to analyse the

...


of 48 MScSQM\QUAN

©University of Portsmouth Session 5

Suggested Answers

…..

effectiveness of various aspects of a marketing campaign on the behaviour of

potential customers. In such experiments the main noise factors are likely to be due

to the people involved. Experiments with people tend to be rather more difficult to

design and organise than industrial experiments!

In practice, research on the factors causing heart disease has to use a mix of survey and

experiment. Experiments are possible for factors (like new drugs) which can be controlled,

and where there are not ethical bars to one of the treatments (e.g. where there is no strong

reason to believe that one treatment is better than the others). And in surveys, it may be

possible to use a statistical analysis to make allowances for some of the interfering

variables.


MSc SQM\QUAN of 48

Session 5 ©University of Portsmouth

Appendix: Orthogonal Array Design Tables

I Design Matrix of L8 (27) Orthogonal Array

Trial no.

Factor

1 2 3 4 5 6 7

1 1 1 1 1 1 1 1

2 1 1 1 2 2 2 2

3 1 2 2 1 1 2 2

4 1 2 2 2 2 1 1

5 2 1 2 1 2 1 2

6 2 1 2 2 1 2 1

7 2 2 1 1 2 2 1

8 2 2 1 2 1 1 2

II Design Matrix of L12 (211

) Orthogonal Array

Factor

Trial no. 1 2 3 4 5 6 7 8 9 10 11

1 1 1 1 1 1 1 1 1 1 1 1

2 1 1 1 1 1 2 2 2 2 2 2

3 1 1 2 2 2 1 1 1 2 2 2

4 1 2 1 2 2 1 2 2 1 1 2

5 1 2 2 1 2 2 1 2 1 2 1

6 1 2 2 2 1 2 2 1 2 1 1

7 2 1 2 2 1 1 2 2 1 2 1

8 2 1 2 1 2 2 2 1 1 1 2

9 2 1 1 2 2 2 1 2 2 1 1

10 2 2 2 1 1 1 1 2 2 1 2

11 2 2 1 2 1 2 1 1 1 2 2

12 2 2 1 1 2 1 2 1 2 2 1


of 48 MScSQM\QUAN

©University of Portsmouth Session 5

III Design Matrix of L16 (215

) Orthogonal Array

Factor

Trial 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

2 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2

3 1 1 1 2 2 2 2 1 1 1 1 2 2 2 2

4 1 1 1 2 2 2 2 2 2 2 2 1 1 1 1

5 1 2 2 1 1 2 2 1 1 2 2 1 1 2 2

6 1 2 2 1 1 2 2 2 2 1 1 2 2 1 1

7 1 2 2 2 2 1 1 1 1 2 2 2 2 1 1

8 1 2 2 2 2 1 1 2 2 1 1 1 1 2 2

9 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

10 2 1 2 1 2 1 2 2 1 2 1 2 1 2 1

11 2 1 2 2 1 2 1 1 2 1 2 2 1 2 1

12 2 1 2 2 1 2 1 2 1 2 1 1 2 1 2

13 2 2 1 1 2 2 1 1 2 2 1 1 2 2 1

14 2 2 1 1 2 2 1 2 1 1 2 2 1 2 1

15 2 2 1 2 1 1 2 1 2 2 1 2 1 1 2

16 2 2 1 2 1 1 2 2 1 1 2 1 2 2 1

unit quan - session 5 design of experiments

Documents