ge 210 lecture 33 (experimental design)

33
GE 210—Probability and Statistics December 3, 2012 Lecture 33

Upload: adamgarth

Post on 25-Sep-2015

9 views

Category:

Documents


2 download

DESCRIPTION

Lecture on 2k factorial experimentationIncludes hypothesis testing, fractional optimization of hydraulic fracturing tower column design From the University of Saskatchewan's general engineering statistic course

TRANSCRIPT

  • GE 210Probability and Statistics

    December 3, 2012

    Lecture 33

  • Today

    Experimental Design

  • Purpose of Experimental Design

    Engineers are always conducting experiments Improving process yield

    Reducing variability of output

    Reduce design and development time

    Reduce cost of operation

    Proper statistical methods are essential to good experimentation

    Poorly designed experiments produce inconclusive or incorrect results and use valuable resources ineffectively

  • Experimental Design

    You have been asked to determine the biogas production potential of several types of feedstocks (slaughterhouse waste, dairy manure, distillers mash)

    Want to know if the type of feedstock has a significant effect on biogas volumes

    If there is a significant effect, which feedstock has the highest biogas volume?

    How do you do it?

  • Feedstock biogas production

    L of biogas/L of feedstock per day

    Feedstock 1 Feedstock 2 Feedstock 3

    1.51 2.29 1.65

    L of biogas/L of feedstock per day

    Feedstock

    1

    Feedstock

    2

    Feedstock

    3

    1.5 2.1 0.95

    1.1 3.9 1.8

    1.2 0.8 1.7

    1.8 3.4 2.1

    1.7 0.8 1.9

    1.4 1.4 0.8

    1.9 3.6 2.3

    average 1.51 2.29 1.65

    stdev 0.30 1.34 0.57

    Replication allows the degree of

    variability in the data to be

    assessed

  • Experimental Design

    Use what we know about analyzing the data to work backwards and collect the data properly

    Can select the number of replicates required to satisfy the desired power of the experiment (also depends on variability of data)

    Quite often, the number of replicates is constrained by time/cost

    Using t-tests to test for differences is very cumbersome!

    Need 3 sets of hypotheses Ho: 1 = 2 Ho: 1 = 3 Ho: 2 = 3

  • Experimental Design

    Instead, use ANOVA to test significance of treatment

    In this case, the treatment is the type of feedstock

    If the effect of the treatment is significant (P value is less than a), then a means separation will tell you exactly which treatment is different from the others

    Ho: there are no differences

    HA: there are differences

  • ANOVA Output (Minitab)

    General Linear Model: Biogas versus Feedstock

    Factor Type Levels Values

    Feedstock fixed 3 1, 2, 3

    Analysis of Variance for Volume, using Adjusted SS for Tests

    Source DF Seq SS Adj SS Adj MS F P

    Feedstock 2 2.3745 2.3745 1.1873 1.61 0.228

    Error 18 13.2821 13.2821 0.7379

    Total 20 15.6567

    Since P > a, do not reject Ho, there is no evidence to support that different feedstocks produce different volumes of biogas.

  • Experimental Design

    You have been asked to optimize the operating parameters (temperature and dry matter content) to improve biogas production from a specific feedstock Temperature can range from 35C to 55C

    Dry matter content can range from 3 to 10%

    How do you do it?

  • Biogas Production by Varying Parameters

    Change factors one at a time

    Set the temperature to 40C, vary dry matter content and measure biogas production

    10% dry matter produces highest volume

    Set the dry matter to 10%, vary temperature and measure biogas production

    35C produces the highest volume

    Conclude that 10% dry matter content and 35C are the optimum operating conditions

    Very poor statistical practice and experimental design

  • Biogas production by varying parameters

    Temperature DM Content L of biogas/L of feedstock per day

    Feedstock 1 35 5 1.8

    40 10 2.1

    50 15 0.7

    Feedstock 2 30 7 2.4

    45 12 2.4

    55 18 2.4

    Feedstock 3 35 4 1.1

    40 7 3.6

    55 9 4.4

    Very poor statistical practice and experimental design

  • Biogas production by varying parameters

    By haphazardly varying temperature and dry matter contents, it is difficult to assess whether differences in biogas production are due to temperature or dry matter content or both

    A better solution is to use a factorial experimental design

  • Factorial Experimental Design

    When several factors are of interest in an experiment, a factorial experiment should be used In each complete replicate of the experiment, all possible

    combinations of the levels of the factors are investigated Each feedstock should be digested at each temperature and each DM

    content for valid comparisons 3 feedstocks x 3 DM contents x 3 temperatures equals 27 experiments (x 3

    reps = 81 trials)

    The effect of each factor can then be separately assessed Potential interactions of factors can also be assessed

    Factorial experiments are the only way to discover interactions between variables

  • Interactions

    When the difference in response between the levels of one factor is not the same at all levels of the other factors, there is an interaction between factors

    When an interaction is significant, the corresponding main effects have very little practical meaning

    No interaction between factors Interaction between factors

  • Feedstock Temperature DM Content L of biogas/L of feedstock per day

    1 25 3 1.8

    1 25 5 2.1

    1 25 10 1.5

    1 30 3 2.1

    1 30 5 2.1

    1 30 10 0.75

    1 40 3 1.9

    1 40 5 2.3

    1 40 10 1.7

    Biogas production by varying parameters

    Feedstocks 2 and 3 will have the same tables of data.

    Now we can analyze the effect of three treatments feedstock type, temperature and dry matter content.

    We can also determine if there are significant interactions among feedstock, temperature and dry matter content.

    Average of 3 reps

  • Factorial Experimental Design

    Are the results significant?

    Cannot make that assessment without a measure of the variability of data

    Average

    Feedstock 1 1.8

    Feedstock 2 2.9

    Feedstock 3 2.1

    25 degrees 2.0

    30 degrees 2.4

    40 degrees 2.4

    3% DM 2.3

    5% DM 2.6

    10% DM 1.8

    StdDev

    0.47

    0.62

    0.79

    0.49

    1.00

    0.75

    0.69

    0.62

    0.85

  • ANOVA Output (Minitab)

    General Linear Model: Biogas versus Feedstock, Temp, DM

    Factor Type Levels Values Feedstock fixed 3 1, 2, 3 Temp fixed 3 25, 30, 40 DM fixed 3 3, 5, 10

    Analysis of Variance for Biogas, using Adjusted SS for Tests

    Source DF Seq SS Adj SS Adj MS F P Feedstock 2 5.7535 5.7535 2.8768 9.13 0.009 Temp 2 1.0591 1.0591 0.5295 1.68 0.246 DM 2 2.8846 2.8846 1.4423 4.58 0.047 Feedstock*DM 4 0.3415 0.3415 0.0854 0.27 0.889 Feedstock*Temp 4 0.9670 0.9670 0.2418 0.77 0.575 Temp*DM 4 2.0293 2.0293 0.5073 1.61 0.262 Error 8 2.5196 2.5196 0.3150 Total 26 15.5546

    Since P is < a for both feedstock and DM, conclude that type feedstock and dry matter content have a significant effect on biogas production.

    Interactions not significant

  • Main Effects Plots

    321

    3.0

    2.8

    2.6

    2.4

    2.2

    2.0

    1.8

    1053

    Feedstock

    Me

    an

    DM

    Main Effects Plot for BiogasData Means

  • Interactions Plot

    1053

    3.5

    3.0

    2.5

    2.0

    1.5

    DM

    Me

    an

    1

    2

    3

    Feedstock

    Interaction Plot for BiogasData Means

  • Means Separation for Feedstock

    Tukey Simultaneous Tests Response Variable Biogas All Pairwise Comparisons among Levels of Feedstock Feedstock = 1 subtracted from:

    Difference SE of Adjusted Feedstock of Means Difference T-Value P-Value 2 1.0833 0.2646 4.0949 0.0086 3 0.2611 0.2646 0.9870 0.6048

    Feedstock = 2 subtracted from:

    Difference SE of Adjusted Feedstock of Means Difference T-Value P-Value 3 -0.8222 0.2646 -3.108 0.0347

    Pairwise t-tests!

    1 is different from 2, but not from 3. Feedstock 2 is different from 3

  • Means Separation for Dry Matter Content

    Tukey Simultaneous Tests Response Variable Biogas All Pairwise Comparisons among Levels of DM DM = 3 subtracted from:

    Difference SE of Adjusted DM of Means Difference T-Value P-Value 5 0.3111 0.2646 1.176 0.4986 10 -0.4833 0.2646 -1.827 0.2217

    DM = 5 subtracted from:

    Difference SE of Adjusted DM of Means Difference T-Value P-Value 10 -0.7944 0.2646 -3.003 0.0404

  • Systematic Error and Randomization

    You are testing a total of 18 o-ringshalf from the old production line and half from the new production line to determine if the new line produces o-rings of higher strength

    Would you take 9 consecutive o-rings from the old production line, test them, then take 9 consecutive o-rings from the new production line and test them? Systematic error can affect results O-rings produced at beginning of the day may be different than o-rings produced later in

    the day

    For a chemical experiment, will need to use 2 bottles of chemical to complete experiment, but pH of one bottle may be slightly different Randomize use of two bottles among treatments to avoid systematic error

    For a bench-scale digester experiment, you have 12 reactors in a circulating warm water bath, but one end of the bath may have a slightly different temperature Randomize the ordering of the treatments in the reactors to avoid systematic error

  • Systematic Error and Blocking

    Another way to avoid systematic error is by blocking the treatments together Commonly used in experiments involving plots of soil Even a small area of land (1 ha) can have variations in the soil

    that may affect results (growth rate, yield, nutrient uptake, etc.) Area is broken up into smaller blocks and 1 replicate of each

    combination of treatments is placed in each block The block number can be treated as a main effect during

    analysis to see if the location of the block had a significant effect on the response variable(s)

  • Factorial Experiments

    Pros

    Simple and robust analysis

    Can analyze effects of interactions

    Cons

    Requires excessive time and resources

    Experiment can get impossibly large very quickly

  • 2k Factorial Design

    To study the effect of several factors on a response Can also investigate effect of interactions of factors Each factor has 2 levels

    Quantitative (continuous) measures of temperature, pressure, time Qualitative (discrete) measures like high and low or old and

    new, etc. A complete replicate of such a design requires 2 x 2 x 2 x x 2 = 2k

    experiments and is called a 2k factorial design If the effect of the treatment is significant, do not need a means

    separation to determine which one is different because there are only 2 factors and they must be different from each other!

  • Experimental Design Considerations

    Control Allows for identification of effects between treatments with the effect of the

    treatment itself

    Placebo A type of control used extensively in drug studies since the psychological effect of

    taking a drug may impact your response to it

    Blinding Removes bias of subjects and experimenters

    Confounding When potential interactions are not specifically identified, effects can become

    confounded (and results are not as conclusive as they could be) Ex: if a new cure for the common cold was administered to men only, and a

    placebo to women only, no way to investigate the interaction between gender and drug so gender and drug are confounded

  • Experimental Design Considerations

    Randomization

    Helps remove some of the bias due to systematic error

    Blocking

    A specific type of randomization that accounts for a known source of variation

    Replication vs duplication

    Replication is required to get a measure of variation

    In true replication, the factors are simulated and repeated x times

    In duplication, x samples are taken from the same simulation of factors (not truly replicated!)

  • ExampleBlocking

    A researcher is carrying out a study of the effectiveness of four different skin creams for the treatment of a certain skin disease. He has eighty subjects and plans to divide them into 4 treatment groups of twenty subjects each. Using a randomized block design, the subjects are assessed and put in blocks of four according to how severe their skin condition is; the four most severe cases are the first block, the next four most severe cases are the second block, and so on to the twentieth block. The four members of each block are then randomly assigned, one to each of the four treatment groups.

  • ExampleControls

    My research: emissions from manure spreading

    Factors: Type of manure (solid, liquid)

    Species of manure (cow, pig)

    Application method (surface, subsurface)

    Application rate (0X, 1X, 2X, 3X)

    Controls for application rate

    Measuring emissions from bare soil (0X) can separate the emissions from the manure from the emissions from bare soil

    Controls for application method

    Disturbed and undisturbed

    Controls for manure type

    Water applied at 2X application rate

  • Example

    Set up an experiment to test the effectiveness of store brand vs brand name laundry detergent

    Replication vs duplication

    Blinding

    Some standard dirty to be tested

    Some qualitative measure of clean

  • The trickiness and ambiguity of experimental design and analysis

    Warning Signs in Experimental Design and Interpretation

    An online essay (unknown author)

    http://norvig.com/experiment-design.html

    Focused on experimental design and analysis for medical research, but the ideas are applicable to all research

  • Warning Signs in Misinterpretation of Results

    Lack of repeatability and reproducibility Ignoring publication bias

    Experiments that dont work out are often not published!

    Ignoring other sources of bias Not understanding conditional probability!

    Pr(cancer/positive) is not the same as Pr(positive/cancer)

    Confusing correlation with causation Being too clever!

  • Next Day

    Review for final exam