ruggedness testing made easier with inc. graphical effects … · 2019. 3. 29. · ifpac® annual...

26
1 by Mark J. Anderson, PE, CQE Stat-Ease, Inc., Minneapolis, MN [email protected] *Encored from presentation to: Smart Data: Design of Experiments Session IFPAC® Annual Meeting, March 2, 2017, North Bethesda, Maryland (Washington, D.C.) Ruggedness Testing Made Easier with Graphical Effects Analysis* 0.00 0.10 0.19 0.29 0.39 0.49 0 10 20 30 50 70 80 90 95 Half-Normal Plot |Standardized Effect| Half-Normal % Probability A-Reagent ©2018 Stat-Ease, Inc.

Upload: others

Post on 29-Jan-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

  • 1

    by Mark J. Anderson, PE, CQEStat-Ease, Inc., Minneapolis, MN

    [email protected]

    *Encored from presentation to: Smart Data: Design of Experiments SessionIFPAC® Annual Meeting, March 2, 2017, North Bethesda, Maryland (Washington, D.C.)

    Ruggedness Testing Made Easier with Graphical Effects Analysis*

    0.00 0.10 0.19 0.29 0.39 0.49

    0

    10

    20

    30

    50

    70

    80

    90

    95

    Half-Normal Plot

    |Standardized Effect|

    Hal

    f-Nor

    mal

    % P

    roba

    bilit

    y

    A-Reagent

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    mailto:[email protected]

  • Agenda

    Ruggedness testing—a briefing

    Graphical methods for assessing effects: What’s in it for you

    Ruggedness testing made easier with graphical effects analysis

    Case study—moisture assay

    Conclusion

    2

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    PresenterPresentation NotesAgenda image from https://pixabay.com/en/road-zebra-crossing-transition-630415/.

  • Ruggedness Testing

    “The robustness/ruggedness of an analytical procedure is a measure of its capacity to remain unaffected by small, but deliberate variations in method parameters and provides an indication of its reliability during normal usage.” (ICH1)

    Ruggedness testing is a “special application of a statistically designed experiment” that examines a “large number of possible factors” to determine which “might have the greatest effect on the outcome” of a test method. “Two levels for each factor are chosen to use moderate separations between the high and low settings.” (ASTM2)1(Tripartite Guideline, Third International Conference on Harmonisation of Technical Requirements for the Registration of Pharmaceuticals for Human Use (ICH), Text on Validation of Analytical Procedures, 1994.)

    2(E1169 – 14: Standard Practice for Conducting Ruggedness Tests, 5.1-5.2.)

    3

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    PresenterPresentation Noteshttps://www.flickr.com/photos/inl/9195856288

  • The problem: Methods Going Wild

    Method developers take great care to achieve high precision and accuracy. So how can field analyzers get wild results? The problem is that the developer has been “unrealistically consistent”.

    For example, temperature might be set at 60° C but it may really be 64.2° C. This bias does not affect precision, only the accuracy. Other constant errors will, likewise, not affect precision. In regard to accuracy, these additional errors may partially cancel each other.

    “It is the nature of protocol development that work will continue until the errors do cancel, and the ‘right’ answer is obtained.”*

    *(“Ruggedness Testing-Part I: Ignoring Interactions”, Journal of Research of the National Bureau of Standards, V91, # I, Jan-Feb 1986.)

    4

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    PresenterPresentation Noteshttp://nvlpubs.nist.gov/nistpubs/jres/091/jresv91n1p3_A1b.pdf

  • The solution: Plackett-Burman Designs

    Plackett and Burman (1946) developed designs with the number of runs (N) being a multiple of 4 (vs the classical 2k-p powers of two). PB’s work well for pass-or-fail ruggedness testing being resolution III. They had best be run “saturated” with k factors, i.e, k = N – 1 (or filled out with “dummies”).

    If the ruggedness test reveals possibly important effects, then the P-B can be simply folded over, i.e., a second block of runs done with all levels opposite of the first. This produces a design that resolves the main effects clear of two-factor interactions (i.e., res IV).

    5

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    PresenterPresentation NotesPlackett and Burman worked on proximity fuses for bombs during World War 2.

  • Agenda

    Ruggedness testing—a briefing

    Graphical methods for assessing effects: What’s in it for you

    Ruggedness testing made easier with graphical effects analysis

    Case study—moisture assay

    Conclusion

    6

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    PresenterPresentation NotesAgenda image from https://pixabay.com/en/road-zebra-crossing-transition-630415/.

  • The Half-Normal (aka “Daniel”) Plot

    In 1959, Cuthbert Daniel introduced an innovative way to graphically judge the “reality of the observed effects” from two-level factorial designs. This proved to be especially helpful for unreplicated experiments such as the one illustrated here.

    7

    0.00 3.60 7.21 10.81 14.42 18.02 21.63

    0102030

    50

    70

    80

    90

    95

    99

    Half-Normal Plot

    |Standardized Effect|

    Hal

    f-Nor

    mal

    % P

    roba

    bilit

    y

    A-Temperature

    C-ConcentrationD-Stir Rate

    AC

    AD

    ©201

    8 Stat

    -Eas

    e, Inc

    .

  • Concerns About Graphical Methods

    IMO a graph is worth 1000 numbers (apologies to Confucious). However, many statisticians, such as one argued against me introducing the half-normal to ASTM 1169-14 Standard Practice for Conducting Ruggedness Tests, worry over misinterpretation.

    Russel V. Lenth brought this to a head recently in “The Case Against Normal Plots of Effects”.* A number of discussants came to the defense.

    8

    Cartoon by John Landers, Nov ’16 Cause Web captioning contest**** www.causeweb.org/cause/caption-contest/november/2016/results

    *(Journal of Quality Technology, V47, N2, April 2015, pp 91-97)

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    http://www.causeweb.org/cause/caption-contest/november/2016/results

  • Keeping to the positive plane:How Half-Normal Plots Help Experimenters

    9

    Half-Normal plots (aka “Daniel”) provide many advantages as enumerated* by David Steinberg, Prof Stats, Tel Aviv U, who achieved his PhD at U Wisconsin under supervision of George Box: Easy to produce Gives a quick visual of important effects Effects can be compared to a reference—the near-zero

    contrasts that emanate from the origin Adapts to split-plot experiments

    “The greatest single advantage of the Daniel plot is its ability to stimulate discussion by encapsulating, in a single display, such a variety of information.”

    * (“Discussion: On Daniel Plots,” Journal of Quality Technology, V47, N2, April 2015, p110.)

    ©201

    8 Stat

    -Eas

    e, Inc

    .

  • Improvements to Half-Normal

    Preset the line to the smallest 50% of effects (very helpful for full-normal).* Reset the line based on terms picked (an interactive feature that provides a

    major advantage over manual graphs). Standardize the effects to contend with missing data and/or altered

    independent factors, which can cause the variance associated with the estimated effects to differ (also handles non-orthogonal two-level designs).

    Select effects by order (i.e., main effects then two-factor interactions, etc.), going forward when degrees of freedom do not suffice to estimate an entire group of terms and apply forward selection when an order cannot be completely estimated (provides robustness to missing data).

    Add points for pure error (answer to those who say these plots are only useful for unreplicated factorials)

    *(Suggested by Doug Montgomery in his rebuttal to “The Case Against Normal Plots of Effects”, Journal of Quality Technology, Vol. 47, No. 2, April 2015, “Discussion 3”, pp 105-106.)

    10

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    PresenterPresentation NotesFor designs with missing data, or with lost orthogonality due to edited factor levels, Design-Expert uses the method of least squares in a hierarchical fashion to compute the coefficients, building up from a base that contains the intercept plus block effects (if any). Next the main effects are estimated and corrected for the intercept, block effects (if any) and other main effects. Then the coefficients for the two-factor interactions are generated from least squares estimates of the model containing the intercept, block effects (if any), all main effects and all two-factor interactions. If you edit the design to the degree that all the original factorial effects cannot be estimated, the program will estimate as many as possible. When all terms of a given order cannot be estimated, a subset is selected using forward stepwise regression.

  • Agenda

    Ruggedness testing—a briefing

    Graphical methods for assessing effects: What’s in it for you

    Ruggedness testing made easier with graphical effects analysis

    Case study—moisture assay

    Conclusion

    11

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    PresenterPresentation NotesAgenda image from https://pixabay.com/en/road-zebra-crossing-transition-630415/.

  • Possible Outcomes of Ruggedness Test

    No significant effects over acceptable range.

    No significant effects but range unacceptable (large).

    Significant effect(s) not practically important. (qualified)

    Significant effect that is practically important; the worst case.

    12

    Many are unclear on this difference!

    Significant

    No Yes

    Impo

    rtan

    t

    No

    Yes ©2

    018 S

    tat-E

    ase,

    Inc.

  • Not Significant, Not Important

    13

    Hal

    f-Nor

    mal

    % P

    roba

    bilit

    y

    |Standardized Effect|

    0.00 7.13 14.25 21.38 28.50

    0102030

    50

    70

    80

    90

    95

    99

    This ruggedness test* hoped to generate less than a 35-unit effect from 7 critical factors varied over normal ranges out in the field.Given a 20-unit std dev, the experimenters chose a 16-run design to achieve 90% power.Here is an outcome that would give them a pass.

    *(Kraber & Whitcomb, “Best Practices in Planning and Analyzing a Verification DOE”, 2009 World Conference on Quality and Improvement.)

    ©201

    8 Stat

    -Eas

    e, Inc

    .

  • Not Significant but Range Unacceptable

    14

    Hal

    f-Nor

    mal

    % P

    roba

    bilit

    y

    |Standardized Effect|

    0.00 10.69 21.38 32.06 42.75

    0102030

    50

    70

    80

    90

    95

    99

    The results vary more than the testers had hoped for (35 units), which is important.This cannot be given a pass.

    ©201

    8 Stat

    -Eas

    e, Inc

    .

  • Significant but Within Acceptable Range

    15

    Hal

    f-Nor

    mal

    % P

    roba

    bilit

    y

    |Standardized Effect|

    0.00 5.50 11.00 16.50 22.00

    0102030

    50

    70

    80

    90

    95

    99

    F

    The results varied less than the testers had expected (35 units). Therefore, although the effect is statistically significant (p

  • Significant Effects Beyond Acceptable

    16

    Hal

    f-Nor

    mal

    % P

    roba

    bilit

    y

    |Standardized Effect|

    0.00 11.16 22.31 33.47 44.63

    0102030

    50

    70

    80

    90

    95

    99

    FThe worse-case scenario. This test fails—going far beyond the 35 unit expectation and significantly so. After first confirming the effect via fold over, work out a way to make the method more rugged. ©2

    018 S

    tat-E

    ase,

    Inc.

  • Agenda

    Ruggedness testing—a briefing

    Graphical methods for assessing effects: What’s in it for you

    Ruggedness testing made easier with graphical effects analysis

    Case study—moisture assay

    Conclusion

    17

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    PresenterPresentation NotesAgenda image from https://pixabay.com/en/road-zebra-crossing-transition-630415/.

  • Moisture-Assay Ruggedness Test

    An adhesives manufacturer began producing a new product. During startup all of the polyamide-resin pellets failed inspection due to high moisture analyzed by Karl-Fischer titration. A team from R&D flew in to fix the process. They generated more samples than the on-site QC lab could analyze. When they sent the excess samples to the R&D analytical lab, there were no indications of high moisture in any of the product!

    This created quite a commotion (aka “kerfuffle”)!

    Consequently, an 8-run P-B ruggedness test was run on 7 key factors thought to affect the moisture determination.

    18

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    PresenterPresentation Noteshttps://commons.wikimedia.org/wiki/File:KF_Titrator.jpg

  • The Experiment (a P-B)

    19

    Std Run

    A:Reagent

    B:Rxntime(min)

    C:n-

    Heptane(ml)

    D:Disttime(min)

    E:Distrate

    (drop/sec)

    F:Aniline

    (ml)

    G:Hydra-

    tionWater

    (%)1 5 new 15 190 90 2 8 ca 2 20.042 4 used 15 190 45 6 12 ca 2 20.503 3 used 0 190 45 2 8 ca 5 20.124 6 new 0 210 45 2 12 ca 2 19.745 7 used 15 210 90 2 12 ca 5 20.496 2 new 0 190 90 6 12 ca 5 19.867 1 new 15 210 45 6 8 ca 5 19.818 8 used 0 210 90 6 8 ca 2 20.29

    ©201

    8 Stat

    -Eas

    e, Inc

    .

  • The Results (1 of 2: Half-normal with 50% preset)

    20

    Water

    Shapiro-Wilk testW-value = 0.853p-value = 0.131A: ReagentB: Reac timeC: n-HeptaneD: Dis timeE: Dis rateF: AnilineG: Hydration

    Positive Effects Negative Effects

    0.00 0.10 0.19 0.29 0.39 0.49

    0

    10

    20

    30

    50

    70

    80

    90

    95

    |Standardized Effect|

    Hal

    f-Nor

    mal

    % P

    roba

    bilit

    y

    Select significant terms - see Tips

    Line defaults to fit smallest 50% of effects©201

    8 Stat

    -Eas

    e, Inc

    .

  • Results (1 of 2: Effect selected with aliases shown)

    21

    To save money, the plant’s QC distilled the water from the costly reagent, and recycled it (factor A “used”). But, per R&D, an azeotrope retained moisture, creating a bias.

    However, to be sure, follow up work was needed to clear up A from being ‘smeared’ by 2FIs BF and/or CD and/or EG.

    A = A - BF - CD + EG

    ©201

    8 Stat

    -Eas

    e, Inc

    .

  • Foldover Options

    1. Complete foldover: Changing all of the signs on all the original runs. Resolution III designs: Separates the main effects from the two factor

    interactions (2FIs), producing a resolution IV design. Resolution IV: May produce a replicate of the original fraction, in

    which case it will not improve the resolution.

    2. Single factor: Changing signs on only one factor. (Option: semifold.) Resolution III and IV designs: The folded factor and all of its 2FIs will

    be clear of any other main effect or 2FI.

    A complete foldover on the initial 8-run P-B ruggedness test resolved the main effect of A (reagent being recycled) from any 2FIs.

    The end.

    22

    ©201

    8 Stat

    -Eas

    e, Inc

    .

  • Agenda

    Ruggedness testing—a briefing

    Graphical methods for assessing effects: What’s in it for you

    Ruggedness testing made easier with graphical effects analysis

    Case study—moisture assay

    Conclusion

    23

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    PresenterPresentation NotesAgenda image from https://pixabay.com/en/road-zebra-crossing-transition-630415/.

  • Conclusion

    The use of graphical effects analysis, in particular the half-normal plot (especially now with smart data tools of modern software):

    Makes factorial DOE (even multilevel categorics) far easier,

    Provides at-a-glance (go vs no-go) results for ruggedness testing.

    24

    “The greatest value of a picture is when it forces us to notice what we never expected to see.”

    - John Tukey, Exploratory Data Analysis, 1977, p. vi.

    “Remember the three rules of data analysis: Plot the data, plot the data and plot the data.”

    -Arved Harding, “Uncovering the Truth”, Quality Progress, Feb 2017, p. 64.

    ©201

    8 Stat

    -Eas

    e, Inc

    .

  • Where to Get Statistical Details

    1. Youden, W. and Steiner, E. (1975). Statistical Manual of the AOAC, Association of Official Analytical Chemists, Washington D.C.

    2. Larntz, K. and Whitcomb, P. (1992). “The Role of Pure Error on Normal Probability Plots” Transactions of Annual Quality Congress of the American Society of Quality, Milwaukee, WI. www.statease.com/pubs/role-of-pure-error.pdf

    3. Larntz, K. and Whitcomb, P. (1998). “Use of Replication in Almost UnreplicatedFactorials,” Fall Technical Conference, Corning, NY. www.statease.com/pubs/use-of-rep.pdf

    4. Oehlert, G. (2009). “Graphical Methods for Selecting Effects in Factorial Models,” Statistics Workshop at Yale University. www.stat.yale.edu/Conferences/Stats2009/GOSLIDES.pdf

    5. Analytical Methods Committee. (2013). “Experimental design and optimisation (4): Plackett–Burman designs”, AMCTB No 55, Royal Society of Chemistry. www.rsc.org/images/Experimental-design-and-optimisation-4-Plackett-Burman-designs-55_tcm18-232212.pdf

    25

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    http://www.statease.com/pubs/role-of-pure-error.pdfhttp://www.statease.com/pubs/use-of-rep.pdfhttp://www.stat.yale.edu/Conferences/Stats2009/GOSLIDES.pdfhttp://www.rsc.org/images/Experimental-design-and-optimisation-4-Plackett-Burman-designs-55_tcm18-232212.pdf

  • Statistics Made Easy®

    26

    Best of luck for your experimenting!

    Thanks for listening!

    -- Mark

    [email protected]

    ©201

    8 Stat

    -Eas

    e, Inc

    .

    mailto:[email protected]

    Slide Number 1AgendaRuggedness TestingThe problem: �Methods Going WildThe solution: �Plackett-Burman DesignsAgendaThe Half-Normal (aka “Daniel”) PlotConcerns About Graphical MethodsKeeping to the positive plane:�How Half-Normal Plots Help Experimenters Improvements to Half-Normal AgendaPossible Outcomes of Ruggedness TestNot Significant, Not Important Not Significant but Range Unacceptable Significant but Within Acceptable Range Significant Effects Beyond Acceptable AgendaMoisture-Assay Ruggedness TestThe Experiment (a P-B)The Results (1 of 2: Half-normal with 50% preset)Results (1 of 2: Effect selected with aliases shown)Foldover OptionsAgendaConclusionWhere to Get Statistical DetailsStatistics Made Easy®