warranty analysis of repairable systems - trindade analysis.pdf · warranty analysis of repairable...

34
Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality and Productivity Research Conference June 2007 Santa Fe, New Mexico

Upload: doankhanh

Post on 08-Feb-2018

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Warranty Analysis of Repairable Systems

David Trindade and Jeff Glosup, Sun MicrosystemsBill Heavlin, Google

Quality and Productivity Research Conference June 2007 Santa Fe, New Mexico

Page 2: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 2

Repairable Systems• Main focus is repairable systems

> A system is repairable if by replacing or repairing system components, the system can be returned to service.

> After the repair, the system may be > As good as new (renewal process)> Worse than new ☹> Better than new ☺

> Behavior of system after repair depends on repair actions and effects of replaced components.

> A key characteristic of a repairable system is that an individual system may fail multiple times for the same or different causes.

Page 3: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 3

Warranty and Reliability Analysis• Warranty analysis and reliability analysis of field

data are closely related areas.• The authors have used graphical Time Dependent

Reliability (TDR) methods extensively for analyzing product field reliability data.> TDR is for use with repairable systems.

• TDR methods are also very useful for the analysis of warranty data.> Can isolate specific failure modes and associated cost

data easily

Page 4: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 4

Time Dependent Reliability• Graphical technique based on studying the

cumulative occurrence of events of interest over time on individual systems> The overall behavior is represented by the average of

events across systems at fixed times – the mean cumulative function (MCF).

> Can change definition of events of interest to suit study> Can be all field failures for field reliability> Can be warranty claims for warranty data analysis> Can be costs of repairs or claims

> TDR approach is consistent with warranty analysis techniques such as suggested by Kalbfleisch, Lawless and Robinson.

Page 5: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 5

Estimating the MCF

• Consider a collection of M observed times of events tk for a set of units sold since a start time t

0

> Assume the event times are set in increasing order so t0≤ t

1 ≤ t

2 ≤ … ≤ t

M

> Note times may be calendar dates or may represent time in operation, e.g., system age

> The number of units active at any time is variable due to manufacture and sale at different times.> Let N(t

k) be the number of units active at the time represented

by tk

Page 6: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 6

Estimating the MCF - 2• The Mean Cumulative Function (MCF) is defined

successively as

where w(tk) is an incremental weight

• Incremental weight w(tk) may be

> Number of failures at time tk

> Costs of failures at time tk

MCF t k =MCF t k−1w t k N tk

MCF t 0=0

Page 7: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 7

Extending the Analysis• The TDR technique can be used to graphically

explore the modes of failure by re-defining the events of interest.> For example, we can produce a MCF for each failure

mode or failing component.> The resulting chart is a dynamic Pareto chart of failing

components. > Each of these curves should fall beneath the overall MCF curve.> These component curves should add up to produce the overall

MCF curve.> Again, cost data can be incorporated by weighting the individual

component fails by their repair cost – scaled in dollars.

Page 8: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 8

Example• This example contains roughly 1500 systems

manufactured and installed over approximately a two year period.

• The data represent all systems manufactured over this period.

• Each system consists of many components, but four of these components are responsible for a large fraction of the failures.

• Adjustments were made for missing data. (See Glosup, 2002)

Page 9: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 9

Age Versus Date Dependency

• In TDR analysis, we typically view the data in two forms.

• Age dependency looks at failure causes related to the age of the system.

• Date dependency shows causes related to events associated with a specific date. Possible causes include software updates, general maintenance actions, physical relocation of systems, sudden environmental changes, and so on.

Page 10: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 10

Active Systems PlotsActive Systems Versus Date

0

200

400

600

800

1000

1200

1400

1600

1800

11/5

/200

1

2/13

/200

2

5/24

/200

2

9/1/

2002

12/1

0/20

02

3/20

/200

3

6/28

/200

3

10/6

/200

3

Date

Num

ber o

f Sys

tem

s

Active Systems Versus Age

0

200

400

600

800

1000

1200

1400

1600

1800

0 100 200 300 400 500 600 700

AGE (Days)

Num

ber o

f Sys

tem

s

Page 11: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 11

MCF Plots by AgeTotal MCF and Failure Mode Specific MCF Plots

0

2

4

6

8

10

12

14

0 100 200 300 400 500 600 700

Age (Days)

Aver

age N

umbe

r of F

ailur

es

MCF.TotalMCF.AMCF.BMCF.CMCF.D

Total MCF and Failure Mode Specific MCF Plots

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 100 200 300 400 500 600 700

Age (Days)

Aver

age N

umbe

r of F

ailur

es

MCF.TotalMCF.AMCF.BMCF.CMCF.D

Page 12: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 12

MCF Plots by Date

05

1015

20

Date

Ave

rage

Num

ber o

f Eve

nts

per S

yste

m

04/02 11/02 05/03

Overall AverageComponent AComponent BComponent CComponent D

Page 13: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 13

MCF Plot with Reversed Time• The MCF plot can also be reportrayed with the time

axis reversed and the MCF curves shifted.> Shows most recent events to the left side of the graph.> Graph emphasizes to a greater extent the differences in

failure modes in the later ages associated with the most recent events.

> Reference time zero point used is oldest age.

Page 14: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 14

Reverse Time Plots

0 100 200 300 400 500 600

02

46

810

12

Time Since End of Observation Period (days)

Aver

age

Num

ber o

f Eve

nts p

er S

yste

m

Overall AverageComponent AComponent BComponent CComponent D

0 20 40 60 80 100

0.0

0.2

0.4

0.6

0.8

1.0

Time Since End of Observation Period (days)

Aver

age

Num

ber o

f Eve

nts p

er S

yste

m

Overall AverageComponent AComponent BComponent CComponent D

Page 15: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 15

NHPP Power Law Model for MCF Vs. Age

• Power law model for MCF:M(t) = αtβ

• Intensity function (theoretical recurrence rate)λ(t) = dM(t)/dt = αβtβ-1

Page 16: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 16

Parameter Estimation for Power Law Model• Although MLE methods exist for estimating the

parameters of the Power Law model (see Crow, 1974), a simple and direct approach is to estimate the parameters by minimizing the sum of squares of the residuals between the MCF and the model.

• The non-linear least squares Solver routine in the spreadsheet program EXCEL easily facilitates the estimation.

Page 17: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 17

Power Law Model FitsTotal MCF and Failure Mode Specific MCFs with Model

Fits

0

2

4

6

8

10

12

14

0 100 200 300 400 500 600 700

Age (Days)

Ave

rage

Num

ber

of F

ailu

res

MCF.TotalMCF.AMCF.BMCF.CMCF.DModel.TotalModel.AModel.BModel.CModel.D

Page 18: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 18

Parameters of Power Law Model

α β

• Overall MCF .0207 .991• Component A .00175 1.23• Component B .00887 .910• Component C .00455 .868• Component D .00148 .940

Page 19: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 19

Projections Using the Power Law Model• With the estimated parameters, it is possible to use

models to do overall or failure mode specific projections for future ages or dates.

• Such projections are a key component of warranty analysis, for example, to estimate spare parts inventories and staffing levels for field support.

Page 20: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 20

Empirical Recurrence Rates (RR)• It is possible to numerically differentiate the MCF

curve and determine recurrence rates as a function of the system age or calendar date.

• A simple approach to the recurrence rate calculations is possible using the built-in spreadsheet SLOPE function.

• A slope is found for a window (odd) number of consecutive MCF points and plotted at the median age of the points. Then, the first point is dropped and another consecutive point added, with the process repeated. (See Trindade, 1975)

Page 21: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 21

ARR (Empirical) Vs. AgeAnnualized Recurrence Rate (ARR) Vs. Age

(31 Day Window)

0

2

4

6

8

10

12

0 100 200 300 400 500 600 700

Age (Days)

AR

R (E

vent

s pe

r Ye

ar)

RR TotalRR.ARR.BRR.CRR.D

Page 22: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 22

Annualized Empirical Recurrence Rates (ARR) Vs. Date

Annulaized Recurrence Rate (ARR) Vs. Date31 Day Window

0

5

10

15

20

25

30

35

40

4511

/5/2

001

2/13

/200

2

5/24

/200

2

9/1/

2002

12/1

0/20

02

3/20

/200

3

6/28

/200

3

10/6

/200

3

Date

AR

R

RR.TotalRR.ARR.BRR.CRR.D

Page 23: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 23

Annualized Recurrence Rates and Power Law Model Fits

Total ARR and Failure Mode Specific ARRs with Model Fits

-2

0

2

4

6

8

10

12

0 100 200 300 400 500 600 700

Age (Days)

Ann

ualiz

ed R

ecur

renc

e R

ate

(Eve

nts

per

Yea

r)

RR TotalRR.ARR.BRR.CRR.DModel.TotalModel.AModel.BModel.CModel.D

Page 24: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 24

Total RR Vs. Failure Mode RR SumRecurrence Rate Total and Sum of Four Modes

0

2

4

6

8

10

12

0 100 200 300 400 500 600 700

Age (Days)

Ann

ualiz

ed R

ecur

renc

e R

ate

(Eve

nts

per Y

ear)

RR TotalSum

Page 25: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 25

Dynamic Pareto• The idea of the dynamic Pareto can be made

easier to interpret by rescaling the curves.• For the components, take the ratio of the

component MCF to the overall MCF, e.g, for component A

R Atk =MCF Atk

MCF Total tk

Page 26: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 26

Dynamic MCF Pareto by Age

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0 100 200 300 400 500 600 700

System Age (Days)

Pro

port

ion

of T

otal

MC

F

Comp.AComp.BComp.CComp.D

Page 27: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 27

Dynamic MCF Pareto by Date

0

0.05

0.1

0.15

0.2

0.25

0.3

11/5

/200

1

2/13

/200

2

5/24

/200

2

9/1/

2002

12/1

0/20

02

3/20

/200

3

6/28

/200

3

10/6

/200

3

Date

Pro

port

ion

of T

otal

MC

F

ABCD

Page 28: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 28

Dynamic RR Pareto by AgeDynamic Pareto Empirical Recurrence Rate Plots

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 100 200 300 400 500 600 700

Age (Days)

Pro

port

ion

of T

otal

RR

comp.Acomp.Bcomp.Ccomp.D

Page 29: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 29

Dynamic RR Pareto by Date

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

11/5

/200

1

2/13

/200

2

5/24

/200

2

9/1/

2002

12/1

0/20

02

3/20

/200

3

6/28

/200

3

10/6

/200

3

Date

Pro

port

ion

of T

otal

RR

ABCD

Page 30: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 30

Average Cost per Unit Time• The Cost per Unit Time computed from the MCF:

Cost per Unit Time = where

n(m,tk) = number of fails of failure mode m at time t

k,

cm = (marginal) cost of repairing failure mode m.

MCF t k =MCF t k−1∑m

cm n m ,t k N tk

MCF t 0=0

MCF T /T−t 0 ,

Page 31: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 31

Cost Per Unit Time PlotsCosts Per Unit Time

0

0.5

1

1.5

2

2.5

0 200 400 600 800

Age (Days)

Cos

t/Uni

t TIm

e

MCF Cost/TimeModel Cost/TimeDOA Adder

Costs Per Unit Time

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0 200 400 600 800

Age (Days)

Cos

t/Uni

t TIm

e

MCF Cost/TimeModel Cost/TimeDOA Adder

Assigned Cost Weights: A(10), B(1), C(1), D(1)

Page 32: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 32

MCF Analysis Benefits• MCFs are close to the data and so one can do

data-sensitive calculations. • Benefits shows up with:

> (a) the failure-mode-specific MCFs> (b) calendar MCFs> (c) reverse-time/recent failure plotting> (d) the dynamic Pareto plots

• MCFs work with large populations & MCFs work with small populations > The empirical approach is revealing with cost-

per-unit-time plots, too, and goodness-of-fit plots of power law models.

Page 33: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 33

Dynamic Pareto and Visualization• The dynamic Pareto concept is very useful in the

analysis of warranty claims.• Both MCFs and RRs can reveal interesting aspects

of the data using the dynamic Pareto approach. • These visualization methods are easy to grasp and

provide excellent insight into specific failure causes that vary with age or date.

• In most cases, the analysis can be easily accomplished using simple spreadsheet programs.

• Power law model complements MCF & RR analysis.

Page 34: Warranty Analysis of Repairable Systems - Trindade Analysis.pdf · Warranty Analysis of Repairable Systems David Trindade and Jeff Glosup, Sun Microsystems Bill Heavlin, Google Quality

Page 34

References● Ascher and Feingold (1984), Repairable Systems Reliability: Modeling,

Inference, Misconceptions and Their Causes, M. Dekker● Crow (1974) Reliability Analysis for Complex, Repairable Systems, in

Reliability and Biometry, ed. By Proschan & Serfing, SIAM● Glosup (2002) "Estimation and Inference for Failure Rate in the Presence of

Incomplete Failure Data", JSM Proceedings● Lawless (1997), “Statistical Analysis of Product Warranty Data”, IIQP

Technical Report RR-97-05, University of Waterloo● Kalbfleisch and Lawless (1993), “Statistical Analysis of Warranty Data”, IIQP

Technical Report RR-93-03, University of Waterloo ● Kalbfleisch, Lawless and Robinson (1991), “Methods for the Analysis and

Prediction of Warranty Claims”, Technometrics, Vol 33 ● Nelson (2004), Recurrent Events Data: Analysis for Product Repairs, Disease

recurrences and Other Applications, SIAM● Tobias and Trindade (1995), Applied Reliability, Chapman & Hall ● Trindade (1975), “An APL Program to Differentiate Data by the Ruler

Method”, IBM Technical Report, TR19.0361