javier garcia - verdugo sanchez - six sigma training - w2 non normal data

20
Evaluation of Non Normal Distributions Histogram (with Normal Curve) of C3 18 16 14 Mean 4,924 StDev 3,092 N 100 F requency 12 10 8 F 6 4 2 C3 15 12 9 6 3 0 0 Week 2 Knorr-Bremse Group About this Module How can we handle non normal data? This module will show you possibilities how to process non normal distributed data sets non normal distributed data sets. Content The Box Cox transformation of data The Box Cox transformation of data • The Johnson transformation of data • Capability Analysis of Non Normal Data Examination of the mean Examination of the mean • Central limit theorem Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 2/39

Upload: j-garcia-verdugo

Post on 13-Apr-2017

296 views

Category:

Engineering


4 download

TRANSCRIPT

Page 1: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Evaluation of Non fNormal Distributions

Histogram (with Normal Curve) of C3

18

16

14

Mean 4,924StDev 3,092N 100

Fre

qu

en

cy

12

10

8

F

6

4

2

C315129630

0 Week 2

Knorr-Bremse Group

About this Module

How can we handle non normal data?

This module will show you possibilities how to process non normal distributed data setsnon normal distributed data sets.

Content

• The Box Cox transformation of dataThe Box Cox transformation of data

• The Johnson transformation of data

• Capability Analysis of Non Normal Data

• Examination of the mean• Examination of the mean

• Central limit theorem

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 2/39

Page 2: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

When do we Need Normal Distribution?

For the following evaluations we need normal distributed data:

• Description of the process spread with mean and standard deviation

• Calculation of confidence intervals

C ti f t l h t (SPC)• Creation of control charts (SPC)

• Calculation of process capability metrics

Recommendation:Recommendation:

At first try to transfer the data. This is relatively fast and simple If that does not show success than use thesimple. If that does not show success than use the central limit theorem. That requires an increased number of samples which costs time and money.

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 3/39

p y

Data Transformation

Which type of transformation?

Fil T f ti t (Th ifi ti li it i 12)File: Transformation.mtw (The upper specification limit is 12)

First review the raw data, which transformation will be successful?

Try different transformation methods:

− Square root

− Natural logarithm, ln

Logarithm base 10 log− Logarithm base 10, log 10

− Square

Review each transformation with histograms and normal distribution curves.

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 4/39

Which of the transformation is close to a normal distribution?

Page 3: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Transformation with Minitab

With the transfer function of Minitab you can search for the best exponent to reach normal distribution of the data set The exponentexponent, to reach normal distribution of the data set. The exponent

is named lambda.

λ= Y*Y = YY

Here some examples of

Lamda Werte Transformation2y2=λ

Lambda Values

Here some examples of lambda values for the transfer function.

yyln0=λ

5,0=λ

y/1y/1

5,0−=λ1−=λ

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 5/39

The Box Cox TransformationStat>Control Charts>Box-Cox Transformation…

7

Lower CL Upper CL

Estimate 0,23

(using 95,0% confidence)

Lambda

Box-Cox Plot of Raw data

6

5

De

v

Estimate 0,23

Lower CL 0,13Upper CL 0,32

Rounded Value 0,23

4

3

StD

3210-1

2

Lambda

Limit

Minitab calculates the lambda value for the best transformation and the 95

% confidence interval

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 6/39

Page 4: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Process Capability with Lambda ValuesStat>Quality Tools>Capability Analysis

N lUSL*

Process Capability of Raw dataUsing Box-Cox Transformation With Lambda = 0,23

>Normal…>Box Cox

transformed dataLSL *Target *USL 12Sample Mean 2,83266Sample N 500StDev (O v erall) 2,23467

Process DataPp *PPL *PPU 0,80Ppk 0,80C pm *

O v erall C apability

Sample Mean* 1,19844StDev (O v erall)* 0,239086

LSL* *Target* *USL* 1,77097

A fter Transformation

1,81,61,41,21,00,80,6

O bserv ed Performance Exp. O v erall PerformancePPM < LSL *PPM > USL 4000,00PPM Total 4000,00

O bserv ed PerformancePPM < LSL* *PPM > USL* 8317,96PPM Total 8317,96

Exp. O v erall Performance

Subsequent you evaluate the data of the capability analysis. Activate under

Options - Box-Cox power p ptransformation. Enter the before calculated value of the exponent.

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 7/39

Control Chart with Lambda Values15

10UCL=9 39

1

1

11

11

I Chart of Raw dataStat>Control Charts>Variable Charts for Individuals

Ind

ivid

ua

l Va

lue

5_X=2,83

UCL=9,39>Individuals… I Chart Options>Box-Cox

Observation500450400350300250200150100501

0

-5LCL=-3,72

2,00UCL 1 918

I Chart of Raw dataUsing Box-Cox Transformation With Lambda = 0,23

Observation

alu

e

1,75

1,50

UCL=1,918

Ind

ivid

ua

l Va

1,25

1,00

_X=1,193

500450400350300250200150100501

0,75

0,50 LCL=0,469

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 8/39

Observation

Page 5: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Comparison Box Cox / Johnson Transformation

With the Box Cox Transformation we try to transform not normal distributed data into normal distributed data for a simplified analysis.y

Johnson used the reverse way:Johnson used the reverse way:

He was aware, that the normal distribution has a lot of advantages (easy understandable well known to use with many good and know(easy understandable, well known, to use with many good and know tools)

He tried to “distort” a normal distribution with a combination of shape keeping and shape changing transformations. He evaluated which shapes could be later transformed back into a normal distribution.normal distribution.

With MINITAB 14 this transformation is now available.

Lets have a detailed look on this transformation:

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 9/39

Johnson Transformation in General

Johnson differentiates between 3 in principle different transformation approaches (MINITAB determines the most suitable one):

− Johnson distribution SBSB stands for system two sided limited data, this is e.g. a U-distribution. Therefore we can use data which even the Box Cox transformation can notTherefore we can use data, which even the Box Cox transformation can not use.

− Johnson distribution SLSL stands for system one sided limited, this is the case with e.g. a log.-normal distribution or also a F-distribution. These data can be limited transformed with Box Cox.

− Johnson distribution SUSU stands here for two side unlimited data. These data should be transformed with the Box Cox transformation without problems toowith the Box Cox transformation without problems too.

Which shape these three types of distribution may have and which equations result out of these distributions you can see on the next pagesequations result out of these distributions you can see on the next pages.

Here we can see, that up to four unknowns have to be determined. This can be solved only with computer support

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 10/39

can be solved only with computer support.

Page 6: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Johnson Distribution SB

SB => System two sided limited

Function: x‘= λ + δ ∗ ln[(x ε) / (γ + ε x)]

f(x) λ 1 0

Function: x = λ + δ ∗ ln[(x − ε) / (γ + ε − x)]

Where:(x) λ = 1,0

δ = 0,5x => Original value &

x‘ => transformed valueλ = 0δ = 1,0 λ ≈ 0

δ = 0,6

λ = 0

δ 0,6

Hint: δ = 0,5Hint:In MINITAB use for „ln“ -„Natural log“! Thi lt i th f l i

γεx

This results in the formula in „LOGE()“

In Excel: LN()

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 11/39

γεIn Excel: LN()

Johnson Distribution SL

SL => System one sided limited (log-normal distributed data)

Function: x‘= λ + δ ∗ ln(x ε)

f(x)

Function: x = λ + δ ∗ ln(x − ε)

Where:(x)

x => Original value &

x‘ => transformed value

Hint:Hint:In MINITAB use for „ln“ -„Natural log“! Thi lt i th f l i

εx

This results in the formula in „LOGE()“

In Excel: LN()

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 12/39

εIn Excel: LN()

Page 7: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Johnson Distribution SU

SU => System two sided unlimited

Function: x‘= λ + δ ∗ Sinh-1[(x ε) / λ]

f(x)

Function: x = λ + δ ∗ Sinh 1[(x − ε) / λ]

Where: (x)

x => Original value &

x‘ => transformed value

λ = 1,0δ = 0,5

λ = 2δ = 0 5

x

δ = 0,5

Hint:In Excel: ARCSINHYP()

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 13/39

In Excel: ARCSINHYP()

Example Johnson 1 (1)

Lets start with a transformation of a uniform distributed set of data.

In the file Johnson1 mtw the Output1“ has 700 valuesIn the file Johnson1.mtw the „Output1 has 700 values.

The histogram and the normal distribution test shows a significant deviation from a normal distributiondeviation from a normal distribution.

File: JOHNSON1. MTW

Histogram of Output1Normal

Probability Plot of Output1Normal - 95% CI

50

40

Mean 3,010StDev 0,5707N 700

Normal

99,99

99

95

Mean

<0,005

3,010StDev 0,5707N 700AD 8,033P-Value

Normal - 95% CI

Fre

qu

en

cy 30

20 Pe

rce

nt 80

50

20

5

Output14,03,63,22,82,42,0

10

0

Output154321

5

1

0,01

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 14/39

Output1 Output1

Page 8: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Example Johnson 1 (2)Lest try a first Johnson Transformation. From the theory MINITAB should use the System SB. After the transformation of Output1 we store the results in the column trans“store the results in the column „trans .

Stat>Quality Tools>Johnson Transformation…

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 15/39

Example Johnson 1 (3)

The graphical analysis:

99,99N 700 t 0 8

0,76Probability Plot for Original Data Select a T ransformation

Johnson Transformation for Output1Normal distribution test without

Transformation

Per

cent

99

90

50

N 700AD 8,033P-Value <0,005

alue

for

AD

tes

t 0,8

0,6

0,4

0,2

Transformation

P

5,54,02,51,0

10

1

0,01 Z Value

P-V

a

1,21,00,80,60,4

,

0,0Ref P

(P-Value = 0.005 means <= 0.005)Normal distribution

test with

t

99,99

99

90

N 700AD 0,241P-Value 0,774 P-V alue for Best F it: 0,773999

Z for Best F it: 0,76Best Transformation Ty pe: SB

Probability Plot for T ransformed Datatest with

transformation Equation for the transformation

Per

cent

50

10

1

Best Transformation Ty pe: SBTransformation function equals-0,0223806 + 0,645292 * Log( ( X - 1,96884 ) / ( 4,04224 - X ) )

The transformation was successful

5,02,50,0-2,50,01

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 16/39

The transformation was successful

Page 9: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Example Johnson 1 (4)

If we want to determine the portion outside the specification limit we use the capability analysis for non normal distribution.

Stat>Quality Tools>Capability Analysis>Nonnormal…

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 17/39

Example Johnson 1 (5)We receive:

Process Capability of Output1

Equation for the transformation

USL*f

Process Capability of Output1Johnson Transformation with SB Distribution Type

-0,022 + 0,645 * Log( ( X - 1,969 ) / ( 4,042 - X ) )

transformed dataProcess Data

Sample N 700

LSL *Target *USL 3,5Sample Mean 3,0096

O v erall C apabilityPp *PPL *PPU 0,22Ppk 0,22

Sample N 700StDev 0,570739Shape1 -0,0223806Shape2 0,645292Location 1,96884Scale 2,0734

Exp. O v erall PerformancePPM < LSL *PPM > USL 251344PPM Total 251344

A fter Transformation

LSL* *Target* *USL* 0,647477Sample Mean* -0,0166779StDev * 0 990882StDev * 0,990882

O bserv ed PerformancePPM < LSL *PPM > USL 258571PPM Total 258571

2,41,60,80,0-0,8-1,6-2,4

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 18/39

Page 10: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Exercise Johnson 2

How does the Johnson transformation react onHow does the Johnson transformation react on outliers?

Try the transformation of the data „Output2“ in the same worksheet.same worksheet.

Does it work without the outliers?

What is the big difference in both equations?

Can you understand the difference?

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 19/39

Example Johnson 3 (1)

Lets try with Output3:

500

400

Mean 21,16StDev 104,8N 500

Histogram of Output3Normal

Fre

qu

en

cy 300

200

100

0

Probability Plot of Output3Normal - 95% CI

Output321001800150012009006003000 99,9

99

9590

80

Mean

<0,005

21,16StDev 104,8N 500AD 136,954P-Value

No normal distribution!

It seems that the data can not be below 0

Pe

rce

nt

80706050403020

10

be below 0 (System SL?).

Output32000150010005000

5

1

0,1

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 20/39

Output3

Page 11: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Example Johnson 3 (2)

Stat>Quality Tools>J h T f ti>Johnson Transformation…

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 21/39

Example Johnson 3 (3)

We receive:

Johnson Transformation for Output3

99,9

99

90

N 500AD 136,954P-Value <0,005

AD

tes

t 0,60

0,45

0,68Probability P lot for Original Data Select a T ransformation

Johnson Transformation for Output3

Per

cent

90

50

10

1

P-V

alue

for

A

1,21,00,80,60,40,2

0,30

0,15

0,00Ref P

2000100000,1

99,9

99N 500AD 0,314

Z Value

P V l f B t F it 0 544750

Probability P lot for T ransformed Data

(P-Value = 0.005 means <= 0.005)

Per

cent

90

50

10

P-Value 0,545 P-V alue for Best F it: 0,544750Z for Best F it: 0,68Best Transformation Ty pe: SLTransformation function equals-0,532048 + 0,502955 * Log( X - 0,986507 )

The transformation was successfully! Minitab used the System SL

420-2

1

0,1

The transformation was successfully! Minitab used the System SL. The equation is shorter that with the system SB. Calculate the proportion above the limit of 200.

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 22/39

p p

Page 12: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Example Johnson 4 (1)

In the last example we use both sided unlimited data

Th l O t t i l d 1000 l Fil JOHNSON2SU MTWThe column Output includes 1000 values.

Since we have positive and negative values a Box Cox-T f ti ld t k ith t ti

File: JOHNSON2SU. MTW

Transformation would not work without preparation.

Normality check results in non normal distribution!

Probability Plot of OutputNormal - 95% CI

Histogram of OutputNormal

99,99

99

95

Mean

<0,005

38,06StDev 1011N 1000AD 7,298P-Value

Normal 95% CI

140

120

100

Mean 38,06StDev 1011N 1000

Normal

Pe

rce

nt 80

50

20Fre

qu

en

cy

100

80

60

500025000-2500-5000

5

1

0,0140003000200010000-1000-2000

40

20

0

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 23/39

OutputOutput

Example Johnson 4 (2)

The result of the transformation:

99,99N 1000 t 0 8

1Probability Plot for Or iginal Data Select a T ransformation

Johnson Transformation for Output

rcen

t

99

90

50

N 1000AD 7,298P-Value <0,005

e fo

r A

D t

est 0,8

0,6

0,4

Per

500005000

10

1

0,01 Z Value

P-V

alu

1,21,00,80,60,40,2

0,2

0,0Ref P

50000-5000

99,99

99

N 1000AD 0,238

P V l f B F i 0 783934

Probability P lot for T ransformed Data

(P-Value = 0.005 means <= 0.005)

Per

cent

99

90

50

10

,P-Value 0,784 P-V alue for Best F it: 0,783934

Z for Best F it: 1Best Transformation Ty pe: SUTransformation function equals-4,33230 + 3,22900 * A sinh( ( X + 2788,42 ) / 1519,24 )

5,02,50,0-2,5

10

1

0,01

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 24/39

Page 13: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Example Johnson 4 (3)

What is the change, if we shift the mean by addition of 2352?

P-V alue for Best F it: 0,783934Z for Best F it: 1Best Transformation Ty pe: SUTransformation function equals

P-V alue for Best F it: 0,783934Z for Best F it: 1Best Transformation Ty pe: SUTransformation function equals

Only one parameter of the transformation equation has changed.

Transformation function equals-4,33230 + 3,22900 * A sinh( ( X + 436,415 ) / 1519,24 )

Transformation function equals-4,33230 + 3,22900 * A sinh( ( X + 2788,42 ) / 1519,24 )

y p q g

99 99

Probability Plot of transoutput2352; transNormal - 95% CI

99,99

99

95Mean

0,78447,78 10,38 1000 0,613 0,111

StDev N AD P-0,01120 0,9922 1000 0,238

Variabletransoutput2352transIn fact, the Box Cox

transformationk l B t it th t

Pe

rce

nt 80

50

20

47,78 10,38 1000 0,613 0,111

works also. But it seems, that the model fit in this example with Johnson Transformation

5

1

0,01

with Johnson Transformation is more effective. (higher P-value)

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 25/39

Data9080706050403020100

Identify the Individual DistributionStat>Quality Tools>Individual Distribution Identification…

Goodness of F it Test

Probability Plot for Raw dataNormal - 95% C I Exponential - 95% C I

Pe

rce

nt

99,9

99

90

50

Pe

rce

nt

99,99

90

50

10

NormalA D = 14,076 P-V alue < 0,005

ExponentialA D = 10,394

Raw data

P

1680

10

1

0,1

Raw data

P

100,00010,0001,0000,1000,0100,001

1

P-V alue = 0,195

Gamma

P-V alue < 0,003

WeibullA D = 0,529

Minitab runs the goodness of fit test for all theoretical distributions

en

t

99,99

90

50

10 en

t

99,99

99

90

50

GammaA D = 0,349 P-V alue > 0,250

Weibull - 95% C I Gamma - 95% C I theoretical distributions.

Together with the Anderson Darling Value

R d t

Pe

rce

10,001,000,100,01

10

1

R d t

Pe

rce

10,001,000,100,01

10

1

Anderson Darling Value Minitab is calculating the

p-Value.

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 26/39

Raw data Raw datap

Page 14: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Capability Analysis of Non Normal DataBased on the best fit (lowest AD value and highest p-value) Minitab is able to run a capability analysis. In our example we get the best fit

with a Gamma distributionwith a Gamma distribution.

Process Capability of Raw dataCalculations Based on Gamma Distribution Model

USL

LSL *Target *

Process DataPp *PPL *

O v erall C apability

Stat

USL 12Sample Mean 2,83266Sample N 500Shape 1,57919Scale 1,79374

PPU 0,81Ppk 0,81

*O bserv ed Performance

PPM < LSL *PPM > USL 4537,96PPM Total 4537,96

Exp. O v erall Performance

>Quality Tools>Capability Analysis>Nonnormal…

PPM < LSL *PPM > USL 4000,00PPM Total 4000,00

14121086420

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 27/39

14121086420

The Quality of MeansHow precise will a sample average predict the center of the population?

Also a mean value has a standard deviation which is called the standard error of the mean (SEMean Standard Error).

sSE x= s

s x=n

SEMean

U i th t d d f th ti t th lit

ns

x

Using the standard error of the average we can estimate the quality of the mean (see confidence interval).

The equation indicates that a mean is more robust than the sample by the factor of the square root of the sample size.

Transferred to the praxis this means that e.g.: it diminishes the measurement error by a half using an average out of 4 readings.

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 28/39

Page 15: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Think about the Following Exercise

• Let´s assume you have a bucket filled with a large number of hit h t O h h t t b tiwhite sheets. On each sheet you wrote a number representing a

normal distribution with a mean and a standard deviation.

P ll d l 9 h t l l t th f th b d• Pull randomly 9 sheets, calculate the mean of these numbers and note this number on a green paper. Subsequently you put the white papers back in the bucket.white papers back in the bucket.

• Put the green paper in another bucket.

R t th t til th th b k t i f ll f h t• Repeat that until the other bucket is full of green sheets.

• The contents of the bucket with the white papers represents the l ti f th i di id l lpopulation of the individual values.

• The contents of the bucket with the green papers represents di t ib ti f th th ldistribution of the mean the samples.

• Lets make this exercise with Minitab.

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 29/39

Lets Proof the Theory We generate simulated data to proof the theory.

Generate 9 columns with numbers from a normal distribution with a mean = 70 and standard deviation of 9.

The columns C1 to C9 represents the white sheets, column C10 the green sheets.

What standard deviation do we expect in column C10? What standard de iation do e act all ha e?deviation do we actually have?

Calc>Random Data

Calc>Row Statistics>Random Data

>Normal…>Row Statistics…

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 30/39

Page 16: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

The Evaluation in Minitab

Variable N Mean Median TrMean StDev SE Mean

C1 250 70,022 69,755 70,004 9,214 0,583C1 250 70,022 69,755 70,004 9,214 0,583

C2 250 69,683 69,743 69,672 8,998 0,569

C3 250 71,287 71,928 71,324 9,072 0,574

C4 250 69,158 68,719 69,072 8,807 0,557C4 250 69,158 68,719 69,072 8,807 0,557

C5 250 71,088 70,894 71,167 9,381 0,593

C6 250 70,343 70,585 70,416 9,068 0,574

C7 250 70 630 70 991 70 697 9 208 0 582C7 250 70,630 70,991 70,697 9,208 0,582

C8 250 70,222 70,057 70,203 8,650 0,547

C9 250 69,990 70,076 69,953 8,776 0,555

C10 250 70 269 70 219 70 267 3 055 0 193C10 250 70,269 70,219 70,267 3,055 0,193

9sEnter the equation:

How can we describe

99

3 =n

ss x

x=

How can we describe these values?

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 31/39

The Graphical EvaluationSingle values vs. Mean values

I Chart of Values by Subs

1001 Mean

1

I Chart of Values by Subs

lue

90

80 UCL=79,411

div

idu

al V

al

70_X=69,91

UCL 79,41

Ind

60

50

LCL=60,421

Spread of single values:

Observation500450400350300250200150100501

40 45 – 95

Spread of means (n=9):Spread of means (n=9):

60 - 80

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 32/39

Page 17: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

View of Mean Values

St d d e o le i e d Std

10Stdv 10Stdv 5

Variable

Standard error vs. sample size n and Stdv

8

or

Stdv 1

6

4an

da

rd E

rro

4

2

Sta

5040302010

0

5040302010Sample Size

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 33/39

The Central Limit Theorem

The central limit theorem gives the condition for the convergence of a sequence of distributions to a normal distribution.

Lets look at any distributed population with the mean value µ and the variance σ2. Then we can make the following statement: The gdistribution of mean values approximates with a growing sample size n against a normal distribution with the expected mean = µ and the

i 2/variance = σ2/n

Z transformation of mean values: Z – transformation of individual Z - transformation of mean values:values of a sample:

xZ

n

n σµ−

µ-x=Z

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 34/39

Page 18: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Exercise

Lets use the central limit theorem for the investigation of non normal distributed data.

File: Central Limit.mtw

H h 10 l ith d d t hi h f ll ChiHere we have 10 columns with random data which follow a Chi –square distribution. From the columns 1 – 5 and 1 – 10 we have

calculated means values.

Investigate the distribution of the columns. How many observations do we need until a near normal distribution appears.

Rule:

pp

Rule:

The stronger the distribution of the single values d i t f l di t ib ti th l th ldeviates from a normal distribution the larger the sample

size has to be. (minimal = 3; maximal = 30)

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 35/39

The Individual Values

20

15

1

1

I Chart of 1

18

16

Mean 5,255StDev 3,394N 100

Histogram of 1Normal

vid

ua

l Va

lue

15

10

_X 5 25

UCL=14,75

qu

en

cy

16

14

12

10

N 100

Ind

iv 5

0

X=5,25

LCL= 4 24

Fre

q 8

6

4

2

Observation1009080706050403020101

-5LCL=-4,24

115129630-3

0

Probability Plot of 1

99,9

99

95

Mean

<0 005

5,255StDev 3,394N 100AD 3,235P-Value

Probability Plot of 1Normal

Pe

rce

nt

90

80706050403020

<0,005P Value

20151050-5

10

5

1

0,1

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 36/39

1201510505

Page 19: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Mean Values; n = 5

18 Mean 4,886StDev 1 426

Histogram of n=5Normal

cy

16

14

12

StDev 1,426N 100

Fre

qu

en

c

10

8

6

9,07,56,04,53,01,5

4

2

0 99,9Mean 4,886StD 1 426

Probability Plot of n=5Normal

n=5

t

99

9590

8070

0,030

StDev 1,426N 100AD 0,839P-Value

Pe

rce

nt 70

6050403020

10

5

1086420

5

1

0,1

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 37/39

n=5

Mean Values; n = 10

20Mean 4,870StDev 1 078

Histogram of n=10Normal

cy

20

15

StDev 1,078N 100

Fre

qu

en

c

10

876543

5

0 99,9Mean 4,870StD 1 078

Probability Plot of n=10Normal

n=10

t

99

9590

8070

0,332

StDev 1,078N 100AD 0,413P-Value

Pe

rce

nt 70

6050403020

10

5

987654321

5

1

0,1

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 38/39

n=10

Page 20: Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Non Normal Data

Example with Uniform Distributed Data

16

14

Mean 94,62StDev 2,663N 100

Histogram of 1Normal

99,9

99

Mean 94,62StDev 2,663N 100

Probability Plot of 1Normal

qu

en

cy

14

12

10

8

N 100

rce

nt

99

9590

8070605040

0,098

N 100AD 0,631P-Value

Fre

q

6

4

2

Pe 40

3020

10

5

1

11009896949290

0

1105100959085

0,1

18

16

14

Mean 94,83StDev 1,301N 100

Histogram of Mean 5Normal

99,9

99

Mean 94,83StDev 1,301N 100AD 0,299

Probability Plot of Mean 5Normal

Fre

qu

en

cy

14

12

10

8 Pe

rce

nt

9590

80706050403020

0,579P-Value

98996999392

6

4

2

09998996999392990

20

10

5

1

0,1

Knorr-Bremse Group 08 BB W2 non normal data 08, D. Szemkus/H. Winkler Page 39/39

Mean 598979695949392

Mean 599989796959493929190