2007-06-17 tpc dd basicstatistics
TRANSCRIPT
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
1/91
Tech-Pro Consultants
Six Sigma Basic Statistics
March 2005Dr. K.S.Ravichandran
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
2/91
Tech-Pro Consultants
Bas ic Stat is t ics
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
3/91
Tech-Pro Consultants
Objectives
Review & Enhance The Basic Statistical & Quality Terms Needed
For Six Sigma Process Improvement
Begin To Enhance Minitab Operating Skills
Politicians Promise: if elected, I'd make certain that everybody gets an above average income
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
4/91
Tech-Pro Consultants
What is Statistics?
Is the science that develops methods to effectively derive
information from numerical data
Statistics is a collection of scientific methods for collecting,organizing and interpreting data, usually with the goal of inferring
certain properties of the population from a representative sample of
the population
science of collecting and classifying a group of facts according totheir relative number and determining certain values that represent
characteristics of the group
There are three kinds of Lies: Lie, Damned Lie and Statistics Mark Twain
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
5/91
Tech-Pro Consultants
Types of data
Measures of the Center of the data
Mean
Median
Mode
Measures of the Spread of Data
Range
Variance
Standard Deviation
Normal Distribution and Normal Probabilities
Process Stability and Process Capability
Basic Statistics
Used With Permission
AlliedSignal 1995 -D r. Steve Zinkgraf
Ask a statistician for her phone number... and get an estimate with 95% confidence
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
6/91
Tech-Pro Consultants
What sorts of data do you see beingcollected around your area?
(List them below)
___________________________________________________
______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
In God we trust. All others must bring data.
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
7/91Tech-Pro Consultants
wo enera n sof Data(but 3 families)
ATTRIBUTE DATA - The data is discrete (counted).Results from using go/no-go gages, or from the inspection ofvisual defects, visual problems, missing parts, or frompass/fail or yes/no decisions.
VARIABLE DATA - The data is continuous(measured). Results from the actual measuring of acharacteristic such as impedance of a motor winding, tensilestrength of steel, diameter of a pipe, flow rate of a pump, etc.
Statisticians do it discretely and continuously.
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
8/91Tech-Pro Consultants
ATTRIBUTE DATA (Count Data)
(#1) Number of Items in a Category (Count-Based Proportions) Heads / Tails (i.e., counting # of Heads and # of Tails)
Yes / No (Order Form Filled Out Accurately or Not)
Pass / Fail; Good / Bad (Accurate Billing/Overcharged)
(#2) Counts of Discrete Event Occurrences
# of Scratches on a Car Hood # of Errors on a Form
# of Insulation Breaks in a Spool of Wire
# of times customer hangs up before receiving response
2 General Kinds of Data (but 3 families)
Different Types Of Data Require Different Analysis Tools
VARIABLE DATA (Continuous Measurement Scale) (#3) Continuous Data
Decimal subdivisions are meaningful
Ex: Time to answer the telephone ( Exact # of secs. per call)
Just ask
yourself,Am I
counting
things,
here?
If yes, you
haveattributes
data.
Type-IAttributes
Data
(Binomial)
Type-IIAttributes
Data
(Poisson)
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
9/91Tech-Pro Consultants
VARIABLES
DATA
TYPE-I
Any Bubbles?(accept / reject
the entire item)
TYPE-II
Number ofBubbles?
Reject Reject Accept Reject
3 2 0 4ATTR
IBUTESDATA
Sample#1 Sample#2 Sample#3 Sample#4
3 Families of Data:
AmIC
ountingThings
?
(D
iscreteData)
(ContinuousD
ata)
(Measurement
Data)
Poisson
Distrib
ution
Binomial
Distribut
ion
NormalDistribu
tion
orOther
Manufacturing Process: Making Sheets of Glass
Weight = 12.2 Weight = 12.4 Weight = 12.1
Glass
Weight
Weight = 11.9
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
10/91Tech-Pro Consultants
VARIABLES
DATA
TYPE-I
Any Errors?(accept / reject
the entire item)
TYPE-II
Number ofErrors on Form?
Reject Reject Accept Reject
3 2 0 4ATTR
IBUTESDATA
Form#1 Form#2 Form#3 Form#4
3 Families of Data:
AmI
CountingThings
?
(D
iscreteData)
(ContinuousD
ata)
(Measurement
Data)
Poisson
Distrib
ution
Binomial
Distribut
ion
NormalDistribution
orOther
Transactional Process: Converting an expense account forminto a reimbursement check
Time to
Reimburse
Employee36.1 hrs 24.6 hrs 21.0 hrs 29.2 hrs
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
11/91Tech-Pro Consultants
Sample at 8:00am Sample at 9:00am Sample at 10:00am
Sample
(n)
Number
(np, c)
Proportion
(p,u)
Date
(Shift, Time, etc.)
30%
20%
10%
40%
8:00am
Pass/FailData
9:00am
She tells you are just Average: never mind, she is just being Mean
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
12/91Tech-Pro Consultants
Sample
(n)
Number
(np, c)
Proportion
(p,u)
Date
(Shift, Time, etc.)
8:00am8:10am
3
2
1
4
Number of
Blemishes
Data8:00am
8:10am8:20am
8:50am
9:00am
9:10am
etc.
etc.8:30am
8:40am
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
13/91Tech-Pro Consultants
Exercise: Which Type of Data Is It?
(1) Percent defective parts in hourly production
(2) Percent cream content in milk bottles (comes in four-bottle container sets)
(3) Amount of time it takes to respond to a request
(4) Number of blemishes per square yard of cloth, where pieces of cloth may be of variablesize
(5) Daily test of water acidity (pH)
(6) Number of raisins per box of Raisin Bran
(7) Number of defective parts in lots of size 100
(8) Length of screws in samples of size ten from production lots
(9) Number of errors on a purchase order
DIRECTIONS: For each of the following applications, identify the type of data you
would be investigating (Attributes Type-I, Attributes Type-II, or Variables Data)
... AND EXPLAIN YOUR CHOICE
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
14/91Tech-Pro Consultants
What is the largest probability possible? _______What does this mean?
What is the smallest probability possible? _______What does this mean?
What does a probability of 0.50 mean? _______________
What is the probability you will be struck by lightning during yourlifetime? _____________________
What are your chances of appearing on The Tonight Show?___________________
What is the probability of being killed by terrorists overseas?____________________
What are your chances of being killed by an American in Baltimore?_______________
The Probability Test
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
15/91Tech-Pro Consultants
What is the largest probability possible?___1.0 = 100%__What does this mean?
What is the smallest probability possible?___0.0 = 0%__What does this mean?
What does a probability of 0.50 mean? 50% Just flip a coin What is the probability you will be struck by lightning during your
lifetime? 0.000001667 = 1/600,000
What are your chances of appearing on The Tonight Show?0.00000204 = 1/490,000
What is the probability of being killed by terrorists overseas?0.000001538 = 1/650,000
What are your chances of being killed by an American in Baltimore?0.00025 = 1/4,000
The Probability Test
Instructor Page
Answers
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
16/91Tech-Pro Consultants
Roll a fair die once, what is Prob(a six)? ______
Roll a fair die twice, what is Prob(a six on the second roll)?__
Roll two fair dice, what is Prob(get two sixes)?____________
What do you think of the recent headline, Education
research shows 49.5% of all American high school studentsfall below the national average!
The Probability Test (cont.)
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
17/91Tech-Pro Consultants
The Customer Requirements
Suppose a certain customer permits only those
combinations which yield 3, 4, 5, . . . , or 11.
What is the process capability?
What is the probability of meeting the requirements?
Are capability and probability related?
Probability
Used With Permission
6 Sigma Academy Inc. 1 995
The Practical Problem Statement ...
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
18/91Tech-Pro Consultants
1 2 3 4 5 6
1
2
3
4
5
6
2 3 4 5 6 7
3 4 5 6 7 8
4 5 6 7 8 9
5 6 7 8 9 10
6 7 8 9 10 11
7 8 9 10 11 12
Computing the Risks- The Statistical Problem Statement
Ways to form a 2 in =
Ways to form a 12 in =
Probability of Defect
Used With Permission
6 Sigma Academy Inc. 1 995
i i i
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
19/91Tech-Pro Consultants
Deeper Insight Into Probability
Die 1 Die 2 Probability
1 4 .0278
2 3 .0278
3 2 .0278
4 1 .0278
Total .1111
What is the probability of
rolling a 5 using a fair pair
of dice?
1 2 3 4 5 6
1 .0278 .0278 .0278 .0278 .0278 .0278
2 .0278 .0278 .0278 .0278 .0278
3 .0278 .0278 .0278 .0278 .0278
4 .0278 .0278 .0278 .0278 .0278
5 .0278 .0278 .0278 .0278 .0278 .0278
6 .0278 .0278 .0278 .0278 .0278 .0278
.0278
.0278
.0278
Used With Permission
6 Sigma Academy Inc. 1 995
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
20/91Tech-Pro Consultants
Establishing the Odds
Value Combinations Probability
2 1 .0278
3 2 .0556
4 3 .0833
5 4 .1111
6 5 .1389
7 6 .1667
8 5 .1389
9 4 .1111
10 3 .0833
11 2 .0556
12 1 .0278Total 36 1.0000
Probability of any given value on Die 1 = 1/6 = .1667
Probability of any given value on Die 2 = 1/6 = .1667
Probability of any given combination = 1/6 x 1/6 = 1/36 = .0278 Used With Permission 6 Sigma Academy Inc. 1 995
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
21/91Tech-Pro Consultants
Graphing the Results
. . .Hence, the probability of Customer Satisfaction is 94.4 %
Zone of Customer Satisfaction 94.4%
18
16
14
12
10
8
6
4
2
2 1210864 140Total of Dice Values
2.8%2.8% LSL USL
Suppose a certain customer permits only those
combinations which yield 3, 4, 5, . . . , or 11.
Value Combinations Probability
2 1 .0278
3 2 .0556
4 3 .0833
5 4 .1111
6 5 .13897 6 .1667
8 5 .1389
9 4 .1111
10 3 .0833
11 2 .0556
12 1 .0278
Total 36 1.0000
Used With Permission
6 Sigma Academy Inc. 1 995
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
22/91Tech-Pro Consultants
Statistical Distributions
We can describe the behavior of any process or
system by plotting multiple data points for the
same variable
Over time
Across products or business
By different people, machines, etc...
The accumulation of these data can be viewed as
a distribution of values
Represented by: Dot plots
Histograms
Normal curve or other smoothed distributionUsed With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
23/91Tech-Pro Consultants
Y = Weight (lbs) 220160 100
Process = Hose
1 Drop = 1 Unit of Output
Histogram is ...a pile of individual values
Dotplot: :
: :
. : . : :
: : : : : : : : : : : . :
. . ::.::::: :.:::.:.:.:.: : : : : . : . .
-----+---------+---------+---------+---------+---------+-C1
100 125 150 175 200 225
D t Pl t
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
24/91Tech-Pro Consultants
Dot Plots
1st Observation2nd Observation
1.11.01.15
1.21.25
1.31.35
1.41.05
Suppose we have a manufacturing line that is producing shafts.
Diameters range from 1.0 to 1.4 inches. As we make a measurement of a
shaft, we record the value with a dot on the above scale
Ex:
1st Observation = 1.4 inches
2nd Observation = 1.1 inches
Diameter
D t Pl t
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
25/91Tech-Pro Consultants
And Suppose we continue sampling until 150 shafts have been measured
What Statements Can You Make About Our Process ?
:: :::. . .
:.. :::::: : :
. :.. ..::::::::::::::: :: ::.:..: .::::::::::::::::::::::::.:::..:.: .
Dot Plots
1.11.01.15
1.21.25
1.31.35
1.41.05
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
26/91
Tech-Pro Consultants
:: :::. . .
:.. :::::: : :
. :.. ..::::::::::::::: :: ::.:..: .::::::::::::::::::::::::.:::..:.: .
Dot Plots
1.11.01.15
1.21.25
1.31.35
1.41.05
Now imagine the same data, grouped into intervals
with bars used to represent how the data looks.
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
27/91
Tech-Pro Consultants
Histogram Distribution
1.11.051.0
35
30
25
20
15
10
5
0
Frequency
1.15 1.2 1.25 1.3 1.35 1.4
Data represented just with the dots is called a Dot Plot
Using data represented in the above bar format is called a Histogram
Hi
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
28/91
Tech-Pro Consultants
Histogram
Now weve combined the Histogram with our Lower and Upper Specifications.
Question #1 : What are Specifications ? Where do they come from ?
Question #2: What can you say about our process now ?
1.11.01.15
1.21.25
1.31.35
1.41.05
Upper SpecificationLower Specification
.001 2.0
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
29/91
Tech-Pro Consultants
Histogram
Suppose the customer has given us new specifications !
Question: What can you say about our process now ?
1.11.01.15
1.21.25
1.31.35
1.41.05
Lower Specification
1.1
Upper Specification
1.3
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
30/91
Tech-Pro Consultants
Dotplot Distribution
Imagine a customer service help line in which the business knows that to
stay competitive, it must return the customers telephone calls in less
than 30 minutes. The actual response time was measured 150 times and
plotted above.
: : :::. . .
:.. :::::: : :
. :.. ..::::::::::::::: :: ::.:..: .::::::::::::::::::::::::.:::..:.: .
-+---------+---------+---------+---------+-------
28.0 29.0 30.0 31.0 32.0
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
31/91
Tech-Pro Consultants
3432302826
Upper SpecLower Spec
Time
Smoothed (Normal) Distribution
Finally, we can view the data as a smoothed distribution (red line), in this
example using the normal distribution assumption. It provides an
approximation of how the data might look if we were to collect an infinite
number of data pointsUsed With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
32/91
Tech-Pro Consultants
Forming the Normal Curve
Uni ts of M easurem
Center of the bar
Smooth curve interconnecting
the center of each bar
Area of Yield
Performance
Limit
Probability
of a Defect
p(x > a) = 1 2
e-(1/2)[(x - m)/]2
a
dx
+ infinity- infinity
Given that 100% of the area
under the normal curve liesbetween , we may
calculate that area which lies
beyond the performance limit.
Doing so would reveal the
random chance probability of
creating a defect.
Note: The tails of the normal curve will touch the baseline at infinity. Used With Permission
6 Sigma Academy Inc. 1 995
a
Basic Statistics
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
33/91
Tech-Pro Consultants
Types of data
Measures of the Center of the Data
Mean
Median
Mode
Measures of the Spread of Data
Range
Variance
Standard Deviation
Shape: Normal Distribution and Normal Probabilities
Process Stability and Process Capability
Basic Statistics
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
D t E l
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
34/91
Tech-Pro Consultants
Data Example(Actual # of Days from Order to Ship)
140 170 215 130 136 130 150
145 175 150 155 123 120 110
160 175 145 150 155 130 116
190 170 155 148 140 131 108 155 180 155 155 120 120 95
165 135 150 150 130 118 125
150 170 155 140 138 125 133
190 157 150 180 121 135 110
195 130 180 190 125 125 150 138 185 160 145 116 118 108
160 190 135 150 145 122
155 155 160 164 150 115
153 170 140 112 102
145 155 142 125 115
Where is the Center of the Data?
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
35/91
Tech-Pro Consultants
Mean = The average value(the Center of Gravity)
Where is the Center of the Data?
Decribed in 2 ways:
- Uses all data points- Heavily influenced byextreme values
X =Sum of the data points
Number of data points
Median = the 50% point,(or the middle number)
To find the median of a data set,
(1) arrange data in order fromsmallest to largest
(2) the middle number is the median!
1, 2, 3, 14, 85
The median is 3
- Not heavily influenced byextreme values
As head of the universitys Communications Dept. you are asked
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
36/91
Tech-Pro Consultants
What is the average income
(or center of gravity)?
$10, 20, 30, 40, 50 ($ in thousands)
What is the median
income?
to summarize the average starting salaries of Communications
graduates.
$10, 20, 30, 40, 5000 ($ in thousands)
What is the average income
(or center of gravity)?
What is the median
income?
However, under the advice of the Public Relations Dept. you consider
to including one of your former Communications majors:
Shaquille ONeal (a rather wealthy rookie basketball star)
Where is the Center?
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
37/91
Tech-Pro Consultants
Mode (not used as much): The value that occurs most often.
The Mode may not exist; and if does exist, it may
not be unique.
-Can be used with categorical/attribute data
Where is the Center?
What is the mode for the following set of defect data?
# of change notices issued:
-Price change: 13
-Spec change: 112
-Ship to address change: 40
-Delivery date changed: 79
What doesBimodal
mean?
rea ou
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
38/91
Tech-Pro Consultants
rea ouExample
Suppose your son or daughter isconsidering going to work for a small, familyowned business after graduation. The
owner of the business proudly states that,of the last 7 college graduates hired, themean salary was $25,000; the salaries werebimodal, with modes of $18,000 and
$20,000; and the median salary was$19,000. He refuses to identify theindividual salaries
From Introductory Statistics William D. Ergle
Exercise
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
39/91
Tech-Pro Consultants
Exercise
Minitab can easily calculate the Mean and Median
1. Open up Minitab
2. Open file: Distskew.mtw
3. Perform The Following
Stat>Basic Statistics>
Descriptive Statistics>
4. Enter The Variables Names
5. Evaluate Results
D i ti St ti ti F 3 Di t ib ti
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
40/91
Tech-Pro Consultants
TABULAR FORM
Variable N Mean Median TrMean StDev
Normal 500 70.000 69.977 70.014 10.000
Pos Skew 500 70.000 65.695 68.554 10.000
Neg Skew 500 70.000 73.783 71.368 10.000
Descriptive Statistics For 3 Distributions
Look For This In Your Session Window !
Graphical Form
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
41/91
Tech-Pro Consultants
Graphical Form
Different Distributions
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
42/91
Tech-Pro Consultants
1101009080706050403020
100
50
0
C1
Frequency
Comparison of Distributions.
Sketch in the Means and Medians on each Distribution.
Negative Skew Positive Skew
Symmetric
Distribution
80706050403020100
300
200
100
0
C3
Frequency
Comparison of Distributions.
Tail
13012011010090807060
300
200
100
0
C2
Frequency
Comparison of Distributions.
Tail
Different Distributions
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Graphical Reminder
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
43/91
Tech-Pro Consultants
Graphical Reminder
* The 3 Charts On The Previous Page
Were Created Under The Minitab Histogram OptionGraph>Histogram
Relationship Of The Mean & Median
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
44/91
Tech-Pro Consultants
1101009080706050403020
100
50
0
Normal
Frequency
Mean, Median
80706050403020100
300
200
100
0
Neg Skew
Frequency
MedianMean
13012011010090807060
300
200
100
0
Pos Skew
Frequency
Median Mean
Relationship Of The Mean & Median
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Basic Statistics
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
45/91
Tech-Pro Consultants
Types of data
Measures of the Center of the Data
Mean
Median
Mode
Measures of the Spread of Data
Range
Variance
Standard Deviation
Normal Distribution and Normal Probabilities
Process Stability and Process Capability
Basic Statistics
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Population Parameters vs Sample Statistics
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
46/91
Tech-Pro Consultants
Population Parameters vs Sample Statistics
m = Population Mean = Population Standard Deviation
Examples of
POPULATION:
Entire United States
Yrs. Worth of Acct. Payable
Every Grain of Sand On The Beach
Examples of SAMPLE:
1000 US Citizens
Hrs. Worth of Acct.
Pay
Handful of Sand
^ = Sample Standard DeviationX= Sample Mean
s =
3 a s to describe ho far the
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
47/91
Tech-Pro Consultants
Range = R the difference between largest
and smallest observations
Standard Deviation = s
Variance = s2 (just the square of the std dev!)
3 ways to describe how far the
data is spread:
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
48/91
Tech-Pro Consultants
Avg = ___
Sum of thelast column
= _______
Divide theSum by (n-1):= Variance = S2
= __________
X =Sum of the data points
Number of data points
X5
4
3
1
2
X2
-1
X X4
1
X X2
Square Root ofthe Variance= Std.Dev. = S= _________
S S 2
Calculate manually the Variance and Standard
Deviation of These 5 Data Points
S2
CLASS EXERCISE
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
49/91
Tech-Pro Consultants
Avg = 3
Sum of thelast column
= 10
Divide theSum by (n-1):= Variance = S2
= 2.5
X =Sum of the data points
Number of data points
X5
4
3
1
2
X2
1
0
-2
-1
X X4
1
0
4
1
X X2
Square Root ofthe Variance= Std.Dev. = S= 1.58
S S 2
Calculate manually the Variance and Standard
Deviation of These 5 Data Points
S2
CLASS EXERCISE
Instructor Page
Computational Equations
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
50/91
Tech-Pro Consultants
Computational Equations
Population Mean
m =
X
N
i
i
N
1
Sample Mean
Population Standard
Deviation
m
=
(X )
N
i
2
i=1
N
Sample Standard
Deviation
x =
x
n
i
i=1
n
s =
(X )
n -1
i
2
i=1
N
X
Used With Permission
6 Sigma Academy Inc. 1 995
The Standard Deviation
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
51/91
Tech-Pro Consultants
The Standard Deviation
mPoint of Inflection
1
T USL
p(d)
Upper Specification Limit (USL)
Target Specification (T)
Lower Specification Limit (LSL)Mean of the distribution (m)Standard Deviation of the distribution () 3
The distance between the point of inflection and
the mean constitutes the size of a standard
deviation. If three such deviations can be fit
between the target value and the specification limit,
we would say the process has three sigma
capability.
Used With Permission
6 Sigma Academy Inc. 1 995
Basic Statistics
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
52/91
Tech-Pro Consultants
Types of data
Measures of the Center of the Data
Mean
Median
Mode
Measures of the Spread of Data
Range
Variance
Standard Deviation
Normal Distribution and Normal Probabilities
Process Stability and Process Capability
Basic Statistics
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
The Normal Distribution
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
53/91
Tech-Pro Consultants
The Normal Distribution
The Normal Distribution is a distribution ofdata which has certain consistent properties
These properties are very useful in our
understanding of the characteristics of the
underlying process from which the data wereobtained
Most natural phenomena and man-made
processes are distributed normally, or can be
represented as normally distributed
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
The Normal Distribution
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
54/91
Tech-Pro Consultants
Property 1: A normal distribution can bedescribed completely by knowing only the:
mean, and
standard deviation
The Normal Distribution
Distribution One
Distribution
Two
Distribution Three
What is the difference among these three normal distributions?
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Statistical Number Line
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
55/91
Tech-Pro Consultants
X Axis
m3 m2 m1 m m+1 m+2 m+3
300
Suppose the weights of players on a footballteam had m=300 lbs and =10 lbs
You fill in the X-axis values (weights) above
Exercise(pounds)
add 10 add 10 add 10
Statistical Number Line
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
56/91
Tech-Pro Consultants
Statistical Number Line
X Axis
m3 m2 m1 m m+1 m+2 m+3
300 310 320 330270 280 290
Suppose the weights of a football teamhad m=300 lbs and =10 lbs
You fill in the X-axis values (weights)
Exercise
Instructor Page
(pounds)
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
57/91
Tech-Pro Consultants
X Axism3 m2 m1 m m+1 m+2 m+3300 310 320 330270 280 290 (pounds)
68%
m + 1= 68%ofthe individuals
Instructor Page
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
58/91
Tech-Pro Consultants
X Axism3 m2 m1 m m+1 m+2 m+3300 310 320 330270 280 290 (pounds)
95%
m + 2= 95%of the individuals
Instructor Page
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
59/91
Tech-Pro Consultants
X Axism3 m2 m1 m m+1 m+2 m+3300 310 320 330270 280 290
m + 3= 99.7%of the individuals
(pounds)
99.7%
Instructor Page
The Normal Curve and Probability Areas
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
60/91
Tech-Pro Consultants
Associated with the Standard Deviation
43210-1-2-3-4
40%
30%
20%
10%
0%
68%
95%
Probabilityofsampleva
lue
Number of standard deviations from the mean
99.73%
Property 2: The area under sections of the curve
can be used to estimate the cumulative probability
of a certain event occurring
Cumulative probability
of obtaining a valuebetween two values
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Empirical Rule of Standard Deviation
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
61/91
Tech-Pro Consultants
p
Number of
StandardDeviations
TheoreticalNormal
EmpiricalNormal
+/- 168% 60-75%
+/- 2 95% 90-98%
+/- 3 99.7% 99-100%
The previous rules of cumulative probability apply even when a set of data is
not perfectly normally distributed. Lets compare the values for a theoretical
(perfect) normal distributions to empirical (real-world) distributions
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
62/91
Tech-Pro Consultants
How can I tell if my data is bell-shaped?(i.e., Normally Distributed)
Normal Probability Plots
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
63/91
Tech-Pro Consultants
Normal Probability Plots
We can test whether a given data set can be described as normal
with a test called a Normal Probability Plot
If a distribution is close to normal, the normal probability plot will be astraight line.
Minitab makes the normal probability plot easy. Using Distskew.Mtw.Choose: Stat>Basic Stats>Normality Tests
Produce a normal plot of each of the first 3 columns. Which appear tobe normal?
3 Ways To See If Your Data Is NormallyDistributed
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
64/91
Tech-Pro Consultants
Distributed
80706050403020100
300
200
100
0
C3
Frequency
Normal Probability Plots
13012011010090807060
300
200
100
0
C2
Frequency
Normal Probability Plots
1101009080706050403020
100
50
0
C1
Frequen
cy
Normal Probability Plots
1069686766656463626
.999
.99
.95
.80
.50
.20
.05
.01
.001
Probability
Normal
p-value: 0.328
A-Squared: 0.418
Anderson-Darling Normality Test
N of data: 500
Std Dev: 10
Average: 70
Normal Distribution
13012011010090807060
.999.99
.95
.80
.50
.20
.05
.01
.001
Probability
Pos Skew
p-value: 0.000
A-Squared: 46.447
Anderson-Darling Normality Test
N of data: 500
Std Dev: 10
Average: 70
Positive Skewed Distribution
80706050403020100
.999
.99
.95
.80
.50
.20
.05
.01
.001
Probability
Neg Skew
p-value: 0.000
A-Squared: 43.953
Anderson-Darling Normality Test
N of data: 500
Std Dev: 10
Average: 70
Negative Skewed Distribution
Used With Permission
AlliedSignal 1995 -Dr. Steve Zinkgraf
If the NormalityTest shows a
P-value that is
lessthan 0.05,then the data is
NOT
represented
well by anormal
distribution
P Value for Normality Test
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
65/91
Tech-Pro Consultants
y
If your P value is lessthat than .05, thenthe data is NOT approximately normal.
Mystery Distribution
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
66/91
Tech-Pro Consultants
y y
Generate a Normal Probability Plot for the Mystery variable
in Mystery.mtw
What is your conclusion? Is this a normal distribution?
15010050
.999
.99
.95
.80
.50
.20
.05
.01
.001
Probability
Mystery
p-value: 0.000
A-Squared: 27.108
Anderson-Darling Normality Tes t
N of data: 500
Std Dev: 32.3849
Average: 100
Mystery Distribution
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Central Limit Theorem
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
67/91
Tech-Pro Consultants
The central limit theorem states that the distribution of the sample means, our estimate of m, can beapproximated with a normal distribution even though the original population may be non-normal.
Given this, we may say that the grand average (resulting from averaging sets of samples) approachesthe universe mean as the number of sample sets approaches infinity. This property is at the core ofmany statistical tests and is very important for resolving a wide array of industrial problems.
Random sample of g sets with n measurements assigned to each set
Various sampling distributions of individual measurements
XX
Used With Permission
6 Sigma Academy Inc. 1995
For more detail, see
the next few pages.
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
68/91
Tech-Pro Consultants
The Distribution of Averages
The Distribution of Individuals
VS
Important Distinctions:
What would the Distribution ofIndividuals look like?
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
69/91
Tech-Pro Consultants
Individuals look like?
= Individual Measurement
= Average of the SubgroupFlashlight
Y = Lifetime(Hrs)96 85 74
Y = Lifetime(Hrs)96 85 74
? ?
The Distribution
of Individuals
What would the Distribution of
I di id l l k lik ?
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
70/91
Tech-Pro Consultants
Individuals look like?
= Individual Measurement
= Average of the SubgroupFlashlight
Y = Lifetime(Hrs)96 85 74
Y = Lifetime(Hrs)96 85 74
The Distribution
of Individuals
What would the Distribution ofAverages look like?
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
71/91
Tech-Pro Consultants
Averages look like?
= Individual Measurement
= Average of the Subgroup
Y = Weight (lbs)
10.5 10 9.5
The Distribution of Averages?
What would the Distribution of Averages look like?
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
72/91
Tech-Pro Consultants
What would the Distribution of Averages look like?
= Individual Measurement
= Average of the Subgroup
Y = Weight (lbs)
10.5 10 9.5
The Distribution of Averages
Distribution of Individuals Distribution of Averages
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
73/91
Tech-Pro Consultants
X
Distribution of Individuals Distribution of Averages
A Pile of Individuals A Pile of X-Bars
Spread is...
X
X
n
Histogram is...
1 Individual 1 Avg (i.e., 1 X-Bar)1 point is ...
What is the probability that theaverage lifetime of an n=20 samplewill exceed 87 hours?
What is the probability thatan individual battery will lastbeyond 87 hours?
The questionmight be...
8574 96 8574 96
Compressed by n
Graphically...
SE(Mean)
Dist of Avgs spread compresses by factor of n
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
74/91
Tech-Pro Consultants
97
95
93
91
89
87
85
83
81
79
77
75
73
__
_
_
__
_
_
__
__
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
_
_
__
__
__
_
_
__
_
_
__n=20 n=50n=12n=4n=2n=1
Dist. of Avgs spread compresses by factor of n
X
X
n
Individ
uals
97
95
93
91
89
87
85
83
81
79
77
75
73
__
_
_
__
_
_
__
__
__
_
_
__
_
_
__
_
_
___
_
__
_
_
__
_
_
__
__
__
_
_
__
_
_
__
Basic Statistics
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
75/91
Tech-Pro Consultants
Types of data
Measures of the Center of the Data Mean
Median
Mode
Measures of the Spread of Data
Range
Variance
Standard Deviation
Normal Distribution and Normal Probabilities
Process Stability and Process Capability
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Basic Statistics
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
76/91
Tech-Pro Consultants
Variability Is the process on target with minimum variability?
We use the mean to determine if process is on target.
We use the Standard Deviation determine variability Stability
How does the process perform over time?Represented by a constant mean and predictable variability over time.
Which process is the best process? Used With PermissionAlliedSignal 1995 - Dr. Steve Zinkgraf
2520151050
80
70
60
50
Sample Number
Sample
Mean
X-Bar Chart for Process B
X=70.98
UCL=77.27
LCL=64.70
2520151050
75
70
65
Sample Number
Sample
Mean
X-Bar Chart for Process A
X=70.91
UCL=77.20
LCL=64.62
Variation
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
77/91
Tech-Pro Consultants
While every process displays Variation, some processes display
controlled variation, while other processes display uncontrolledvariation (Walter Shewhart).
. Controlled Variation is characterized by a stable and consistentpattern of variation over time. Associated with Common Causes.
Uncontrolled Variation is characterized by variation that changesover time. Associated with Special Causes.
Process A shows controlled variation.
Process B shows uncontrolled variation
Special Causes
2520151050
75
70
65
SampleNumber
SampleMean
X-Bar Chart for Process A
X=70.91
UCL=77.20
LCL=64.62
2520151050
80
70
60
50
SampleNumber
Sample
Mean
X-Bar Chart for Process B
X=70.98
UCL=77.27
LCL=64.70
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Can We Tolerate Variability ?
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
78/91
Tech-Pro Consultants
There will always be variability present in any process We can tolerate variability if
The total variability of the Output is relatively small compared to theprocess specifications and the process is on target
The process is stable over time
LSL USLNom USL
LSL USLNom
Acceptable
Cost
Cost
OLD
New
Traditional
Goal Post
Mentality
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Expanding On The Goal Post Mentality
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
79/91
Tech-Pro Consultants
LSL USLNom
UNDER THE OLD RULES,
The field goal kicker gets 3 points for his team as long as
the ball falls between the LSL and USL.
3 Points
Expanding On The Goal Post Mentality
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
80/91
Tech-Pro Consultants
LSL USLNom
UNDER THE NEW RULES,
The Field Goal Kicker Might Get...3 points Target & +/-12 points Between +/-1 & +/-21 point > +/-2 Out To The LSL & USL
321 2 1Points
Data Analysis Tasks For Improvement
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
81/91
Tech-Pro Consultants
Determine If Process Is Stable
If process is no tstable, identify and remove causes of
instability
Determine The Location Of The Process Mean.
Is It On Target?
If not, identify the variables which affect the mean and
determine optimal settings to achieve target value
Estimate The Magnitude Of The Total Variability. Is
i t acceptable with respect to the c ustom er requirements (spec l imi ts)? If not, identify the sources of the variability and eliminate or
reduce their influence on the process
Used With Permission
AlliedSignal 1995 - Dr. Steve Zinkgraf
Visualizing the Process Dynamics - Is TheProcess Stable ?
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
82/91
Tech-Pro Consultants
Inherent Capability of the
Process
General Assumptions::
Over time, a typical process
will shift and drift by approx. 1.5
. . . also called short-term capability
Time 1
Time 2
Time 3
Time 4
TLSL USL
Sustained Capability of theProcess . . . also called long-term capability
Used With Permission
6 Sigma Academy Inc. 1 995
The Goal Is ...
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
83/91
Tech-Pro Consultants
Variables Data
0% Rejected
Target
Attributes Data
How We Progress Toward The Goal
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
84/91
Tech-Pro Consultants
PHASE ONE - Unpredictable Performance
- VARIATION (SPECIAL / NATURAL CAUSES)
- UNPREDICTABLE (HOURLY, DAILY)
- DETECT AND ELIMINATE SPECIAL CAUSES
PHASE TWO - Stability
- IN CONTROL
- NATURAL VARIATION ONLY
Not capable of getting
all the water output into
the clowns mouth?
How We Progress Toward The Goal UpperSpecificationLower Specification
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
85/91
Tech-Pro Consultants
IN CONTROL, BUT NOT CAPABLE
(Variation from common causes excessive)
IN CONTROL AND CAPABLE
(Variation from common causes reduced)
SIZE
LOWERSPECIFICATION
LIMIT UPPER
SPECIFICATION
LIMIT
Now it is capable of
getting all the water output
into the clowns mouth
1.11.01.15
1.21.25
1.31.35
1.41.05.001 2.0
Is The Process on Target ? - Accurate ?
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
86/91
Tech-Pro Consultants
USL
Part
T
LSL
Recognize that the process center (m) is
independent of the design center (T). In
other words, the ability of a process to
repeat any given centering condition is
independent of the design specifications.
1.233 1.235 1.239 1.241 1.243 1.245 1.2471.237
m ManufacturingDistribution of the Widget
Part
54321
Increase in nonconformance due
to shift in process centering
Used With Permission
6 Sigma Academy Inc. 1 995
Is The Process on Target ? - Precise?
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
87/91
Tech-Pro Consultants
1.235 1.237 1.239 1.241 1.243 1.245 1.247
USL
Part
T
LSL
Recognize that the process width is
independent of the design width. In
other words, the inherent precision of
a process is not determined by the
design specifications.
Manufacturing Distribution
of the Widget Part
Used With Permission
6 Sigma Academy Inc. 1 995
Is The Variability AcceptableTo Customer Requirements ?
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
88/91
Tech-Pro Consultants
USL Y = f (X1 . . . XN)
The variation inherent to any dependent variable (Y) is determined by
the variations inherent to each of the independent variables.
LSL
Poor Process
Capability
LSL USL
Very High
Probabilityof Defects
Very High
Probabilityof Defects
LSL USL
ExcellentProcess
Capability
Very Low
Probabilityof Defects
Very Low
Probabilityof Defects
Used With Permission
6 Sigma Academy Inc. 1 995
Summary
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
89/91
Tech-Pro Consultants
Reviewed & Enhanced The Basic Statistical & Quality TermsNeeded For Six Sigma Process Improvement
Began to Build Up Minitab Operating Skills
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
90/91
Tech-Pro Consultants
Six Sigma
Q&A
-
7/30/2019 2007-06-17 TPC DD BasicStatistics
91/91
Six Sigma
Thank You