data collection and sample size considerations measure kaizen facilitation

Post on 12-Jan-2016

221 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Data Collection and Sample Size Considerations• Measure• Kaizen Facilitation

2

Objectives

• Review Data Collection Plan Definition and Goals• Discuss Sampling Principles• Identify Key Elements of a Data Collection Plan

3

Data Collection Must Be Planned

• A Data Collection Plan is an organized, written strategy for gathering information for your Project

• Goals• Data are representative of the process• Data are reliable, every time• Only relevant data are collected• All the necessary data are collected• Resources are used effectively

4

What to Measure?

• Outputs (Y’s)• Product or Service produced or delivered by the process• Measures include cycle time, customer satisfaction, cost

• Process Variables (X’s)• Those variables that influence the output and are generally controllable by

those who operate the process

• Input Variables (X’s)• Materials and information used by the process to create the outputs. (Inputs

are often outside the control of the process owner)

5

Remember the Data Types?

• Qualitative• Judgment / Feeling (porridge too hot, job takes too long)

• Quantitative• Attribute Data

• Discrete • Binary, Ordinal, Categorical, Individual Count

• i.e. accuracy, defects, etc…

• Continuous Data • Variables

• Measured• i.e. time, physical measures, etc…

6

Attribute Data Characteristics

• Use for sorting, primarily• Less informative than Continuous (Variables) data• Need large sample sizes to predict capability• Need to define opportunities for defects to be meaningful

• Definition of exact Quality Characteristics in a “taste test”

• Examples: misspelled words on page, mis-loaded containers

Pepsi vs. Coke

7

Continuous Data Characteristics• Best type if available or can be gathered• Most information for a given sample size• Information on capability and shape of distribution• Graphical and Statistical Analysis available• Examples: LTI’s, service cycle times

Observation

Indiv

idual V

alu

e

736557494133251791

6

4

2

0

-2

_X=1.935

UCL=5.763

LCL=-1.893

Observation

Movin

g R

ange

736557494133251791

4

3

2

1

0

__MR=1.439

UCL=4.702

LCL=0

2

I-MR Chart of Cycle Time - Hrs

8

Continuous Types of Measures

• There are two primary types of (Y) output metrics• Effectiveness (VoC)• Efficiency (VoB)

• A third type of metric could be around Quality• Which may be attribute or continuous data

9

Effectiveness Measures

• Degree to which Customers’ needs/requirements are met or exceeded

• On-time Delivery• Accuracy (i.e. – Billing Process)• Ease of use• Performance• Serviceability• Price• Value

10

Efficiency Measures

• Amount of Business resources allocated in meeting or exceeding Customer needs/requirements

• Total Cycle Time• Machine Time• Processing Time• Waiting Time• Per Unit Costs• Rework Costs• Inspection/ Audit costs

11

How Much Data is Needed?

• It is often impractical to collect all the data from every aspect of your process

• When there is too much data• When too much time is required to sample all the data• When measurement is costly

• In these cases data sampling is used• Sound conclusions can be made from a relatively small amount of

data

12

Purpose and Advantages of Using Samples• Sampling refers to the practice of evaluating (inspecting) a portion

(sample) of a lot (population) for the purpose of inferring information about the entire lot

• Statistically speaking, the properties of the sample distribution are used to infer the properties of the population distribution

• Sampling makes possible the study of a large population• Sampling is for economy, speed, and accuracy

13

Considerations in Data Sampling

Factor Example• What type Complaints, Defects, Problems• When Year, Month, Week, Day, Hour• Where Region, City, Site, Quadrant• Who BU, Department, Individual

NOTE: These questions should be answered within the data collection plan

14

Principles of Data Sampling

Samples must be:• Representative• Adequate• Random

Population: a set which includes all data measurements of interest to the project leader(The collection of all responses, or counts that are of interest)

Sample:A subset of the population

Sampling Unit:An individual unit of a sample

15

Requirements of Data Sampling

• Based on the ‘operational definition’ of the output (Y) and other factors (Xs) to be recorded, determine a sampling plan

• Sampling plan must be:• Representative: all occurring conditions, locations

and times• Adequate: statistically significant conclusions can be

drawn about long term and short term performance• Random: data gathered free from bias

16

Requirements of a Representative Sample• Sample data must represent all segments

• Physical locations• Shifts• Days of Week• Months• Seasons

• Avoid bias• Collecting only when convenient (omitting night/ weekend shifts)• Collecting only from responsive individuals

17

Requirements of an Adequate Sample• Sampling sizes must be adequate to achieve statistical significance• Sample size to achieve statistical significance varies with each

analytical tool• Statistical significance may or may not be the same as practical

significance• Larger sample sizes increase confidence

– refer to guidelines

18

Several Ways to Ensure Random Samples

• Randomization helps ensure data is representative• Randomization helps endure data is free from bias• Sampling approaches include:

• Pure Random Sampling (each unit has equal chance)• Stratified Sampling (select from different groups/classes)• Systematic or Interval Sampling (every 15 minutes, every 4th unit,

sweep across a location from left to right, etc)• Cluster Sampling (large geographic areas to deal with)• Sub-grouping (Sample output of step or activity with some

frequency - usually a time increment e.g. -Pull 5 samples at 10 a.m., 12 p.m., 2 p.m., and 4 p.m.)

19

Dealing with Differences – ‘Within, Between’

• Random samples are selected from a “homogeneous group” or “lot”, but sometimes may not be because there are different machines involved, different people, different locations/ shifts

• With stratified sampling, random samples are drawn from each “group” of processes that are different

• Stratify data collection efforts by:• Shift / Time of Day• Item or Type of Service• Location (Gate, Yard)• Equipment utilized (Top Pick vs. Strad)

20

Sampling Must Catch The Variation

Sudden Change

Hugging or Bunching

Cycling

Trending

21

Patterns of Sample Data

• When we collect sample output data (Y), we want to know not only what its patterns are, but also what other factors are related to the pattern in Y

• Two concepts are used to describe the factors related to these patterns are:

• Segmentation – external factors• Stratification – internal and process measures

• If easy, gather other process data as well:• shift, time, operator, part number, material type, etc…

22

Examine External Factors through Segmentation

• Used to identify differences between different factors or processes• Components from different suppliers• Example below (l to r): in-person, fax, telephone, on-line

• Segmentation may point out major drivers of defects or correlation to the output Y

23

Stratification for Internal and Process Measures

• Stratification is a data analysis technique by which project Y data is sorted according to relevant subgroups called levels or strata

• Understanding Level differences may lead to the root cause, that will lead to ultimate improvement of the process/ project

24

Data Collection Plan - Stratification Example

Loan ApplicationCycle Time

Location Size of Loan

Small

MediumBig

Phoenix

HoustonNew Orleans

Jacksonville

Project Y’sWhat team must change to influence Customer Satisfaction

Stratified X’s What team must study and/or change to influence Project Y’s

This strategy tries to identify variation within locations and perhaps, is there a relationship between loan size and location?

25

How Much Data Do I Need?

• The amount of data required depends greatly on:• The process you’re collecting it from• What you’re trying to represent• The difference you’re trying to detect• Your confidence levels (discussed later)

• As a general rule of thumb it should be enough to cover the sampling aspect we discussed

• More is always better…

26

Sample Sizes for Data Displays

Rules of ThumbTool or Statistic Minimum Sample Size

Mean 10 - 15Standard Deviation 20Proportion Defective (P) 30Histogram or Pareto 25 - 50Control Chart 20 - 30

27

But I Only Have 5 Data Points?

• In some situations you may not have enough data to collect a sample

• If you run into this situation it’s important to understand:

• How to deal with these small sample sizes?

• How it affects your ability to use statistics?

28

Risks With Small Samples – Margin of Error

• It is all about precision, tolerance for risk and cost.

• For samples smaller than 1000, we always have to think about how confident we want to be that estimates are within a particular range (level of confidence and risk), and how small we want that range to be (level of precision). Unfortunately, they go in opposite directions. Higher levels of confidence require greater ranges (margins of error) in small sample sizes.

29

Dealing With Small Samples

• Expected effects may not be fully accurate, so be upfront about the limitations and document your sampling strategies, decisions, and criteria

• See it as an opportunity to keep evaluation costs low recognizing that a large study without sufficient resources can under-power results

• Hesitate to report percentages, or don't at all - report fraction instead, as percents can be misleading and may overstate results

30

Questions to Consider

• What type of data of data analysis will be conducted? • Will subgroups be compared?• What is the probability of the event occurring?• How much error is tolerable (confidence interval)? • How much precision do we need?• How confident do we need to be that the true population value falls within the

confidence interval?• What is the budget? Can we afford the desired sample?• What is the population size? Large? Small/Finite?

• If unknown, assume it to be large ( >100,000)

31

Before Collecting Data

• One task remains before collecting data• Validation of the Measurement System…

• Systems Audit – Data Validation • Measurement System Analysis (MSA) • Gage calibration• Gage repeatability and reproducibility (GR&R) for Variables Data

• MSA must be completed before Data Collection!

Validation helps understand total variation attributed to operator methods, bias in effort, gage discrepancies …

32

Data Collection Plan Example

Customize Your Form and Format – Each process and project has unique requirements

33

Collecting and Recording Data

3 Elements:• A procedure

• What will be measured (Y and X’s)• What segmentation or stratification will be recorded• Sampling Plan (what, where, when, how much)• Who will record and with What instrument• Measurement System Validation method

• A checklist• Assure all factors, segments, and strata are included

• A form• Collect data

34

Data Collection ‘Procedure’ Example

What data are you going to need? What are you going to do with it?

35

Data Collection ‘Checklist’ Example

Develop Your Form and Format – Each process and project has unique requirements

36

Data Collection ‘Form’ Example

Develop Your Form and Format – Each process and project has unique requirements

37

Review

• Review Data Collection Plan Definition and Goals• Discuss Sampling Principles• Identify Key Elements of a Data Collection Plan

38

Exercise

• Objectives:• Collect a data sample• Calculate the sample mean and standard deviation of the total distribution• Plot a histogram of the data total distribution• Test the data for normality• Do some data mining

• Procedure:• Set up ‘helicopter’ and keep all conditions fixed

• Except: Wing Length & Body Width• Change Those Randomly

• Record flight time values in Minitab for 30 launches• Perform appropriate analysis

top related