outlier managment

9

Upload: siddhartha-harit

Post on 07-Feb-2017

69 views

Category:

Business


0 download

TRANSCRIPT

Page 1: Outlier managment
Page 2: Outlier managment

BASIC STATISTICS

Discussing Quality Control implies the use of several terms and concepts with a specific (and sometimes confusing) meaning.

Some of the most important concepts are:

Error:- Any analytical result which is deviated form true value is an error.

ORAny data point which is not in confirmation with customer’s requirement and or any product which so ever is deviated from the actual requirement is a defect.

Accuracy:-The "trueness" or the closeness of the analytical result to the "true" value.

Precision:-The closeness with which results of replicate analyses of a sample agree.It is a measure of dispersion or scattering around the mean value and usually expressed in terms of standard deviation, standard error or a range (difference between the highest and the lowest result).

Bias:-The consistent deviation of analytical results from the "true" value caused by systematic errors in a procedure. Bias is the opposite but most used measure for "trueness" which is the agreement of the mean of analytical results with the true value, i.e. excluding the contribution of randomness represented in precision.

Page 3: Outlier managment

BASIC STATISTICS

Mean:-The sample mean is the average and is computed as the sum of all the observed outcomes from the sample divided by the total number of events. We use x as the symbol for the sample mean.

Mean = Sum of all the set elements / Number of elements, where n is the sample size and the x correspond to the observed valued.

Mode:-The mode of a set of data is the number with the highest frequency.

Mode of the dataset S = 1,2,3,3,3,3,3,4,4,4,5,5,6,7, is 3

Median:-Median is the middle value of a set.Median of the data set : 1, 2, 3, 4, 10000, is 3

Variance:-It is the measure of the deviation of a set of data from the mean value.The variance is obtained by: Finding out the difference between the mean value and all the values in the set. Squaring those differences. Adding the differences.

Standard Deviation:-The Standard Deviation is a measure of how spread out numbers are. It is the square root of the Variance.

Page 4: Outlier managment

Find the 5 Number Summary of the following numbers:

3 12 7 40 9 14 18 15 17

Step 1: Sort the numbers from lowest to highest

3 7 9 12 14 15 17 18 40

Step 2: Identify the Median

3 7 9 12 14 15 17 18 40

Step 3: Identify the Smallest and Largest numbers

3 7 9 12 14 15 17 18 40

Step 4: Identify the Median between the smallest numberand the Median for the entire set of data, and between that Median and the largest number in the set.

3 7 9 12 14 15 17 18 40

In a set of numbers, a number that is much LARGER or much SMALLER than the rest of the numbers is called an Outlier.

To find any outliers in a set of data, we need to find the 5 Number Summary of the data.

Outliers

These are the five numbers in the 5 Number Summary

3 7 9 12 14 15 17 18 40

Page 5: Outlier managment

Outliers

3 7 9 12 14 15 17 18 40

1st Quarter

2nd Quarter

3rd Quarter

4th Quarter

A 5

Num

ber S

umm

ary

divi

des y

our d

ata

into

four

qua

rter

s.

This is called the Inter-Quartile Range (IQR)IQR= Q3-Q1

17-9=8

To determine if a number is an outlier, multiply the IQR by 1.58*1.5=12

An outlier is any number that is 12 less than Q1 or 12 more than Q3

A quartile is a type of quantile. The first quartile (Q1) is defined as the middle number between the smallest number and the median of the data set. The second quartile (Q2) is the median of the data. The third quartile (Q3) is the middle value between the median and the highest value of the data set.

Page 6: Outlier managment

3 7 9 12 14 15 17 18 40

IQR = 8

+ 12- 12

- 3 39

OUTLIER

Outliers

Page 7: Outlier managment

Outlier Management

Page 8: Outlier managment

Outlier Management is the identification and treatment of outliers.

Outliers are individuals or observations that are statistically different from the group they are being compared to.Outliers can be “good” or “bad”. Management of “good” outliers can help businesses improve performance and maximize profits. Management of “bad” outliers can help businesses improve performance and minimize risks. Effective Outlier Management programs include the following components:

Outlier Management

Page 9: Outlier managment