Dealing with Outliers is a difficult task in spinning industry. There are so many questions arises when we plan to deal with outliers. We are trying to answers all these questions through this article.
Dealing with outliers 2012
Produced By Mr. Sunil Kumar Sharma,
Published in Spinning Textiles Magazine, Vol.7, Issue-2, March – April 2013 Edition Page 1
Dealing with Outliers is a difficult task in spinning industry. There are so many questions arises when we plan to
deal with outliers. We are trying to answers all these questions through this article.
Q.1. What is an Outlier and it’s definition?
Ans. An outliers can be understand through following definitions : -
Outlier is a scientific term to describe things or phenomena that lie outside
normal experience.
In statistics, an outlier is an observation that is numerically distant from the rest
of the data.
An outlying observation, or outlier, is one that appears to deviate markedly from
other members of the sample in which it occurs.
An extreme deviation from the mean.
An outlier is an observation that lies an abnormal distance from other values in a
random sample from a population.
From the above definitions we can understand that an outlier value does not
belongs to the normal population and it is different with others members or
readings of a distribution. In a sense, this definition leaves it up to the analyst (or a
consensus process) to decide what will be considered abnormal. Before abnormal
observations can be singled out, it is necessary to characterize normal observations.
Q. 2. Why outliers to be detected and removed from process?
Ans. Outliers arises due to changes in system behaviour, fraudulent behaviour,
human error, instrument error or simply through natural deviations in populations. A
sample may have been contaminated with elements from outside the population being
examined. Outlier detection has been used to detect and, where appropriate, remove
anomalous observations from data. Outlier detection can identify system faults before
they escalate with potentially catastrophic consequences. Outliers should be
investigated carefully. Often they contain valuable information about the process
under investigation or the data gathering and recording process. Before considering
the possible elimination of these points from the data, one should try to understand
why they appeared and whether it is likely similar values will continue to appear. Of
Dealing with outliers 2012
Produced By Mr. Sunil Kumar Sharma,
Published in Spinning Textiles Magazine, Vol.7, Issue-2, March – April 2013 Edition Page 2
course, outliers are often bad data points. An outlier is the abnormal reading which is
significantly different from most of the population of a normal distribution. This
significant different characteristic of outlier creates major variation in process or in
ultimate product characteristics. In spinning process there are many factors creates
variation in process. These variations considered as spinning abnormalities and
causing defects in yarn or fabric. One defective yarn package may spoil the thousand
meters of fabric length. A defect in spinning preparatory process may disturb the
working of whole spinning mill. Hence it is better to identify & remove such
abnormalities in early stages before they create problems. Systematic outlier detection
and it’s detail analysis up to root cause and finally eliminating the origin of outlier
reduces the variation significantly in downstream with better Yarn & fabric Quality.
Q. 3. How to identify or calculate outliers?
Ans. : There is no rigid mathematical definition of what constitutes an outlier;
determining whether or not an observation is an outlier is ultimately a subjective
exercise. There are several methods for detection of outliers, however in spinning
mills commonly used method for identification & detection of outliers are based on
mean and standard deviation. Hence here we explain this method only.
A Standard Deviation is a measuring stick used to describe how data are dispersed
around their average. A normal distribution, which takes the shape of a nice “bell
curve, one Standard Deviation encompasses about 68.27% of all observation data
represented with dark blue colour in fig.-1. Two Standard Deviations includes about
95.45% of all observations represented with dark blue & medium blue colour. And
three Standard Deviations encompass nearly all values i.e. 99.73% of all observations
represented with dark blue, medium & light blue colours. A graphical representation
of a normal deviation is shows below in fig.1: -
Dealing with outliers
Produced By Mr. Sunil Kumar Sharma,
Published in Spinning Textiles Magazine, Vol.
Fig. 1 : Graphical Representation of a Normal Distribution with different Standard deviation level.
Where x is an observation from a normally distributed
its standard deviation:
Thus reading outside the two sigma (i.e. 2 S.D.)
be considered as outlier depending upon the no. of occurrences
total number of readings observed
Chart : 1 : % Population & expected frequency of
Range % Population in range
μ ± 1σ
μ ± 1.5σ
μ ± 2σ
μ ± 2.5σ
μ ± 3σ
It is clear with above table
should be 22 in case of two sigma limit (i.e. 2 SD) & 370 for three sigma (i.e. 3 SD)
limits. Hence for a spinning mill two sigma limit
Dealing with outliers
ublished in Spinning Textiles Magazine, Vol.7, Issue-2, March – April 2013 Edition
Fig. 1 : Graphical Representation of a Normal Distribution with different Standard deviation level.
is an observation from a normally distributed random variable, μ is the mean of the distribution, and
two sigma (i.e. 2 S.D.) or three sigma (i.e. 3 SD)
depending upon the no. of occurrences for outside range
total number of readings observed.
% Population & expected frequency of outliers for different standard deviation range.
Population in range Expected frequency outside range
68.27
86.64
95.45
98.76
99.73
It is clear with above table that minimum observation for identifying the outliers
should be 22 in case of two sigma limit (i.e. 2 SD) & 370 for three sigma (i.e. 3 SD)
Hence for a spinning mill two sigma limits are more appropriate & practical for
2012
Page 3
Fig. 1 : Graphical Representation of a Normal Distribution with different Standard deviation level.
is the mean of the distribution, and σ is
or three sigma (i.e. 3 SD) limits may
for outside range and
outliers for different standard deviation range.
Expected frequency outside range
1 in 3
1 in 7
1 in 22
1 in 81
1 in 370
that minimum observation for identifying the outliers
should be 22 in case of two sigma limit (i.e. 2 SD) & 370 for three sigma (i.e. 3 SD)
more appropriate & practical for
Dealing with outliers 2012
Produced By Mr. Sunil Kumar Sharma,
Published in Spinning Textiles Magazine, Vol.7, Issue-2, March – April 2013 Edition Page 4
outlier detection instead of three sigma limit. 2 sigma limits provide us opportunity to
review & analysis approx. 4.55 % readings from total observation for further
improvement.
Q. 4 : In spinning Quality control, how we can utilize these principles and
implement the 2 σ theory for detection of outliers?
Ans. : - In spinning Quality control we generate lot of data during daily testing of in-
process & finished material. For Outliers detection through two sigma analysis we
require minimum 22 no. of readings, hence this analysis is practicable & beneficial for
speed frame & ring frame section where we obtain maximum test readings on daily
basis through spindle wise testing. Following critical parameters & test results may be
analyzed for outliers detection with two sigma limits : -
1. Spindle wise Roving Hank measurement.
2. Spindle wise Roving U %.
3. Spindle wise count measurement of Ring frame.
4. Spindle wise U %, Imperfection level & Hairiness index of ring frame.
5. Spindle wise single yarn strength & Elongation %.
6. Or any report may be analyzed for outlier analysis, where minimum no. of
readings should be more than 22.
Dealing with outliers 2012
Produced By Mr. Sunil Kumar Sharma,
Published in Spinning Textiles Magazine, Vol.7, Issue-2, March – April 2013 Edition Page 5
Chart – 2 : A reference UT-5 test report for outlier detection through 2 sigma limit.
Dealing with outliers 2012
Produced By Mr. Sunil Kumar Sharma,
Published in Spinning Textiles Magazine, Vol.7, Issue-2, March – April 2013 Edition Page 6
Methodology : - Chart – 2, illustrates a reference UT-5 test report in which outliers
being identified for U % & Hairiness index by applying two sigma limits. In given
report total 30 No. of samples were tested from identified ring frames spindles. Frame
no. & spindle numbers mentioned in first column of the test report.
1. Detection of outliers for U % : -
The average U % of total 30 readings is 8.88 & standards deviation is 0.20.
Hence 2 σ limit will be = 0.20 x 2 = 0.40 i.e. 8.88 ± 0.40 = 8.48 to 9.28.
Now see the total readings of U %, where test no.- 29 observed beyond this
limit, which belong to RF No.- 14 RHS, Spdl No.- 612. Which is an outlier
reading for U %. Highlighted with yellow colour in test report.
2. Detection of outliers for Hairiness Index i.e. H : -
The average of all readings for H is 5.09 & standard deviation is 0.16.
Hence 2 σ limit will be = 0.16 x2 = 0.32 i.e. 5.09 ± 0.32 = 4.77 to 5.41.
Test No.-2 observed beyond this range, which belong to RF No.-13 RHS, Spdl.
No.-108, which is an outlier reading for hairiness index. Highlighted in orange
colour in test report.
Fig. 2 : - Graphical representation of outliers for U %
8.5
8.6
8.7
8.8
8.9
9
9.1
9.2
9.3
9.4
9.5
9.6
9.7
0 5 10 15 20 25 30
U%
No. of Readings
Outlier
Dealing with outliers 2012
Produced By Mr. Sunil Kumar Sharma,
Published in Spinning Textiles Magazine, Vol.7, Issue-2, March – April 2013 Edition Page 7
Fig. 3 : - Graphical representation of outliers for Hairiness Index
Conclusion : - An outlier is a distinct observation from the mean value, which
represent different characteristics with most of others members of a normal
distribution. Hence these observations should be detected, removed and detailed
analyzed up to the root cause of abnormality and to be corrected. A defect in spinning
preparatory process may disturb the working of whole spinning mill or a defects of
spinning process may causes huge losses in downstream processes, hence to detect the
prominent outliers at spinning stage itself and eliminating it’s root cause will
significantly reduce the rejection in next processes. There are several methods for
detection of outliers but for spinning process, outlier detection through two sigma
limit is more practicable and easy method. Outlier’s detection through two sigma
limits covers minimum 4.55 % observation for at least more than 22 No. of reading,
which provides opportunity to correct & analyzed at-least 4.55 % observations and
reduces the variation. However if No. of readings are more than 370 and there is huge
variation in process, than 3 σ limit may be applied, which covers total 99.73 %
readings and approx. 0.27 % no. of readings will be outside the limit which will be
considered as outliers. Methodology of outlier detection through 2 σ or 3 σ is very
simple as now days most of the reports generated through PC based instruments
which itself provide the standard deviation.
4.6
4.7
4.8
4.9
5
5.1
5.2
5.3
5.4
5.5
5.6
0 5 10 15 20 25 30
HairinessIndex
No. of Readings
Outlier
Dealing with outliers 2012
Produced By Mr. Sunil Kumar Sharma,
Published in Spinning Textiles Magazine, Vol.7, Issue-2, March – April 2013 Edition Page 8
Produced by : Mr. Sunil Kumar Sharma, Manager – QAD,
Mobile No. : – 09552596742, 09921417107
E_mail : - sunil_ku67@yahoo.com
Loknayak Jayprakash Narayan Shetkari Sahakari Soot Girni Ltd.
Kamalnagar, Untawad – Hol, Shahada,
Tal. : - Shahada, Dist. : - Nandurbar (MS)
Pin : - 425409