SlideShare uma empresa Scribd logo
1 de 8
Baixar para ler offline
IOSR Journal of Mathematics (IOSR-JM)
e-ISSN: 2278-5728, p-ISSN: 2319-765X. Volume 11, Issue 6 Ver. V (Nov. - Dec. 2015), PP 60-67
www.iosrjournals.org
DOI: 10.9790/5728-11656067 www.iosrjournals.org 60 | Page
Identification of Outliersin Time Series Data via
Simulation Study
1
ADEWALE Asiata. O,2
Nurul Sima MOHAMAD SHARIFF
1,2,
Facultyof Science&TechnologyUniversitySainsIslamMalaysia (USIM)Bandar Baru Nilai, 71800 Nilai
Negeri SembilanMalaysia
Abstract:This paper compares the performance of regression diagnostics techniques based on
Ordinary LeastSquares (OLS) estimators and four types of robust regression based on robust estimators
todetect and identify outliers. It is known that OLS is not robust in the presence of multiple high leverage
points.Thus severalrobust regression models are used as alternative and its approach is more reliable and
appropriate method for solving this problem. The comparisonsaremadeviasimulationstudies.Our
resultshaveshownthatinsomecases diagnostics basedonthe OLSandsome
robustestimatorsgivesimilaroutcomes,they detectthesame percentage of correct outlier detection. And under
small sample size OLS and M-estimation perform best for innovative outliers. The results also shows that Least
Trimmed Square is the best among all its counterparts under large sample size.
Keywords: Outliers, Ordinary Least Squares (OLS), Regression diagnostics, Robust regression,
Simulation Studies.
I. Introduction
Outliers are usually encountered in time series data analysis. The presence of outliers in time series
analysis can seriously has negative impact in the analysis because they may stimulate substantial biasness in
parameter estimation, model misspecification and incorrect inference, [1]. Outliers has been defined by Abd.
Mutalib and Ibrahim [2] as data points or observations that deviate distinctly from other observations or data
points which are abnormally large or small from the other observations. The relevancies of outlier detection and
identification in time series have been used in fraud detection, financial institute, public health and
Telecommunication Company. According to Lopez-de-Lacalle[3], there are five types of outliers that are
commonly found, namely, innovation outlier (IO), additive outlier (AO), level shift (LS), temporary change
(TC)and seasonal level shift (SLS).
The most popular way to analyse time series regression model is to use Ordinary Least Square (OLS)
method. It is the best technique if all the statistical assumptions are valid but when the data or the series are
contaminated with outliers, these statistical assumptions are invalid. There arise the needs of regression
diagnostics tools or techniques to detect and identify the outliers or influential observations. There are many
types of regression diagnostics tools in the literature, among them are: The welsch-kuh distances, Cook’s
Distance and Hadis influence Measure. However, these methods are not robust in presence of multiple high
leverage point, which can cause masking and swamping effects [4]. According to Widodo et al [5], robust
regression approach is more reliable and appropriate method for solving this problem. The robust estimators are
relatively unaltered by large changes in a small series of data and also small changes in a large part of the series.
Yafee [6] discusses that there are several kinds of robust estimators in the literature among which are Least
Absolute Deviations (LAD or L1), Least Median Squares (LMS), Least Trimmed Squares (LTS) and Huber M-
Estimation. These robust estimators will be used in this research work by S plus statistical software through
simulation study. Their performance will be compared to one another and the best technique among them will
be figured determined.
II. Materials and Method
OLS Estimation
Consider the time series model of simple autoregressive, AR(p)
tptptt yyy    ...11 (1)
The time series model of simple autoregressive, AR(p) as the sameform as (1), can be written as
,...210
  pii
xy ni ,...,2,1 (1)
Model (2) can be written in matrix form as,
  XY (3)
Identification of Outliersin Time Series Data via...
DOI: 10.9790/5728-11656067 www.iosrjournals.org 61 | Page
WhereY= vector of nx1 response,Xis the n x p matrix of explanatory variables,  is the vector of parameter
(regression coefficient),  is the random error distributed as normal distribution with mean zero and
2
 . Here
n is number of observations and p is number of regressors.
The final estimates of  is then,
YXXX TT 1
)(ˆ 
 (4)
The residual isobtained as follows:
yHYXXXXYXYYYe TT
)1()(ˆˆ 1
 
 (5)
Where
TT
XXXXH 1
)( 
 is leverage / hat matrix.
The i-th elements of H is,
T
i
T
iii xXXxh 1
)( 
 (6)
Regression Diagnostics Methods
There is numerous regression diagnostics methods used to identify outliers. However, this study will only
consider five methods that are listed below.
Hadi’s influence measure
,
111 2
2
2
i
i
iiii
ii
i
d
d
h
p
h
h
H



 ni ,...,3,2,1 (7)
Where
SSE
e
d i
i 2
the normalized, SSE is sum squares residual residuals, It identify influential observations.
The cut-off point for
2
iH measure is  )var()( 22
ii HCHmean  C is constant which take the value of 2.
The welsch-kuh distances
,
)1(ˆ
*
)( iii
iii
h
he
DFFits



ni ,...,3,2,1 (8)
has the cuff-off value of
n
p
*2
Cook’s Distance
,
)1)(1(
*2
ii
iii
i
hp
hr
D

 ni ,...,3,2,1 (9)
Where
i
i
i
h
e
r


1ˆ
, Any observation that exceeds the cut-off of
pn 
4
is considered as influential
observation.
Robust Regression
Robust regression methods are designed not to be wholly affected by violations of assumptions by the core data
generating process. A robust regression is performed on a high breakdown point and high efficiency regression,
[9].
Huber M-Estimation
Huber-M estimation uses Huber weight function as its weight function. The Huber M-Estimator scale estimate
of mˆ and the Huber M-Estimator error me are usedinstead ofˆ , and ie which are based on OLS method.
Identification of Outliersin Time Series Data via...
DOI: 10.9790/5728-11656067 www.iosrjournals.org 62 | Page
Least absolute deviations (LAD or L1)
LAD obtains a higher effectiveness than OLS through minimizing the sum of the absolute errors, [7]. It scale
estimate denoted by 1
ˆL and its error 1Le are usedinstead ofˆ , and ie which are based on OLS.
Least median squares (LMS)
LMS is a robust estimator that has been hypothetically has breakdown point of 50%, [8]. This means the LMS
provides reliable outcomes even if up to 50% of contaminated data or series exist. It has the characteristics of
solving the liner model by minimising the median of the weighted squared. LMS scale estimate denoted by
LMSˆ and its error LMSe are usedinstead ofˆ , and ie which are based on OLS.
Least trimmed squares (LTS)
This robust regression techniques minimizes the sum of squared residuals over n observations and subset k of
those observations, thus the n-k observations which are excluded does not has effect on the fit.The LAD scale
estimate is denoted by LADˆ and its error LADe are usedinstead ofˆ , and ie which are based on OLS.
Note that their cuff-value remains the same as the OLS.
Incorporating outliers into the series.
From the original series, model1:
)( ttt Iyq  , the series is contaminated with outliers.
 is the magnitude of the outliers, is the time that the outliers occur and )(tI is a dummy variable which has
zero value at all lags except when time Tt 



 




t
t
It
,0
,1
tatoccursioncontaminatwhen , 1tI otherwise 0.
III. Result and Discussion
The sample size used is n=30 & 200, parameter is set to be 7.0 , 5 , standard deviation 1 ,300
replications for n=30 and 500 replications for n = 200. To assess the power of the procedure, the following case
will be considered;
i. Single outlier of AO / IO
ii. Multiple outliers AOs / IOs
iii. Both multiple outliers AOs and IOs
Fifthteen measures will be run for each cases,
Under n=30, the location for single outlier is set to be 12 and for the multiple outlier, the location is set to
be k1 =12, and k= 20. Forn=200, the location for single outlier is set to be 26 , and multiple outliers; k1
=26, k2 =62 and k3 =99.
Table 1. A simulation study on the power of the outlier detection in regression diagnostics tools based on
ols.
12 12 (n=30,ai=0.7,nsimul=300,p=1,k1=12,k2=20)
Case Single
AO
Single
IO
2 AOs 2IOs Both AO and IO
1st
outlier
2nd
outlier
1st
outlier
2nd
outlier
1st
outlier
(IO)
2nd
outlier
(AO)
Ordinary Least Square (OLS)
2
iH 94.6 99.3 80 80 86.3 87.3 99.3 0.3
Dffits 87.7 99.7 70.7 79 97.7 98 99.6 0.3
Identification of Outliersin Time Series Data via...
DOI: 10.9790/5728-11656067 www.iosrjournals.org 63 | Page
Di 82.7 99 54 61 92.7 93 99 0.3
Summary of the outliers detection performance. Note that the numbers are in percentage.
Table 1. Present the result of the performance of 300 replications on outlier-detection method for each single
and multiple outlier specifications using the cut-off C value of each classical diagnostic techniques as stated
earlier respectively. The numbers and percentage of the correct detection and identification are given under each
location of the outliers. Under multiple outlier and “Both AO and IO”, the label k1 and k2, tells us the location
of 1st outlier, outlier and 2nd outlier respectively, and  for single outlier.
The result shows that OLS based diagnostic technique, 2
iH detect 94.6% of correct number of outlier, follow by
Dffits (87.7%) and iD , (82.7%) under single AO while in single IO, the percentage detection for Dffits , 2
iH and
iD is very powerful i.e (99.7%, 99.3% and 99%). For multiple IO (2IOs), there seems to be a better correct
detection percentage ranging from 86.3% to 99.6%, and for AO (2AOs), 54% to 80%. The result outcome of
“Both AO and IO” seems to favour additive outliers (IO) and perform woefully under the additive outlier (AO),
this may be that OLS based diagnostic technique can only detect correctly the first outlier that comes on it way
and swamp the rest of the outlier.
Table 2. A simulation study on the power of the outlier detection based on robust versions of
regression diagnostics tools.
12 12 (n=30,ai=0.7,nsimul=300,p=1,k1=12,k2=20)
case Single
AO
Single
IO
2 AOs 2IOs Both AO and IO
1st
outlier
2nd
outlier
1st
outlier
2nd
outlier
1st
outlier
(IO)
2nd
outlier
(AO)
M estimation
2
iH 94.6 99 80.3 82.3 88.3 91.7 99 0.3
Dffits 86.3 99.7 71.2 78.3 97.7 98.7 99.6 0.3
Di 84 99.3 53.3 61 92.3 92.7 99.3 0.3
L1/Least Absolute Deviation / Least Absolute value
2
iH 84.3 98.3 76.7 77.3 88.3 92 98.3 0.3
Dffits 77.7 99.3 58 64.3 95.3 95 96.3 0.3
Di 71 95.7 45 52 90 90.3 95.7 0.3
Least Median Square
2
iH 87.3 96 82.7 86 84 85 96 0.3
Dffits 75.3 83 63.3 68.3 83 84.3 82 0.3
Di 69 79.7 56.3 56 79.3 81 79.7 0.3
Least Trimmed Square
2
iH 87.3 98 73.7 73 89.7 90.7 98 0.3
Dffits 79.3 91.7 63 65.7 91.3 92.7 91.7 0.3
Di 71 85.7 43 48.3 84.7 85.7 85.7 0.3
Summary of the outliers detection performance. Note that the numbers are in percentage.
Table 2. present the results of the proposed methods based on robust version which contains four different kinds
of robust regression namely M-estimation, LAD/L1, LMS and LTS respectively, the table shows the comparison
between their power of performance accordingly in the correctly detection and identification in percentage.
Identification of Outliersin Time Series Data via...
DOI: 10.9790/5728-11656067 www.iosrjournals.org 64 | Page
Which one has the most powerful performance among the robust regressions, however their outliers’ locations
and cut-off values C are set as the same in Table 1. The interpretations of the table are as follows:
a. M-estimation
The power of correct detection and identification percentage for single AO under 2
iH is 94.6% and, Dffits
(86.3%) and iD (84%). There is a very powerful correct percentage detection for single IO i.e. 99% to 99.7%.
Under multiple IO, the power of correct percentage detection and identification is between 88.3% to 97.7% for
2
iH , Dffits and iD . And for multiple AO, 2
iH , Dffits and iD has 99% to 99.6% power of correctly detection
and identification. The 1st
outlier in “both AO and IO”, which is IO has 99% to 99.6% and nothing was detected
correctly in 2nd
outlier which is AO.
b. Least Absolute Deviation (L1/LAD)
All the method has a low percentage detection in multiple AO and little powerful percentage detection on
multiple IO of 88.3% to 95.3%.In single IO, correct outlier detection percentage for 2
iH , Dffits and iD is 95.7%
to 99.3%. And single AO is 71% to 84.3%.The 1st
outlier which is IO in “both AO and IO” has 95.7% to 98.3%
of correct detection in 2
iH , Dffits and iD . And the 2nd
outlier has approximately 0% all through the methods.
c. Least Median Square (LMS).
2
iH has 96% power of correct outlier detection in single IO and the rest method has power of 79.7% to 83%
while in single AO, all method has power of 69% to 87.3%.For multiple AO, 2
iH has 82.7% to 86% of correct
outlier detection and the rest method has a relatively small percentage of correct detection. 2
iH , Dffits and iD
has a percentage of correct detection of 79.3% to 85% in multiple IO. In “both AO and IO”, 2
iH , Dffits and iD
has a percentage of 96%, 82% and 79.7% on the 1st
outlier which is IO and 0.3% on 2nd
outlier which are IO.
d. Least Trimmed Square (LTS)
2
iH , Dffits and iD has 85.7%, 91.7% and 98% power of correct outlier detection in single IO while in single
AO, all method has less percentage power of 69% to 87.3%.For multiple AO, all the methods has percentage
detection of 43% to 73.7% and for multiple IO 84.7% to 91.3% of correct percentage detection.In “both AO and
IO”, 2
iH , Dffits and iD has a powerful percentage of correct outlier detection of 85.7% to 98% under the 1st
outlier which is IO, and 0.3% on 2nd
outlier which is AO.
Table 3. A simulation study on the power of the outlier detection in regression diagnostics tools
based on ordinary least square (ols)
Summary of the outliers detection performance. Note that the numbers are in percentage.
Table 3. Present the result of the performance of 500 replications on outlier-detection method for each single
and multiple outlier specifications using the cut-off C value of each classical diagnostic techniques as stated
earlier respectively. The numbers and percentage of the correct detection and identification are given under each
location of the outliers. Under multiple outlier and “Both AO and IO”, the label k1, k2, k3 tells us the location
of 1st outlier, 2nd outlier and 3rd
outlier respectively.
26 26 (n=100,ai=0.7,nsimul=500,p=1,k1=26,k2=62,k3=99)
case
Single
AO
Single
IO
3 AOs 3 IOs
Both AO and IO
1st
outlier
2nd
outlier
3rd
outlier
1st
outlier
2nd
outlier
3rd
outlier
1st
outlier
(AO)
2nd
outlier
(IO)
3rd
outlier
(IO)
Ordinary Least Square (OLS)
2
iH 97.6 99.4 88.4 88.8 87.2 95 94.4 93.4 99.4 0.2 0.2
Dffits 67.8 99.4 56 53.8 53.4 98.6 99.2 98.2 99.4 0 0.2
iD 55 99.2 38.6 36.6 35.6 96 96.6 95.6 99.2 0 0.2
Identification of Outliersin Time Series Data via...
DOI: 10.9790/5728-11656067 www.iosrjournals.org 65 | Page
The result shows that OLS based diagnostic technique, 2
iH detect 97.6% of correct number of outlier under
Single AO. The percentage detection for single IO under 2
iH , Dffits and iD is very powerful i.e (99.4%, 99.4%
and 99.2%).For multiple IO (3IOs), there seems to be a better correct detection percentage ranging from 93.4%
to 99.2% under the method 2
iH , Dffits and iD . The result outcome of “both AO and IO” seems to favour
additive outliers (AO) and perform woefully under the innovative outlier (IO), this may be that OLS based
diagnostic technique can only detect correctly the first outlier that comes on it way and swamp the rest of the
outlier.
Table 4. A simulation study on the power of the outlier detection based on robust versions of regression
diagnostics tools.
26 26 (n=100,ai=0.7,nsimul=500,p=1,k1=26,k2=62,k3=99)
Case
Single
AO
Single
IO
3 AOs 3 IOs
Both AO and IO
1st
outlie
r
2nd
outlie
r
3rd
Outlie
r
1st
outlie
r
2nd
outlie
r
3rd
outlie
r
1st
outlie
r
(AO)
2nd
outlie
r
(IO)
3rd
outlier
(IO)
M estimation
2
iH 97.6 99.6 88.6 88.8 87 94.6 94.6 94 99.6 0.2 0.2
Dffits 64.8 99.4 52.4 49.8 48.2 99.2 99 98.4 99.4 0.2 0.2
Di 53 99.2 37.6 34.4 34 96.6 95.6 95.6 99.2 0.2% 0.2
L1/Least Absolute Deviation / Least Absolute value
2
iH 98 99.6 90 89.4 87.2 93.8 94 92.8 99.4 0.2 98.4
Dffits 56.4 98.4 49.8 46.2 47 98.8 98.6 98 98.4 0 0.2
iD 47 98.4 36.2 32.2 0.8 96 96 45.8 98.4 0 0.2
Least Median Square
2
iH 97.8 99.4 90.4 90.6 87.8 94.4 94.2 92.2 99.4 0.2 0.2
Dffits 70.4 96.2 55 53 50.6 93.8 92.4 94.2 96.2 0 0.2
iD 56.2 94.6 42.6 38.8 40.8 91 88 88.8 94.6 0 0.2
Least Trimmed Square
2
iH 97.6 99.8 89.2 89.2 87.8 94.6 95 93.8 99.8 0.2 0.2
Dffits 64.6 98.8 53.6 47.2 47.8 98.8 98.4 97.2 98.8 0.2 0.2
iD 49.6 98.4 36.2 33.6 32.8 97.4 94.8 93.8 98.4 0.2 0.2
Summary of the outliers detection performance. Note that the numbers are in percentage.
Table 4. present the results of the proposed methods based on robust version which contains four different kinds
of robust regression namely M-estimation, LAD/L1, LMS and LTS respectively, the table shows the comparison
between their power of performance accordingly in the correctly detection and identification in percentage.
Which one has the most powerful performance among the robust regressions, however their outliers’ locations
and cut-off values C are set as the same in Table 1. The interpretations of the table are as follows:
Identification of Outliersin Time Series Data via...
DOI: 10.9790/5728-11656067 www.iosrjournals.org 66 | Page
a. M-estimation
The power of correct detection and identification percentage for multiple and single AO is very poor under the
M-estimation except for 2
iH single AO that has 99.6% correct detection.Under multiple IO, the power of
correct percentage detection and identification is between 94% to 99.6% for 2
iH , Dffits and iD which is quiet
powerful. 2
iH , Dffits and iD has 99.2% to 99.6% power of correctly detection and identification under the 1st
outlier which is AO in “Both AO and IO” and nothing was detected correctly in 2nd
outlier and 3rd
outlier which
were IOs.
b. Least Absolute Deviation (L1/LAD)
Dffits and iD has 0.8% to 56.4% power of correct outlier detection which is relatively low, meanwhile
2
iH has
correct power detection of 87.2% to 90% in multiple AO and 98% in single AO.In single and multiple IO,
correct outlier detection percentage for 2
iH , Dffits and iD which has the percentage of 92.8% to 99.6%.The 1st
outlier which is AO in “both AO and IO” has 98.4% to 99.4% of correct detection in 2
iH , Dffits and iD . For 2nd
and 3rd
outlier, they have approximately 0% correct detection expect the 2
iH 3rd
outlier that has 98.4% correct
detection.
c. Least Median Square (LMS).
Only 2
iH attain 97.8% correct percentage detection in single AO and 87.8% to 90.6% correct detection from the
1st
outlier to 3rd
outlier in multiple AO. 2
iH also have 87.8% to 90.6% in multiple IO and 99.4% in single IO,
however Dffits and iD has correct percentage detection of 96.2% and 94.6% in single IO and 88% to 94.4% in
multiple IO.In “both AO and IO”, 2
iH , Dffits and iD has a powerful percentage of 99.4%,96.2% and 94.6% on
the 1st
outlier which is AO and 0% to 0.2% on 2nd
and 3rd
outlier which are IO.
d. Least Trimmed Square (LTS)
2
iH has 97.6% power of correct outlier detection in single AO and the rest method has power of 1% to 64.6%
while in single IO, all method has power of 98.4% to 99.8%.For multiple AO, 2
iH has 87.8% to 89.2% of
correct outlier detection and the rest method has a relatively small percentage of correct detection. 2
iH , Dffits
and iD has a percentage of correct detection of 93.8% to 98.8% in multiple IO.In “both AO and IO”, 2
iH ,
Dffits and iD has a powerful percentage of correct outlier detection of 98.4% to 99.8% under the 1st
outlier
which is AO, and 0.2% on 2nd
and 3rd
outlier which are IO.
IV. SUMMARY
It is seen from the result that under small sample size, OLS performance is good under innovative outlier for
single, and multiple outliers, and also in “both AO and IO”, the percentage correction outlier detection is only
innovative outlier. However, generally, the power of percentage correct outlier detection only give best result
for both OLS based and robust version based under innovative outliers. Robust version base on M-estimation
and OLS based perform similar way and the rest robust version method perform less.
Under large sample size, the result also indicate that regression diagnostics tools based on OLS perform similar
way to other various kind of robust versions that are based on robust regression. However, in this part of the
power of correct outlier detection, LTS performance is the best with the probability percentage of 87.8% to
99.8% followed by L1/LAD (87.2% to 99.6%), LMS (87.8% to 99.4%) and M-estimation (87% to 99.6%). In
spite of this Hadi’s influence measure 𝐻𝑖
2
perform the best in both OLS based and robust based version.
Meanwhile, all diagnostics measure didn’t detect any outlier in “both AO and IO” except for the first outlier
which is AO, no outlier is detected in second and third outlier under IO.
V. Conclusion
In a small sample size, OLS and M-estimation is suggest to be use for the detection and identification of outlier
under innovative outliers (IO). However both method fail to detect any number of correct outliers detection
when the sample were mixed by both innovative outliers and additive outliers. Other robust estimation methods
Identification of Outliersin Time Series Data via...
DOI: 10.9790/5728-11656067 www.iosrjournals.org 67 | Page
performed less. On the other side, large sample size, LTS perform best in the simulation study compared to
other measures, however it is not robust to a series that is contaminated with both AO and IO. It can only detect
the AO in the series. Also Cook’s Distance and The welsch-kuh distance are not robust to multiple AO.
Reference
[1] R. S. Tsay, Analysis Of Financial Time Series, pp. 1-448, 2002.
[2] S. S. S. Abd Mutalib, and K. Ibrahim, Identification Of Outliers: A Simulation Study, ARPN Journal of Engineering and Applied
Sciences, Vol. 10, pp. 326-330, 2015.
[3] J. Lopez-de-Lacalle, tsoutliers R Package for Detection of Outliers in Time Series, pp. 1- 28, 2014.
[4] J. Xu,. B. Abraham and S. H. Steiner, Outlier Detection Methods in MultivariateRegression Models, pp. 1-27.
[5] E. Widodo, S. Guritno ands.Haryatmi, Response Surface Models with Data Outliersthrough a Case Study, Applied Mathematical
Sciences, Vol. 9, pp. 1803 – 1812, 2015.
[6] R. A. Yaffee, Regression Analysis: Some Popular Statistical Package Options, Social Science, and Mapping Group Academic
Computing Services Information Technology Services, pp. 1-12. 2002.
[7] F. H. Thanoon, Robust Regression by Least Absolute Deviations Method, International Journal of Statistics and Applications, Vol.
5(3), pp. 109-112, 2015.
[8] S. Bachmann and M. Losler, IVS Combination Center at BKG - Robust Outlier Detection Weighting Strategies, IVS General
Meeting Proceedings, pp. 266 – 270, 2012.
[9] S-PLUS Language Reference.

Mais conteúdo relacionado

Mais procurados

Robust Design And Variation Reduction Using DiscoverSim
Robust Design And Variation Reduction Using DiscoverSimRobust Design And Variation Reduction Using DiscoverSim
Robust Design And Variation Reduction Using DiscoverSim
JohnNoguera
 
State and disturbance estimation of a linear systems using proportional integ...
State and disturbance estimation of a linear systems using proportional integ...State and disturbance estimation of a linear systems using proportional integ...
State and disturbance estimation of a linear systems using proportional integ...
Mustefa Jibril
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 

Mais procurados (12)

161783709 chapter-04-answers
161783709 chapter-04-answers161783709 chapter-04-answers
161783709 chapter-04-answers
 
Chapter 4 estimation of parameters
Chapter 4   estimation of parametersChapter 4   estimation of parameters
Chapter 4 estimation of parameters
 
Chapter 12
Chapter 12Chapter 12
Chapter 12
 
Robust Design And Variation Reduction Using DiscoverSim
Robust Design And Variation Reduction Using DiscoverSimRobust Design And Variation Reduction Using DiscoverSim
Robust Design And Variation Reduction Using DiscoverSim
 
Testing a claim about a mean
Testing a claim about a mean  Testing a claim about a mean
Testing a claim about a mean
 
Presentation of the unbalanced R package
Presentation of the unbalanced R packagePresentation of the unbalanced R package
Presentation of the unbalanced R package
 
11
1111
11
 
State and disturbance estimation of a linear systems using proportional integ...
State and disturbance estimation of a linear systems using proportional integ...State and disturbance estimation of a linear systems using proportional integ...
State and disturbance estimation of a linear systems using proportional integ...
 
hypothesis test
 hypothesis test hypothesis test
hypothesis test
 
Dem 7263 fall 2015 spatially autoregressive models 1
Dem 7263 fall 2015   spatially autoregressive models 1Dem 7263 fall 2015   spatially autoregressive models 1
Dem 7263 fall 2015 spatially autoregressive models 1
 
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
THE EFFECT OF SEGREGATION IN NONREPEATED PRISONER'S DILEMMA
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 

Destaque

Assure assignment Wednesday
Assure assignment WednesdayAssure assignment Wednesday
Assure assignment Wednesday
zheaver
 
Exercise reading
Exercise readingExercise reading
Exercise reading
'Niq Hdyn
 
BIOMASS AS FUEL FOR REHEATING FURNACES FOCUSING ON IMPURITIES. (Emilia, Sir...
BIOMASS AS FUEL FOR REHEATING FURNACES   FOCUSING ON IMPURITIES. (Emilia, Sir...BIOMASS AS FUEL FOR REHEATING FURNACES   FOCUSING ON IMPURITIES. (Emilia, Sir...
BIOMASS AS FUEL FOR REHEATING FURNACES FOCUSING ON IMPURITIES. (Emilia, Sir...
Siri Andersson
 
Clustering large probabilistic graphs
Clustering large probabilistic graphsClustering large probabilistic graphs
Clustering large probabilistic graphs
Ecwayt
 
CV - Ettienne van der Merwe - Apr 2016
CV - Ettienne van der Merwe - Apr 2016CV - Ettienne van der Merwe - Apr 2016
CV - Ettienne van der Merwe - Apr 2016
Ettienne van der Merwe
 

Destaque (13)

Assure assignment Wednesday
Assure assignment WednesdayAssure assignment Wednesday
Assure assignment Wednesday
 
El tdah de los dk
El tdah de los dkEl tdah de los dk
El tdah de los dk
 
Exercise reading
Exercise readingExercise reading
Exercise reading
 
BIOMASS AS FUEL FOR REHEATING FURNACES FOCUSING ON IMPURITIES. (Emilia, Sir...
BIOMASS AS FUEL FOR REHEATING FURNACES   FOCUSING ON IMPURITIES. (Emilia, Sir...BIOMASS AS FUEL FOR REHEATING FURNACES   FOCUSING ON IMPURITIES. (Emilia, Sir...
BIOMASS AS FUEL FOR REHEATING FURNACES FOCUSING ON IMPURITIES. (Emilia, Sir...
 
Clustering large probabilistic graphs
Clustering large probabilistic graphsClustering large probabilistic graphs
Clustering large probabilistic graphs
 
Diploma
DiplomaDiploma
Diploma
 
Nationalpublishingcompany
NationalpublishingcompanyNationalpublishingcompany
Nationalpublishingcompany
 
Design like you stand for something
Design like you stand for something Design like you stand for something
Design like you stand for something
 
Credit Rating Process with Respect to Corporate Debt
Credit Rating Process with Respect to Corporate DebtCredit Rating Process with Respect to Corporate Debt
Credit Rating Process with Respect to Corporate Debt
 
CV - Ettienne van der Merwe - Apr 2016
CV - Ettienne van der Merwe - Apr 2016CV - Ettienne van der Merwe - Apr 2016
CV - Ettienne van der Merwe - Apr 2016
 
Maratones de lectura - Guia secundaria
Maratones de lectura - Guia secundariaMaratones de lectura - Guia secundaria
Maratones de lectura - Guia secundaria
 
Suelos y plantas
Suelos y plantasSuelos y plantas
Suelos y plantas
 
Relación agua suelo planta atmosfera (raspa) ingenieria agric
Relación agua suelo planta atmosfera (raspa)   ingenieria agricRelación agua suelo planta atmosfera (raspa)   ingenieria agric
Relación agua suelo planta atmosfera (raspa) ingenieria agric
 

Semelhante a Identification of Outliersin Time Series Data via Simulation Study

Bel ventutorial hetero
Bel ventutorial heteroBel ventutorial hetero
Bel ventutorial hetero
Edda Kang
 
2014-mo444-final-project
2014-mo444-final-project2014-mo444-final-project
2014-mo444-final-project
Paulo Faria
 
Fast and Bootstrap Robust for LTS
Fast and Bootstrap Robust for LTSFast and Bootstrap Robust for LTS
Fast and Bootstrap Robust for LTS
dessybudiyanti
 

Semelhante a Identification of Outliersin Time Series Data via Simulation Study (20)

DETECTION OF RELIABLE SOFTWARE USING SPRT ON TIME DOMAIN DATA
DETECTION OF RELIABLE SOFTWARE USING SPRT ON TIME DOMAIN DATADETECTION OF RELIABLE SOFTWARE USING SPRT ON TIME DOMAIN DATA
DETECTION OF RELIABLE SOFTWARE USING SPRT ON TIME DOMAIN DATA
 
50120130405032
5012013040503250120130405032
50120130405032
 
report
reportreport
report
 
Exponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLEExponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLE
 
Fuzzy Regression Model for Knee Osteoarthritis Disease Diagnosis
Fuzzy Regression Model for Knee Osteoarthritis Disease DiagnosisFuzzy Regression Model for Knee Osteoarthritis Disease Diagnosis
Fuzzy Regression Model for Knee Osteoarthritis Disease Diagnosis
 
Design of optimized Interval Arithmetic Multiplier
Design of optimized Interval Arithmetic MultiplierDesign of optimized Interval Arithmetic Multiplier
Design of optimized Interval Arithmetic Multiplier
 
Bel ventutorial hetero
Bel ventutorial heteroBel ventutorial hetero
Bel ventutorial hetero
 
An Efficient Unsupervised AdaptiveAntihub Technique for Outlier Detection in ...
An Efficient Unsupervised AdaptiveAntihub Technique for Outlier Detection in ...An Efficient Unsupervised AdaptiveAntihub Technique for Outlier Detection in ...
An Efficient Unsupervised AdaptiveAntihub Technique for Outlier Detection in ...
 
Model of robust regression with parametric and nonparametric methods
Model of robust regression with parametric and nonparametric methodsModel of robust regression with parametric and nonparametric methods
Model of robust regression with parametric and nonparametric methods
 
Research Assignment INAR(1)
Research Assignment INAR(1)Research Assignment INAR(1)
Research Assignment INAR(1)
 
An SPRT Procedure for an Ungrouped Data using MMLE Approach
An SPRT Procedure for an Ungrouped Data using MMLE ApproachAn SPRT Procedure for an Ungrouped Data using MMLE Approach
An SPRT Procedure for an Ungrouped Data using MMLE Approach
 
Concept of Limit of Detection (LOD)
Concept of Limit of Detection (LOD)Concept of Limit of Detection (LOD)
Concept of Limit of Detection (LOD)
 
2014-mo444-final-project
2014-mo444-final-project2014-mo444-final-project
2014-mo444-final-project
 
SURE Model_Panel data.pptx
SURE Model_Panel data.pptxSURE Model_Panel data.pptx
SURE Model_Panel data.pptx
 
Outlier detection by Ueda's method
Outlier detection by Ueda's methodOutlier detection by Ueda's method
Outlier detection by Ueda's method
 
A Study on Performance Analysis of Different Prediction Techniques in Predict...
A Study on Performance Analysis of Different Prediction Techniques in Predict...A Study on Performance Analysis of Different Prediction Techniques in Predict...
A Study on Performance Analysis of Different Prediction Techniques in Predict...
 
IRJET-Debarred Objects Recognition by PFL Operator
IRJET-Debarred Objects Recognition by PFL OperatorIRJET-Debarred Objects Recognition by PFL Operator
IRJET-Debarred Objects Recognition by PFL Operator
 
Fast and Bootstrap Robust for LTS
Fast and Bootstrap Robust for LTSFast and Bootstrap Robust for LTS
Fast and Bootstrap Robust for LTS
 
Statistical techniques used in measurement
Statistical techniques used in measurementStatistical techniques used in measurement
Statistical techniques used in measurement
 
Neural Network based Supervised Self Organizing Maps for Face Recognition
Neural Network based Supervised Self Organizing Maps for Face Recognition  Neural Network based Supervised Self Organizing Maps for Face Recognition
Neural Network based Supervised Self Organizing Maps for Face Recognition
 

Mais de iosrjce

Mais de iosrjce (20)

An Examination of Effectuation Dimension as Financing Practice of Small and M...
An Examination of Effectuation Dimension as Financing Practice of Small and M...An Examination of Effectuation Dimension as Financing Practice of Small and M...
An Examination of Effectuation Dimension as Financing Practice of Small and M...
 
Does Goods and Services Tax (GST) Leads to Indian Economic Development?
Does Goods and Services Tax (GST) Leads to Indian Economic Development?Does Goods and Services Tax (GST) Leads to Indian Economic Development?
Does Goods and Services Tax (GST) Leads to Indian Economic Development?
 
Childhood Factors that influence success in later life
Childhood Factors that influence success in later lifeChildhood Factors that influence success in later life
Childhood Factors that influence success in later life
 
Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...
Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...
Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...
 
Customer’s Acceptance of Internet Banking in Dubai
Customer’s Acceptance of Internet Banking in DubaiCustomer’s Acceptance of Internet Banking in Dubai
Customer’s Acceptance of Internet Banking in Dubai
 
A Study of Employee Satisfaction relating to Job Security & Working Hours amo...
A Study of Employee Satisfaction relating to Job Security & Working Hours amo...A Study of Employee Satisfaction relating to Job Security & Working Hours amo...
A Study of Employee Satisfaction relating to Job Security & Working Hours amo...
 
Consumer Perspectives on Brand Preference: A Choice Based Model Approach
Consumer Perspectives on Brand Preference: A Choice Based Model ApproachConsumer Perspectives on Brand Preference: A Choice Based Model Approach
Consumer Perspectives on Brand Preference: A Choice Based Model Approach
 
Student`S Approach towards Social Network Sites
Student`S Approach towards Social Network SitesStudent`S Approach towards Social Network Sites
Student`S Approach towards Social Network Sites
 
Broadcast Management in Nigeria: The systems approach as an imperative
Broadcast Management in Nigeria: The systems approach as an imperativeBroadcast Management in Nigeria: The systems approach as an imperative
Broadcast Management in Nigeria: The systems approach as an imperative
 
A Study on Retailer’s Perception on Soya Products with Special Reference to T...
A Study on Retailer’s Perception on Soya Products with Special Reference to T...A Study on Retailer’s Perception on Soya Products with Special Reference to T...
A Study on Retailer’s Perception on Soya Products with Special Reference to T...
 
A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...
A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...
A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...
 
Consumers’ Behaviour on Sony Xperia: A Case Study on Bangladesh
Consumers’ Behaviour on Sony Xperia: A Case Study on BangladeshConsumers’ Behaviour on Sony Xperia: A Case Study on Bangladesh
Consumers’ Behaviour on Sony Xperia: A Case Study on Bangladesh
 
Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...
Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...
Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...
 
Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...
Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...
Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...
 
Media Innovations and its Impact on Brand awareness & Consideration
Media Innovations and its Impact on Brand awareness & ConsiderationMedia Innovations and its Impact on Brand awareness & Consideration
Media Innovations and its Impact on Brand awareness & Consideration
 
Customer experience in supermarkets and hypermarkets – A comparative study
Customer experience in supermarkets and hypermarkets – A comparative studyCustomer experience in supermarkets and hypermarkets – A comparative study
Customer experience in supermarkets and hypermarkets – A comparative study
 
Social Media and Small Businesses: A Combinational Strategic Approach under t...
Social Media and Small Businesses: A Combinational Strategic Approach under t...Social Media and Small Businesses: A Combinational Strategic Approach under t...
Social Media and Small Businesses: A Combinational Strategic Approach under t...
 
Secretarial Performance and the Gender Question (A Study of Selected Tertiary...
Secretarial Performance and the Gender Question (A Study of Selected Tertiary...Secretarial Performance and the Gender Question (A Study of Selected Tertiary...
Secretarial Performance and the Gender Question (A Study of Selected Tertiary...
 
Implementation of Quality Management principles at Zimbabwe Open University (...
Implementation of Quality Management principles at Zimbabwe Open University (...Implementation of Quality Management principles at Zimbabwe Open University (...
Implementation of Quality Management principles at Zimbabwe Open University (...
 
Organizational Conflicts Management In Selected Organizaions In Lagos State, ...
Organizational Conflicts Management In Selected Organizaions In Lagos State, ...Organizational Conflicts Management In Selected Organizaions In Lagos State, ...
Organizational Conflicts Management In Selected Organizaions In Lagos State, ...
 

Último

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
gindu3009
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 

Último (20)

Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Creating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsCreating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening Designs
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 

Identification of Outliersin Time Series Data via Simulation Study

  • 1. IOSR Journal of Mathematics (IOSR-JM) e-ISSN: 2278-5728, p-ISSN: 2319-765X. Volume 11, Issue 6 Ver. V (Nov. - Dec. 2015), PP 60-67 www.iosrjournals.org DOI: 10.9790/5728-11656067 www.iosrjournals.org 60 | Page Identification of Outliersin Time Series Data via Simulation Study 1 ADEWALE Asiata. O,2 Nurul Sima MOHAMAD SHARIFF 1,2, Facultyof Science&TechnologyUniversitySainsIslamMalaysia (USIM)Bandar Baru Nilai, 71800 Nilai Negeri SembilanMalaysia Abstract:This paper compares the performance of regression diagnostics techniques based on Ordinary LeastSquares (OLS) estimators and four types of robust regression based on robust estimators todetect and identify outliers. It is known that OLS is not robust in the presence of multiple high leverage points.Thus severalrobust regression models are used as alternative and its approach is more reliable and appropriate method for solving this problem. The comparisonsaremadeviasimulationstudies.Our resultshaveshownthatinsomecases diagnostics basedonthe OLSandsome robustestimatorsgivesimilaroutcomes,they detectthesame percentage of correct outlier detection. And under small sample size OLS and M-estimation perform best for innovative outliers. The results also shows that Least Trimmed Square is the best among all its counterparts under large sample size. Keywords: Outliers, Ordinary Least Squares (OLS), Regression diagnostics, Robust regression, Simulation Studies. I. Introduction Outliers are usually encountered in time series data analysis. The presence of outliers in time series analysis can seriously has negative impact in the analysis because they may stimulate substantial biasness in parameter estimation, model misspecification and incorrect inference, [1]. Outliers has been defined by Abd. Mutalib and Ibrahim [2] as data points or observations that deviate distinctly from other observations or data points which are abnormally large or small from the other observations. The relevancies of outlier detection and identification in time series have been used in fraud detection, financial institute, public health and Telecommunication Company. According to Lopez-de-Lacalle[3], there are five types of outliers that are commonly found, namely, innovation outlier (IO), additive outlier (AO), level shift (LS), temporary change (TC)and seasonal level shift (SLS). The most popular way to analyse time series regression model is to use Ordinary Least Square (OLS) method. It is the best technique if all the statistical assumptions are valid but when the data or the series are contaminated with outliers, these statistical assumptions are invalid. There arise the needs of regression diagnostics tools or techniques to detect and identify the outliers or influential observations. There are many types of regression diagnostics tools in the literature, among them are: The welsch-kuh distances, Cook’s Distance and Hadis influence Measure. However, these methods are not robust in presence of multiple high leverage point, which can cause masking and swamping effects [4]. According to Widodo et al [5], robust regression approach is more reliable and appropriate method for solving this problem. The robust estimators are relatively unaltered by large changes in a small series of data and also small changes in a large part of the series. Yafee [6] discusses that there are several kinds of robust estimators in the literature among which are Least Absolute Deviations (LAD or L1), Least Median Squares (LMS), Least Trimmed Squares (LTS) and Huber M- Estimation. These robust estimators will be used in this research work by S plus statistical software through simulation study. Their performance will be compared to one another and the best technique among them will be figured determined. II. Materials and Method OLS Estimation Consider the time series model of simple autoregressive, AR(p) tptptt yyy    ...11 (1) The time series model of simple autoregressive, AR(p) as the sameform as (1), can be written as ,...210   pii xy ni ,...,2,1 (1) Model (2) can be written in matrix form as,   XY (3)
  • 2. Identification of Outliersin Time Series Data via... DOI: 10.9790/5728-11656067 www.iosrjournals.org 61 | Page WhereY= vector of nx1 response,Xis the n x p matrix of explanatory variables,  is the vector of parameter (regression coefficient),  is the random error distributed as normal distribution with mean zero and 2  . Here n is number of observations and p is number of regressors. The final estimates of  is then, YXXX TT 1 )(ˆ   (4) The residual isobtained as follows: yHYXXXXYXYYYe TT )1()(ˆˆ 1    (5) Where TT XXXXH 1 )(   is leverage / hat matrix. The i-th elements of H is, T i T iii xXXxh 1 )(   (6) Regression Diagnostics Methods There is numerous regression diagnostics methods used to identify outliers. However, this study will only consider five methods that are listed below. Hadi’s influence measure , 111 2 2 2 i i iiii ii i d d h p h h H     ni ,...,3,2,1 (7) Where SSE e d i i 2 the normalized, SSE is sum squares residual residuals, It identify influential observations. The cut-off point for 2 iH measure is  )var()( 22 ii HCHmean  C is constant which take the value of 2. The welsch-kuh distances , )1(ˆ * )( iii iii h he DFFits    ni ,...,3,2,1 (8) has the cuff-off value of n p *2 Cook’s Distance , )1)(1( *2 ii iii i hp hr D   ni ,...,3,2,1 (9) Where i i i h e r   1ˆ , Any observation that exceeds the cut-off of pn  4 is considered as influential observation. Robust Regression Robust regression methods are designed not to be wholly affected by violations of assumptions by the core data generating process. A robust regression is performed on a high breakdown point and high efficiency regression, [9]. Huber M-Estimation Huber-M estimation uses Huber weight function as its weight function. The Huber M-Estimator scale estimate of mˆ and the Huber M-Estimator error me are usedinstead ofˆ , and ie which are based on OLS method.
  • 3. Identification of Outliersin Time Series Data via... DOI: 10.9790/5728-11656067 www.iosrjournals.org 62 | Page Least absolute deviations (LAD or L1) LAD obtains a higher effectiveness than OLS through minimizing the sum of the absolute errors, [7]. It scale estimate denoted by 1 ˆL and its error 1Le are usedinstead ofˆ , and ie which are based on OLS. Least median squares (LMS) LMS is a robust estimator that has been hypothetically has breakdown point of 50%, [8]. This means the LMS provides reliable outcomes even if up to 50% of contaminated data or series exist. It has the characteristics of solving the liner model by minimising the median of the weighted squared. LMS scale estimate denoted by LMSˆ and its error LMSe are usedinstead ofˆ , and ie which are based on OLS. Least trimmed squares (LTS) This robust regression techniques minimizes the sum of squared residuals over n observations and subset k of those observations, thus the n-k observations which are excluded does not has effect on the fit.The LAD scale estimate is denoted by LADˆ and its error LADe are usedinstead ofˆ , and ie which are based on OLS. Note that their cuff-value remains the same as the OLS. Incorporating outliers into the series. From the original series, model1: )( ttt Iyq  , the series is contaminated with outliers.  is the magnitude of the outliers, is the time that the outliers occur and )(tI is a dummy variable which has zero value at all lags except when time Tt           t t It ,0 ,1 tatoccursioncontaminatwhen , 1tI otherwise 0. III. Result and Discussion The sample size used is n=30 & 200, parameter is set to be 7.0 , 5 , standard deviation 1 ,300 replications for n=30 and 500 replications for n = 200. To assess the power of the procedure, the following case will be considered; i. Single outlier of AO / IO ii. Multiple outliers AOs / IOs iii. Both multiple outliers AOs and IOs Fifthteen measures will be run for each cases, Under n=30, the location for single outlier is set to be 12 and for the multiple outlier, the location is set to be k1 =12, and k= 20. Forn=200, the location for single outlier is set to be 26 , and multiple outliers; k1 =26, k2 =62 and k3 =99. Table 1. A simulation study on the power of the outlier detection in regression diagnostics tools based on ols. 12 12 (n=30,ai=0.7,nsimul=300,p=1,k1=12,k2=20) Case Single AO Single IO 2 AOs 2IOs Both AO and IO 1st outlier 2nd outlier 1st outlier 2nd outlier 1st outlier (IO) 2nd outlier (AO) Ordinary Least Square (OLS) 2 iH 94.6 99.3 80 80 86.3 87.3 99.3 0.3 Dffits 87.7 99.7 70.7 79 97.7 98 99.6 0.3
  • 4. Identification of Outliersin Time Series Data via... DOI: 10.9790/5728-11656067 www.iosrjournals.org 63 | Page Di 82.7 99 54 61 92.7 93 99 0.3 Summary of the outliers detection performance. Note that the numbers are in percentage. Table 1. Present the result of the performance of 300 replications on outlier-detection method for each single and multiple outlier specifications using the cut-off C value of each classical diagnostic techniques as stated earlier respectively. The numbers and percentage of the correct detection and identification are given under each location of the outliers. Under multiple outlier and “Both AO and IO”, the label k1 and k2, tells us the location of 1st outlier, outlier and 2nd outlier respectively, and  for single outlier. The result shows that OLS based diagnostic technique, 2 iH detect 94.6% of correct number of outlier, follow by Dffits (87.7%) and iD , (82.7%) under single AO while in single IO, the percentage detection for Dffits , 2 iH and iD is very powerful i.e (99.7%, 99.3% and 99%). For multiple IO (2IOs), there seems to be a better correct detection percentage ranging from 86.3% to 99.6%, and for AO (2AOs), 54% to 80%. The result outcome of “Both AO and IO” seems to favour additive outliers (IO) and perform woefully under the additive outlier (AO), this may be that OLS based diagnostic technique can only detect correctly the first outlier that comes on it way and swamp the rest of the outlier. Table 2. A simulation study on the power of the outlier detection based on robust versions of regression diagnostics tools. 12 12 (n=30,ai=0.7,nsimul=300,p=1,k1=12,k2=20) case Single AO Single IO 2 AOs 2IOs Both AO and IO 1st outlier 2nd outlier 1st outlier 2nd outlier 1st outlier (IO) 2nd outlier (AO) M estimation 2 iH 94.6 99 80.3 82.3 88.3 91.7 99 0.3 Dffits 86.3 99.7 71.2 78.3 97.7 98.7 99.6 0.3 Di 84 99.3 53.3 61 92.3 92.7 99.3 0.3 L1/Least Absolute Deviation / Least Absolute value 2 iH 84.3 98.3 76.7 77.3 88.3 92 98.3 0.3 Dffits 77.7 99.3 58 64.3 95.3 95 96.3 0.3 Di 71 95.7 45 52 90 90.3 95.7 0.3 Least Median Square 2 iH 87.3 96 82.7 86 84 85 96 0.3 Dffits 75.3 83 63.3 68.3 83 84.3 82 0.3 Di 69 79.7 56.3 56 79.3 81 79.7 0.3 Least Trimmed Square 2 iH 87.3 98 73.7 73 89.7 90.7 98 0.3 Dffits 79.3 91.7 63 65.7 91.3 92.7 91.7 0.3 Di 71 85.7 43 48.3 84.7 85.7 85.7 0.3 Summary of the outliers detection performance. Note that the numbers are in percentage. Table 2. present the results of the proposed methods based on robust version which contains four different kinds of robust regression namely M-estimation, LAD/L1, LMS and LTS respectively, the table shows the comparison between their power of performance accordingly in the correctly detection and identification in percentage.
  • 5. Identification of Outliersin Time Series Data via... DOI: 10.9790/5728-11656067 www.iosrjournals.org 64 | Page Which one has the most powerful performance among the robust regressions, however their outliers’ locations and cut-off values C are set as the same in Table 1. The interpretations of the table are as follows: a. M-estimation The power of correct detection and identification percentage for single AO under 2 iH is 94.6% and, Dffits (86.3%) and iD (84%). There is a very powerful correct percentage detection for single IO i.e. 99% to 99.7%. Under multiple IO, the power of correct percentage detection and identification is between 88.3% to 97.7% for 2 iH , Dffits and iD . And for multiple AO, 2 iH , Dffits and iD has 99% to 99.6% power of correctly detection and identification. The 1st outlier in “both AO and IO”, which is IO has 99% to 99.6% and nothing was detected correctly in 2nd outlier which is AO. b. Least Absolute Deviation (L1/LAD) All the method has a low percentage detection in multiple AO and little powerful percentage detection on multiple IO of 88.3% to 95.3%.In single IO, correct outlier detection percentage for 2 iH , Dffits and iD is 95.7% to 99.3%. And single AO is 71% to 84.3%.The 1st outlier which is IO in “both AO and IO” has 95.7% to 98.3% of correct detection in 2 iH , Dffits and iD . And the 2nd outlier has approximately 0% all through the methods. c. Least Median Square (LMS). 2 iH has 96% power of correct outlier detection in single IO and the rest method has power of 79.7% to 83% while in single AO, all method has power of 69% to 87.3%.For multiple AO, 2 iH has 82.7% to 86% of correct outlier detection and the rest method has a relatively small percentage of correct detection. 2 iH , Dffits and iD has a percentage of correct detection of 79.3% to 85% in multiple IO. In “both AO and IO”, 2 iH , Dffits and iD has a percentage of 96%, 82% and 79.7% on the 1st outlier which is IO and 0.3% on 2nd outlier which are IO. d. Least Trimmed Square (LTS) 2 iH , Dffits and iD has 85.7%, 91.7% and 98% power of correct outlier detection in single IO while in single AO, all method has less percentage power of 69% to 87.3%.For multiple AO, all the methods has percentage detection of 43% to 73.7% and for multiple IO 84.7% to 91.3% of correct percentage detection.In “both AO and IO”, 2 iH , Dffits and iD has a powerful percentage of correct outlier detection of 85.7% to 98% under the 1st outlier which is IO, and 0.3% on 2nd outlier which is AO. Table 3. A simulation study on the power of the outlier detection in regression diagnostics tools based on ordinary least square (ols) Summary of the outliers detection performance. Note that the numbers are in percentage. Table 3. Present the result of the performance of 500 replications on outlier-detection method for each single and multiple outlier specifications using the cut-off C value of each classical diagnostic techniques as stated earlier respectively. The numbers and percentage of the correct detection and identification are given under each location of the outliers. Under multiple outlier and “Both AO and IO”, the label k1, k2, k3 tells us the location of 1st outlier, 2nd outlier and 3rd outlier respectively. 26 26 (n=100,ai=0.7,nsimul=500,p=1,k1=26,k2=62,k3=99) case Single AO Single IO 3 AOs 3 IOs Both AO and IO 1st outlier 2nd outlier 3rd outlier 1st outlier 2nd outlier 3rd outlier 1st outlier (AO) 2nd outlier (IO) 3rd outlier (IO) Ordinary Least Square (OLS) 2 iH 97.6 99.4 88.4 88.8 87.2 95 94.4 93.4 99.4 0.2 0.2 Dffits 67.8 99.4 56 53.8 53.4 98.6 99.2 98.2 99.4 0 0.2 iD 55 99.2 38.6 36.6 35.6 96 96.6 95.6 99.2 0 0.2
  • 6. Identification of Outliersin Time Series Data via... DOI: 10.9790/5728-11656067 www.iosrjournals.org 65 | Page The result shows that OLS based diagnostic technique, 2 iH detect 97.6% of correct number of outlier under Single AO. The percentage detection for single IO under 2 iH , Dffits and iD is very powerful i.e (99.4%, 99.4% and 99.2%).For multiple IO (3IOs), there seems to be a better correct detection percentage ranging from 93.4% to 99.2% under the method 2 iH , Dffits and iD . The result outcome of “both AO and IO” seems to favour additive outliers (AO) and perform woefully under the innovative outlier (IO), this may be that OLS based diagnostic technique can only detect correctly the first outlier that comes on it way and swamp the rest of the outlier. Table 4. A simulation study on the power of the outlier detection based on robust versions of regression diagnostics tools. 26 26 (n=100,ai=0.7,nsimul=500,p=1,k1=26,k2=62,k3=99) Case Single AO Single IO 3 AOs 3 IOs Both AO and IO 1st outlie r 2nd outlie r 3rd Outlie r 1st outlie r 2nd outlie r 3rd outlie r 1st outlie r (AO) 2nd outlie r (IO) 3rd outlier (IO) M estimation 2 iH 97.6 99.6 88.6 88.8 87 94.6 94.6 94 99.6 0.2 0.2 Dffits 64.8 99.4 52.4 49.8 48.2 99.2 99 98.4 99.4 0.2 0.2 Di 53 99.2 37.6 34.4 34 96.6 95.6 95.6 99.2 0.2% 0.2 L1/Least Absolute Deviation / Least Absolute value 2 iH 98 99.6 90 89.4 87.2 93.8 94 92.8 99.4 0.2 98.4 Dffits 56.4 98.4 49.8 46.2 47 98.8 98.6 98 98.4 0 0.2 iD 47 98.4 36.2 32.2 0.8 96 96 45.8 98.4 0 0.2 Least Median Square 2 iH 97.8 99.4 90.4 90.6 87.8 94.4 94.2 92.2 99.4 0.2 0.2 Dffits 70.4 96.2 55 53 50.6 93.8 92.4 94.2 96.2 0 0.2 iD 56.2 94.6 42.6 38.8 40.8 91 88 88.8 94.6 0 0.2 Least Trimmed Square 2 iH 97.6 99.8 89.2 89.2 87.8 94.6 95 93.8 99.8 0.2 0.2 Dffits 64.6 98.8 53.6 47.2 47.8 98.8 98.4 97.2 98.8 0.2 0.2 iD 49.6 98.4 36.2 33.6 32.8 97.4 94.8 93.8 98.4 0.2 0.2 Summary of the outliers detection performance. Note that the numbers are in percentage. Table 4. present the results of the proposed methods based on robust version which contains four different kinds of robust regression namely M-estimation, LAD/L1, LMS and LTS respectively, the table shows the comparison between their power of performance accordingly in the correctly detection and identification in percentage. Which one has the most powerful performance among the robust regressions, however their outliers’ locations and cut-off values C are set as the same in Table 1. The interpretations of the table are as follows:
  • 7. Identification of Outliersin Time Series Data via... DOI: 10.9790/5728-11656067 www.iosrjournals.org 66 | Page a. M-estimation The power of correct detection and identification percentage for multiple and single AO is very poor under the M-estimation except for 2 iH single AO that has 99.6% correct detection.Under multiple IO, the power of correct percentage detection and identification is between 94% to 99.6% for 2 iH , Dffits and iD which is quiet powerful. 2 iH , Dffits and iD has 99.2% to 99.6% power of correctly detection and identification under the 1st outlier which is AO in “Both AO and IO” and nothing was detected correctly in 2nd outlier and 3rd outlier which were IOs. b. Least Absolute Deviation (L1/LAD) Dffits and iD has 0.8% to 56.4% power of correct outlier detection which is relatively low, meanwhile 2 iH has correct power detection of 87.2% to 90% in multiple AO and 98% in single AO.In single and multiple IO, correct outlier detection percentage for 2 iH , Dffits and iD which has the percentage of 92.8% to 99.6%.The 1st outlier which is AO in “both AO and IO” has 98.4% to 99.4% of correct detection in 2 iH , Dffits and iD . For 2nd and 3rd outlier, they have approximately 0% correct detection expect the 2 iH 3rd outlier that has 98.4% correct detection. c. Least Median Square (LMS). Only 2 iH attain 97.8% correct percentage detection in single AO and 87.8% to 90.6% correct detection from the 1st outlier to 3rd outlier in multiple AO. 2 iH also have 87.8% to 90.6% in multiple IO and 99.4% in single IO, however Dffits and iD has correct percentage detection of 96.2% and 94.6% in single IO and 88% to 94.4% in multiple IO.In “both AO and IO”, 2 iH , Dffits and iD has a powerful percentage of 99.4%,96.2% and 94.6% on the 1st outlier which is AO and 0% to 0.2% on 2nd and 3rd outlier which are IO. d. Least Trimmed Square (LTS) 2 iH has 97.6% power of correct outlier detection in single AO and the rest method has power of 1% to 64.6% while in single IO, all method has power of 98.4% to 99.8%.For multiple AO, 2 iH has 87.8% to 89.2% of correct outlier detection and the rest method has a relatively small percentage of correct detection. 2 iH , Dffits and iD has a percentage of correct detection of 93.8% to 98.8% in multiple IO.In “both AO and IO”, 2 iH , Dffits and iD has a powerful percentage of correct outlier detection of 98.4% to 99.8% under the 1st outlier which is AO, and 0.2% on 2nd and 3rd outlier which are IO. IV. SUMMARY It is seen from the result that under small sample size, OLS performance is good under innovative outlier for single, and multiple outliers, and also in “both AO and IO”, the percentage correction outlier detection is only innovative outlier. However, generally, the power of percentage correct outlier detection only give best result for both OLS based and robust version based under innovative outliers. Robust version base on M-estimation and OLS based perform similar way and the rest robust version method perform less. Under large sample size, the result also indicate that regression diagnostics tools based on OLS perform similar way to other various kind of robust versions that are based on robust regression. However, in this part of the power of correct outlier detection, LTS performance is the best with the probability percentage of 87.8% to 99.8% followed by L1/LAD (87.2% to 99.6%), LMS (87.8% to 99.4%) and M-estimation (87% to 99.6%). In spite of this Hadi’s influence measure 𝐻𝑖 2 perform the best in both OLS based and robust based version. Meanwhile, all diagnostics measure didn’t detect any outlier in “both AO and IO” except for the first outlier which is AO, no outlier is detected in second and third outlier under IO. V. Conclusion In a small sample size, OLS and M-estimation is suggest to be use for the detection and identification of outlier under innovative outliers (IO). However both method fail to detect any number of correct outliers detection when the sample were mixed by both innovative outliers and additive outliers. Other robust estimation methods
  • 8. Identification of Outliersin Time Series Data via... DOI: 10.9790/5728-11656067 www.iosrjournals.org 67 | Page performed less. On the other side, large sample size, LTS perform best in the simulation study compared to other measures, however it is not robust to a series that is contaminated with both AO and IO. It can only detect the AO in the series. Also Cook’s Distance and The welsch-kuh distance are not robust to multiple AO. Reference [1] R. S. Tsay, Analysis Of Financial Time Series, pp. 1-448, 2002. [2] S. S. S. Abd Mutalib, and K. Ibrahim, Identification Of Outliers: A Simulation Study, ARPN Journal of Engineering and Applied Sciences, Vol. 10, pp. 326-330, 2015. [3] J. Lopez-de-Lacalle, tsoutliers R Package for Detection of Outliers in Time Series, pp. 1- 28, 2014. [4] J. Xu,. B. Abraham and S. H. Steiner, Outlier Detection Methods in MultivariateRegression Models, pp. 1-27. [5] E. Widodo, S. Guritno ands.Haryatmi, Response Surface Models with Data Outliersthrough a Case Study, Applied Mathematical Sciences, Vol. 9, pp. 1803 – 1812, 2015. [6] R. A. Yaffee, Regression Analysis: Some Popular Statistical Package Options, Social Science, and Mapping Group Academic Computing Services Information Technology Services, pp. 1-12. 2002. [7] F. H. Thanoon, Robust Regression by Least Absolute Deviations Method, International Journal of Statistics and Applications, Vol. 5(3), pp. 109-112, 2015. [8] S. Bachmann and M. Losler, IVS Combination Center at BKG - Robust Outlier Detection Weighting Strategies, IVS General Meeting Proceedings, pp. 266 – 270, 2012. [9] S-PLUS Language Reference.