SlideShare uma empresa Scribd logo
1 de 6
Baixar para ler offline
ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010



  A Novel Classification via Clustering Method
     for Anomaly Based Network Intrusion
               Detection System
                                   Mrutyunjaya Panda1 and Manas Ranjan Patra2
      1
          Department of ECE, Gandhi Institute of Engineering and Technology, Gunupur, Orissa-765022, India
                                    Email: mrutyunjaya.2007@rediffmail.com
                  2
                    Department of Computer Science, Berhampur University, Orissa-760007, India
                                           Email: mrpatra12@gmail.com

Abstract- Intrusion detection in the internet is an active           intrusions by scanning for pre-classified signatures in
area of research. Intruders can be classified into two               network packets. On the other hand, anomaly detection
types, namely; external intruders who are unauthorized               can detect new intrusions while misuse detection may
users of the computers they attack, and internal                     not. The idea behind anomaly detection is that if we
intruders, who have permission to access the system but              can establish a normal activity profile for a system, in
with some restrictions. The aim of this paper is to present
a methodology to recognize attacks during the normal
                                                                     theory we can flag all system states varying from the
activities in a system. A novel classification via sequential        established profile as intrusion attempts. However, if a
information bottleneck (sIB) clustering algorithm has                set of intrusive activities is not identical to the set of
been proposed to build an efficient anomaly based                    known anomalous activities, the situation becomes
network intrusion detection model. We have compared                  more interesting as we can find new interesting
our proposed method with other clustering algorithms                 possibilities. Anomalous activities that are not intrusive
like X-Means, Farthest First, Filtered clusters, DBSCAN,             but flagged as intrusive are referred as false positives.
K-Means,        and     EM     (Expectation-Maximization)            Actual intrusive activities that go undetected are called
clustering in order to find the suitability of our proposed
                                                                     false negatives. This is a serious problem and far more
algorithm. A subset of KDDCup 1999 intrusion detection
benchmark dataset has been used for the experiment.                  serious than false positives. Anomaly detection is
Results show that the proposed method is efficient in                computationally expensive because of the overhead of
terms of detection accuracy, low false positive rate in              keeping track of and possibly updating several system
comparison to the other existing methods.                            profiles. Recently proposed system learning rules for
                                                                     anomaly detection (LERAD) discovers relationships
Index Terms: Intrusion Detection, Meta Classifier,                   among attributes in order to model application
Unsupervised      clustering, Sequential Information                 protocols [3, 4].
bottleneck clustering.                                               In fact, intrusion detection can be regarded as a
                                                                     classification problem, namely; identifying normal and
                   I INTRODUCTION                                    other types of intrusive behavior. Hence, the key point
Identifying unauthorized use, misuse and attacks on                  here is to choose an effective classification approach to
information systems is defined as intrusion detection                build accurate intrusion detection models. A number of
[1, 2]. The most popular way to detect intrusions has                machine learning algorithms have been applied to
been done by analyzing audit data generated by                       intrusion detection to learn intrusion rules or build
operating systems and by networks. Since almost all                  normal usage patterns. In this paper, we investigate and
activities are logged on a system, it is possible that a             evaluate the performance of classifiers using different
manual inspection of these logs would allow intrusions               cluster algorithms like Sequential Information
to be detected. It is important to analyze the audit data            Bottleneck (sIB), Farthest First Traversal (FFT), K-
even an attack has occurred, for determining the extent              Means, DBSCAN and EM. The motivation behind
of damage occurred, this analysis helps in attack trace              using the classification using clustering algorithms is to
back and also in recording the attack patterns for future            improve the accuracy of the intrusion detection system
prevention of such attacks. An intrusion detection                   in comparison to the individual clustering algorithms.
system (IDS) can be used to analyze audit data for                   The rest of the paper is organized as follows. A
such insights. This makes IDS a valuable real-time                   literature review is presented in Section 2 followed by
detection and prevention tool as well as a forensic                  a short theoretical background on the clustering
analysis tool.                                                       algorithms used in this research in Section 3. Section 4
Traditionally, intrusion detection techniques fall into              provides a brief idea about the KDDCup 1999 dataset
two categories: signature detection and anomaly                      used. The proposed methodology is described in
detection. Signature detection, or misuse detection,                 Section 5 followed by the experimental setup in
searches for well known patterns of attacks and                      Section 6. Section 7 describes the results and


                                                                17
© 2010 ACEEE
DOI: 01.ijns.01.02.04
ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010

discussion. Finally, the paper concludes with future            In our experiments, some clustering algorithms are
research directions in Section 8.                               selected for performance comparison. Some of them
                                                                are discussed as follows.
             II. RELATED RESEARCH
                                                                A. K-Means:
In [5], the authors have designed an Intrusion response         We first describe the K-means algorithm for producing
(IR) system cooperating with IDS using mobile agents            a clustering of the points in the input into K clusters. It
distributed throughout the network, based on                    partitions the data-points into K subsets such that all
stigmergic properties. In [6], the authors introduced a         points in a given subset “belong” to some centre. The
self-organized ant colony based intrusion detection             algorithm keeps track of the centroids of the subsets
system (ANTIDS) to detect intrusions and compares               and proceeds in iterations. Before the first iteration the
its performance with linear genetic programming                 centroids are initialized to random values. The
(LGP) [7], Support vector machines (SVM) [8] and                algorithm terminates when the centroid locations stay
Decision Trees (DT) [9]. Other works have made use              fixed during iteration. During each iteration the
of Multiple Adaptive Regression Splines (MARS)                  following is performed:
[10]. In [11], the authors have compared various data            For each point x, find the centroid which is closest
mining algorithms for detecting network intrusions.                   to x. Associate x with this centroid.
The authors have used Naïve Bayes algorithm in                   Re-estimate centroid locations by taking, for each
building a network intrusion detection model [12]. In                 centroid, the canter of mass of points associated
[13], the authors proposed Bayesian Belief network                    with it.
(BBN) with genetic and simulated annealing local                The K-Means algorithm is known to converge to a
search in order to build an efficient network intrusion         local minimum of the distortion measure (that is,
detection model. The authors have compared various              average squared distance from points to their class
ensemble algorithms in detecting the intrusion                  centroids). It is also known to be very simple and can
detection in [14]. Modeling intrusion detection system          be easily implemented in solving many practical
using hybrid intelligent systems is proposed in [15]. In        problems. It can work very well for compact and hyper
this, DT and SVM are combined as a hierarchical                 spherical clusters. The time complexity of the K-means
hybrid intelligent system model (DT_SVM) and an                 is O (NKd). Since, K and d are usually much less than
ensemble approach combining the base classifiers. In            N, K-means can be used to cluster large data sets.
[16], the authors propose support vector learning
approach to classify network requests. The authors in           B. Expectation Maximization (EM)
[17] used K-Means and DBSCAN to demonstrate how                 We follow the procedure as in [20]. The steps for our
cluster analysis can be used to effectively identify            implementation of EM are as follows. We initiate the
groups of traffic that are similar using only transport         process by an initial estimate of mean and standard
layer statistics. The authors propose hierarchical              deviation. The EM algorithm then searches for a
Gaussian Mixture Model (HGMM) a novel type of                   maximum likelihood hypothesis through the following
Gaussian Mixture which detects network based attacks            iterative scheme.
as anomalies using statistical processing classification             • Initialization Step: Initialize the hypothesis
                                                                         θ 0 = ( µ10 , µ2 ,...., µ k0 ) , θ k0 = µk0
in [18]. In [19], the authors use automated feature                                     0
weighting for network anomaly detection. They
conclude that their proposed method not only increases                                       (1)
the detection rate but also reduces false alarm rate as             Where k is the current number of Gaussians, σ is
well.                                                               the standard deviation, θ 0 is the estimate at 0th
We believe that an unsupervised clustering approach
                                                                    iteration, µ is the mean.
offers some advantages over supervised learning
approaches. One of the main benefits is that new                    • Expectation Step: Estimate the expected values
applications can be identified by examining the                         of the hidden variables zij (mean and standard
connections that are grouped to form a new cluster.                     deviation)      using     the     current      hypothesis
                                                                         θ = ( µ , µ ,...., µ         )
The supervised approach can not discover new                               t        t   t         t
applications and can only classify network traffic for                              1   2         k
which it has labeled training data. This prompted us to                                   � x −µt ) �
                                                                                            −( i
                                                                                                          2

use unsupervised clustering approach in building an                                  exp �
                                                                                                  k
                                                                                                      �
efficient classifier for detecting anomaly based                                          � 2σ2       �
                                                                     E ( zik )   =        �           �
network intrusions.                                                                                                          (2)
                                                                                    k       � x −µt ) 2 �
                                                                                             −( i
                                                                                   ¥ � 2σ2 �
                                                                                                    j
                                                                                        exp
                                                                                            �           �
        I. CLUSTERING ALGORITHMS                                                   j= 1
                                                                                            �           �

                                                                    where t is the number of iteration, E zij       ( )    is the
                                                                    expected value for the hidden variables (namely


                                                           18
© 2010 ACEEE
DOI: 01.ijns.01.02.04
ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010

     mean and standard deviation), k is the dimension,             The farthest first traversal k-center algorithm (FFT) is a
     σ is the standard deviation.                                  fast, greedy algorithm that minimizes the maximum
     • Maximization Step: provides a new estimate                  cluster radius [23]. This is also treated as an efficient
          of the parameters.                                       algorithm which always returns the right answer. The
                             n                                     pseudo code for the farthest first traversal algorithm is
                             ¥( z ) x
                              E          ik    i                   as follows:
                   µt +1 =
                    k
                             i=1
                                n                  (3)
                              ¥( zik )
                               E
                               i=1                                      • Pick any z ᆪ S and set T = { z}
                                                                        • While
          Convergence Step: if θ − θ < e , stop
                                t +1 t                                                  T <k :
     •
                                                                                       z = arg max ρ( x, T )
          (finish iteration); otherwise, go to step 2.                                         xᆪS

The hidden variables are the parameters of the model.                                  T =T ᆪ z}
                                                                                             {
In this case, we use mixture of Gaussians; hence the
hidden variables are the mean and standard deviation
for each Gaussian distribution. We start with an initial           Here,   ρ ( x, T ) is the distance from point ( x ) to the
estimate of those parameters and iteratively run the               closest point in set T. This builds a solution T one point
algorithm to find the maximum likelihood (ML) for                  at a time. It starts with any point, and then iteratively
our estimates.                                                     adds in the point farthest from the ones chosen so far.
The reason we are using EM is to fit the data better, so
that clusters are compact and far from other clusters,             The farthest point ( x ) from a set ( S ) is obtained from
since we initially estimate the parameters and iterate to          ρ ( x, T ) . Farthest first traversal takes time O ( k S ) ,
find the ML for those parameters
                                                                   which is fairly efficient and is always close to optimal
C. DBSCAN                                                          solution, in the sense that if T is the solution returned
Density-based methods are based on a simple                        by the farthest first traversal, and T * is the optimal
assumption: clusters are dense regions in the data
space that are separated by regions of lower density.                                                  ( )
                                                                   solution, then cos t ( T ) ᆪ2 cos t T
                                                                                                           *
                                                                                                               .
Their general idea is to continue growing the given
                                                                   E. Filtered Clusterer
cluster as long as the density in the neighborhood
exceeds some threshold. In other words, for each data              The Filtered Clusterer is Meta-Clusterer which offers
point within a given cluster, the neighborhood of a                the possibility to apply filters directly before the
given radius has to contain at least a minimum number              Clusterer is learnt. The structure of the filter is based
of points. These methods are good at altering out                  exclusively on the training data and test instances are
outliers and discovering clusters of arbitrary shapes.             processed by the filter without changing their structure
Density based spatial clustering of applications with              [24].
noise (DBSCAN) is an example of density based                      F. X-Means
methods [21].
                                                                   X-means is a new algorithm that quickly estimates the
The algorithm DBSCAN (Density Based Spatial
                                                                   number of clusters K. It goes into action after each run
Clustering of Applications with Noise) targeting low-
                                                                   of K-means, making local decisions about which subset
dimensional spatial data is the major representative in
                                                                   of the current centroids should split themselves in order
this category. Two input parameters ε and MinPts are
                                                                   to better fit the data. The splitting decision is done by
used to define:
                                        ε -neighborhood            computing the Bayesian Information Criterion (BIC).
1)An
                                                                   X-means have been experimented against a more
Nε ( x ) = y ΣX | d ( x, y )         ε ) of the point x,           traditional method that estimates the number of clusters
2) A core object (a point with a neighborhood                      by guessing K. X-means consistently produced better
consisting of more than MinPts points)                             clustering on both synthetic and real-life data, with
3) A concept of a point y density-reachable from a                 respect to BIC. It also runs much faster, which
core object x (a finite sequence of core objects                   prompted us to select this algorithm for our intrusion
between x and y exists such that each next belongs to              detection model, which has not been considered so far
an ε -neighborhood of its predecessor)                             for intrusion detection. A detailed description on X-
4) A density-connectivity of two points x, y (they                 Means operations can be seen in [25].
should be density-reachable from a common core
object).                                                                       IV INTRUSION DETECTION DATASET
More details about the DBSCAN can be found in [22].                The KDD Cup 1999 Intrusion detection contest data is
D.    Farthest First                                               used in our experiments. This data was prepared by
                                                                   DARPA Intrusion detection evaluation program by


                                                              19
© 2010 ACEEE
DOI: 01.ijns.01.02.04
ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010

MIT Lincoln Laboratory. Lincoln labs acquired nine               as the number of class labels in the dataset in order to
weeks of raw TCP dump data. The raw data was                     obtain a useful model. However, in case of sIB
processed into connection records, which contains                clustering algorithm, we need not specify the number
about 5 million connection records. The data set                 of clusters to generate a model, as it fixes the number
contains 24 attack types. These attacks fall into four           of clusters automatically.
main categories such as Denial of Service (DoS),
                                                                      B. Proposed Sequential Information Bottleneck
Probing, User to Root (U2R) and Remote to Local
                                                                      Clustering(sIB)
(R2L) attacks.
Besides the four different types of attacks mentioned            The steps for the Sequential Information Bottleneck
above, normal class needs to be detected. The data set           (sIB) Clustering are as follows.
for our experiments contained 1000 connection                         Finding the global minimum for given number
records, which is a subset of 10% KDD Cup’99                              of clusters (W)
intrusion detection benchmark dataset. These were                     Initialized with a (possibly random) partition
randomly generated from the MIT data set. Random                          of W clusters
generation of data include the number of data from                    sIB draws a sample x at random, treats it as a
each class proportional to its size, except that the                      singleton cluster and merges it to a new
smallest class is completely included. All the intrusion                  cluster cnew such that,
detection models are trained and tested with the same
data set. Although, the data set contains five different              cnew = arg min cᆪC JS ( x, c )           (4)
classes, we perform 2-class classification (either
normal or attack) as our proposed method could not                   Where JS ( .,.) is the Jensen-Shannon distance.
handle multiple class labels for building an efficient                 At each step the objective function improves
intrusion detection model.                                                 or remains un-changed.
                                                                 More details about this algorithm can be found in [26].
          V PROPOSED METHODOLOGY
The proposed methodology for classification Via                              VI EXPERIMENTAL SETUP
Clustering using Sequential Information Bottleneck               All our experiments were performed using a Pentium 4,
(sIB) principle is shown in Fig. 1 below.                        2.8 GHz CPU with 512 MB RAM. Full data set is used
                                                                 for the purpose of training in order to build an intrusion
             Featu         Clustering           Clust            detection model. Then, 10-fold cross validation method
                re         Algorithm             ers             is used in order to test the efficacy of the model built
             Select          (sIB)                               during the training phase.
   Data Samples
               ion/         Design/
             Extra         Selection                             A. Cross Validation Methods
   Knowl         Resul
              ction        Classi         Cluster
    edge
                                                                 Cross-validation (CV) tests exist in a number of
                   ts      ficatio        validatio              variants but the general idea is to divide the training
                 Interp       n               n                  data into a number of partitions or folds. The classifier
                retatio                                          is evaluated by its classification accuracy over one
                   n                                             partition after having learned from the other. This
Figure1.Proposed methodology of Classification Via               procedure is then repeated until all partitions have been
sIB clustering                                                   used for evaluation. Some of the most common types
                                                                 are 10-fold, n-fold and bootstrap CV [27]. The
A. Meta Classifier                                               difference between these three types of CV lies in the
Meta Classifier which comes under Decision                       way data is partitioned. Leave-one-out is similar to n-
committee learning has demonstrated spectacular                  fold CV, where n stands for the number of instances in
success in reducing classification error from learned            the data set. Leave–one-out or n-fold CV is performed
classifiers. These techniques develop a classifier in the        by leaving one instance out for testing and training on
form of a committee of subsidiary classifiers. The               the other instances. This procedure is then performed
committee members are applied to a classification task           until all instances have been left out once. It has been
and their individual outputs combined to create a                argued that the design of 10-fold cross-validation
single classification from the committee as a whole.             introduces a source of correlation since one is used for
These are recent methods for improving the predictive            training in one trial and then the same is used for
power of classifier learning systems.                            testing in another [28].
In our proposed methodology, we design a simple
meta-classifier that uses a clusterer for classification.                 VII RESULTS AND DISCUSSION
For cluster algorithms that use a fixed number of
                                                                 Here, we present the behavior of different unsupervised
Clusterer, like Simple K-Means, the user has to make
                                                                 approaches in building an efficient anomaly based
sure that the number of clusters to generate is the same
                                                                 network intrusion detection model. In Table 1, we


                                                            20
© 2010 ACEEE
DOI: 01.ijns.01.02.04
ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010

present a comparative study of our proposed sIB                    Classificati   Average        Average      Unclassified
clustering method with respect to the other existing               on       via   Detection      False        instances
approaches.                                                        Clustering     Rate (%)       positive     (%)
                                                                   Algorithms                    Rate (%)
  Table 1. Comparative Study of Existing Clustering and            K-Means        80.1           35           NIL
               Classification Algorithms                           X-Means        88.3           26.8         12.6
                                                                   Filtered       80.1           19.8         NIL
     Algorithm          Average          False Positive            Clusterer
                      Detection Rate       Rate (%)                +K-Means
                           (%)
   K-Means [29]          46.986              0.875                 Farthest       86.6           89.6         0.8
   KM-VQ [19]              48.4               10                   First (FFT)
     FES-KM-               60.1               10
      VQ[19]                                                       DBSCAN         86.7           100          18.66
  X-Means (ours)             78               36                   EM             89.1           52.3         2.2
     KDDCup                 48.57            0.225                 sIB            86.3           34           NIL
    Winner [29]                                                    (Proposed)
    GMIX [30]              53.725            29.715
     SOM [30]                47              25.685
  Nearest Cluster          47.875             0.2
        [31]                                                         VIII. CONCLUSION AND FUTURE SCOPE
    Incremental         44.57 to 84.78   15.99 to 76.83         In this paper, we have illustrated the use of
  Clustering [32]                                               classification via clustering methodology using sIB
    Cluster [32]             66                2                clustering algorithm for modeling intrusion detection
     K-NN [32]               23                6
                                                                systems. The proposed approach provides better
     FR-1 [33]              74.82            73.07
                                                                detection accuracy with comparatively low false
     FR-2 [33]              79.68            85.66
  Pure SVM [34]             57.6              35.5
                                                                positive rate in comparison to other existing
  SVM + Rocchio             51.6              44.2              unsupervised clustering algorithms. This makes the
   bundling [34]                                                approach suitable for building an efficient anomaly
  SVM +DGSOT                69.8              37.8              based network intrusion detection model. As evident
        [34]                                                    from the results none of the algorithms provide the best
  sIB (Proposed)            85.5               34               detection with zero false positive rates. Therefore, in
   ANTIDS-a [6]             69.9              ----              our future research we shall investigate other data
                                                                mining techniques with a view to enhance the detection
From, the above comparison, it is clear that our                accuracy as close as possible to 100% while
proposed method is efficient in detecting anomalous             maintaining a low false positive rate.
activities with high detection rate and relatively low
false positive rate.                                                                     REFERENCES
It is also quite evident from Table 1 and Table 2 that
the Meta classifier using sIB clustering for                     [1] D. Denning. An Intrusion detection model. IEEE
classification enhances the intrusion detection rate                  Transaction on software Engineering.SE-13(2), pp.222-
while maintaining the same false positive rate in                     232, 1998.
                                                                 [2] S. Kumar, E. H. Spafford. An application of pattern
comparison to unsupervised sIB clustering method.                     matching in Intrusion Detection. Technical report,CSD-
It can be seen from Table 2 that our proposed method                  TR-94-013. Purdue University, 1994.
is able to detect intrusions with highest accuracy in            [3] M.Mahoney, P.K.Chan. An analysis of the 199
comparison to K-Means and Filtered Clusterer with K-                  DARPA/Lincoln Laboratory evaluation data for network
Means classification via clustering methods. It can also              anomaly detection. In 6th Intl. symposium on RAID,
be observed that the proposed methodology with X-                     2003. pp. 220-237.
Means, FFT, DBSCAN and EM clustering provides                    [4] P. K. Chan, M. mahoney and M. Arshad. Learning rules
high average detection rate in comparison to our sIB                  and clusters for anomaly detection in network traffic”, in
                                                                      managing       cyber   threats,issues,approaches      and
clustering, but they are not able to classify all the
                                                                      challenges, V.Kumar,J.Srivastava, and A.Lazarevic
instances successfully. At the same time, their false                 (Edns.),Kluwer.2004.
positive rates except in case of X-Means are more than           [5] N. Foukia, S. Hassas, S. Fenet and J. Hulaas, “An
sIB clustering. These comparisons show that our                       Intrusion response scheme:Tracking the source using the
proposed classification via sIB clustering is more                    sigmery paradigm”, in Proc. Of security of mobile multi
suitable in building an efficient anomaly based                       agent system workshop (SEMAS-2002).Italy, July16,
network intrusion detection model.                                    2002.
                                                                [6] V. Ramos and Ajith Abraham, “ANTIDS-Self organised
Table 2.Comparison of our proposed Classification via                 Ant based clustering model for intrusion detection
                                                                      system. WSTST, 2005, pp.103-112.
Clustering Methods
                                                                      http://www.softcomputing.net/WSTST-ra.pdf


                                                           21
© 2010 ACEEE
DOI: 01.ijns.01.02.04
ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010

 [7] M. Brameier and W. Banzhaf, “A comparison of linear              [20]      The      Expectation    Maximization       Algorithm.
     genetic programming and neural network in medical data                  http://www.cs.unr.edu/~bebis/mathmethods/EM/lecture
     mining”, in IEEE Transaction on Evolutionary                            .pdf. Sept. 25, 2004.
     computation.5 (1), pp.17-26, 2001.                               [21] M. Ester, H.-P Kriegel, J. Sander and X. Xu, „A density-
 [8] V.N. Vapnik, The Nature of Statistical Learning Theory.                 based algorithm for discovering clusters in large spatial
     Springer, 1995.                                                         database with noise“, in Proc. of KDD-96,pp.226-
 [9] J. Denebourg ,et al. “The dynamic of collective sorting                 231,1996.
     Robot like ants and Ant like Robots”, in 1st conf. on            [22] Pavel Berkhin, “Survey of clustering data mining
     simulation of Adaptive behaviour ;from animals to                     techniques”, Technical report. Accrue software, San
     animats,cambridg,MA,MIT Press,pp.356-365,1991.                        Jose,CA,USA,pp.1-56.2002.
 [10] S. Mukkamala, A.H. Sung, A. Abraham, V. Ramos,                       http://www.ee.ucr.edu/~barth/EE242/clustering_survey.
     “Intrusion Detection Systems using Adaptive Regression                pdf
     Splines”, in ICEIS-04, 6th Int. Conf. on Enterprise              [23] A.Willium, “Clustering algorithm for categorical
     Information Systems, to appear at Kluwer Academic                       data”.2006.
     Press, 2005, Porto, 14-17 April 2004.                                   http://www.cse.yorku.ca/~billa/MULIC/Bill_Andreopo
 [11] M. Panda and M. R. Patra, “A comparative study of data                 ulous_PhD_thesis_web.pdf
      mining algorithms for network intrusion detection”,             [24] A. William, “Clustering Algorithms for categorical
      proc. of ICETET, India, 2008.pp.504-507. IEEE                        data”,                   September                   2006.
      Xplore.                                                              http://www.cse.yorku.ca/~billa/MULIC/Bill_Andreopou
 [12] M. Panda and M. R. Patra, “Network intrusion detection               los_PhD_thesis_web.pdf
      using Naïve Bayes”, International journal of computer           [25] D. Pellag and A. Moore, “X-Means: Extending K-Means
      science and network security, vol.7, No.12, 2007,                    with efficient estimation of the number of clusters”. Int.
      pp.258-263.                                                          Conf. on Machine Learning (ICML), pp.727-734, San
 [13] M. Panda and M. R. Patra. “Bayesian belief Network                   Francisco, 2000.
      using genetic local search for network intrusion”,              [26] N. Slonim, N. Friedman and N. Tishby, “Unsupervised
      International journal of secure digital information                  document classification using sIB maximization: SIGIR,
      age.Vol.1, issue.1, June 2009. In Press.                             Tempere, Finland, pp.129-136, ACM Press. ISBN: 1-
 [14] M. Panda and M. R. Patra, “Network intrusion detection               58113-561-0102/0008.
      using boosting support vector classifiers”, In 2009 IEEE        [27] N. Lavesson and P. Davidson, “Multi dimensional
      Intl.Advance computing Conference, ptaila, Punjab,                   measures function for classifier performance”, in 2 nd
      pp.926-931. IEEE Press.USA.                                          IEEE Intl. Conf. on Intelligent systems, June
 [15] S. Peddabachigari, A. Abraham, C. Grosan and J.                      2004,pp.508-513.
      Thomas, “Modelling intrusion detection system using             [28] M. A. Maloof, “On machine learning ROC analysis and
      hybrid intelligent systems”, Journal of Network and                  statistical tests of significance”, in 16 th Intl.Conf. on
      computer applications.Elsevier, 30(1), pp.114-132,                   pattern recognition, IEEE,Vol.2,2002,pp.204-207.
      2007.                                                           [29] I. Levin,”KDD-99 classifier learning context”, in
 [16] John Mill and A. Inoue, “Support vector classifier and               LLSoft’s results overview, SIGKDD explorations,
      network intrusion detection”, In proc. of 2004 IEEE                  ACMSIGKDD, Vol.1, No.2, pp.67-75, 2000.
      international conference on fuzzy system, WA, USA,              [30] V. Venkatchalam and S. Selvan, “Performance
      2004, Vol.1, pp.407-410. ISBN: 1098-7584.                             comparison of intrusion detection system classifier
[17] J. Erman, M. Arlitt and A. Mahanti, “Traffic                           using     various feature       reduction    techniques”,
       classification using clustering algorithms”, in                      International journal of simulation, Vol.9, No.1, 2008.
       SIGCOMM-06 workshops, sept.11-15, 2006, Pisa,                  [31] M. K. Sabhani and G. Sarpen, “Application of machine
       Italy.pp.281-286.ACM Press.                                         learning algorithm on KDD intrusion detection dataset
                                                                           with misuse detection context”, in Proc. of Intl.conf. on
[18] M. Bahrololum and M. Khaleghi, “Anomaly intrusion                     machine learning models,technologies and applications
     detection system using hierarchical gausian mixture                   (MLMTA 2005), Los Vegas,NV,2003,pp.209-215.
     model”, International journal of computer science and            [32] T. Hassan, M. Hashaem and A. Fahmy,” Unsupervised
     network security, vol.8, no.8, pp.264-271, August                     anomaly detection using an incremental clustering
     2008.                                                                 algorithm”, International journal of intelligent
[19] Dat Tran, W. Ma and Dharmendra Sharma,                                computing and information sciences, Vol.5, No.1,
     “Automatedfeature weighting for network anomaly                       pp.253-268, 2005.
     detection”, in International journal of computer science
     and network security, Vol.8, no.2, pp.173-178,
     Feb.2008.




                                                                 22
© 2010 ACEEE
DOI: 01.ijns.01.02.04

Mais conteúdo relacionado

Mais procurados

An approach for ids by combining svm and ant colony algorithm
An approach for ids by combining svm and ant colony algorithmAn approach for ids by combining svm and ant colony algorithm
An approach for ids by combining svm and ant colony algorithmeSAT Journals
 
An approach for ids by combining svm and ant colony algorithm
An approach for ids by combining svm and ant colony algorithmAn approach for ids by combining svm and ant colony algorithm
An approach for ids by combining svm and ant colony algorithmeSAT Publishing House
 
Survey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chainSurvey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chainijcseit
 
A Stacked Generalization Ensemble Approach for Improved Intrusion Detection
A Stacked Generalization Ensemble Approach for Improved Intrusion DetectionA Stacked Generalization Ensemble Approach for Improved Intrusion Detection
A Stacked Generalization Ensemble Approach for Improved Intrusion DetectionIJCSIS Research Publications
 
Intrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set ReductionIntrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set ReductionIOSR Journals
 
Yolinda chiramba Survey Paper
Yolinda chiramba Survey PaperYolinda chiramba Survey Paper
Yolinda chiramba Survey PaperYolinda Chiramba
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...researchinventy
 
DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1IJITE
 
A Technique by using Neuro-Fuzzy Inference System for Intrusion Detection and...
A Technique by using Neuro-Fuzzy Inference System for Intrusion Detection and...A Technique by using Neuro-Fuzzy Inference System for Intrusion Detection and...
A Technique by using Neuro-Fuzzy Inference System for Intrusion Detection and...IJMER
 
IDS IN TELECOMMUNICATION NETWORK USING PCA
IDS IN TELECOMMUNICATION NETWORK USING PCAIDS IN TELECOMMUNICATION NETWORK USING PCA
IDS IN TELECOMMUNICATION NETWORK USING PCAIJCNCJournal
 
NOVEL HYBRID INTRUSION DETECTION SYSTEM FOR CLUSTERED WIRELESS SENSOR NETWORK
NOVEL HYBRID INTRUSION DETECTION SYSTEM FOR CLUSTERED WIRELESS SENSOR NETWORKNOVEL HYBRID INTRUSION DETECTION SYSTEM FOR CLUSTERED WIRELESS SENSOR NETWORK
NOVEL HYBRID INTRUSION DETECTION SYSTEM FOR CLUSTERED WIRELESS SENSOR NETWORKIJNSA Journal
 
A HYBRID FUZZY SYSTEM BASED COOPERATIVE SCALABLE AND SECURED LOCALIZATION SCH...
A HYBRID FUZZY SYSTEM BASED COOPERATIVE SCALABLE AND SECURED LOCALIZATION SCH...A HYBRID FUZZY SYSTEM BASED COOPERATIVE SCALABLE AND SECURED LOCALIZATION SCH...
A HYBRID FUZZY SYSTEM BASED COOPERATIVE SCALABLE AND SECURED LOCALIZATION SCH...ijwmn
 

Mais procurados (16)

An approach for ids by combining svm and ant colony algorithm
An approach for ids by combining svm and ant colony algorithmAn approach for ids by combining svm and ant colony algorithm
An approach for ids by combining svm and ant colony algorithm
 
An approach for ids by combining svm and ant colony algorithm
An approach for ids by combining svm and ant colony algorithmAn approach for ids by combining svm and ant colony algorithm
An approach for ids by combining svm and ant colony algorithm
 
Survey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chainSurvey of network anomaly detection using markov chain
Survey of network anomaly detection using markov chain
 
Ij2514951500
Ij2514951500Ij2514951500
Ij2514951500
 
A Stacked Generalization Ensemble Approach for Improved Intrusion Detection
A Stacked Generalization Ensemble Approach for Improved Intrusion DetectionA Stacked Generalization Ensemble Approach for Improved Intrusion Detection
A Stacked Generalization Ensemble Approach for Improved Intrusion Detection
 
Intrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set ReductionIntrusion Detection System Based on K-Star Classifier and Feature Set Reduction
Intrusion Detection System Based on K-Star Classifier and Feature Set Reduction
 
Yolinda chiramba Survey Paper
Yolinda chiramba Survey PaperYolinda chiramba Survey Paper
Yolinda chiramba Survey Paper
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...
 
DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1
 
A Technique by using Neuro-Fuzzy Inference System for Intrusion Detection and...
A Technique by using Neuro-Fuzzy Inference System for Intrusion Detection and...A Technique by using Neuro-Fuzzy Inference System for Intrusion Detection and...
A Technique by using Neuro-Fuzzy Inference System for Intrusion Detection and...
 
IDS IN TELECOMMUNICATION NETWORK USING PCA
IDS IN TELECOMMUNICATION NETWORK USING PCAIDS IN TELECOMMUNICATION NETWORK USING PCA
IDS IN TELECOMMUNICATION NETWORK USING PCA
 
NOVEL HYBRID INTRUSION DETECTION SYSTEM FOR CLUSTERED WIRELESS SENSOR NETWORK
NOVEL HYBRID INTRUSION DETECTION SYSTEM FOR CLUSTERED WIRELESS SENSOR NETWORKNOVEL HYBRID INTRUSION DETECTION SYSTEM FOR CLUSTERED WIRELESS SENSOR NETWORK
NOVEL HYBRID INTRUSION DETECTION SYSTEM FOR CLUSTERED WIRELESS SENSOR NETWORK
 
A HYBRID FUZZY SYSTEM BASED COOPERATIVE SCALABLE AND SECURED LOCALIZATION SCH...
A HYBRID FUZZY SYSTEM BASED COOPERATIVE SCALABLE AND SECURED LOCALIZATION SCH...A HYBRID FUZZY SYSTEM BASED COOPERATIVE SCALABLE AND SECURED LOCALIZATION SCH...
A HYBRID FUZZY SYSTEM BASED COOPERATIVE SCALABLE AND SECURED LOCALIZATION SCH...
 
A45010107
A45010107A45010107
A45010107
 
J017446568
J017446568J017446568
J017446568
 
06558266
0655826606558266
06558266
 

Semelhante a A Novel Classification via Clustering Method for Anomaly Based Network Intrusion Detection System

ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERCSEIJJournal
 
Attack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest ClassifierAttack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest ClassifierCSEIJJournal
 
COPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxCOPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxvoversbyobersby
 
Visualize network anomaly detection by using k means clustering algorithm
Visualize network anomaly detection by using k means clustering algorithmVisualize network anomaly detection by using k means clustering algorithm
Visualize network anomaly detection by using k means clustering algorithmIJCNCJournal
 
Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194Editor IJARCET
 
Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194Editor IJARCET
 
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...IJNSA Journal
 
A combined approach to search for evasion techniques in network intrusion det...
A combined approach to search for evasion techniques in network intrusion det...A combined approach to search for evasion techniques in network intrusion det...
A combined approach to search for evasion techniques in network intrusion det...eSAT Journals
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...ijcseit
 
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAINSURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAINijcseit
 
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...IJMER
 
Wmn06MODERNIZED INTRUSION DETECTION USING ENHANCED APRIORI ALGORITHM
Wmn06MODERNIZED INTRUSION DETECTION USING  ENHANCED APRIORI ALGORITHM Wmn06MODERNIZED INTRUSION DETECTION USING  ENHANCED APRIORI ALGORITHM
Wmn06MODERNIZED INTRUSION DETECTION USING ENHANCED APRIORI ALGORITHM ijwmn
 
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...IJMER
 
A novel ensemble modeling for intrusion detection system
A novel ensemble modeling for intrusion detection system A novel ensemble modeling for intrusion detection system
A novel ensemble modeling for intrusion detection system IJECEIAES
 
A Survey on Different Machine Learning Algorithms and Weak Classifiers Based ...
A Survey on Different Machine Learning Algorithms and Weak Classifiers Based ...A Survey on Different Machine Learning Algorithms and Weak Classifiers Based ...
A Survey on Different Machine Learning Algorithms and Weak Classifiers Based ...gerogepatton
 
Comparative Analysis of K-Means Data Mining and Outlier Detection Approach fo...
Comparative Analysis of K-Means Data Mining and Outlier Detection Approach fo...Comparative Analysis of K-Means Data Mining and Outlier Detection Approach fo...
Comparative Analysis of K-Means Data Mining and Outlier Detection Approach fo...IJCSIS Research Publications
 
New Hybrid Intrusion Detection System Based On Data Mining Technique to Enhan...
New Hybrid Intrusion Detection System Based On Data Mining Technique to Enhan...New Hybrid Intrusion Detection System Based On Data Mining Technique to Enhan...
New Hybrid Intrusion Detection System Based On Data Mining Technique to Enhan...ijceronline
 

Semelhante a A Novel Classification via Clustering Method for Anomaly Based Network Intrusion Detection System (20)

A6
A6A6
A6
 
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIER
 
Attack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest ClassifierAttack Detection Availing Feature Discretion using Random Forest Classifier
Attack Detection Availing Feature Discretion using Random Forest Classifier
 
COPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docxCOPYRIGHTThis thesis is copyright materials protected under the .docx
COPYRIGHTThis thesis is copyright materials protected under the .docx
 
Visualize network anomaly detection by using k means clustering algorithm
Visualize network anomaly detection by using k means clustering algorithmVisualize network anomaly detection by using k means clustering algorithm
Visualize network anomaly detection by using k means clustering algorithm
 
A45010107
A45010107A45010107
A45010107
 
Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194
 
Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194
 
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...FORTIFICATION OF HYBRID INTRUSION  DETECTION SYSTEM USING VARIANTS OF NEURAL ...
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...
 
A combined approach to search for evasion techniques in network intrusion det...
A combined approach to search for evasion techniques in network intrusion det...A combined approach to search for evasion techniques in network intrusion det...
A combined approach to search for evasion techniques in network intrusion det...
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...
 
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAINSURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
SURVEY OF NETWORK ANOMALY DETECTION USING MARKOV CHAIN
 
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...
 
Wmn06MODERNIZED INTRUSION DETECTION USING ENHANCED APRIORI ALGORITHM
Wmn06MODERNIZED INTRUSION DETECTION USING  ENHANCED APRIORI ALGORITHM Wmn06MODERNIZED INTRUSION DETECTION USING  ENHANCED APRIORI ALGORITHM
Wmn06MODERNIZED INTRUSION DETECTION USING ENHANCED APRIORI ALGORITHM
 
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...
Minkowski Distance based Feature Selection Algorithm for Effective Intrusion ...
 
Layered approach
Layered approachLayered approach
Layered approach
 
A novel ensemble modeling for intrusion detection system
A novel ensemble modeling for intrusion detection system A novel ensemble modeling for intrusion detection system
A novel ensemble modeling for intrusion detection system
 
A Survey on Different Machine Learning Algorithms and Weak Classifiers Based ...
A Survey on Different Machine Learning Algorithms and Weak Classifiers Based ...A Survey on Different Machine Learning Algorithms and Weak Classifiers Based ...
A Survey on Different Machine Learning Algorithms and Weak Classifiers Based ...
 
Comparative Analysis of K-Means Data Mining and Outlier Detection Approach fo...
Comparative Analysis of K-Means Data Mining and Outlier Detection Approach fo...Comparative Analysis of K-Means Data Mining and Outlier Detection Approach fo...
Comparative Analysis of K-Means Data Mining and Outlier Detection Approach fo...
 
New Hybrid Intrusion Detection System Based On Data Mining Technique to Enhan...
New Hybrid Intrusion Detection System Based On Data Mining Technique to Enhan...New Hybrid Intrusion Detection System Based On Data Mining Technique to Enhan...
New Hybrid Intrusion Detection System Based On Data Mining Technique to Enhan...
 

Mais de IDES Editor

Power System State Estimation - A Review
Power System State Estimation - A ReviewPower System State Estimation - A Review
Power System State Estimation - A ReviewIDES Editor
 
Artificial Intelligence Technique based Reactive Power Planning Incorporating...
Artificial Intelligence Technique based Reactive Power Planning Incorporating...Artificial Intelligence Technique based Reactive Power Planning Incorporating...
Artificial Intelligence Technique based Reactive Power Planning Incorporating...IDES Editor
 
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...IDES Editor
 
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...IDES Editor
 
Line Losses in the 14-Bus Power System Network using UPFC
Line Losses in the 14-Bus Power System Network using UPFCLine Losses in the 14-Bus Power System Network using UPFC
Line Losses in the 14-Bus Power System Network using UPFCIDES Editor
 
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...IDES Editor
 
Assessing Uncertainty of Pushover Analysis to Geometric Modeling
Assessing Uncertainty of Pushover Analysis to Geometric ModelingAssessing Uncertainty of Pushover Analysis to Geometric Modeling
Assessing Uncertainty of Pushover Analysis to Geometric ModelingIDES Editor
 
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...IDES Editor
 
Selfish Node Isolation & Incentivation using Progressive Thresholds
Selfish Node Isolation & Incentivation using Progressive ThresholdsSelfish Node Isolation & Incentivation using Progressive Thresholds
Selfish Node Isolation & Incentivation using Progressive ThresholdsIDES Editor
 
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...IDES Editor
 
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...IDES Editor
 
Cloud Security and Data Integrity with Client Accountability Framework
Cloud Security and Data Integrity with Client Accountability FrameworkCloud Security and Data Integrity with Client Accountability Framework
Cloud Security and Data Integrity with Client Accountability FrameworkIDES Editor
 
Genetic Algorithm based Layered Detection and Defense of HTTP Botnet
Genetic Algorithm based Layered Detection and Defense of HTTP BotnetGenetic Algorithm based Layered Detection and Defense of HTTP Botnet
Genetic Algorithm based Layered Detection and Defense of HTTP BotnetIDES Editor
 
Enhancing Data Storage Security in Cloud Computing Through Steganography
Enhancing Data Storage Security in Cloud Computing Through SteganographyEnhancing Data Storage Security in Cloud Computing Through Steganography
Enhancing Data Storage Security in Cloud Computing Through SteganographyIDES Editor
 
Low Energy Routing for WSN’s
Low Energy Routing for WSN’sLow Energy Routing for WSN’s
Low Energy Routing for WSN’sIDES Editor
 
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...IDES Editor
 
Rotman Lens Performance Analysis
Rotman Lens Performance AnalysisRotman Lens Performance Analysis
Rotman Lens Performance AnalysisIDES Editor
 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesIDES Editor
 
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...IDES Editor
 
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...Texture Unit based Monocular Real-world Scene Classification using SOM and KN...
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...IDES Editor
 

Mais de IDES Editor (20)

Power System State Estimation - A Review
Power System State Estimation - A ReviewPower System State Estimation - A Review
Power System State Estimation - A Review
 
Artificial Intelligence Technique based Reactive Power Planning Incorporating...
Artificial Intelligence Technique based Reactive Power Planning Incorporating...Artificial Intelligence Technique based Reactive Power Planning Incorporating...
Artificial Intelligence Technique based Reactive Power Planning Incorporating...
 
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...
Design and Performance Analysis of Genetic based PID-PSS with SVC in a Multi-...
 
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...
Optimal Placement of DG for Loss Reduction and Voltage Sag Mitigation in Radi...
 
Line Losses in the 14-Bus Power System Network using UPFC
Line Losses in the 14-Bus Power System Network using UPFCLine Losses in the 14-Bus Power System Network using UPFC
Line Losses in the 14-Bus Power System Network using UPFC
 
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...
Study of Structural Behaviour of Gravity Dam with Various Features of Gallery...
 
Assessing Uncertainty of Pushover Analysis to Geometric Modeling
Assessing Uncertainty of Pushover Analysis to Geometric ModelingAssessing Uncertainty of Pushover Analysis to Geometric Modeling
Assessing Uncertainty of Pushover Analysis to Geometric Modeling
 
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...
Secure Multi-Party Negotiation: An Analysis for Electronic Payments in Mobile...
 
Selfish Node Isolation & Incentivation using Progressive Thresholds
Selfish Node Isolation & Incentivation using Progressive ThresholdsSelfish Node Isolation & Incentivation using Progressive Thresholds
Selfish Node Isolation & Incentivation using Progressive Thresholds
 
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...
Various OSI Layer Attacks and Countermeasure to Enhance the Performance of WS...
 
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...
Responsive Parameter based an AntiWorm Approach to Prevent Wormhole Attack in...
 
Cloud Security and Data Integrity with Client Accountability Framework
Cloud Security and Data Integrity with Client Accountability FrameworkCloud Security and Data Integrity with Client Accountability Framework
Cloud Security and Data Integrity with Client Accountability Framework
 
Genetic Algorithm based Layered Detection and Defense of HTTP Botnet
Genetic Algorithm based Layered Detection and Defense of HTTP BotnetGenetic Algorithm based Layered Detection and Defense of HTTP Botnet
Genetic Algorithm based Layered Detection and Defense of HTTP Botnet
 
Enhancing Data Storage Security in Cloud Computing Through Steganography
Enhancing Data Storage Security in Cloud Computing Through SteganographyEnhancing Data Storage Security in Cloud Computing Through Steganography
Enhancing Data Storage Security in Cloud Computing Through Steganography
 
Low Energy Routing for WSN’s
Low Energy Routing for WSN’sLow Energy Routing for WSN’s
Low Energy Routing for WSN’s
 
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...
Permutation of Pixels within the Shares of Visual Cryptography using KBRP for...
 
Rotman Lens Performance Analysis
Rotman Lens Performance AnalysisRotman Lens Performance Analysis
Rotman Lens Performance Analysis
 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
 
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...
Microelectronic Circuit Analogous to Hydrogen Bonding Network in Active Site ...
 
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...Texture Unit based Monocular Real-world Scene Classification using SOM and KN...
Texture Unit based Monocular Real-world Scene Classification using SOM and KN...
 

Último

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Último (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

A Novel Classification via Clustering Method for Anomaly Based Network Intrusion Detection System

  • 1. ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 A Novel Classification via Clustering Method for Anomaly Based Network Intrusion Detection System Mrutyunjaya Panda1 and Manas Ranjan Patra2 1 Department of ECE, Gandhi Institute of Engineering and Technology, Gunupur, Orissa-765022, India Email: mrutyunjaya.2007@rediffmail.com 2 Department of Computer Science, Berhampur University, Orissa-760007, India Email: mrpatra12@gmail.com Abstract- Intrusion detection in the internet is an active intrusions by scanning for pre-classified signatures in area of research. Intruders can be classified into two network packets. On the other hand, anomaly detection types, namely; external intruders who are unauthorized can detect new intrusions while misuse detection may users of the computers they attack, and internal not. The idea behind anomaly detection is that if we intruders, who have permission to access the system but can establish a normal activity profile for a system, in with some restrictions. The aim of this paper is to present a methodology to recognize attacks during the normal theory we can flag all system states varying from the activities in a system. A novel classification via sequential established profile as intrusion attempts. However, if a information bottleneck (sIB) clustering algorithm has set of intrusive activities is not identical to the set of been proposed to build an efficient anomaly based known anomalous activities, the situation becomes network intrusion detection model. We have compared more interesting as we can find new interesting our proposed method with other clustering algorithms possibilities. Anomalous activities that are not intrusive like X-Means, Farthest First, Filtered clusters, DBSCAN, but flagged as intrusive are referred as false positives. K-Means, and EM (Expectation-Maximization) Actual intrusive activities that go undetected are called clustering in order to find the suitability of our proposed false negatives. This is a serious problem and far more algorithm. A subset of KDDCup 1999 intrusion detection benchmark dataset has been used for the experiment. serious than false positives. Anomaly detection is Results show that the proposed method is efficient in computationally expensive because of the overhead of terms of detection accuracy, low false positive rate in keeping track of and possibly updating several system comparison to the other existing methods. profiles. Recently proposed system learning rules for anomaly detection (LERAD) discovers relationships Index Terms: Intrusion Detection, Meta Classifier, among attributes in order to model application Unsupervised clustering, Sequential Information protocols [3, 4]. bottleneck clustering. In fact, intrusion detection can be regarded as a classification problem, namely; identifying normal and I INTRODUCTION other types of intrusive behavior. Hence, the key point Identifying unauthorized use, misuse and attacks on here is to choose an effective classification approach to information systems is defined as intrusion detection build accurate intrusion detection models. A number of [1, 2]. The most popular way to detect intrusions has machine learning algorithms have been applied to been done by analyzing audit data generated by intrusion detection to learn intrusion rules or build operating systems and by networks. Since almost all normal usage patterns. In this paper, we investigate and activities are logged on a system, it is possible that a evaluate the performance of classifiers using different manual inspection of these logs would allow intrusions cluster algorithms like Sequential Information to be detected. It is important to analyze the audit data Bottleneck (sIB), Farthest First Traversal (FFT), K- even an attack has occurred, for determining the extent Means, DBSCAN and EM. The motivation behind of damage occurred, this analysis helps in attack trace using the classification using clustering algorithms is to back and also in recording the attack patterns for future improve the accuracy of the intrusion detection system prevention of such attacks. An intrusion detection in comparison to the individual clustering algorithms. system (IDS) can be used to analyze audit data for The rest of the paper is organized as follows. A such insights. This makes IDS a valuable real-time literature review is presented in Section 2 followed by detection and prevention tool as well as a forensic a short theoretical background on the clustering analysis tool. algorithms used in this research in Section 3. Section 4 Traditionally, intrusion detection techniques fall into provides a brief idea about the KDDCup 1999 dataset two categories: signature detection and anomaly used. The proposed methodology is described in detection. Signature detection, or misuse detection, Section 5 followed by the experimental setup in searches for well known patterns of attacks and Section 6. Section 7 describes the results and 17 © 2010 ACEEE DOI: 01.ijns.01.02.04
  • 2. ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 discussion. Finally, the paper concludes with future In our experiments, some clustering algorithms are research directions in Section 8. selected for performance comparison. Some of them are discussed as follows. II. RELATED RESEARCH A. K-Means: In [5], the authors have designed an Intrusion response We first describe the K-means algorithm for producing (IR) system cooperating with IDS using mobile agents a clustering of the points in the input into K clusters. It distributed throughout the network, based on partitions the data-points into K subsets such that all stigmergic properties. In [6], the authors introduced a points in a given subset “belong” to some centre. The self-organized ant colony based intrusion detection algorithm keeps track of the centroids of the subsets system (ANTIDS) to detect intrusions and compares and proceeds in iterations. Before the first iteration the its performance with linear genetic programming centroids are initialized to random values. The (LGP) [7], Support vector machines (SVM) [8] and algorithm terminates when the centroid locations stay Decision Trees (DT) [9]. Other works have made use fixed during iteration. During each iteration the of Multiple Adaptive Regression Splines (MARS) following is performed: [10]. In [11], the authors have compared various data  For each point x, find the centroid which is closest mining algorithms for detecting network intrusions. to x. Associate x with this centroid. The authors have used Naïve Bayes algorithm in  Re-estimate centroid locations by taking, for each building a network intrusion detection model [12]. In centroid, the canter of mass of points associated [13], the authors proposed Bayesian Belief network with it. (BBN) with genetic and simulated annealing local The K-Means algorithm is known to converge to a search in order to build an efficient network intrusion local minimum of the distortion measure (that is, detection model. The authors have compared various average squared distance from points to their class ensemble algorithms in detecting the intrusion centroids). It is also known to be very simple and can detection in [14]. Modeling intrusion detection system be easily implemented in solving many practical using hybrid intelligent systems is proposed in [15]. In problems. It can work very well for compact and hyper this, DT and SVM are combined as a hierarchical spherical clusters. The time complexity of the K-means hybrid intelligent system model (DT_SVM) and an is O (NKd). Since, K and d are usually much less than ensemble approach combining the base classifiers. In N, K-means can be used to cluster large data sets. [16], the authors propose support vector learning approach to classify network requests. The authors in B. Expectation Maximization (EM) [17] used K-Means and DBSCAN to demonstrate how We follow the procedure as in [20]. The steps for our cluster analysis can be used to effectively identify implementation of EM are as follows. We initiate the groups of traffic that are similar using only transport process by an initial estimate of mean and standard layer statistics. The authors propose hierarchical deviation. The EM algorithm then searches for a Gaussian Mixture Model (HGMM) a novel type of maximum likelihood hypothesis through the following Gaussian Mixture which detects network based attacks iterative scheme. as anomalies using statistical processing classification • Initialization Step: Initialize the hypothesis θ 0 = ( µ10 , µ2 ,...., µ k0 ) , θ k0 = µk0 in [18]. In [19], the authors use automated feature 0 weighting for network anomaly detection. They conclude that their proposed method not only increases (1) the detection rate but also reduces false alarm rate as Where k is the current number of Gaussians, σ is well. the standard deviation, θ 0 is the estimate at 0th We believe that an unsupervised clustering approach iteration, µ is the mean. offers some advantages over supervised learning approaches. One of the main benefits is that new • Expectation Step: Estimate the expected values applications can be identified by examining the of the hidden variables zij (mean and standard connections that are grouped to form a new cluster. deviation) using the current hypothesis θ = ( µ , µ ,...., µ ) The supervised approach can not discover new t t t t applications and can only classify network traffic for 1 2 k which it has labeled training data. This prompted us to � x −µt ) � −( i 2 use unsupervised clustering approach in building an exp � k � efficient classifier for detecting anomaly based � 2σ2 � E ( zik ) = � � network intrusions. (2) k � x −µt ) 2 � −( i ¥ � 2σ2 � j exp � � I. CLUSTERING ALGORITHMS j= 1 � � where t is the number of iteration, E zij ( ) is the expected value for the hidden variables (namely 18 © 2010 ACEEE DOI: 01.ijns.01.02.04
  • 3. ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 mean and standard deviation), k is the dimension, The farthest first traversal k-center algorithm (FFT) is a σ is the standard deviation. fast, greedy algorithm that minimizes the maximum • Maximization Step: provides a new estimate cluster radius [23]. This is also treated as an efficient of the parameters. algorithm which always returns the right answer. The n pseudo code for the farthest first traversal algorithm is ¥( z ) x E ik i as follows: µt +1 = k i=1 n (3) ¥( zik ) E i=1 • Pick any z ᆪ S and set T = { z} • While Convergence Step: if θ − θ < e , stop t +1 t T <k : • z = arg max ρ( x, T ) (finish iteration); otherwise, go to step 2. xᆪS The hidden variables are the parameters of the model. T =T ᆪ z} { In this case, we use mixture of Gaussians; hence the hidden variables are the mean and standard deviation for each Gaussian distribution. We start with an initial Here, ρ ( x, T ) is the distance from point ( x ) to the estimate of those parameters and iteratively run the closest point in set T. This builds a solution T one point algorithm to find the maximum likelihood (ML) for at a time. It starts with any point, and then iteratively our estimates. adds in the point farthest from the ones chosen so far. The reason we are using EM is to fit the data better, so that clusters are compact and far from other clusters, The farthest point ( x ) from a set ( S ) is obtained from since we initially estimate the parameters and iterate to ρ ( x, T ) . Farthest first traversal takes time O ( k S ) , find the ML for those parameters which is fairly efficient and is always close to optimal C. DBSCAN solution, in the sense that if T is the solution returned Density-based methods are based on a simple by the farthest first traversal, and T * is the optimal assumption: clusters are dense regions in the data space that are separated by regions of lower density. ( ) solution, then cos t ( T ) ᆪ2 cos t T * . Their general idea is to continue growing the given E. Filtered Clusterer cluster as long as the density in the neighborhood exceeds some threshold. In other words, for each data The Filtered Clusterer is Meta-Clusterer which offers point within a given cluster, the neighborhood of a the possibility to apply filters directly before the given radius has to contain at least a minimum number Clusterer is learnt. The structure of the filter is based of points. These methods are good at altering out exclusively on the training data and test instances are outliers and discovering clusters of arbitrary shapes. processed by the filter without changing their structure Density based spatial clustering of applications with [24]. noise (DBSCAN) is an example of density based F. X-Means methods [21]. X-means is a new algorithm that quickly estimates the The algorithm DBSCAN (Density Based Spatial number of clusters K. It goes into action after each run Clustering of Applications with Noise) targeting low- of K-means, making local decisions about which subset dimensional spatial data is the major representative in of the current centroids should split themselves in order this category. Two input parameters ε and MinPts are to better fit the data. The splitting decision is done by used to define: ε -neighborhood computing the Bayesian Information Criterion (BIC). 1)An X-means have been experimented against a more Nε ( x ) = y ΣX | d ( x, y ) ε ) of the point x, traditional method that estimates the number of clusters 2) A core object (a point with a neighborhood by guessing K. X-means consistently produced better consisting of more than MinPts points) clustering on both synthetic and real-life data, with 3) A concept of a point y density-reachable from a respect to BIC. It also runs much faster, which core object x (a finite sequence of core objects prompted us to select this algorithm for our intrusion between x and y exists such that each next belongs to detection model, which has not been considered so far an ε -neighborhood of its predecessor) for intrusion detection. A detailed description on X- 4) A density-connectivity of two points x, y (they Means operations can be seen in [25]. should be density-reachable from a common core object). IV INTRUSION DETECTION DATASET More details about the DBSCAN can be found in [22]. The KDD Cup 1999 Intrusion detection contest data is D. Farthest First used in our experiments. This data was prepared by DARPA Intrusion detection evaluation program by 19 © 2010 ACEEE DOI: 01.ijns.01.02.04
  • 4. ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 MIT Lincoln Laboratory. Lincoln labs acquired nine as the number of class labels in the dataset in order to weeks of raw TCP dump data. The raw data was obtain a useful model. However, in case of sIB processed into connection records, which contains clustering algorithm, we need not specify the number about 5 million connection records. The data set of clusters to generate a model, as it fixes the number contains 24 attack types. These attacks fall into four of clusters automatically. main categories such as Denial of Service (DoS), B. Proposed Sequential Information Bottleneck Probing, User to Root (U2R) and Remote to Local Clustering(sIB) (R2L) attacks. Besides the four different types of attacks mentioned The steps for the Sequential Information Bottleneck above, normal class needs to be detected. The data set (sIB) Clustering are as follows. for our experiments contained 1000 connection  Finding the global minimum for given number records, which is a subset of 10% KDD Cup’99 of clusters (W) intrusion detection benchmark dataset. These were  Initialized with a (possibly random) partition randomly generated from the MIT data set. Random of W clusters generation of data include the number of data from  sIB draws a sample x at random, treats it as a each class proportional to its size, except that the singleton cluster and merges it to a new smallest class is completely included. All the intrusion cluster cnew such that, detection models are trained and tested with the same data set. Although, the data set contains five different cnew = arg min cᆪC JS ( x, c ) (4) classes, we perform 2-class classification (either normal or attack) as our proposed method could not Where JS ( .,.) is the Jensen-Shannon distance. handle multiple class labels for building an efficient  At each step the objective function improves intrusion detection model. or remains un-changed. More details about this algorithm can be found in [26]. V PROPOSED METHODOLOGY The proposed methodology for classification Via VI EXPERIMENTAL SETUP Clustering using Sequential Information Bottleneck All our experiments were performed using a Pentium 4, (sIB) principle is shown in Fig. 1 below. 2.8 GHz CPU with 512 MB RAM. Full data set is used for the purpose of training in order to build an intrusion Featu Clustering Clust detection model. Then, 10-fold cross validation method re Algorithm ers is used in order to test the efficacy of the model built Select (sIB) during the training phase. Data Samples ion/ Design/ Extra Selection A. Cross Validation Methods Knowl Resul ction Classi Cluster edge Cross-validation (CV) tests exist in a number of ts ficatio validatio variants but the general idea is to divide the training Interp n n data into a number of partitions or folds. The classifier retatio is evaluated by its classification accuracy over one n partition after having learned from the other. This Figure1.Proposed methodology of Classification Via procedure is then repeated until all partitions have been sIB clustering used for evaluation. Some of the most common types are 10-fold, n-fold and bootstrap CV [27]. The A. Meta Classifier difference between these three types of CV lies in the Meta Classifier which comes under Decision way data is partitioned. Leave-one-out is similar to n- committee learning has demonstrated spectacular fold CV, where n stands for the number of instances in success in reducing classification error from learned the data set. Leave–one-out or n-fold CV is performed classifiers. These techniques develop a classifier in the by leaving one instance out for testing and training on form of a committee of subsidiary classifiers. The the other instances. This procedure is then performed committee members are applied to a classification task until all instances have been left out once. It has been and their individual outputs combined to create a argued that the design of 10-fold cross-validation single classification from the committee as a whole. introduces a source of correlation since one is used for These are recent methods for improving the predictive training in one trial and then the same is used for power of classifier learning systems. testing in another [28]. In our proposed methodology, we design a simple meta-classifier that uses a clusterer for classification. VII RESULTS AND DISCUSSION For cluster algorithms that use a fixed number of Here, we present the behavior of different unsupervised Clusterer, like Simple K-Means, the user has to make approaches in building an efficient anomaly based sure that the number of clusters to generate is the same network intrusion detection model. In Table 1, we 20 © 2010 ACEEE DOI: 01.ijns.01.02.04
  • 5. ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 present a comparative study of our proposed sIB Classificati Average Average Unclassified clustering method with respect to the other existing on via Detection False instances approaches. Clustering Rate (%) positive (%) Algorithms Rate (%) Table 1. Comparative Study of Existing Clustering and K-Means 80.1 35 NIL Classification Algorithms X-Means 88.3 26.8 12.6 Filtered 80.1 19.8 NIL Algorithm Average False Positive Clusterer Detection Rate Rate (%) +K-Means (%) K-Means [29] 46.986 0.875 Farthest 86.6 89.6 0.8 KM-VQ [19] 48.4 10 First (FFT) FES-KM- 60.1 10 VQ[19] DBSCAN 86.7 100 18.66 X-Means (ours) 78 36 EM 89.1 52.3 2.2 KDDCup 48.57 0.225 sIB 86.3 34 NIL Winner [29] (Proposed) GMIX [30] 53.725 29.715 SOM [30] 47 25.685 Nearest Cluster 47.875 0.2 [31] VIII. CONCLUSION AND FUTURE SCOPE Incremental 44.57 to 84.78 15.99 to 76.83 In this paper, we have illustrated the use of Clustering [32] classification via clustering methodology using sIB Cluster [32] 66 2 clustering algorithm for modeling intrusion detection K-NN [32] 23 6 systems. The proposed approach provides better FR-1 [33] 74.82 73.07 detection accuracy with comparatively low false FR-2 [33] 79.68 85.66 Pure SVM [34] 57.6 35.5 positive rate in comparison to other existing SVM + Rocchio 51.6 44.2 unsupervised clustering algorithms. This makes the bundling [34] approach suitable for building an efficient anomaly SVM +DGSOT 69.8 37.8 based network intrusion detection model. As evident [34] from the results none of the algorithms provide the best sIB (Proposed) 85.5 34 detection with zero false positive rates. Therefore, in ANTIDS-a [6] 69.9 ---- our future research we shall investigate other data mining techniques with a view to enhance the detection From, the above comparison, it is clear that our accuracy as close as possible to 100% while proposed method is efficient in detecting anomalous maintaining a low false positive rate. activities with high detection rate and relatively low false positive rate. REFERENCES It is also quite evident from Table 1 and Table 2 that the Meta classifier using sIB clustering for [1] D. Denning. An Intrusion detection model. IEEE classification enhances the intrusion detection rate Transaction on software Engineering.SE-13(2), pp.222- while maintaining the same false positive rate in 232, 1998. [2] S. Kumar, E. H. Spafford. An application of pattern comparison to unsupervised sIB clustering method. matching in Intrusion Detection. Technical report,CSD- It can be seen from Table 2 that our proposed method TR-94-013. Purdue University, 1994. is able to detect intrusions with highest accuracy in [3] M.Mahoney, P.K.Chan. An analysis of the 199 comparison to K-Means and Filtered Clusterer with K- DARPA/Lincoln Laboratory evaluation data for network Means classification via clustering methods. It can also anomaly detection. In 6th Intl. symposium on RAID, be observed that the proposed methodology with X- 2003. pp. 220-237. Means, FFT, DBSCAN and EM clustering provides [4] P. K. Chan, M. mahoney and M. Arshad. Learning rules high average detection rate in comparison to our sIB and clusters for anomaly detection in network traffic”, in managing cyber threats,issues,approaches and clustering, but they are not able to classify all the challenges, V.Kumar,J.Srivastava, and A.Lazarevic instances successfully. At the same time, their false (Edns.),Kluwer.2004. positive rates except in case of X-Means are more than [5] N. Foukia, S. Hassas, S. Fenet and J. Hulaas, “An sIB clustering. These comparisons show that our Intrusion response scheme:Tracking the source using the proposed classification via sIB clustering is more sigmery paradigm”, in Proc. Of security of mobile multi suitable in building an efficient anomaly based agent system workshop (SEMAS-2002).Italy, July16, network intrusion detection model. 2002. [6] V. Ramos and Ajith Abraham, “ANTIDS-Self organised Table 2.Comparison of our proposed Classification via Ant based clustering model for intrusion detection system. WSTST, 2005, pp.103-112. Clustering Methods http://www.softcomputing.net/WSTST-ra.pdf 21 © 2010 ACEEE DOI: 01.ijns.01.02.04
  • 6. ACEEE International Journal on Network Security, Vol 1, No. 2, July 2010 [7] M. Brameier and W. Banzhaf, “A comparison of linear [20] The Expectation Maximization Algorithm. genetic programming and neural network in medical data http://www.cs.unr.edu/~bebis/mathmethods/EM/lecture mining”, in IEEE Transaction on Evolutionary .pdf. Sept. 25, 2004. computation.5 (1), pp.17-26, 2001. [21] M. Ester, H.-P Kriegel, J. Sander and X. Xu, „A density- [8] V.N. Vapnik, The Nature of Statistical Learning Theory. based algorithm for discovering clusters in large spatial Springer, 1995. database with noise“, in Proc. of KDD-96,pp.226- [9] J. Denebourg ,et al. “The dynamic of collective sorting 231,1996. Robot like ants and Ant like Robots”, in 1st conf. on [22] Pavel Berkhin, “Survey of clustering data mining simulation of Adaptive behaviour ;from animals to techniques”, Technical report. Accrue software, San animats,cambridg,MA,MIT Press,pp.356-365,1991. Jose,CA,USA,pp.1-56.2002. [10] S. Mukkamala, A.H. Sung, A. Abraham, V. Ramos, http://www.ee.ucr.edu/~barth/EE242/clustering_survey. “Intrusion Detection Systems using Adaptive Regression pdf Splines”, in ICEIS-04, 6th Int. Conf. on Enterprise [23] A.Willium, “Clustering algorithm for categorical Information Systems, to appear at Kluwer Academic data”.2006. Press, 2005, Porto, 14-17 April 2004. http://www.cse.yorku.ca/~billa/MULIC/Bill_Andreopo [11] M. Panda and M. R. Patra, “A comparative study of data ulous_PhD_thesis_web.pdf mining algorithms for network intrusion detection”, [24] A. William, “Clustering Algorithms for categorical proc. of ICETET, India, 2008.pp.504-507. IEEE data”, September 2006. Xplore. http://www.cse.yorku.ca/~billa/MULIC/Bill_Andreopou [12] M. Panda and M. R. Patra, “Network intrusion detection los_PhD_thesis_web.pdf using Naïve Bayes”, International journal of computer [25] D. Pellag and A. Moore, “X-Means: Extending K-Means science and network security, vol.7, No.12, 2007, with efficient estimation of the number of clusters”. Int. pp.258-263. Conf. on Machine Learning (ICML), pp.727-734, San [13] M. Panda and M. R. Patra. “Bayesian belief Network Francisco, 2000. using genetic local search for network intrusion”, [26] N. Slonim, N. Friedman and N. Tishby, “Unsupervised International journal of secure digital information document classification using sIB maximization: SIGIR, age.Vol.1, issue.1, June 2009. In Press. Tempere, Finland, pp.129-136, ACM Press. ISBN: 1- [14] M. Panda and M. R. Patra, “Network intrusion detection 58113-561-0102/0008. using boosting support vector classifiers”, In 2009 IEEE [27] N. Lavesson and P. Davidson, “Multi dimensional Intl.Advance computing Conference, ptaila, Punjab, measures function for classifier performance”, in 2 nd pp.926-931. IEEE Press.USA. IEEE Intl. Conf. on Intelligent systems, June [15] S. Peddabachigari, A. Abraham, C. Grosan and J. 2004,pp.508-513. Thomas, “Modelling intrusion detection system using [28] M. A. Maloof, “On machine learning ROC analysis and hybrid intelligent systems”, Journal of Network and statistical tests of significance”, in 16 th Intl.Conf. on computer applications.Elsevier, 30(1), pp.114-132, pattern recognition, IEEE,Vol.2,2002,pp.204-207. 2007. [29] I. Levin,”KDD-99 classifier learning context”, in [16] John Mill and A. Inoue, “Support vector classifier and LLSoft’s results overview, SIGKDD explorations, network intrusion detection”, In proc. of 2004 IEEE ACMSIGKDD, Vol.1, No.2, pp.67-75, 2000. international conference on fuzzy system, WA, USA, [30] V. Venkatchalam and S. Selvan, “Performance 2004, Vol.1, pp.407-410. ISBN: 1098-7584. comparison of intrusion detection system classifier [17] J. Erman, M. Arlitt and A. Mahanti, “Traffic using various feature reduction techniques”, classification using clustering algorithms”, in International journal of simulation, Vol.9, No.1, 2008. SIGCOMM-06 workshops, sept.11-15, 2006, Pisa, [31] M. K. Sabhani and G. Sarpen, “Application of machine Italy.pp.281-286.ACM Press. learning algorithm on KDD intrusion detection dataset with misuse detection context”, in Proc. of Intl.conf. on [18] M. Bahrololum and M. Khaleghi, “Anomaly intrusion machine learning models,technologies and applications detection system using hierarchical gausian mixture (MLMTA 2005), Los Vegas,NV,2003,pp.209-215. model”, International journal of computer science and [32] T. Hassan, M. Hashaem and A. Fahmy,” Unsupervised network security, vol.8, no.8, pp.264-271, August anomaly detection using an incremental clustering 2008. algorithm”, International journal of intelligent [19] Dat Tran, W. Ma and Dharmendra Sharma, computing and information sciences, Vol.5, No.1, “Automatedfeature weighting for network anomaly pp.253-268, 2005. detection”, in International journal of computer science and network security, Vol.8, no.2, pp.173-178, Feb.2008. 22 © 2010 ACEEE DOI: 01.ijns.01.02.04