SlideShare uma empresa Scribd logo
1 de 11
Baixar para ler offline
Applying Genetic Algorithm in Intrusion Detection
System: A Comprehensive Review
Shaveta1
, Er. Abhinav Bhandari2
and Dr. Krishan Kumar Saluja3
1
Research Scholar, Department of Computer Engineering, UCOE, Punjabi University, Patiala, India
er.shaveta89@gmail.com
2
Assistant Professor, Department of Computer Engineering, UCOE, Punjabi University, Patiala, India
bhandarinitj@gmail.com
3
Associate Professor, Department of Computer Science and Engineering, S.B.S.C.E.T, Ferozepur, India
k.saluja@rediffmail.com
Abstract— Information Systems and Networks are subjected to electronic attacks. When
network attacks hit, organizations are thrown into crisis mode. From the IT department to
call centers, to the board room and beyond, all are fraught with danger until the situation is
under control. Traditional methods which are used to overcome these threats (e.g. firewall,
antivirus software, password protection etc.) do not provide complete security to the system.
This encourages the researchers to develop an Intrusion Detection System which is capable
of detecting and responding to such events. This review paper presents a comprehensive
study of Genetic Algorithm (GA) based Intrusion Detection System (IDS). It provides a
brief overview of rule-based IDS, elaborates the implementation issues of Genetic Algorithm
and also presents a comparative analysis of existing studies.
Index Terms— False Positive, Fitness Function, Genetic Algorithm (GA), Intrusion,
Intrusion Detection System (IDS)
I. INTRODUCTION
Internet was originally designed by keeping functionality but not security in mind. The TCP/IP protocol
suite, the most widely used protocol suite for data communication, works on the assumption that all the hosts
participating in the communication have no malicious intention. Such design flaws open up the internet to
many opportunities for intrusion. Intrusion is a set of actions aimed at compromising the security goals
(confidentiality, integrity, availability) of a computing/networking resource [1]. Intrusion techniques may
include exploiting software bugs and system misconfigurations, password cracking, sniffing unsecured
traffic, or exploiting the design flaw of specific protocols [5]. An intruder is any user or group of users who
initiate such intrusive actions. Intruders can be divided into two groups, external and internal. The former
refers to those who do not have authorized access to the system and who attack by using various penetration
techniques. The latter refers to those with access permission who wish to perform unauthorized activities [6].
The attacks are growing exponentially and are getting more sophisticated. Attempts to breach information
security are rising every day, along with the availability of the Vulnerability Assessment tools that are widely
available on the Internet, for free, as well as for a commercial use. Tools such as SubSeven, BackOrifce,
Nmap, L0ftCrack, can all be used to scan, identify, probe, and penetrate into your systems. With the help of
such tools even the best security measures can be breached. The key targets of the attackers include banks,
DOI: 02.ITC.2014.5.46
© Association of Computer Electronics and Electrical Engineers, 2014
Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC
103
law firms and corporates. According to the report published by Symantec Corporation [14] for the month of
November, 2013 the number of targeted attacks has increased, 438 new vulnerabilities have been discovered
bringing the total for the year up to 5965, two zero-day vulnerabilities have been discovered and 42 million
identities have been exposed. A successful targeted attack on a large company can cost it $2.4 million in
direct financial losses and additional costs. For a medium-sized or small company, a targeted attack can mean
about $92,000 in damages – almost twice as much as an average attack [10]. Therefore, the attention drifts to
Intrusion Detection Systems which monitor network traffic so as to identify resources misuse, unauthorized
use as well as its abuse and perform actions as defined by security policies. Intrusion detection systems
perform following functions:
 Monitoring and analysis of user and system activity
 Auditing of system configurations and vulnerabilities
 Assessing the integrity of critical system and data files
 Statistical analysis of activity patterns based on the matching to known attacks
 Abnormal activity analysis and Operating system audit
The majorities of currently existing IDS face a number of challenges such as low detection rates and high
false alarm rates and therefore obstruct legitimate users from accessing the network resources. These
problems are due to the sophistication of the attacks and their intended similarities to normal behavior. To
overcome these problems in currently existing IDS, Genetic Algorithm based Intrusion detection system is
employed to enhance the performance of intrusion detection for rare and complicated attacks.
The rest of the paper is organized as follows: Section 2 provides a brief introduction to Intrusion Detection
System. Section 3 describes the implementation issues of Genetic Algorithm. Section 4 describes the
technique of applying Genetic Algorithm to Intrusion Detection System. Section 5 presents the related work
and a comparative analysis of existing studies. Finally, the discussion is concluded.
II. INTRUSION DETECTION SYSTEM
Intrusion detection is the process of identifying and responding to such events which violate the computer
security policies, acceptable use policies or standard security practices. An Intrusion Detection System (IDS)
is a security system which implements the process of intrusion detection and reports the intrusion accurately
to the appropriate authority. The IDS monitors packets from various network connections in order to detect
an intrusive activity [1]. If an intrusion is detected, the IDS simply logs in a message into system audit file to
be later analyzed by network security experts or stops such connections to end an intruder's attack or
performs some other action as defined by the organization’s rules and practices to provide security, handle
intrusion and recover from the damage caused by security breaches [1]. These systems do not react equally at
all the times, false alarms could occur sometimes.
A. Components of IDS
The basic architecture of intrusion detection system is explained below [2] [16] and presented in figure 1:
 Data Source: Data sources can be categorized into four categories namely Host-based monitors, Network-
based monitors, Application-based monitors and Target-based monitors.
 Data gathering device (sensor): It is responsible for collecting data from the monitored system.
 Analysis Engine (detector): This component takes information from the sensors and examines the data in
order to detect attacks. The analysis engine can use various analysis approaches e.g. misuse/signature
based detection or anomaly/statistical detection.
 Knowledge base: It is database which contains information collected by the sensors, but in preprocessed
format (e.g. knowledge base of attacks and their signatures, filtered data, data profiles, etc.). This
information is usually provided by network and security experts.
 Configuration device: It provides information about the current state of the intrusion detection system
(IDS).
 Response Manager: The response manager only acts when an intrusion is detected and performs the
necessary action as defined by the security policies of the organization. These actions can be either
automated (active) or involve human interaction (inactive).
104
Figure1. Basic Architecture of Intrusion Detection System
B. Characteristics of IDS
IDS must have following characteristics [2]:
 Prediction performance: Typical measures for evaluating predictive performance of IDS include detection
rate and false alarm rate. Detection rate is defined as the ratio of the number of correctly detected attacks
to the total number of attacks. The false alarm (or false positive) rate is the ratio of the number of normal
connections that are incorrectly classified as attacks to the total number of normal connections. Therefore,
good IDS must have high detection rate and low false positive rate.
 Time performance: The total time taken by IDS for generating alarm should be as short as possible. The
processing time depends upon the processing speed of the IDS, which is the rate at which the IDS
processes audit events. If this rate is not sufficiently high, then the real time processing of security events
may not be feasible. The propagation time is the time needed for processed information to propagate to
the security analyst. Both times need to be as short as possible in order to allow the security analyst
sufficient time to react to an attack before much damage has been done, as well as to stop an attacker
from modifying audit information or altering the IDS itself.
 Fault tolerance: An IDS should be robust, dependable and resistant to attacks and should be able to
recover quickly. This characteristic is very important for the proper functioning of IDSs, since most
commercial IDSs run on operating systems and networks that are vulnerable to different types of attacks.
In addition, IDS should also be resistant to scenarios when an adversary can cause the IDS to generate a
large number of false or misleading alarms. Such alarms may easily have a negative impact on the
availability of the system, and the IDS should be able to quickly overcome these obstacles.
 Dynamic reconfiguration: it must be dynamically reconfigurable so that time spent on reconfiguration of
the system is as short as possible.
C. Taxonomy of IDS’s
The IDSs are generally classified [9] as shown in the figure 2:
Figure2: Taxonomy of IDS’s
By location (or by scope of protection):
Data Source (Monitored System)
Data gathering (sensors)
Analysis Engine
Knowledge base Configuration
Response Component
Raw data
Events
System state System state
Actions
Actions
IDS Classification
By location By detection model
Host-based
IDS
Network-
based IDS
Misuse
Detection
Anomaly
Detection
105
Intrusion Detection Systems can be divided into following two types depending on the location where they
look for intrusive actions:
 Host-based IDS (HIDS): Host-based IDS loads a piece of software on the system to be monitored. This
software evaluates the information associated with the system including the contents of operating system,
system and application files. If any critical file is deleted or modified then an alert message is send to the
administrator for further investigation.
 Network-based IDS (NIDS): identifies the intrusive activities by analyzing the stream of packets which
travel across the network.
By detection model:
Intrusion Detection Systems can also be classified into following categories on the basis of the detection
approaches:
 Misuse detection (or signature based detection): these systems work by matching user activity with stored
signatures of known attacks. Such detection systems use a predefined knowledgebase to check whether
the new network connection is in that knowledge database. If yes, the IDS consider this connection as a
possible attack and then block it.
 Anomaly detection (or Behavior detection): In this case, the system learns the characteristics of normal
user activities and then uses such characteristics to judge whether new user's activity is normal or not.
III. GENETIC ALGORITHM
The Genetic Algorithm is a probabilistic search algorithm that iteratively transforms a set (called population)
of mathematical objects (typically fixed-length binary character strings called chromosomes), each with an
associated fitness value, into a new population of offspring objects using operations that are patterned after
naturally occurring genetic operations, such as crossover and mutation [8]. Genetic Algorithm is inspired
from the natural search and selection processes leading to the survival of the fittest [13]. In last few years,
genetic algorithms have emerged as practical, robust optimization and search methods. Genetic Algorithms
represent an intelligent exploitation of a random search used to solve optimization problems. GAs, although
randomized, exploit historical information to direct the search into the region of better performance within
the search space.
A. Working Principle of GA:
The working principle of GA is explained as follows [17]. Genetic Algorithm begins with a set of suitable
solutions for the problem. Each solution is represented by a chromosome-like data structure. Solutions from
one population are selected and used to generate a new population. This is motivated by the possibility that
the new population will be better than the old one. Solutions are selected according to their fitness to generate
new population; more suitable they are, more chances they have to reproduce. This is repeated until some
condition (e.g. fixed number of generations reached or improvement of the best solution etc.) is satisfied. The
pseudo-code for GA is as shown below.
Pseudo-code:
BEGIN
INITIALISE population with random candidate solutions.
EVALUATE each candidate;
REPEAT UNTIL (terminate condition) is satisfied DO
1. SELECT parents;
2. RECOMBINE pairs of parents;
3. MUTATE the resulting offspring;
4. SELECT individuals or the next generation;
END
B. Encoding of solutions as chromosomes:
Before using genetic algorithm to solve any problem it is necessary to encode the potential solutions to that
problem in a form which can be processed by a computer [17]. One common approach is to encode the
solutions as binary strings: sequences of 1’s and 0’s, where each digit represents the value of some aspect of
the solution. Each solution is represented in the form of a chromosome. Different positions in a chromosome
are referred to as genes and are changed randomly within a range during the process of evolution.
Example:
106
A Gene may look like: 1101
A chromosome may look like: Gene1 Gene 2 Gene3 Gene4
1101 1001 1111 1011
Binary string representation of above chromosome: 1101100111111011
Other methods of encoding include encoding values as integers or real numbers or any element (E11 E3
E7…E1 E15) or list of rules (R1 R2 R3…R22 R23) or any data structure. The selection of the encoding
method depends upon the attributes of the problem to be solved.
C. Steps involved in basic Genetic Algorithm:
The various steps involved in GA are explained below [17] and the overall flow chart is presented in figure 3:
Step 1: [Start] Generate random population of ‘n’ chromosomes each representing a different solution to the
problem.
Step 2: [Fitness] Evaluate fitness f(x) of each chromosome ‘x’ in the population.
Step 3: [New population] Generate new population by repeating following steps until the new population is
complete
a. [Selection] Select two parent chromosomes from a population according to their fitness (higher the
fitness, greater the chance of selection).
b. [Crossover] With a crossover probability, cross over the parents to generate new offspring.
Crossover could be one-point or multi-point. If no crossover is performed then offspring is the
exact copy of parents.
c. [Mutation] With a mutation probability, mutate new offspring (i.e. randomly flip some bits).
d. [Accepting] Place new offspring in the new population.
Step 4: [Replace] Use new population for further run of the algorithm.
Step 5: [Test] If the end condition is satisfied, stop and return the best solution in current population.
Step 6: [Loop] Go to step2.
Figure3: Overall flow of GA
Yes
No
Mutation
Start
Generate
random
population
Apply Fitness
Function
Optimization
criteria met?
Result
Selection
Crossover
107
A genetic algorithm is quite straightforward in general, but it could be complex in most cases. The values of
various parameters (for example, mutation rate, crossover rate, population size, chromosome size, number of
evolutions or generations, and selection process) need to be selected by considering the attributes of the
problem being solved. Genetic Algorithm is used to solve a problem if alternate solutions are too slow (or
much complicated) or an exploratory tool is required to examine new approaches or benefits of GA meet key
problem requirements etc. The advantages of using Genetic algorithm are [8]:
 Always gives answer
 Answer gets better with time
 Inherently parallel
 Easily re-trainable
 Multiple ways to speed up and improve a GA-based application as knowledge about problem domain is
gained
 Easy to exploit previous or alternate solutions
 Different operators used in genetic algorithm avoid getting stuck in local maxima etc.
D. Limitations of Genetic Algorithm
Genetic algorithms are efficient, but in practice they have certain limitations:
 It is not always easy to find a fitness function.
 Representing a problem space in genetic algorithms is very complex.
 It is a tough task to choose the optimal parameters for a genetic algorithm.
 Genetic algorithms need a large number of fitness function evaluations.
 It is not easy to configure a genetic algorithm based system.
IV. GENETIC ALGORITHM BASED SYSTEM MODEL
Genetic Algorithm can be used in different ways in intrusion detection systems. If Intrusion Detection
System is illustrated as a rule-based system then GA can be considered as a tool to generate rules for the rule-
based IDS. The goal of the system is not to evolve a single best rule (global optimal), but to create a set of
rules which is good enough to detect attacks. The system works by analysing the network connections. The
figure 4 describes the overall flow of GA based IDS. The system works in two phases: training phase and
testing phase.
A. Training phase
In this phase, a set of classification rules is generated from network audit data using Genetic Algorithm in an
offline environment. The training data set contains analysed logs of connections which clearly distinguish
between normal connections and attacks. The examples of various data sets include KDD Cup99 and
DARPA. The records from the training data set are represented in the form of chromosomes. Each
chromosome is a rule within which certain features of a connection are encoded in the form of fixed length
vector. A fitness function is then applied to each chromosome in order to evaluate its goodness. If a
chromosome helps to identify an attack correctly, it is considered good (or fit) else it is considered bad.
Crossover and mutation operations are applied to the good chromosomes in order to produce new generation.
This entire process is repeated by using the newly generated population. This process of evolution continues
until a solution is reached (i.e. a set of rules, capable of detecting attacks is generated). The generated rules
are stored in a rule base in the following form:
if { condition } then { act }
For example, a rule can be defined as [1]:
if {the connection has following information: source IP address 124.12.5.18; destination IP address:
130.18.206.55; destination port number: 21; connection time: 10.1 seconds} then {stop the connection}
Explanation: if there exists a network connection request with source IP address 124.12.5.18, destination IP
address 130.18.206.55, destination port number 21, and connection time 10.1 seconds, then stop the
connection establishment – since IP address 124.12.5.18 is recognized by the IDS as a blacklisted IP address.
Thus, service request initiated from it, is rejected.
The various steps involved in training phase are [1]:
1. Encoding of connections – Consider the following case [13] where six features of a network connection
are being used to identify an attack. The dataset used in this case is DARPA dataset which contains 7 features
of a connection including the attack name. The normal connections contain no attack-names. Each
108
chromosome is a rule within which the 7-features are encoded via fixed length vector, and each feature is
encoded as one or more genes of different types as shown in table below.
TABLE I. CHROMOSOME REPRESENTATION OF A RULE
Sr.
no
Feature Feature Explanation Format Number of Genes
1. Duration Time period of the connection H:M:S 3
2. Protocol Protocol used for making connection Numeric 1
3. Source Port Application that the attacker system is running Numeric 1
4. Destination Port Application that the target system is running Numeric 1
5. Source IP Attacker system’s IP address a.b.c.d 4
6. Destination IP Target system’s IP address a.b.c.d 4
7. Attack name and type Name of the attack string 1
Each rule uses an if-then clause with a “condition” and “outcome” part. The first 6-features are connected via
logical AND to form “condition” part; while attack name is the “outcome” to show network record
classification (during training) or connection (during intrusion detection) if a rule is matched. For example
consider the following rule [13]:
if (duration=“0:0:1” and protocol=“finger” and source_port=18982 and destination_port=79 and
source_ip=“9.9.9.9” and destination_ip=“172.16.112.50”) then (attack_name=“neptune”)
The above rule expresses that if a network packet is originated from IP address 9.9.9.9 and port 18982, and
sent to IP address 172.16.112.50 and port 79 using the protocol finger, and the connection duration is 1
second, then most likely it is a network attack of type neptune that may eventually cause the destination host
out of service. The above rule can be represented as follows:
{0, 0, 1, 2, 18982, 79, 9, 9, 9, 9, 172, 16, 112, 50, 1}
2. Evaluating each chromosome using fitness function – During the training phase, evaluation of
chromosomes is carried out in order to determine their goodness. If a chromosome correctly classifies an
attack, it is considered good; else, it is bad and is not selected for crossover to produce offspring. Thus, a
chromosome which detects more attacks has higher fitness value and has higher chances for selection. The
different fitness models proposed by various researchers are: support and confidence model, reward-penalty
model, weighted sum model etc.
3. Selection – In order to choose the chromosomes different selection methods are used e.g. Fitness-
proportion selection, Roulette-wheel selection, Rank selection, Local selection, Tournament selection, Steady
state selection [6].
4. Crossover –With a crossover probability, cross over the parents to generate new offspring. Crossover can
be one-point or multi-point. If no crossover is performed then offspring is the exact copy of parents.
5. Mutation: Each gene in a chromosome may or may not change depending on the probability of mutation
rate. Mutation improves population diversity needed in this work.
B. Testing phase
In this phase, the rules stored in the rule base are used to detect whether a real-time network connection is a
normal connection or an intrusive attack. If the characteristics of new connection match with the ‘condition’
section of some pre-defined rule in the rule-base then the connection is considered as an attack else it is
considered as a normal connection. If an attack is detected then IDS performs the necessary actions defined
by the security policies of the organization. The algorithm for GA-based IDS is presented below.
Algorithm: Intrusion Detection [1]
Input: Inflowing network connection
Output: Decision if connection is intrusive or not
1: Loop Forever {fetch incoming packet}
2: for each rule in rule-base
3: Match rule with network connection (analysis console)
4: if rules match then
5: Mark current connection as an intrusion (and generate an alarm as per security policies)
6: end if
7: end for each
8: end loop forever.
109
Figure4: Overall flow of GA based IDS
V. RELATED WORK
The Intrusion Detection System has undergone rapid changes and is using new evolved techniques to
generate better results. Genetic Algorithm can be used in different ways in Intrusion Detection Systems.
Genetic Algorithm based intrusion detection approach discussed in this review paper is focused on a rule
based Intrusion Detection System which uses only Genetic Algorithm to generate knowledge. For this
purpose network connections are analysed to describe the normal and abnormal behaviour in the network.
This section briefly summarizes some of the GA based IDSs and presents a comparative analysis of various
existing studies in table 2.
The early effort of using GAs for intrusion detection can be dated back to 1995, when Crosbie and Spafford
[12] applied the multiple agent technology and GP (Genetic Programming) to detect network anomalies.
Each agent monitors one parameter of the network audit data and GP is used to find the set of agents that
collectively determine anomalous network behaviors. This method has the advantage of using many small
autonomous agents, but the communication among them is still a problem. Also the training process can be
time consuming if the agents are not appropriately initialized.
Wei Li [11] proposes a GA-based method to detect anomalous network behaviors. This implementation of
genetic algorithm is unique as it considers both temporal and spatial information of network connections in
encoding the network connection information into rules in IDS. This may lead to increased detection rates.
However, no experimental results are available yet.
Ren Hui Gong, Mohammad Zulkernine and Purang Abolmaesumi [13] present a method of applying Genetic
Algorithm for intrusion detection. Seven network features including both categorical and quantitative data
fields are used when encoding and deriving the rules. A simple but efficient and flexible fitness function, i.e.
the support-confidence framework, is used to judge the quality of each rule. Depending on the selection of
fitness function weight values, the generated rules can be used to either generally detect network intrusions or
precisely classify the types of intrusions.The method has been implemented using Java and third party
package ECJ. The implementation has been tested using subsets of 1998 DARPA dataset. Experimental
results show that the proposed method worked efficiently and has flexibility to be used in different ways.
Start
Evolution of rules
using Genetic
Algorithm
Analysis of new
connections using
rules from rule-base
Attack
Detected?
Alert
Yes
No
Training
Dataset
Testing
Dataset
110
However, some limitations of the method are also observed. First, the generated rules are biased to the
training dataset. This issue may be resolved by carefully selecting either the number of generations in the
training phase or the number of top best-fit rules in the intrusion detection phase. Second, while the support-
confidence framework is simple to implement and provides improved accuracy to final rules, it requires the
whole training data to be loaded into memory before any computation. For large training datasets, it is neither
efficient nor feasible. The use of some sorts of cache technologies may solve the problem.
Anup Goyal and Chetan Kumar [3] describe a GA based IDS to classify different types of network attacks
with very low false positive rate (at 0.2%) and almost 100% detection rate. The algorithm takes into
consideration different features of network connections such as type of protocol, network service on the
destination and status of the connection to generate a classification rule set. Each rule in rule set identifies a
particular attack .The design of the fitness function is such to make it biased towards individuals that
correctly classify only the attack connections. The experiments are performed on the KDDCup99 data set.
The generated rule set consists of six rules that can be applied to the IDS to identify and classify six different
types of attack connections that fall into two classes namely Denial of Service (DoS) and Probing attacks.
GALIB C++ library, especially suited to develop GA is used to implement the proposed system.
Bader and Nasereddin [5] discuss a technique of using Genetic Algorithm for Intrusion Detection System.
This implementation considers both temporal and spatial information of network connections in encoding the
network connection information into rules in IDS. The network traffic used for implementing GA is a pre-
classified data set that differentiates normal network connections from anomalous ones. This data set is
gathered using network sniffers (a program used to record network traffic without doing something harmful)
such as Tcpdump or Snort. The data set is manually classified based on the knowledge of experts. The rules
generated are good enough for filtering new network traffic. The various attributes of network connections
which are used for generating rules are: source IP address, destination IP address, source port number,
destination port number, duration, state, protocol, number of bytes sent by originator, number of bytes send
by responder.
B. Uppalaiah, K. Anand, B. Narsimha, S. Swaraj and T. Bharat [4] suggest an intrusion detection system
using genetic algorithm to generate rule set for eight types of attacks belonging to four categories. The
proposed architecture deployed KDDCUP99 dataset. The dataset contains 41 features out of which only 3
features have been used to specify each entry of the dataset. The architecture of the system and the software
implementation for the proposed technique are also discussed. The system created specified set of rules and
achieved high DoS (Denial of Service), R2L (Remote to Local), U2R (User to Root), Probe attack detection
rate. The average success rate achieved during experiments is 83.65%. The proposed system is flexible for
usage in different application areas. The proposed system is implemented using C# in .net suite.
Firas Alabsi and Reyadh Naoum [7] recommend a new fitness function using Reward-Penalty technique to
evaluate the chromosomes efficiently. The data of 5% of KDDCUP’99 has been used for the proposed
system. The proposed fitness function works on the principle that reward and penalty are proportionate to the
strength and weakness of chromosomes. In order to prove the validity of the new fitness function, the results
of reward-penalty model based fitness function are compared with the results of the support-confidence
model based fitness function. The results closely match with each other. The system has been built by using
Vb.Net 2010 and SQL server 2008.
A.A. Ojugo, A.O. Eboka, O.E. Okonta, R.E Yoro (Mrs) and F.O. Aghware [1] present a genetic algorithm
based approach which uses rules derived from network audit data for network intrusion detection. The fitness
function utilized is based on the support-confidence framework. The fitness function is simple, efficient and
flexible. The training and testing data set used is the DARPA 1998 MIT Lincoln laboratory. The study
implemented GA based IDS using C (programming language) in Linux operating system platform. However,
some limitations of the method are also witnessed. First, the generated rules are biased to the training dataset.
This issue may be resolved by carefully selecting either the number of generations in the training phase or the
number of top best-fit rules in the intrusion detection phase. Second, while the support-confidence framework
is simple to implement and provides improved accuracy to final rules, it requires the whole training data to be
loaded into memory before any computation. For large training datasets, it is neither efficient nor feasible.
The use of some sorts of cache technologies may solve the problem.
V. Moraveji Hashmei, Z. Muda and W. Yassin [15] present a genetic algorithm based intrusion detection
system. Software implementation of the proposed system is presented. The system is flexible enough to be
used in different application environments, if proper attack taxonomy and proper training dataset exist. High
detection rate and low false positive rates are the highlights of the proposed system. The proposed system can
111
be applied for intrusion detection without using any complementary technique that is commonly used with
other soft-computing techniques. KDDCUP’99 dataset is used for training phase.
Bharat S. Dhak and Shrikant Lade [6] present a genetic algorithm based intrusion detection technique to
detect malicious packets on the network and ultimately help to block the respective IP addresses. The Genetic
Algorithm process is discussed in detail. The training is done on the predefined data rules. The testing is done
on the entries generated by the firewall system of machine in pfirewall.log file. The proposed system can be
integrated with any of the IDS system to improve the efficiency and the performance of the same.
M. Sadiq Ali Khan [18] designed a rule-based Intrusion Detection System to detect DoS (Denial of Service)
or Probing attacks by formulating the contributing parameters in terms of rules. Genetic algorithm is used to
devise these rules. In this study, KDD-99 data set is used with reduced set of attributes. Principal Component
Analysis is used to reduce the data set. By running GA for more than 2000 times the proposed system
managed to achieve 91% accuracy in detecting network attacks.
TABLE II. COMPARISON OF EXISTING STUDIES ON GA BASED IDS
Reference
Detection
Approach
Fitness Function (F)
Explanation of Fitness function
used
Remarks
A.A. Ojugo,
A.O. Eboka,
O.E. Okonta,
R.E Yoro
(Mrs), F.O.
Aghware [1]
Misuse analysis
Support and confidence model
F=W1*support+
W2*confidence
If we have the rule:
If A then B,
support = |A and B| / N
confidence = |A and B| / |A|
N = Number of connections in
training data
|A| = Number of connections
matching condition A.
|A and B| = Connections
matching rule if A and B
w1, w2 = Weights to
balance/control the two terms.
Uses 7-network features;
so in order to detect
millions of connections
high processing speed and
sufficient cache are the
required features; 97% of
the attacks detected
correctly by this system.
B. Uppalaiah
K. Anand,
B.Narsimha,
S.Swaraj,
T.Bharat [4]
Misuse analysis
Fitness = f(x) / f (sum)
Where f(x) is the fitness of
entity x and f is the total fitness
of
all entities
Uses only 3 network
features; 83.65% of avg.
success rate; process is
faster , can be applied for
high speed networks
Bharat S.
Dhak,
Shrikant
Lade [6]
Misuse analysis
F= weight*packet_size
Where the packet_size is the
actual packet data size
prescribed by the incoming
packet data stream and weight
is the Vector which is applied
to each chromosome.
Scope of experiment is
focused to generate a list
of vulnerable IP
addresses; gained 96% of
accuracy.
Firas Alabsi,
Reyadh
Naoum [7]
Misuse analysis
Reward Penalty model based
F=2+(AB-A/AB+A)+(AB/X)-
(A/Y)
Consider a rule:
If A then B,
((AB-A)/(AB+A))= strength
of a record;
AB/X= ratio of the strength of
record to the strength of the
strongest record;
A/Y=ratio of the weakness of a
record to the weakness of the
weakest record;
Uses 5-network features;
Fitness function gives
reward to good
chromosomes and applies
penalty on the bad
chromosomes; comparison
between the newly
proposed and other
existing fitness functions
is presented.
Wei Li [11]
Anomaly
Detection
Weighted sum model based
F=1-penalty
Fitness function is determined
by calculating the general
outcome, absolute difference
and penalty values.
Considers both temporal
and spatial features of a
network connection to
detect an attack; no
experimental results
V. Moraveji
Hashmei, Z.
Muda, W.
Yassin [15]
Misuse analysis
F= (a/A)-(b/B)
Where a=number of correctly
detected attacks; A = total
number of attacks in the
training dataset; b = number of
normal connections that are
falsely detected as attacks; B =
total number of normal
connections.
Uses only 3-network
features; fast processing
and can be applied for
high speed networks; high
detection rate; low false
positives; gained 95.62%
as detection rate and
4.37% as false alarm; can
be used without using any
complementary technique.
112
VI. CONCLUSION
The three factors which have impact on the effectiveness of the genetic algorithm are selection of fitness
function, representation of individuals and values of the GA parameters. The determination of these factors
often depends on applications. Designing accurate fitness function is the major challenge for solving a
particular problem. Different models for designing fitness function have been discussed in the paper. Using
GA for intrusion detection has proven to be a cost-effective approach. One of the major advantages of this
technique is due to the fact that in the real world, the types of intrusions change and become complicated
very rapidly. The GA based detection system can upload and update new rules to the systems as the new
intrusions become known. Therefore, it is cost effective and adaptive.
REFERENCES
[1] A.A. Ojugo, A.O. Eboka, O.E. Okonta, R.E Yoro (Mrs), F.O. Aghware, “Genetic Algorithm Rule-Based Intrusion
Detection System (GAIDS)”, Journal of Emerging Trends in Computing and Information Sciences, Vol.3, pp. 1182-
1194, Aug 2012
[2] Aleksandar Lazarevic, Vipin Kumar, Jaideep Srivastava, “Intrusion Detection a survey”, unpublished.
[3] Anup Goyal, Chetan Kumar, “GA-NIDS: A Genetic Algorithm based Network Intrusion Detection System”, 2008.
[4] B. Uppalaiah, K. Anand, B. Narsimha, S. Swaraj, T. Bharat, “Genetic Algorithm Approach to Intrusion Detection
System”, IJCST Vol. 3, Issue 1, Jan-March 2012
[5] Bader and Nasereddin, “Using Genetic Algorithm in Network Security”, IJRRAS, vol. 5, pp. 148-154, Nov. 2010
[6] Bharat S. Dhak, Shrikant Lade, “ An Evolutionary Approach to Intrusion Detection System using Genetic
Algorithm” .ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 2, Issue 12, Dec. 2012
[7] Firas Alabsi and Reyadh Naoum(2012, April), “Fitness Function for Genetic Algorithm used in Intrusion Detection
System”, International Journal of Applied Science and Technology, Vol. 2, pp. 632-637.
[8] GA tutorial, Available at:
http://www.vit.ac.in/academicresearch/res701/RES701DUMP/Evolutionary%20Algorithms/GATutorial.pdf
[9] Kamal Kishore Prasad, Samarjeet Borah, “Use of Genetic Algorithms in Intrusion Detection Systems: An Analysis”,
International Journal of Applied Research and Studies (iJARS) ISSN: 2278-9480 Volume 2, Issue 8, Aug 2013
[10]Kaspersky lab Global Corporate IT Security Risks: 2013, May 2013
[11]Li, Wei, “Using Genetic Algorithm for Network Intrusion Detection”, (2004)
[12]M. Crosbie, E. Spafford, “Applying Genetic Programming to Intrusion Detection”, Proceedings of the AAAI Fall
Symposium, 1995.
[13]Ren Hui Gong, Mohammad Zulkernine, Purang Abolmaesumi, “A Software Implementation of a Genetic Algorithm
Based Approach to Network Intrusion Detection”, Proceedings of the Sixth International Conference on Software
Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing and First ACIS International
Workshop on Self-Assembling Wireless Networks (SNPD/SAWN’05) 2005 IEEE .
[14]Ben Nahorney, Symantec Intelligence report: November 2013
[15]V. Moraveji Hashmei, Z. Muda and W. Yassin, “Improving Intrusion Detection using Genetic Algorithm”,
International Technology journal 12(11) pp. 2167-2173, 2013
[16]Mohammad Sazzadul Hoque, Md. Abdul Mukit and Md. Abu Naser Bikas, “An Implementation of Intrusion
Detection System using Genetic Algorithm”, International Journal of Network Security & Its Applications (IJNSA),
Vol.4, No.2, March 2012
[17]RC Chakraborty, Fundamentals of Genetic Algorithm: AI Course, June 2010, available at
http://www.myreaders.info/09-Genetic_Algorithms.pdf
[18]M. Sadiq Ali Khan, “Rule based Network Intrusion Detection using Genetic Algorithm”, International Journal of
Computer Applications (0975 – 8887) Volume 18– No.8, March 2011

Mais conteúdo relacionado

Mais procurados

Intrusion Detection in Industrial Automation by Joint Admin Authorization
Intrusion Detection in Industrial Automation by Joint Admin AuthorizationIntrusion Detection in Industrial Automation by Joint Admin Authorization
Intrusion Detection in Industrial Automation by Joint Admin AuthorizationIJMTST Journal
 
AN ISP BASED NOTIFICATION AND DETECTION SYSTEM TO MAXIMIZE EFFICIENCY OF CLIE...
AN ISP BASED NOTIFICATION AND DETECTION SYSTEM TO MAXIMIZE EFFICIENCY OF CLIE...AN ISP BASED NOTIFICATION AND DETECTION SYSTEM TO MAXIMIZE EFFICIENCY OF CLIE...
AN ISP BASED NOTIFICATION AND DETECTION SYSTEM TO MAXIMIZE EFFICIENCY OF CLIE...IJNSA Journal
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSieijjournal
 
IRJET- Data Security using Honeypot System
IRJET- Data Security using Honeypot SystemIRJET- Data Security using Honeypot System
IRJET- Data Security using Honeypot SystemIRJET Journal
 
information security (Audit mechanism, intrusion detection, password manageme...
information security (Audit mechanism, intrusion detection, password manageme...information security (Audit mechanism, intrusion detection, password manageme...
information security (Audit mechanism, intrusion detection, password manageme...Zara Nawaz
 
IDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision treesIDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision treesVahid Farrahi
 
IRJET- A Review on Intrusion Detection System
IRJET-  	  A Review on Intrusion Detection SystemIRJET-  	  A Review on Intrusion Detection System
IRJET- A Review on Intrusion Detection SystemIRJET Journal
 
IMPROVED IDS USING LAYERED CRFS WITH LOGON RESTRICTIONS AND MOBILE ALERTS BAS...
IMPROVED IDS USING LAYERED CRFS WITH LOGON RESTRICTIONS AND MOBILE ALERTS BAS...IMPROVED IDS USING LAYERED CRFS WITH LOGON RESTRICTIONS AND MOBILE ALERTS BAS...
IMPROVED IDS USING LAYERED CRFS WITH LOGON RESTRICTIONS AND MOBILE ALERTS BAS...IJNSA Journal
 
Assessing Risk: Developing a Client/Server Security Architecture,
 Assessing Risk: Developing a Client/Server Security Architecture,  Assessing Risk: Developing a Client/Server Security Architecture,
Assessing Risk: Developing a Client/Server Security Architecture, MITDaveMillaar
 
Threat Modeling - Writing Secure Code
Threat Modeling - Writing Secure CodeThreat Modeling - Writing Secure Code
Threat Modeling - Writing Secure CodeCaleb Jenkins
 
Intrusion Detection System using Data Mining
Intrusion Detection System using Data MiningIntrusion Detection System using Data Mining
Intrusion Detection System using Data MiningIRJET Journal
 
Operational Security Intelligence
Operational Security IntelligenceOperational Security Intelligence
Operational Security IntelligenceSplunk
 

Mais procurados (16)

Bt33430435
Bt33430435Bt33430435
Bt33430435
 
Intrusion Detection in Industrial Automation by Joint Admin Authorization
Intrusion Detection in Industrial Automation by Joint Admin AuthorizationIntrusion Detection in Industrial Automation by Joint Admin Authorization
Intrusion Detection in Industrial Automation by Joint Admin Authorization
 
AN ISP BASED NOTIFICATION AND DETECTION SYSTEM TO MAXIMIZE EFFICIENCY OF CLIE...
AN ISP BASED NOTIFICATION AND DETECTION SYSTEM TO MAXIMIZE EFFICIENCY OF CLIE...AN ISP BASED NOTIFICATION AND DETECTION SYSTEM TO MAXIMIZE EFFICIENCY OF CLIE...
AN ISP BASED NOTIFICATION AND DETECTION SYSTEM TO MAXIMIZE EFFICIENCY OF CLIE...
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
 
IRJET- Data Security using Honeypot System
IRJET- Data Security using Honeypot SystemIRJET- Data Security using Honeypot System
IRJET- Data Security using Honeypot System
 
information security (Audit mechanism, intrusion detection, password manageme...
information security (Audit mechanism, intrusion detection, password manageme...information security (Audit mechanism, intrusion detection, password manageme...
information security (Audit mechanism, intrusion detection, password manageme...
 
IDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision treesIDS - Analysis of SVM and decision trees
IDS - Analysis of SVM and decision trees
 
IRJET- A Review on Intrusion Detection System
IRJET-  	  A Review on Intrusion Detection SystemIRJET-  	  A Review on Intrusion Detection System
IRJET- A Review on Intrusion Detection System
 
IMPROVED IDS USING LAYERED CRFS WITH LOGON RESTRICTIONS AND MOBILE ALERTS BAS...
IMPROVED IDS USING LAYERED CRFS WITH LOGON RESTRICTIONS AND MOBILE ALERTS BAS...IMPROVED IDS USING LAYERED CRFS WITH LOGON RESTRICTIONS AND MOBILE ALERTS BAS...
IMPROVED IDS USING LAYERED CRFS WITH LOGON RESTRICTIONS AND MOBILE ALERTS BAS...
 
Assessing Risk: Developing a Client/Server Security Architecture,
 Assessing Risk: Developing a Client/Server Security Architecture,  Assessing Risk: Developing a Client/Server Security Architecture,
Assessing Risk: Developing a Client/Server Security Architecture,
 
Threat Modeling - Writing Secure Code
Threat Modeling - Writing Secure CodeThreat Modeling - Writing Secure Code
Threat Modeling - Writing Secure Code
 
Intrusion Detection System using Data Mining
Intrusion Detection System using Data MiningIntrusion Detection System using Data Mining
Intrusion Detection System using Data Mining
 
Es34887891
Es34887891Es34887891
Es34887891
 
306 310
306 310306 310
306 310
 
Operational Security Intelligence
Operational Security IntelligenceOperational Security Intelligence
Operational Security Intelligence
 
Cyber intrusion
Cyber intrusionCyber intrusion
Cyber intrusion
 

Destaque

Application of genetic algorithm in intrusion detection system
Application of genetic algorithm in intrusion detection systemApplication of genetic algorithm in intrusion detection system
Application of genetic algorithm in intrusion detection systemAlexander Decker
 
An Approach of Automatic Data Mining Algorithm for Intrusion Detection and P...
An Approach of Automatic Data Mining Algorithm for Intrusion  Detection and P...An Approach of Automatic Data Mining Algorithm for Intrusion  Detection and P...
An Approach of Automatic Data Mining Algorithm for Intrusion Detection and P...IOSR Journals
 
A Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection SystemA Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection SystemIJARIIE JOURNAL
 
Intrusion detection system with GA
Intrusion detection system with GAIntrusion detection system with GA
Intrusion detection system with GAChungHsiangHsueh
 
Intrusion Detection
Intrusion DetectionIntrusion Detection
Intrusion Detectionbutest
 

Destaque (7)

Application of genetic algorithm in intrusion detection system
Application of genetic algorithm in intrusion detection systemApplication of genetic algorithm in intrusion detection system
Application of genetic algorithm in intrusion detection system
 
An Approach of Automatic Data Mining Algorithm for Intrusion Detection and P...
An Approach of Automatic Data Mining Algorithm for Intrusion  Detection and P...An Approach of Automatic Data Mining Algorithm for Intrusion  Detection and P...
An Approach of Automatic Data Mining Algorithm for Intrusion Detection and P...
 
A Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection SystemA Survey On Genetic Algorithm For Intrusion Detection System
A Survey On Genetic Algorithm For Intrusion Detection System
 
Intrusion detection system with GA
Intrusion detection system with GAIntrusion detection system with GA
Intrusion detection system with GA
 
my IEEE
my IEEEmy IEEE
my IEEE
 
Intrusion Detection
Intrusion DetectionIntrusion Detection
Intrusion Detection
 
3ppt
3ppt3ppt
3ppt
 

Semelhante a 46 102-112

A Comprehensive Review On Intrusion Detection System And Techniques
A Comprehensive Review On Intrusion Detection System And TechniquesA Comprehensive Review On Intrusion Detection System And Techniques
A Comprehensive Review On Intrusion Detection System And TechniquesKelly Taylor
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSieijjournal1
 
Certified Ethical Hacking
Certified Ethical HackingCertified Ethical Hacking
Certified Ethical HackingJennifer Wood
 
Detecting Unknown Attacks Using Big Data Analysis
Detecting Unknown Attacks Using Big Data AnalysisDetecting Unknown Attacks Using Big Data Analysis
Detecting Unknown Attacks Using Big Data AnalysisEditor IJMTER
 
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORT
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORTINTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORT
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORTIJMIT JOURNAL
 
A Survey: Comparative Analysis of Classifier Algorithms for DOS Attack Detection
A Survey: Comparative Analysis of Classifier Algorithms for DOS Attack DetectionA Survey: Comparative Analysis of Classifier Algorithms for DOS Attack Detection
A Survey: Comparative Analysis of Classifier Algorithms for DOS Attack Detectionijsrd.com
 
A Performance Analysis of Chasing Intruders by Implementing Mobile Agents
A Performance Analysis of Chasing Intruders by Implementing Mobile AgentsA Performance Analysis of Chasing Intruders by Implementing Mobile Agents
A Performance Analysis of Chasing Intruders by Implementing Mobile AgentsCSCJournals
 
Modification data attack inside computer systems: A critical review
Modification data attack inside computer systems: A critical reviewModification data attack inside computer systems: A critical review
Modification data attack inside computer systems: A critical reviewCSITiaesprime
 
Ea3212451252
Ea3212451252Ea3212451252
Ea3212451252IJMER
 
Intrusion Detection System using AI and Machine Learning Algorithm
Intrusion Detection System using AI and Machine Learning AlgorithmIntrusion Detection System using AI and Machine Learning Algorithm
Intrusion Detection System using AI and Machine Learning AlgorithmIRJET Journal
 
The Practical Data Mining Model for Efficient IDS through Relational Databases
The Practical Data Mining Model for Efficient IDS through Relational DatabasesThe Practical Data Mining Model for Efficient IDS through Relational Databases
The Practical Data Mining Model for Efficient IDS through Relational DatabasesIJRES Journal
 
Ethical hacking a licence to hack
Ethical hacking a licence to hackEthical hacking a licence to hack
Ethical hacking a licence to hackamrutharam
 
Machine learning in network security using knime analytics
Machine learning in network security using knime analyticsMachine learning in network security using knime analytics
Machine learning in network security using knime analyticsIJNSA Journal
 
Articles - International Journal of Network Security & Its Applications (IJNSA)
Articles - International Journal of Network Security & Its Applications (IJNSA)Articles - International Journal of Network Security & Its Applications (IJNSA)
Articles - International Journal of Network Security & Its Applications (IJNSA)IJNSA Journal
 

Semelhante a 46 102-112 (20)

A Comprehensive Review On Intrusion Detection System And Techniques
A Comprehensive Review On Intrusion Detection System And TechniquesA Comprehensive Review On Intrusion Detection System And Techniques
A Comprehensive Review On Intrusion Detection System And Techniques
 
50320130403001 2-3
50320130403001 2-350320130403001 2-3
50320130403001 2-3
 
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMSAN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
AN IMPROVED METHOD TO DETECT INTRUSION USING MACHINE LEARNING ALGORITHMS
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Certified Ethical Hacking
Certified Ethical HackingCertified Ethical Hacking
Certified Ethical Hacking
 
1776 1779
1776 17791776 1779
1776 1779
 
Bt33430435
Bt33430435Bt33430435
Bt33430435
 
Detecting Unknown Attacks Using Big Data Analysis
Detecting Unknown Attacks Using Big Data AnalysisDetecting Unknown Attacks Using Big Data Analysis
Detecting Unknown Attacks Using Big Data Analysis
 
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORT
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORTINTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORT
INTRUSION DETECTION SYSTEM USING CUSTOMIZED RULES FOR SNORT
 
A Survey: Comparative Analysis of Classifier Algorithms for DOS Attack Detection
A Survey: Comparative Analysis of Classifier Algorithms for DOS Attack DetectionA Survey: Comparative Analysis of Classifier Algorithms for DOS Attack Detection
A Survey: Comparative Analysis of Classifier Algorithms for DOS Attack Detection
 
IS - Firewall
IS - FirewallIS - Firewall
IS - Firewall
 
Kx3419591964
Kx3419591964Kx3419591964
Kx3419591964
 
A Performance Analysis of Chasing Intruders by Implementing Mobile Agents
A Performance Analysis of Chasing Intruders by Implementing Mobile AgentsA Performance Analysis of Chasing Intruders by Implementing Mobile Agents
A Performance Analysis of Chasing Intruders by Implementing Mobile Agents
 
Modification data attack inside computer systems: A critical review
Modification data attack inside computer systems: A critical reviewModification data attack inside computer systems: A critical review
Modification data attack inside computer systems: A critical review
 
Ea3212451252
Ea3212451252Ea3212451252
Ea3212451252
 
Intrusion Detection System using AI and Machine Learning Algorithm
Intrusion Detection System using AI and Machine Learning AlgorithmIntrusion Detection System using AI and Machine Learning Algorithm
Intrusion Detection System using AI and Machine Learning Algorithm
 
The Practical Data Mining Model for Efficient IDS through Relational Databases
The Practical Data Mining Model for Efficient IDS through Relational DatabasesThe Practical Data Mining Model for Efficient IDS through Relational Databases
The Practical Data Mining Model for Efficient IDS through Relational Databases
 
Ethical hacking a licence to hack
Ethical hacking a licence to hackEthical hacking a licence to hack
Ethical hacking a licence to hack
 
Machine learning in network security using knime analytics
Machine learning in network security using knime analyticsMachine learning in network security using knime analytics
Machine learning in network security using knime analytics
 
Articles - International Journal of Network Security & Its Applications (IJNSA)
Articles - International Journal of Network Security & Its Applications (IJNSA)Articles - International Journal of Network Security & Its Applications (IJNSA)
Articles - International Journal of Network Security & Its Applications (IJNSA)
 

Mais de idescitation (20)

65 113-121
65 113-12165 113-121
65 113-121
 
69 122-128
69 122-12869 122-128
69 122-128
 
71 338-347
71 338-34771 338-347
71 338-347
 
72 129-135
72 129-13572 129-135
72 129-135
 
74 136-143
74 136-14374 136-143
74 136-143
 
80 152-157
80 152-15780 152-157
80 152-157
 
82 348-355
82 348-35582 348-355
82 348-355
 
84 11-21
84 11-2184 11-21
84 11-21
 
62 328-337
62 328-33762 328-337
62 328-337
 
47 292-298
47 292-29847 292-298
47 292-298
 
49 299-305
49 299-30549 299-305
49 299-305
 
57 306-311
57 306-31157 306-311
57 306-311
 
60 312-318
60 312-31860 312-318
60 312-318
 
5 1-10
5 1-105 1-10
5 1-10
 
11 69-81
11 69-8111 69-81
11 69-81
 
14 284-291
14 284-29114 284-291
14 284-291
 
15 82-87
15 82-8715 82-87
15 82-87
 
29 88-96
29 88-9629 88-96
29 88-96
 
43 97-101
43 97-10143 97-101
43 97-101
 
106 419-424
106 419-424106 419-424
106 419-424
 

Último

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 

Último (20)

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 

46 102-112

  • 1. Applying Genetic Algorithm in Intrusion Detection System: A Comprehensive Review Shaveta1 , Er. Abhinav Bhandari2 and Dr. Krishan Kumar Saluja3 1 Research Scholar, Department of Computer Engineering, UCOE, Punjabi University, Patiala, India er.shaveta89@gmail.com 2 Assistant Professor, Department of Computer Engineering, UCOE, Punjabi University, Patiala, India bhandarinitj@gmail.com 3 Associate Professor, Department of Computer Science and Engineering, S.B.S.C.E.T, Ferozepur, India k.saluja@rediffmail.com Abstract— Information Systems and Networks are subjected to electronic attacks. When network attacks hit, organizations are thrown into crisis mode. From the IT department to call centers, to the board room and beyond, all are fraught with danger until the situation is under control. Traditional methods which are used to overcome these threats (e.g. firewall, antivirus software, password protection etc.) do not provide complete security to the system. This encourages the researchers to develop an Intrusion Detection System which is capable of detecting and responding to such events. This review paper presents a comprehensive study of Genetic Algorithm (GA) based Intrusion Detection System (IDS). It provides a brief overview of rule-based IDS, elaborates the implementation issues of Genetic Algorithm and also presents a comparative analysis of existing studies. Index Terms— False Positive, Fitness Function, Genetic Algorithm (GA), Intrusion, Intrusion Detection System (IDS) I. INTRODUCTION Internet was originally designed by keeping functionality but not security in mind. The TCP/IP protocol suite, the most widely used protocol suite for data communication, works on the assumption that all the hosts participating in the communication have no malicious intention. Such design flaws open up the internet to many opportunities for intrusion. Intrusion is a set of actions aimed at compromising the security goals (confidentiality, integrity, availability) of a computing/networking resource [1]. Intrusion techniques may include exploiting software bugs and system misconfigurations, password cracking, sniffing unsecured traffic, or exploiting the design flaw of specific protocols [5]. An intruder is any user or group of users who initiate such intrusive actions. Intruders can be divided into two groups, external and internal. The former refers to those who do not have authorized access to the system and who attack by using various penetration techniques. The latter refers to those with access permission who wish to perform unauthorized activities [6]. The attacks are growing exponentially and are getting more sophisticated. Attempts to breach information security are rising every day, along with the availability of the Vulnerability Assessment tools that are widely available on the Internet, for free, as well as for a commercial use. Tools such as SubSeven, BackOrifce, Nmap, L0ftCrack, can all be used to scan, identify, probe, and penetrate into your systems. With the help of such tools even the best security measures can be breached. The key targets of the attackers include banks, DOI: 02.ITC.2014.5.46 © Association of Computer Electronics and Electrical Engineers, 2014 Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC
  • 2. 103 law firms and corporates. According to the report published by Symantec Corporation [14] for the month of November, 2013 the number of targeted attacks has increased, 438 new vulnerabilities have been discovered bringing the total for the year up to 5965, two zero-day vulnerabilities have been discovered and 42 million identities have been exposed. A successful targeted attack on a large company can cost it $2.4 million in direct financial losses and additional costs. For a medium-sized or small company, a targeted attack can mean about $92,000 in damages – almost twice as much as an average attack [10]. Therefore, the attention drifts to Intrusion Detection Systems which monitor network traffic so as to identify resources misuse, unauthorized use as well as its abuse and perform actions as defined by security policies. Intrusion detection systems perform following functions:  Monitoring and analysis of user and system activity  Auditing of system configurations and vulnerabilities  Assessing the integrity of critical system and data files  Statistical analysis of activity patterns based on the matching to known attacks  Abnormal activity analysis and Operating system audit The majorities of currently existing IDS face a number of challenges such as low detection rates and high false alarm rates and therefore obstruct legitimate users from accessing the network resources. These problems are due to the sophistication of the attacks and their intended similarities to normal behavior. To overcome these problems in currently existing IDS, Genetic Algorithm based Intrusion detection system is employed to enhance the performance of intrusion detection for rare and complicated attacks. The rest of the paper is organized as follows: Section 2 provides a brief introduction to Intrusion Detection System. Section 3 describes the implementation issues of Genetic Algorithm. Section 4 describes the technique of applying Genetic Algorithm to Intrusion Detection System. Section 5 presents the related work and a comparative analysis of existing studies. Finally, the discussion is concluded. II. INTRUSION DETECTION SYSTEM Intrusion detection is the process of identifying and responding to such events which violate the computer security policies, acceptable use policies or standard security practices. An Intrusion Detection System (IDS) is a security system which implements the process of intrusion detection and reports the intrusion accurately to the appropriate authority. The IDS monitors packets from various network connections in order to detect an intrusive activity [1]. If an intrusion is detected, the IDS simply logs in a message into system audit file to be later analyzed by network security experts or stops such connections to end an intruder's attack or performs some other action as defined by the organization’s rules and practices to provide security, handle intrusion and recover from the damage caused by security breaches [1]. These systems do not react equally at all the times, false alarms could occur sometimes. A. Components of IDS The basic architecture of intrusion detection system is explained below [2] [16] and presented in figure 1:  Data Source: Data sources can be categorized into four categories namely Host-based monitors, Network- based monitors, Application-based monitors and Target-based monitors.  Data gathering device (sensor): It is responsible for collecting data from the monitored system.  Analysis Engine (detector): This component takes information from the sensors and examines the data in order to detect attacks. The analysis engine can use various analysis approaches e.g. misuse/signature based detection or anomaly/statistical detection.  Knowledge base: It is database which contains information collected by the sensors, but in preprocessed format (e.g. knowledge base of attacks and their signatures, filtered data, data profiles, etc.). This information is usually provided by network and security experts.  Configuration device: It provides information about the current state of the intrusion detection system (IDS).  Response Manager: The response manager only acts when an intrusion is detected and performs the necessary action as defined by the security policies of the organization. These actions can be either automated (active) or involve human interaction (inactive).
  • 3. 104 Figure1. Basic Architecture of Intrusion Detection System B. Characteristics of IDS IDS must have following characteristics [2]:  Prediction performance: Typical measures for evaluating predictive performance of IDS include detection rate and false alarm rate. Detection rate is defined as the ratio of the number of correctly detected attacks to the total number of attacks. The false alarm (or false positive) rate is the ratio of the number of normal connections that are incorrectly classified as attacks to the total number of normal connections. Therefore, good IDS must have high detection rate and low false positive rate.  Time performance: The total time taken by IDS for generating alarm should be as short as possible. The processing time depends upon the processing speed of the IDS, which is the rate at which the IDS processes audit events. If this rate is not sufficiently high, then the real time processing of security events may not be feasible. The propagation time is the time needed for processed information to propagate to the security analyst. Both times need to be as short as possible in order to allow the security analyst sufficient time to react to an attack before much damage has been done, as well as to stop an attacker from modifying audit information or altering the IDS itself.  Fault tolerance: An IDS should be robust, dependable and resistant to attacks and should be able to recover quickly. This characteristic is very important for the proper functioning of IDSs, since most commercial IDSs run on operating systems and networks that are vulnerable to different types of attacks. In addition, IDS should also be resistant to scenarios when an adversary can cause the IDS to generate a large number of false or misleading alarms. Such alarms may easily have a negative impact on the availability of the system, and the IDS should be able to quickly overcome these obstacles.  Dynamic reconfiguration: it must be dynamically reconfigurable so that time spent on reconfiguration of the system is as short as possible. C. Taxonomy of IDS’s The IDSs are generally classified [9] as shown in the figure 2: Figure2: Taxonomy of IDS’s By location (or by scope of protection): Data Source (Monitored System) Data gathering (sensors) Analysis Engine Knowledge base Configuration Response Component Raw data Events System state System state Actions Actions IDS Classification By location By detection model Host-based IDS Network- based IDS Misuse Detection Anomaly Detection
  • 4. 105 Intrusion Detection Systems can be divided into following two types depending on the location where they look for intrusive actions:  Host-based IDS (HIDS): Host-based IDS loads a piece of software on the system to be monitored. This software evaluates the information associated with the system including the contents of operating system, system and application files. If any critical file is deleted or modified then an alert message is send to the administrator for further investigation.  Network-based IDS (NIDS): identifies the intrusive activities by analyzing the stream of packets which travel across the network. By detection model: Intrusion Detection Systems can also be classified into following categories on the basis of the detection approaches:  Misuse detection (or signature based detection): these systems work by matching user activity with stored signatures of known attacks. Such detection systems use a predefined knowledgebase to check whether the new network connection is in that knowledge database. If yes, the IDS consider this connection as a possible attack and then block it.  Anomaly detection (or Behavior detection): In this case, the system learns the characteristics of normal user activities and then uses such characteristics to judge whether new user's activity is normal or not. III. GENETIC ALGORITHM The Genetic Algorithm is a probabilistic search algorithm that iteratively transforms a set (called population) of mathematical objects (typically fixed-length binary character strings called chromosomes), each with an associated fitness value, into a new population of offspring objects using operations that are patterned after naturally occurring genetic operations, such as crossover and mutation [8]. Genetic Algorithm is inspired from the natural search and selection processes leading to the survival of the fittest [13]. In last few years, genetic algorithms have emerged as practical, robust optimization and search methods. Genetic Algorithms represent an intelligent exploitation of a random search used to solve optimization problems. GAs, although randomized, exploit historical information to direct the search into the region of better performance within the search space. A. Working Principle of GA: The working principle of GA is explained as follows [17]. Genetic Algorithm begins with a set of suitable solutions for the problem. Each solution is represented by a chromosome-like data structure. Solutions from one population are selected and used to generate a new population. This is motivated by the possibility that the new population will be better than the old one. Solutions are selected according to their fitness to generate new population; more suitable they are, more chances they have to reproduce. This is repeated until some condition (e.g. fixed number of generations reached or improvement of the best solution etc.) is satisfied. The pseudo-code for GA is as shown below. Pseudo-code: BEGIN INITIALISE population with random candidate solutions. EVALUATE each candidate; REPEAT UNTIL (terminate condition) is satisfied DO 1. SELECT parents; 2. RECOMBINE pairs of parents; 3. MUTATE the resulting offspring; 4. SELECT individuals or the next generation; END B. Encoding of solutions as chromosomes: Before using genetic algorithm to solve any problem it is necessary to encode the potential solutions to that problem in a form which can be processed by a computer [17]. One common approach is to encode the solutions as binary strings: sequences of 1’s and 0’s, where each digit represents the value of some aspect of the solution. Each solution is represented in the form of a chromosome. Different positions in a chromosome are referred to as genes and are changed randomly within a range during the process of evolution. Example:
  • 5. 106 A Gene may look like: 1101 A chromosome may look like: Gene1 Gene 2 Gene3 Gene4 1101 1001 1111 1011 Binary string representation of above chromosome: 1101100111111011 Other methods of encoding include encoding values as integers or real numbers or any element (E11 E3 E7…E1 E15) or list of rules (R1 R2 R3…R22 R23) or any data structure. The selection of the encoding method depends upon the attributes of the problem to be solved. C. Steps involved in basic Genetic Algorithm: The various steps involved in GA are explained below [17] and the overall flow chart is presented in figure 3: Step 1: [Start] Generate random population of ‘n’ chromosomes each representing a different solution to the problem. Step 2: [Fitness] Evaluate fitness f(x) of each chromosome ‘x’ in the population. Step 3: [New population] Generate new population by repeating following steps until the new population is complete a. [Selection] Select two parent chromosomes from a population according to their fitness (higher the fitness, greater the chance of selection). b. [Crossover] With a crossover probability, cross over the parents to generate new offspring. Crossover could be one-point or multi-point. If no crossover is performed then offspring is the exact copy of parents. c. [Mutation] With a mutation probability, mutate new offspring (i.e. randomly flip some bits). d. [Accepting] Place new offspring in the new population. Step 4: [Replace] Use new population for further run of the algorithm. Step 5: [Test] If the end condition is satisfied, stop and return the best solution in current population. Step 6: [Loop] Go to step2. Figure3: Overall flow of GA Yes No Mutation Start Generate random population Apply Fitness Function Optimization criteria met? Result Selection Crossover
  • 6. 107 A genetic algorithm is quite straightforward in general, but it could be complex in most cases. The values of various parameters (for example, mutation rate, crossover rate, population size, chromosome size, number of evolutions or generations, and selection process) need to be selected by considering the attributes of the problem being solved. Genetic Algorithm is used to solve a problem if alternate solutions are too slow (or much complicated) or an exploratory tool is required to examine new approaches or benefits of GA meet key problem requirements etc. The advantages of using Genetic algorithm are [8]:  Always gives answer  Answer gets better with time  Inherently parallel  Easily re-trainable  Multiple ways to speed up and improve a GA-based application as knowledge about problem domain is gained  Easy to exploit previous or alternate solutions  Different operators used in genetic algorithm avoid getting stuck in local maxima etc. D. Limitations of Genetic Algorithm Genetic algorithms are efficient, but in practice they have certain limitations:  It is not always easy to find a fitness function.  Representing a problem space in genetic algorithms is very complex.  It is a tough task to choose the optimal parameters for a genetic algorithm.  Genetic algorithms need a large number of fitness function evaluations.  It is not easy to configure a genetic algorithm based system. IV. GENETIC ALGORITHM BASED SYSTEM MODEL Genetic Algorithm can be used in different ways in intrusion detection systems. If Intrusion Detection System is illustrated as a rule-based system then GA can be considered as a tool to generate rules for the rule- based IDS. The goal of the system is not to evolve a single best rule (global optimal), but to create a set of rules which is good enough to detect attacks. The system works by analysing the network connections. The figure 4 describes the overall flow of GA based IDS. The system works in two phases: training phase and testing phase. A. Training phase In this phase, a set of classification rules is generated from network audit data using Genetic Algorithm in an offline environment. The training data set contains analysed logs of connections which clearly distinguish between normal connections and attacks. The examples of various data sets include KDD Cup99 and DARPA. The records from the training data set are represented in the form of chromosomes. Each chromosome is a rule within which certain features of a connection are encoded in the form of fixed length vector. A fitness function is then applied to each chromosome in order to evaluate its goodness. If a chromosome helps to identify an attack correctly, it is considered good (or fit) else it is considered bad. Crossover and mutation operations are applied to the good chromosomes in order to produce new generation. This entire process is repeated by using the newly generated population. This process of evolution continues until a solution is reached (i.e. a set of rules, capable of detecting attacks is generated). The generated rules are stored in a rule base in the following form: if { condition } then { act } For example, a rule can be defined as [1]: if {the connection has following information: source IP address 124.12.5.18; destination IP address: 130.18.206.55; destination port number: 21; connection time: 10.1 seconds} then {stop the connection} Explanation: if there exists a network connection request with source IP address 124.12.5.18, destination IP address 130.18.206.55, destination port number 21, and connection time 10.1 seconds, then stop the connection establishment – since IP address 124.12.5.18 is recognized by the IDS as a blacklisted IP address. Thus, service request initiated from it, is rejected. The various steps involved in training phase are [1]: 1. Encoding of connections – Consider the following case [13] where six features of a network connection are being used to identify an attack. The dataset used in this case is DARPA dataset which contains 7 features of a connection including the attack name. The normal connections contain no attack-names. Each
  • 7. 108 chromosome is a rule within which the 7-features are encoded via fixed length vector, and each feature is encoded as one or more genes of different types as shown in table below. TABLE I. CHROMOSOME REPRESENTATION OF A RULE Sr. no Feature Feature Explanation Format Number of Genes 1. Duration Time period of the connection H:M:S 3 2. Protocol Protocol used for making connection Numeric 1 3. Source Port Application that the attacker system is running Numeric 1 4. Destination Port Application that the target system is running Numeric 1 5. Source IP Attacker system’s IP address a.b.c.d 4 6. Destination IP Target system’s IP address a.b.c.d 4 7. Attack name and type Name of the attack string 1 Each rule uses an if-then clause with a “condition” and “outcome” part. The first 6-features are connected via logical AND to form “condition” part; while attack name is the “outcome” to show network record classification (during training) or connection (during intrusion detection) if a rule is matched. For example consider the following rule [13]: if (duration=“0:0:1” and protocol=“finger” and source_port=18982 and destination_port=79 and source_ip=“9.9.9.9” and destination_ip=“172.16.112.50”) then (attack_name=“neptune”) The above rule expresses that if a network packet is originated from IP address 9.9.9.9 and port 18982, and sent to IP address 172.16.112.50 and port 79 using the protocol finger, and the connection duration is 1 second, then most likely it is a network attack of type neptune that may eventually cause the destination host out of service. The above rule can be represented as follows: {0, 0, 1, 2, 18982, 79, 9, 9, 9, 9, 172, 16, 112, 50, 1} 2. Evaluating each chromosome using fitness function – During the training phase, evaluation of chromosomes is carried out in order to determine their goodness. If a chromosome correctly classifies an attack, it is considered good; else, it is bad and is not selected for crossover to produce offspring. Thus, a chromosome which detects more attacks has higher fitness value and has higher chances for selection. The different fitness models proposed by various researchers are: support and confidence model, reward-penalty model, weighted sum model etc. 3. Selection – In order to choose the chromosomes different selection methods are used e.g. Fitness- proportion selection, Roulette-wheel selection, Rank selection, Local selection, Tournament selection, Steady state selection [6]. 4. Crossover –With a crossover probability, cross over the parents to generate new offspring. Crossover can be one-point or multi-point. If no crossover is performed then offspring is the exact copy of parents. 5. Mutation: Each gene in a chromosome may or may not change depending on the probability of mutation rate. Mutation improves population diversity needed in this work. B. Testing phase In this phase, the rules stored in the rule base are used to detect whether a real-time network connection is a normal connection or an intrusive attack. If the characteristics of new connection match with the ‘condition’ section of some pre-defined rule in the rule-base then the connection is considered as an attack else it is considered as a normal connection. If an attack is detected then IDS performs the necessary actions defined by the security policies of the organization. The algorithm for GA-based IDS is presented below. Algorithm: Intrusion Detection [1] Input: Inflowing network connection Output: Decision if connection is intrusive or not 1: Loop Forever {fetch incoming packet} 2: for each rule in rule-base 3: Match rule with network connection (analysis console) 4: if rules match then 5: Mark current connection as an intrusion (and generate an alarm as per security policies) 6: end if 7: end for each 8: end loop forever.
  • 8. 109 Figure4: Overall flow of GA based IDS V. RELATED WORK The Intrusion Detection System has undergone rapid changes and is using new evolved techniques to generate better results. Genetic Algorithm can be used in different ways in Intrusion Detection Systems. Genetic Algorithm based intrusion detection approach discussed in this review paper is focused on a rule based Intrusion Detection System which uses only Genetic Algorithm to generate knowledge. For this purpose network connections are analysed to describe the normal and abnormal behaviour in the network. This section briefly summarizes some of the GA based IDSs and presents a comparative analysis of various existing studies in table 2. The early effort of using GAs for intrusion detection can be dated back to 1995, when Crosbie and Spafford [12] applied the multiple agent technology and GP (Genetic Programming) to detect network anomalies. Each agent monitors one parameter of the network audit data and GP is used to find the set of agents that collectively determine anomalous network behaviors. This method has the advantage of using many small autonomous agents, but the communication among them is still a problem. Also the training process can be time consuming if the agents are not appropriately initialized. Wei Li [11] proposes a GA-based method to detect anomalous network behaviors. This implementation of genetic algorithm is unique as it considers both temporal and spatial information of network connections in encoding the network connection information into rules in IDS. This may lead to increased detection rates. However, no experimental results are available yet. Ren Hui Gong, Mohammad Zulkernine and Purang Abolmaesumi [13] present a method of applying Genetic Algorithm for intrusion detection. Seven network features including both categorical and quantitative data fields are used when encoding and deriving the rules. A simple but efficient and flexible fitness function, i.e. the support-confidence framework, is used to judge the quality of each rule. Depending on the selection of fitness function weight values, the generated rules can be used to either generally detect network intrusions or precisely classify the types of intrusions.The method has been implemented using Java and third party package ECJ. The implementation has been tested using subsets of 1998 DARPA dataset. Experimental results show that the proposed method worked efficiently and has flexibility to be used in different ways. Start Evolution of rules using Genetic Algorithm Analysis of new connections using rules from rule-base Attack Detected? Alert Yes No Training Dataset Testing Dataset
  • 9. 110 However, some limitations of the method are also observed. First, the generated rules are biased to the training dataset. This issue may be resolved by carefully selecting either the number of generations in the training phase or the number of top best-fit rules in the intrusion detection phase. Second, while the support- confidence framework is simple to implement and provides improved accuracy to final rules, it requires the whole training data to be loaded into memory before any computation. For large training datasets, it is neither efficient nor feasible. The use of some sorts of cache technologies may solve the problem. Anup Goyal and Chetan Kumar [3] describe a GA based IDS to classify different types of network attacks with very low false positive rate (at 0.2%) and almost 100% detection rate. The algorithm takes into consideration different features of network connections such as type of protocol, network service on the destination and status of the connection to generate a classification rule set. Each rule in rule set identifies a particular attack .The design of the fitness function is such to make it biased towards individuals that correctly classify only the attack connections. The experiments are performed on the KDDCup99 data set. The generated rule set consists of six rules that can be applied to the IDS to identify and classify six different types of attack connections that fall into two classes namely Denial of Service (DoS) and Probing attacks. GALIB C++ library, especially suited to develop GA is used to implement the proposed system. Bader and Nasereddin [5] discuss a technique of using Genetic Algorithm for Intrusion Detection System. This implementation considers both temporal and spatial information of network connections in encoding the network connection information into rules in IDS. The network traffic used for implementing GA is a pre- classified data set that differentiates normal network connections from anomalous ones. This data set is gathered using network sniffers (a program used to record network traffic without doing something harmful) such as Tcpdump or Snort. The data set is manually classified based on the knowledge of experts. The rules generated are good enough for filtering new network traffic. The various attributes of network connections which are used for generating rules are: source IP address, destination IP address, source port number, destination port number, duration, state, protocol, number of bytes sent by originator, number of bytes send by responder. B. Uppalaiah, K. Anand, B. Narsimha, S. Swaraj and T. Bharat [4] suggest an intrusion detection system using genetic algorithm to generate rule set for eight types of attacks belonging to four categories. The proposed architecture deployed KDDCUP99 dataset. The dataset contains 41 features out of which only 3 features have been used to specify each entry of the dataset. The architecture of the system and the software implementation for the proposed technique are also discussed. The system created specified set of rules and achieved high DoS (Denial of Service), R2L (Remote to Local), U2R (User to Root), Probe attack detection rate. The average success rate achieved during experiments is 83.65%. The proposed system is flexible for usage in different application areas. The proposed system is implemented using C# in .net suite. Firas Alabsi and Reyadh Naoum [7] recommend a new fitness function using Reward-Penalty technique to evaluate the chromosomes efficiently. The data of 5% of KDDCUP’99 has been used for the proposed system. The proposed fitness function works on the principle that reward and penalty are proportionate to the strength and weakness of chromosomes. In order to prove the validity of the new fitness function, the results of reward-penalty model based fitness function are compared with the results of the support-confidence model based fitness function. The results closely match with each other. The system has been built by using Vb.Net 2010 and SQL server 2008. A.A. Ojugo, A.O. Eboka, O.E. Okonta, R.E Yoro (Mrs) and F.O. Aghware [1] present a genetic algorithm based approach which uses rules derived from network audit data for network intrusion detection. The fitness function utilized is based on the support-confidence framework. The fitness function is simple, efficient and flexible. The training and testing data set used is the DARPA 1998 MIT Lincoln laboratory. The study implemented GA based IDS using C (programming language) in Linux operating system platform. However, some limitations of the method are also witnessed. First, the generated rules are biased to the training dataset. This issue may be resolved by carefully selecting either the number of generations in the training phase or the number of top best-fit rules in the intrusion detection phase. Second, while the support-confidence framework is simple to implement and provides improved accuracy to final rules, it requires the whole training data to be loaded into memory before any computation. For large training datasets, it is neither efficient nor feasible. The use of some sorts of cache technologies may solve the problem. V. Moraveji Hashmei, Z. Muda and W. Yassin [15] present a genetic algorithm based intrusion detection system. Software implementation of the proposed system is presented. The system is flexible enough to be used in different application environments, if proper attack taxonomy and proper training dataset exist. High detection rate and low false positive rates are the highlights of the proposed system. The proposed system can
  • 10. 111 be applied for intrusion detection without using any complementary technique that is commonly used with other soft-computing techniques. KDDCUP’99 dataset is used for training phase. Bharat S. Dhak and Shrikant Lade [6] present a genetic algorithm based intrusion detection technique to detect malicious packets on the network and ultimately help to block the respective IP addresses. The Genetic Algorithm process is discussed in detail. The training is done on the predefined data rules. The testing is done on the entries generated by the firewall system of machine in pfirewall.log file. The proposed system can be integrated with any of the IDS system to improve the efficiency and the performance of the same. M. Sadiq Ali Khan [18] designed a rule-based Intrusion Detection System to detect DoS (Denial of Service) or Probing attacks by formulating the contributing parameters in terms of rules. Genetic algorithm is used to devise these rules. In this study, KDD-99 data set is used with reduced set of attributes. Principal Component Analysis is used to reduce the data set. By running GA for more than 2000 times the proposed system managed to achieve 91% accuracy in detecting network attacks. TABLE II. COMPARISON OF EXISTING STUDIES ON GA BASED IDS Reference Detection Approach Fitness Function (F) Explanation of Fitness function used Remarks A.A. Ojugo, A.O. Eboka, O.E. Okonta, R.E Yoro (Mrs), F.O. Aghware [1] Misuse analysis Support and confidence model F=W1*support+ W2*confidence If we have the rule: If A then B, support = |A and B| / N confidence = |A and B| / |A| N = Number of connections in training data |A| = Number of connections matching condition A. |A and B| = Connections matching rule if A and B w1, w2 = Weights to balance/control the two terms. Uses 7-network features; so in order to detect millions of connections high processing speed and sufficient cache are the required features; 97% of the attacks detected correctly by this system. B. Uppalaiah K. Anand, B.Narsimha, S.Swaraj, T.Bharat [4] Misuse analysis Fitness = f(x) / f (sum) Where f(x) is the fitness of entity x and f is the total fitness of all entities Uses only 3 network features; 83.65% of avg. success rate; process is faster , can be applied for high speed networks Bharat S. Dhak, Shrikant Lade [6] Misuse analysis F= weight*packet_size Where the packet_size is the actual packet data size prescribed by the incoming packet data stream and weight is the Vector which is applied to each chromosome. Scope of experiment is focused to generate a list of vulnerable IP addresses; gained 96% of accuracy. Firas Alabsi, Reyadh Naoum [7] Misuse analysis Reward Penalty model based F=2+(AB-A/AB+A)+(AB/X)- (A/Y) Consider a rule: If A then B, ((AB-A)/(AB+A))= strength of a record; AB/X= ratio of the strength of record to the strength of the strongest record; A/Y=ratio of the weakness of a record to the weakness of the weakest record; Uses 5-network features; Fitness function gives reward to good chromosomes and applies penalty on the bad chromosomes; comparison between the newly proposed and other existing fitness functions is presented. Wei Li [11] Anomaly Detection Weighted sum model based F=1-penalty Fitness function is determined by calculating the general outcome, absolute difference and penalty values. Considers both temporal and spatial features of a network connection to detect an attack; no experimental results V. Moraveji Hashmei, Z. Muda, W. Yassin [15] Misuse analysis F= (a/A)-(b/B) Where a=number of correctly detected attacks; A = total number of attacks in the training dataset; b = number of normal connections that are falsely detected as attacks; B = total number of normal connections. Uses only 3-network features; fast processing and can be applied for high speed networks; high detection rate; low false positives; gained 95.62% as detection rate and 4.37% as false alarm; can be used without using any complementary technique.
  • 11. 112 VI. CONCLUSION The three factors which have impact on the effectiveness of the genetic algorithm are selection of fitness function, representation of individuals and values of the GA parameters. The determination of these factors often depends on applications. Designing accurate fitness function is the major challenge for solving a particular problem. Different models for designing fitness function have been discussed in the paper. Using GA for intrusion detection has proven to be a cost-effective approach. One of the major advantages of this technique is due to the fact that in the real world, the types of intrusions change and become complicated very rapidly. The GA based detection system can upload and update new rules to the systems as the new intrusions become known. Therefore, it is cost effective and adaptive. REFERENCES [1] A.A. Ojugo, A.O. Eboka, O.E. Okonta, R.E Yoro (Mrs), F.O. Aghware, “Genetic Algorithm Rule-Based Intrusion Detection System (GAIDS)”, Journal of Emerging Trends in Computing and Information Sciences, Vol.3, pp. 1182- 1194, Aug 2012 [2] Aleksandar Lazarevic, Vipin Kumar, Jaideep Srivastava, “Intrusion Detection a survey”, unpublished. [3] Anup Goyal, Chetan Kumar, “GA-NIDS: A Genetic Algorithm based Network Intrusion Detection System”, 2008. [4] B. Uppalaiah, K. Anand, B. Narsimha, S. Swaraj, T. Bharat, “Genetic Algorithm Approach to Intrusion Detection System”, IJCST Vol. 3, Issue 1, Jan-March 2012 [5] Bader and Nasereddin, “Using Genetic Algorithm in Network Security”, IJRRAS, vol. 5, pp. 148-154, Nov. 2010 [6] Bharat S. Dhak, Shrikant Lade, “ An Evolutionary Approach to Intrusion Detection System using Genetic Algorithm” .ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 2, Issue 12, Dec. 2012 [7] Firas Alabsi and Reyadh Naoum(2012, April), “Fitness Function for Genetic Algorithm used in Intrusion Detection System”, International Journal of Applied Science and Technology, Vol. 2, pp. 632-637. [8] GA tutorial, Available at: http://www.vit.ac.in/academicresearch/res701/RES701DUMP/Evolutionary%20Algorithms/GATutorial.pdf [9] Kamal Kishore Prasad, Samarjeet Borah, “Use of Genetic Algorithms in Intrusion Detection Systems: An Analysis”, International Journal of Applied Research and Studies (iJARS) ISSN: 2278-9480 Volume 2, Issue 8, Aug 2013 [10]Kaspersky lab Global Corporate IT Security Risks: 2013, May 2013 [11]Li, Wei, “Using Genetic Algorithm for Network Intrusion Detection”, (2004) [12]M. Crosbie, E. Spafford, “Applying Genetic Programming to Intrusion Detection”, Proceedings of the AAAI Fall Symposium, 1995. [13]Ren Hui Gong, Mohammad Zulkernine, Purang Abolmaesumi, “A Software Implementation of a Genetic Algorithm Based Approach to Network Intrusion Detection”, Proceedings of the Sixth International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing and First ACIS International Workshop on Self-Assembling Wireless Networks (SNPD/SAWN’05) 2005 IEEE . [14]Ben Nahorney, Symantec Intelligence report: November 2013 [15]V. Moraveji Hashmei, Z. Muda and W. Yassin, “Improving Intrusion Detection using Genetic Algorithm”, International Technology journal 12(11) pp. 2167-2173, 2013 [16]Mohammad Sazzadul Hoque, Md. Abdul Mukit and Md. Abu Naser Bikas, “An Implementation of Intrusion Detection System using Genetic Algorithm”, International Journal of Network Security & Its Applications (IJNSA), Vol.4, No.2, March 2012 [17]RC Chakraborty, Fundamentals of Genetic Algorithm: AI Course, June 2010, available at http://www.myreaders.info/09-Genetic_Algorithms.pdf [18]M. Sadiq Ali Khan, “Rule based Network Intrusion Detection using Genetic Algorithm”, International Journal of Computer Applications (0975 – 8887) Volume 18– No.8, March 2011