Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
An Implementational approach to genetic algorithms for TSP
1. International Journal of Information Technology and Management Research
3 (2), July-December 2011, pp. 109–115I J I T M R
AN IMPLEMENTATIONAL APPROACH TO THE GENETIC ALGORITHM
FOR SOLVING TRAVELLING SALESMAN PROBLEM
Ayan Mukherjee*
, Sougata Das**
ABSTRACT: This paper aims to study of evolutionary computation and Genetic Algorithms for solving the real life optimization
problems. Genetic Algorithms are a class of algorithms known as evolutionary algorithms.
Travelling Salesman Problem is a well-studied problem. Many real life problems can be simulated as Travelling Salesman
Problem. Here in this paper we made an attempt to find a solution to water distribution network problem.
Keywords: Fitness Function, Crossover, Mutation, MOGA (Multi Objective Genetic Algorithm), POGA, Pareto Optimality,
Pseudo Random Number Generator
*
Assistant Professor, Dept. of MCA, Brainware Group of Institutions, Barasat, Kolkata, India E-mail: ayanmca@gmail.com
**
Student, M.S., Department of Computer Science, BITS, Pilani Pilani, India E-mail: sougatadas10@gmail.com
1. INTRODUCTION
Water Distribution Network Analysis: Given a
collection of houses in a city and the cost of
distribution of water between each pair of them,
the Water Distribution Network Problem, is to
find the most optimal way of distributing water
among all of the houses in the city. Quite
obviously, this problem can be easily reduced to
traveling Salesman Problem, or TSP for short.
Let us see an easy example to see how difficult
it is to find the solution mathematically:
It is easy to calculate the number of different
tours through n cities: given a starting city, we
have n – 1 choices for the second city, n – 2 choices
for the third city, etc. Multiplying these together
we get (n – 1)! = n – 1 × n – 2 × n – 3 ×... × 3 × 2 × 1.
Now since our travel costs do not depend on the
direction we take around the tour, we should
divide this number by 2 to get (n –1)!/2.
This is a very large number (the actual value
for n = 3038 is given below) and it is often cited as
the reason the Telling Salesman Problem seems
to be so difficult to solve. It is true that the rapidly
growing value of (n –1)!/2 rules out the possibility
of checking all tours one by one, but there are
other problems that are easy to solve (such as the
minimum spanning tree) where the number of
solutions for n points grows even more quickly.
The first mention of the TSP in literature was
made in 1832 in a German book entitled “Der
Handlungsreisende, wie er sien soll und was er
zu thun hat, um Auftrage zu erhalten und eines
glucklichen Erfolgs in seinen Geschaften gewiss
zu sein. Von einem alten Commis-Voyageur”
(“The Travelling Salesman[10][13][14]
, how he should
be and what he should do to get Commissions
and to be Successful in his Business”). The last
chapter makes the first explicit reference to the
TSP: ‘By a proper choice and scheduling of the
tour, one can often gain so much time that we have
to make some suggestions... The most important
aspect is to cover as many locations as possible
without visiting a location twice...’ (Voigt, 1831;
Muller-Merbach, 1983).
Although it is not certain who brought the TSP
into mathematical scrutiny, it is supposed that
Merrill Flood is responsible for bringing it into
focus having heard about it from A.W. Tucker in
1937. Flood writes that Tucker heard about the
2. 110 INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND MANAGEMENT RESEARCH
TSP from Hassler Whitney at Princeton University.
This seems to make Whitney the founder of the
mathematical TSP. The study of the problem, at
time of writing, seems to have been just over 60
years; there is still no polynomial-time solution.
NP-Complete: Mathematically, we have to
find a path that covers all the vertices in a graph
exactly once. In other words, we have to find a
Hamiltonian cycle. It is well known that finding
a Hamiltonian path is NP Complete. The time
complexity is (n –1)! Hence TSP problem cannot
be solved in polynomial time. So, we cannot use
the Greedy or any other algorithms to solve it.
Evolutioary Algorithm: An evolutionary
algorithm[4][5]
isa part of evolutionary computation
which is inspired by biological evolution[2]
,
mutation[2]
, recombination[2]
, natural selection[2][8]
and survival of the fittest[2]
. Evolutionary Algorithms
are population-based metaheuristic optimization
algorithms[2]
. They use mechanisms inspired by
biological evolution, such as reproduction[2]
,
mutation[3]
, recombination[2]
, natural selection and
survival of the fittest. Candidate solutions to the
optimization problem play the role of individuals
in a population, and the cost function (also known
as fitness function) determines the environment
within which the candidate solutions. Evolution
of the population then takes place after the
repeated application of the above operators.
Specific examples of Evolutionary Algorithms
are given below. Most of these techniques are
similar in spirit, but differ in the details of their
implementation and the nature of the particular
problem to which they have been applied.
• Genetic algorithm[1][7][9][12]
: We shall discuss
about this in the following sections.
• Evolutionary programming[2]
: Like genetic
programming, only the structure of the
program is fixed and its numerical
parameters are allowed to evolve;
• Evolution strategy[2]
: Works with vectors
of real numbers as representations of
solutions, and typically uses self-adaptive
mutation rates;
• Genetic programming[2]
: Herethesolutions
are in the form of computer programs, and
their fitness is determined by their ability
to solve a computational problem.
• Learning classifier system[2][3][11]
: Instead of
a using fitness function, rule utility is
decided by a reinforcement learning
technique.
Genetic Algorithms
A Genetic Algorithm (GA) is a search technique
used in computer science to find approximate
solutions to optimization and search problems.
Specifically it falls into the category of local search
techniques and is therefore generally an
incomplete search. Genetic algorithms are a
particular class of evolutionary algorithms that
use techniques inspired by evolutionary biology
such as inheritance, selection, crossover, mating
and mutation.
The evolution starts from a population of
completely random individuals and happens in
generations. In each generation, the fitness of the
whole population is evaluated, multiple individuals
are stochastically selected from the current
population (based on their fitness), and modified
(mutated or recombined) to form a new
population. The new population is then used in
the next iteration of the algorithm.
2. Proposed Algorithm
Input: An undirected graph with n edges and all
the nodes are connected.
The cost of traversing between two edges is a
vector [cij
, pij
] where
cij
= cost of traversing from city i to city j.
pij
= profit of traversing from city i to city j.
The profit function is based on many factors
like the topology, pressure difference between two
cities, etc that may have an effect on the flow of
water. The edge weight vector is symmetric.
Output: A set of Pareto-optimal solutions.
Step 1: Generate the initial population
Step 1a: Generate the initial population for
objective function 1, say init_pop1.
Step 1b: Generate initial population for
objective function 2, say init_pop2.
3. AN IMPLEMENTATIONAL APPROACH TO THE GENETIC ALGORITHM FOR SOLVING TRAVELLING SALESMAN PROBLEM 111
Step 2: Calculate fitness of the populations so as
to preserve the genetic diversity.
Step 2a: Calculate Average-Fitness for init_
pop1 and init_pop2.
Step 2b: Calculate Niche-Count forinit_pop1and
init_pop2.
Step 2c: Calculate Shared-Fitness for init_pop1
and init_pop2.
Step 2d: Calculate Scaled-Fitness for init_pop1
and init_pop2.
Step 2e: Rank the population according to their
fitness.
Step 3: Calculate fitness of the populations so as
to preserve the genetic diversity.
Step 3a: CalculateAverage-Fitnessforinit_pop2.
Step 3b: Calculate Niche-Count for init_pop2.
Step 3c: Calculate Shared-Fitness for init_pop2.
Step 3d: Calculate Scaled-Fitness for init_pop2.
Step 3e: Rank the population according to their
fitness.
Step 4: Generate N/2 offspring for objective
function 1, if N is the number of initial parents
generated in Step 1a. Combine N/2 offspring with
N/2 fittest parents. This ensures survival of the
fittest.
Step 4a: Select two chromosomes randomly
from init_pop1.
Step 4b: Apply the operators: Greedy Crossover
and Mutation2opt.
Step 4c: Replace N/2 of the parents by the
newly generated N/2 offspring. We call this
as new population or new_pop_obj1
Step 4d: Calculate Average-Fitness for new_
pop_obj1.
Step 4e: Calculate Niche-Count for new_pop_
obj1.
Step 4f: Calculate Shared-Fitness for new_
pop_obj1.
Step 4g: Calculate Scaled-Fitness for new_
pop_obj1.
Step 4h: Rank the population according to
their fitness.
Step 5: Generate M/2 offspring for objective
function 1, if M is the number of initial parents
generated in Step 1b. Combine M/2 offspring
with M/2 fittest parents. This ensures survival of
the fittest.
Step 5a: Select two chromosomes randomly
from init_pop2.
Step 5b: Apply the operators: Greedy Crossover
and Mutation2opt.
Step 5c: Replace M/2 of the parents by the
newly generated M/2 offspring. We call this
as new population or new_pop_obj2
Step 5d: Calculate Average-Fitness for new_
pop_obj2.
Step 5e: Calculate Niche-Count for new_pop
_obj2.
Step 5f: Calculate Shared-Fitness for new_
pop_ obj2.
Step 5g: Calculate Scaled-Fitness for new_pop
_obj2.
Step 5h: Rank the population according to
their fitness.
Step 6: Repeat steps 4 and 5 for a fixed number of
times as given by the user.
Step 7: Combine the result sets of step 4 and 5 to
generate the Pareto-optimal solution set.
3. EXPLANATION OF THE ALGORITHM
Encoding: The cities are listed in order they are
visited. For example 3-5-8-1-4-2-6-7 can be
represented as [3 5 1 8 4 2 6 7] ( Gen and Cheng ,
1997). This representation is also known as path
or order representation.
Objective functions: Since we are dealing
with multi objective solutions we have 2 objective
functions.
Cost Function:
c = Minimize 1 1
nn
j i= =∑ ∑ p(i, j)*k m for any
two cities i and j
Subject to the following constraints
(i) Costij
= Costji
(ii) Costij
≥ 0
(iii) Costik
+ Costkj
> Costij
4. 112 INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND MANAGEMENT RESEARCH
Where Costij
= cost of travelling between city
i and j.
k = 1, if the edge is selected
= 0, if the edge is not selected
Profit Function:
p = Maximize 1 1
nn
j i= =∑ ∑ p(i,j)*m for any two
cities i and j
Subject to the following constraints
(i) Profitij
= Profitji
(ii) Profitij
≥ 0
(iii) Profitik
+ Profitkj
> Profitij
Where Profitij
= profit of constructing a
pipeline between city i and j.
m = 1, if the edge is selected
= 0, if the edge is not selected
Fitness Calculations:
(a)Average Fitness calculations
(b) Rank(i) or ri
=1 + n, when n = number of
solutions that dominate ith
solution.
µ(i) = number of solutions of rank i.
Average Fitness of ith solution,
FAVGi
= N – ri
– 1
1k=∑ µ(k) – [ 0.5 * µ(ri
) – 1]
where, N = Total population size
(c) Normalized Distance
The normalized distance,
dij
= max min
1 {( )/( )} ^ 2 ^ 0.5M i j
k k k k kf f f f=
− − ∑
where,
M = number of objective functions.
fk
i
= kth
objective function value for ith
solution.
fk
max
and fkmin are maximum and minimum
values of kth objective functions.
(d)Sharing Distance
Sharing Distance between cities i and k,
Sh(dik
) = 1– dik
/ϕshare
, when dik
< ϕshare
= 0, otherwise
dik
= Distance between city i and city k
ϕshare
= a value randomly calculated for
every iteration. ϕshare
€ (0,1)
(e) Niche Count
Niche count is calculated as:
nci
= ( )
1(sh )j
i ij
µ
=∑ forallthesharingdistances
of chromosome j having the same rank as
solution i.
(f) Shared Fitness
Shared Fitness, F’ = FAVGi
/nci
(g) Scaled Fitness
Scaled fitness, Fj
” = FAVGj
* µ(j)/Σ F’
Figure 1: Snapshot of Fitness Calculation of a Sample run
The scaling factor here is 50 * 3/(4.65 + 4.673
+4.975) = 10.491. So, the factor 10.491 is multiplied
with column 5, to get the scaled fitness. This scaled
fitness helps to preserve the genetic diversity and
this method of fitness calculations is proposed by
Deb[4] and is known as MOGA.
Genetic Operators
Crossover operator: Greedy crossover It is
generally used in solving problems Traveling
Salesman Problem. Greedy Crossover was
designed by J.Grefenstette. The citation from
Sushil J.Louis: “Greedy Crossover selects the first
city of one parent, compares the cities leaving that
city in both the parents, and chooses the closer
one to extend the tour. If one city has already
appeared in the tour, we choose the other city. If
both the cities have already appeared, we
randomly select a non-selected city”.
Mutation: Mutation by 2 opt The 2opt
method[6] is one of the most well-known local
search algorithms among Traveling Salesman
Problem solving algorithms. It improves the tour
edge by edge and reverses the order of the sub
tour. For example, imagine a tour as shown in the
upper part of Figure 2 below. Remove the two
edges ab and cd, and reverse the order of the sub
tour (from b to c), and add the two edges ac and
5. AN IMPLEMENTATIONAL APPROACH TO THE GENETIC ALGORITHM FOR SOLVING TRAVELLING SALESMAN PROBLEM 113
bd. This gives us a tour as shown in the lower
part of Figure 2 below. The lower tour is shorter
than the upper one because ab + cd > He + bd.
We check every pair of edges, for example, ab
and cd. If ab + cd > ac + bd holds, we improve them
in the same way as shown. Actually, if both
ac > ab and bd > cd hold, then it is not necessary to
check the edges. Therefore we can skip the pairs
whose edges are far away from each other. We
repeat the procedures described above until no
further improvement can be made.
Figure 2: Application of Mutation by 2 opt
In this paper we have used our own Random
Number Generator Algorithm. Let us describe
how it works:
A LFSR is a shift register (see Figure 3) which
when clocked advances the signal through the
register from one bit to the next most significant
bit (see Figure 4). Some of the outputs are
combined in exclusive-OR configuration to form
a feedback mechanism.
Figure 3: Shift Register
A linear feedback shift register can be formed
by performing exclusive-OR on the outputs of two
or more of the flip-flops together and feeding
those output back into the input of one of the flip-
flops( see Figure 4).
Figure 4: Linear Feedback Shift Register
A maximallengthLFSRproducesthemaximum
number of patterns possible and has a pattern
count of 2n
– 1, where n is the number of register
elements in the LFSR.
The initial value of the LFSR is called the seed.
The operation of the register is deterministic and
the sequence of values produced by it is
completely determined at nth
state is determined
by its (n –1 )th
state. Likewise, because the register
has a finite number of possible states, it must enter
into a repeating cycle.
However, a LFSR with a well-chosen feedback
function can produce a sequence of bits which has
a very long cycle and hence appears to be random.
Thesequenceofnumbersgeneratedcanbeconsidered
a numeral system very similar to Gray code.
The output bits that influence the input are
called taps.
The tap sequence of a LFSR can be represented
as a polynomial mod 2, which means the
co-efficient of the polynomial is either 0 or 1. This
is called the feedback or characteristic polynomial.
For example, if the taps are 11th
, 13th
, 14th
and
16th
bits, the resulting characteristic polynomialis,
F(x) = x11
+ x13
+ x14
+x16
+ 1
Note: The 1 in the above equation does not
represent a tap.
The figure of the above polynomial is shown
below:
Figure 5: LFSR of Characteristic polynomial,
x11
+ x13
+ x14
+ x16
+ 1
6. 114 INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND MANAGEMENT RESEARCH
Algorithm for Pseudo random Generator
using LFSR
Step 1: Convert the initial seed to a hex number.
For each hex value generate the 4 bit binary string.
Step 2: For j = 1 to j = 10 Begin loop
• Generate a new bit by XORing the bits in
tap position. Shift the LFSR bits. Clock the
LFSR to generate 1 output bit (say y).
• val = val XOR y.
End loop
Step 3: val = val/1024.
4. RESULTS AND DISCUSSION
For simplicity, let us consider the characteristic
polynomial x4
+ x + 1. The above algorithm
generated the followingbit sequence (seeFigure6):
Figure 6: Output for Characteristic Polynomial x4
+ x + 1
The periodicity is 24
– 1 = 15. We take the last
10 bits and divide it by 1024, to get the number.
In this implementation the convergence
criterion is fixed at the number of generations i.e.
the user will have to input the number of iterations
which the algorithm has to perform before
converging.
Elitism is the problem of losing better solutions
during evolution. The fitness calculations are
non-elitist in nature.
The algorithm is implemented in Visual C++.
The output is a text file containing a set of Pareto
optimal solutions in the ascending of the fitness.
Figure 7: Part I of the Output File
Figure 8: Part II of the Output File
5. CONCLUSION
In this implementation we have used MOGA
fitness assignment algorithm. It ensures that no
two vectors in a population have same fitness
value. This actually helps in exploration of the
solution space. However, the choice of the
objective functions remain very critical.
REFERENCES
[1] Chia-Hsuan Yeh: Graduate Course: An Introduction to
Genetic Algorithms.
[2] Fonscea and Fleming: Genetic Algorithms for
Multiobjective Optimazation: Formulation, Discussion
and Generalization.
[3] David E. Goldberg: Genetic Algorithms in Search,
Optimization and Machine Learning.
[4] Kalyanmoy Deb: Multi-Objective Optimization using
Evolutionary Algorithm.
[5] Fonscea and Fleming: An Overview of Evolutionary
Algorithms in Multi Objective Optimization.
[6] Mitsunki Matayoshi: A Genetic Algorithm with the
Improved 2-opt Method.
[7] K.F. Man, K.S. Tang, S. Kwong: Genetic Algorithms:
Concepts and Applications.
[8] Peter J.B. Hancock: An Empirical Comparison of
Selection Methods in Evolutionary Algorithm.
[9] SJ van Vuuren: Application of Genetic Algorithms-
Determination of Optical Pipe Diameters.
[10] Hiroaki SENGOKU, Ikuo YOSHIHARA: A Fast TSP
Solver in JAVA.
7. AN IMPLEMENTATIONAL APPROACH TO THE GENETIC ALGORITHM FOR SOLVING TRAVELLING SALESMAN PROBLEM 115
[11] Carlos A. Coello: An Updated Survey of GA Based Mult
Optimization and Machine Learning Objective
Optimization.
[12] Francesco di Pierro, Soon Thiam Khu, Slobodan
Djordjevic and Dragan A. Savie: A new Genetic Algorithm
to Solve Effectively High Multiobjetive Optimization.
[13] Luis Paquete, Marco Chiarandini and Thomas Stuzle:
A Study of Local Optima in the Bi-Objective Travelling
Salesman Problem.
[14] Jorge Rierra-Ledesma, Juan Jose Salazar-Gonzalez: The
Biobjective Travelling Salesman Problem.