1. Soft Computing
Lecture 06: Introduction “Genetic Algorithms are good at
to Genetic Algorithms taking large, potentially huge
search spaces and navigating them,
looking for optimal combinations
of things, solutions you might not
otherwise find in a lifetime.”
- Salvatore Mangano
Computer Design, May 1995
2. GENETIC ALGORITHM
A biologically inspired model of intelligence and the
principles of biological evolution are applied to find solutions
to difficult problems
The problems are not solved by reasoning logically about
them; rather populations of competing candidate solutions
are spawned and then evolved to become better solutions
through a process patterned after biological evolution
Less worthy candidate solutions tend to die out, while those
that show promise of solving a problem survive and
reproduce by constructing new solutions out of their
components
3. GENETIC ALGORITHM
GA begin with a population of candidate problem solutions
Candidate solutions are evaluated according to their ability
to solve problem instances: only the fittest survive and
combine with each other to produce the next generation of
possible solutions
Thus increasingly powerful solutions emerge in a Darwinian
universe
Learning is viewed as a competition among a population of
evolving candidate problem solutions
This method is heuristic in nature and it was introduced by
John Holland in 1975
4. GENETIC ALGORITHM
Basic Algorithm
begin
set time t = 0;
initialise population P(t) = {x1t, x2t, …, xnt} of solutions;
while the termination condition is not met do
begin
evaluate fitness of each member of P(t);
select some members of P(t) for creating offspring;
produce offspring by genetic operators;
replace some members with the new offspring;
set time t = t + 1;
end
end
5. GENETIC ALGORITHM
The Evolutionary Cycle
parents
selection modification
modified
offspring
initiate &
population evaluation
evaluate evaluated offspring
deleted
members
discard
6. GENETIC ALGORITHM
Representation of Solutions: The Chromosome
Gene: A basic unit, which represents one characteristic of the
individual. The value of each gene is called an allele
Chromosome: A string of genes; it represents an individual
i.e. a possible solution of a problem. Each chromosome
represents a point in the search space
Population: A collection of chromosomes
An appropriate chromosome representation is important for
the efficiency and complexity of the GA
7. GENETIC ALGORITHM
Representation of Solutions: The Chromosome
The classical representation scheme for chromosomes is
binary vectors of fixed length
In the case of an I-dimensional search space, each
chromosome consists of I variables with each variable
encoded as a bit string
8. GENETIC ALGORITHM
Example: Cookies Problem
Two parameters sugar and flour (in kgs). The range for both
is 0 to 9 kgs. Therefore a chromosome will comprise of two
genes called sugar and flour
5 1 Chromosome # 01
2 4 Chromosome # 02
9. GENETIC ALGORITHM
Example: Expression satisfaction Problem
F = (¬a ∨ c) ∧ (¬a ∨ c ∨ ¬e)
∧ (¬b ∨ c ∨ d ∨ ¬e) ∧ (a ∨ ¬b ∨ c)
∧ (¬e ∨ f)
Chromosome: Six binary genes abcdef e.g. 100111
10. GENETIC ALGORITHM
Representation of Solutions: The Chromosome
Chromosomes have either binary or real valued genes
In binary coded chromosomes, every gene has two alleles
In real coded chromosomes, a gene can be assigned any value
from a domain of values
12. GENETIC ALGORITHM
Chromosomes Encoding
A potential model of the data can be represented as a
chromosome with the genetic representation:
Gene # 1 Gene # 2 Gene # 3 Gene # 4
Restaurant Meal Day Cost
The alleles of genes are:
Restaurant gene: Sam, Lobdell, Sarah, X
Meal gene: breakfast, lunch, X
Day gene: Friday, Saturday, Sunday, X
Cost gene: cheap, expensive, X
13. GENETIC ALGORITHM
Chromosomes Encoding (Hypotheses Representation)
Hypotheses are often represented by bit strings (because they
can be easily manipulated by genetic operators), but other
numerical and symbolic representations are also possible
Set of if-then rules:
Specific sub-strings are allocated for encoding each
rule pre-condition and post-condition
Example: Suppose we have an attribute “Outlook”
which can take on values: Sunny, Overcast or Rain
14. GENETIC ALGORITHM
Chromosomes Encoding (Hypotheses Representation)
We can represent it with 3 bits:
100 would mean the value Sunny,
010 would mean Overcast &
001 would mean Rain
110 would mean Sunny or Overcast
111 would mean that we don’t care about its value
The pre-conditions and post-conditions of a rule are
encoding by concatenating the individual representation of
attributes
15. GENETIC ALGORITHM
Chromosomes Encoding (Hypotheses Representation)
Example:
If (Outlook = Overcast or Rain) and Wind = strong
then PlayTennis = No
can be encoded as 0111001
Another rule
If Wind = Strong
then PlayTennis = Yes
can be encoded as 1111010
16. GENETIC ALGORITHM
Chromosomes Encoding (Hypotheses Representation)
An hypothesis comprising of both of these rules can be
encoded as a chromosome
01110011111010
Note that even if an attribute does not appear in a rule, we
reserve its place in the chromosome, so that we can have
fixed length chromosomes
17. GENETIC ALGORITHM
Variable size chromosomes
Sometimes we need a variable size chromosome; e.g. to
represent a set of rules
Example:
Suppose we are representing a set of rules by a chromosome
If a1 = T and a2 = F then c = T
If a2 = T then c = F
The chromosome would be 10 01 1 11 10 0
where a1 = T is represented by 10,
a2 = F by 01,
and so on
19. GENETIC ALGORITHM
Example: Cookies Problem
Two parameters sugar and flour (in kgs). The range for both
is 0 to 9 kgs. Therefore a chromosome will comprise of two
genes called sugar and flour
5 1
2 4
The fitness function for a chromosome is the taste of the
resulting cookies; range of 1 to 9
20. GENETIC ALGORITHM
Example: Expression satisfaction Problem
F = (¬a ∨ c) ∧ (¬a ∨ c ∨ ¬e)
∧ (¬b ∨ c ∨ d ∨ ¬e) ∧ (a ∨ ¬b ∨ c)
∧ (¬e ∨ f)
Chromosome: Six binary genes abcdef e.g. 100111
Fitness function: No of clauses having truth value of 1
e.g. 010010 has fitness 2
21. GENETIC ALGORITHM
Model Learning
Use GA to learn the concept Yes Reaction from the Food
Allergy problem’s data
The fitness function can be the number of training samples
correctly classified by a chromosome (model)
22. GENETIC ALGORITHM
Population Size
Number of individuals present and competing in an iteration
(generation)
If the population size is too large, the processing time is high
and the GA tends to take longer to converge upon a
solution (because less fit members have to be selected to
make up the required population)
If the population size is too small, the GA is in danger of
premature convergence upon a sub-optimal solution (all
chromosomes will soon have identical traits). This is
primarily because there may not be enough diversity in
the population to allow the GA to escape local optima
23. GENETIC ALGORITHM
Selection Operators (Algorithms)
They are used to select parents from the current population
The selection is primarily based on the fitness. The better the
fitness of a chromosome, the greater its chance of being
selected to be a parent
24. GENETIC ALGORITHM
Selection Operators: Random Selection
Individuals are selected randomly with no reference to fitness
at all
All the individuals, good or bad, have an equal chance of
being selected
25. GENETIC ALGORITHM
Selection Operators: Proportional Selection
Chromosomes are selected based on their fitness relative to
the fitness of all other chromosomes
For this all the fitness are added to form a sum S and each
chromosome is assigned a relative fitness (which is its fitness
divided by the total fitness S)
A process similar to spinning a roulette wheel is adopted to
choose a parent; the better a chromosome’s relative fitness,
the higher its chances of selection
26. GENETIC ALGORITHM
Selection Operators: Proportional Selection
The selection of only the most fittest chromosomes may result
in the loss of a correct gene value which may be present in a
less fit member (and then the only chance of getting it back is
by mutation)
One way to overcome this risk is to assign probability of
selection to each chromosome based on its fitness
In this way even the less fit members have some chance of
surviving into the next generation
Chromosomes are selected based on their fitness relative to
the fitness of all other chromosomes
27. GENETIC ALGORITHM
Selection Operators: Proportional Selection
For this all the fitness are added to form a sum S and each
chromosome is assigned a relative fitness (which is its fitness
divided by the total fitness S)
A process similar to spinning a roulette wheel is adopted to
choose a parent; the better a chromosome’s relative fitness,
the higher its chances of selection
28. GENETIC ALGORITHM
Selection Operators: Proportional Selection
The probability of selection of a chromosome “i” may be
calculated as
pi = fitnessi / ∑ j fitnessj
Example
Chromosome Fitness Selection Probability
1 7 7/14
2 4 4/14
3 2 2/14
4 1 1/14
30. GENETIC ALGORITHM
Selection Operators: Proportional Selection Algorithm
1. [Sum] Calculate sum of all chromosome fitnesses in population - sum S.
2. [Assign] Assign a range to each chromosome over a line ranging from 0-S.
3. [Select] Generate random number from interval (0,S) - r.
4. [Select] Select the chromosome belongs to r
Of course, step 1 is performed only once for each population.
31. GENETIC ALGORITHM
Selection Operators: Rank based selection
Rank based selection uses the rank ordering of the fitness
values to determine the probability of selection and not the
fitness values themselves
This means that the selection probability is independent of
the actual fitness value
Ranking therefore has the advantage that a highly fit
individual will not dominate in the selection process as a
function of the magnitude of its fitness
32. GENETIC ALGORITHM
Selection Operators: Rank based selection
Proportional Selection have problems when the fitnesses differs very much. For
example, if the best chromosome fitness is 90% of all the roulette wheel then
the other chromosomes will have very few chances to be selected.
Rank selection first ranks the population and then every chromosome receives
fitness from this ranking. The worst will have fitness 1, second worst 2 etc. and
the best will have fitness N (number of chromosomes in population)
You can see in following picture, how the situation changes after changing
fitness to order number.
Before ranking After ranking
33. GENETIC ALGORITHM
Selection Operators: Rank based selection
The population is sorted from best to worst according to the
fitness
Each chromosome is then assigned a new
fitness based on a linear ranking function
New Fitness = (P – r) + 1
where P = population size, r = fitness rank of the chromosome
If P = 11, then a chromosome of rank 1 will have a New
Fitness of 10 + 1 = 11 & a chromosome of rank 6 will have 6
34. GENETIC ALGORITHM
Selection Operators: Rank based selection
A user adjusted slope can also be incorporated
New Fitness = {(P – r) (max - min)/(P – 1)} + min
where max and min are set by the user to determine the slope
(max - min)/(P – 1) of the function
Let P = 11, max = 8, min = 3,
then a chromosome of rank 1 will have a New fitness of
10*5/10 + 3 = 8
& a chromosome of rank 6 will have 5*5/10 + 3 = 5.5
35. GENETIC ALGORITHM
Selection Operators: Rank based selection
Once the new fitness is assigned, parents are selected by the
same roulette wheel procedure used in proportionate
selection
36. GENETIC ALGORITHM
Selection Operators: Tournament Selection
Extracts k individuals from the population with uniform
probability (without re-insertion) and makes them play a
“tournament”, where the probability for an individual to
win is generally proportional to its fitness
37. GENETIC ALGORITHM
Reproduction Operators
Genetic operators are applied to chromosomes that are
selected to be parents, to create offspring
Basically of two types: Crossover and Mutation
Crossover operators create offspring by recombining the
chromosomes of selected parents
Mutation is used to make small random changes to a
chromosome in an effort to add diversity to the population
38. GENETIC ALGORITHM
Reproduction Operators: Crossover
Crossover operation takes two candidate solutions and
divides them, swapping components to produce two new
candidates
39. GENETIC ALGORITHM
Reproduction Operators: Crossover
Figure illustrates crossover on bit string patterns of length 8
The operator splits them and forms two children whose initial
segment comes from one parent and whose tail comes from
the other
Input Bit Strings
11#0101# #110#0#1
Resulting Strings
11#0#0#1 #110101#
41. GENETIC ALGORITHM
Reproduction Operators: Crossover
The place of split in the candidate solution is an arbitrary
choice. This split may be at any point in the solution
This splitting point may be randomly chosen or changed
systematically during the solution process
Crossover can unite an individual that is doing well in one
dimension with another individual that is doing well in the
other dimension
42. GENETIC ALGORITHM
Reproduction Operators: Crossover
Two types: Single point crossover & Uniform crossover
Single type crossover
This operator takes two parents and randomly selects a
single point between two genes to cut both chromosomes
into two parts (this point is called cut point)
The first part of the first parent is combined with the
second part of the second parent to create the first child
The first part of the second parent is combined with the
second part of first parent to create the second child
1000010 1000001
1110001 1110010
43. GENETIC ALGORITHM
Reproduction Operators: Crossover
Uniform crossover
The value of each gene of an offspring’s chromosome is
randomly taken from either parent
This is equivalent to multiple point crossover
1000010
1110001 1010010
44. GENETIC ALGORITHM
Reproduction Operators: Crossover (Variable size chromosomes)
Sometimes we need a variable size chromosome; e.g. to
represent a set of rules
Example:
Suppose we are representing a set of rules by a chromosome
If a1 = T and a2 = F then c = T
If a2 = T then c = F
The chromosome would be 10 01 1 11 10 0
where a1 = T is represented by 10,
a2 = F by 01,
and so on
45. GENETIC ALGORITHM
Reproduction Operators: Crossover (Variable size chromosomes)
The sub-strings can be considered as a single entity during
cross-over (i.e. crossover point is not allowed in the middle of
the sub-string)
Another way can be to allow all possible crossovers, but assign
a very low fitness to resulting chromosomes which have
undesirable sub-string meaning(s)
e.g. 01110011111011
would mean, we do not care whether we play tennis or
not
46. GENETIC ALGORITHM
Reproduction Operators: Crossover (Variable size chromosomes)
For such chromosomes we use a modified cross-over operator
To perform a crossover operation on two parents, two
crossover points are first chosen at random in one of the
parents
Example: Let the two parents be
10 01 1 11 10 0
and 01 11 0 10 01 0
47. GENETIC ALGORITHM
Reproduction Operators: Crossover (Variable size chromosomes)
Suppose the crossover points chosen randomly for the first
parent are after bit position 1 and 8
1st parent 10 01 1 11 10 0
2nd parent 01 11 0 10 01 0
Let d1 denote the distance from the leftmost of these crossover
points to the rule boundary immediately to the left
d1 = 1
Let d2 denote the distance from the rightmost of these
crossover points to the rule boundary immediately to the left
d2 = 3
48. GENETIC ALGORITHM
Reproduction Operators: Crossover (Variable size chromosomes)
The crossover points in the second parent are now randomly
chosen, subject to the constraint that they must have the same
d1 and d2 values
1st parent 10 01 1 11 10 0 d1 = 1, d2 = 3
2nd parent 01 11 0 10 01 0
The possible crossover points for the 2nd parent are at bit
positions (1, 3), (1, 8), and (6, 8)
2nd parent 01 11 0 10 01 0 d1 = 1, d2 = 3
01 11 0 10 01 0
01 11 0 10 01 0
49. GENETIC ALGORITHM
Reproduction Operators: Crossover (Variable size chromosomes)
Suppose crossover points (1, 3) happen to be chosen for the 2nd
parent
1st parent 10 01 1 11 10 0
2nd parent 01 11 0 10 01 0
The resulting two offspring would be
11 10 0
and
00 01 1 11 11 0 10 01 0
50. GENETIC ALGORITHM
Reproduction Operators: Mutation
Mutation is another important genetic operator
Mutation takes a single candidate and randomly changes
some aspect (gene) of it
For example, mutation may randomly select a bit in the
pattern and change it, switching a 1 to a 0 or to # (don’t care)
51. GENETIC ALGORITHM
Reproduction Operators: Mutation
Mutation is important in that the initial population may
exclude an essential component of a solution
For example, if no member of the initial population has a 1 in
the first position, then crossover in the middle, cannot
produce a child that could become a solution
52. GENETIC ALGORITHM
Reproduction Operators: Mutation
Each gene of each offspring is mutated with a given
mutation rate pµ (say 0.01)
It is hence possible that no gene may be mutated for many
generations. On the other hand more than one gene may
be mutated in the same generation (or even in the same
chromosome)
For real valued genes, the value is selected randomly
from the alleles
If the rate is too low, new traits will appear too slowly in
the population. If the rate is too high, each generation will
be unrelated to the previous generation
53. GENETIC ALGORITHM
Q: Is it some kind of learning technique like neural networks ?
A: No
Q: Then, what is it ?
54. GENETIC ALGORITHM
Search Techniqes
Calculus Base Enumerative
Guided random search Techniques
Techniques techniques
Fibonacci Sort DFS Dynamic BFS
Programming
Tabu Search Hill Simulated Evolutionary
Climbing Annealing Algorithms
Genetic Genetic
Programming Algorithms
Figure: Taxonomy of searching techniques
55. GENETIC ALGORITHM
GA Quick Overview
• Developed: USA in the 1970’s
• Early names: J. Holland, K. DeJong, D. Goldberg
• Typically applied to:
– discrete optimization
• Attributed features:
– not too fast
– good heuristic for combinatorial problems
• Special Features:
– Traditionally emphasizes combining information from good
parents (crossover)
– many variants, e.g., reproduction models, operators
56. GENETIC ALGORITHM
The MAXONE problem
Suppose we want to maximize the number
of ones in a string of l binary digits
Is it a trivial problem?
It may seem so because we know the answer
in advance
However, we can think of it as maximizing
the number of correct answers, each
encoded by 1, to l yes/no difficult questions`
57. GENETIC ALGORITHM
The MAXONE problem
• An individual is encoded (naturally) as a
string of l binary digits
• The fitness f of a candidate solution to the
MAXONE problem is the number of ones in
its genetic code
• We start with a population of n random
strings. Suppose that l = 10 and n = 6
58. GENETIC ALGORITHM
The MAXONE problem (initialization step)
We toss a fair coin 60 times and get the
following initial population:
s1 = 1111010101 f (s1) = 7
s2 = 0111000101 f (s2) = 5
s3 = 1110110101 f (s3) = 7
s4 = 0100010011 f (s4) = 4
s5 = 1110111101 f (s5) = 8
s6 = 0100110000 f (s6) = 3
59. GENETIC ALGORITHM
The MAXONE problem (selection step)
Next we apply fitness proportionate selection with
the roulette wheel method; the individual i have
the probability to chose: f (i )
∑ f (i )
i
We repeat the extraction Area is
1 2
as many times as the n
Proportional
to fitness
number of individuals we value
need to have the same 3
parent population size 4
(6 in our case)
60. GENETIC ALGORITHM
The MAXONE problem (selection step)
Suppose that, after performing selection, we
get the following population:
s1` = 1111010101 (s1)
s2` = 1110110101 (s3)
s3` = 1110111101 (s5)
s4` = 0111000101 (s2)
s5` = 0100010011 (s4)
s6` = 1110111101 (s5)
61. GENETIC ALGORITHM
The MAXONE problem (crossover step)
Next we mate strings for crossover. For each
couple we decide according to crossover
probability (for instance 0.6) whether to
actually perform crossover or not
Suppose that we decide to actually perform
crossover only for couples (s1`, s2`) and (s5`,
s6`). For each couple, we randomly extract a
crossover point, for instance 2 for the first
and 5 for the second
62. GENETIC ALGORITHM
The MAXONE problem (crossover step)
Before crossover:
s1` = 1111010101 s5` = 0100010011
s2` = 1110110101 s6` = 1110111101
After crossover:
s1`` = 1110110101 s5`` = 0100011101
s2`` = 1111010101 s6`` = 1110110011
63. GENETIC ALGORITHM
The MAXONE problem (mutation step)
The final step is to apply random mutation: for each bit
that we are to copy to the new population we allow a
small probability of error (for instance 0.1)
Before applying mutation:
s1`` = 1110110101
s2`` = 1111010101
s3`` = 1110111101
s4`` = 0111000101
s5`` = 0100011101
s `` = 1110110011
64. GENETIC ALGORITHM
The MAXONE problem (mutation step)
After applying mutation:
s1``` = 1110100101 f (s1``` ) = 6
s2``` = 1111110100 f (s2``` ) = 7
s3``` = 1110101111 f (s3``` ) = 8
s4``` = 0111000101 f (s4``` ) = 5
s5``` = 0100011101 f (s5``` ) = 5
s6``` = 1110110001 f (s6``` ) = 6
65. GENETIC ALGORITHM
The MAXONE problem (example end)
In one generation, the total population
fitness changed from 34 to 37, thus
improved by ~9%
At this point, we go through the same
process all over again, until a stopping
criterion is met
66. GENETIC ALGORITHM
Short Assignment:
Design a genetic algorithm to learn conjunctive
classification rules for the Play-Tennis problem.
Describe precisely the bit-string encoding of hypotheses
and a set of crossover operators.
Due Date:
24-05-2012
Hint: see Chapter 09, Machine Learning book by Tom. Mitchell
68. GENETIC ALGORITHM
Major Assignment (10 marks) :
Task: you have to analyse some research paper and give
your critical analysis in the form of a report. The report
will be critically reviewed and accordingly marked.
Deliverables: A report + Presentation
Relevant Information: The assignment will be prepared
within groups. However, the marks will be assigned based
on individual performance.
Research Topics: the topics will be uploaded on the group
69. GENETIC ALGORITHM
The New Generation
The new offspring can replace the old population without any
fitness comparison
or
only the better ones from the new and old make it to the new
generation (more processing needed)
70. GENETIC ALGORITHM
New Generation: Elitism
Elitism is a value between 0 and 1, which represents the
fraction of the individuals of a population that will be
duplicated to the next generation
Example P = 20 and elitism = 0.1, then 2 individuals of
current population do not get replaced
The elite chromosome selection may be based on highest
fitness values
If highest fitness valued chromosomes are carried over to the
next generation, we ensure that the maximum fitness does not
decrease from one generation to next
71. GENETIC ALGORITHM
New Generation: Generation Gap
The number of individuals replaced in the next generation is
called generation gap
A generation gap of 100% will mean that whole of the
population comprises of new chromosomes
We may have a generation gap which is not fixed. In this
method, the fittest P chromosomes will be selected from the
set of current population plus new children, and will form the
new generation
72. GENETIC ALGORITHM
New Generation: Number of Duplicates Allowed
Duplicates are Chromosomes that are same
If they are allowed then that chromosome has higher
probability of producing an offspring, and may probably
create many offspring
Eliminating duplicates increases the efficiency of the genetic
search and reduces the danger of premature convergence
Eliminating duplicates means that if an offspring is created
which is a duplicate of a chromosome of the current
population, we terminate it immediately and create a new
one. It increases the processing time in large populations
73. GENETIC ALGORITHM
Termination Requirement
The GA continues until some termination requirement is met,
such as
- having a solution whose fitness exceeds some threshold
- the fitness of solutions becomes stable & stops improving