SlideShare a Scribd company logo
1 of 25
Download to read offline
Introduction to
Genetic Algorithms




Karthik S
Undergraduate Student (Final Year)
Department of Computer Science and Engineering
National Institute of Technology, Tiruchirappalli
What is GA?
DARWINIAN SELECTION:
From a group of individuals the best will survive

Understanding a GA means understanding the simple, iterative processes that
underpin evolutionary change

GA is an algorithm which makes it easy to search a large search space

EXAMPLE: finding largest divisor of a big number

By implementing this Darwinian selection to the problem only the best
solutions will remain, thus narrowing the search space.



EVOLUTIONARY COMPUTING – BIOLOGY PERSPECTIVE
Origin of species from a common descent and descent of species, as well as
their change, multiplication and diversity over time.



                                              Data Mining                     2
Where GAs can be used?
OPTIMIZATION:
Where there are large solutions to the problem but we have to find
the best one.
 best moves in chess
 mathematical problems
 financial problems


DISADVANTAGES
 GAs are very slow.
 They cannot always find the exact solution but they always find
  best solution.




                                        Data Mining                  3
Biological Background
   Chromosome: A set of genes. Chromosome contains the solution in
    form of genes.
   Gene: A part of chromosome. A gene contains a part of solution. It
    determines the solution. E.g. 16743 is a chromosome and 1, 6, 7, 4 and
    3 are its genes.
   Individual: Same as chromosome.
   Population: No of individuals present with same length of
    chromosome.
   Fitness: Fitness is the value assigned to an individual. It is based on
    how far or close a individual is from the solution. Greater the fitness
    value better the solution it contains.
   Fitness function: Fitness function is a function which assigns fitness
    value to the individual. It is problem specific.
   Selection: Selecting individuals for creating the next generation.
   Recombination (or crossover): Genes from parents form in some way
    the whole new chromosome.
   Mutation: Changing a random gene in an individual.



                                            Data Mining                   4
General Algorithm of GA

START

Generate initial population.
Assign fitness function to all individuals.

DO UNTIL best solution is found

      Select individuals from current generation
      Create new offsprings with mutation and/or breeding
      Compute new fitness for all individuals
      Kill all unfit individuals to give space to new offsprings
      Check if best solution is found

LOOP

END



                                       Data Mining                 5
Selection
   Darwinian Survival of The Fittest
   More preference to better guys
   Ways to do:
    ◦ Roulette Wheel
    ◦ Tournament
    ◦ Truncation
   By itself, pick best




                                        Data Mining   6
Recombination (crossover)
   Combine bits and pieces of good parents
   Speculate on new, possibly better children
   By itself, a random shuffle

Given two chromosomes

10001001110010010
01010001001000011

Choose a random bit along the length, say at position 9, and swap all the
bits after that point

so the above become:

10001001101000011
01010001010010010


                                           Data Mining                      7
Mutation
   Mutation is random alteration of a string
   Change a gene, small movement in the neighbourhood
   By itself, a random walk



Before:         10001001110010010

After:          10000001110110010




                                      Data Mining        8
Data Mining   9
Improvement / Innovation

IMPROVEMENT:


         Selection                Mutation


Local changes - hill climbing

INNOVATION:

         Selection              Recombination


Combine notions - invent



                                       Data Mining   10
Encoding
“Coding of the population for evolution process”

BINARY ENCODING:

 Chromosome A                011010110110110101
 Chromosome B                101001010100101001

PERMUTATION ENCODING:

 Chromosome A                12345678
 Chromosome B                83456127




                                       Data Mining   11
Example
The travelling salesman problem

Find a tour of given set of cities so that:
 each city is visited only once
 the total distance travelled is minimized




                                         Data Mining   12
TSP – Coding for 8 cities
Encoding using permutation encoding
1. Chennai      2. Trichy       3. Thanjavur       4. Madurai
5. Bangalore    6. Hyderabad    7. Coimbatore      8. Cochin

City Route 1:   (12347856)
City Route 2:   (65872134)

CROSSOVER:
(12347856)
                         (12346587)
(31246587)

MUTATION:
(12346587)               (12846537)

                                     Data Mining                13
TSP – GA Process
   First, create a group of many random tours in what is called a
    population. This algorithm uses a greedy initial population that
    gives preference to linking cities that are close to each other.
   Second, pick 2 of the better (shorter) tours parents in the
    population and combine them to make 2 new child tours.
    Hopefully, these children tour will be better than either parent.
   A small percentage of the time, the child tours are mutated. This is
    done to prevent all tours in the population from looking identical.
   The new child tours are inserted into the population replacing two
    of the longer tours. The size of the population remains the same.
   New children tours are repeatedly created until the desired goal is
    reached.


                        Survival of the Fittest


                                          Data Mining                  14
TSP – GA Process – Issues (1)
The two complex issues with using a Genetic Algorithm to solve the
Traveling Salesman Problem are the encoding of the tour and the crossover
algorithm that is used to combine the two parent tours to make the child
tours.

In this example, the crossover point is between the 3rd and 4th item in the
list. To create the children, every item in the parent's sequence after the
crossover point is swapped.

Parent 1                               F A B | E C G D
Parent 2                               D E A | C G B F
Child 1                                F A B | C G B F
Child 1                                D E A | E C G D

What is the issue here ???

We get invalid sequences as children

                                             Data Mining                      15
TSP – GA Process – Issues (2)
The encoding cannot simply be the list of cities in the order they are
travelled. Other encoding methods have been created that solve the
crossover problem. Although these methods will not create invalid tours,
they do not take into account the fact that the tour "A B C D E F G" is the
same as "G F E D C B A". To solve the problem properly the crossover
algorithm will have to get much more complicated.




                                             Data Mining                      16
Other Examples
THE MAXONE PROBLEM

• Suppose we want to maximize the number of ones in a string of l binary
  digits

• We can think of it as maximizing the number of correct answers, each
  encoded by 1, to l yes/no difficult questions

THE TARGET NUMBER PROBLEM

• Given the digits 0 through 9 and the operators +, -, * and /, find a
  sequence that will represent a given target number. The operators will
  be applied sequentially from left to right as you read.




                                           Data Mining                     17
GA in Data Mining
• Used in Classification

EXAMPLE:

• Two Boolean attributes, A1 and A2, and two classes, C1 and C2

• IF A1 AND NOT A2 THEN C2
  100
• IF NOT A1 AND NOT A2 THEN C1
  001

• If an attribute has k values, where k > 2, then k bits may be used
  to encode the attribute’s values.
• Classes can be encoded in a similar fashion.



                                         Data Mining                   18
Classification Problem
• Associating a given input pattern with one of the distinct classes
• Patterns are specified by a number of features (representing
  some measurements made on the objects that are being
  classified) so it is natural to think of them as d-dimensional
  vectors, where d is the number of different features
• This representation gives rise to a concept of feature space
• Classification - determining which of the regions a given pattern
  falls into
• A decision rule determines a decision boundary which partitions
  the feature space into regions associated with each class
• The goal is to design a decision rule which is easy to compute and
  yields the smallest possible probability of misclassification of
  input patterns from the feature space.




                                         Data Mining                   19
Classification Problem - samples



                                          classification




 An overly classified decision boundary




                                               Data Mining   20
Discriminant Function
• Training set - finite sample of patterns with known class affiliations
• Use training sets to create decision boundaries
• Avoid over-fitting a training set by creating overly complex decision
  boundaries
• Simplify the shape of the decision boundary which will, by
  sacrificing performance on the training samples, improve the
  performance on new patterns
• Different classifiers can be implemented by constructing an
  appropriate discriminant function gi(x), where i is the class index. A
  pattern x is associated with the class j such that gj(x)>gi(x) for every
  i not equal to j




                                           Data Mining                   21
A Linear Discriminant Function
 • Linear discriminant function limits to two distinct classes
 • f(x) = ������ ω������ ������������ + ω������+1
           ������=1
     where xi are the components of the feature vector and the
     weights ������������ need to be adjusted to optimize the performance of
     the classifier
HOW TO USE GA FOR CLASSIFICATION AND FINDING THE OPTIMAL
WEIGHTS ������������

• In genetic algorithms, classification problem reduces to finding the
  parameters of the optimum discriminant function defining the
  boundary between classes
• Each chromosome has a number of genes equal to the number of
  parameters used in the discriminant function
• The fitness function is the fraction of patterns properly classified by
  applying the discriminant function parameterized by the
  chromosome to a given testing set
                                           Data Mining                  22
Advantages of GA
•   Concepts are easy to understand
•   Genetic Algorithms are intrinsically parallel.
•   Always an answer; answer gets better with time
•   Inherently parallel; easily distributed
•   Less time required for some special applications
•   Chances of getting optimal solution are more




                                          Data Mining   23
Limitations of GA
• The population considered for the evolution should be moderate
  or suitable one for the problem (normally 20-30 or 50-100)
• Crossover rate should be 80%-95%
• Mutation rate should be low i.e. 0.5%-1% assumed as best
• The method of selection should be appropriate
• Writing of fitness function must be accurate




                                       Data Mining                 24
Conclusion
• Genetic algorithms are rich in application across a large and
  growing number of disciplines.
• Genetic Algorithms are used in Optimization and in Classification
  in Data Mining
• Genetic algorithm has changed the way we do computer
  programming.




                                        Data Mining                   25

More Related Content

What's hot

Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithmJari Abbas
 
GENETIC ALGORITHM
GENETIC ALGORITHMGENETIC ALGORITHM
GENETIC ALGORITHMHarsh Sinha
 
Introduction to Genetic algorithms
Introduction to Genetic algorithmsIntroduction to Genetic algorithms
Introduction to Genetic algorithmsAkhil Kaushik
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMPuneet Kulyana
 
Genetic algorithm raktim
Genetic algorithm raktimGenetic algorithm raktim
Genetic algorithm raktimRaktim Halder
 
Flowchart of GA
Flowchart of GAFlowchart of GA
Flowchart of GAIshucs
 
Genetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceGenetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceSahil Kumar
 
Genetic algorithms in Data Mining
Genetic algorithms in Data MiningGenetic algorithms in Data Mining
Genetic algorithms in Data MiningAtul Khanna
 
Solving the traveling salesman problem by genetic algorithm
Solving the traveling salesman problem by genetic algorithmSolving the traveling salesman problem by genetic algorithm
Solving the traveling salesman problem by genetic algorithmAlex Bidanets
 
Genetic algorithm ppt
Genetic algorithm pptGenetic algorithm ppt
Genetic algorithm pptMayank Jain
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisXin-She Yang
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learningHaris Jamil
 
Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms Xin-She Yang
 
Genetic_Algorithm_AI(TU)
Genetic_Algorithm_AI(TU)Genetic_Algorithm_AI(TU)
Genetic_Algorithm_AI(TU)Kapil Khatiwada
 

What's hot (20)

Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
GENETIC ALGORITHM
GENETIC ALGORITHMGENETIC ALGORITHM
GENETIC ALGORITHM
 
Introduction to Genetic algorithms
Introduction to Genetic algorithmsIntroduction to Genetic algorithms
Introduction to Genetic algorithms
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHM
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithms
 
Genetic algorithm raktim
Genetic algorithm raktimGenetic algorithm raktim
Genetic algorithm raktim
 
Flowchart of GA
Flowchart of GAFlowchart of GA
Flowchart of GA
 
Genetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceGenetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial Intelligence
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
Genetic algorithm
Genetic algorithm Genetic algorithm
Genetic algorithm
 
Genetic algorithms in Data Mining
Genetic algorithms in Data MiningGenetic algorithms in Data Mining
Genetic algorithms in Data Mining
 
Crow search algorithm
Crow search algorithmCrow search algorithm
Crow search algorithm
 
Solving the traveling salesman problem by genetic algorithm
Solving the traveling salesman problem by genetic algorithmSolving the traveling salesman problem by genetic algorithm
Solving the traveling salesman problem by genetic algorithm
 
Genetic algorithm ppt
Genetic algorithm pptGenetic algorithm ppt
Genetic algorithm ppt
 
Metaheuristics
MetaheuristicsMetaheuristics
Metaheuristics
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical Analysis
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms
 
Genetic_Algorithm_AI(TU)
Genetic_Algorithm_AI(TU)Genetic_Algorithm_AI(TU)
Genetic_Algorithm_AI(TU)
 
Ga ppt (1)
Ga ppt (1)Ga ppt (1)
Ga ppt (1)
 

Viewers also liked

Data Warehouse Design Considerations
Data Warehouse Design ConsiderationsData Warehouse Design Considerations
Data Warehouse Design ConsiderationsRam Kedem
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic AlgorithmsAhmed Othman
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimizationITER
 
Ant Colony Optimization
Ant Colony OptimizationAnt Colony Optimization
Ant Colony OptimizationPratik Poddar
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimizationMeenakshi Devi
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimizationJoy Dutta
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimizationvk1dadhich
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 

Viewers also liked (11)

Data Warehouse Design Considerations
Data Warehouse Design ConsiderationsData Warehouse Design Considerations
Data Warehouse Design Considerations
 
Soft computing06
Soft computing06Soft computing06
Soft computing06
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic Algorithms
 
Ant Colony Optimization
Ant Colony OptimizationAnt Colony Optimization
Ant Colony Optimization
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimization
 
Ant Colony Optimization
Ant Colony OptimizationAnt Colony Optimization
Ant Colony Optimization
 
Ant colony Optimization
Ant colony OptimizationAnt colony Optimization
Ant colony Optimization
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimization
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimization
 
Ant colony optimization
Ant colony optimizationAnt colony optimization
Ant colony optimization
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 

Similar to Genetic Algorithm Introduction

Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02Amna Saeed
 
GA of a Paper 2012.pptx
GA of a Paper 2012.pptxGA of a Paper 2012.pptx
GA of a Paper 2012.pptxwaqasjavaid26
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic AlgorithmsVanessa Camilleri
 
Genetic algorithm_raktim_IITKGP
Genetic algorithm_raktim_IITKGP Genetic algorithm_raktim_IITKGP
Genetic algorithm_raktim_IITKGP Raktim Halder
 
CSA 3702 machine learning module 4
CSA 3702 machine learning module 4CSA 3702 machine learning module 4
CSA 3702 machine learning module 4Nandhini S
 
evolutionary algo's.ppt
evolutionary algo's.pptevolutionary algo's.ppt
evolutionary algo's.pptSherazAhmed103
 
Travelling Salesman Problem
Travelling Salesman ProblemTravelling Salesman Problem
Travelling Salesman ProblemShikha Gupta
 
An Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsAn Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsSajib Sen
 
Genetic Algorithm 2 -.pptx
Genetic Algorithm 2 -.pptxGenetic Algorithm 2 -.pptx
Genetic Algorithm 2 -.pptxTAHANMKH
 
Chapter09.ppt
Chapter09.pptChapter09.ppt
Chapter09.pptbutest
 
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Gingles Caroline
 
Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014Aleksander Stensby
 
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO... ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...cscpconf
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersAlbert Y. C. Chen
 

Similar to Genetic Algorithm Introduction (20)

Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
GA of a Paper 2012.pptx
GA of a Paper 2012.pptxGA of a Paper 2012.pptx
GA of a Paper 2012.pptx
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic Algorithms
 
Genetic algorithm_raktim_IITKGP
Genetic algorithm_raktim_IITKGP Genetic algorithm_raktim_IITKGP
Genetic algorithm_raktim_IITKGP
 
F0422052058
F0422052058F0422052058
F0422052058
 
CSA 3702 machine learning module 4
CSA 3702 machine learning module 4CSA 3702 machine learning module 4
CSA 3702 machine learning module 4
 
evolutionary algo's.ppt
evolutionary algo's.pptevolutionary algo's.ppt
evolutionary algo's.ppt
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
Travelling Salesman Problem
Travelling Salesman ProblemTravelling Salesman Problem
Travelling Salesman Problem
 
GA.pptx
GA.pptxGA.pptx
GA.pptx
 
An Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsAn Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their Applications
 
Genetic Algorithm 2 -.pptx
Genetic Algorithm 2 -.pptxGenetic Algorithm 2 -.pptx
Genetic Algorithm 2 -.pptx
 
Chapter09.ppt
Chapter09.pptChapter09.ppt
Chapter09.ppt
 
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Histogram-Based Method for Effective Initialization of the K-Means Clustering...
Histogram-Based Method for Effective Initialization of the K-Means Clustering...
 
04 1 evolution
04 1 evolution04 1 evolution
04 1 evolution
 
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
Machine Learning Tools and Particle Swarm Optimization for Content-Based Sear...
 
Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014Introduction to Genetic Algorithms 2014
Introduction to Genetic Algorithms 2014
 
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO... ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 

More from Karthik Sankar

Clearing the air on Cloud Computing
Clearing the air on Cloud ComputingClearing the air on Cloud Computing
Clearing the air on Cloud ComputingKarthik Sankar
 
The Science of Violin Harmonics with special focus on Articulation, Vibrato a...
The Science of Violin Harmonics with special focus on Articulation, Vibrato a...The Science of Violin Harmonics with special focus on Articulation, Vibrato a...
The Science of Violin Harmonics with special focus on Articulation, Vibrato a...Karthik Sankar
 
Realizing Parallelism and Transparency in Applications through Idempotence
Realizing Parallelism and Transparency in Applications through IdempotenceRealizing Parallelism and Transparency in Applications through Idempotence
Realizing Parallelism and Transparency in Applications through IdempotenceKarthik Sankar
 
Tamil Morphological Analysis
Tamil Morphological AnalysisTamil Morphological Analysis
Tamil Morphological AnalysisKarthik Sankar
 
Natural Language Processing and Machine Learning
Natural Language Processing and Machine LearningNatural Language Processing and Machine Learning
Natural Language Processing and Machine LearningKarthik Sankar
 

More from Karthik Sankar (9)

Clearing the air on Cloud Computing
Clearing the air on Cloud ComputingClearing the air on Cloud Computing
Clearing the air on Cloud Computing
 
The Science of Violin Harmonics with special focus on Articulation, Vibrato a...
The Science of Violin Harmonics with special focus on Articulation, Vibrato a...The Science of Violin Harmonics with special focus on Articulation, Vibrato a...
The Science of Violin Harmonics with special focus on Articulation, Vibrato a...
 
Realizing Parallelism and Transparency in Applications through Idempotence
Realizing Parallelism and Transparency in Applications through IdempotenceRealizing Parallelism and Transparency in Applications through Idempotence
Realizing Parallelism and Transparency in Applications through Idempotence
 
Rates of exchange
Rates of exchangeRates of exchange
Rates of exchange
 
Tamil Morphological Analysis
Tamil Morphological AnalysisTamil Morphological Analysis
Tamil Morphological Analysis
 
Natural Language Processing and Machine Learning
Natural Language Processing and Machine LearningNatural Language Processing and Machine Learning
Natural Language Processing and Machine Learning
 
Java Fx
Java FxJava Fx
Java Fx
 
Sadhana
SadhanaSadhana
Sadhana
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Genetic Algorithm Introduction

  • 1. Introduction to Genetic Algorithms Karthik S Undergraduate Student (Final Year) Department of Computer Science and Engineering National Institute of Technology, Tiruchirappalli
  • 2. What is GA? DARWINIAN SELECTION: From a group of individuals the best will survive Understanding a GA means understanding the simple, iterative processes that underpin evolutionary change GA is an algorithm which makes it easy to search a large search space EXAMPLE: finding largest divisor of a big number By implementing this Darwinian selection to the problem only the best solutions will remain, thus narrowing the search space. EVOLUTIONARY COMPUTING – BIOLOGY PERSPECTIVE Origin of species from a common descent and descent of species, as well as their change, multiplication and diversity over time. Data Mining 2
  • 3. Where GAs can be used? OPTIMIZATION: Where there are large solutions to the problem but we have to find the best one.  best moves in chess  mathematical problems  financial problems DISADVANTAGES  GAs are very slow.  They cannot always find the exact solution but they always find best solution. Data Mining 3
  • 4. Biological Background  Chromosome: A set of genes. Chromosome contains the solution in form of genes.  Gene: A part of chromosome. A gene contains a part of solution. It determines the solution. E.g. 16743 is a chromosome and 1, 6, 7, 4 and 3 are its genes.  Individual: Same as chromosome.  Population: No of individuals present with same length of chromosome.  Fitness: Fitness is the value assigned to an individual. It is based on how far or close a individual is from the solution. Greater the fitness value better the solution it contains.  Fitness function: Fitness function is a function which assigns fitness value to the individual. It is problem specific.  Selection: Selecting individuals for creating the next generation.  Recombination (or crossover): Genes from parents form in some way the whole new chromosome.  Mutation: Changing a random gene in an individual. Data Mining 4
  • 5. General Algorithm of GA START Generate initial population. Assign fitness function to all individuals. DO UNTIL best solution is found Select individuals from current generation Create new offsprings with mutation and/or breeding Compute new fitness for all individuals Kill all unfit individuals to give space to new offsprings Check if best solution is found LOOP END Data Mining 5
  • 6. Selection  Darwinian Survival of The Fittest  More preference to better guys  Ways to do: ◦ Roulette Wheel ◦ Tournament ◦ Truncation  By itself, pick best Data Mining 6
  • 7. Recombination (crossover)  Combine bits and pieces of good parents  Speculate on new, possibly better children  By itself, a random shuffle Given two chromosomes 10001001110010010 01010001001000011 Choose a random bit along the length, say at position 9, and swap all the bits after that point so the above become: 10001001101000011 01010001010010010 Data Mining 7
  • 8. Mutation  Mutation is random alteration of a string  Change a gene, small movement in the neighbourhood  By itself, a random walk Before: 10001001110010010 After: 10000001110110010 Data Mining 8
  • 10. Improvement / Innovation IMPROVEMENT: Selection Mutation Local changes - hill climbing INNOVATION: Selection Recombination Combine notions - invent Data Mining 10
  • 11. Encoding “Coding of the population for evolution process” BINARY ENCODING: Chromosome A 011010110110110101 Chromosome B 101001010100101001 PERMUTATION ENCODING: Chromosome A 12345678 Chromosome B 83456127 Data Mining 11
  • 12. Example The travelling salesman problem Find a tour of given set of cities so that:  each city is visited only once  the total distance travelled is minimized Data Mining 12
  • 13. TSP – Coding for 8 cities Encoding using permutation encoding 1. Chennai 2. Trichy 3. Thanjavur 4. Madurai 5. Bangalore 6. Hyderabad 7. Coimbatore 8. Cochin City Route 1: (12347856) City Route 2: (65872134) CROSSOVER: (12347856) (12346587) (31246587) MUTATION: (12346587) (12846537) Data Mining 13
  • 14. TSP – GA Process  First, create a group of many random tours in what is called a population. This algorithm uses a greedy initial population that gives preference to linking cities that are close to each other.  Second, pick 2 of the better (shorter) tours parents in the population and combine them to make 2 new child tours. Hopefully, these children tour will be better than either parent.  A small percentage of the time, the child tours are mutated. This is done to prevent all tours in the population from looking identical.  The new child tours are inserted into the population replacing two of the longer tours. The size of the population remains the same.  New children tours are repeatedly created until the desired goal is reached. Survival of the Fittest Data Mining 14
  • 15. TSP – GA Process – Issues (1) The two complex issues with using a Genetic Algorithm to solve the Traveling Salesman Problem are the encoding of the tour and the crossover algorithm that is used to combine the two parent tours to make the child tours. In this example, the crossover point is between the 3rd and 4th item in the list. To create the children, every item in the parent's sequence after the crossover point is swapped. Parent 1 F A B | E C G D Parent 2 D E A | C G B F Child 1 F A B | C G B F Child 1 D E A | E C G D What is the issue here ??? We get invalid sequences as children Data Mining 15
  • 16. TSP – GA Process – Issues (2) The encoding cannot simply be the list of cities in the order they are travelled. Other encoding methods have been created that solve the crossover problem. Although these methods will not create invalid tours, they do not take into account the fact that the tour "A B C D E F G" is the same as "G F E D C B A". To solve the problem properly the crossover algorithm will have to get much more complicated. Data Mining 16
  • 17. Other Examples THE MAXONE PROBLEM • Suppose we want to maximize the number of ones in a string of l binary digits • We can think of it as maximizing the number of correct answers, each encoded by 1, to l yes/no difficult questions THE TARGET NUMBER PROBLEM • Given the digits 0 through 9 and the operators +, -, * and /, find a sequence that will represent a given target number. The operators will be applied sequentially from left to right as you read. Data Mining 17
  • 18. GA in Data Mining • Used in Classification EXAMPLE: • Two Boolean attributes, A1 and A2, and two classes, C1 and C2 • IF A1 AND NOT A2 THEN C2 100 • IF NOT A1 AND NOT A2 THEN C1 001 • If an attribute has k values, where k > 2, then k bits may be used to encode the attribute’s values. • Classes can be encoded in a similar fashion. Data Mining 18
  • 19. Classification Problem • Associating a given input pattern with one of the distinct classes • Patterns are specified by a number of features (representing some measurements made on the objects that are being classified) so it is natural to think of them as d-dimensional vectors, where d is the number of different features • This representation gives rise to a concept of feature space • Classification - determining which of the regions a given pattern falls into • A decision rule determines a decision boundary which partitions the feature space into regions associated with each class • The goal is to design a decision rule which is easy to compute and yields the smallest possible probability of misclassification of input patterns from the feature space. Data Mining 19
  • 20. Classification Problem - samples classification An overly classified decision boundary Data Mining 20
  • 21. Discriminant Function • Training set - finite sample of patterns with known class affiliations • Use training sets to create decision boundaries • Avoid over-fitting a training set by creating overly complex decision boundaries • Simplify the shape of the decision boundary which will, by sacrificing performance on the training samples, improve the performance on new patterns • Different classifiers can be implemented by constructing an appropriate discriminant function gi(x), where i is the class index. A pattern x is associated with the class j such that gj(x)>gi(x) for every i not equal to j Data Mining 21
  • 22. A Linear Discriminant Function • Linear discriminant function limits to two distinct classes • f(x) = ������ ω������ ������������ + ω������+1 ������=1 where xi are the components of the feature vector and the weights ������������ need to be adjusted to optimize the performance of the classifier HOW TO USE GA FOR CLASSIFICATION AND FINDING THE OPTIMAL WEIGHTS ������������ • In genetic algorithms, classification problem reduces to finding the parameters of the optimum discriminant function defining the boundary between classes • Each chromosome has a number of genes equal to the number of parameters used in the discriminant function • The fitness function is the fraction of patterns properly classified by applying the discriminant function parameterized by the chromosome to a given testing set Data Mining 22
  • 23. Advantages of GA • Concepts are easy to understand • Genetic Algorithms are intrinsically parallel. • Always an answer; answer gets better with time • Inherently parallel; easily distributed • Less time required for some special applications • Chances of getting optimal solution are more Data Mining 23
  • 24. Limitations of GA • The population considered for the evolution should be moderate or suitable one for the problem (normally 20-30 or 50-100) • Crossover rate should be 80%-95% • Mutation rate should be low i.e. 0.5%-1% assumed as best • The method of selection should be appropriate • Writing of fitness function must be accurate Data Mining 24
  • 25. Conclusion • Genetic algorithms are rich in application across a large and growing number of disciplines. • Genetic Algorithms are used in Optimization and in Classification in Data Mining • Genetic algorithm has changed the way we do computer programming. Data Mining 25