SlideShare a Scribd company logo
1 of 23
Download to read offline
Do not Match, Inherit: Fitness Surrogates for
Genetics-Based Machine Learning Techniques


       Xavier Llorà1,2, Kumara Sastry2, Tian-Li Yu3, David E. Goldberg2

1 National   Center for Supercomputing Applications. University of Illinois at Urbana-Champaign
     2 Illinois   Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign
                  3 Department   of Electrical Engineering, National Taiwan University




                          Supported by AFOSR FA9550-06-1-0370, NSF at ISS-02-09199
GECCO 2007 HUMIES                                                                             1
Motivation
• Competent GBML
      – Use competent GAs to approach GBML problems
      – Take advantage of competent GA scalability
      – Provide insight about problem structure
      – χeCCS by Llorà, Sastry, Goldberg & de la Ossa (2006)

• Rule matching may thread practical applications
      – Even for small dimensional problems (MUX 20), rule matching
        may take more than 85% of the execution time in XCS
      – As dimensionality or cardinality of training sets increase, rule
        matching rules the overall execution time
      – Efficient implementation (Llora & Sastry, 2006) still require
        matching rules

GECCO 2007                     Llorà, Sastry, Yu & Goldberg                2
Motivation
• Competent GAs
      – Byproduct: Models and problem structure insight
      – Revision of the fitness relaxation for expensive fitness
             evaluations
      – Idea: Build a cheap surrogate fitness accurate enough
      – Successfully applied to GA (Sastry, Lima & Goldberg, 2006)
      – Help cut down the number of fitness evaluations

• GBML
      – Can we transfer the same ideas to GBML approaches?
      – What are the requirements needed for competent GBML to
             benefit from fitness relaxation?

GECCO 2007                         Llorà, Sastry, Yu & Goldberg      3
Outline
• Overview χeCCS
• Evaluation relaxation
• Fitness inheritance using least squares fitting
• Fitness inheritance and χeCCS
• Results
• Conclusions




GECCO 2007            Llorà, Sastry, Yu & Goldberg   4
χ-ari Extended Compact Classifier System

• No reinforcement learning is used
• A competent GA is in charge of the learning
• The idea:
      – A population of single rules
      – For each rule we compute its fitness
      – The χ-ari extended compact genetic algorithm
      – Niching to maintain different accurate rules (restricted
        tournament replacement)




GECCO 2007                  Llorà, Sastry, Yu & Goldberg           5
Maximally Accurate and General Rules
• Accuracy and generality can be compute as

                    n t + (r) + n t# (r)                        n t + (r)
             quot;(r) =                                      quot;(r) =
                             nt                                   nm
 • Fitness should combine accuracy and generality
                           f (r) = quot;(r) # $(r)%
  !                              !
 • Such measure can be either applied to rules or rule sets

             !



GECCO 2007                    Llorà, Sastry, Yu & Goldberg                  6
Maximally Accurate and General Rules




GECCO 2007         Llorà, Sastry, Yu & Goldberg   7
Extended Compact Genetic Algorithm
  • A Probabilistic model building GA (Harik, 1999)
        – Builds models of good solutions as linkage groups
  • Key idea:
        – Good probability distribution → Linkage learning

  • Key components:
        – Representation: Marginal product model (MPM)
             • Marginal distribution of a gene partition

        – Quality: Minimum description length (MDL)
             • Occam’s razor principle
             • All things being equal, simpler models are better

        – Search Method: Greedy heuristic search

GECCO 2007                               Llorà, Sastry, Yu & Goldberg   8
Marginal Product Model (MPM)
  • Partition variables into disjoint sets
  • Product of marginal distributions on a partition of
       genes
  • Gene partition maps to linkage groups
               MPM: [1, 2, 3], [4, 5, 6], … [l-2, l -1, l]


                                               ...            xl-2 xl-1 xl
                x1 x2 x3    x4 x5 x6



               {p000, p001, p00#, p010, p011, p01#, p100, p101,
               p10#, p110, p111, p11# … }(27 probabilities)
GECCO 2007                     Llorà, Sastry, Yu & Goldberg                  9
Minimum Description Length Metric
  • Hypothesis: For an optimal model
        – Model size and error is minimum

  • Model complexity, Cm
        – # of bits required to store all marginal probabilities



  • Compressed population complexity, Cp
        – Entropy of the marginal distribution over all partitions




  • MDL metric, Cc = Cm + Cp
GECCO 2007                      Llorà, Sastry, Yu & Goldberg         10
Building an Optimal MPM
  •     Assume independent genes ([1],[2],…,[l])

  •     Compute MDL metric, Cc

  •     All combinations of two subset merges
             Eg., {([1,2],[3],…,[l]), ([1,3],[2],…,[l]), ([1],[2],…,[l-1,l])}
        •

  •     Compute MDL metric for all model candidates

  •     Select the set with minimum MDL,

  •     If           , accept the model and go to step 2.

  •     Else, the current model is optimal




GECCO 2007                           Llorà, Sastry, Yu & Goldberg               11
χeCCS Models for Different Multiplexers
Building Block Size Increases
Fitness Inheritance using Least Squares
• Proposed by Sastry, Lima & Goldberg (2006)
• Surrogate is a regression using basis identified by BBs
• A simple example: [1,3] [2] [4]
• The schemas represented are
      – {0*0*, 0*1*, 0*#*, 1*0*, 1*1*, 1*#*, #*0*, #*1*, #*#*,
        *0**, *1**, *#**, ***0 , ***1 , ***#}
• Recode the individuals by




GECCO 2007                  Llorà, Sastry, Yu & Goldberg         13
Fitness Inheritance using Least Squares
• Recoding defines matrix A




• Normalize the fitness




GECCO 2007                Llorà, Sastry, Yu & Goldberg   14
Fitness Inheritance using Least Squares
• Solve using least squares




• Once solved, the fitness surrogate take the following form




GECCO 2007                Llorà, Sastry, Yu & Goldberg         15
Fitness Inheritance and χeCCS
• Two different problems




               Hidden XOR                            6-input multiplexer


GECCO 2007                  Llorà, Sastry, Yu & Goldberg                   16
Hidden XOR
• Evolved rules and model




• Surrogate accuracy




GECCO 2007             Llorà, Sastry, Yu & Goldberg   17
6-input Multiplexer
• The evolved solution and model




• The surrogate is totally off



GECCO 2007            Llorà, Sastry, Yu & Goldberg   18
6-input Multiplexer
• The key = missing basis




• χeCCS is able to solve the problem quickly, reliably,
    and accurately
• However, the model basis are not accurate enough
    to build a proper surrogate
GECCO 2007             Llorà, Sastry, Yu & Goldberg   19
Overlapping BBs using DSMGA
• Proposed by Yu, Yassine, Goldberg and Chen (2003)
• Based on organizational theory
• Main property = DSMGA model builder (DSMcluster)
    deals with overlapping building blocks
• The main issue = translate a populations or rules
    into a dependency structure matrix (DSM)
• The intuition = specific bits are the ones responsible
    for the kind of linkage we seek


GECCO 2007             Llorà, Sastry, Yu & Goldberg    20
Jumping to the Results
• DSMcluster model for the hidden XOR
      – [i0 i1 i2] [i3] [i4] [i5]


• DSMcluster model for the 6-input multiplexer
      – [i0 i1] <i2 i3 i4 i5>
      – It identifies a BB [i0 i1] of variables interacting with a
        bus <i2 i3 i4 i5>
      – Translated into χeCCS language:
                 [i0 i1 i2] [i0 i1 i3] [i0 i1 i4] [i0 i1 i5]
      – The right model which provides the right set of basis

GECCO 2007                      Llorà, Sastry, Yu & Goldberg     21
Conclusions
• The matching process is crucial and expensive
• Efficient implementations can take us far to a point
• Relaxation can get rid of the need of matching
• For some types of problems overlapping BBs are
    required
• DSMGA provides the proper machinery to identify
    the proper basis for such a surrogate




GECCO 2007              Llorà, Sastry, Yu & Goldberg     22
Do not Match, Inherit: Fitness Surrogates for
Genetics-Based Machine Learning Techniques


       Xavier Llorà1,2, Kumara Sastry2, Tian-Li Yu3, David E. Goldberg2

1 National   Center for Supercomputing Applications. University of Illinois at Urbana-Champaign
     2 Illinois   Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign
                  3 Department   of Electrical Engineering, National Taiwan University




                          Supported by AFOSR FA9550-06-1-0370, NSF at ISS-02-09199
GECCO 2007 HUMIES                                                                             23

More Related Content

Similar to Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques

Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...Xavier Llorà
 
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...Ahmed Gamal Abdel Gawad
 
IRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic AlorithmIRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic AlorithmIRJET Journal
 
IRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic AlorithmIRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic AlorithmIRJET Journal
 
Grds international conference on pure and applied science (5)
Grds international conference on pure and applied science (5)Grds international conference on pure and applied science (5)
Grds international conference on pure and applied science (5)Global R & D Services
 
Grds international conference on pure and applied science (6)
Grds international conference on pure and applied science (6)Grds international conference on pure and applied science (6)
Grds international conference on pure and applied science (6)Global R & D Services
 
Second Order Heuristics in ACGP
Second Order Heuristics in ACGPSecond Order Heuristics in ACGP
Second Order Heuristics in ACGPhauschildm
 
A Genetic Algorithm Based Approach for Solving Optimal Power Flow Problem
A Genetic Algorithm Based Approach for Solving Optimal Power Flow ProblemA Genetic Algorithm Based Approach for Solving Optimal Power Flow Problem
A Genetic Algorithm Based Approach for Solving Optimal Power Flow ProblemShubhashis Shil
 
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...IRJET Journal
 
Higgs Boson Challenge
Higgs Boson ChallengeHiggs Boson Challenge
Higgs Boson ChallengeRaouf KESKES
 
SpecAugment review
SpecAugment reviewSpecAugment review
SpecAugment reviewJune-Woo Kim
 
Genetic Algorithms and Genetic Programming for Multiscale Modeling
Genetic Algorithms and Genetic Programming for Multiscale ModelingGenetic Algorithms and Genetic Programming for Multiscale Modeling
Genetic Algorithms and Genetic Programming for Multiscale Modelingkknsastry
 
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...Vahid Taslimitehrani
 
Using particle swarm optimization to solve test functions problems
Using particle swarm optimization to solve test functions problemsUsing particle swarm optimization to solve test functions problems
Using particle swarm optimization to solve test functions problemsriyaniaes
 
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.pptArtificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.pptAnonymous9etQKwW
 

Similar to Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques (20)

Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
 
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...
Structural Optimization using Genetic Algorithms - Artificial Intelligence Fu...
 
IRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic AlorithmIRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic Alorithm
 
IRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic AlorithmIRJET- Optimization of Riser through Genetic Alorithm
IRJET- Optimization of Riser through Genetic Alorithm
 
Grds international conference on pure and applied science (5)
Grds international conference on pure and applied science (5)Grds international conference on pure and applied science (5)
Grds international conference on pure and applied science (5)
 
Grds international conference on pure and applied science (6)
Grds international conference on pure and applied science (6)Grds international conference on pure and applied science (6)
Grds international conference on pure and applied science (6)
 
Second Order Heuristics in ACGP
Second Order Heuristics in ACGPSecond Order Heuristics in ACGP
Second Order Heuristics in ACGP
 
A Genetic Algorithm Based Approach for Solving Optimal Power Flow Problem
A Genetic Algorithm Based Approach for Solving Optimal Power Flow ProblemA Genetic Algorithm Based Approach for Solving Optimal Power Flow Problem
A Genetic Algorithm Based Approach for Solving Optimal Power Flow Problem
 
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...
IRJET- Comprehensive Analysis on Optimal Allocation and Sizing of Distributed...
 
Cz24655657
Cz24655657Cz24655657
Cz24655657
 
Higgs Boson Challenge
Higgs Boson ChallengeHiggs Boson Challenge
Higgs Boson Challenge
 
50120130406046
5012013040604650120130406046
50120130406046
 
SpecAugment review
SpecAugment reviewSpecAugment review
SpecAugment review
 
Genetic Algorithms and Genetic Programming for Multiscale Modeling
Genetic Algorithms and Genetic Programming for Multiscale ModelingGenetic Algorithms and Genetic Programming for Multiscale Modeling
Genetic Algorithms and Genetic Programming for Multiscale Modeling
 
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
 
40220140502004
4022014050200440220140502004
40220140502004
 
Recent Advances in CPLEX 12.6.1
Recent Advances in CPLEX 12.6.1Recent Advances in CPLEX 12.6.1
Recent Advances in CPLEX 12.6.1
 
Using particle swarm optimization to solve test functions problems
Using particle swarm optimization to solve test functions problemsUsing particle swarm optimization to solve test functions problems
Using particle swarm optimization to solve test functions problems
 
I045046066
I045046066I045046066
I045046066
 
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.pptArtificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
Artificial Neural Networks_Bioinsspired_Algorithms_Nov 20.ppt
 

More from Xavier Llorà

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewXavier Llorà
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with MeandreXavier Llorà
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0Xavier Llorà
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine LearningXavier Llorà
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...Xavier Llorà
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsXavier Llorà
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...Xavier Llorà
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance ProblemsXavier Llorà
 
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...Xavier Llorà
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challengesXavier Llorà
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionXavier Llorà
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier SystemsXavier Llorà
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?Xavier Llorà
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsXavier Llorà
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring LanguageXavier Llorà
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...Xavier Llorà
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Xavier Llorà
 
Visualizing content in metadata stores
Visualizing content in metadata storesVisualizing content in metadata stores
Visualizing content in metadata storesXavier Llorà
 

More from Xavier Llorà (20)

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha Preview
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with Meandre
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance Problems
 
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challenges
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly Detection
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier Systems
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?
 
NIGEL 2006 welcome
NIGEL 2006 welcomeNIGEL 2006 welcome
NIGEL 2006 welcome
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring Language
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
 
The DISCUS project
The DISCUS projectThe DISCUS project
The DISCUS project
 
Visualizing content in metadata stores
Visualizing content in metadata storesVisualizing content in metadata stores
Visualizing content in metadata stores
 

Recently uploaded

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 

Recently uploaded (20)

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 

Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques

  • 1. Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques Xavier Llorà1,2, Kumara Sastry2, Tian-Li Yu3, David E. Goldberg2 1 National Center for Supercomputing Applications. University of Illinois at Urbana-Champaign 2 Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign 3 Department of Electrical Engineering, National Taiwan University Supported by AFOSR FA9550-06-1-0370, NSF at ISS-02-09199 GECCO 2007 HUMIES 1
  • 2. Motivation • Competent GBML – Use competent GAs to approach GBML problems – Take advantage of competent GA scalability – Provide insight about problem structure – χeCCS by Llorà, Sastry, Goldberg & de la Ossa (2006) • Rule matching may thread practical applications – Even for small dimensional problems (MUX 20), rule matching may take more than 85% of the execution time in XCS – As dimensionality or cardinality of training sets increase, rule matching rules the overall execution time – Efficient implementation (Llora & Sastry, 2006) still require matching rules GECCO 2007 Llorà, Sastry, Yu & Goldberg 2
  • 3. Motivation • Competent GAs – Byproduct: Models and problem structure insight – Revision of the fitness relaxation for expensive fitness evaluations – Idea: Build a cheap surrogate fitness accurate enough – Successfully applied to GA (Sastry, Lima & Goldberg, 2006) – Help cut down the number of fitness evaluations • GBML – Can we transfer the same ideas to GBML approaches? – What are the requirements needed for competent GBML to benefit from fitness relaxation? GECCO 2007 Llorà, Sastry, Yu & Goldberg 3
  • 4. Outline • Overview χeCCS • Evaluation relaxation • Fitness inheritance using least squares fitting • Fitness inheritance and χeCCS • Results • Conclusions GECCO 2007 Llorà, Sastry, Yu & Goldberg 4
  • 5. χ-ari Extended Compact Classifier System • No reinforcement learning is used • A competent GA is in charge of the learning • The idea: – A population of single rules – For each rule we compute its fitness – The χ-ari extended compact genetic algorithm – Niching to maintain different accurate rules (restricted tournament replacement) GECCO 2007 Llorà, Sastry, Yu & Goldberg 5
  • 6. Maximally Accurate and General Rules • Accuracy and generality can be compute as n t + (r) + n t# (r) n t + (r) quot;(r) = quot;(r) = nt nm • Fitness should combine accuracy and generality f (r) = quot;(r) # $(r)% ! ! • Such measure can be either applied to rules or rule sets ! GECCO 2007 Llorà, Sastry, Yu & Goldberg 6
  • 7. Maximally Accurate and General Rules GECCO 2007 Llorà, Sastry, Yu & Goldberg 7
  • 8. Extended Compact Genetic Algorithm • A Probabilistic model building GA (Harik, 1999) – Builds models of good solutions as linkage groups • Key idea: – Good probability distribution → Linkage learning • Key components: – Representation: Marginal product model (MPM) • Marginal distribution of a gene partition – Quality: Minimum description length (MDL) • Occam’s razor principle • All things being equal, simpler models are better – Search Method: Greedy heuristic search GECCO 2007 Llorà, Sastry, Yu & Goldberg 8
  • 9. Marginal Product Model (MPM) • Partition variables into disjoint sets • Product of marginal distributions on a partition of genes • Gene partition maps to linkage groups MPM: [1, 2, 3], [4, 5, 6], … [l-2, l -1, l] ... xl-2 xl-1 xl x1 x2 x3 x4 x5 x6 {p000, p001, p00#, p010, p011, p01#, p100, p101, p10#, p110, p111, p11# … }(27 probabilities) GECCO 2007 Llorà, Sastry, Yu & Goldberg 9
  • 10. Minimum Description Length Metric • Hypothesis: For an optimal model – Model size and error is minimum • Model complexity, Cm – # of bits required to store all marginal probabilities • Compressed population complexity, Cp – Entropy of the marginal distribution over all partitions • MDL metric, Cc = Cm + Cp GECCO 2007 Llorà, Sastry, Yu & Goldberg 10
  • 11. Building an Optimal MPM • Assume independent genes ([1],[2],…,[l]) • Compute MDL metric, Cc • All combinations of two subset merges Eg., {([1,2],[3],…,[l]), ([1,3],[2],…,[l]), ([1],[2],…,[l-1,l])} • • Compute MDL metric for all model candidates • Select the set with minimum MDL, • If , accept the model and go to step 2. • Else, the current model is optimal GECCO 2007 Llorà, Sastry, Yu & Goldberg 11
  • 12. χeCCS Models for Different Multiplexers Building Block Size Increases
  • 13. Fitness Inheritance using Least Squares • Proposed by Sastry, Lima & Goldberg (2006) • Surrogate is a regression using basis identified by BBs • A simple example: [1,3] [2] [4] • The schemas represented are – {0*0*, 0*1*, 0*#*, 1*0*, 1*1*, 1*#*, #*0*, #*1*, #*#*, *0**, *1**, *#**, ***0 , ***1 , ***#} • Recode the individuals by GECCO 2007 Llorà, Sastry, Yu & Goldberg 13
  • 14. Fitness Inheritance using Least Squares • Recoding defines matrix A • Normalize the fitness GECCO 2007 Llorà, Sastry, Yu & Goldberg 14
  • 15. Fitness Inheritance using Least Squares • Solve using least squares • Once solved, the fitness surrogate take the following form GECCO 2007 Llorà, Sastry, Yu & Goldberg 15
  • 16. Fitness Inheritance and χeCCS • Two different problems Hidden XOR 6-input multiplexer GECCO 2007 Llorà, Sastry, Yu & Goldberg 16
  • 17. Hidden XOR • Evolved rules and model • Surrogate accuracy GECCO 2007 Llorà, Sastry, Yu & Goldberg 17
  • 18. 6-input Multiplexer • The evolved solution and model • The surrogate is totally off GECCO 2007 Llorà, Sastry, Yu & Goldberg 18
  • 19. 6-input Multiplexer • The key = missing basis • χeCCS is able to solve the problem quickly, reliably, and accurately • However, the model basis are not accurate enough to build a proper surrogate GECCO 2007 Llorà, Sastry, Yu & Goldberg 19
  • 20. Overlapping BBs using DSMGA • Proposed by Yu, Yassine, Goldberg and Chen (2003) • Based on organizational theory • Main property = DSMGA model builder (DSMcluster) deals with overlapping building blocks • The main issue = translate a populations or rules into a dependency structure matrix (DSM) • The intuition = specific bits are the ones responsible for the kind of linkage we seek GECCO 2007 Llorà, Sastry, Yu & Goldberg 20
  • 21. Jumping to the Results • DSMcluster model for the hidden XOR – [i0 i1 i2] [i3] [i4] [i5] • DSMcluster model for the 6-input multiplexer – [i0 i1] <i2 i3 i4 i5> – It identifies a BB [i0 i1] of variables interacting with a bus <i2 i3 i4 i5> – Translated into χeCCS language: [i0 i1 i2] [i0 i1 i3] [i0 i1 i4] [i0 i1 i5] – The right model which provides the right set of basis GECCO 2007 Llorà, Sastry, Yu & Goldberg 21
  • 22. Conclusions • The matching process is crucial and expensive • Efficient implementations can take us far to a point • Relaxation can get rid of the need of matching • For some types of problems overlapping BBs are required • DSMGA provides the proper machinery to identify the proper basis for such a surrogate GECCO 2007 Llorà, Sastry, Yu & Goldberg 22
  • 23. Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning Techniques Xavier Llorà1,2, Kumara Sastry2, Tian-Li Yu3, David E. Goldberg2 1 National Center for Supercomputing Applications. University of Illinois at Urbana-Champaign 2 Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign 3 Department of Electrical Engineering, National Taiwan University Supported by AFOSR FA9550-06-1-0370, NSF at ISS-02-09199 GECCO 2007 HUMIES 23