SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
Linkage Learning for Pittsburgh LCS:
    Making Problems Tractable


    Xavier Llorà, Kumara Sastry, & David E. Goldberg

                Illinois Genetic Algorithms Lab
           University of Illinois at Urbana-Champaign


           {xllora,kumara,deg}@illigal.ge.uiuc.edu
Motivation and Early Work

    • Can we apply Wilson’s ideas for evolving rule sets
      formed only by maximally accurate and general rules in
      Pittsburgh LCS?
    • Previous Multi-objective approaches:
              Bottom up (Bernadó, 2002)
                 • Panmictic populations
                 • Multimodal optimization (sharing/crowding for niche formation)
              Top down (Llorà, Goldberg, Traus, Bernadó, 2003)
                 • Explicitly address accuracy and generality
                 • Use it to push and product compact rule sets
    • The compact classifier system (CCS) roots on the bottom
      up approach.

NIGEL 2006                           Llorà, X., Sastry, K., and Goldberg, D.        2
Maximally Accurate and General Rules

    • Accuracy and generality can be compute as
                        n t + (r) + n t# (r)                           n t + (r)
                 quot;(r) =                                         quot;(r) =
                                 nt                                      nm
     • Fitness should combine accuracy and generality
                               f (r) = quot;(r) # $(r)%
       !                             !
     • Such measure can be either applied to rules or rule sets.
     • The CCS uses this fitness and a compact genetic algorithm
               !
       (cGA) to evolve such rules.
     • One cGA run provides one rule.
     • Multiple rules are required to form a rule set.
NIGEL 2006                     Llorà, X., Sastry, K., and Goldberg, D.             3
The cGA Can Make It

    • Rules may be obtained optimizing

                                    f (r) = quot;(r) # $(r)%

             The basic CGA scheme
    •
                                  0
             1. Initialization   px i = 0.5
                       !
             2. Model sampling (two individuals are generated)
             3. Evaluation (f(r))
             4. Selection (tournament selection)
                   !
             5. Probabilistic model updation
             6. Repeat steps 2-5 until termination criteria are met



NIGEL 2006                          Llorà, X., Sastry, K., and Goldberg, D.   4
cGA Model Perturbation

    • Facilitate the evolution of different rules
    • Explore the frequency of appearance of each optimal
      rule
    • Initial model perturbation
                         0
                        px i = 0.5 + U(quot;0.4,0.4)

     • Experiments using the 3-input multiplexer
     • 1,000 independent runs
             !
     • Visualize the pair-wise relations of the genes



NIGEL 2006                Llorà, X., Sastry, K., and Goldberg, D.   5
But One Rule Is Not Enough

    • Model perturbation in cGA evolve different rules
    • The goal: evolve population of rules that solve the
      problem together
    • The fitness measure (f(r)) can be also be applied to rule
      sets
             Two mechanism:
    •
              Spawn a population until the solution is meet
              Fusing populations when they represent the same rule




NIGEL 2006                        Llorà, X., Sastry, K., and Goldberg, D.   6
Spawning and Fusing Populations




NIGEL 2006        Llorà, X., Sastry, K., and Goldberg, D.   7
Experiments & Scalability
    • Analysis using multiplexer problems (3-, 6-, and 11-input)
    • The number of rules in [O] grow exponentially.
              It grows as 2i, where i is the number of inputs.
              Assume equal probability of hitting a rule (binomial model).
              The number or runs to achieve all the rules in [O] grows
               exponentially.
    • The cGA success as a function of the problem size!
              3-input: 97%
              6-input: 73.93%
              11-input: 43.03%
    • Scalability over 10,000 independent runs


NIGEL 2006                         Llorà, X., Sastry, K., and Goldberg, D.    8
Scalability of CCS




NIGEL 2006   Llorà, X., Sastry, K., and Goldberg, D.            9
So?
             Open questions:
    •
              Multiple runs is not an option.
              Could the poor cGA scalability be the result of the existence of linkage?
             The χ-ary extended compact classifier system (χeCCS) needs to
    •
             provide answers to:
              Perform linkage learning to improve the scalability of the rule learning
               process.
              Evolve [O] in a single run (rule niching?).
             The χeCCS answer:
    •
              Use the extended compact genetic algorithm (Harik, 1999)
              Rule niching via restricted tournament replacement (Harik, 1995)




NIGEL 2006                            Llorà, X., Sastry, K., and Goldberg, D.              10
Extended Compact Genetic Algorithm
             A Probabilistic model building GA (Harik, 1999)
       •
              Builds models of good solutions as linkage groups

             Key idea:
       •
              Good probability distribution → Linkage learning

             Key components:
       •
              Representation: Marginal product model (MPM)
                 • Marginal distribution of a gene partition

              Quality: Minimum description length (MDL)
                 • Occam’s razor principle
                 • All things being equal, simpler models are better
              Search Method: Greedy heuristic search



NIGEL 2006                          Llorà, X., Sastry, K., and Goldberg, D.   11
Marginal Product Model (MPM)
       • Partition variables into clusters
       • Product of marginal distributions on a partition of genes
       • Gene partition maps to linkage groups
                 MPM: [1, 2, 3], [4, 5, 6], … [l-2, l -1, l]


                                                     ...                 xl-2 xl-1 xl
                  x1 x2 x3   x4 x5 x6


                 {p000, p001, p010, p100, p011, p101, p110, p111}



NIGEL 2006                     Llorà, X., Sastry, K., and Goldberg, D.                  12
Minimum Description Length Metric
             Hypothesis: For an optimal model
       •
              Model size and error is minimum

             Model complexity, Cm
       •
              # of bits required to store all marginal probabilities



             Compressed population complexity, Cp
       •
              Entropy of the marginal distribution over all partitions




             MDL metric, Cc = Cm + Cp
       •


NIGEL 2006                          Llorà, X., Sastry, K., and Goldberg, D.   13
Building an Optimal MPM
             Assume independent genes ([1],[2],…,[l])
       •

             Compute MDL metric, Cc
       •

             All combinations of two subset merges
       •
                  Eg., {([1,2],[3],…,[l]), ([1,3],[2],…,[l]), ([1],[2],…,[l-1,l])}
             •

             Compute MDL metric for all model candidates
       •

             Select the set with minimum MDL,
       •

             If            , accept the model and go to step 2.
       •

             Else, the current model is optimal
       •


NIGEL 2006                          Llorà, X., Sastry, K., and Goldberg, D.          14
Extended Compact Genetic Algorithm
             Initialize the population (usually random initialization)
   •

             Evaluate the fitness of individuals
   •

             Select promising solutions (e.g., tournament selection)
   •

             Build the probabilistic model
   •
                 Optimize structure & parameters to best fit selected individuals
             •
                 Automatic identification of sub-structures
             •

             Sample the model to create new candidate solutions
   •
                 Effective exchange of building blocks
             •

             Repeat steps 2–7 till some convergence criteria are met
   •



NIGEL 2006                            Llorà, X., Sastry, K., and Goldberg, D.       15
Models built by eCGA
   • Use model-building procedure of extended compact GA
              Partition genes into (mutually) independent groups
              Start with the lowest complexity model
              Search for a least-complex, most-accurate model


                              Model Structure                                          Metric
             [X0] [X1] [X2] [X3] [X4] [X5] [X6] [X7] [X8] [X9] [X10] [X11]             1.0000
             [X0] [X1] [X2] [X3] [X4X5] [X6] [X7] [X8] [X9] [X10] [X11]                0.9933
             [X0] [X1] [X2] [X3] [X4X5X7] [X6] [X8] [X9] [X10] [X11]                    0.9819
             [X0] [X1] [X2] [X3] [X4X5X6X7] [X8] [X9] [X10] [X11]                       0.9644
                                       M                                                  M
             [X0] [X1] [X2] [X3] [X4X5X6X7] [X8X9X10X11]                               0.9273
                                      M                                                  M
             [X0X1X2X3] [X4X5X6X7] [X8X9X10X11]                                        0.8895



NIGEL 2006                                   Llorà, X., Sastry, K., and Goldberg, D.             16
Modifying ecGA for Rule Learning
    • Rules are described using χ-ary alphabets {0, 1, #}.
    • χeCCS uses a χ-ary version of ecGA (Sastry and Goldberg,
      2003; de la Osa, Sastry, and Lobo, 2006).
    • Maximally general and maximally accurate rules may be
      obtained using:
                               f (r) = quot;(r) # $(r)%

    • Needs to maintain multiple rules in a run → niching
              We need an efficient niching method, that does not adversely
                    !
               affect the quality of the probabilistic models.
              Restricted tournament replacement (Harik, 1995)

NIGEL 2006                        Llorà, X., Sastry, K., and Goldberg, D.     17
Experiments

             Goals
    •
             1. Is linkage learning useful to solve the multiplexer problem using
                Pittsburgh LCS?
             2. How far can we push it?
             Multiplexer problems
    •
                 Address bits determine what input to use
             
                 There is un underlying structure, isn’t it?
             
             The larger solved using Pittsburgh approaches (11-input)
    •
                 Match all the examples
             
                 No linkage learning available
             
             We borrowed the population sizing theory for ecGA.
    •


NIGEL 2006                          Llorà, X., Sastry, K., and Goldberg, D.             18
χeCCS Models for Different Multiplexers
        Building Block Size Increases




NIGEL 2006                                       Llorà, X., Sastry, K., and Goldberg, D.   19
χeCCS Scalability




             Follows facet-wise theory:
     •
             1. Grows exponential with the number of address bits (building block size)
             2. Quadratically with the problem size


NIGEL 2006                          Llorà, X., Sastry, K., and Goldberg, D.                   20
Conclusions
             The χeCCS builds on competent GAs
    •
             The facetwise models from GA theory hold
    •
             The χeCCS is able to:
    •
             1. Perform linkage learning to improve the scalability of the rule
                learning process.
             2. Evolve [O] in a single run.
             The χeCCS show the need for linkage learning in
    •
             Pittsburgh LCS to effectively solve multiplexer
             problems.
             χeCCS solved 20-input, 37-input, and 70-input
    •
             multiplexers problems for the first time using Pittsburgh
             LCS.
NIGEL 2006                          Llorà, X., Sastry, K., and Goldberg, D.             21
Linkage Learning for Pittsburgh LCS:
    Making Problems Tractable


    Xavier Llorà, Kumara Sastry, & David E. Goldberg

                Illinois Genetic Algorithms Lab
           University of Illinois at Urbana-Champaign


           {xllora,kumara,deg}@illigal.ge.uiuc.edu

Mais conteúdo relacionado

Semelhante a Linkage Learning for Pittsburgh LCS: Making Problems Tractable

The compact classifier system: Motivation, analysis and first results
The compact classifier system: Motivation, analysis and first results The compact classifier system: Motivation, analysis and first results
The compact classifier system: Motivation, analysis and first results Xavier Llorà
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksMLReview
 
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...Albert Orriols-Puig
 
Towards billion bit optimization via parallel estimation of distribution algo...
Towards billion bit optimization via parallel estimation of distribution algo...Towards billion bit optimization via parallel estimation of distribution algo...
Towards billion bit optimization via parallel estimation of distribution algo...kknsastry
 
The Robust Optimization of Non-Linear Requirements Models
The Robust Optimization of Non-Linear Requirements ModelsThe Robust Optimization of Non-Linear Requirements Models
The Robust Optimization of Non-Linear Requirements Modelsgregoryg
 
Generative Adversarial Networks 2
Generative Adversarial Networks 2Generative Adversarial Networks 2
Generative Adversarial Networks 2Alireza Shafaei
 
Empirical Analysis of ideal recombination on random decomposable problems
Empirical Analysis of ideal recombination on random decomposable problemsEmpirical Analysis of ideal recombination on random decomposable problems
Empirical Analysis of ideal recombination on random decomposable problemskknsastry
 
powerpoint feb
powerpoint febpowerpoint feb
powerpoint febimu409
 
Arbonne's Results Presentation
Arbonne's Results PresentationArbonne's Results Presentation
Arbonne's Results Presentationguest06b488
 
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...Vahid Taslimitehrani
 
adversarial robustness lecture
adversarial robustness lectureadversarial robustness lecture
adversarial robustness lectureMuhammadAhmedShah2
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraJason Riedy
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...cvpaper. challenge
 
Talwalkar mlconf (1)
Talwalkar mlconf (1)Talwalkar mlconf (1)
Talwalkar mlconf (1)MLconf
 

Semelhante a Linkage Learning for Pittsburgh LCS: Making Problems Tractable (20)

The compact classifier system: Motivation, analysis and first results
The compact classifier system: Motivation, analysis and first results The compact classifier system: Motivation, analysis and first results
The compact classifier system: Motivation, analysis and first results
 
Cz24655657
Cz24655657Cz24655657
Cz24655657
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
 
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
IWLCS'2008: First Approach toward Online Evolution of Association Rules wit...
 
Towards billion bit optimization via parallel estimation of distribution algo...
Towards billion bit optimization via parallel estimation of distribution algo...Towards billion bit optimization via parallel estimation of distribution algo...
Towards billion bit optimization via parallel estimation of distribution algo...
 
The Robust Optimization of Non-Linear Requirements Models
The Robust Optimization of Non-Linear Requirements ModelsThe Robust Optimization of Non-Linear Requirements Models
The Robust Optimization of Non-Linear Requirements Models
 
Generative Adversarial Networks 2
Generative Adversarial Networks 2Generative Adversarial Networks 2
Generative Adversarial Networks 2
 
23AFMC_Beamer.pdf
23AFMC_Beamer.pdf23AFMC_Beamer.pdf
23AFMC_Beamer.pdf
 
Empirical Analysis of ideal recombination on random decomposable problems
Empirical Analysis of ideal recombination on random decomposable problemsEmpirical Analysis of ideal recombination on random decomposable problems
Empirical Analysis of ideal recombination on random decomposable problems
 
GARCH
GARCHGARCH
GARCH
 
powerpoint feb
powerpoint febpowerpoint feb
powerpoint feb
 
Neural modeling of verbal consciousness based on the results of the associati...
Neural modeling of verbal consciousness based on the results of the associati...Neural modeling of verbal consciousness based on the results of the associati...
Neural modeling of verbal consciousness based on the results of the associati...
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithms
 
Arbonne's Results Presentation
Arbonne's Results PresentationArbonne's Results Presentation
Arbonne's Results Presentation
 
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
A new CPXR Based Logistic Regression Method and Clinical Prognostic Modeling ...
 
adversarial robustness lecture
adversarial robustness lectureadversarial robustness lecture
adversarial robustness lecture
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
riken-RBlur-slides.pptx
riken-RBlur-slides.pptxriken-RBlur-slides.pptx
riken-RBlur-slides.pptx
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
 
Talwalkar mlconf (1)
Talwalkar mlconf (1)Talwalkar mlconf (1)
Talwalkar mlconf (1)
 

Mais de Xavier Llorà

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewXavier Llorà
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with MeandreXavier Llorà
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0Xavier Llorà
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine LearningXavier Llorà
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...Xavier Llorà
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsXavier Llorà
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...Xavier Llorà
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance ProblemsXavier Llorà
 
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...Xavier Llorà
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challengesXavier Llorà
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionXavier Llorà
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier SystemsXavier Llorà
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?Xavier Llorà
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsXavier Llorà
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring LanguageXavier Llorà
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Xavier Llorà
 
Visualizing content in metadata stores
Visualizing content in metadata storesVisualizing content in metadata stores
Visualizing content in metadata storesXavier Llorà
 
GE498-ECI, Lecture 9:The Unstructured Data Contains a Clue
GE498-ECI, Lecture 9:The Unstructured Data Contains a Clue GE498-ECI, Lecture 9:The Unstructured Data Contains a Clue
GE498-ECI, Lecture 9:The Unstructured Data Contains a Clue Xavier Llorà
 
GE498-ECI, Lecture 8: Connectivity Everywhere; Graph Theory 101
GE498-ECI, Lecture 8: Connectivity Everywhere; Graph Theory 101GE498-ECI, Lecture 8: Connectivity Everywhere; Graph Theory 101
GE498-ECI, Lecture 8: Connectivity Everywhere; Graph Theory 101Xavier Llorà
 

Mais de Xavier Llorà (20)

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha Preview
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with Meandre
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance Problems
 
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challenges
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly Detection
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier Systems
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring Language
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
 
The DISCUS project
The DISCUS projectThe DISCUS project
The DISCUS project
 
Visualizing content in metadata stores
Visualizing content in metadata storesVisualizing content in metadata stores
Visualizing content in metadata stores
 
GE498-ECI, Lecture 9:The Unstructured Data Contains a Clue
GE498-ECI, Lecture 9:The Unstructured Data Contains a Clue GE498-ECI, Lecture 9:The Unstructured Data Contains a Clue
GE498-ECI, Lecture 9:The Unstructured Data Contains a Clue
 
GE498-ECI, Lecture 8: Connectivity Everywhere; Graph Theory 101
GE498-ECI, Lecture 8: Connectivity Everywhere; Graph Theory 101GE498-ECI, Lecture 8: Connectivity Everywhere; Graph Theory 101
GE498-ECI, Lecture 8: Connectivity Everywhere; Graph Theory 101
 

Último

Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 

Último (20)

Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 

Linkage Learning for Pittsburgh LCS: Making Problems Tractable

  • 1. Linkage Learning for Pittsburgh LCS: Making Problems Tractable Xavier Llorà, Kumara Sastry, & David E. Goldberg Illinois Genetic Algorithms Lab University of Illinois at Urbana-Champaign {xllora,kumara,deg}@illigal.ge.uiuc.edu
  • 2. Motivation and Early Work • Can we apply Wilson’s ideas for evolving rule sets formed only by maximally accurate and general rules in Pittsburgh LCS? • Previous Multi-objective approaches:  Bottom up (Bernadó, 2002) • Panmictic populations • Multimodal optimization (sharing/crowding for niche formation)  Top down (Llorà, Goldberg, Traus, Bernadó, 2003) • Explicitly address accuracy and generality • Use it to push and product compact rule sets • The compact classifier system (CCS) roots on the bottom up approach. NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 2
  • 3. Maximally Accurate and General Rules • Accuracy and generality can be compute as n t + (r) + n t# (r) n t + (r) quot;(r) = quot;(r) = nt nm • Fitness should combine accuracy and generality f (r) = quot;(r) # $(r)% ! ! • Such measure can be either applied to rules or rule sets. • The CCS uses this fitness and a compact genetic algorithm ! (cGA) to evolve such rules. • One cGA run provides one rule. • Multiple rules are required to form a rule set. NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 3
  • 4. The cGA Can Make It • Rules may be obtained optimizing f (r) = quot;(r) # $(r)% The basic CGA scheme • 0 1. Initialization px i = 0.5 ! 2. Model sampling (two individuals are generated) 3. Evaluation (f(r)) 4. Selection (tournament selection) ! 5. Probabilistic model updation 6. Repeat steps 2-5 until termination criteria are met NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 4
  • 5. cGA Model Perturbation • Facilitate the evolution of different rules • Explore the frequency of appearance of each optimal rule • Initial model perturbation 0 px i = 0.5 + U(quot;0.4,0.4) • Experiments using the 3-input multiplexer • 1,000 independent runs ! • Visualize the pair-wise relations of the genes NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 5
  • 6. But One Rule Is Not Enough • Model perturbation in cGA evolve different rules • The goal: evolve population of rules that solve the problem together • The fitness measure (f(r)) can be also be applied to rule sets Two mechanism: •  Spawn a population until the solution is meet  Fusing populations when they represent the same rule NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 6
  • 7. Spawning and Fusing Populations NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 7
  • 8. Experiments & Scalability • Analysis using multiplexer problems (3-, 6-, and 11-input) • The number of rules in [O] grow exponentially.  It grows as 2i, where i is the number of inputs.  Assume equal probability of hitting a rule (binomial model).  The number or runs to achieve all the rules in [O] grows exponentially. • The cGA success as a function of the problem size!  3-input: 97%  6-input: 73.93%  11-input: 43.03% • Scalability over 10,000 independent runs NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 8
  • 9. Scalability of CCS NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 9
  • 10. So? Open questions: •  Multiple runs is not an option.  Could the poor cGA scalability be the result of the existence of linkage? The χ-ary extended compact classifier system (χeCCS) needs to • provide answers to:  Perform linkage learning to improve the scalability of the rule learning process.  Evolve [O] in a single run (rule niching?). The χeCCS answer: •  Use the extended compact genetic algorithm (Harik, 1999)  Rule niching via restricted tournament replacement (Harik, 1995) NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 10
  • 11. Extended Compact Genetic Algorithm A Probabilistic model building GA (Harik, 1999) •  Builds models of good solutions as linkage groups Key idea: •  Good probability distribution → Linkage learning Key components: •  Representation: Marginal product model (MPM) • Marginal distribution of a gene partition  Quality: Minimum description length (MDL) • Occam’s razor principle • All things being equal, simpler models are better  Search Method: Greedy heuristic search NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 11
  • 12. Marginal Product Model (MPM) • Partition variables into clusters • Product of marginal distributions on a partition of genes • Gene partition maps to linkage groups MPM: [1, 2, 3], [4, 5, 6], … [l-2, l -1, l] ... xl-2 xl-1 xl x1 x2 x3 x4 x5 x6 {p000, p001, p010, p100, p011, p101, p110, p111} NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 12
  • 13. Minimum Description Length Metric Hypothesis: For an optimal model •  Model size and error is minimum Model complexity, Cm •  # of bits required to store all marginal probabilities Compressed population complexity, Cp •  Entropy of the marginal distribution over all partitions MDL metric, Cc = Cm + Cp • NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 13
  • 14. Building an Optimal MPM Assume independent genes ([1],[2],…,[l]) • Compute MDL metric, Cc • All combinations of two subset merges • Eg., {([1,2],[3],…,[l]), ([1,3],[2],…,[l]), ([1],[2],…,[l-1,l])} • Compute MDL metric for all model candidates • Select the set with minimum MDL, • If , accept the model and go to step 2. • Else, the current model is optimal • NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 14
  • 15. Extended Compact Genetic Algorithm Initialize the population (usually random initialization) • Evaluate the fitness of individuals • Select promising solutions (e.g., tournament selection) • Build the probabilistic model • Optimize structure & parameters to best fit selected individuals • Automatic identification of sub-structures • Sample the model to create new candidate solutions • Effective exchange of building blocks • Repeat steps 2–7 till some convergence criteria are met • NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 15
  • 16. Models built by eCGA • Use model-building procedure of extended compact GA  Partition genes into (mutually) independent groups  Start with the lowest complexity model  Search for a least-complex, most-accurate model Model Structure Metric [X0] [X1] [X2] [X3] [X4] [X5] [X6] [X7] [X8] [X9] [X10] [X11] 1.0000 [X0] [X1] [X2] [X3] [X4X5] [X6] [X7] [X8] [X9] [X10] [X11] 0.9933 [X0] [X1] [X2] [X3] [X4X5X7] [X6] [X8] [X9] [X10] [X11] 0.9819 [X0] [X1] [X2] [X3] [X4X5X6X7] [X8] [X9] [X10] [X11] 0.9644 M M [X0] [X1] [X2] [X3] [X4X5X6X7] [X8X9X10X11] 0.9273 M M [X0X1X2X3] [X4X5X6X7] [X8X9X10X11] 0.8895 NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 16
  • 17. Modifying ecGA for Rule Learning • Rules are described using χ-ary alphabets {0, 1, #}. • χeCCS uses a χ-ary version of ecGA (Sastry and Goldberg, 2003; de la Osa, Sastry, and Lobo, 2006). • Maximally general and maximally accurate rules may be obtained using: f (r) = quot;(r) # $(r)% • Needs to maintain multiple rules in a run → niching  We need an efficient niching method, that does not adversely ! affect the quality of the probabilistic models.  Restricted tournament replacement (Harik, 1995) NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 17
  • 18. Experiments Goals • 1. Is linkage learning useful to solve the multiplexer problem using Pittsburgh LCS? 2. How far can we push it? Multiplexer problems • Address bits determine what input to use  There is un underlying structure, isn’t it?  The larger solved using Pittsburgh approaches (11-input) • Match all the examples  No linkage learning available  We borrowed the population sizing theory for ecGA. • NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 18
  • 19. χeCCS Models for Different Multiplexers Building Block Size Increases NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 19
  • 20. χeCCS Scalability Follows facet-wise theory: • 1. Grows exponential with the number of address bits (building block size) 2. Quadratically with the problem size NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 20
  • 21. Conclusions The χeCCS builds on competent GAs • The facetwise models from GA theory hold • The χeCCS is able to: • 1. Perform linkage learning to improve the scalability of the rule learning process. 2. Evolve [O] in a single run. The χeCCS show the need for linkage learning in • Pittsburgh LCS to effectively solve multiplexer problems. χeCCS solved 20-input, 37-input, and 70-input • multiplexers problems for the first time using Pittsburgh LCS. NIGEL 2006 Llorà, X., Sastry, K., and Goldberg, D. 21
  • 22. Linkage Learning for Pittsburgh LCS: Making Problems Tractable Xavier Llorà, Kumara Sastry, & David E. Goldberg Illinois Genetic Algorithms Lab University of Illinois at Urbana-Champaign {xllora,kumara,deg}@illigal.ge.uiuc.edu