SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
A Multi-Objective Genetic Algorithm for
Pruning Support Vector Machines
Mohamed Abdel Hady, Wessam Herbawi,
Friedhelm Schwenker
Institute of Neural Information Processing
University of Ulm, Germany
{mohamed.abdel-hady}@uni-ulm.de
November 4, 2011
1 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Support Vector Machine
+
- +
+
+
+
+
+
+
+
-
-
-
-
-
-
-
-
-
+
{x|‹w, ϕ(x)›+b = 0}
w
y = -1 y = +1
{x|‹w, ϕ(x)›+b = -1}
{x|‹w, ϕ(x)›+b = +1}
Maximum
margin
+
+
-
-
є1
є4
є2
є3
+
-
2 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Support Vector Machine
To obtain the optimal hyperplane, one solves the following convex quadratic
optimization problem with respect to weight vector w and bias b:
min
w,b
1
2
w 2
+ C
n
i=1
i , (1)
subject to the constraints:
yi ( w, φ(xi ) + b) ≥ 1 − i , i ≥ 0 for i = 1 . . . , n (2)
The regularization parameter C controls the trade-off between maximizing the margin
1/ w and minimizing the sum of slack variables of the training examples
i = max(0, 1 − yi ( w, φ(xi ) + b))for i = 1, . . . , n. (3)
The training example xi is correctly classified if 0 ≤ i < 1 and is misclassified when
i ≥ 1.
3 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Support Vector Machine
The problem is converted into its equivalent dual problem, using standard Lagrangian
techniques, whose number of variables is the number of training examples.
max
α
n
i=1
αi −
1
2
n
i,j=1
αi αj yi yj k(xi , xj ) (4)
subject to the constraints
n
i=1
αi yi = 0 and 0 ≤ αi ≤ C for i = 1, . . . n. (5)
where the coefficients α∗
i are the optimal solution of the dual problem and k is the
kernel function. Hence, the decision function to classify unseen example x can be
written as:
f(x) =
nsv
i=1
α∗
i yi k(x, xi ) + b∗
, (6)
The training examples xi with α∗
i > 0 are called support vectors and the number of
support vectors is denoted by nsv ≤ n.
4 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
SVM Pruning
The classification time complexity of the SVM classifier scales with the number of
support vectors (O(nsv )).
To reduce the complexity of SVM, the number of support vectors should be
reduced
To reduce the overfitting (over-training) of SVM, the number of support vectors
should be reduced
Indirect methods: reduce the number of training examples
{(xi , yi ) : i = 1, . . . , n} [Pedrajas, IEEE TNN 2009]
Direct methods: The multiobjective evolutionary SVM proposed in this paper is
the first evolutionary algorithm that reformulates SVM pruning as a combinatorial
multi-objective optimization problem.
5 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Genetic Algorithm for Support Vector Selection
Evaluate SVM
simplified decision
function
GA Operators
(Selection, Crossover
and Mutation)
Evaluate the fitness of
individuals in
population
Number of support
vectors
Training error
Genetic Algorithm
support vectors
indices
6 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Representation (Encoding)
For support vector selection a binary encoding is appropriate. Here, the tth
candidate solution in a population is an nsv -dimensional bit vector st ∈ {0, 1}nsv .
The jth support vector will be included in the decision function if stj = 1 and
excluded when stj = 0. For instance, if we have a problem with 7 support
vectors, the tth individual solution of the population can be represented as
st = (1, 0, 0, 1, 1, 1, 0) or st = (0, 1, 0, 1, 1, 0, 1).
Then for each solution with bit vector st , only the summation of the nsv selected
support vectors are performed to define the reduced decision function (freduced ),
which is used in Eq. (9) to evaluate the fitness of solution st .
freduced (xi , st ) =
nsv
j=1
stj α∗
j yj Kij + b∗
, (7)
7 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Selection Criteria (Objectives)
determine the quality of each candidate solution in the population. We want to
design classifiers with high generalization ability.
There is a trade-off between SVM complexity and its training error (the number
of misclassified examples on the set n training examples)
the following two objective functions are used to measure the fitness of a solution
st :
f1(st ) = nsv =
nsv
j=1
stj (8)
and
f2(st ) =
n
i=1
1(yi =sgn(freduced (xi ,st ))) (9)
where freduced is the reduced decision function defined in Eq. (7) and sgn is the
indicator function with values -1 and +1. It is easy to achieve zero training error
when all training examples are support vectors, but this solution is not likely to
generalize well (prone to overfitting).
8 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Experimental Setup
soft-margin L1-SVMs with Gaussian kernel function
k(x, xi ) = exp(−γ x − xi
2
) (10)
with γ = 1/d and the regularization term C =1.
four benchmark datasets from UCI Benchmark Repository, ionosphere, diabetes,
sick, and german credit where the number of features (d) is 34, 8, 29, and 20,
respectively.
All features are normalized to have zero mean and unit variance.
Each dataset is divided randomly into two subsets, 10% are used as testset
Dtest , while the remaining 90% are used as training examples Dtrain. Thus, the
size of training sets (n) is 315, 691, 3394 and 900 and the size of test set (m) is
36, 77, 378 and 100, respectively.
At the beginning of the experiment, a soft margin L1-norm SVM is constructed
using subset Dtrain and SMO algorithm.
The training error f2(st ) of each individual solution st (support vector subset) is
evaluated on subset Dtrain where CE(train) = f2(st )/n. After each run of MOGA,
we evaluate the average test set error CE(test) of each solution in the final set of
Pareto-optimal solutions using subset Dtest .
9 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Experimental Results
For the application of the NSGA-II we choose a population size of 100 and the
other parameters of the NSGA-II (pc = 0.9, pmut = 1/nsv , ηc = 20, ηmut = 20)
where the two objectives given in Eq. (8) and Eq. (9) are optimized.
For each dataset, ten optimization runs of MOGA are carried out, each of them
lasting for 10000 generations.
Pareto-optimal solutions after pruning compared to unpruned SVM
dataset ionosphere diabetes sick german credit
before [101, 4, 10] [399, 126, 14] [503, 88, 12] [820, 20, 27]
after
[0, 202, 23] [0, 450, 50] [0, 208, 23] [8, 259, 26]
to [15, 3, 5] to [101, 125, 18] to [92, 83, 13] to [283, 57, 22]
the solutions are written as triple [nsv , n.CE(train), m.CE(test)]
10 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Pareto Fronts
0 5 10 15
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
ionosphere
0 50 100 150
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
diabetes
after pruning: CE(train)
after pruning: CE(test)
before pruning: CE(train)
before pruning: CE(test)
0 20 40 60 80 100
0.02
0.03
0.04
0.05
0.06
0.07
sick
0 100 200 300
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
german credit
11 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Experimental Results
For many solutions for ionosphere and german credit, we can see the effort of
overfitting as the generalization ability of the SVM classifier was improved after
pruning while the training error get worse.
A typical MOO heuristic is to select a solution (support vector subset) that
corresponds to an interesting part of the Pareto front.
12 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Attainment Surfaces
0 5 10 15 20 25
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
ionosphere
0 50 100 150
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
diabetes
attainment surface: 10th
attainment surface: 5th
attainment surface: 1st
before pruning
0 50 100 150 200
0.02
0.03
0.04
0.05
0.06
0.07
sick
0 100 200 300 400
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
german credit
13 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Experimental Results
The attainment curves have a maximum complexity of 22, 132, 171, and 300 for
ionosphere, diabetes, sick and german credit, respectively. That is, the
evolutionary pruning approach achieved a percentage of complexity reduction
equals to 78.2%, 66.9%, 66% and 63.4% for the four datasets, repectively
without sacrificing the training error.
14 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Conclusion
Support vector selection is a multi-objective optimization problem. We have
described a genetic algorithm to reduce the computational complexity of support
vector machines by reducing number of support vectors comprised in their
decision functions.
The resulting Pareto fronts visualize the trade-off between SVM complexity and
its training error for guiding the support vector selection
For some data sets, the experimental results show that the test set classification
accuracy is improved after pruning without sacrificing the training set accuracy.
Thus, the post-pruning of SVMs achieved the same effect of post-pruning
decision trees where it reduces overfitting.
15 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Future Work
We plan to extend the application of the proposed approach to regression tasks
that suffer from the same problem of large number of support vectors in the
decision functions of support vector regression machines.
In addition, we will conduct further experiments using other types of kernel
functions as we used only Gaussian kernels in the presented experiments. We
expect that the percentage of complexity reduction is kernel-dependent.
16 / 15
Support Vector Machine SVM Pruning Experiments Conclusion Future Work
Thanks for your attention
Questions ??
17 / 15

Mais conteúdo relacionado

Mais procurados

Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
nextlib
 

Mais procurados (20)

Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
maXbox starter65 machinelearning3
maXbox starter65 machinelearning3maXbox starter65 machinelearning3
maXbox starter65 machinelearning3
 
Svm vs ls svm
Svm vs ls svmSvm vs ls svm
Svm vs ls svm
 
Svm
SvmSvm
Svm
 
MLHEP 2015: Introductory Lecture #2
MLHEP 2015: Introductory Lecture #2MLHEP 2015: Introductory Lecture #2
MLHEP 2015: Introductory Lecture #2
 
End1
End1End1
End1
 
A Simple Review on SVM
A Simple Review on SVMA Simple Review on SVM
A Simple Review on SVM
 
A BA-based algorithm for parameter optimization of support vector machine
A BA-based algorithm for parameter optimization of support vector machineA BA-based algorithm for parameter optimization of support vector machine
A BA-based algorithm for parameter optimization of support vector machine
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
 
Afsar ml applied_svm
Afsar ml applied_svmAfsar ml applied_svm
Afsar ml applied_svm
 
2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
 
Study on Application of Ensemble learning on Credit Scoring
Study on Application of Ensemble learning on Credit ScoringStudy on Application of Ensemble learning on Credit Scoring
Study on Application of Ensemble learning on Credit Scoring
 
maXbox starter67 machine learning V
maXbox starter67 machine learning VmaXbox starter67 machine learning V
maXbox starter67 machine learning V
 
MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3MLHEP 2015: Introductory Lecture #3
MLHEP 2015: Introductory Lecture #3
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1
 
Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
 
ICPR 2016
ICPR 2016ICPR 2016
ICPR 2016
 

Destaque

Başarı Hedefli, HORIZON2020 ve TÜBİTAK Destekleri Proje Yazımı - Muzaffer Öztan
Başarı Hedefli, HORIZON2020 ve TÜBİTAK Destekleri Proje Yazımı - Muzaffer ÖztanBaşarı Hedefli, HORIZON2020 ve TÜBİTAK Destekleri Proje Yazımı - Muzaffer Öztan
Başarı Hedefli, HORIZON2020 ve TÜBİTAK Destekleri Proje Yazımı - Muzaffer Öztan
İTÜ Çekirdek
 
Optimizing the Performance-Related Configurations of Object-Relational Mappin...
Optimizing the Performance-Related Configurations of Object-Relational Mappin...Optimizing the Performance-Related Configurations of Object-Relational Mappin...
Optimizing the Performance-Related Configurations of Object-Relational Mappin...
corpaulbezemer
 
Edge detection of video using matlab code
Edge detection of video using matlab codeEdge detection of video using matlab code
Edge detection of video using matlab code
Bhushan Deore
 

Destaque (20)

Modular Multi-Objective Genetic Algorithm for Large Scale Bi-level Problems
Modular Multi-Objective Genetic Algorithm for Large Scale Bi-level ProblemsModular Multi-Objective Genetic Algorithm for Large Scale Bi-level Problems
Modular Multi-Objective Genetic Algorithm for Large Scale Bi-level Problems
 
vts_7560_10802
vts_7560_10802vts_7560_10802
vts_7560_10802
 
Genetic Algorithms for optimization
Genetic Algorithms for optimizationGenetic Algorithms for optimization
Genetic Algorithms for optimization
 
Multi-objective Genetic Algorithm Applied to Conceptual Design of Single-stag...
Multi-objective Genetic Algorithm Applied to Conceptual Design of Single-stag...Multi-objective Genetic Algorithm Applied to Conceptual Design of Single-stag...
Multi-objective Genetic Algorithm Applied to Conceptual Design of Single-stag...
 
The Multi-Objective Genetic Algorithm Based Techniques for Intrusion Detection
The Multi-Objective Genetic Algorithm Based Techniques for Intrusion DetectionThe Multi-Objective Genetic Algorithm Based Techniques for Intrusion Detection
The Multi-Objective Genetic Algorithm Based Techniques for Intrusion Detection
 
Başarı Hedefli, HORIZON2020 ve TÜBİTAK Destekleri Proje Yazımı - Muzaffer Öztan
Başarı Hedefli, HORIZON2020 ve TÜBİTAK Destekleri Proje Yazımı - Muzaffer ÖztanBaşarı Hedefli, HORIZON2020 ve TÜBİTAK Destekleri Proje Yazımı - Muzaffer Öztan
Başarı Hedefli, HORIZON2020 ve TÜBİTAK Destekleri Proje Yazımı - Muzaffer Öztan
 
Rioja revision
Rioja revisionRioja revision
Rioja revision
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Optimizing the Performance-Related Configurations of Object-Relational Mappin...
Optimizing the Performance-Related Configurations of Object-Relational Mappin...Optimizing the Performance-Related Configurations of Object-Relational Mappin...
Optimizing the Performance-Related Configurations of Object-Relational Mappin...
 
Semi-supervised Facial Expressions Annotation Using Co-Training with Fast Pro...
Semi-supervised Facial Expressions Annotation Using Co-Training with Fast Pro...Semi-supervised Facial Expressions Annotation Using Co-Training with Fast Pro...
Semi-supervised Facial Expressions Annotation Using Co-Training with Fast Pro...
 
A genetic algorithm approach for multi objective optimization of supply chain...
A genetic algorithm approach for multi objective optimization of supply chain...A genetic algorithm approach for multi objective optimization of supply chain...
A genetic algorithm approach for multi objective optimization of supply chain...
 
Bayes Aglari
Bayes AglariBayes Aglari
Bayes Aglari
 
Genetik Algoritma Nasıl Çalışır
Genetik Algoritma Nasıl ÇalışırGenetik Algoritma Nasıl Çalışır
Genetik Algoritma Nasıl Çalışır
 
Destek vektör makineleri
Destek vektör makineleriDestek vektör makineleri
Destek vektör makineleri
 
Geneti̇k algori̇tma
Geneti̇k algori̇tmaGeneti̇k algori̇tma
Geneti̇k algori̇tma
 
Rich Farrell
Rich FarrellRich Farrell
Rich Farrell
 
Edge detection of video using matlab code
Edge detection of video using matlab codeEdge detection of video using matlab code
Edge detection of video using matlab code
 
Support Vector Machine without tears
Support Vector Machine without tearsSupport Vector Machine without tears
Support Vector Machine without tears
 
Lecture12 - SVM
Lecture12 - SVMLecture12 - SVM
Lecture12 - SVM
 
Semi supervised learning
Semi supervised learningSemi supervised learning
Semi supervised learning
 

Semelhante a A Multi-Objective Genetic Algorithm for Pruning Support Vector Machines

Lecture7 cross validation
Lecture7 cross validationLecture7 cross validation
Lecture7 cross validation
Stéphane Canu
 
Data Selection For Support Vector Machine Classifier
Data Selection For Support Vector Machine ClassifierData Selection For Support Vector Machine Classifier
Data Selection For Support Vector Machine Classifier
GUANBO
 
Data Selection For Support Vector Machine Classifier
Data Selection For Support Vector Machine ClassifierData Selection For Support Vector Machine Classifier
Data Selection For Support Vector Machine Classifier
GUANBO
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
Yaxin Liu
 

Semelhante a A Multi-Objective Genetic Algorithm for Pruning Support Vector Machines (20)

Lecture7 cross validation
Lecture7 cross validationLecture7 cross validation
Lecture7 cross validation
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Application of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosisApplication of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosis
 
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
 
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
 
A parsimonious SVM model selection criterion for classification of real-world ...
A parsimonious SVM model selection criterion for classification of real-world ...A parsimonious SVM model selection criterion for classification of real-world ...
A parsimonious SVM model selection criterion for classification of real-world ...
 
Data Selection For Support Vector Machine Classifier
Data Selection For Support Vector Machine ClassifierData Selection For Support Vector Machine Classifier
Data Selection For Support Vector Machine Classifier
 
Data Selection For Support Vector Machine Classifier
Data Selection For Support Vector Machine ClassifierData Selection For Support Vector Machine Classifier
Data Selection For Support Vector Machine Classifier
 
FIDUCIAL POINTS DETECTION USING SVM LINEAR CLASSIFIERS
FIDUCIAL POINTS DETECTION USING SVM LINEAR CLASSIFIERSFIDUCIAL POINTS DETECTION USING SVM LINEAR CLASSIFIERS
FIDUCIAL POINTS DETECTION USING SVM LINEAR CLASSIFIERS
 
MSE.pptx
MSE.pptxMSE.pptx
MSE.pptx
 
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion mining
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
 
Event classification & prediction using support vector machine
Event classification & prediction using support vector machineEvent classification & prediction using support vector machine
Event classification & prediction using support vector machine
 
Guide
GuideGuide
Guide
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 
The Sample Average Approximation Method for Stochastic Programs with Integer ...
The Sample Average Approximation Method for Stochastic Programs with Integer ...The Sample Average Approximation Method for Stochastic Programs with Integer ...
The Sample Average Approximation Method for Stochastic Programs with Integer ...
 
Comparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text MiningComparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text Mining
 
More on randomization semi-definite programming and derandomization
More on randomization semi-definite programming and derandomizationMore on randomization semi-definite programming and derandomization
More on randomization semi-definite programming and derandomization
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 

A Multi-Objective Genetic Algorithm for Pruning Support Vector Machines

  • 1. Support Vector Machine SVM Pruning Experiments Conclusion Future Work A Multi-Objective Genetic Algorithm for Pruning Support Vector Machines Mohamed Abdel Hady, Wessam Herbawi, Friedhelm Schwenker Institute of Neural Information Processing University of Ulm, Germany {mohamed.abdel-hady}@uni-ulm.de November 4, 2011 1 / 15
  • 2. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Support Vector Machine + - + + + + + + + + - - - - - - - - - + {x|‹w, ϕ(x)›+b = 0} w y = -1 y = +1 {x|‹w, ϕ(x)›+b = -1} {x|‹w, ϕ(x)›+b = +1} Maximum margin + + - - є1 є4 є2 є3 + - 2 / 15
  • 3. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Support Vector Machine To obtain the optimal hyperplane, one solves the following convex quadratic optimization problem with respect to weight vector w and bias b: min w,b 1 2 w 2 + C n i=1 i , (1) subject to the constraints: yi ( w, φ(xi ) + b) ≥ 1 − i , i ≥ 0 for i = 1 . . . , n (2) The regularization parameter C controls the trade-off between maximizing the margin 1/ w and minimizing the sum of slack variables of the training examples i = max(0, 1 − yi ( w, φ(xi ) + b))for i = 1, . . . , n. (3) The training example xi is correctly classified if 0 ≤ i < 1 and is misclassified when i ≥ 1. 3 / 15
  • 4. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Support Vector Machine The problem is converted into its equivalent dual problem, using standard Lagrangian techniques, whose number of variables is the number of training examples. max α n i=1 αi − 1 2 n i,j=1 αi αj yi yj k(xi , xj ) (4) subject to the constraints n i=1 αi yi = 0 and 0 ≤ αi ≤ C for i = 1, . . . n. (5) where the coefficients α∗ i are the optimal solution of the dual problem and k is the kernel function. Hence, the decision function to classify unseen example x can be written as: f(x) = nsv i=1 α∗ i yi k(x, xi ) + b∗ , (6) The training examples xi with α∗ i > 0 are called support vectors and the number of support vectors is denoted by nsv ≤ n. 4 / 15
  • 5. Support Vector Machine SVM Pruning Experiments Conclusion Future Work SVM Pruning The classification time complexity of the SVM classifier scales with the number of support vectors (O(nsv )). To reduce the complexity of SVM, the number of support vectors should be reduced To reduce the overfitting (over-training) of SVM, the number of support vectors should be reduced Indirect methods: reduce the number of training examples {(xi , yi ) : i = 1, . . . , n} [Pedrajas, IEEE TNN 2009] Direct methods: The multiobjective evolutionary SVM proposed in this paper is the first evolutionary algorithm that reformulates SVM pruning as a combinatorial multi-objective optimization problem. 5 / 15
  • 6. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Genetic Algorithm for Support Vector Selection Evaluate SVM simplified decision function GA Operators (Selection, Crossover and Mutation) Evaluate the fitness of individuals in population Number of support vectors Training error Genetic Algorithm support vectors indices 6 / 15
  • 7. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Representation (Encoding) For support vector selection a binary encoding is appropriate. Here, the tth candidate solution in a population is an nsv -dimensional bit vector st ∈ {0, 1}nsv . The jth support vector will be included in the decision function if stj = 1 and excluded when stj = 0. For instance, if we have a problem with 7 support vectors, the tth individual solution of the population can be represented as st = (1, 0, 0, 1, 1, 1, 0) or st = (0, 1, 0, 1, 1, 0, 1). Then for each solution with bit vector st , only the summation of the nsv selected support vectors are performed to define the reduced decision function (freduced ), which is used in Eq. (9) to evaluate the fitness of solution st . freduced (xi , st ) = nsv j=1 stj α∗ j yj Kij + b∗ , (7) 7 / 15
  • 8. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Selection Criteria (Objectives) determine the quality of each candidate solution in the population. We want to design classifiers with high generalization ability. There is a trade-off between SVM complexity and its training error (the number of misclassified examples on the set n training examples) the following two objective functions are used to measure the fitness of a solution st : f1(st ) = nsv = nsv j=1 stj (8) and f2(st ) = n i=1 1(yi =sgn(freduced (xi ,st ))) (9) where freduced is the reduced decision function defined in Eq. (7) and sgn is the indicator function with values -1 and +1. It is easy to achieve zero training error when all training examples are support vectors, but this solution is not likely to generalize well (prone to overfitting). 8 / 15
  • 9. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Experimental Setup soft-margin L1-SVMs with Gaussian kernel function k(x, xi ) = exp(−γ x − xi 2 ) (10) with γ = 1/d and the regularization term C =1. four benchmark datasets from UCI Benchmark Repository, ionosphere, diabetes, sick, and german credit where the number of features (d) is 34, 8, 29, and 20, respectively. All features are normalized to have zero mean and unit variance. Each dataset is divided randomly into two subsets, 10% are used as testset Dtest , while the remaining 90% are used as training examples Dtrain. Thus, the size of training sets (n) is 315, 691, 3394 and 900 and the size of test set (m) is 36, 77, 378 and 100, respectively. At the beginning of the experiment, a soft margin L1-norm SVM is constructed using subset Dtrain and SMO algorithm. The training error f2(st ) of each individual solution st (support vector subset) is evaluated on subset Dtrain where CE(train) = f2(st )/n. After each run of MOGA, we evaluate the average test set error CE(test) of each solution in the final set of Pareto-optimal solutions using subset Dtest . 9 / 15
  • 10. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Experimental Results For the application of the NSGA-II we choose a population size of 100 and the other parameters of the NSGA-II (pc = 0.9, pmut = 1/nsv , ηc = 20, ηmut = 20) where the two objectives given in Eq. (8) and Eq. (9) are optimized. For each dataset, ten optimization runs of MOGA are carried out, each of them lasting for 10000 generations. Pareto-optimal solutions after pruning compared to unpruned SVM dataset ionosphere diabetes sick german credit before [101, 4, 10] [399, 126, 14] [503, 88, 12] [820, 20, 27] after [0, 202, 23] [0, 450, 50] [0, 208, 23] [8, 259, 26] to [15, 3, 5] to [101, 125, 18] to [92, 83, 13] to [283, 57, 22] the solutions are written as triple [nsv , n.CE(train), m.CE(test)] 10 / 15
  • 11. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Pareto Fronts 0 5 10 15 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 ionosphere 0 50 100 150 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 diabetes after pruning: CE(train) after pruning: CE(test) before pruning: CE(train) before pruning: CE(test) 0 20 40 60 80 100 0.02 0.03 0.04 0.05 0.06 0.07 sick 0 100 200 300 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 german credit 11 / 15
  • 12. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Experimental Results For many solutions for ionosphere and german credit, we can see the effort of overfitting as the generalization ability of the SVM classifier was improved after pruning while the training error get worse. A typical MOO heuristic is to select a solution (support vector subset) that corresponds to an interesting part of the Pareto front. 12 / 15
  • 13. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Attainment Surfaces 0 5 10 15 20 25 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 ionosphere 0 50 100 150 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 diabetes attainment surface: 10th attainment surface: 5th attainment surface: 1st before pruning 0 50 100 150 200 0.02 0.03 0.04 0.05 0.06 0.07 sick 0 100 200 300 400 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 german credit 13 / 15
  • 14. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Experimental Results The attainment curves have a maximum complexity of 22, 132, 171, and 300 for ionosphere, diabetes, sick and german credit, respectively. That is, the evolutionary pruning approach achieved a percentage of complexity reduction equals to 78.2%, 66.9%, 66% and 63.4% for the four datasets, repectively without sacrificing the training error. 14 / 15
  • 15. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Conclusion Support vector selection is a multi-objective optimization problem. We have described a genetic algorithm to reduce the computational complexity of support vector machines by reducing number of support vectors comprised in their decision functions. The resulting Pareto fronts visualize the trade-off between SVM complexity and its training error for guiding the support vector selection For some data sets, the experimental results show that the test set classification accuracy is improved after pruning without sacrificing the training set accuracy. Thus, the post-pruning of SVMs achieved the same effect of post-pruning decision trees where it reduces overfitting. 15 / 15
  • 16. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Future Work We plan to extend the application of the proposed approach to regression tasks that suffer from the same problem of large number of support vectors in the decision functions of support vector regression machines. In addition, we will conduct further experiments using other types of kernel functions as we used only Gaussian kernels in the presented experiments. We expect that the percentage of complexity reduction is kernel-dependent. 16 / 15
  • 17. Support Vector Machine SVM Pruning Experiments Conclusion Future Work Thanks for your attention Questions ?? 17 / 15