SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
Introduction to Support Vector Machine

                       Lucas Xu


                September 4, 2012




Lucas Xu    Introduction to Support Vector Machine   September 4, 2012   1 / 20
1   Classifier


2   Hyper-Plane


3   Convex Optimization


4   Kernel


5   Application




        Lucas Xu      Introduction to Support Vector Machine   September 4, 2012   2 / 20
Classifier




   Attributes and Class Labels
   Training Data
    S = (x(1) , y (1) ), · · · , (x(m) , y (m) ) ,    x(i) ∈ Rd , y (i) ∈ {−1, 1}




      Lucas Xu             Introduction to Support Vector Machine   September 4, 2012   3 / 20
Classifier



   Umeng Gender Classification Data
                user app1 app2         ···     appd gender
                user1 1    0           ···      0    male
                user2 0    1           ···      1   f emale
                  .
                  .    .
                       .    .
                            .          ..        .
                                                 .     .
                                                       .
                  .    .    .             .      .     .
                usern 1    1           ···      1   f emale

   Each App belongs to one category, ≈ 20 categories.
   Categories are mutual exclusive.




     Lucas Xu        Introduction to Support Vector Machine   September 4, 2012   4 / 20
Classifier



    Umeng Gender Classification Data
        S = (x(1) , y (1) ), · · · , (x(m) , y (m) ) ,    x(i) ∈ Rd , y (i) ∈ {−1, 1}


  (i)
 xk ∈ {0, 1},         0 means not installed, 1 means installed on the device
   1 ≤ k ≤ d, d            30, 000, about 30,000 apps
                      y (i) ∈ {male, f emale}




          Lucas Xu             Introduction to Support Vector Machine   September 4, 2012   5 / 20
Hyper-Plane




                           Figure : Hyper Plane


The hyper-plane: wT x + b = 0
Classification function: hw,b (x) = g(wT x + b)

                                   1         if z ≥ 0
                      g(z) =
                                   −1        otherwise
       Lucas Xu        Introduction to Support Vector Machine   September 4, 2012   6 / 20
Hyper-Plane



Functional Margin:
                            γ (i) = y (i) (wT x(i) + b)
                            ˆ
Scaling: set constraint normalization condition : w = 1
Geometric Margin:

                                       w      T            b
                     γ (i) = y (i)                x(i) +
                                       w                   w

γ (i) should be a large positive number to increase the prediction
confidence.




       Lucas Xu          Introduction to Support Vector Machine   September 4, 2012   7 / 20
Hyper-Plane




Definition
The geometry margin of (w, b) with respect to training dataset S:

                             γ = min γ (i)
                                   i=1,...,m




       Lucas Xu       Introduction to Support Vector Machine   September 4, 2012   8 / 20
Hyper-Plane
The optimal margin classifier: (Intuitive)
find a decision boundary that maximizes the margin.

             maxγ,w,b    γ
                  s.t. y (i) (wT x(i) + b) ≥ γ,          i = 1, ..., m
                           w = 1.




      Lucas Xu          Introduction to Support Vector Machine   September 4, 2012   9 / 20
Hyper-Plane
Normalization Constraint: let function margin γ = 1
                                              ˆ

                                          ⇓

                         1
             maxγ,w,b
                         w
                    s.t. y (i) (wT x(i) + b) ≥ γ,          i = 1, ..., m


                                          ⇓

                         1
                 maxw,b      w 2
                         2
                    s.t. y (i) (wT x(i) + b) ≥ 1,         i = 1, ..., m


      Lucas Xu            Introduction to Support Vector Machine   September 4, 2012   10 / 20
Hyper-Plane




   Convex function




     Lucas Xu        Introduction to Support Vector Machine   September 4, 2012   11 / 20
Hyper-Plane




   Convex function
   Convex set




     Lucas Xu        Introduction to Support Vector Machine   September 4, 2012   11 / 20
Hyper-Plane




   Convex function
   Convex set
   So-called Quadratic Programming. Their are many software
   packages to solve the problem.




     Lucas Xu        Introduction to Support Vector Machine   September 4, 2012   11 / 20
Hyper-Plane




   Convex function
   Convex set
   So-called Quadratic Programming. Their are many software
   packages to solve the problem.
   Basic Ideas for Support Vector Machine DONE !




     Lucas Xu        Introduction to Support Vector Machine   September 4, 2012   11 / 20
Hyper-Plane




   Convex function
   Convex set
   So-called Quadratic Programming. Their are many software
   packages to solve the problem.
   Basic Ideas for Support Vector Machine DONE !
   More efficient solution ?




     Lucas Xu        Introduction to Support Vector Machine   September 4, 2012   11 / 20
Convex Optimization




Primal Problem:
                         1
                 maxw,b      w 2
                         2
                    s.t. y (i) (wT x(i) + b) ≥ 1,         i = 1, ..., m




      Lucas Xu            Introduction to Support Vector Machine   September 4, 2012   12 / 20
Convex Optimization
Lagrangian for the original problem:
                                                    m
                         1                 2
     min max L(w, b, α) = w                    −         αi y (i) (wT x(i) + b) − 1
     w,b α:αi ≥0         2
                                                   i=1

                                           ⇓
Under K.K.T condition, transforms to its Dual problem:
                               m                m
                                           1
          max W (α) =               αi −               y (i) y (j) αi αj x(i) , x(j)
            α                              2
                              i=1              i,j=1

           s.t. αi ≥ 0,        i = 1, ..., m
                  m
                        αi y (i) = 0
                  i=1



       Lucas Xu            Introduction to Support Vector Machine      September 4, 2012   13 / 20
Convex Optimization
Solutions:
                   m
              ∗
             w =         αi y (i) x(i)
                   i=1
                      maxi:y(i) =−1 w∗T x(i) + mini:y(i) =1 w∗T x(i)
             b∗ = −
                                                    2

Predict:

                         g(x) = wT x + b
                                         m                   T
                               =             αi y (i) x(i)       x+b
                                     i=1
                                    m
                               =         αi y (i) x(i) , x + b
                                   i=1

       Lucas Xu            Introduction to Support Vector Machine      September 4, 2012   14 / 20
Kernel

   For most of αi ,    αi = 0.
   For those αi > 0, (x(i) , y (i) ) are called support vectors
   Only needs to compute x(i) , x
                                        (i)   (i)      (i)
   if we can map feature space (x1 , x2 , ...xk ) to another high
                      (i) (i)     (i)
   dimension space (z1 , z2 , ...zl ), z = φ(x)
   i.e. φ(x(i) , φ(x)
   we can easily compute z (i) , z = K(φ( x(i) , x ))
   Use a slightly different notation:

                            K(x, y) = φ(x), φ(y)


   Intuitive Explanation: Measure of Similarities

     Lucas Xu          Introduction to Support Vector Machine   September 4, 2012   15 / 20
Kernel




Definition
Mercer Kernel: K is positive semi-definite




       Lucas Xu       Introduction to Support Vector Machine   September 4, 2012   16 / 20
Kernel




   Primitive x, y




     Lucas Xu       Introduction to Support Vector Machine   September 4, 2012   17 / 20
Kernel




   Primitive x, y
   Polynomial ( x, y + 1)d




     Lucas Xu       Introduction to Support Vector Machine   September 4, 2012   17 / 20
Kernel




   Primitive x, y
   Polynomial ( x, y + 1)d
   RBF exp(−γ||x − y||2 )




     Lucas Xu       Introduction to Support Vector Machine   September 4, 2012   17 / 20
Kernel




   Primitive x, y
   Polynomial ( x, y + 1)d
   RBF exp(−γ||x − y||2 )
   Sigmoid tanh(κ x, y + c).




     Lucas Xu       Introduction to Support Vector Machine   September 4, 2012   17 / 20
Kernel




   Primitive x, y
   Polynomial ( x, y + 1)d
   RBF exp(−γ||x − y||2 )


   String




     Lucas Xu       Introduction to Support Vector Machine   September 4, 2012   17 / 20
Kernel




   Primitive x, y
   Polynomial ( x, y + 1)d
   RBF exp(−γ||x − y||2 )


   String
   Tree




     Lucas Xu       Introduction to Support Vector Machine   September 4, 2012   17 / 20
Apply to Umeng Gender Classification
   Problem Description
   Classify the gender of a user based on apps (s)he installed and
   categories of apps.
   Kernel Design
                                           m
                          K(x, y) =              φ(xi , yj )
                                         i,j=0




                   
                    (1 + w)xi yj        if i = j
     φ(xi , yj ) =   xi yj               if i = j but the same category
                     0                   if not the same category
                   

   w ≥ 0 , the extra weight if two users have installed the same app.
   default to 1.0
   Experiment Result
     Lucas Xu         Introduction to Support Vector Machine   September 4, 2012   18 / 20
Apply to Umeng Gender Classification
                                         
                                       x1
                                      x2 
                                      
                                      . 
                                      . 
                                        .
                                      xm
                                         ⇓
                                         
                                   w · x1
                                  w · x2 
                                 
                                  . 
                                          
                                  . 
                                     . 
                                 
                                 w · xm 
                                         
                                  c1 
                                         
                                  c2 
                                         
                                  . 
                                  . . 
                                         c20
ci counts the number of apps belonging to category i
       Lucas Xu       Introduction to Support Vector Machine   September 4, 2012   19 / 20
references


    Book: Christopher Bishop – PRML Chapter 7: Section 7.1
    Slides: Andrew Moore – Support Vector Machines
    Video: Bernhard Scholkopf – Kernel Methods
    Video: Liva Ralaivola – Introduction to Kernel Methods
    Video: Colin Campbell – Introduction to Support Vector Machines
    Video: Alex Smola – Kernel Methods and Support Vector
    Machines
    Video: Partha Niyogi – Introduction to Kernel Methods
    Many more videos on kernel-related topics here
http://www.seas.harvard.edu/courses/cs281/



      Lucas Xu       Introduction to Support Vector Machine   September 4, 2012   20 / 20

Mais conteúdo relacionado

Mais procurados

SVM Tutorial
SVM TutorialSVM Tutorial
SVM Tutorial
butest
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
Musa Hawamdah
 

Mais procurados (19)

How to use SVM for data classification
How to use SVM for data classificationHow to use SVM for data classification
How to use SVM for data classification
 
Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
 
Svm vs ls svm
Svm vs ls svmSvm vs ls svm
Svm vs ls svm
 
Svm
SvmSvm
Svm
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
 
SVM Tutorial
SVM TutorialSVM Tutorial
SVM Tutorial
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learning
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Support Vector Machines (SVM)
Support Vector Machines (SVM)Support Vector Machines (SVM)
Support Vector Machines (SVM)
 
Support Vector Machines- SVM
Support Vector Machines- SVMSupport Vector Machines- SVM
Support Vector Machines- SVM
 
A BA-based algorithm for parameter optimization of support vector machine
A BA-based algorithm for parameter optimization of support vector machineA BA-based algorithm for parameter optimization of support vector machine
A BA-based algorithm for parameter optimization of support vector machine
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Svm and kernel machines
Svm and kernel machinesSvm and kernel machines
Svm and kernel machines
 
23 Machine Learning Feature Generation
23 Machine Learning Feature Generation23 Machine Learning Feature Generation
23 Machine Learning Feature Generation
 
Binary Class and Multi Class Strategies for Machine Learning
Binary Class and Multi Class Strategies for Machine LearningBinary Class and Multi Class Strategies for Machine Learning
Binary Class and Multi Class Strategies for Machine Learning
 
11 Machine Learning Important Issues in Machine Learning
11 Machine Learning Important Issues in Machine Learning11 Machine Learning Important Issues in Machine Learning
11 Machine Learning Important Issues in Machine Learning
 
26 Machine Learning Unsupervised Fuzzy C-Means
26 Machine Learning Unsupervised Fuzzy C-Means26 Machine Learning Unsupervised Fuzzy C-Means
26 Machine Learning Unsupervised Fuzzy C-Means
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
 

Destaque

Destaque (10)

Lecture12 - SVM
Lecture12 - SVMLecture12 - SVM
Lecture12 - SVM
 
Data Science - Part XVI - Fourier Analysis
Data Science - Part XVI - Fourier AnalysisData Science - Part XVI - Fourier Analysis
Data Science - Part XVI - Fourier Analysis
 
Data Science - Part VII - Cluster Analysis
Data Science - Part VII -  Cluster AnalysisData Science - Part VII -  Cluster Analysis
Data Science - Part VII - Cluster Analysis
 
Data Science - Part VIII - Artifical Neural Network
Data Science - Part VIII -  Artifical Neural NetworkData Science - Part VIII -  Artifical Neural Network
Data Science - Part VIII - Artifical Neural Network
 
Data Science - Part XIV - Genetic Algorithms
Data Science - Part XIV - Genetic AlgorithmsData Science - Part XIV - Genetic Algorithms
Data Science - Part XIV - Genetic Algorithms
 
Data Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingData Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series Forecasting
 
Data Science - Part XIII - Hidden Markov Models
Data Science - Part XIII - Hidden Markov ModelsData Science - Part XIII - Hidden Markov Models
Data Science - Part XIII - Hidden Markov Models
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector Machine
 
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science -  Part XV - MARS, Logistic Regression, & Survival AnalysisData Science -  Part XV - MARS, Logistic Regression, & Survival Analysis
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
 
Data Science - Part XVII - Deep Learning & Image Processing
Data Science - Part XVII - Deep Learning & Image ProcessingData Science - Part XVII - Deep Learning & Image Processing
Data Science - Part XVII - Deep Learning & Image Processing
 

Semelhante a Support Vector Machine

05 history of cv a machine learning (theory) perspective on computer vision
05  history of cv a machine learning (theory) perspective on computer vision05  history of cv a machine learning (theory) perspective on computer vision
05 history of cv a machine learning (theory) perspective on computer vision
zukun
 
2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machine2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machine
nozomuhamada
 
SVM (2).ppt
SVM (2).pptSVM (2).ppt
SVM (2).ppt
NoorUlHaq47
 
Prévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMPrévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAM
Cdiscount
 

Semelhante a Support Vector Machine (20)

A Simple Review on SVM
A Simple Review on SVMA Simple Review on SVM
A Simple Review on SVM
 
Lecture4 xing
Lecture4 xingLecture4 xing
Lecture4 xing
 
lecture9-support vector machines algorithms_ML-1.ppt
lecture9-support vector machines algorithms_ML-1.pptlecture9-support vector machines algorithms_ML-1.ppt
lecture9-support vector machines algorithms_ML-1.ppt
 
05 history of cv a machine learning (theory) perspective on computer vision
05  history of cv a machine learning (theory) perspective on computer vision05  history of cv a machine learning (theory) perspective on computer vision
05 history of cv a machine learning (theory) perspective on computer vision
 
Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
 
2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machine2012 mdsp pr13 support vector machine
2012 mdsp pr13 support vector machine
 
linear SVM.ppt
linear SVM.pptlinear SVM.ppt
linear SVM.ppt
 
CS571: Gradient Descent
CS571: Gradient DescentCS571: Gradient Descent
CS571: Gradient Descent
 
Gradient Descent
Gradient DescentGradient Descent
Gradient Descent
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6
 
cswiercz-general-presentation
cswiercz-general-presentationcswiercz-general-presentation
cswiercz-general-presentation
 
SVM (2).ppt
SVM (2).pptSVM (2).ppt
SVM (2).ppt
 
Linear Classifiers
Linear ClassifiersLinear Classifiers
Linear Classifiers
 
Lagrange
LagrangeLagrange
Lagrange
 
Prévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMPrévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAM
 
lecture14-SVMs (1).ppt
lecture14-SVMs (1).pptlecture14-SVMs (1).ppt
lecture14-SVMs (1).ppt
 
Lecture 03: Machine Learning for Language Technology - Linear Classifiers
Lecture 03: Machine Learning for Language Technology - Linear ClassifiersLecture 03: Machine Learning for Language Technology - Linear Classifiers
Lecture 03: Machine Learning for Language Technology - Linear Classifiers
 
Unit 4 SVM and AVR.ppt
Unit 4 SVM and AVR.pptUnit 4 SVM and AVR.ppt
Unit 4 SVM and AVR.ppt
 
svm-jain.ppt
svm-jain.pptsvm-jain.ppt
svm-jain.ppt
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Support Vector Machine

  • 1. Introduction to Support Vector Machine Lucas Xu September 4, 2012 Lucas Xu Introduction to Support Vector Machine September 4, 2012 1 / 20
  • 2. 1 Classifier 2 Hyper-Plane 3 Convex Optimization 4 Kernel 5 Application Lucas Xu Introduction to Support Vector Machine September 4, 2012 2 / 20
  • 3. Classifier Attributes and Class Labels Training Data S = (x(1) , y (1) ), · · · , (x(m) , y (m) ) , x(i) ∈ Rd , y (i) ∈ {−1, 1} Lucas Xu Introduction to Support Vector Machine September 4, 2012 3 / 20
  • 4. Classifier Umeng Gender Classification Data user app1 app2 ··· appd gender user1 1 0 ··· 0 male user2 0 1 ··· 1 f emale . . . . . . .. . . . . . . . . . . usern 1 1 ··· 1 f emale Each App belongs to one category, ≈ 20 categories. Categories are mutual exclusive. Lucas Xu Introduction to Support Vector Machine September 4, 2012 4 / 20
  • 5. Classifier Umeng Gender Classification Data S = (x(1) , y (1) ), · · · , (x(m) , y (m) ) , x(i) ∈ Rd , y (i) ∈ {−1, 1} (i) xk ∈ {0, 1}, 0 means not installed, 1 means installed on the device 1 ≤ k ≤ d, d 30, 000, about 30,000 apps y (i) ∈ {male, f emale} Lucas Xu Introduction to Support Vector Machine September 4, 2012 5 / 20
  • 6. Hyper-Plane Figure : Hyper Plane The hyper-plane: wT x + b = 0 Classification function: hw,b (x) = g(wT x + b) 1 if z ≥ 0 g(z) = −1 otherwise Lucas Xu Introduction to Support Vector Machine September 4, 2012 6 / 20
  • 7. Hyper-Plane Functional Margin: γ (i) = y (i) (wT x(i) + b) ˆ Scaling: set constraint normalization condition : w = 1 Geometric Margin: w T b γ (i) = y (i) x(i) + w w γ (i) should be a large positive number to increase the prediction confidence. Lucas Xu Introduction to Support Vector Machine September 4, 2012 7 / 20
  • 8. Hyper-Plane Definition The geometry margin of (w, b) with respect to training dataset S: γ = min γ (i) i=1,...,m Lucas Xu Introduction to Support Vector Machine September 4, 2012 8 / 20
  • 9. Hyper-Plane The optimal margin classifier: (Intuitive) find a decision boundary that maximizes the margin. maxγ,w,b γ s.t. y (i) (wT x(i) + b) ≥ γ, i = 1, ..., m w = 1. Lucas Xu Introduction to Support Vector Machine September 4, 2012 9 / 20
  • 10. Hyper-Plane Normalization Constraint: let function margin γ = 1 ˆ ⇓ 1 maxγ,w,b w s.t. y (i) (wT x(i) + b) ≥ γ, i = 1, ..., m ⇓ 1 maxw,b w 2 2 s.t. y (i) (wT x(i) + b) ≥ 1, i = 1, ..., m Lucas Xu Introduction to Support Vector Machine September 4, 2012 10 / 20
  • 11. Hyper-Plane Convex function Lucas Xu Introduction to Support Vector Machine September 4, 2012 11 / 20
  • 12. Hyper-Plane Convex function Convex set Lucas Xu Introduction to Support Vector Machine September 4, 2012 11 / 20
  • 13. Hyper-Plane Convex function Convex set So-called Quadratic Programming. Their are many software packages to solve the problem. Lucas Xu Introduction to Support Vector Machine September 4, 2012 11 / 20
  • 14. Hyper-Plane Convex function Convex set So-called Quadratic Programming. Their are many software packages to solve the problem. Basic Ideas for Support Vector Machine DONE ! Lucas Xu Introduction to Support Vector Machine September 4, 2012 11 / 20
  • 15. Hyper-Plane Convex function Convex set So-called Quadratic Programming. Their are many software packages to solve the problem. Basic Ideas for Support Vector Machine DONE ! More efficient solution ? Lucas Xu Introduction to Support Vector Machine September 4, 2012 11 / 20
  • 16. Convex Optimization Primal Problem: 1 maxw,b w 2 2 s.t. y (i) (wT x(i) + b) ≥ 1, i = 1, ..., m Lucas Xu Introduction to Support Vector Machine September 4, 2012 12 / 20
  • 17. Convex Optimization Lagrangian for the original problem: m 1 2 min max L(w, b, α) = w − αi y (i) (wT x(i) + b) − 1 w,b α:αi ≥0 2 i=1 ⇓ Under K.K.T condition, transforms to its Dual problem: m m 1 max W (α) = αi − y (i) y (j) αi αj x(i) , x(j) α 2 i=1 i,j=1 s.t. αi ≥ 0, i = 1, ..., m m αi y (i) = 0 i=1 Lucas Xu Introduction to Support Vector Machine September 4, 2012 13 / 20
  • 18. Convex Optimization Solutions: m ∗ w = αi y (i) x(i) i=1 maxi:y(i) =−1 w∗T x(i) + mini:y(i) =1 w∗T x(i) b∗ = − 2 Predict: g(x) = wT x + b m T = αi y (i) x(i) x+b i=1 m = αi y (i) x(i) , x + b i=1 Lucas Xu Introduction to Support Vector Machine September 4, 2012 14 / 20
  • 19. Kernel For most of αi , αi = 0. For those αi > 0, (x(i) , y (i) ) are called support vectors Only needs to compute x(i) , x (i) (i) (i) if we can map feature space (x1 , x2 , ...xk ) to another high (i) (i) (i) dimension space (z1 , z2 , ...zl ), z = φ(x) i.e. φ(x(i) , φ(x) we can easily compute z (i) , z = K(φ( x(i) , x )) Use a slightly different notation: K(x, y) = φ(x), φ(y) Intuitive Explanation: Measure of Similarities Lucas Xu Introduction to Support Vector Machine September 4, 2012 15 / 20
  • 20. Kernel Definition Mercer Kernel: K is positive semi-definite Lucas Xu Introduction to Support Vector Machine September 4, 2012 16 / 20
  • 21. Kernel Primitive x, y Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20
  • 22. Kernel Primitive x, y Polynomial ( x, y + 1)d Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20
  • 23. Kernel Primitive x, y Polynomial ( x, y + 1)d RBF exp(−γ||x − y||2 ) Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20
  • 24. Kernel Primitive x, y Polynomial ( x, y + 1)d RBF exp(−γ||x − y||2 ) Sigmoid tanh(κ x, y + c). Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20
  • 25. Kernel Primitive x, y Polynomial ( x, y + 1)d RBF exp(−γ||x − y||2 ) String Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20
  • 26. Kernel Primitive x, y Polynomial ( x, y + 1)d RBF exp(−γ||x − y||2 ) String Tree Lucas Xu Introduction to Support Vector Machine September 4, 2012 17 / 20
  • 27. Apply to Umeng Gender Classification Problem Description Classify the gender of a user based on apps (s)he installed and categories of apps. Kernel Design m K(x, y) = φ(xi , yj ) i,j=0   (1 + w)xi yj if i = j φ(xi , yj ) = xi yj if i = j but the same category 0 if not the same category  w ≥ 0 , the extra weight if two users have installed the same app. default to 1.0 Experiment Result Lucas Xu Introduction to Support Vector Machine September 4, 2012 18 / 20
  • 28. Apply to Umeng Gender Classification   x1  x2     .   .  . xm ⇓   w · x1  w · x2    .    .  .   w · xm     c1     c2     .   . .  c20 ci counts the number of apps belonging to category i Lucas Xu Introduction to Support Vector Machine September 4, 2012 19 / 20
  • 29. references Book: Christopher Bishop – PRML Chapter 7: Section 7.1 Slides: Andrew Moore – Support Vector Machines Video: Bernhard Scholkopf – Kernel Methods Video: Liva Ralaivola – Introduction to Kernel Methods Video: Colin Campbell – Introduction to Support Vector Machines Video: Alex Smola – Kernel Methods and Support Vector Machines Video: Partha Niyogi – Introduction to Kernel Methods Many more videos on kernel-related topics here http://www.seas.harvard.edu/courses/cs281/ Lucas Xu Introduction to Support Vector Machine September 4, 2012 20 / 20