SlideShare uma empresa Scribd logo
1 de 28
Baixar para ler offline
Identification of Relevant Sections in Web Pages Using a
               Machine Learning Approach




                                  Jerrin Shaji George

                                      NIT Calicut


                                  November 8, 2012
Introduction

  There is a massive amount of data available on the internet.
  Extracting only the relevant content has become very important.
  A Machine Learning approach is suitable as it can adapt to the
  rapidly changing dynamics of the internet.




2 of 28
Machine Learning

  The science of getting computers to act without being explicitly
  programmed.
  A method of teaching computers to make and improve predictions
  or behaviors based on some data.
  Machine Learning Algorithms :
          Supervised Machine Learning
          Unsupervised Machine Learning




3 of 28
Supervised Learning

  Machine learning task of inferring a function from labeled training
  data.




           Figure: Supervised Learning Model (courtesy scikit-learn)
4 of 28
Supervised Learning

  Example of a classification problem - discrete valued output.




                   Figure: Copyright c Victor Lavrenko

5 of 28
Supervised Learning

  Example of a regression problem - continuous valued output.




                   Figure: Copyright c Victor Lavrenko

6 of 28
Unsupervised Learning

  The data has no labels. The algorithm tries to find similarities
  between the objects in question.




          Figure: Unsupervised Learning Model (courtesy scikit-learn)
7 of 28
Unsupervised Learning

  Example of a clustering problem




                   Figure: Copyright c Victor Lavrenko
8 of 28
Support Vector machines (SVM)

  A supervised learning model.
  Used for classification and regression analysis.
  The basic SVM:
          A non-probabilistic binary linear classifier.
          Classifies each given input into one of the two possible classes which
          forms the output.




9 of 28
The SVM Algorithm

   Inputs are formulated as feature vectors.
   The feature vectors are mapped into a feature space by using a
   kernel function.
   A division is computed in the feature space to optimally separate
   the classes of training vectors.




10 of 28
The SVM Algorithm

               φ: The Kernel Function




11 of 28
Formal Definition of SVM

   An SVM constructs a hyperplane or set of hyperplanes in a high-
   or infinite-dimensional space.
   It can be used for classification and regression.
   A good separation is achieved by the hyperplane that has the
   largest distance to the nearest training data point of any class
   (called the functional margin).




12 of 28
Optimal Separating Hyperplane




                 Figure: Courtesy Steve Gunn

13 of 28
Functional Margin

   The vectors (points) that constrain the width of the margin are the
   support vectors.




14 of 28
                       Figure: Image from scikit-learn
Mapping to Higher Dimensions

   Sometime data is not linearly separable.
   If the original finite-dimensional space is mapped into a much
   higher-dimensional space, the separation is made easier in that
   space.
   This is achieved by the SVM using the Kernel Trick.




15 of 28
Mapping to Higher Dimensions

   Mapping from 1D to 2D




   Mapping from 2D to 3D




16 of 28
                     Figure: Coutesy Steve Gunn
Identification of Relevant Sections in a Web Page for
Web Search

   Shallow techniques like keyword matching gives unsatisfactory
   results.
   Search methodologies must focus more on contextual information
   than just keyword occurrences.
           Search term might not a be very differentiating term.
           It might not appear in the section at all.

   SQUINT : an SVM based approach to identify sections of a Web
   page relevant to a Web Search.



17 of 28
Overall Architecure




18 of 28
Feature Generation

   Word Rank Based Features
   Bigram Rank Based Features
   Coverage of Top Ranked Tokens
   Query Word Frequency
   Distance from the Query




19 of 28
Word Rank Based Features

   The rank of a word is defined to be its position in the list if the
   words were ordered by frequency of occurrence across all search
   results.
   The value of this feature is the frequency of the particular word in
   the given section.
   Bucketing can be used to reduce dimensionality.




20 of 28
Bigram Rank Based Features

   A bigram is defined to be two consecutive words occurring in a
   section.
   Eg. Machine learning may be more important than machine and
   learning separately.
   The value of the feature is calculated same as Word Rank Based
   Features.




21 of 28
Coverage of Top Ranked Tokens

   Relevance may also be determined by the number of top ranked
   words which occur in the section.
   The value of this feature is the coverage of top ranked words per
   bucket.




22 of 28
Distance from the Query

   The intuition here is that the closer a section is to the query in the
   Web page, the more likely it is to be relevant.
   The value of this feature is the section-wise distance between the
   section in question and the nearest section which contains the
   query.




23 of 28
Query Word Frequency

   The value of this feature is the frequency of the query word in the
   section.
   The value is normalized by the number of words in the section.




24 of 28
Training Set Generation

   Query Google to get a set of pages
   Clean each page remove scripts, pictures, links etc.
   Break each page into sections.
   Label each section of every page.




25 of 28
Learning Algorithm

   An Support Vector Machine with a linear kernel is used.
   Given the relatively high dimensionality of the feature vector, it is a
   reasonable choice to use an SVM.
   The predicted margins of each sample are used to get a non-binary
   metric of how relevant each sections are.




26 of 28
Conclusion

   Support Vector Machines are an attractive approach to data
   modelling.
   Evaluations suggest that using information retrieval inspired
   features and some basic hints from summarization give respectable
   accuracy with respect to detecting the most relevant section in a
   page.
   Thus SQUINT can have a large impact on the user’s overall search
   experience.




27 of 28
References

   Cristianini, Nello; and Shawe-Taylor, John; An Introduction to
   Support Vector Machines and other kernel-based learning methods,
   Cambridge University Press, 2000.
   Siddharth Jonathan J.B., Riku Inoue and Jyotika Prasad. SQUINT
   SVM for Identification of Relevant Sections in Web Pages for Web
   Search.
   Wikipedia article on Machine Learning,
   http://en.wikipedia.org/wiki/Support vector machine
   Machine Learning Course on Coursera,
   https://class.coursera.org/ml-2012-002/class/index



28 of 28

Mais conteúdo relacionado

Mais procurados

Application of machine learning in industrial applications
Application of machine learning in industrial applicationsApplication of machine learning in industrial applications
Application of machine learning in industrial applications
Anish Das
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
butest
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
butest
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
butest
 

Mais procurados (20)

Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
 
Application of machine learning in industrial applications
Application of machine learning in industrial applicationsApplication of machine learning in industrial applications
Application of machine learning in industrial applications
 
ML Basics
ML BasicsML Basics
ML Basics
 
Machine learning
Machine learning Machine learning
Machine learning
 
Machine Learning Project - Neural Network
Machine Learning Project - Neural Network Machine Learning Project - Neural Network
Machine Learning Project - Neural Network
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learning
 
Machine Learning and Applications
Machine Learning and ApplicationsMachine Learning and Applications
Machine Learning and Applications
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
 
Introduction To Machine Learning
Introduction To Machine LearningIntroduction To Machine Learning
Introduction To Machine Learning
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
15857 cse422 unsupervised-learning
15857 cse422 unsupervised-learning15857 cse422 unsupervised-learning
15857 cse422 unsupervised-learning
 
Building Azure Machine Learning Models
Building Azure Machine Learning ModelsBuilding Azure Machine Learning Models
Building Azure Machine Learning Models
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
 
Introduction into machine learning
Introduction into machine learningIntroduction into machine learning
Introduction into machine learning
 

Semelhante a Identification of Relevant Sections in Web Pages Using a Machine Learning Approach

Dive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptxDive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptx
RakshaAgrawal21
 
Dive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSCDive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSC
RakshaAgrawal21
 

Semelhante a Identification of Relevant Sections in Web Pages Using a Machine Learning Approach (20)

A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...
 
Network intrusion detection using supervised machine learning technique with ...
Network intrusion detection using supervised machine learning technique with ...Network intrusion detection using supervised machine learning technique with ...
Network intrusion detection using supervised machine learning technique with ...
 
RESUME SCREENING USING LSTM
RESUME SCREENING USING LSTMRESUME SCREENING USING LSTM
RESUME SCREENING USING LSTM
 
Student Performance Predictor
Student Performance PredictorStudent Performance Predictor
Student Performance Predictor
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Dive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptxDive into Machine Learning Event MUGDSC.pptx
Dive into Machine Learning Event MUGDSC.pptx
 
Dive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSCDive into Machine Learning Event--MUGDSC
Dive into Machine Learning Event--MUGDSC
 
International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI), International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI),
 
A Comparative Study on Identical Face Classification using Machine Learning
A Comparative Study on Identical Face Classification using Machine LearningA Comparative Study on Identical Face Classification using Machine Learning
A Comparative Study on Identical Face Classification using Machine Learning
 
Density Based Clustering Approach for Solving the Software Component Restruct...
Density Based Clustering Approach for Solving the Software Component Restruct...Density Based Clustering Approach for Solving the Software Component Restruct...
Density Based Clustering Approach for Solving the Software Component Restruct...
 
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...
 
Record matching over multiple query result - Document
Record matching over multiple query result - DocumentRecord matching over multiple query result - Document
Record matching over multiple query result - Document
 
Regression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms ExcelRegression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms Excel
 
MACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXMACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOX
 
Top 50 ML Ques & Ans.pdf
Top 50 ML Ques & Ans.pdfTop 50 ML Ques & Ans.pdf
Top 50 ML Ques & Ans.pdf
 
A Survey on Machine Learning Algorithms
A Survey on Machine Learning AlgorithmsA Survey on Machine Learning Algorithms
A Survey on Machine Learning Algorithms
 
IJET-V3I2P2
IJET-V3I2P2IJET-V3I2P2
IJET-V3I2P2
 
An Overview of Supervised Machine Learning Paradigms and their Classifiers
An Overview of Supervised Machine Learning Paradigms and their ClassifiersAn Overview of Supervised Machine Learning Paradigms and their Classifiers
An Overview of Supervised Machine Learning Paradigms and their Classifiers
 
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning AlgorithmsIRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
IRJET- Sentimental Analysis for Online Reviews using Machine Learning Algorithms
 
Journal Publishers
Journal PublishersJournal Publishers
Journal Publishers
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Identification of Relevant Sections in Web Pages Using a Machine Learning Approach

  • 1. Identification of Relevant Sections in Web Pages Using a Machine Learning Approach Jerrin Shaji George NIT Calicut November 8, 2012
  • 2. Introduction There is a massive amount of data available on the internet. Extracting only the relevant content has become very important. A Machine Learning approach is suitable as it can adapt to the rapidly changing dynamics of the internet. 2 of 28
  • 3. Machine Learning The science of getting computers to act without being explicitly programmed. A method of teaching computers to make and improve predictions or behaviors based on some data. Machine Learning Algorithms : Supervised Machine Learning Unsupervised Machine Learning 3 of 28
  • 4. Supervised Learning Machine learning task of inferring a function from labeled training data. Figure: Supervised Learning Model (courtesy scikit-learn) 4 of 28
  • 5. Supervised Learning Example of a classification problem - discrete valued output. Figure: Copyright c Victor Lavrenko 5 of 28
  • 6. Supervised Learning Example of a regression problem - continuous valued output. Figure: Copyright c Victor Lavrenko 6 of 28
  • 7. Unsupervised Learning The data has no labels. The algorithm tries to find similarities between the objects in question. Figure: Unsupervised Learning Model (courtesy scikit-learn) 7 of 28
  • 8. Unsupervised Learning Example of a clustering problem Figure: Copyright c Victor Lavrenko 8 of 28
  • 9. Support Vector machines (SVM) A supervised learning model. Used for classification and regression analysis. The basic SVM: A non-probabilistic binary linear classifier. Classifies each given input into one of the two possible classes which forms the output. 9 of 28
  • 10. The SVM Algorithm Inputs are formulated as feature vectors. The feature vectors are mapped into a feature space by using a kernel function. A division is computed in the feature space to optimally separate the classes of training vectors. 10 of 28
  • 11. The SVM Algorithm φ: The Kernel Function 11 of 28
  • 12. Formal Definition of SVM An SVM constructs a hyperplane or set of hyperplanes in a high- or infinite-dimensional space. It can be used for classification and regression. A good separation is achieved by the hyperplane that has the largest distance to the nearest training data point of any class (called the functional margin). 12 of 28
  • 13. Optimal Separating Hyperplane Figure: Courtesy Steve Gunn 13 of 28
  • 14. Functional Margin The vectors (points) that constrain the width of the margin are the support vectors. 14 of 28 Figure: Image from scikit-learn
  • 15. Mapping to Higher Dimensions Sometime data is not linearly separable. If the original finite-dimensional space is mapped into a much higher-dimensional space, the separation is made easier in that space. This is achieved by the SVM using the Kernel Trick. 15 of 28
  • 16. Mapping to Higher Dimensions Mapping from 1D to 2D Mapping from 2D to 3D 16 of 28 Figure: Coutesy Steve Gunn
  • 17. Identification of Relevant Sections in a Web Page for Web Search Shallow techniques like keyword matching gives unsatisfactory results. Search methodologies must focus more on contextual information than just keyword occurrences. Search term might not a be very differentiating term. It might not appear in the section at all. SQUINT : an SVM based approach to identify sections of a Web page relevant to a Web Search. 17 of 28
  • 19. Feature Generation Word Rank Based Features Bigram Rank Based Features Coverage of Top Ranked Tokens Query Word Frequency Distance from the Query 19 of 28
  • 20. Word Rank Based Features The rank of a word is defined to be its position in the list if the words were ordered by frequency of occurrence across all search results. The value of this feature is the frequency of the particular word in the given section. Bucketing can be used to reduce dimensionality. 20 of 28
  • 21. Bigram Rank Based Features A bigram is defined to be two consecutive words occurring in a section. Eg. Machine learning may be more important than machine and learning separately. The value of the feature is calculated same as Word Rank Based Features. 21 of 28
  • 22. Coverage of Top Ranked Tokens Relevance may also be determined by the number of top ranked words which occur in the section. The value of this feature is the coverage of top ranked words per bucket. 22 of 28
  • 23. Distance from the Query The intuition here is that the closer a section is to the query in the Web page, the more likely it is to be relevant. The value of this feature is the section-wise distance between the section in question and the nearest section which contains the query. 23 of 28
  • 24. Query Word Frequency The value of this feature is the frequency of the query word in the section. The value is normalized by the number of words in the section. 24 of 28
  • 25. Training Set Generation Query Google to get a set of pages Clean each page remove scripts, pictures, links etc. Break each page into sections. Label each section of every page. 25 of 28
  • 26. Learning Algorithm An Support Vector Machine with a linear kernel is used. Given the relatively high dimensionality of the feature vector, it is a reasonable choice to use an SVM. The predicted margins of each sample are used to get a non-binary metric of how relevant each sections are. 26 of 28
  • 27. Conclusion Support Vector Machines are an attractive approach to data modelling. Evaluations suggest that using information retrieval inspired features and some basic hints from summarization give respectable accuracy with respect to detecting the most relevant section in a page. Thus SQUINT can have a large impact on the user’s overall search experience. 27 of 28
  • 28. References Cristianini, Nello; and Shawe-Taylor, John; An Introduction to Support Vector Machines and other kernel-based learning methods, Cambridge University Press, 2000. Siddharth Jonathan J.B., Riku Inoue and Jyotika Prasad. SQUINT SVM for Identification of Relevant Sections in Web Pages for Web Search. Wikipedia article on Machine Learning, http://en.wikipedia.org/wiki/Support vector machine Machine Learning Course on Coursera, https://class.coursera.org/ml-2012-002/class/index 28 of 28