SlideShare uma empresa Scribd logo
1 de 94
Kernel Methods: the Emergence of a Well-founded Machine Learning John Shawe-Taylor Centre for Computational Statistics and Machine Learning University College London
Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Surprising fact:   there are now 13,444 citations of  Support Vectors reported by Google scholar – the vast  majority from papers not about machine learning!  Ratio NNs/SVs in Google scholar  1987-1996: 220  1997-2006: 11
Caveats ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Motivation behind kernel methods ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Historical perspective ,[object Object],[object Object],[object Object],[object Object]
Kernel methods approach ,[object Object],[object Object]
Example ,[object Object],[object Object],[object Object]
Capacity of feature spaces ,[object Object],[object Object]
Form of the functions ,[object Object],[object Object],[object Object]
Problems of high dimensions ,[object Object],[object Object]
Overview ,[object Object],[object Object],[object Object],[object Object]
Bayesian approach ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Bayesian approach ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Surprising fact:   the covariance (kernel) function  that arises from infinitely many sigmoidal hidden units  is not a sigmoidal kernel – indeed the sigmoidal kernel  is not positive semi-definite and so cannot arise as a  covariance function!
Bayesian approach ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Frequentist approach ,[object Object],[object Object],[object Object]
Capacity problem ,[object Object]
Generalisation of a learner
Example of Generalisation ,[object Object],[object Object],[object Object],[object Object]
Example of Generalisation ,[object Object],[object Object],[object Object],[object Object]
Example of Generalisation ,[object Object],[object Object]
Error distribution: full dataset
Error distribution: dataset size: 342
Error distribution: dataset size: 273
Error distribution: dataset size: 205
Error distribution: dataset size: 137
Error distribution: dataset size: 68
Error distribution: dataset size: 34
Error distribution: dataset size: 27
Error distribution: dataset size: 20
Error distribution: dataset size: 14
Error distribution: dataset size: 7
Observations ,[object Object],[object Object],[object Object]
Controlling generalisation ,[object Object]
Intuitive and rigorous explanations ,[object Object],[object Object],[object Object],[object Object],[object Object],Surprising fact:  structural risk minimisation over VC classes does not provide a bound on the generalisation  of SVMs, except in the transductive setting – and then only  if the margin is measured on training and test data!  Indeed SVMs were wake-up call that classical PAC  analysis was not capturing critical factors in real-world  applications!
Learning framework ,[object Object],[object Object],[object Object]
Error distribution: dataset size: 205
Error distribution: dataset size: 137
Error distribution: dataset size: 68
Error distribution: dataset size: 34
Error distribution: dataset size: 27
Error distribution: dataset size: 20
Error distribution: dataset size: 14
Error distribution: dataset size: 7
Handling training errors ,[object Object],[object Object],[object Object]
Support Vector Machines ,[object Object],[object Object],[object Object]
Complexity problem ,[object Object],[object Object],[object Object]
Dual representation ,[object Object],[object Object]
Learning the dual variables ,[object Object],[object Object],[object Object]
Dual form of SVM ,[object Object],[object Object]
Using kernels ,[object Object],[object Object],[object Object]
Kernel example ,[object Object],[object Object]
Efficiency ,[object Object],[object Object]
Using Gaussian kernel for Breast: 273
Data size 342 Surprising fact:   kernel methods are invariant to rotations of the coordinate system – so any special information encoded in the choice of coordinates will  be lost! Surprisingly one of the most successful applications of SVMs has been for text classification in which one  would expect the encoding to be very informative!
Constraints on the kernel ,[object Object],[object Object],Surprising fact:   the property that guarantees the existence of a feature space is precisely the property required to ensure convexity of the resulting optimisation problem for SVMs, etc!
What have we achieved? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Historical note ,[object Object],[object Object],[object Object],Surprising fact:   good covering number bounds  on generalisation of SVMs use bound on updates of the perceptron algorithm to bound the covering numbers  and hence generalisation – but the same bound can be  applied directly to the perceptron without margin using a  sparsity argument, so bound is tighter for classical perceptron!
PAC-Bayes: General perspectives ,[object Object],[object Object],[object Object],[object Object],[object Object]
Evidence and generalisation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
PAC-Bayes Theorem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
 
 
Examples of bound evaluation 0.212 ±.002 0.198 ±.002 0.254 ±.003 0.334 ±.005 PAC-Bayes 0.109 ±.024 0.151 ±.005 0.184 ±.010 0.306 ±.018 PAC-Bayes prior 0.026 ±.005 Ringnorm 0.089 ±.005 Waveform 0.074 ±.014 (0.056 ±.01) Image 0.073 ±.021 Wdbc Test Error Problem Surprising fact:   optimising the bound does not  typically improve the test error – despite significant  improvements in the bound itself! Training an SVM on half the data to learn a prior and then using the rest to learn relative to this prior further improves the bound with almost no effect on the test error!
Principal Components Analysis (PCA) ,[object Object]
Dual representation of PCA ,[object Object],[object Object]
Kernel PCA ,[object Object],[object Object]
Generalisation of k-PCA ,[object Object],[object Object],[object Object],[object Object],Surprising fact:   no frequentist bounds apply for  learning in the representation given by k-PCA – the  data is used twice, once to find the subspace and then  to learn the classifier/regressor! Intuitively, this  shouldn’t be a problem since the labels are not used  by k-PCA!
Latent Semantic Indexing ,[object Object],[object Object]
Lower dimensional representation ,[object Object],[object Object]
Latent Semantic Kernels ,[object Object]
Related techniques ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Different criteria ,[object Object],[object Object],[object Object],[object Object],[object Object]
Other subspace methods ,[object Object],[object Object],[object Object]
Paired corpora ,[object Object],[object Object]
aligned text E 1 E 2 E N E i . . . . F 1 F 2 F N F i . . . .
Canadian parliament corpus LAND MINES  Ms. Beth Phinney (Hamilton Mountain, Lib.): Mr. Speaker, we are pleased that the Nobel peace prize has been given to those working to ban land mines worldwide.  We hope this award will encourage the United States to join the over 100 countries planning to come to … LES MINES ANTIPERSONNEL  Mme Beth Phinney (Hamilton Mountain, Lib.): Monsieur le Président, nous nous réjouissons du fait que le prix Nobel ait été attribué à ceux qui oeuvrent en faveur de l'interdiction des mines antipersonnel dans le monde entier.  Nous espérons que cela incitera les Américains à se joindre aux représentants de plus de 100 pays qui ont l'intention de venir à   … E 12 F 12
cross-lingual lsi via svd M. L. Littman, S. T. Dumais, and T. K. Landauer. Automatic cross-language information retrieval using latent semantic indexing. In G. Grefenstette, editor,  Cross-language information retrieval . Kluwer, 1998.
cross-lingual kernel canonical correlation analysis input “English” space input “French” space f F 1 f F 2 Φ(x) feature “English” space feature “French” space f E 1 f E 2
kernel canonical correlation analysis
regularization    is the regularization parameter ,[object Object],[object Object],[object Object],[object Object]
pseudo query test E i q e i F 1 F 2 F N F i . . . . Queries were generated from each test document by extracting 5 words with the highest TFIDF weights and using them as a query.
Experimental Results ,[object Object],[object Object],[object Object],[object Object],[object Object]
English-French retrieval accuracy, %
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Applying to different data types
Classification error of baseline method ,[object Object],3.0% 9.0% 13.6% 18.3% 22.9% All Text Colour + Texture Texture Colour
Classification rates using KCCA 1.21%  0.27% KCCA-SVM (200) 1.36%  0.15% KCCA-SVM (150) 2.13%  0.23% Plain SVM Error rate K
Classification with multi-views ,[object Object],[object Object],[object Object],[object Object],[object Object]
Results 82.7±1.3 81.4±1.4 82.2±1.3 80.9±1.3 71.5±1.5  76.0±1.6  14 78.4±0.6 76.8±0.6 77.6±0.7 76.8±1.0  73.6±0.8  75.0±0.8 12 80.7±1.5 79.0±1.2 77.5±1.4  76.7±1.3 76.0±1.2 74.9±1.8 7 22.5±1.7 21.5±1.9 20.8±1.9  18.8±1.6 13.1±1.0  16.7±1.2 3 75.1±4.1 73.9±4.0 74.8±4.7 73.0±4.0  68.4±4.4  71.1±4.5  2 67.5±2.1 67.5±2.3  66.1± 2.6 66.6±2.8  60.3±2.8  59.4±3.9  1 SVM_2k  Concat SVM_2k_j SVM kcca_SVM pSVM Dual variables KCCA +SVM Direct SVM X-ling. SVM-2k Concat +SVM Co-ling. SVM-2k
Targeted Density Learning ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example plot ,[object Object],[object Object]
Idealised view of progress Study problem to develop  theoretical model Derive  analysis that indicates factors that affect solution quality Translate into optimisation maximising  factors – relaxing to ensure convexity  Develop efficient  solutions using specifics of the task
Role of theory ,[object Object],[object Object],[object Object],[object Object],[object Object]
Conclusions and future directions ,[object Object],[object Object],[object Object],[object Object],[object Object]

Mais conteúdo relacionado

Mais procurados

Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
MLAI2
 

Mais procurados (17)

Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
 
The Machinery behind Deep Learning
The Machinery behind Deep LearningThe Machinery behind Deep Learning
The Machinery behind Deep Learning
 
Chapter 09 class advanced
Chapter 09 class advancedChapter 09 class advanced
Chapter 09 class advanced
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
 
Data Science - Part VII - Cluster Analysis
Data Science - Part VII -  Cluster AnalysisData Science - Part VII -  Cluster Analysis
Data Science - Part VII - Cluster Analysis
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
 
PPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberPPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at Uber
 
50120140504015
5012014050401550120140504015
50120140504015
 
Multimedia And Contiguity Principles Casey Susan
Multimedia And Contiguity Principles Casey SusanMultimedia And Contiguity Principles Casey Susan
Multimedia And Contiguity Principles Casey Susan
 
Learning Methods in a Neural Network
Learning Methods in a Neural NetworkLearning Methods in a Neural Network
Learning Methods in a Neural Network
 
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINERANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
 
Machine Learning Project - Neural Network
Machine Learning Project - Neural Network Machine Learning Project - Neural Network
Machine Learning Project - Neural Network
 
DESIGN SUITABLE FEED FORWARD NEURAL NETWORK TO SOLVE TROESCH'S PROBLEM
DESIGN SUITABLE FEED FORWARD NEURAL NETWORK TO SOLVE TROESCH'S PROBLEMDESIGN SUITABLE FEED FORWARD NEURAL NETWORK TO SOLVE TROESCH'S PROBLEM
DESIGN SUITABLE FEED FORWARD NEURAL NETWORK TO SOLVE TROESCH'S PROBLEM
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
Introduction to Interpretable Machine Learning
Introduction to Interpretable Machine LearningIntroduction to Interpretable Machine Learning
Introduction to Interpretable Machine Learning
 
Ijciet 10 01_153-2
Ijciet 10 01_153-2Ijciet 10 01_153-2
Ijciet 10 01_153-2
 

Semelhante a November, 2006 CCKM'06 1

notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
butest
 
Cristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical modelsCristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical models
butest
 
Cristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical modelsCristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical models
butest
 
Cristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical modelsCristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical models
butest
 
Cristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical modelsCristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical models
butest
 
Learning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification DataLearning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification Data
萍華 楊
 
Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Research
butest
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
kevinlan
 
Soft Computing Techniques Based Image Classification using Support Vector Mac...
Soft Computing Techniques Based Image Classification using Support Vector Mac...Soft Computing Techniques Based Image Classification using Support Vector Mac...
Soft Computing Techniques Based Image Classification using Support Vector Mac...
ijtsrd
 
Intro to Model Selection
Intro to Model SelectionIntro to Model Selection
Intro to Model Selection
chenhm
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
ESCOM
 

Semelhante a November, 2006 CCKM'06 1 (20)

ML_in_QM_JC_02-10-18
ML_in_QM_JC_02-10-18ML_in_QM_JC_02-10-18
ML_in_QM_JC_02-10-18
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector Machine
 
Cristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical modelsCristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical models
 
Cristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical modelsCristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical models
 
Cristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical modelsCristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical models
 
Cristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical modelsCristopher M. Bishop's tutorial on graphical models
Cristopher M. Bishop's tutorial on graphical models
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
SVM - Functional Verification
SVM - Functional VerificationSVM - Functional Verification
SVM - Functional Verification
 
Learning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification DataLearning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification Data
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
 
Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Research
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
 
Soft Computing Techniques Based Image Classification using Support Vector Mac...
Soft Computing Techniques Based Image Classification using Support Vector Mac...Soft Computing Techniques Based Image Classification using Support Vector Mac...
Soft Computing Techniques Based Image Classification using Support Vector Mac...
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
lec10svm.ppt
lec10svm.pptlec10svm.ppt
lec10svm.ppt
 
Intro to Model Selection
Intro to Model SelectionIntro to Model Selection
Intro to Model Selection
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & Underfitting
 

Mais de butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
butest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
butest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
butest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
butest
 
Facebook
Facebook Facebook
Facebook
butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
butest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
butest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
butest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
butest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
butest
 

Mais de butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

November, 2006 CCKM'06 1

  • 1. Kernel Methods: the Emergence of a Well-founded Machine Learning John Shawe-Taylor Centre for Computational Statistics and Machine Learning University College London
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 18.
  • 19.
  • 20.
  • 32.
  • 33.
  • 34.
  • 35.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53. Using Gaussian kernel for Breast: 273
  • 54. Data size 342 Surprising fact: kernel methods are invariant to rotations of the coordinate system – so any special information encoded in the choice of coordinates will be lost! Surprisingly one of the most successful applications of SVMs has been for text classification in which one would expect the encoding to be very informative!
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.  
  • 62.  
  • 63.  
  • 64. Examples of bound evaluation 0.212 ±.002 0.198 ±.002 0.254 ±.003 0.334 ±.005 PAC-Bayes 0.109 ±.024 0.151 ±.005 0.184 ±.010 0.306 ±.018 PAC-Bayes prior 0.026 ±.005 Ringnorm 0.089 ±.005 Waveform 0.074 ±.014 (0.056 ±.01) Image 0.073 ±.021 Wdbc Test Error Problem Surprising fact: optimising the bound does not typically improve the test error – despite significant improvements in the bound itself! Training an SVM on half the data to learn a prior and then using the rest to learn relative to this prior further improves the bound with almost no effect on the test error!
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76. aligned text E 1 E 2 E N E i . . . . F 1 F 2 F N F i . . . .
  • 77. Canadian parliament corpus LAND MINES Ms. Beth Phinney (Hamilton Mountain, Lib.): Mr. Speaker, we are pleased that the Nobel peace prize has been given to those working to ban land mines worldwide. We hope this award will encourage the United States to join the over 100 countries planning to come to … LES MINES ANTIPERSONNEL Mme Beth Phinney (Hamilton Mountain, Lib.): Monsieur le Président, nous nous réjouissons du fait que le prix Nobel ait été attribué à ceux qui oeuvrent en faveur de l'interdiction des mines antipersonnel dans le monde entier. Nous espérons que cela incitera les Américains à se joindre aux représentants de plus de 100 pays qui ont l'intention de venir à … E 12 F 12
  • 78. cross-lingual lsi via svd M. L. Littman, S. T. Dumais, and T. K. Landauer. Automatic cross-language information retrieval using latent semantic indexing. In G. Grefenstette, editor, Cross-language information retrieval . Kluwer, 1998.
  • 79. cross-lingual kernel canonical correlation analysis input “English” space input “French” space f F 1 f F 2 Φ(x) feature “English” space feature “French” space f E 1 f E 2
  • 81.
  • 82. pseudo query test E i q e i F 1 F 2 F N F i . . . . Queries were generated from each test document by extracting 5 words with the highest TFIDF weights and using them as a query.
  • 83.
  • 85.
  • 86.
  • 87. Classification rates using KCCA 1.21%  0.27% KCCA-SVM (200) 1.36%  0.15% KCCA-SVM (150) 2.13%  0.23% Plain SVM Error rate K
  • 88.
  • 89. Results 82.7±1.3 81.4±1.4 82.2±1.3 80.9±1.3 71.5±1.5 76.0±1.6 14 78.4±0.6 76.8±0.6 77.6±0.7 76.8±1.0 73.6±0.8 75.0±0.8 12 80.7±1.5 79.0±1.2 77.5±1.4 76.7±1.3 76.0±1.2 74.9±1.8 7 22.5±1.7 21.5±1.9 20.8±1.9 18.8±1.6 13.1±1.0 16.7±1.2 3 75.1±4.1 73.9±4.0 74.8±4.7 73.0±4.0 68.4±4.4 71.1±4.5 2 67.5±2.1 67.5±2.3 66.1± 2.6 66.6±2.8 60.3±2.8 59.4±3.9 1 SVM_2k Concat SVM_2k_j SVM kcca_SVM pSVM Dual variables KCCA +SVM Direct SVM X-ling. SVM-2k Concat +SVM Co-ling. SVM-2k
  • 90.
  • 91.
  • 92. Idealised view of progress Study problem to develop theoretical model Derive analysis that indicates factors that affect solution quality Translate into optimisation maximising factors – relaxing to ensure convexity Develop efficient solutions using specifics of the task
  • 93.
  • 94.

Notas do Editor

  1. The combined image and text database is obtained from the Internet by searching for images and downloading adjacent text. Images less then 72x72 were discarded 192 image textures features, 768 image colour features and 3591 text features (terms). [was retrieved from www.yahoo.com and www.warpig.com]