SlideShare uma empresa Scribd logo
1 de 45
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Methods for Meta-Learning in AutoML
Learning how to Learn
1
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Motivational Example
● Alex starts to learn a Maths course of 10 tests for
the first time in his life. (Problem)
● Alex wants to get a grade A in most of the
course tests. (Target)
● Alex thought that attending all lectures would
easily help him to get grade A like what he
always does in history courses. (Approach 1)
2
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Motivational Example
● Alex got grade D in his first test. (Result 1)
● Alex decided to switch to reading the reference
book instead of attending all lectures only.
(Approach 2)
● Alex got grade C in his second test. (Result 2)
3
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Motivational Example
● After that, Alex decided to switch to solving practice
problems instead. (Approach 3)
● Alex got grade B in his third test. (Result 3)
● So, Alex decided to summarize each lesson and
teach it to his colleagues too. (Approach 4)
● Alex got grade A in his fourth test. (Result 4)
4
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Motivational Example
● Now, the question is how will Alex study for his 5th
test in the course ?
5
Alex has already learnt how to learn.
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Back to Machine Learning
6
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Motivation
7
Data
Collection
1. Data
Preprocessing
2. Feature
Extraction
3. Feature
Selection
4.
Algorithm
Selection
Deploym
ent
5.
Parameter
Tuning
Prediction
Real-World
Data Feature Engineering Model Building
Typical Supervised Machine Learning Pipeline
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
8
Model Building
4.
Algorithm
Selection
5.
Parameter
Tuning
Examples:
- Linear Classification: (Simple Linear Classification, Ridge, Lasso, Simple Perceptron, ….)
- Support Vector Machines
- Decision Tree (ID3, C4.5, C5.0, CART, ….)
- Nearest Neighbors
- Gaussian Processes
- Naive Bayes (Gaussian, Bernoulli, Complement, ….)
- Ensembling: (Random Forest, GBM, AdaBoost, ….)
Motivation: Model Building
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
9
Model Building
4.
Algorithm
Selection
5.
Parameter
Tuning
Kernel
Linear RBF Polynomial
Gamma
[2^-15, 2^3]
Degree
2,3,....
C - Penalty
[2^-5, 2^15]
Example: Support Vector Machine
……..
Motivation: Model Building
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
10
Motivation: Dimensionality Reduction
Examples of Feature Extraction:
1. Principal Component Analysis
2. Linear Discriminant Analysis
3. Multiple Discriminant Analysis
4. Independent Component
Analysis
Examples of Multivariate Feature Selection:
1. Relief
2. Correlation Feature Selection
3. Branch and Bound
4. Sequential Forward Selection
5. Plus L - Minus R
Examples of Univariate Feature Selection:
1. Information Gain
2. Fisher Score
3. Correlation with Target
2. Feature
Extraction
3. Feature
Selection
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
11
Motivation: Data PreProcessing
Examples of Data Preprocessors:
1. Scaling
2. Normalization
3. Standardization
4. Binarization
5. Imputation
6. Deletion
7. One-Hot-Encoding
8. Hashing
9. Discretization
1. Data
Preprocessing
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Solution: Meta-Learning
1. Science of systematically observing how different machine learning
approaches perform on a wide range of learning tasks and then
learning from this experience.
12
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Solution: Meta-Learning
2. It also allows to replace hand-written rules and algorithms with
novel approaches that are data-driven.
13
1. Science of systematically observing how different machine learning
approaches perform on a wide range of learning tasks and then
learning from this experience.
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
HOW ? Collect MetaData
1. Model Configurations:
- Pipeline Composition: (Normalization → PCA → SVM)
- Hyperparameter Settings: (PCA = 2 components, SVM = gamma: 1e-9, C = 1e2)
- Network Architectures: (2 Hidden Layers, 100 Neurons per layer)
2. Resulting Model Evaluations:
- Different Metrics: Accuracy, error rate, F1-Score.
- Training Time.
3. Task Itself (Meta-Features):
- Description of the data
14
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
HOW ? Use Meta-Data
1. Knowledge Transfer.
Use the same model as an initial
point and start to tune it.
2. Guided Search.
If Classifier X is worth than Classifier
Y by 10% then there is no need to
tune classifier X
15
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
HOW ? Use Meta-Data
Remember that Alex starts with the same approach that
succeeds in History courses.
Meta-Learning won’t be effective and may affect performance
badly in case of:
- Tasks with random noise, and unrelated phenomena.
“Tasks that are Never Seen Before”
16
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Meta-Learning Methodologies:
1. Learning from Task Properties.
2. Learning from Model Evaluations.
3. Learning from Prior Models.
17
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
1-Learning from Task Properties:
● Represent task as a meta-feature vector.
● Studies show that optimal set of meta-features depends on application
type.[2]
● Different studies used various feature selection and extraction techniques
to reduce set of meta-features.[2][3]
18
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
1-Learning from Task Properties:
● What are Task Properties? = Types of Meta-features:
1. Simple
2. Statistical
3. Information Theoretic
4. Complexity
5. LandMarkers
19
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Meta-Features Types: (Simple)
● Examples:
1. Number of Instances
2. Number of Features
3. Number of Classes
4. Number of Missing Values
5. Number of Outliers
20
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Meta-Features Types: (Statistical)
● Examples:
1. Skewness of Numerical Features.
2. Kurtosis of Numerical Features.
3. Correlation Covariance between features.
4. Variance in first PCA.
5. Skewness and Kurtosis of first PCA.
6. Class probability distribution.
7. Concentration, Sparsity, Gravity of Features
(Measurements of independence and
dispersion of values.)
21
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Meta-Features Types: (Information theoretic)
● Examples:
1. Class Entropy.
2. Mutual Information between feature and
Class.
3. Equivalent number of features (2/1)
4. Noise to Signal ratio.
22
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Meta-Features Types: (Task Complexity)
● Examples:
1. Fisher discriminant (Measure separability
between classes).
23
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
Meta-Features Types: (Landmarkers)
● Examples:
1. LandMarker 1NN.
2. LandMarker Decision Tree.
3. LandMarker Naive Bayes.
4. LandMarker Linear Discriminant Analysis.
24
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How to use Meta-Features?
25
● Different Similarity Measurements (Unsupervised) and warm starting optimization of
similar tasks for recommendation of candidate configurations:
Examples:
1. Rank of different configurations.
- Tasks A, B are twin tasks.
- SVM and KNN are the best for Task A.
- Then, SVM and KNN are the best for Task B.
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How to use Meta-Features?
26
● Different Similarity Measurements (Unsupervised) and warm starting optimization of
similar tasks for recommendation of candidate configurations:
Examples:
2. Collaborative Filtering
Use results of few configurations on Task A to
predicts results of all other configurations based on
configurations results on a similar Task B
Knowledge Base Needs almost full
configurations results to be updated.
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
1-Learning from Task Properties:
27
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How to use Meta-Features?
28
● Learning High Level Meta-Features
Low Level Features High Level Features
NEEDS BIG KNOWLEDGE BASE
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How to use Meta-Features?
29
● Meta-Models (Supervised): Learn the complex relationship between
meta-features and useful configurations in this large space.
Example:
- Ranking of Top N Promising Configurations:
Literature suggests Boosting, and Bagging Models [4][5].
+
Approximate Ranking Tree Forests [6] (Auto Meta-Feature Selection
based on some initial results).
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How to use Meta-Features?
30
● Pipeline Synthesis:
1. Meta-Model to predict which preprocessor with improve
performance of a specific classifier in that particular task. [7] [8]
2. Reinforcement Learning to construct pipeline by addition, deletion,
replacement of pipeline blocks. [9] (Alpha D3M - Evolutionary
Approach)
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How to use Meta-Features?
31
● Tune or Not to Tune:
Meta Models to predict:
1. How much improvement we can expect from tuning this particular
classifier on that particular task [10].
2. How much improvement VS additional time investment? [11].
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
2-Learning from Model Evaluations:
● Using Current configuration evaluations as a prior to suggesting the
coming candidate outperforming configuration in an iterative way.
32
Example:
1. Evaluate Px on Task 1
2. Suggest new Ps
3. Select most candidate outperforming P
4. Set Px = P
5. GO TO 1
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How it is used?
● Task Independent Recommendation:
1. Discretize the search space into a set of configurations.
2. Apply over many datasets.
3. Aggregate single task rankings into a global ranking.
● Example: Scikit Learn Cheat Sheet Algorithm.
33
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
34
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How it is used?
● Search Space Design:
1. Learn hyperparameter default values (Best configuration over all
tasks).
2. Learn different hyperparameters importance:
- Measure variance of algorithm performance by keeping all
hyperparameters fixed and change only one.
35
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How it is used?
● Learning Curves: (Example: 1. Apply SVM Over 100 Training Datasets)
36
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How it is used?
● Learning Curves: (Example: 2. Apply SVM Over New Dataset)
37
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How it is used?
● Learning Curves: (Example: 3. Measure Similarity between training curves
and testing curve)
38
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How it is used?
● Configuration Transfer:
- Surrogate Models: usually suitable with Gaussian Processes Bayesian Optimization like the
SMAC algorithm.
- We can define task similarity based on Learning Distribution Similarity between tasks too.
39
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How it is used?
● Configuration Transfer:
- Surrogate Models: usually suitable with Gaussian Processes like the SMAC algorithm.
We can define task similarity based of accuracy of predictions for
40
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How it is used?
● Configuration Transfer:
- Multi-armed bandits:
1. Start with small data portion and apply multiple
configurations on these small portions.
2. Drop lowest performing configurations and increase
portion size for other configurations.
41
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
How it is used?
● Configuration Transfer:
- Multi-armed bandits:
42
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
3-Learning from Prior Models:
● Take already trained Models (Model HUB) to use for similar tasks.
● Suitable for few classifiers (Eg: Kernel Classifiers - Bayesian Networks)
BUT very good with Neural Networks. WHY?
Both Structure and Network Parameters can be a good initialization
for the target model.
43
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
References:
[1] Hutter Frank and Kotthoff Lars and Vanschoren Joaquin - Automated Machine Learning: Methods, Systems,
Challenges - (2019), Springer
[2] Bilalli, B., Abell´o, A., Aluja-Banet, T.: On the predictive power of metafeatures in OpenML. International Journal of
Applied Mathematics and Computer Science 27(4), 697 – 712 (2017)
[3] Todorovski, L., Brazdil, P., Soares, C.: Report on the experiments with feature selection in meta-level learning.
PKDD 2000 Workshop on Data mining, Decision support, Meta-learning and ILP pp. 27–39 (2000)
[4] Pinto, F., Cerqueira, V., Soares, C., Mendes-Moreira, J.: autoBagging: Learning to rank bagging workflows with
metalearning. arXiv 1706.09367 (2017)
[5] Lorena, A.C., Maciel, A.I., de Miranda, P.B.C., Costa, I.G., Prudˆencio, R.B.C.: Data complexity meta-features for
regression problems. Machine Learning 107(1), 209–246 (2018)
[6] Sun, Q., Pfahringer, B.: Pairwise meta-rules for better meta-learning based algorithm ranking. Machine Learning
93(1), 141–161 (2013)
[7] Bilalli, B., Abell´o, A., Aluja-Banet, T., Wrembel, R.: Intelligent assistance for data pre-processing. Computer
Standards & Interf. 57, 101 – 109 (2018)
[8] Schoenfeld, B., Giraud-Carrier, C., Poggeman, M., Christensen, J., Seppi, K.: Feature selection for high-dimensional
data: A fast correlation-based filter solution. In: AutoML Workshop at ICML (2018)
[9] Drori, I., Krishnamurthy, Y., Rampin, R., de Paula Lourenco, R., Ono, J.P., Cho, K., Silva, C., Freire, J.: AlphaD3M:
Machine learning pipeline synthesis. In: AutoML Workshop at ICML (2018)
[10] Ridd, P., Giraud-Carrier, C.: Using metalearning to predict when parameter optimization is likely to improve
classification accuracy. In: ECAI Workshop on Meta-learning and Algorithm Selection. pp. 18–23 (2014)
[11] Sanders, S., Giraud-Carrier, C.: Informing the use of hyperparameter optimization through metalearning. In: Proc.
ICDM. pp. 1051–1056 (2017) 44
Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee
45

Mais conteúdo relacionado

Mais procurados

Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural networkSopheaktra YONG
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic AlgorithmsAhmed Othman
 
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...Sri Ambati
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine LearningYuriy Guts
 
Introduction of Artificial Intelligence
Introduction of Artificial IntelligenceIntroduction of Artificial Intelligence
Introduction of Artificial IntelligenceAkhileshwar Nirala
 
Genetic algorithms in Data Mining
Genetic algorithms in Data MiningGenetic algorithms in Data Mining
Genetic algorithms in Data MiningAtul Khanna
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?Kazuki Yoshida
 
Explainable AI
Explainable AIExplainable AI
Explainable AIDinesh V
 
Using SHAP to Understand Black Box Models
Using SHAP to Understand Black Box ModelsUsing SHAP to Understand Black Box Models
Using SHAP to Understand Black Box ModelsJonathan Bechtel
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Krishnaram Kenthapadi
 
Machine Learning & Self-Driving Cars
Machine Learning & Self-Driving CarsMachine Learning & Self-Driving Cars
Machine Learning & Self-Driving CarsChristopher Mohritz
 
Autonomous Driving
Autonomous DrivingAutonomous Driving
Autonomous DrivingUsman Hashmi
 
Ai vs machine learning vs deep learning
Ai vs machine learning vs deep learningAi vs machine learning vs deep learning
Ai vs machine learning vs deep learningSanjay Patel
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee
 
Particle Swarm Optimization - PSO
Particle Swarm Optimization - PSOParticle Swarm Optimization - PSO
Particle Swarm Optimization - PSOMohamed Talaat
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithmsadil raja
 

Mais procurados (20)

Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic Algorithms
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
Explaining Black-Box Machine Learning Predictions - Sameer Singh, Assistant P...
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Introduction of Artificial Intelligence
Introduction of Artificial IntelligenceIntroduction of Artificial Intelligence
Introduction of Artificial Intelligence
 
Genetic algorithms in Data Mining
Genetic algorithms in Data MiningGenetic algorithms in Data Mining
Genetic algorithms in Data Mining
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Using SHAP to Understand Black Box Models
Using SHAP to Understand Black Box ModelsUsing SHAP to Understand Black Box Models
Using SHAP to Understand Black Box Models
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
 
Machine Learning & Self-Driving Cars
Machine Learning & Self-Driving CarsMachine Learning & Self-Driving Cars
Machine Learning & Self-Driving Cars
 
Autonomous Driving
Autonomous DrivingAutonomous Driving
Autonomous Driving
 
Ai vs machine learning vs deep learning
Ai vs machine learning vs deep learningAi vs machine learning vs deep learning
Ai vs machine learning vs deep learning
 
Shap
ShapShap
Shap
 
Self Driving Cars
Self Driving Cars Self Driving Cars
Self Driving Cars
 
Robust ai
Robust aiRobust ai
Robust ai
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
Particle Swarm Optimization - PSO
Particle Swarm Optimization - PSOParticle Swarm Optimization - PSO
Particle Swarm Optimization - PSO
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithms
 

Semelhante a Methods for meta learning in AutoML

Data science seminar - University of Tartu - SmartML
Data science seminar - University of Tartu - SmartMLData science seminar - University of Tartu - SmartML
Data science seminar - University of Tartu - SmartMLMohamed Maher
 
Automated Machine Learning and SmartML
Automated Machine Learning and SmartMLAutomated Machine Learning and SmartML
Automated Machine Learning and SmartMLMohamed Maher
 
Scheme g third semester (co,cm,cd,if, cw)
Scheme   g third semester (co,cm,cd,if, cw)Scheme   g third semester (co,cm,cd,if, cw)
Scheme g third semester (co,cm,cd,if, cw)anita bodke
 
M.sc cs annual_2019-20
M.sc cs annual_2019-20M.sc cs annual_2019-20
M.sc cs annual_2019-20Pankaj Dadhich
 
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...cscpconf
 
Te computer syllabus 2015 course 3-4-17 3-5-17
Te computer syllabus 2015 course 3-4-17 3-5-17Te computer syllabus 2015 course 3-4-17 3-5-17
Te computer syllabus 2015 course 3-4-17 3-5-17VishalButkar2
 
Te computer-syllabus-2015-course-3-4-17
Te computer-syllabus-2015-course-3-4-17Te computer-syllabus-2015-course-3-4-17
Te computer-syllabus-2015-course-3-4-17abc19789
 
Se be information technology rev 2016
Se   be information technology rev 2016Se   be information technology rev 2016
Se be information technology rev 2016SANJEEVKUMARSRIVASTA7
 
Se be information technology rev 2016
Se   be information technology rev 2016Se   be information technology rev 2016
Se be information technology rev 2016SANJEEVKUMARSRIVASTA7
 
[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"Young-Min kang
 
Big Data & Text Analytics - Lesson Schedule
Big Data & Text Analytics - Lesson ScheduleBig Data & Text Analytics - Lesson Schedule
Big Data & Text Analytics - Lesson ScheduleMichael Lew
 
META-LEARNING.pptx
META-LEARNING.pptxMETA-LEARNING.pptx
META-LEARNING.pptxAyanaRukasar
 
4.74 s.e. computer engineering (1)
4.74 s.e. computer engineering (1)4.74 s.e. computer engineering (1)
4.74 s.e. computer engineering (1)Aditya66086
 
CS8082_MachineLearnigTechniques _Unit-1.ppt
CS8082_MachineLearnigTechniques _Unit-1.pptCS8082_MachineLearnigTechniques _Unit-1.ppt
CS8082_MachineLearnigTechniques _Unit-1.pptpushpait
 
Validation of ATL Transformation to Generate a Reliable MVC2 Web Models
Validation of ATL Transformation to Generate a Reliable MVC2 Web ModelsValidation of ATL Transformation to Generate a Reliable MVC2 Web Models
Validation of ATL Transformation to Generate a Reliable MVC2 Web ModelsIJEACS
 
course outline
course outlinecourse outline
course outlinebutest
 
Fundamentals of computers and information technology
Fundamentals of computers and information technologyFundamentals of computers and information technology
Fundamentals of computers and information technologyDammar Singh Saud
 
Orientation slides : M1 CCS (Cloud Computing and Services) : Univ de Rennes 1
Orientation slides : M1 CCS (Cloud Computing and Services) : Univ de Rennes 1Orientation slides : M1 CCS (Cloud Computing and Services) : Univ de Rennes 1
Orientation slides : M1 CCS (Cloud Computing and Services) : Univ de Rennes 1Muhammad Chaudry
 

Semelhante a Methods for meta learning in AutoML (20)

Data science seminar - University of Tartu - SmartML
Data science seminar - University of Tartu - SmartMLData science seminar - University of Tartu - SmartML
Data science seminar - University of Tartu - SmartML
 
Automated Machine Learning and SmartML
Automated Machine Learning and SmartMLAutomated Machine Learning and SmartML
Automated Machine Learning and SmartML
 
Scheme g third semester (co,cm,cd,if, cw)
Scheme   g third semester (co,cm,cd,if, cw)Scheme   g third semester (co,cm,cd,if, cw)
Scheme g third semester (co,cm,cd,if, cw)
 
M.sc cs annual_2019-20
M.sc cs annual_2019-20M.sc cs annual_2019-20
M.sc cs annual_2019-20
 
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
 
Te computer syllabus 2015 course 3-4-17 3-5-17
Te computer syllabus 2015 course 3-4-17 3-5-17Te computer syllabus 2015 course 3-4-17 3-5-17
Te computer syllabus 2015 course 3-4-17 3-5-17
 
06522405
0652240506522405
06522405
 
Te computer-syllabus-2015-course-3-4-17
Te computer-syllabus-2015-course-3-4-17Te computer-syllabus-2015-course-3-4-17
Te computer-syllabus-2015-course-3-4-17
 
Se be information technology rev 2016
Se   be information technology rev 2016Se   be information technology rev 2016
Se be information technology rev 2016
 
Se be information technology rev 2016
Se   be information technology rev 2016Se   be information technology rev 2016
Se be information technology rev 2016
 
[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"[update] Introductory Parts of the Book "Dive into Deep Learning"
[update] Introductory Parts of the Book "Dive into Deep Learning"
 
Big Data & Text Analytics - Lesson Schedule
Big Data & Text Analytics - Lesson ScheduleBig Data & Text Analytics - Lesson Schedule
Big Data & Text Analytics - Lesson Schedule
 
META-LEARNING.pptx
META-LEARNING.pptxMETA-LEARNING.pptx
META-LEARNING.pptx
 
4.74 s.e. computer engineering (1)
4.74 s.e. computer engineering (1)4.74 s.e. computer engineering (1)
4.74 s.e. computer engineering (1)
 
CS8082_MachineLearnigTechniques _Unit-1.ppt
CS8082_MachineLearnigTechniques _Unit-1.pptCS8082_MachineLearnigTechniques _Unit-1.ppt
CS8082_MachineLearnigTechniques _Unit-1.ppt
 
Numerical Methods
Numerical MethodsNumerical Methods
Numerical Methods
 
Validation of ATL Transformation to Generate a Reliable MVC2 Web Models
Validation of ATL Transformation to Generate a Reliable MVC2 Web ModelsValidation of ATL Transformation to Generate a Reliable MVC2 Web Models
Validation of ATL Transformation to Generate a Reliable MVC2 Web Models
 
course outline
course outlinecourse outline
course outline
 
Fundamentals of computers and information technology
Fundamentals of computers and information technologyFundamentals of computers and information technology
Fundamentals of computers and information technology
 
Orientation slides : M1 CCS (Cloud Computing and Services) : Univ de Rennes 1
Orientation slides : M1 CCS (Cloud Computing and Services) : Univ de Rennes 1Orientation slides : M1 CCS (Cloud Computing and Services) : Univ de Rennes 1
Orientation slides : M1 CCS (Cloud Computing and Services) : Univ de Rennes 1
 

Último

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 

Último (20)

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 

Methods for meta learning in AutoML

  • 1. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Methods for Meta-Learning in AutoML Learning how to Learn 1
  • 2. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Motivational Example ● Alex starts to learn a Maths course of 10 tests for the first time in his life. (Problem) ● Alex wants to get a grade A in most of the course tests. (Target) ● Alex thought that attending all lectures would easily help him to get grade A like what he always does in history courses. (Approach 1) 2
  • 3. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Motivational Example ● Alex got grade D in his first test. (Result 1) ● Alex decided to switch to reading the reference book instead of attending all lectures only. (Approach 2) ● Alex got grade C in his second test. (Result 2) 3
  • 4. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Motivational Example ● After that, Alex decided to switch to solving practice problems instead. (Approach 3) ● Alex got grade B in his third test. (Result 3) ● So, Alex decided to summarize each lesson and teach it to his colleagues too. (Approach 4) ● Alex got grade A in his fourth test. (Result 4) 4
  • 5. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Motivational Example ● Now, the question is how will Alex study for his 5th test in the course ? 5 Alex has already learnt how to learn.
  • 6. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Back to Machine Learning 6
  • 7. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Motivation 7 Data Collection 1. Data Preprocessing 2. Feature Extraction 3. Feature Selection 4. Algorithm Selection Deploym ent 5. Parameter Tuning Prediction Real-World Data Feature Engineering Model Building Typical Supervised Machine Learning Pipeline
  • 8. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee 8 Model Building 4. Algorithm Selection 5. Parameter Tuning Examples: - Linear Classification: (Simple Linear Classification, Ridge, Lasso, Simple Perceptron, ….) - Support Vector Machines - Decision Tree (ID3, C4.5, C5.0, CART, ….) - Nearest Neighbors - Gaussian Processes - Naive Bayes (Gaussian, Bernoulli, Complement, ….) - Ensembling: (Random Forest, GBM, AdaBoost, ….) Motivation: Model Building
  • 9. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee 9 Model Building 4. Algorithm Selection 5. Parameter Tuning Kernel Linear RBF Polynomial Gamma [2^-15, 2^3] Degree 2,3,.... C - Penalty [2^-5, 2^15] Example: Support Vector Machine …….. Motivation: Model Building
  • 10. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee 10 Motivation: Dimensionality Reduction Examples of Feature Extraction: 1. Principal Component Analysis 2. Linear Discriminant Analysis 3. Multiple Discriminant Analysis 4. Independent Component Analysis Examples of Multivariate Feature Selection: 1. Relief 2. Correlation Feature Selection 3. Branch and Bound 4. Sequential Forward Selection 5. Plus L - Minus R Examples of Univariate Feature Selection: 1. Information Gain 2. Fisher Score 3. Correlation with Target 2. Feature Extraction 3. Feature Selection
  • 11. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee 11 Motivation: Data PreProcessing Examples of Data Preprocessors: 1. Scaling 2. Normalization 3. Standardization 4. Binarization 5. Imputation 6. Deletion 7. One-Hot-Encoding 8. Hashing 9. Discretization 1. Data Preprocessing
  • 12. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Solution: Meta-Learning 1. Science of systematically observing how different machine learning approaches perform on a wide range of learning tasks and then learning from this experience. 12
  • 13. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Solution: Meta-Learning 2. It also allows to replace hand-written rules and algorithms with novel approaches that are data-driven. 13 1. Science of systematically observing how different machine learning approaches perform on a wide range of learning tasks and then learning from this experience.
  • 14. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee HOW ? Collect MetaData 1. Model Configurations: - Pipeline Composition: (Normalization → PCA → SVM) - Hyperparameter Settings: (PCA = 2 components, SVM = gamma: 1e-9, C = 1e2) - Network Architectures: (2 Hidden Layers, 100 Neurons per layer) 2. Resulting Model Evaluations: - Different Metrics: Accuracy, error rate, F1-Score. - Training Time. 3. Task Itself (Meta-Features): - Description of the data 14
  • 15. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee HOW ? Use Meta-Data 1. Knowledge Transfer. Use the same model as an initial point and start to tune it. 2. Guided Search. If Classifier X is worth than Classifier Y by 10% then there is no need to tune classifier X 15
  • 16. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee HOW ? Use Meta-Data Remember that Alex starts with the same approach that succeeds in History courses. Meta-Learning won’t be effective and may affect performance badly in case of: - Tasks with random noise, and unrelated phenomena. “Tasks that are Never Seen Before” 16
  • 17. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Meta-Learning Methodologies: 1. Learning from Task Properties. 2. Learning from Model Evaluations. 3. Learning from Prior Models. 17
  • 18. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee 1-Learning from Task Properties: ● Represent task as a meta-feature vector. ● Studies show that optimal set of meta-features depends on application type.[2] ● Different studies used various feature selection and extraction techniques to reduce set of meta-features.[2][3] 18
  • 19. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee 1-Learning from Task Properties: ● What are Task Properties? = Types of Meta-features: 1. Simple 2. Statistical 3. Information Theoretic 4. Complexity 5. LandMarkers 19
  • 20. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Meta-Features Types: (Simple) ● Examples: 1. Number of Instances 2. Number of Features 3. Number of Classes 4. Number of Missing Values 5. Number of Outliers 20
  • 21. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Meta-Features Types: (Statistical) ● Examples: 1. Skewness of Numerical Features. 2. Kurtosis of Numerical Features. 3. Correlation Covariance between features. 4. Variance in first PCA. 5. Skewness and Kurtosis of first PCA. 6. Class probability distribution. 7. Concentration, Sparsity, Gravity of Features (Measurements of independence and dispersion of values.) 21
  • 22. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Meta-Features Types: (Information theoretic) ● Examples: 1. Class Entropy. 2. Mutual Information between feature and Class. 3. Equivalent number of features (2/1) 4. Noise to Signal ratio. 22
  • 23. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Meta-Features Types: (Task Complexity) ● Examples: 1. Fisher discriminant (Measure separability between classes). 23
  • 24. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee Meta-Features Types: (Landmarkers) ● Examples: 1. LandMarker 1NN. 2. LandMarker Decision Tree. 3. LandMarker Naive Bayes. 4. LandMarker Linear Discriminant Analysis. 24
  • 25. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How to use Meta-Features? 25 ● Different Similarity Measurements (Unsupervised) and warm starting optimization of similar tasks for recommendation of candidate configurations: Examples: 1. Rank of different configurations. - Tasks A, B are twin tasks. - SVM and KNN are the best for Task A. - Then, SVM and KNN are the best for Task B.
  • 26. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How to use Meta-Features? 26 ● Different Similarity Measurements (Unsupervised) and warm starting optimization of similar tasks for recommendation of candidate configurations: Examples: 2. Collaborative Filtering Use results of few configurations on Task A to predicts results of all other configurations based on configurations results on a similar Task B Knowledge Base Needs almost full configurations results to be updated.
  • 27. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee 1-Learning from Task Properties: 27
  • 28. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How to use Meta-Features? 28 ● Learning High Level Meta-Features Low Level Features High Level Features NEEDS BIG KNOWLEDGE BASE
  • 29. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How to use Meta-Features? 29 ● Meta-Models (Supervised): Learn the complex relationship between meta-features and useful configurations in this large space. Example: - Ranking of Top N Promising Configurations: Literature suggests Boosting, and Bagging Models [4][5]. + Approximate Ranking Tree Forests [6] (Auto Meta-Feature Selection based on some initial results).
  • 30. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How to use Meta-Features? 30 ● Pipeline Synthesis: 1. Meta-Model to predict which preprocessor with improve performance of a specific classifier in that particular task. [7] [8] 2. Reinforcement Learning to construct pipeline by addition, deletion, replacement of pipeline blocks. [9] (Alpha D3M - Evolutionary Approach)
  • 31. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How to use Meta-Features? 31 ● Tune or Not to Tune: Meta Models to predict: 1. How much improvement we can expect from tuning this particular classifier on that particular task [10]. 2. How much improvement VS additional time investment? [11].
  • 32. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee 2-Learning from Model Evaluations: ● Using Current configuration evaluations as a prior to suggesting the coming candidate outperforming configuration in an iterative way. 32 Example: 1. Evaluate Px on Task 1 2. Suggest new Ps 3. Select most candidate outperforming P 4. Set Px = P 5. GO TO 1
  • 33. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How it is used? ● Task Independent Recommendation: 1. Discretize the search space into a set of configurations. 2. Apply over many datasets. 3. Aggregate single task rankings into a global ranking. ● Example: Scikit Learn Cheat Sheet Algorithm. 33
  • 34. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee 34
  • 35. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How it is used? ● Search Space Design: 1. Learn hyperparameter default values (Best configuration over all tasks). 2. Learn different hyperparameters importance: - Measure variance of algorithm performance by keeping all hyperparameters fixed and change only one. 35
  • 36. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How it is used? ● Learning Curves: (Example: 1. Apply SVM Over 100 Training Datasets) 36
  • 37. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How it is used? ● Learning Curves: (Example: 2. Apply SVM Over New Dataset) 37
  • 38. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How it is used? ● Learning Curves: (Example: 3. Measure Similarity between training curves and testing curve) 38
  • 39. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How it is used? ● Configuration Transfer: - Surrogate Models: usually suitable with Gaussian Processes Bayesian Optimization like the SMAC algorithm. - We can define task similarity based on Learning Distribution Similarity between tasks too. 39
  • 40. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How it is used? ● Configuration Transfer: - Surrogate Models: usually suitable with Gaussian Processes like the SMAC algorithm. We can define task similarity based of accuracy of predictions for 40
  • 41. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How it is used? ● Configuration Transfer: - Multi-armed bandits: 1. Start with small data portion and apply multiple configurations on these small portions. 2. Drop lowest performing configurations and increase portion size for other configurations. 41
  • 42. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee How it is used? ● Configuration Transfer: - Multi-armed bandits: 42
  • 43. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee 3-Learning from Prior Models: ● Take already trained Models (Model HUB) to use for similar tasks. ● Suitable for few classifiers (Eg: Kernel Classifiers - Bayesian Networks) BUT very good with Neural Networks. WHY? Both Structure and Network Parameters can be a good initialization for the target model. 43
  • 44. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee References: [1] Hutter Frank and Kotthoff Lars and Vanschoren Joaquin - Automated Machine Learning: Methods, Systems, Challenges - (2019), Springer [2] Bilalli, B., Abell´o, A., Aluja-Banet, T.: On the predictive power of metafeatures in OpenML. International Journal of Applied Mathematics and Computer Science 27(4), 697 – 712 (2017) [3] Todorovski, L., Brazdil, P., Soares, C.: Report on the experiments with feature selection in meta-level learning. PKDD 2000 Workshop on Data mining, Decision support, Meta-learning and ILP pp. 27–39 (2000) [4] Pinto, F., Cerqueira, V., Soares, C., Mendes-Moreira, J.: autoBagging: Learning to rank bagging workflows with metalearning. arXiv 1706.09367 (2017) [5] Lorena, A.C., Maciel, A.I., de Miranda, P.B.C., Costa, I.G., Prudˆencio, R.B.C.: Data complexity meta-features for regression problems. Machine Learning 107(1), 209–246 (2018) [6] Sun, Q., Pfahringer, B.: Pairwise meta-rules for better meta-learning based algorithm ranking. Machine Learning 93(1), 141–161 (2013) [7] Bilalli, B., Abell´o, A., Aluja-Banet, T., Wrembel, R.: Intelligent assistance for data pre-processing. Computer Standards & Interf. 57, 101 – 109 (2018) [8] Schoenfeld, B., Giraud-Carrier, C., Poggeman, M., Christensen, J., Seppi, K.: Feature selection for high-dimensional data: A fast correlation-based filter solution. In: AutoML Workshop at ICML (2018) [9] Drori, I., Krishnamurthy, Y., Rampin, R., de Paula Lourenco, R., Ono, J.P., Cho, K., Silva, C., Freire, J.: AlphaD3M: Machine learning pipeline synthesis. In: AutoML Workshop at ICML (2018) [10] Ridd, P., Giraud-Carrier, C.: Using metalearning to predict when parameter optimization is likely to improve classification accuracy. In: ECAI Workshop on Meta-learning and Algorithm Selection. pp. 18–23 (2014) [11] Sanders, S., Giraud-Carrier, C.: Informing the use of hyperparameter optimization through metalearning. In: Proc. ICDM. pp. 1051–1056 (2017) 44
  • 45. Mohamed Maher - University of Tartu - 2019 - mohamed.abdelrahman@ut.ee 45