Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Lecture8 - From CBR to IBk
1. Introduction to Machine
Learning
Lecture 8
Instance Based Learning and Case-Based Reasoning
Albert Orriols i Puig
aorriols@salle.url.edu
i l @ ll ld
Artificial Intelligence – Machine Learning
Enginyeria i Arquitectura La Salle
gy q
Universitat Ramon Llull
2. Recap of Lecture 7
kNN
15-NN 1-NN
Key aspects
Value of k
Distance functions
Slide 2
Artificial Intelligence Machine Learning
3. Recap of Lecture 7
Where is learning in kNN?
g
Retrieval system
No l b l
N global model
dl
No generalization
…
No learning!
o ea g
But till i bl t
B t still, it is able to create accurate classification
t tl ifi ti
models
Slide 3
Artificial Intelligence Machine Learning
4. Today’s Agenda
Formalizing the framework: From kNN to CBR
Incorporating learning in different phases:
Learn prototypes
Organize the memory in clusters
Learn the best distance function
Provide explanations
Slide 4
Artificial Intelligence Machine Learning
5. From kNN to CBR
kNN provides a retrieving system
Much work on different phases of kNN
Prototype selection
Distance function selection
…
CBR provides a general framework based on kNN
Slide 5
Artificial Intelligence Machine Learning
6. Schema of CBR
CBR cycle
Select a
(Aamodt &
solution
Plaza,
Plaza 1994)
Reuse
Similarity
function
Revise the
solution
Solution
Revise
Problem Retrieve Case
Memory
Retain
Coherence and
Structure and Retain the
relevance of the
agrupation of the cases new knowledge
attributes
Slide 6
Artificial Intelligence Machine Learning
7. Phases of CBR
Five key phases
Preprocess the training instance
So that it meets the requirements of the system
Retrieve
Use
U kNN with the selected distance function
ih h l d di f i
Reuse
Vote-based scheme
Revise
Adapt the solution if necessary
Retain
Remove examples from or add examples to
the case memory
Slide 7
Artificial Intelligence Machine Learning
8. Challenges in CBR
Hot areas
Reduce the cost of matching
Reduce the total number of examples in the case memory
Organize the case memory in clusters and only consult examples
O i th ilt dl lt l
of some clusters
Automatically create distance functions that are suited to your
problem
Extraction of explanations:
CBR does not extract legible models (actually, does not learn any
model)
)
Slide 8
Artificial Intelligence Machine Learning
9. Prototype Selection
Training data sets contain a large number of instances
g g
Increase the prediction time
May
M contain noisy i t
ti i instances
Prototype selection
Select the representative examples to form the case base
Remove all the other examples
How?
Learn which examples are the ones that maximize CBR
accuracy
Slide 9
Artificial Intelligence Machine Learning
10. Prototype Selection
Possible sets of prototypes
…
Training Sel. Sel. Sel.
Training
Data set Proto 1 Proto 2 Proto 3
Data set
Split the
training set
How do we know which
is th b t S l ti
i the best Selection of
f
Prototypes?
Validation
set
KNN
Test data set
Does it sound familiar to you?
Problem: Search for the best SP
It s
It’s just an optimization problem
For robustness, use cross-validation or similar validation procedures
Slide 10
Artificial Intelligence Machine Learning
11. Prototype Selection
Optimization methods used so far
p
Genetic algorithms (Holland, 75)
Genetic Programming (Koza et al., 1989)
G ti P i (K tl
Grammar Evolution (Ryan & O’Neill, 1998)
Slide 11
Artificial Intelligence Machine Learning
12. Case-Based Memory Clustering
Training data sets contain a large number of instances
g g
Clustering: Place instances in different clusters
Only t i
O l retrieve from the same cluster or clusters that are
f th lt lt th t
close to you
Slide 12
Artificial Intelligence Machine Learning
13. Case-Based Memory Clustering
Retrieve phase
Reuse Reuse phase
1. Compare with all the prototypes
Propose a solution with the
2. Compare only with the examples
retrieved cases
of the closest cluster
Case
Retrieve
R ti Revise
Ri
Memory
Retain phase Revise phase
Update the organization. Revise if the solution is
It may imply the update of the
y py p p
potentially valid
y
Retain
clusters
Slide 13
Artificial Intelligence Machine Learning
14. Generation of Distance Functions
How does the distance function influences learning?
g
It may be the key between success and failure!
Slide 14
Artificial Intelligence Machine Learning
15. Generation of Distance Functions
Can I find a distance function that makes kNN perform
p
the best in all cases?
No way Actually, NFL announces it (Wolpert 1992)
way. Actually (Wolpert,
Different distances suited for different domains
May I try to create a new distance function for each
specific problem?
Of course. Again, an optimization problem
Slide 15
Artificial Intelligence Machine Learning
16. Generation of Distance Functions
Split the training data set into
Training t’
T i i set’
Optimization problem
Validation set
Assume a parametric form
Optimize the parameters of the
Validation underlying function
set
Being more ambitious?
Dist. Dist. Dist.
function1 function2 functionn
Do not assume any parametric
… form
Optimize both the function
structure and the parameters
kNN
Training
Examples:
Data set‘
(
(Fornells et al., 2005)
, )
(Camps et al., 2003)
error1 error2 errorn
Slide 16
Artificial Intelligence Machine Learning
17. Extraction of Explanations
One of the main drawbacks of CBR is that it does not provide
p
any explanation
Prediction based on nearest neighbors
New techniques to provide explanations
Based on used instances
Building of partial models
Not studied in more detail here
Slide 17
Artificial Intelligence Machine Learning
18. Next Class
Probabilistic-based learning
Slide 18
Artificial Intelligence Machine Learning
19. Introduction to Machine
Learning
Lecture 8
Instance Based Learning and Case-Based Reasoning
Albert Orriols i Puig
aorriols@salle.url.edu
i l @ ll ld
Artificial Intelligence – Machine Learning
Enginyeria i Arquitectura La Salle
gy q
Universitat Ramon Llull