This document proposes methods to improve pollen classification with less training data. It discusses using transfer learning and active learning to apply classification rules learned from existing labeled samples to new unlabeled samples, reducing the need for experts to manually label large training sets. The methods are tested on a dataset of 9 pollen types, using 6 types as a source domain and 3 as a target, achieving 92% accuracy while reducing the labeling effort needed by up to 5 times.
5. Challenges
Manual labeling Automatic classification
Tedious and expensive Large training samples
Labeling effort for skilled specialists
6. Transfer Learning
source samples
existing rule
1. Learn rules from existing (source) samples using AdaBoost
7. Transfer Learning
source samples
existing rule
target sample
transfer rule
unlabeled sample
2. Apply rules to new (target) samples by modifying TaskTrAdaBoost
8. Active Learning
source samples
existing rule
target sample
transfer rule
unlabeled sample
selected sample
3. Develop a new selection criterion to focus on target samples
9. Active Learning
source samples
existing rule
target sample
transfer rule
unlabeled sample
selected sample
updated rule
4. Ask expert for label, then update rules based on the new training set
10. Results
1. Select more target samples 2. Select better target samples
Reducing training effort up to 5 times with 92% accuracy