5. Before do Something Fancy
http://blog.kaggle.com/2017/01/05/your-year-on-kaggle-most-memorable-
community-stats-from-2016/
on deep learning
1. Traditional Model
6. Before do Something Fancy
SVM 은 MDA, Logit, CBR 과 비교해서 우수한 예측력을 보였으며 .... 인공신경망과 비
슷한 수준의 높은 예측력을 나타낼 뿐만 아니라 인공신경망의 한계점으로 지적되었던
과대 적합, 국소 최적화와 같은 한계점들을 완화하는 장점을 가진다. (2003 한인구)
http://www.aistudy.co.kr/pattern/support_vector_machine.htm
https://www.researchgate.net/post/Which_classifier_is_the_better_in_case_of_small_data_s
amples
그냥 일단 먼저 다른 모델(SVM) 돌려보세요...
on deep learning
1. Traditional Model
7. Before do Something Fancy on deep learning
2. Data Augmentation
For Image
https://github.com/aleju/imgaug
For Audio
https://github.com/bmcfee/muda
For others
put some money
8. Wait... Why Deep Learning?
https://www.quora.com/Why-is-xgboost-given-so-much-less-attention-than-deep-
learning-despite-its-ubiquity-in-winning-Kaggle-solutions
When you do have "enough" training data, and when somehow you manage to
find the matching magical deep architecture, deep learning blows away any other
method by a large margin.
Will you still do it?
9.
10. How can we train with
few data
Dive with Example, no math required!
ETRI 두스 2018 _ 2차 스터디
davinnovation@gmail.com
In deep learning perspective
12. 후에 소개되는 방법론들이
앞에서 설명한 결과보다 좋아진다는 보장은 없음!!!
It’s just the other tools. Not magic wand
13. Approaches
if (data size is small) and not (satisfied SVM):
if (sufficient label):
Fine Tuning
elif (few labels):
N-shot Learning
elif (no labels):
Zero-shot Learning/Domain Adaptation
elif (skewed labels):
Anomaly Detection if special else Training Tricks!
else:
Hire Alba for labeling!
else:
if !(sufficient label):
semi-supervised learning, unsupervised learning
else:
JUST DO DEEP LEARNING!
14. Approaches
if (data size is small) and not (satisfied SVM):
if (sufficient label):
Fine Tuning
elif (few labels):
N-shot Learning
elif (no labels):
Zero-shot Learning/Domain Adaptation
elif (skewed labels):
Anomaly Detection if special else Training Tricks!
else:
Hire Alba for labeling!
else:
if !(sufficient label):
semi-supervised learning, unsupervised learning
else:
JUST DO DEEP LEARNING!
Transfer
Learning
Uncertainty
Learning Method
15. Transfer Learning
== Knowledge transfer
Pan, Sinno Jialin, and Qiang Yang. "A survey on transfer learning." IEEE Transactions on knowledge and data
engineering 22.10 (2010): 1345-1359.
Before we dive into models...
20. Transfer Learning
== Knowledge transfer
Transfer Types
Instance-transfer re-weight (source-data trained)model by target-data
== min(model(source)) -> min(model(target))
Feature-representation
transfer
find good feature representation for source & target
== min ( model_feature(source).variation - model_feature(target).variation )
Parameter-transfer discover share parameter between source & target
== 1/2 ( model_feature(source).weight + model_feature(target).weight )
Relational-knowledge-
transfer
build mapping of relational knowledge between source & target
learn source - > learn target
learn source & target same time
model weight perspective...
learn some relational info
21. FINE TUNING
if (sufficient label)
if (data size is small) and not (satisfied SVM):
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
10M > datasets
Fixed Feature Extractor
Fine-tuning
In deep learning perspective with small data
http://cs231n.github.io/transfer-learning/
22. FINE TUNING
if (sufficient label)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
SC : Sparse Coding ( Scratch )
TF : Transfer Learning
CTL : Complete TL
PTL : Partial TL
MTL : Multi-task TL
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7480825 http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7064414
16220 images
23. FINE TUNING
if (sufficient label)
if (data size is small) and not (satisfied SVM):
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
10M > datasets
https://arxiv.org/pdf/1311.2901v3.pdf
each layer can extract Features
Why fine-tune can work on deep learning
In deep learning perspective with small data
24. FINE TUNING
if (sufficient label)
if (data size is small) and not (satisfied SVM):
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
10M > datasets
https://arxiv.org/pdf/1411.1792.pdf
How much we fine tune?
In deep learning perspective with small data
25. Multi-task learning
if (sufficient label)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
https://openreview.net/pdf?id=S1PWi_lC-
70,000
70,000
70,000
26. N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar )
Human Deep Learning
27. N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar )
single one picture book 60,000 train data (MNIST)
( Actually is NOT! but just for fun...)
28. N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Meta-Learning
Meta Learning
Learning to Learn
http://bair.berkeley.edu/blog/2017/07/18/lear
ning-to-learn/
29. N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Memory model ( Neural Turing Machine )
RNN Memory
https://www.slideshare.net/ssuserafc864/one-shot-learning-deep-learning-
meta-learn
30. N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Memory model ( Neural Turing Machine )
One-shot Learn with Meta
https://arxiv.org/pdf/1605.06065.pdf
Omniglot Dataset : 1600 > classes
1200 class train, 423 class test ( downscale to 20x20 )
+ plus rotate augmentation
31. N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Metric Learning Perspective
Metric Learning : learns Feature Extract + Features Manifold
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
Metric Learning
32. N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Metric Learning Perspective
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
Metric Learning
33. N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Metric Learning Perspective
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
One-Shot Learn with metric
60, 000 color images of size 84 × 84 with 100 classes
NOT LEARN CLASS
LEARNS metrics
https://arxiv.org/pdf/1606.04080.pdf
34. N-Shot
elif (few labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
N-Shot ( One shot is more familiar ) Metric Learning Perspective
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
One-Shot Learn with metric
35. Zero-Shot
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Zero-shot Metric Learning Perspective
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
Zero-shot Learn with metric
36. Zero-Shot
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Zero-shot Metric Learning Perspective
https://drive.google.com/file/d/1kDedrnO4N2l9RATSXRS0FuAZqW1mHPWu/view
Zero-shot Learn with metric
37. Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
No Domain
Adaptation
38. Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Class = Backpack
Amazon
DSLR
Webcam
Caltech
Training
39. Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Domain-Adversarial Neural Network
Classifier의 성능을 유지하면서
(Classifier)
source || target feature 분포도 고려,
Source에서 왔는지 Target에서 왔는지 알
수 없도록 방해
(GAN)
40. Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Domain-Adversarial Neural Network
41. Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Domain Separation Networks
이미지 복원값
shared encoder와 차이classification loss
2개 difference를 비슷하게 만들어줌
42. Domain Adaptation
elif (no labels)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
Domain Separation Networks
Source-only : Training with only source data
Target-only : Training with only Target data Testing on target data
SVHN
GTSRBMNIST
43. Anomaly Detection (Novelty Detection)
elif (skewed labels) and (special case)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
https://www.youtube.com/watch?v=hHHmWmJG9Rw
It’s look like Zero – shot learn
44. Anomaly Detection (Novelty Detection)
elif (skewed labels) and (special case)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
https://www.datascience.com/blog/python-
anomaly-detection
Gaussian Distribution -> Check Uncertainty!!!
45. Anomaly Detection (Novelty Detection)
elif (skewed labels) and (special case)
if (data size is small) and not (satisfied SVM):
In deep learning perspective with small data
safe visual navigation via deep learning https://www.slideshare.net/samchoi7/modeling-uncertainty-in-deep-learning
46. Training Skills
elif (skewed labels)
if (data size is small) and not (satisfied SVM):
Model weight update with balance
https://arxiv.org/pdf/1710.05381.pdf : imbalance class effect
Stratified Sampling / Bootstrapping... / K-Fold...