비전공자들을 위한 머신러닝 / 딥러닝 튜토리얼입니다.
This is a deep learning (machine learning) tutorial for beginners.
Contents
1. Introduction to machine learning & deep learning
2. DL methods:
Convolutional neural networks (CNN)
Recurrent neural networks (RNN)
Variational autoencoder (VAE)
Generative adversarial networks (GAN)
3. Can we believe deep neural networks?
이 슬라이드는 부산 동아대학교에서 2018년 7월 16일 2시간 특강을 위해 마련된 자료로, 비전공자들을 위해 수식보다 개념 이해를 위해 힘쓴 강의자료입니다. 나중에 테리의 딥러닝톡에서도 한번 설명을 붙여볼게요~ https://www.facebook.com/deeplearningtalk/
https://www.youtube.com/playlist?list=PL0oFI08O71gKEXITQ7OG2SCCXkrtid7Fq
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Deep Learning Tutorial in 100 Mins
1. Terry Taewoong Um (terry.t.um@gmail.com)
University of Waterloo
Department of Electrical & Computer Engineering
Terry Taewoong Um
DEEP LEARNING TUTORIAL
IN 100 MINS
1
2. Terry Taewoong Um (terry.t.um@gmail.com)
WHO AM I
2
2008 – 2010 M.S. at Seoul National University, Korea
c.f. “Tangent Space RRT with Lazy Projection: An Efficient Planning
Algorithm for Constrained Motions”, T. T. Um et al., ARK2010.
2010 – 2014 Robotics researcher at LIG Nex1 / KIST, Korea
c.f. “Independent Joint Learning: A Novel Task-to-Task Transfer Learning
Scheme for Robot Models”, T. T. Um et al., ICRA2014.
• I am a robotics researcher
3. Terry Taewoong Um (terry.t.um@gmail.com)
WHO AM I
3
2014 – Present PhD candidate at U.Waterloo, Canada
c.f. “Exercise Motion Classification from Large-Scale Wearable Sensor
Data Using Convolutional Neural Networks”, T. T. Um et al., IROS2017.
• I am a deep learning researcher
http://hookedoneverything.com/parkinsons/
https://www.trainwithpush.com/
PUSH project Parkinson’s disease (PD) project
4. WHO AM I
4
• But I am more known for ... - Facebook communities
: 로봇공학을 위한 열린 모임,
Tensorflow Korea, etc.
- Blog / Youtube
: 테리의 딥러닝 토크, T-Robotics,
대학원생 때 알았더라면
좋았을 이야기들
- Etc.
: Most-cited DL papers (github)
5. Terry Taewoong Um (terry.t.um@gmail.com)
CONTENTS
5
1. Introduction to ML & DL 50min
2. DL methods: CNN, RNN, VAE, GAN 35min
3. Can we believe DNNs? 15min
4. Q & A 15min
Break 10min
6. Terry Taewoong Um (terry.t.um@gmail.com)
CONTENTS
6
1. Introduction to ML & DL 50min
2. DL methods: CNN, RNN, VAE, GAN 35min
3. Can we believe DNNs? 15min
4. Q & A 15min
Break 10min
7. 7
Terry Taewoong Um (terry.t.um@gmail.com)
https://github.com/sjchoi86/
dl_tutorials_10weeks
https://github.com/terryum/
awesome-deep-learning-papers
http://videolectures.net/
deeplearning2017_montreal/
• Deep learning summer school
STUDY MATERIALS
• Andrew Ng, Deeplearning.ai / Coursera
• Stanford Univ., CS231n (CNNs) / CS224d (RNNs)
• Various tutorials presented in NIPS, ICML, etc.
8. Terry Taewoong Um (terry.t.um@gmail.com)
8
AI, ML, DL, NN
https://medium.com/zeroth-ai/understanding-
artificial-intelligence-b9b58f9b25c2
9. Terry Taewoong Um (terry.t.um@gmail.com)
9
RECOGNITION - IMAGE
Google photos
Object recognition (image retrieval)
10. Terry Taewoong Um (terry.t.um@gmail.com)
10
YOLO v2, https://www.youtube.com/watch?v=VOC3huqHrss
RECOGNITION - IMAGE
Object detection
11. Terry Taewoong Um (terry.t.um@gmail.com)
11
RECOGNITION - NATURAL LANGUAGE
Sentiment classification
SAD Joyful
12. Terry Taewoong Um (terry.t.um@gmail.com)
12
Speech recognition
RECOGNITION - SPEECH
14. Terry Taewoong Um (terry.t.um@gmail.com)
14
SUPERVISED LEARNING
Train : X → Y
image, text, speech,
wearable data, etc.
labels
Test : X → ?
(real practice)
* Never use the test dataset during the development of a model (training)
16. Terry Taewoong Um (terry.t.um@gmail.com)
16
OVERFITTING
good performance for training data
bad performance for test data
model complexity
error
training error
test error
• Model complexity vs. Error
17. Terry Taewoong Um (terry.t.um@gmail.com)
17
SUPERVISED LEARNING
Train : X → Y
image, text, speech,
wearable data, etc.
labels
Test : X → ?
(real practice)
* Never use the test dataset during the development of a model (training)
18. Terry Taewoong Um (terry.t.um@gmail.com)
18
VALIDATION SET
Train : X → Y
image, text, speech,
wearable data, etc.
labels
Validation : X → ?
(real-practice indicator)
Test : X → ?
(real practice)
19. Terry Taewoong Um (terry.t.um@gmail.com)
19
PREVENTING OVERFITTING
training time
error
training error
test error
we should
stop here
training
set
validation
set
test
set
for training
(parameter
optimization)
for early
stopping
(avoid
overfitting)
for evaluation
(measure the
performance)
keep watching the validation error
• Training / Validation / Test datasets
20. Terry Taewoong Um (terry.t.um@gmail.com)
20
PREVENTING OVERFITTING
training validation test
• N-fold cross validation
21. Terry Taewoong Um (terry.t.um@gmail.com)
21
BOLTS & NUTS OF BUILDING DL
http://www.computervisionblog.com/2016/12/
nuts-and-bolts-of-building-deep.html
Andrew Ng at NIPS2016
23. Terry Taewoong Um (terry.t.um@gmail.com)
23
GENERAL PROCEDURE OF ML
Task
Representation
(Features)
Feature
extraction
Machine
learning
IMAGE
SPEECH
Feature engineering
24. Terry Taewoong Um (terry.t.um@gmail.com)
24
WHAT ARE THE GOOD FEATURES?
http://twistedsifter.com/2016/03/puppy-or-bagel-meme-gallery/
25. Terry Taewoong Um (terry.t.um@gmail.com)
25
GENERAL PROCEDURE OF DL
Task
Representation
(Features)
Feature
extraction
Machine
learning
Task
Deep learning
(end-to-end)
* Feature
extraction included
26. Terry Taewoong Um (terry.t.um@gmail.com)
26
DEEP LEARNING
• What is Deep Learning (DL) ?
- Learning methods which have deep (not shallow) architecture
- It usually allows end-to-end learning
- It automatically learn intermediate representation. Thus,
it can be regarded as a representation learning
- It often contains stacked “neural network”. Thus,
Deep learning usually indicates “deep neural network”
“Deep Gaussian Process” (2013)
https://youtu.be/NwoGqYsQifg
http://goo.gl/fxmmPE
http://goo.gl/5Ry08S
27. Terry Taewoong Um (terry.t.um@gmail.com)
27
BIOLOGICAL EVIDENCE
Yann LeCun, https://goo.gl/VVQXJG
• The vental pathway in the visual cortex has multiple stages
• There exist a lot of intermediate representations
28. Terry Taewoong Um (terry.t.um@gmail.com)
28
IMAGENET CHALLENGE (ILSVRC)
http://image-net.org/challenges/talks/2016/ILSVRC2016_10_09_clsloc.pdf
• 1000 classes, 1.4 million images
• The first “large-scale” ML challenge
• Labeled by Amazon Mechanical Turk
(Fei-Fei Lee, Stanford Univ.)
• Need large-scale data → ImageNet
• Need a scalable method → DL
• Need computation power → GPU
• Convolutional Neural Networks (CNNs)
AlexNet (2012), VGG (2014), GoogLeNet
(2015), ResNet (2016), DenseNet (2017)...
29. 29
NEURAL NETWORKS
Terry Taewoong Um (terry.t.um@gmail.com)
(H. Lalochelle, DLSS2017)
• A large parametric model
(like high-order polynomials)
• Learn the parameters using
gradient descent (GD) method
• Local minima problem? → Stochastic GD (SGD)
• Overfitting problem? → Large-scale data
30. 30
NEURAL NETWORKS
Terry Taewoong Um (terry.t.um@gmail.com)
• Neural networks =
Composition of functions
Linear combination
𝑊𝑥 + 𝑏
Activation
σ(𝑊𝑥 + 𝑏)
(…repeat…)
Linear combination
𝑊 σ 𝑊(… ) + 𝑏 + 𝑏
Output activation
σ 𝑜𝑢𝑡 (… )
Forward pass Backward pass
Calculate the loss
Loss(𝑦𝑡𝑟𝑢𝑒, 𝑦 𝑝𝑟𝑒𝑑)
Gradient of the loss
Gradient of the activation
Gradient of the weights
(…repeat…)
Update the weights
(H. Lalochelle, DLSS2017)
ReLU or tanh
Softmax or Linear
Optimization:
SGD, RMSProb, or Adam
cross-entropy or MSE
31. Terry Taewoong Um (terry.t.um@gmail.com)
31
from keras.models import Sequential
from keras.layers import Dense
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
model = Sequential()
model.add(Dense(units=64, activation='relu', input_dim=100))
model.add(Dense(units=10, activation='softmax’))
model.compile(loss='categorical_crossentropy',
optimizer='sgd’)
model.fit(x_train, y_train, epochs=5)
NN IN KERAS
32. • Recognition & Supervised learning
• Model selection & Overfitting
• Training set split & Cross validation
• Regularization
• Deep learning : End-to-end learning
• Neural Network Basics
Terry Taewoong Um (terry.t.um@gmail.com)
SUMMARY – PART1
33. Terry Taewoong Um (terry.t.um@gmail.com)
CONTENTS
33
1. Introduction to ML & DL 50min
2. DL methods: CNN, RNN, VAE, GAN 35min
3. Can we believe DNNs? 15min
4. Q & A 15min
Break 10min
35. 35
Terry Taewoong Um (terry.t.um@gmail.com)
Generative
Adversarial
Network (GAN)
Variational
Autoencoder
(VAE)
Unsupervised learningSupervised learning
Convolutional
Neural Network
(CNN)
Recurrent
Neural Network
(RNN*)
Labels (O) Labels (X)
(mostly)
Discriminative model
(mostly)
Generative model
* RNN can be used as unsupervised manner
POPULAR DL METHODS
36. 36
Terry Taewoong Um (terry.t.um@gmail.com)
Generative
Adversarial
Network (GAN)
Variational
Autoencoder
(VAE)
Unsupervised learningSupervised learning
Convolutional
Neural Network
(CNN)
Recurrent
Neural Network
(RNN*)
* RNN can be used as unsupervised manner
Explicit
density
Implicit
density
(try to generate
realistic samples)
POPULAR DL METHODS
37. 37
Terry Taewoong Um (terry.t.um@gmail.com)
Generative
Adversarial
Network (GAN)
Variational
Autoencoder
(VAE)
Unsupervised learningSupervised learning
Convolutional
Neural Network
(CNN)
Recurrent
Neural Network
(RNN*)
* RNN can be used as unsupervised manner
Static
data
(e.g. image)
Sequence
data
(e.g. natural
language)
The area that I am
most familiar with
POPULAR DL METHODS
Explicit
density
Implicit
density
(try to generate
realistic samples)
38. 38
Terry Taewoong Um (terry.t.um@gmail.com)
Generative
Adversarial
Network (GAN)
Variational
Autoencoder
(VAE)
Unsupervised learningSupervised learning
Convolutional
Neural Network
(CNN)
Recurrent
Neural Network
(RNN*)
* RNN can be used as unsupervised manner
Static
data
(e.g. image)
Sequence
data
(e.g. natural
language)
POPULAR DL METHODS
Explicit
density
Implicit
density
(try to generate
realistic samples)
39. CONVOLUTIONAL NN (CNN)
Fully-connected layers Convolutional layers
w
h
n
39 / 39
p × 𝑞
Terry Taewoong Um (terry.t.um@gmail.com)
e.g.) (1k*1k) image * 1k nodes = 1 billion parameters [Fully-connected]
(3*3) kernel size * 64 kernels = 576 parameters [Convolutional]
https://github.com/vdumoulin
/conv_arithmetic
40. Terry Taewoong Um (terry.t.um@gmail.com)
40
• How can we deal with real images which is
much bigger than MNIST digit images?
- Use not fully-connected, but locally-connected NN
- Use convolutions to get various feature maps
- Abstract the results into higher layer by using pooling
- Fine tune with fully-connected NN
https://goo.gl/G7kBjI
https://goo.gl/Xswsbd
http://goo.gl/5OR5oH
CONVOLUTIONAL NN (CNN)
41. CNN FEATURES
41 / 39
Terry Taewoong Um (terry.t.um@gmail.com)
http://yosinski.com/deepvis
43. 43
APPLICATIONS
Terry Taewoong Um (terry.t.um@gmail.com)
https://goo.gl/1SjmTp
A. Karpathy @ Bay area DL school 2016
https://docs.google.com/presentation/d/
1Q1CmVVnjVJM_9CDk3B8Y6MWCavZOti
KmOLQ0XB7s9Vg/edit
45. 45
Terry Taewoong Um (terry.t.um@gmail.com)
Generative
Adversarial
Network (GAN)
Variational
Autoencoder
(VAE)
Unsupervised learningSupervised learning
Convolutional
Neural Network
(CNN)
Recurrent
Neural Network
(RNN*)
* RNN can be used as unsupervised manner
Static
data
(e.g. image)
Sequence
data
(e.g. natural
language)
POPULAR DL METHODS
Explicit
density
Implicit
density
(try to generate
realistic samples)
47. LONG-SHORT TERM MEMORY (LSTM)
• Long-short term memory (LSTM)
LSTM
47 / 39
Terry Taewoong Um (terry.t.um@gmail.com)
[S. Hochreiter & J. Schmidhuber 1998]
48. RNN APPLICATIONS
Terry Taewoong Um (terry.t.um@gmail.com)
(Andrej Karpathy, http://karpathy.github.io/2015/05/21/rnn-effectiveness/)
49. 49
RNN APPLICATIONS
Terry Taewoong Um (terry.t.um@gmail.com)
• Sequence generation
• Classification
Speech recognition, Sentence/document classification,
Video classification, Activity recognition, …
𝑥
ℎ
51. RNN APPLICATIONS
• Machine translation with attention mechanism
https://research.googleblog.com/
2016/09/a-neural-network-for-
machine.html
Terry Taewoong Um (terry.t.um@gmail.com)
52. 52
Terry Taewoong Um (terry.t.um@gmail.com)
Generative
Adversarial
Network (GAN)
Variational
Autoencoder
(VAE)
Unsupervised learningSupervised learning
Convolutional
Neural Network
(CNN)
Recurrent
Neural Network
(RNN*)
* RNN can be used as unsupervised manner
Static
data
(e.g. image)
Sequence
data
(e.g. natural
language)
POPULAR DL METHODS
Explicit
density
Implicit
density
(try to generate
realistic samples)
53. Terry Taewoong Um (terry.t.um@gmail.com)
53
Task 2:
emotion estimation
Task 1:
person identification
TASK-SPECIFIC FEATURES
54. Terry Taewoong Um (terry.t.um@gmail.com)
54
- Labeled data are difficult to collect
- Is this a right way to obtain a good representation?
(Lack of generalizability / transferability)
WHY UNSUPERVISED LEARNING?
Task
Deep learning
(end-to-end)
* Feature
extraction included
57. 57
Terry Taewoong Um (terry.t.um@gmail.com)
• Attempt to learn a good representation without labels
• Unsupervised learning is far more difficult than supervised learning
• Turn unsupervised learning into supervised learning!
UNSUPERVISED LEARNING
58. 58
Terry Taewoong Um (terry.t.um@gmail.com)
• Objective : Minimize reconstruction error “오토엔코더의 모든것“,
https://www.slideshare.net/
NaverEngineering/ss-
96581209
AUTOENCODER
59. 59
“All about VAE”, H. Lee, https://www.slideshare.net/NaverEngineering/ss-96581209
VARIATIONAL AUTOENCODER (VAE)
• Objective : Minimize reconstruction error + regularization loss
60. 60
Terry Taewoong Um (terry.t.um@gmail.com)
OVERFITTING & REGULARIZATION
• Objective : Minimize reconstruction error + regularization loss
61. 61
Terry Taewoong Um (terry.t.um@gmail.com)
http://blog.fastforwardlabs.com/2016/08/12/introdu
cing-variational-autoencoders-in-prose-and.html
VARIATIONAL AUTOENCODER (VAE)
62. Terry Taewoong Um (terry.t.um@gmail.com)
62
GENERATED IMAGES BY VAE
https://github.com/davidsandberg/facenet/wiki/Variational-autoencoder
63. Terry Taewoong Um (terry.t.um@gmail.com)
63
GENERATED IMAGES BY VAE
https://github.com/davidsandberg/facenet/wiki/Variational-autoencoder
64. 64 / 39
Terry Taewoong Um (terry.t.um@gmail.com)
[X. Yan et al. 2016]
65. CONDITIONAL VAE
65 / 39
Terry Taewoong Um (terry.t.um@gmail.com)
[X. Yan et al. 2016]
67. 67
Terry Taewoong Um (terry.t.um@gmail.com)
Generative
Adversarial
Network (GAN)
Variational
Autoencoder
(VAE)
Unsupervised learningSupervised learning
Convolutional
Neural Network
(CNN)
Recurrent
Neural Network
(RNN*)
* RNN can be used as unsupervised manner
Static
data
(e.g. image)
Sequence
data
(e.g. natural
language)
POPULAR DL METHODS
Explicit
density
Implicit
density
(try to generate
realistic samples)
69. 69
NOT OPTIMIZATION, BUT GAME
이활석, “그림 그리는 AI”,
https://www.slideshare.net/NaverEngineering/ai-83896428
http://bzit.donga.com/List/3/all/50/1202090/1
70. 70 / 39
Terry Taewoong Um (terry.t.um@gmail.com)
DCGAN EBGAN LSGAN
WGAN BEGAN DRAGAN
GAN
74. GAN VARIANTS
74 / 39
GAN zoo,
https://deephunt.in/the-
gan-zoo-79597dc8c347
Most of them have
been developed for
the last year
75. Terry Taewoong Um (terry.t.um@gmail.com)
VOICE GENERATION ( A U TOR EGR ESSIVE)
75
김태훈 (OpenAI), 네이버 Deview2017 “책읽는 딥러닝”
https://www.youtube.com/watch?v=klnfWhPGPRs&t=1992s
76. Terry Taewoong Um (terry.t.um@gmail.com)
76
Google Duplex
https://www.youtube.com/watch?v=D5VN56jQMWM&t=2m47s
RECOGNITION + GENERATION
77. 77
POPULAR METHODS
Terry Taewoong Um (terry.t.um@gmail.com)
Variational
Autoencoder
(VAE)
Generative
Adversarial
Network (GAN)
Unsupervised learningSupervised learning
Convolutional
Neural Network
(CNN)
Recurrent
Neural Network
(RNN*)
* RNN can be used as unsupervised manner
Static
data
(e.g. image)
Sequence
data
(e.g. natural
language)
Explicit
density
Implicit
density
(try to generate
realistic samples)
78. Terry Taewoong Um (terry.t.um@gmail.com)
CONTENTS
78
1. Introduction to ML & DL 50min
2. DL methods: CNN, RNN, VAE, GAN 35min
3. Can we believe DNNs? 15min
4. Q & A 15min
Break 10min
79. Terry Taewoong Um (terry.t.um@gmail.com)
BELIEVE OR NOT
79
green?
enemy?
1. Adversarial attacks
2. Uncertainty
3. Interpretability
80. Terry Taewoong Um (terry.t.um@gmail.com)
BELIEVE OR NOT
80
[Keyword]
= NOISE
(perturbation)
1. Adversarial attacks
2. Uncertainty
3. Interpretability
81. 81
Terry Taewoong Um (terry.t.um@gmail.com)
[Wang &
Bovik, 2002]
ERRORS IN INPUT SPACE
82. Terry Taewoong Um (terry.t.um@gmail.com)
ADVERSARIAL ATTACKS
82
Gradient ascent method:
Increase “the changes of the loss” w.r.t. the changes of the input”
83. Terry Taewoong Um (terry.t.um@gmail.com)
83
ADVERSARIAL ATTACKS
• Adversarial examples in the physical world (Kurakin et al. 2016)
84. Terry Taewoong Um (terry.t.um@gmail.com)
ADVERSARIAL ATTACKS
84
• Adversarial patch (Brown et al. 2017)
85. Terry Taewoong Um (terry.t.um@gmail.com)
ADVERSARIAL TRAINING
85
https://www.spsc.tugraz.at/research/roM/virtual-adversarial-training-
applied-neural-higher-order-factors-phone-classification
• Virtual adversarial training (Miyato et al. 2016)
https://youtu.be/kvPmArtVoFE
86. Terry Taewoong Um (terry.t.um@gmail.com)
BELIEVE OR NOT
86
green?
enemy?
1. Adversarial attacks
2. Uncertainty
3. Interpretability
87. Terry Taewoong Um (terry.t.um@gmail.com)
BAYESIAN APPROACHES
87
https://youtu.be/kvPmArtVoFE
• Posterior ∝ Prior * Likelihood
88. Terry Taewoong Um (terry.t.um@gmail.com)
GAUSSIAN PROCESS
88
https://youtu.be/kvPmArtVoFE
Beautiful, but not scalable!
89. Terry Taewoong Um (terry.t.um@gmail.com)
DROPOUT AS BAYESIAN
89
• Dropout: Randomly drop nodes
→ regularization
90. Terry Taewoong Um (terry.t.um@gmail.com)
BELIEVE OR NOT
90
green?
enemy?
1. Adversarial attacks
2. Uncertainty
3. Interpretability