SlideShare uma empresa Scribd logo
1 de 36
Baixar para ler offline
Transfer learning
Keras & TensorFlow
Hichem Felouat
hichemfel@gmail.com
Hichem Felouat - hichemfel@gmail.com 2
Transfer learning
Transfer learning (TL) is a research problem in ML that focuses on
storing knowledge gained while solving one problem and applying it
to a different but related problem. For example, knowledge gained
while learning to recognize Cats could apply when trying to
recognize Tigers.
Hichem Felouat - hichemfel@gmail.com 3
Building Complex Models - Functional API
• The Keras functional API is a way to create
models that is more flexible than the
tf.keras.Sequential API.
• The functional API can handle models with
non-linear topology, models with shared
layers, and models with multiple inputs or
outputs.
• This architecture makes it possible for the
neural network to learn both deep patterns
(using the deep path) and simple rules
(through the short path).
Hichem Felouat - hichemfel@gmail.com 4
Building Complex Models - Functional API
input_ = keras.layers.Input(shape=X_train.shape[1:])
hidden1 = keras.layers.Dense(30, activation="relu")(input_)
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1)
concat = keras.layers.Concatenate()([input_, hidden2])
output = keras.layers.Dense(1)(concat)
model = keras.Model(inputs=[input_], outputs=[output])
Once you have built the Keras model, everything is exactly like earlier, so
there’s no need to repeat it here: you must compile the model, train it, evaluate
it, and use it to make predictions.
Hichem Felouat - hichemfel@gmail.com 5
Building Complex Models - Functional API
What if you want to send a subset of the
features through the wide path and a different
subset (possibly overlapping) through the
deep path?!
• In this case, one solution is to use
multiple inputs.
Hichem Felouat - hichemfel@gmail.com 6
Building Complex Models - Functional API
input_A = keras.layers.Input(shape=[5], name="wide_input")
input_B = keras.layers.Input(shape=[6], name="deep_input")
hidden1 = keras.layers.Dense(30, activation="relu")(input_B)
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1)
concat = keras.layers.concatenate([input_A, hidden2])
output = keras.layers.Dense(1, name="output")(concat)
model = keras.Model(inputs=[input_A, input_B], outputs=[output])
For example, suppose we want to send five features through the wide path
(features 0 to 4), and six features through the deep path (features 2 to 7) :
• You should name at least the most important layers.
Hichem Felouat - hichemfel@gmail.com 7
We can compile the model as usual, but when we call the fit()
method, instead of passing a single input matrix X_train, we must
pass a pair of matrices (X_train_A, X_train_B): one per input.
The same is true for X_valid, and also for X_test and X_new when
you call evaluate() or predict():
Building Complex Models - Functional API
history = model.fit((X_train_A, X_train_B), y_train, epochs=20,
validation_data = ((X_valid_A, X_valid_B), y_valid))
model_evaluate = model.evaluate((X_test_A, X_test_B), y_test)
y_pred = model.predict((X_new_A, X_new_B))
Hichem Felouat - hichemfel@gmail.com 8
Building Complex Models - Functional API
Another use case is as a regularization technique (i.e., a training
constraint whose objective is to reduce overfitting and thus improve
the model’s ability to generalize).
For example, you may want to add some
auxiliary outputs in a neural network
architecture to ensure that the underlying
part of the network learns something useful
on its own, without relying on the rest of the
network.
Hichem Felouat - hichemfel@gmail.com 9
Building Complex Models - Functional API
[...] # Same as before, up to the main output layer
output = keras.layers.Dense(1, name="main_output")(concat)
aux_output = keras.layers.Dense(1, name="aux_output")(hidden2)
model = keras.Model(inputs=[input_A, input_B], outputs=[output,
aux_output])
• Each output will need its own loss function. Therefore, when we compile the
model, we should pass a list of losses.
model.compile(loss=["mse", "mse"], loss_weights=[0.9, 0.1], optimizer="sgd")
Hichem Felouat - hichemfel@gmail.com 10
• We care much more about the main output than about the auxiliary output,
so we want to give the main output’s loss a much greater weight.
• we need to provide labels for each output (In this example, we used the
same labels).
history = model.fit([X_train_A, X_train_B], [y_train, y_train], epochs=20,
validation_data=([X_valid_A, X_valid_B], [y_valid, y_valid]))
• When we evaluate the model, Keras will return the total loss, as well as all
the individual losses:
total_loss, main_loss, aux_loss = model.evaluate([X_test_A, X_test_B],
[y_test, y_test])
y_pred_main, y_pred_aux = model.predict([X_new_A, X_new_B])
Building Complex Models - Functional API
Hichem Felouat - hichemfel@gmail.com 11
Reusing Pretrained Layers
• It is generally not a good idea to train a very large DNN from scratch:
instead, you should always try to find an existing neural network that
accomplishes a similar task to the one you are trying to tackle then reuse
the lower layers of this network.
• This technique is called transfer learning.
• It will not only speed up training considerably but also require significantly
less training data.
• The output layer of the original model should usually be replaced
because it is most likely not useful at all for the new task, and it may not
even have the right number of outputs for the new task.
Hichem Felouat - hichemfel@gmail.com 12
Reusing Pretrained Layers
• Transfer learning will work best when the inputs have similar low-level features.
Hichem Felouat - hichemfel@gmail.com 13
Reusing Pretrained Layers
• Try freezing all the reused layers first (i.e., make their weights non-trainable
so that Gradient Descent won’t modify them), then train your model and see
how it performs. Then try unfreezing one or two of the top hidden layers to
let backpropagation tweak them and see if performance improves.
model_A = keras.models.load_model("my_model_A.h5")
model_B_on_A = keras.models.Sequential(model_A.layers[:-1])
model_B_on_A.add(keras.layers.Dense(1, activation="sigmoid"))
# If you want to avoid affecting model_A
model_A_clone = keras.models.clone_model(model_A)
model_A_clone.set_weights(model_A.get_weights())
Hichem Felouat - hichemfel@gmail.com 14
Reusing Pretrained Layers
• The new output layer was initialized randomly it will make large errors.
• Freeze the reused layers during the first few epochs, giving the new layer
some time to learn reasonable weights.
for layer in model_B_on_A.layers[:-1]:
layer.trainable = False
model_B_on_A.compile( ... )
history = model_B_on_A.fit( ..., epochs = 5, ... )
for layer in model_B_on_A.layers[:-1]:
layer.trainable = True
model_B_on_A.compile( ... )
history = model_B_on_A.fit( ... )
model_B_on_A.evaluate( ... )
Hichem Felouat - hichemfel@gmail.com 15
Pretraining on an Auxiliary Task
If you do not have much labeled training data, one last option is
to train a first neural network on an auxiliary task for which you
can easily obtain or generate labeled training data, then reuse the
lower layers of that network for your actual task. The first neural
network’s lower layers will learn feature detectors that will likely be
reusable by the second neural network.
Reusing Pretrained Layers
Hichem Felouat - hichemfel@gmail.com 16
1. LeNet-5 (1998)
2. AlexNet (2012)
3. VGG-16 (2014)
4. Inception-v1
5. Inception-v3
6. ResNet-50
7. Xception (2016)
8. Inception-v4 (2016)
9. Inception-ResNets
10.ResNeXt-50 (2017)
CNN Variations
Over the years, variants of the CNN architecture have been developed,
leading to amazing advances in the field. A good measure of this progress is
the error rate in competitions such as the ILSVRC ImageNet challenge.
https://towardsdatascience.com/illustrated-10-cnn-architectures-95d78ace614d
Hichem Felouat - hichemfel@gmail.com 17
CNN Variations - Legend
Hichem Felouat - hichemfel@gmail.com 18
CNN Variations - LeNet-5
• It was created by Yann LeCun in 1998 and has been widely used for
handwritten digit recognition (MNIST).
• This architecture has become the standard “template”: stacking
convolutions and pooling layers and ending the network with one or more
fully-connected layers.
Hichem Felouat - hichemfel@gmail.com 19
CNN Variations - AlexNet
• The AlexNet CNN architecture won the 2012 ImageNet ILSVRC challenge.
• To reduce overfitting, the authors used two regularization techniques. First,
they applied dropout with a 50% dropout rate during training to the outputs
of layers F1 and F2. Second, they performed data augmentation.
Hichem Felouat - hichemfel@gmail.com 20
CNN Variations - VGG
• The runner-up in the ILSVRC 2014 challenge was VGG, developed by
Karen Simonyan and Andrew Zisserman from the Visual Geometry Group
(VGG) research lab at Oxford University.
• VGG-16 architecture and VGG-19 architecture
Hichem Felouat - hichemfel@gmail.com 21
CNN Variations - GoogLeNet (Inception)
Hichem Felouat - hichemfel@gmail.com 22
CNN Variations - GoogLeNet (Inception)
• Inception modules allow GoogLeNet to use parameters much more
efficiently than previous architectures.
• This 22-layer architecture with 5M parameters is called the Inception-v1.
• Having parallel towers of convolutions with different filters, followed by
concatenation, captures different features at 1×1, 3×3 and 5×5, thereby
"clustering" them.
• 1×1 convolutions are used for dimensionality reduction to remove
computational bottlenecks.
• Due to the activation function from 1×1 convolution, its addition also adds
nonlinearity.
Hichem Felouat - hichemfel@gmail.com 23
CNN Variations - GoogLeNet (Inception V3)
Hichem Felouat - hichemfel@gmail.com 24
CNN Variations - ResNet-50
• The basic building block for ResNets are the conv and identity blocks.
• It uses skip connections (also called shortcut connections).
• If you add many skip connections, the network can start making progress
even if several layers have not started learning yet.
Hichem Felouat - hichemfel@gmail.com 25
CNN Variations - Xception (2016)
Hichem Felouat - hichemfel@gmail.com 26
CNN Variations - Inception-v4 (2016)
Hichem Felouat - hichemfel@gmail.com 27
CNN Variations - Inception-ResNet-V2 (2016)
Hichem Felouat - hichemfel@gmail.com 28
CNN Variations - ResNeXt-50 (2017)
Hichem Felouat - hichemfel@gmail.com 29
Using Pretrained Models from Keras
In general, you won’t have to implement standard models like
GoogLeNet or ResNet manually, since pretrained networks are
r e a d i l y a v a i l a b l e w i t h a s i n g l e l i n e o f c o d e i n t h e
keras.applications package.
For example, you can load the ResNet-50 model, pretrained on
ImageNet, with the following line of code:
model = keras.applications.resnet50. ResNet50
(weights= "imagenet")
Hichem Felouat - hichemfel@gmail.com 30
Using Pretrained Models from Keras
Available models :
Models for image classification with weights trained on ImageNet:
1) Xception
2) VGG16
3) VGG19
4) ResNet, ResNetV2
5) InceptionV3
6) InceptionResNetV2
7) MobileNet
8) MobileNetV2
9) DenseNet
10) NASNet
https://keras.io/applications/
Hichem Felouat - hichemfel@gmail.com 31
Using Pretrained Models from Keras
Hichem Felouat - hichemfel@gmail.com 32
Using Pretrained Models from Keras
model = keras.applications.resnet50.ResNet50(weights="imagenet")
# you first need to ensure that the images have the right size
# ResNet-50(224 × 224)
images_resized = tf.image.resize(images, [224, 224])
# Each model provides a preprocess_input() function
inputs = keras.applications.resnet50.preprocess_input(images_resized * 255)
# Now we can use the pretrained model to make predictions
Y_proba = model.predict(inputs)
Hichem Felouat - hichemfel@gmail.com 33
Using Pretrained Models from Keras
import tensorflow as tf
from tensorflow import keras
model = keras.applications.vgg16.VGG16(weights=None)
# The model’s summary() method displays all the model’s layers
print(model.summary())
include_top: whether to include the top layers of the network or not (False,
True).
weights: one of None (random initialization) or 'imagenet' (pre-training on
ImageNet).
Hichem Felouat - hichemfel@gmail.com 34
Pretrained Models for Transfer Learning
• If you want to build an image classifier but you do not have
enough training data, then it is often a good idea to reuse the
lower layers of a pretrained model.
• For example Xception model, we exclude the top of the
network by setting include_top=False: this excludes the global
average pooling layer and the dense output layer. We then add
our own layers. Finally, we create the Keras Model:
Hichem Felouat - hichemfel@gmail.com 35
Pretrained Models for Transfer Learning
base_model = keras.applications.xception.Xception(weights="imagenet",include_top=False)
avg = keras.layers.GlobalAveragePooling2D()(base_model.output)
output = keras.layers.Dense(n_classes, activation="softmax")(avg)
model = keras.Model(inputs=base_model.input, outputs=output)
for layer in base_model.layers:
layer.trainable = False
optimizer = keras.optimizers.SGD(lr=0.2, momentum=0.9, decay=0.01)
model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"])
history = model.fit(train_set, epochs=5, validation_data=valid_set)
for layer in base_model.layers:
layer.trainable = True
optimizer = keras.optimizers.SGD(lr=0.01, momentum=0.9, decay=0.001)
model.compile(...)
history = model.fit(...)
Hichem Felouat - hichemfel@gmail.com 36
Thank you for your
attention
link of the code:
https://github.com/hichemfelouat/my-codes-of-machine-learning/blob/master/Transfer_learning.py

Mais conteúdo relacionado

Mais procurados

GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning TutorialAmr Rashed
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...Edge AI and Vision Alliance
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsKasun Chinthaka Piyarathna
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overviewjins0618
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual IntroductionLukas Masuch
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Sujit Pal
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep LearningJulien SIMON
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMustafa Yagmur
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networksananth
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Prakhar Rastogi
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep LearningSebastian Ruder
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNNNoura Hussein
 
Transfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningTransfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningSebastian Ruder
 

Mais procurados (20)

GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
 
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
 
Cnn
CnnCnn
Cnn
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
Lstm
LstmLstm
Lstm
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)
 
Deep learning ppt
Deep learning pptDeep learning ppt
Deep learning ppt
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
Transfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningTransfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine Learning
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
1.Introduction to deep learning
1.Introduction to deep learning1.Introduction to deep learning
1.Introduction to deep learning
 

Semelhante a Transfer Learning

Build your own Convolutional Neural Network CNN
Build your own Convolutional Neural Network CNNBuild your own Convolutional Neural Network CNN
Build your own Convolutional Neural Network CNNHichem Felouat
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkSri Ambati
 
Effective Java, Third Edition - Keepin' it Effective
Effective Java, Third Edition - Keepin' it EffectiveEffective Java, Third Edition - Keepin' it Effective
Effective Java, Third Edition - Keepin' it EffectiveC4Media
 
Logistic Regression using Mahout
Logistic Regression using MahoutLogistic Regression using Mahout
Logistic Regression using Mahouttanuvir
 
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Pooyan Jamshidi
 
How to Build your First Neural Network
How to Build your First Neural NetworkHow to Build your First Neural Network
How to Build your First Neural NetworkHichem Felouat
 
C optimization notes
C optimization notesC optimization notes
C optimization notesFyaz Ghaffar
 
Training course lect2
Training course lect2Training course lect2
Training course lect2Noor Dhiya
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic SystemsPooyan Jamshidi
 
maXbox starter65 machinelearning3
maXbox starter65 machinelearning3maXbox starter65 machinelearning3
maXbox starter65 machinelearning3Max Kleiner
 
Competition 1 (blog 1)
Competition 1 (blog 1)Competition 1 (blog 1)
Competition 1 (blog 1)TarunPaparaju
 
Data structures and algorithms lab2
Data structures and algorithms lab2Data structures and algorithms lab2
Data structures and algorithms lab2Bianca Teşilă
 
Presentation 3rd
Presentation 3rdPresentation 3rd
Presentation 3rdConnex
 
Polymorphism in java
Polymorphism in javaPolymorphism in java
Polymorphism in javasureshraj43
 
Short intro to scala and the play framework
Short intro to scala and the play frameworkShort intro to scala and the play framework
Short intro to scala and the play frameworkFelipe
 

Semelhante a Transfer Learning (20)

Build your own Convolutional Neural Network CNN
Build your own Convolutional Neural Network CNNBuild your own Convolutional Neural Network CNN
Build your own Convolutional Neural Network CNN
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
 
Effective Java, Third Edition - Keepin' it Effective
Effective Java, Third Edition - Keepin' it EffectiveEffective Java, Third Edition - Keepin' it Effective
Effective Java, Third Edition - Keepin' it Effective
 
Design patterns
Design patternsDesign patterns
Design patterns
 
Logistic Regression using Mahout
Logistic Regression using MahoutLogistic Regression using Mahout
Logistic Regression using Mahout
 
Lab5
Lab5Lab5
Lab5
 
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
 
How to Build your First Neural Network
How to Build your First Neural NetworkHow to Build your First Neural Network
How to Build your First Neural Network
 
C optimization notes
C optimization notesC optimization notes
C optimization notes
 
Training course lect2
Training course lect2Training course lect2
Training course lect2
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic Systems
 
Function overloading
Function overloadingFunction overloading
Function overloading
 
maXbox starter65 machinelearning3
maXbox starter65 machinelearning3maXbox starter65 machinelearning3
maXbox starter65 machinelearning3
 
Competition 1 (blog 1)
Competition 1 (blog 1)Competition 1 (blog 1)
Competition 1 (blog 1)
 
Data structures and algorithms lab2
Data structures and algorithms lab2Data structures and algorithms lab2
Data structures and algorithms lab2
 
Templates
TemplatesTemplates
Templates
 
Presentation 3rd
Presentation 3rdPresentation 3rd
Presentation 3rd
 
Polymorphism in java
Polymorphism in javaPolymorphism in java
Polymorphism in java
 
C sharp chap6
C sharp chap6C sharp chap6
C sharp chap6
 
Short intro to scala and the play framework
Short intro to scala and the play frameworkShort intro to scala and the play framework
Short intro to scala and the play framework
 

Mais de Hichem Felouat

Natural Language Processing NLP (Transformers)
Natural Language Processing NLP (Transformers)Natural Language Processing NLP (Transformers)
Natural Language Processing NLP (Transformers)Hichem Felouat
 
Introduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANsIntroduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANsHichem Felouat
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance SegmentationHichem Felouat
 
The fundamentals of Machine Learning
The fundamentals of Machine LearningThe fundamentals of Machine Learning
The fundamentals of Machine LearningHichem Felouat
 
Artificial Intelligence and its Applications
Artificial Intelligence and its ApplicationsArtificial Intelligence and its Applications
Artificial Intelligence and its ApplicationsHichem Felouat
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Hichem Felouat
 
Predict future time series forecasting
Predict future time series forecastingPredict future time series forecasting
Predict future time series forecastingHichem Felouat
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning AlgorithmsHichem Felouat
 

Mais de Hichem Felouat (8)

Natural Language Processing NLP (Transformers)
Natural Language Processing NLP (Transformers)Natural Language Processing NLP (Transformers)
Natural Language Processing NLP (Transformers)
 
Introduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANsIntroduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANs
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance Segmentation
 
The fundamentals of Machine Learning
The fundamentals of Machine LearningThe fundamentals of Machine Learning
The fundamentals of Machine Learning
 
Artificial Intelligence and its Applications
Artificial Intelligence and its ApplicationsArtificial Intelligence and its Applications
Artificial Intelligence and its Applications
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Predict future time series forecasting
Predict future time series forecastingPredict future time series forecasting
Predict future time series forecasting
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning Algorithms
 

Último

Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 

Último (20)

Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 

Transfer Learning

  • 1. Transfer learning Keras & TensorFlow Hichem Felouat hichemfel@gmail.com
  • 2. Hichem Felouat - hichemfel@gmail.com 2 Transfer learning Transfer learning (TL) is a research problem in ML that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize Cats could apply when trying to recognize Tigers.
  • 3. Hichem Felouat - hichemfel@gmail.com 3 Building Complex Models - Functional API • The Keras functional API is a way to create models that is more flexible than the tf.keras.Sequential API. • The functional API can handle models with non-linear topology, models with shared layers, and models with multiple inputs or outputs. • This architecture makes it possible for the neural network to learn both deep patterns (using the deep path) and simple rules (through the short path).
  • 4. Hichem Felouat - hichemfel@gmail.com 4 Building Complex Models - Functional API input_ = keras.layers.Input(shape=X_train.shape[1:]) hidden1 = keras.layers.Dense(30, activation="relu")(input_) hidden2 = keras.layers.Dense(30, activation="relu")(hidden1) concat = keras.layers.Concatenate()([input_, hidden2]) output = keras.layers.Dense(1)(concat) model = keras.Model(inputs=[input_], outputs=[output]) Once you have built the Keras model, everything is exactly like earlier, so there’s no need to repeat it here: you must compile the model, train it, evaluate it, and use it to make predictions.
  • 5. Hichem Felouat - hichemfel@gmail.com 5 Building Complex Models - Functional API What if you want to send a subset of the features through the wide path and a different subset (possibly overlapping) through the deep path?! • In this case, one solution is to use multiple inputs.
  • 6. Hichem Felouat - hichemfel@gmail.com 6 Building Complex Models - Functional API input_A = keras.layers.Input(shape=[5], name="wide_input") input_B = keras.layers.Input(shape=[6], name="deep_input") hidden1 = keras.layers.Dense(30, activation="relu")(input_B) hidden2 = keras.layers.Dense(30, activation="relu")(hidden1) concat = keras.layers.concatenate([input_A, hidden2]) output = keras.layers.Dense(1, name="output")(concat) model = keras.Model(inputs=[input_A, input_B], outputs=[output]) For example, suppose we want to send five features through the wide path (features 0 to 4), and six features through the deep path (features 2 to 7) : • You should name at least the most important layers.
  • 7. Hichem Felouat - hichemfel@gmail.com 7 We can compile the model as usual, but when we call the fit() method, instead of passing a single input matrix X_train, we must pass a pair of matrices (X_train_A, X_train_B): one per input. The same is true for X_valid, and also for X_test and X_new when you call evaluate() or predict(): Building Complex Models - Functional API history = model.fit((X_train_A, X_train_B), y_train, epochs=20, validation_data = ((X_valid_A, X_valid_B), y_valid)) model_evaluate = model.evaluate((X_test_A, X_test_B), y_test) y_pred = model.predict((X_new_A, X_new_B))
  • 8. Hichem Felouat - hichemfel@gmail.com 8 Building Complex Models - Functional API Another use case is as a regularization technique (i.e., a training constraint whose objective is to reduce overfitting and thus improve the model’s ability to generalize). For example, you may want to add some auxiliary outputs in a neural network architecture to ensure that the underlying part of the network learns something useful on its own, without relying on the rest of the network.
  • 9. Hichem Felouat - hichemfel@gmail.com 9 Building Complex Models - Functional API [...] # Same as before, up to the main output layer output = keras.layers.Dense(1, name="main_output")(concat) aux_output = keras.layers.Dense(1, name="aux_output")(hidden2) model = keras.Model(inputs=[input_A, input_B], outputs=[output, aux_output]) • Each output will need its own loss function. Therefore, when we compile the model, we should pass a list of losses. model.compile(loss=["mse", "mse"], loss_weights=[0.9, 0.1], optimizer="sgd")
  • 10. Hichem Felouat - hichemfel@gmail.com 10 • We care much more about the main output than about the auxiliary output, so we want to give the main output’s loss a much greater weight. • we need to provide labels for each output (In this example, we used the same labels). history = model.fit([X_train_A, X_train_B], [y_train, y_train], epochs=20, validation_data=([X_valid_A, X_valid_B], [y_valid, y_valid])) • When we evaluate the model, Keras will return the total loss, as well as all the individual losses: total_loss, main_loss, aux_loss = model.evaluate([X_test_A, X_test_B], [y_test, y_test]) y_pred_main, y_pred_aux = model.predict([X_new_A, X_new_B]) Building Complex Models - Functional API
  • 11. Hichem Felouat - hichemfel@gmail.com 11 Reusing Pretrained Layers • It is generally not a good idea to train a very large DNN from scratch: instead, you should always try to find an existing neural network that accomplishes a similar task to the one you are trying to tackle then reuse the lower layers of this network. • This technique is called transfer learning. • It will not only speed up training considerably but also require significantly less training data. • The output layer of the original model should usually be replaced because it is most likely not useful at all for the new task, and it may not even have the right number of outputs for the new task.
  • 12. Hichem Felouat - hichemfel@gmail.com 12 Reusing Pretrained Layers • Transfer learning will work best when the inputs have similar low-level features.
  • 13. Hichem Felouat - hichemfel@gmail.com 13 Reusing Pretrained Layers • Try freezing all the reused layers first (i.e., make their weights non-trainable so that Gradient Descent won’t modify them), then train your model and see how it performs. Then try unfreezing one or two of the top hidden layers to let backpropagation tweak them and see if performance improves. model_A = keras.models.load_model("my_model_A.h5") model_B_on_A = keras.models.Sequential(model_A.layers[:-1]) model_B_on_A.add(keras.layers.Dense(1, activation="sigmoid")) # If you want to avoid affecting model_A model_A_clone = keras.models.clone_model(model_A) model_A_clone.set_weights(model_A.get_weights())
  • 14. Hichem Felouat - hichemfel@gmail.com 14 Reusing Pretrained Layers • The new output layer was initialized randomly it will make large errors. • Freeze the reused layers during the first few epochs, giving the new layer some time to learn reasonable weights. for layer in model_B_on_A.layers[:-1]: layer.trainable = False model_B_on_A.compile( ... ) history = model_B_on_A.fit( ..., epochs = 5, ... ) for layer in model_B_on_A.layers[:-1]: layer.trainable = True model_B_on_A.compile( ... ) history = model_B_on_A.fit( ... ) model_B_on_A.evaluate( ... )
  • 15. Hichem Felouat - hichemfel@gmail.com 15 Pretraining on an Auxiliary Task If you do not have much labeled training data, one last option is to train a first neural network on an auxiliary task for which you can easily obtain or generate labeled training data, then reuse the lower layers of that network for your actual task. The first neural network’s lower layers will learn feature detectors that will likely be reusable by the second neural network. Reusing Pretrained Layers
  • 16. Hichem Felouat - hichemfel@gmail.com 16 1. LeNet-5 (1998) 2. AlexNet (2012) 3. VGG-16 (2014) 4. Inception-v1 5. Inception-v3 6. ResNet-50 7. Xception (2016) 8. Inception-v4 (2016) 9. Inception-ResNets 10.ResNeXt-50 (2017) CNN Variations Over the years, variants of the CNN architecture have been developed, leading to amazing advances in the field. A good measure of this progress is the error rate in competitions such as the ILSVRC ImageNet challenge. https://towardsdatascience.com/illustrated-10-cnn-architectures-95d78ace614d
  • 17. Hichem Felouat - hichemfel@gmail.com 17 CNN Variations - Legend
  • 18. Hichem Felouat - hichemfel@gmail.com 18 CNN Variations - LeNet-5 • It was created by Yann LeCun in 1998 and has been widely used for handwritten digit recognition (MNIST). • This architecture has become the standard “template”: stacking convolutions and pooling layers and ending the network with one or more fully-connected layers.
  • 19. Hichem Felouat - hichemfel@gmail.com 19 CNN Variations - AlexNet • The AlexNet CNN architecture won the 2012 ImageNet ILSVRC challenge. • To reduce overfitting, the authors used two regularization techniques. First, they applied dropout with a 50% dropout rate during training to the outputs of layers F1 and F2. Second, they performed data augmentation.
  • 20. Hichem Felouat - hichemfel@gmail.com 20 CNN Variations - VGG • The runner-up in the ILSVRC 2014 challenge was VGG, developed by Karen Simonyan and Andrew Zisserman from the Visual Geometry Group (VGG) research lab at Oxford University. • VGG-16 architecture and VGG-19 architecture
  • 21. Hichem Felouat - hichemfel@gmail.com 21 CNN Variations - GoogLeNet (Inception)
  • 22. Hichem Felouat - hichemfel@gmail.com 22 CNN Variations - GoogLeNet (Inception) • Inception modules allow GoogLeNet to use parameters much more efficiently than previous architectures. • This 22-layer architecture with 5M parameters is called the Inception-v1. • Having parallel towers of convolutions with different filters, followed by concatenation, captures different features at 1×1, 3×3 and 5×5, thereby "clustering" them. • 1×1 convolutions are used for dimensionality reduction to remove computational bottlenecks. • Due to the activation function from 1×1 convolution, its addition also adds nonlinearity.
  • 23. Hichem Felouat - hichemfel@gmail.com 23 CNN Variations - GoogLeNet (Inception V3)
  • 24. Hichem Felouat - hichemfel@gmail.com 24 CNN Variations - ResNet-50 • The basic building block for ResNets are the conv and identity blocks. • It uses skip connections (also called shortcut connections). • If you add many skip connections, the network can start making progress even if several layers have not started learning yet.
  • 25. Hichem Felouat - hichemfel@gmail.com 25 CNN Variations - Xception (2016)
  • 26. Hichem Felouat - hichemfel@gmail.com 26 CNN Variations - Inception-v4 (2016)
  • 27. Hichem Felouat - hichemfel@gmail.com 27 CNN Variations - Inception-ResNet-V2 (2016)
  • 28. Hichem Felouat - hichemfel@gmail.com 28 CNN Variations - ResNeXt-50 (2017)
  • 29. Hichem Felouat - hichemfel@gmail.com 29 Using Pretrained Models from Keras In general, you won’t have to implement standard models like GoogLeNet or ResNet manually, since pretrained networks are r e a d i l y a v a i l a b l e w i t h a s i n g l e l i n e o f c o d e i n t h e keras.applications package. For example, you can load the ResNet-50 model, pretrained on ImageNet, with the following line of code: model = keras.applications.resnet50. ResNet50 (weights= "imagenet")
  • 30. Hichem Felouat - hichemfel@gmail.com 30 Using Pretrained Models from Keras Available models : Models for image classification with weights trained on ImageNet: 1) Xception 2) VGG16 3) VGG19 4) ResNet, ResNetV2 5) InceptionV3 6) InceptionResNetV2 7) MobileNet 8) MobileNetV2 9) DenseNet 10) NASNet https://keras.io/applications/
  • 31. Hichem Felouat - hichemfel@gmail.com 31 Using Pretrained Models from Keras
  • 32. Hichem Felouat - hichemfel@gmail.com 32 Using Pretrained Models from Keras model = keras.applications.resnet50.ResNet50(weights="imagenet") # you first need to ensure that the images have the right size # ResNet-50(224 × 224) images_resized = tf.image.resize(images, [224, 224]) # Each model provides a preprocess_input() function inputs = keras.applications.resnet50.preprocess_input(images_resized * 255) # Now we can use the pretrained model to make predictions Y_proba = model.predict(inputs)
  • 33. Hichem Felouat - hichemfel@gmail.com 33 Using Pretrained Models from Keras import tensorflow as tf from tensorflow import keras model = keras.applications.vgg16.VGG16(weights=None) # The model’s summary() method displays all the model’s layers print(model.summary()) include_top: whether to include the top layers of the network or not (False, True). weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet).
  • 34. Hichem Felouat - hichemfel@gmail.com 34 Pretrained Models for Transfer Learning • If you want to build an image classifier but you do not have enough training data, then it is often a good idea to reuse the lower layers of a pretrained model. • For example Xception model, we exclude the top of the network by setting include_top=False: this excludes the global average pooling layer and the dense output layer. We then add our own layers. Finally, we create the Keras Model:
  • 35. Hichem Felouat - hichemfel@gmail.com 35 Pretrained Models for Transfer Learning base_model = keras.applications.xception.Xception(weights="imagenet",include_top=False) avg = keras.layers.GlobalAveragePooling2D()(base_model.output) output = keras.layers.Dense(n_classes, activation="softmax")(avg) model = keras.Model(inputs=base_model.input, outputs=output) for layer in base_model.layers: layer.trainable = False optimizer = keras.optimizers.SGD(lr=0.2, momentum=0.9, decay=0.01) model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"]) history = model.fit(train_set, epochs=5, validation_data=valid_set) for layer in base_model.layers: layer.trainable = True optimizer = keras.optimizers.SGD(lr=0.01, momentum=0.9, decay=0.001) model.compile(...) history = model.fit(...)
  • 36. Hichem Felouat - hichemfel@gmail.com 36 Thank you for your attention link of the code: https://github.com/hichemfelouat/my-codes-of-machine-learning/blob/master/Transfer_learning.py