SlideShare uma empresa Scribd logo
1 de 23
Siamese Networks
for One-shot Learning
Masa Kato
1
Contents
Introduction of methods for one shot learning using siamese neural network.
• Signature Verification using a "Siamese” Time Delay Neural Network (1993), NIPS
• Siamese Neural Networks for One-shot Image Recognition (2015), ICML
• Matching Networks for One Shot Learning (2016), NIPS
Propose my idea for matching
2
History of One-shot Learning
Firstly proposed by Fei-Fei et al. (2003); Fei-Fei et al. (2006). They developed a
variational Bayesian framework.
Lake et al. (2013) proposed an algorithm with a method called Hierarchical
Bayesian Programming Learning.
Methods based metric learning were proposed (Koch et al. (2015); Vinyals et al.
(2016)).
Methods based neural network with memory were proposed (Graves et al. (2014);
Santoro et al. (2016)).
 There exist some other general formulations and domain specific researches.
One-shot Object Detection was proposed in Schwartz et al. (2018)
3
Recent Methods for One-shot Learning
using Neural Networks
1. Metric Learning
2. Memory network
Papers:
1. Koch et al. (2015)
2. Graves et al. (2014)
1+2. Vinyals et al. (2016)
The siamese network is often used.
• Siamese nets were first introduces by Bromley et al.
(1993) to solve signature verification as an image
matching problem.
• Koch et al. (2015) proposed Deep Siamese Networks
for one-shot image recognition.
• Vinyals et al. (2016) proposed Matching Nets, which
is a model that incorporated memory network to
Deep Siamese Networks and formulated the task as
classification problem.
• Schwartz et al. (2018) applied existing methods for
One-shot Object Detection.
4
Siamese Network
Siamese network consists
of two identical sub-networks joined at their outputs.
Image A Image B
Layer Layer
Computes metric between A and B
5
More Detail of Basic Structure
Image A
Image B
Same structure and weights
6
Signature Verification
using a “Siamese” Time Delay
Neural Network
• The aim of the project was to make a signature verification system based on the
NCR 5990 Signature Capture Device.
• A signature is 800 sets of 𝑥, 𝑦 and pen up-down points with time 𝑡.
• Preprocess the data before training the network.
7
Bromley et al. (1993)
Performance
8
GA: Genuine signature pairs
• Correct pairs.
FR: Forgery
• Write to deceive.
Classified the signature and detect the
forgery with good performance.
Siamese Neural Networks
for On-shot Image Recognition
• Siamese nets were first introduces by Bromley et al. (1993) to solve signature
verification as an image matching problem.
• Koch et al. (2015) used convolutional deep neural network to extract features of
images before calculating its distance.
9
Koch et al. (2015)
• The model is a siamese convolutional network with 𝐿 layers each with 𝑁𝑙 units, where
ℎ1,𝑙 represents the hidden vector in layer 𝑙 for the first twin, and ℎ2,𝑙 denotes the same
for the second twin.
• ReLU units in the first 𝐿 − 2 layers and sigmoidal units in the remaining layers.
Distance metric
Image A
Image B
Deep Siamese Networks
10
Learning
𝑀: minibatch size.
𝑖: indexes the 𝑖the minibatch.
𝑦 𝑥1
𝑖
, 𝑥2
𝑖
: length- 𝑀 vector which contains the labels for the minibatch.
• 𝑦 𝑥1
𝑖
, 𝑥2
𝑖
= 1 whenever 𝑥1 and 𝑥2 are from the same class.
• 𝑦 𝑥1
𝑖
, 𝑥2
𝑖
= 0 otherwise.
Regularized cross-entropy objective on a binary classifier
ℒ 𝑥1
𝑖
, 𝑥2
𝑖
= 𝑦 𝑥1
𝑖
, 𝑥2
𝑖
log 𝑝 𝑥1
𝑖
, 𝑥2
𝑖
+ 1 − 𝑦 𝑥1
𝑖
, 𝑥2
𝑖
log(1 − 𝑝 𝑥1
𝑖
, 𝑥2
𝑖
) + 𝜆 𝑇
|𝑤|2
11
Dataset
Dataset: Omniglot
1623 characters from 50 different alphabets (40 train, 10 test).
Each of these was hand down by 20 different people.
The number of letters in each alphabet varies considerably from about 15 to
upwards of 40 characters.
12
N-way k-shot learning
This is a problem setting which is often used in one shot learning.
• Pick 𝑁 classes.
• Use 𝑘 training data.
13
Experiments
14
The number
of samples
Data augmentation
use 20 alphabet from 50
(except for previous 30 alphabet)
and 1 data from 20.
use 30 alphabet from 50 and 12
data from 20.
For fine tuning
Matching Networks
for One Shot Learning
Image A Image B
Layer Layer
Computes metric Classification
Layer
Image A
Image A
Image B
Image C
Attention
Matching Networks
for One Shot Learning
Siamese Neural Networks
for On-shot Image Recognition
15
Vinyals et al. (2016)
Concepts
➕ Excellent generalization.
➖ Learning is slow and based on large
datasets, requiring many weight updates
using SGD.
➕ Novel examples to be assimilated.
➖ Some models in this family do not
require any training but performance
depends on the chosen metric.
Incorporate the characteristics from both parametric and non-parametric models
Rapid acquisition of new examples while providing excellent generalization from common
examples.
Parametric models (Deep Learning) Non-parametric models
16
1. Propose Matching Nets, a neural network which uses recent advances in attention and
memory that enable rapid learning.
2. The training procedure is based on a simple machine learning principle: test and train
conditions must match.
Model Architecture
• A neural attention mechanism is defined to access a memory matrix which
stores useful information to solve the task at hand.
𝑘 examples of image-label pairs 𝑆 = {(𝑥𝑖, 𝑦𝑖)}𝑖=1
𝑘
.
A classifier 𝑐 𝑠( 𝑥) which defines a probability distribution over outputs 𝑦 given a
test example 𝑥.
Define the mapping 𝑺 → 𝒄 𝒔( 𝒙) to be 𝑷( 𝒚| 𝒙, 𝑺)
where 𝑃 is parametrized by a neural network
𝑷( 𝒚| 𝒙, 𝑺)
17
Model Architecture
• The model computes 𝑦 as follows:
𝑦 = ∑𝑖=1
𝑘
𝑎 𝑥, 𝑥𝑖 𝑦𝑖
where 𝑥𝑖, 𝑦𝑖 are the samples and labels from the support set 𝑆 = {(𝑥𝑖, 𝑦𝑖)}𝑖=1
𝑘
, and 𝑎 is
an attention mechanism which is discussed in the next slide.
If there is only one image, it is
one-shot learning.
𝑺 = {(𝒙𝒊, 𝒚𝒊)}𝒊=𝟏
𝒌
𝒙
𝒚
𝑎 𝑥, 𝑥𝑖
18
Formulation and Learning
The algorithm relies on choosing 𝑎 . , . , the attention mechanism.
The simplest form is to use softmax over the cosine distance 𝑐, i.e.,
𝑎 𝑥, 𝑥𝑖 = 𝑒 𝑐(𝑓 𝑥 ,𝑔 𝑥 𝑖 )/
𝑗=1
𝑘
𝑒 𝑐(𝑓 𝑥 ,𝑔 𝑥 𝑗 )
with embedding functions 𝑓 and 𝑔 being approximate neural networks to embed
𝑥 and 𝑥𝑖.
The Attention Kernel
19
𝐿: Possible label sets
• 𝐿 could be the label set {𝑐𝑎𝑡𝑠, 𝑑𝑜𝑔𝑠}.
𝑇: Distribution over 𝐿.This is the train data.
1. Sample 𝐿 from 𝑇.
2. Sample 𝑆 and 𝐵 from 𝐿.
3. Minimize the error predicting the labels in the batch 𝐵 conditioned on the
support set 𝑆.
Definition
Learning Step
Objective Function
𝜃 = arg max 𝜃 𝔼 𝐿∼𝑇[𝔼 𝑆∼𝐿,𝐵∼𝐿[
𝑥,𝑦 ∈𝐵
log 𝑃 𝜃 𝑦 𝑥, 𝑆 ]]
Simulate the task of one shot learning only from train data.
20
Experiments
N-way k-shot learning
• Pick 𝑁 unseen character classes, independent of alphabet, as
𝐿.
• Provide the model with one drawing of each of the 𝑁
characters as 𝑆~𝐿 and a batch 𝐵~𝐿.
21
𝜃 = arg max 𝜃 𝔼 𝐿∼𝑇[𝔼 𝑆∼𝐿,𝐵∼𝐿[
𝑥,𝑦 ∈𝐵
log 𝑃 𝜃 𝑦 𝑥, 𝑆 ]]
Objective Function
Experiments
• Pixels: Nearest Neighbor.
• Baseline: Using features calculated with CNN, do Nearest Neighbor.
• Convolutional siamese net: “Siamese Neural Networks for One-shot
Image Recognition”.
22
The number of class
References
Slides: https://www.slideshare.net/masa_s/dlmatching-networks-for-one-shot-
learning-71539566
Blog: https://sorenbouma.github.io/blog/oneshot/
Papers:
• Signature Verification using a "Siamese” Time Delay Neural Network (1993),
NIPS
• DeepFace: Closing the Gap to Human-Level Performance in Face Verification
(2014), IEEE
• Siamese Neural Networks for One-shot Image Recognition (2015), ICML
• Matching Networks for One Shot Learning (2016), NIPS
• RepMet: Representative-based metric learning for classification and one-shot
object detection (2018), arXiv
23

Mais conteúdo relacionado

Mais procurados

Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksFrancesco Collova'
 
(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot LearningMasahiro Suzuki
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning ANKUSH PAL
 
Machine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuMachine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuSeokhyun Yoon
 
Nearest Class Mean Metric Learning
Nearest Class Mean Metric LearningNearest Class Mean Metric Learning
Nearest Class Mean Metric LearningSangjun Han
 
Neural network for machine learning
Neural network for machine learningNeural network for machine learning
Neural network for machine learningUjjawal
 
Unsupervised learning: Clustering
Unsupervised learning: ClusteringUnsupervised learning: Clustering
Unsupervised learning: ClusteringDeepak George
 
# Neural network toolbox
# Neural network toolbox # Neural network toolbox
# Neural network toolbox VineetKumar508
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...IOSR Journals
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature LearningAmgad Muhammad
 
Character Recognition using Artificial Neural Networks
Character Recognition using Artificial Neural NetworksCharacter Recognition using Artificial Neural Networks
Character Recognition using Artificial Neural NetworksJaison Sabu
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Akash Goel
 
ppt slides
ppt slidesppt slides
ppt slidesbutest
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksParrotAI
 
Using Deep Learning to Find Similar Dresses
Using Deep Learning to Find Similar DressesUsing Deep Learning to Find Similar Dresses
Using Deep Learning to Find Similar DressesHJ van Veen
 
Introduction to Machine learning & Neural Networks
Introduction to Machine learning & Neural NetworksIntroduction to Machine learning & Neural Networks
Introduction to Machine learning & Neural NetworksAnkur Nair
 
Vol 14 No 1 - July 2014
Vol 14 No 1 - July 2014Vol 14 No 1 - July 2014
Vol 14 No 1 - July 2014ijcsbi
 

Mais procurados (20)

Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 
(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning
 
Machine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuMachine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dku
 
Sefl Organizing Map
Sefl Organizing MapSefl Organizing Map
Sefl Organizing Map
 
Nearest Class Mean Metric Learning
Nearest Class Mean Metric LearningNearest Class Mean Metric Learning
Nearest Class Mean Metric Learning
 
Neural network for machine learning
Neural network for machine learningNeural network for machine learning
Neural network for machine learning
 
Unsupervised learning: Clustering
Unsupervised learning: ClusteringUnsupervised learning: Clustering
Unsupervised learning: Clustering
 
# Neural network toolbox
# Neural network toolbox # Neural network toolbox
# Neural network toolbox
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature Learning
 
Character Recognition using Artificial Neural Networks
Character Recognition using Artificial Neural NetworksCharacter Recognition using Artificial Neural Networks
Character Recognition using Artificial Neural Networks
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
 
ppt slides
ppt slidesppt slides
ppt slides
 
Image captioning
Image captioningImage captioning
Image captioning
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural Networks
 
Using Deep Learning to Find Similar Dresses
Using Deep Learning to Find Similar DressesUsing Deep Learning to Find Similar Dresses
Using Deep Learning to Find Similar Dresses
 
Introduction to Machine learning & Neural Networks
Introduction to Machine learning & Neural NetworksIntroduction to Machine learning & Neural Networks
Introduction to Machine learning & Neural Networks
 
Vol 14 No 1 - July 2014
Vol 14 No 1 - July 2014Vol 14 No 1 - July 2014
Vol 14 No 1 - July 2014
 

Semelhante a Siamese Nets for One-shot Learning

Exploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionExploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionYongsu Baek
 
Facial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceFacial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceTakrim Ul Islam Laskar
 
Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015Beatrice van Eden
 
Optical character recognition performance analysis of sif and ldf based ocr
Optical character recognition performance analysis of sif and ldf based ocrOptical character recognition performance analysis of sif and ldf based ocr
Optical character recognition performance analysis of sif and ldf based ocrcsandit
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Rakibul Hasan Pranto
 
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...ssuser4b1f48
 
HW2-1_05.doc
HW2-1_05.docHW2-1_05.doc
HW2-1_05.docbutest
 
Build a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flowBuild a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flowDebasisMohanty37
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classificationijtsrd
 
Document clustering for forensic analysis an approach for improving compute...
Document clustering for forensic   analysis an approach for improving compute...Document clustering for forensic   analysis an approach for improving compute...
Document clustering for forensic analysis an approach for improving compute...Madan Golla
 
Teach a neural network to read handwriting
Teach a neural network to read handwritingTeach a neural network to read handwriting
Teach a neural network to read handwritingVipul Kaushal
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspectiveAnirban Santara
 
NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsMohid Nabil
 
Towards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprogramsTowards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprogramsParidha Saxena
 

Semelhante a Siamese Nets for One-shot Learning (20)

Neural networks
Neural networksNeural networks
Neural networks
 
Exploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionExploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image Recognition
 
Som paper1.doc
Som paper1.docSom paper1.doc
Som paper1.doc
 
Facial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceFacial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional Face
 
report
reportreport
report
 
Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015
 
Optical character recognition performance analysis of sif and ldf based ocr
Optical character recognition performance analysis of sif and ldf based ocrOptical character recognition performance analysis of sif and ldf based ocr
Optical character recognition performance analysis of sif and ldf based ocr
 
Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019 Islamic University Pattern Recognition & Neural Network 2019
Islamic University Pattern Recognition & Neural Network 2019
 
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...NS-CUK Seminar: H.E.Lee,  Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
 
HW2-1_05.doc
HW2-1_05.docHW2-1_05.doc
HW2-1_05.doc
 
Build a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flowBuild a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flow
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classification
 
230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx230727_HB_JointJournalClub.pptx
230727_HB_JointJournalClub.pptx
 
Document clustering for forensic analysis an approach for improving compute...
Document clustering for forensic   analysis an approach for improving compute...Document clustering for forensic   analysis an approach for improving compute...
Document clustering for forensic analysis an approach for improving compute...
 
Teach a neural network to read handwriting
Teach a neural network to read handwritingTeach a neural network to read handwriting
Teach a neural network to read handwriting
 
neuralAC
neuralACneuralAC
neuralAC
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
ppt.pdf
ppt.pdfppt.pdf
ppt.pdf
 
NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximatePrograms
 
Towards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprogramsTowards neuralprocessingofgeneralpurposeapproximateprograms
Towards neuralprocessingofgeneralpurposeapproximateprograms
 

Mais de Masa Kato

効率的反実仮想学習
効率的反実仮想学習効率的反実仮想学習
効率的反実仮想学習Masa Kato
 
敵対的学習に対するラデマッハ複雑度
敵対的学習に対するラデマッハ複雑度敵対的学習に対するラデマッハ複雑度
敵対的学習に対するラデマッハ複雑度Masa Kato
 
最適腕識別と多重検定
最適腕識別と多重検定最適腕識別と多重検定
最適腕識別と多重検定Masa Kato
 
Validating Causal Inference Models via Influence Functions
Validating Causal Inference Modelsvia Influence FunctionsValidating Causal Inference Modelsvia Influence Functions
Validating Causal Inference Models via Influence FunctionsMasa Kato
 
Jamieson_Jain2018
Jamieson_Jain2018Jamieson_Jain2018
Jamieson_Jain2018Masa Kato
 
マルコフ転換モデル:導入編
マルコフ転換モデル:導入編マルコフ転換モデル:導入編
マルコフ転換モデル:導入編Masa Kato
 
経済学のための並列分散処理2
経済学のための並列分散処理2経済学のための並列分散処理2
経済学のための並列分散処理2Masa Kato
 
経済学のための並列分散処理1
経済学のための並列分散処理1経済学のための並列分散処理1
経済学のための並列分散処理1Masa Kato
 
Koh_Liang_ICML2017
Koh_Liang_ICML2017Koh_Liang_ICML2017
Koh_Liang_ICML2017Masa Kato
 
米国のインサイダー取引規制
米国のインサイダー取引規制米国のインサイダー取引規制
米国のインサイダー取引規制Masa Kato
 
Risk based approaches to asset allocation chap0102
Risk based approaches to asset allocation chap0102Risk based approaches to asset allocation chap0102
Risk based approaches to asset allocation chap0102Masa Kato
 
適時開示制度
適時開示制度適時開示制度
適時開示制度Masa Kato
 
Experimental games
Experimental games Experimental games
Experimental games Masa Kato
 

Mais de Masa Kato (13)

効率的反実仮想学習
効率的反実仮想学習効率的反実仮想学習
効率的反実仮想学習
 
敵対的学習に対するラデマッハ複雑度
敵対的学習に対するラデマッハ複雑度敵対的学習に対するラデマッハ複雑度
敵対的学習に対するラデマッハ複雑度
 
最適腕識別と多重検定
最適腕識別と多重検定最適腕識別と多重検定
最適腕識別と多重検定
 
Validating Causal Inference Models via Influence Functions
Validating Causal Inference Modelsvia Influence FunctionsValidating Causal Inference Modelsvia Influence Functions
Validating Causal Inference Models via Influence Functions
 
Jamieson_Jain2018
Jamieson_Jain2018Jamieson_Jain2018
Jamieson_Jain2018
 
マルコフ転換モデル:導入編
マルコフ転換モデル:導入編マルコフ転換モデル:導入編
マルコフ転換モデル:導入編
 
経済学のための並列分散処理2
経済学のための並列分散処理2経済学のための並列分散処理2
経済学のための並列分散処理2
 
経済学のための並列分散処理1
経済学のための並列分散処理1経済学のための並列分散処理1
経済学のための並列分散処理1
 
Koh_Liang_ICML2017
Koh_Liang_ICML2017Koh_Liang_ICML2017
Koh_Liang_ICML2017
 
米国のインサイダー取引規制
米国のインサイダー取引規制米国のインサイダー取引規制
米国のインサイダー取引規制
 
Risk based approaches to asset allocation chap0102
Risk based approaches to asset allocation chap0102Risk based approaches to asset allocation chap0102
Risk based approaches to asset allocation chap0102
 
適時開示制度
適時開示制度適時開示制度
適時開示制度
 
Experimental games
Experimental games Experimental games
Experimental games
 

Último

Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 

Último (20)

9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 

Siamese Nets for One-shot Learning

  • 1. Siamese Networks for One-shot Learning Masa Kato 1
  • 2. Contents Introduction of methods for one shot learning using siamese neural network. • Signature Verification using a "Siamese” Time Delay Neural Network (1993), NIPS • Siamese Neural Networks for One-shot Image Recognition (2015), ICML • Matching Networks for One Shot Learning (2016), NIPS Propose my idea for matching 2
  • 3. History of One-shot Learning Firstly proposed by Fei-Fei et al. (2003); Fei-Fei et al. (2006). They developed a variational Bayesian framework. Lake et al. (2013) proposed an algorithm with a method called Hierarchical Bayesian Programming Learning. Methods based metric learning were proposed (Koch et al. (2015); Vinyals et al. (2016)). Methods based neural network with memory were proposed (Graves et al. (2014); Santoro et al. (2016)).  There exist some other general formulations and domain specific researches. One-shot Object Detection was proposed in Schwartz et al. (2018) 3
  • 4. Recent Methods for One-shot Learning using Neural Networks 1. Metric Learning 2. Memory network Papers: 1. Koch et al. (2015) 2. Graves et al. (2014) 1+2. Vinyals et al. (2016) The siamese network is often used. • Siamese nets were first introduces by Bromley et al. (1993) to solve signature verification as an image matching problem. • Koch et al. (2015) proposed Deep Siamese Networks for one-shot image recognition. • Vinyals et al. (2016) proposed Matching Nets, which is a model that incorporated memory network to Deep Siamese Networks and formulated the task as classification problem. • Schwartz et al. (2018) applied existing methods for One-shot Object Detection. 4
  • 5. Siamese Network Siamese network consists of two identical sub-networks joined at their outputs. Image A Image B Layer Layer Computes metric between A and B 5
  • 6. More Detail of Basic Structure Image A Image B Same structure and weights 6
  • 7. Signature Verification using a “Siamese” Time Delay Neural Network • The aim of the project was to make a signature verification system based on the NCR 5990 Signature Capture Device. • A signature is 800 sets of 𝑥, 𝑦 and pen up-down points with time 𝑡. • Preprocess the data before training the network. 7 Bromley et al. (1993)
  • 8. Performance 8 GA: Genuine signature pairs • Correct pairs. FR: Forgery • Write to deceive. Classified the signature and detect the forgery with good performance.
  • 9. Siamese Neural Networks for On-shot Image Recognition • Siamese nets were first introduces by Bromley et al. (1993) to solve signature verification as an image matching problem. • Koch et al. (2015) used convolutional deep neural network to extract features of images before calculating its distance. 9 Koch et al. (2015)
  • 10. • The model is a siamese convolutional network with 𝐿 layers each with 𝑁𝑙 units, where ℎ1,𝑙 represents the hidden vector in layer 𝑙 for the first twin, and ℎ2,𝑙 denotes the same for the second twin. • ReLU units in the first 𝐿 − 2 layers and sigmoidal units in the remaining layers. Distance metric Image A Image B Deep Siamese Networks 10
  • 11. Learning 𝑀: minibatch size. 𝑖: indexes the 𝑖the minibatch. 𝑦 𝑥1 𝑖 , 𝑥2 𝑖 : length- 𝑀 vector which contains the labels for the minibatch. • 𝑦 𝑥1 𝑖 , 𝑥2 𝑖 = 1 whenever 𝑥1 and 𝑥2 are from the same class. • 𝑦 𝑥1 𝑖 , 𝑥2 𝑖 = 0 otherwise. Regularized cross-entropy objective on a binary classifier ℒ 𝑥1 𝑖 , 𝑥2 𝑖 = 𝑦 𝑥1 𝑖 , 𝑥2 𝑖 log 𝑝 𝑥1 𝑖 , 𝑥2 𝑖 + 1 − 𝑦 𝑥1 𝑖 , 𝑥2 𝑖 log(1 − 𝑝 𝑥1 𝑖 , 𝑥2 𝑖 ) + 𝜆 𝑇 |𝑤|2 11
  • 12. Dataset Dataset: Omniglot 1623 characters from 50 different alphabets (40 train, 10 test). Each of these was hand down by 20 different people. The number of letters in each alphabet varies considerably from about 15 to upwards of 40 characters. 12
  • 13. N-way k-shot learning This is a problem setting which is often used in one shot learning. • Pick 𝑁 classes. • Use 𝑘 training data. 13
  • 14. Experiments 14 The number of samples Data augmentation use 20 alphabet from 50 (except for previous 30 alphabet) and 1 data from 20. use 30 alphabet from 50 and 12 data from 20. For fine tuning
  • 15. Matching Networks for One Shot Learning Image A Image B Layer Layer Computes metric Classification Layer Image A Image A Image B Image C Attention Matching Networks for One Shot Learning Siamese Neural Networks for On-shot Image Recognition 15 Vinyals et al. (2016)
  • 16. Concepts ➕ Excellent generalization. ➖ Learning is slow and based on large datasets, requiring many weight updates using SGD. ➕ Novel examples to be assimilated. ➖ Some models in this family do not require any training but performance depends on the chosen metric. Incorporate the characteristics from both parametric and non-parametric models Rapid acquisition of new examples while providing excellent generalization from common examples. Parametric models (Deep Learning) Non-parametric models 16 1. Propose Matching Nets, a neural network which uses recent advances in attention and memory that enable rapid learning. 2. The training procedure is based on a simple machine learning principle: test and train conditions must match.
  • 17. Model Architecture • A neural attention mechanism is defined to access a memory matrix which stores useful information to solve the task at hand. 𝑘 examples of image-label pairs 𝑆 = {(𝑥𝑖, 𝑦𝑖)}𝑖=1 𝑘 . A classifier 𝑐 𝑠( 𝑥) which defines a probability distribution over outputs 𝑦 given a test example 𝑥. Define the mapping 𝑺 → 𝒄 𝒔( 𝒙) to be 𝑷( 𝒚| 𝒙, 𝑺) where 𝑃 is parametrized by a neural network 𝑷( 𝒚| 𝒙, 𝑺) 17
  • 18. Model Architecture • The model computes 𝑦 as follows: 𝑦 = ∑𝑖=1 𝑘 𝑎 𝑥, 𝑥𝑖 𝑦𝑖 where 𝑥𝑖, 𝑦𝑖 are the samples and labels from the support set 𝑆 = {(𝑥𝑖, 𝑦𝑖)}𝑖=1 𝑘 , and 𝑎 is an attention mechanism which is discussed in the next slide. If there is only one image, it is one-shot learning. 𝑺 = {(𝒙𝒊, 𝒚𝒊)}𝒊=𝟏 𝒌 𝒙 𝒚 𝑎 𝑥, 𝑥𝑖 18
  • 19. Formulation and Learning The algorithm relies on choosing 𝑎 . , . , the attention mechanism. The simplest form is to use softmax over the cosine distance 𝑐, i.e., 𝑎 𝑥, 𝑥𝑖 = 𝑒 𝑐(𝑓 𝑥 ,𝑔 𝑥 𝑖 )/ 𝑗=1 𝑘 𝑒 𝑐(𝑓 𝑥 ,𝑔 𝑥 𝑗 ) with embedding functions 𝑓 and 𝑔 being approximate neural networks to embed 𝑥 and 𝑥𝑖. The Attention Kernel 19
  • 20. 𝐿: Possible label sets • 𝐿 could be the label set {𝑐𝑎𝑡𝑠, 𝑑𝑜𝑔𝑠}. 𝑇: Distribution over 𝐿.This is the train data. 1. Sample 𝐿 from 𝑇. 2. Sample 𝑆 and 𝐵 from 𝐿. 3. Minimize the error predicting the labels in the batch 𝐵 conditioned on the support set 𝑆. Definition Learning Step Objective Function 𝜃 = arg max 𝜃 𝔼 𝐿∼𝑇[𝔼 𝑆∼𝐿,𝐵∼𝐿[ 𝑥,𝑦 ∈𝐵 log 𝑃 𝜃 𝑦 𝑥, 𝑆 ]] Simulate the task of one shot learning only from train data. 20
  • 21. Experiments N-way k-shot learning • Pick 𝑁 unseen character classes, independent of alphabet, as 𝐿. • Provide the model with one drawing of each of the 𝑁 characters as 𝑆~𝐿 and a batch 𝐵~𝐿. 21 𝜃 = arg max 𝜃 𝔼 𝐿∼𝑇[𝔼 𝑆∼𝐿,𝐵∼𝐿[ 𝑥,𝑦 ∈𝐵 log 𝑃 𝜃 𝑦 𝑥, 𝑆 ]] Objective Function
  • 22. Experiments • Pixels: Nearest Neighbor. • Baseline: Using features calculated with CNN, do Nearest Neighbor. • Convolutional siamese net: “Siamese Neural Networks for One-shot Image Recognition”. 22 The number of class
  • 23. References Slides: https://www.slideshare.net/masa_s/dlmatching-networks-for-one-shot- learning-71539566 Blog: https://sorenbouma.github.io/blog/oneshot/ Papers: • Signature Verification using a "Siamese” Time Delay Neural Network (1993), NIPS • DeepFace: Closing the Gap to Human-Level Performance in Face Verification (2014), IEEE • Siamese Neural Networks for One-shot Image Recognition (2015), ICML • Matching Networks for One Shot Learning (2016), NIPS • RepMet: Representative-based metric learning for classification and one-shot object detection (2018), arXiv 23