Siamese Nets for One-shot Learning

Siamese Networks
for One-shot Learning
Masa Kato
1

Contents
Introduction of methods for one shot learning using siamese neural network.
• Signature Verification using a "Siamese” Time Delay Neural Network (1993), NIPS
• Siamese Neural Networks for One-shot Image Recognition (2015), ICML
• Matching Networks for One Shot Learning (2016), NIPS
Propose my idea for matching
2

History of One-shot Learning
Firstly proposed by Fei-Fei et al. (2003); Fei-Fei et al. (2006). They developed a
variational Bayesian framework.
Lake et al. (2013) proposed an algorithm with a method called Hierarchical
Bayesian Programming Learning.
Methods based metric learning were proposed (Koch et al. (2015); Vinyals et al.
(2016)).
Methods based neural network with memory were proposed (Graves et al. (2014);
Santoro et al. (2016)).
 There exist some other general formulations and domain specific researches.
One-shot Object Detection was proposed in Schwartz et al. (2018)
3

Recent Methods for One-shot Learning
using Neural Networks
1. Metric Learning
2. Memory network
Papers:
1. Koch et al. (2015)
2. Graves et al. (2014)
1+2. Vinyals et al. (2016)
The siamese network is often used.
• Siamese nets were first introduces by Bromley et al.
(1993) to solve signature verification as an image
matching problem.
• Koch et al. (2015) proposed Deep Siamese Networks
for one-shot image recognition.
• Vinyals et al. (2016) proposed Matching Nets, which
is a model that incorporated memory network to
Deep Siamese Networks and formulated the task as
classification problem.
• Schwartz et al. (2018) applied existing methods for
One-shot Object Detection.
4

Siamese Network
Siamese network consists
of two identical sub-networks joined at their outputs.
Image A Image B
Layer Layer
Computes metric between A and B
5

More Detail of Basic Structure
Image A
Image B
Same structure and weights
6

Signature Verification
using a “Siamese” Time Delay
Neural Network
• The aim of the project was to make a signature verification system based on the
NCR 5990 Signature Capture Device.
• A signature is 800 sets of 𝑥, 𝑦 and pen up-down points with time 𝑡.
• Preprocess the data before training the network.
7
Bromley et al. (1993)

Performance
8
GA: Genuine signature pairs
• Correct pairs.
FR: Forgery
• Write to deceive.
Classified the signature and detect the
forgery with good performance.

Siamese Neural Networks
for On-shot Image Recognition
• Siamese nets were first introduces by Bromley et al. (1993) to solve signature
verification as an image matching problem.
• Koch et al. (2015) used convolutional deep neural network to extract features of
images before calculating its distance.
9
Koch et al. (2015)

• The model is a siamese convolutional network with 𝐿 layers each with 𝑁𝑙 units, where
ℎ1,𝑙 represents the hidden vector in layer 𝑙 for the first twin, and ℎ2,𝑙 denotes the same
for the second twin.
• ReLU units in the first 𝐿 − 2 layers and sigmoidal units in the remaining layers.
Distance metric
Image A
Image B
Deep Siamese Networks
10

Learning
𝑀: minibatch size.
𝑖: indexes the 𝑖the minibatch.
𝑦 𝑥1
𝑖
, 𝑥2
𝑖
: length- 𝑀 vector which contains the labels for the minibatch.
• 𝑦 𝑥1
𝑖
, 𝑥2
𝑖
= 1 whenever 𝑥1 and 𝑥2 are from the same class.
• 𝑦 𝑥1
𝑖
, 𝑥2
𝑖
= 0 otherwise.
Regularized cross-entropy objective on a binary classifier
ℒ 𝑥1
𝑖
, 𝑥2
𝑖
= 𝑦 𝑥1
𝑖
, 𝑥2
𝑖
log 𝑝 𝑥1
𝑖
, 𝑥2
𝑖
+ 1 − 𝑦 𝑥1
𝑖
, 𝑥2
𝑖
log(1 − 𝑝 𝑥1
𝑖
, 𝑥2
𝑖
) + 𝜆 𝑇
|𝑤|2
11

Dataset
Dataset: Omniglot
1623 characters from 50 different alphabets (40 train, 10 test).
Each of these was hand down by 20 different people.
The number of letters in each alphabet varies considerably from about 15 to
upwards of 40 characters.
12

N-way k-shot learning
This is a problem setting which is often used in one shot learning.
• Pick 𝑁 classes.
• Use 𝑘 training data.
13

Experiments
14
The number
of samples
Data augmentation
use 20 alphabet from 50
(except for previous 30 alphabet)
and 1 data from 20.
use 30 alphabet from 50 and 12
data from 20.
For fine tuning

Matching Networks
for One Shot Learning
Image A Image B
Layer Layer
Computes metric Classification
Layer
Image A
Image A
Image B
Image C
Attention
Matching Networks
for One Shot Learning
Siamese Neural Networks
for On-shot Image Recognition
15
Vinyals et al. (2016)

Concepts
➕ Excellent generalization.
➖ Learning is slow and based on large
datasets, requiring many weight updates
using SGD.
➕ Novel examples to be assimilated.
➖ Some models in this family do not
require any training but performance
depends on the chosen metric.
Incorporate the characteristics from both parametric and non-parametric models
Rapid acquisition of new examples while providing excellent generalization from common
examples.
Parametric models (Deep Learning) Non-parametric models
16
1. Propose Matching Nets, a neural network which uses recent advances in attention and
memory that enable rapid learning.
2. The training procedure is based on a simple machine learning principle: test and train
conditions must match.

Model Architecture
• A neural attention mechanism is defined to access a memory matrix which
stores useful information to solve the task at hand.
𝑘 examples of image-label pairs 𝑆 = {(𝑥𝑖, 𝑦𝑖)}𝑖=1
𝑘
.
A classifier 𝑐 𝑠( 𝑥) which defines a probability distribution over outputs 𝑦 given a
test example 𝑥.
Define the mapping 𝑺 → 𝒄 𝒔( 𝒙) to be 𝑷( 𝒚| 𝒙, 𝑺)
where 𝑃 is parametrized by a neural network
𝑷( 𝒚| 𝒙, 𝑺)
17

Model Architecture
• The model computes 𝑦 as follows:
𝑦 = ∑𝑖=1
𝑘
𝑎 𝑥, 𝑥𝑖 𝑦𝑖
where 𝑥𝑖, 𝑦𝑖 are the samples and labels from the support set 𝑆 = {(𝑥𝑖, 𝑦𝑖)}𝑖=1
𝑘
, and 𝑎 is
an attention mechanism which is discussed in the next slide.
If there is only one image, it is
one-shot learning.
𝑺 = {(𝒙𝒊, 𝒚𝒊)}𝒊=𝟏
𝒌
𝒙
𝒚
𝑎 𝑥, 𝑥𝑖
18

Formulation and Learning
The algorithm relies on choosing 𝑎 . , . , the attention mechanism.
The simplest form is to use softmax over the cosine distance 𝑐, i.e.,
𝑎 𝑥, 𝑥𝑖 = 𝑒 𝑐(𝑓 𝑥 ,𝑔 𝑥 𝑖 )/
𝑗=1
𝑘
𝑒 𝑐(𝑓 𝑥 ,𝑔 𝑥 𝑗 )
with embedding functions 𝑓 and 𝑔 being approximate neural networks to embed
𝑥 and 𝑥𝑖.
The Attention Kernel
19

𝐿: Possible label sets
• 𝐿 could be the label set {𝑐𝑎𝑡𝑠, 𝑑𝑜𝑔𝑠}.
𝑇: Distribution over 𝐿.This is the train data.
1. Sample 𝐿 from 𝑇.
2. Sample 𝑆 and 𝐵 from 𝐿.
3. Minimize the error predicting the labels in the batch 𝐵 conditioned on the
support set 𝑆.
Definition
Learning Step
Objective Function
𝜃 = arg max 𝜃 𝔼 𝐿∼𝑇[𝔼 𝑆∼𝐿,𝐵∼𝐿[
𝑥,𝑦 ∈𝐵
log 𝑃 𝜃 𝑦 𝑥, 𝑆 ]]
Simulate the task of one shot learning only from train data.
20

Experiments
N-way k-shot learning
• Pick 𝑁 unseen character classes, independent of alphabet, as
𝐿.
• Provide the model with one drawing of each of the 𝑁
characters as 𝑆~𝐿 and a batch 𝐵~𝐿.
21
𝜃 = arg max 𝜃 𝔼 𝐿∼𝑇[𝔼 𝑆∼𝐿,𝐵∼𝐿[
𝑥,𝑦 ∈𝐵
log 𝑃 𝜃 𝑦 𝑥, 𝑆 ]]
Objective Function

Experiments
• Pixels: Nearest Neighbor.
• Baseline: Using features calculated with CNN, do Nearest Neighbor.
• Convolutional siamese net: “Siamese Neural Networks for One-shot
Image Recognition”.
22
The number of class

References
Slides: https://www.slideshare.net/masa_s/dlmatching-networks-for-one-shot-
learning-71539566
Blog: https://sorenbouma.github.io/blog/oneshot/
Papers:
• Signature Verification using a "Siamese” Time Delay Neural Network (1993),
NIPS
• DeepFace: Closing the Gap to Human-Level Performance in Face Verification
(2014), IEEE
• Siamese Neural Networks for One-shot Image Recognition (2015), ICML
• Matching Networks for One Shot Learning (2016), NIPS
• RepMet: Representative-based metric learning for classification and one-shot
object detection (2018), arXiv
23

Siamese Nets for One-shot Learning

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Siamese Nets for One-shot Learning

Semelhante a Siamese Nets for One-shot Learning (20)

Mais de Masa Kato

Mais de Masa Kato (13)

Último

Último (20)

Siamese Nets for One-shot Learning