What Makes Transfer learning Work for Medical Images

What Makes Transfer learning Work for Medical Images
: Feature Reuse and Other Factors
Christos Matsoukas1,2,3 , Johan Fredin Haslum1,2,3, Moein Sorkhei1,2, Magnus Soderberg3, Kevin Smith1,2
1 KTH Royal Institute of Technology, Stockholm, Sweden
2 Science for Life Laboratory, Stockholm, Sweden
3 AstraZeneca, Gothenburg, Sweden
Presenter : Mithunjha Anandakumar

What is Transfer Learning?
Source domain
Model Model
Target domain
Knowledge
reuse knowledge gained in one domain, the
source domain, to improve performance in
another, the target domain.
2

Source domain vs Target domain
Source Domain/ ImageNet Target Domain/ Medical Images
Natural images with clear global subject large image of a bodily region of interest and use
variations in local textures to identify pathologies
Millions of images Larger Images/ fewer images*
1000 classes Fewer classes
Image credits : https://www.researchgate.net/figure/Examples-of-pictures-randomly-sampled-from-the-Tiny-ImageNet-dataset_fig1_354590544
Content credits: Raghu, M., Zhang, C., Kleinberg, J., & Bengio, S. (2019). Transfusion: Understanding transfer learning for medical imaging. Advances in neural information processing systems, 32.
3
* Rareness of disease, ethical concerns, expense of acquisition

Contribution of the paper
• Shows that the benefits of TL increase with:
• Reduced data size
• Smaller Distance between source and target domain
• Models with fewer inductive biases
• Models with more capacity (to a lesser extent)
• Shows that the benefits of TL correlates with feature reuse.
• Shows that there is feature independent benefits of pretraining -
speed up training.
4

Related work
• Summary of the paper and contribution
- 2 dataset
- a large dataset : CheXpert
- a private dataset : Retinal fundus
- Architecture:
- Resnet
- Inception
- Contribution :
- Little benefit (due to overparameterization
and weight statistics but not due to feature
reuse)
- Speed up the training
5

Methodology
• Dataset
6
N = 3662 N = 10,239 N = 25331 N = 224316 N = 327680
High-resolution
diabetic retinopathy
images
A mammography
dataset
Dermoscopic images Chest X-rays Patches of H&E
stained WSIs of lymph
node sections
Classification : 5
Classes
detect the presence of
masses
Classification : 9
Classes
Classification : 14
Classes
Classification : 2
Classes

Methodology
• Architecture
7
DEIT SWIN INCEPTION RESNET
ViT models CNN models

Methodology
• Initialization – to isolate the contribution of feature reuse and weights
statistics
8
1. Random Initialization (RI)
 Kaiming initialization
2. Weight statistics transfer (ST)
 Sampling weights from a normal distribution whose mean and std are taken
from an IMAGENET pretrained model
3. Weight Transfer (WT)
 Transferring IMAGENET pretrained weights

When is TL to medical
domain beneficial and
how important is
feature reuse?
10
Relative increase in the performance,
𝑊𝑇
𝑅𝐼
Relative gain attributed to feature use,
𝑊𝑇 − 𝑆𝑇
𝑅𝐼

Which layers benefits from feature reuse?
11
Transferring weights (WT)
upto n block and initializing
remaining m blocks with ST.

What properties of TL are revealed via feature similarity?
12
Feature similarity resulting from transfer learning (WT) before and after finetuning.

What properties of TL are revealed via feature similarity?
13
Feature similarity between ST and WT initialized models after fine-tuning.

Which transferred weight changes?
14
L2 distance between the initial weights of each network and the weights after fine-tuning.

Which transferred weight changes?
15
impact of resetting a layer’s weights to their initial values : Reinitialization robustness

What is the impact of TL for different model capacities
16

What is the impact of TL on convergence speed?
17

Contribution of the paper
• Shows that the benefits of TL increase with:
• Reduced data size
• Smaller Distance between source and target domain
• Models with fewer inductive biases
• Models with more capacity (to a lesser extent)
• Shows that the benefits of TL correlates with feature reuse.
• Shows that there is feature independent benefits of pretraining -
speed up training.
18

What Makes Transfer learning Work for Medical Images

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a What Makes Transfer learning Work for Medical Images

Semelhante a What Makes Transfer learning Work for Medical Images (20)

Último

Último (20)

What Makes Transfer learning Work for Medical Images

Notas do Editor