2. はじめに
Inductive Transfer : 10 Years Later (NIPS2005 Workshop)
Inductive transfer or transfer learning refers to the problem
of retaining and applying the knowledge learned in one or
more tasks to efficiently develop an effective hypothesis for a
new task.
帰納的転移または転移学習とは, 新しいタスクに対する有効
な仮説を効率的に見つけ出すために, 一つ以上の別のタスク
で学習された知識を保持 · 適用する問題を指す.
本発表の目的
• 転移学習を体系的に整理する
• 転移学習の問題設定と具体的な定式化を説明する
• 転移学習の具体的な方法の例を紹介する
注) ∗
の付いているスライドや章は時間の都合上説明を省略します
松井 (名古屋大) 転移学習の基礎 1 / 41
14. いつ転移するか: 負転移
負転移
1. 一方のドメインのみで学習したモデルを目標タスクで用いる
2. 両ドメインを使って学習したモデルを目標タスクで用いる
として (2 のタスク性能) ≤ (1 のタスク性能) のとき (下図 (b))
1.0
0.2
0.4
0.6
0.8
0.0
1.0
0.2
0.4
0.6
0.8
0.0
AUC
AUC
The number of target training cases
The number of target training cases
(a) (b)
source only
transfer
target only
source only
transfer
target only
• 2 つのドメインが乖離しているほど負転移が発生しやすい
• 負転移を防ぐことは転移学習における重要な課題
松井 (名古屋大) 転移学習の基礎 転移学習の基本問題 11 / 41
47. References
[1] Hal Daumé III. Frustratingly easy domain adaptation. ACL, 2007.
[2] A. Krizhevsky et al. Imagenet classification with deep convolutional neural networks. NeurIPS, 2012.
[3] A. Radford et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
[4] A. Ramesh et al. Zero-shot text-to-image generation. arXiv preprint arXiv:2102.12092, 2021.
[5] A. Soltoggio et al. Born to learn: the inspiration, progress, and future of evolved plastic artificial neural
networks. Neural Networks, 108:48–67, 2018.
[6] B. K. Sriperumbudur et al. On the empirical estimation of integral probability metrics. Electronic Journal of
Statistics, 6:1550–1599, 2012.
[7] C. Finn et al. Model-agnostic meta-learning for fast adaptation of deep networks. ICML, 2017.
[8] F. Zhuang et al. Supervised representation learning: Transfer learning with deep autoencoders. IJCAI, 2015.
[9] H. Liu et al. Transferable adversarial training: A general approach to adapting deep classifiers. ICML, 2019.
[10] H. Zhao et al. On learning invariant representations for domain adaptation, 2019.
[11] I. Redko et al. Optimal transport for multi-source domain adaptation under target shift. AISTATS, 2019.
[12] I. Sato et al. Managing computer-assisted detection system based on transfer learning with negative transfer
inhibition. KDD, 2018.
[13] J. Devlin et al. Bert: Pre-training of deep bidirectional transformers for language understanding. NAACL, 2018.
[14] J. Gou et al. Knowledge distillation: A survey. International Journal of Computer Vision, pages 1–31, 2021.
[15] J. Quionero-Candela et al. Dataset shift in machine learning. The MIT Press, 2009.
[16] L. Duan et al. Learning with augmented features for heterogeneous domain adaptation. ICML, 2012.
[17] L. Franceschi et al. Forward and reverse gradient-based hyperparameter optimization. 2017.
松井 (名古屋大) 転移学習の基礎 まとめ 40 / 41
48. [18] M. Sugiyama et al. Density ratio estimation in machine learning. Cambridge University Press, 2012.
[19] N. Courty et al. Optimal transport for domain adaptation. IEEE transactions on pattern analysis and machine
intelligence, 39(9):1853–1865, 2016.
[20] S. Ben-David et al. A theory of learning from different domains. Machine learning, 79(1):151–175, 2010.
[21] S. Kuroki et al. Unsupervised domain adaptation based on source-guided discrepancy. 2019.
[22] T. Brown et al. Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020.
[23] T. Teshima et al. Few-shot domain adaptation by causal mechanism transfer. 2020.
[24] V. Veeriah et al. Discovery of useful questions as auxiliary tasks. NeurIPS, 2019.
[25] Y. Chen et al. Learning to learn without gradient descent by gradient descent. 2017.
[26] Y. Duan et al. Rl ˆ2: Fast reinforcement learning via slow reinforcement learning. arXiv preprint
arXiv:1611.02779, 2016.
[27] Y. Ganin et al. Domain-adversarial training of neural networks. JMLR, 17(1):2096–2030, 2016.
[28] Y. Li et al. Feature-critic networks for heterogeneous domain generalization. 2019.
[29] T. Iwata and M. Yamada. Multi-view anomaly detection via robust probabilistic latent variable models.
NeurIPS, 2016.
[30] S. Ravi and H. Larochelle. Optimization as a model for few-shot learning. 2017.
[31] M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. 2014.
松井 (名古屋大) 転移学習の基礎 まとめ 41 / 41