SlideShare uma empresa Scribd logo
1 de 43
Baixar para ler offline
1


Deep Learning with Implicit Gradients
Shohei Taniguchi, Matsuo Lab (M1)
•
•
• 2
- Meta-Learning with Implicit Gradients
‣ MAML inner update 

iMAML
- RNNs Evolving on an Equilibrium Manifold: A Panacea for Vanishing 

and Exploding Gradients?
‣ ERNN
2
Outline
1.
-
-
2.
- 1
‣ Implicit Reparameterization Gradients
3. Meta-Learning with Implicit Gradients
4. RNNs Evolving on an Equilibrium Manifold: A Panacea for Vanishing and
Exploding Gradients?
3
4
• (e.g. 2 )
-
-
- NN
• (e.g. )
-
-
- 

( )
y = f (x) y = ax2
+ bx + c
f (x, y) = 0 x2
+ y2
= r2
x y 5
•
•
, ,


f(x, y) = 0
dy
dx
= −
∂f/∂x
∂f/∂y
= −
fx
fy
f(x, y) = 0 (x0, y0) fy (x0, y0)
x0 ∈ U y0 ∈ V g : U → V
{(x, g(x))|x ∈ U} = {(x, y) ∈ U × V| f(x, y) = 0}
6
• 1
-
- A
- B 

( )
• 2 Jacobian
f(x, y) = 0 (x0, y0)
fy (x0, y0)
x2
+ y2
− r2
= 0
y = r2
− x2
fy (r,0) = 2 × 0 = 0
y = ± r2
− x2
fy 7
( )
1.
- ( )
- iMAML
2.
-
- ERNN
8
( )
1.
- ( )
- iMAML
2.
-
- ERNN
9
Implicit Reparameterization Gradients
10
• NeurIPS 2018 accepted
•
- Michael Figurnov, Shakir Mohamed, Andriy Mnih
- DeepMind
• reparameterization
trick
•
iMAML ERNN
11
Reparameterization Trick
• VAE
•
reparameterization trick
•
𝔼q(z; ϕ) [log p (x|z)]−KL (q (z; ϕ)||p (z))
q ϵ = f (z; ϕ) =
z − μϕ
σϕ
ϵ ∼ 𝒩 (0,1)
ϕ ϵ
∇ϕ 𝔼q(z; ϕ) [log p (x|z)] = 𝔼p(ϵ)
[
∇ϕlog p (x|z)
z=f−1
(ϵ; ϕ)]
f f
12
Implicit Reparameterization Gradients
• 1 →
-
-
-
f
ϵ ∼ U (0,1) ϕ
z = f−1
(ϵ; ϕ)
∇ϕ 𝔼q(z; ϕ) [log p (x|z)] = 𝔼p(ϵ) [∇ϕlog p (x|z)]
= 𝔼p(ϵ) [∇zlog p (x|z)∇ϕz]
∇ϕz
13
Implicit Reparameterization Gradients
•
-
-
•
ϵ = f (z; ϕ) ⇔ f (z; ϕ) − ϵ = 0 z ϕ
∇ϕz = −
∇ϕ f (z; ϕ)
∇z f (z; ϕ)
= −
∇ϕ f (z; ϕ)
q (z; ϕ)
z q (z; ϕ) f−1
14
Meta-Learning with Implicit Gradients
15
• NeurIPS 2019 accepted
•
- Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine
- MAML
• MAML
16
Model-Agnostic Meta-Learning (MAML)
•
-
- 1 (one-step adaptation)
θ*ML
:= argmin
θ∈Θ
F(θ),  where F(θ) =
1
M
M
∑
i=1
ℒ ( 𝒜lgi (θ), 𝒟test
i )
𝒜lgi (θ) = θ − α∇θℒ (θ, 𝒟tr
i )
17
MAML
•
-
• MAML
• 1 FOMAML
- FOMAML 

https://www.slideshare.net/DeepLearningJP2016/dl1maml
• iMAML
∇θF (θ) 𝒜lgi (θ)
18
Inner Loop
•
•
𝒜lg⋆
(θ) = argmin
ϕ′∈Φ
Gi (ϕ′, θ)
Gi (ϕ′, θ) = ̂ℒ (ϕ′)+
λ
2
ϕ′− θ
2
19
Outer Loop
• MAML outer loop
• inner loop
➡
θ ← θ − ηdθF(θ)
= θ − η
1
M
M
∑
i=1
d𝒜lgi(θ)
dθ
∇ϕℒi ( 𝒜lgi(θ))
(ϕ = 𝒜lgi(θ))
d𝒜lgi(θ)
dθ
20
Outer Loop
• inner loop
•
•
- adapt
ϕi ≡ 𝒜lg⋆
i (θ) = argmin
ϕ′∈Φ
Gi (ϕ′, θ)
∇ϕ′Gi (ϕ′, θ)
ϕ′=ϕi
= 0
∇ ̂ℒ(ϕi) + λ(𝒜lg⋆
i (θ) − θ) = 0
θ 𝒜lg⋆
(θ)
d𝒜lg⋆
(θ)
dθ
=
(
I +
1
λ
∇2 ̂ℒ (ϕi))
−1
ϕi 21
Outer Loop
• 2
① inner loop adapt
(SGD )
② 3
•
(
I +
1
λ
∇2 ̂ℒ (ϕi))
−1
ϕi
(
I +
1
λ
∇2 ̂ℒ (ϕi))
−1
∇ϕℒi ( 𝒜lgi(θ))
22
(CG )
•
•
Ax = b ⋯(1)
(1) f(x) =
1
2
xT
Ax − bT
x
x0 = 0,r0 = b − Ax0, p0 = r0
αk =
rT
k pk
pT
k Apk
xk+1 = xk + αk pk
rk+1 = rk − αkApk
pk+1 = rk+1 +
rT
k+1rk+1
rT
k rk
pk
23
(CG )
•
•
( 5 )
- (p22 ① )
‣ Appendix E
gi =
(
I +
1
λ
∇2 ̂ℒ (ϕi))
−1
∇ϕℒi ( 𝒜lgi(θ)) gi
(
I +
1
λ
∇2 ̂ℒ (ϕi))
gi = ∇ϕℒi ( 𝒜lgi(θ))
rk
𝒜lgi(θ)
24
iMAML
• inner loop
➡adapt
• outer loop inner loop
➡inner loop
‣ MAML 1
‣ iMAML Hessian-Free 2
adapt
25
•
- iMAML inner loop ( )
- FOMAML (CG )
- MAML 

(FOMAML ??)
O(1)
26
• Omniglot
- inner loop Hessian-Free iMAML
- iMAML way ( )
- FOMAML
27
• Mini-ImageNet
- Reptile (FOMAML )
-
??
28
iMAML
•
iMAML
• MAML
•
•
•
-
29
( )
1.
- ( )
- iMAML
2.
-
- ERNN
30
RNNs Evolving on an Equilibrium Manifold:
A Panacea for Vanishing and Exploding Gradients?
31
•
- Anil Kag, Ziming Zhang, Venkatesh Saligrama
- , MERL
• NeurIPS 2019 reject
•
RNN
•
•
32
RNN /
• RNN
- sigmoid tanh
• RNN /
- LSTM GRU
hk = ϕ (Uhk−1 + Wxk + b)
ϕ
∂hm
∂hn
=
∏
m≥k>n
∂hk
∂hk−1
=
∏
m≥k>n
∇ϕ (Uhk−1 + Wxk + b) U
33
RNN ODE
• RNN skip connection
(ODE)
• Neural ODE
- 

https://www.slideshare.net/DeepLearningJP2016/dlneural-ordinary-
differential-equations
dh(t)
dt
≜ h′(t) = ϕ (Uh(t) + Wxk + b)
⟹ hk = hk−1 + ηϕ (Uhk−1 + Wxk + b)
34
ODE
• ODE
• 1
➡ ( )
• ERNN
dh
dt
= f (h, x) f (h, x) = 0 ⋯(1)
(1) h x (h0, x0)
fh (h0, x0) (1)
h = g (x)
(h0, x0)
35
ERNN
• ERNN 



ODE
• 



➡ 

h′(t) = ϕ (U (h(t) + hk−1) + Wxk + b) − γ (h(t) + hk−1)
h′(t) = 0 hk
hk f (hk−1, h) = ϕ (U (h + hk−1) + Wxk + b) − γ (h + hk−1) = 0
∂h
∂hk−1
= −
∂f/∂hk−1
∂f/∂h
= − I
∂f/∂h
36
∂f/∂h
•
1. (sigmoid tanh OK)
2.
- 

( )
∂f
∂h
= ∇ϕ (U (h + hk−1) + Wxk + b) U
ϕ
U
37
•
• 5
•
-
h(0)
k
= 0
h(i+1)
k
= h(i)
k
+ η(i)
k [
ϕ
(
U (h(i)
k
+ hk−1) + Wxk + b
)
− γ (h(i)
k
+ hk−1)]
η(i)
k
38
HAR-2 RNN ERNN (log scale)
• RNN
• ERNN 1
∂hT
∂h1
39
• RNN 

ERNN
40
• SoTA
•
•
41
ERNN
• NN
1
RNN
•
• SoTA
• RNN
• accept
42
&
•
• iMAML ERNN
•
43

Mais conteúdo relacionado

Mais procurados

Transformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法についてTransformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法についてSho Takase
 
深層学習の数理
深層学習の数理深層学習の数理
深層学習の数理Taiji Suzuki
 
PRML輪読#7
PRML輪読#7PRML輪読#7
PRML輪読#7matsuolab
 
Sparse estimation tutorial 2014
Sparse estimation tutorial 2014Sparse estimation tutorial 2014
Sparse estimation tutorial 2014Taiji Suzuki
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?Deep Learning JP
 
PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門tmtm otm
 
Batch normalization effectiveness_20190206
Batch normalization effectiveness_20190206Batch normalization effectiveness_20190206
Batch normalization effectiveness_20190206Masakazu Shinoda
 
Noisy Labels と戦う深層学習
Noisy Labels と戦う深層学習Noisy Labels と戦う深層学習
Noisy Labels と戦う深層学習Plot Hong
 
強化学習その3
強化学習その3強化学習その3
強化学習その3nishio
 
機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)Kota Matsui
 
【解説】 一般逆行列
【解説】 一般逆行列【解説】 一般逆行列
【解説】 一般逆行列Kenjiro Sugimoto
 
ブラックボックス最適化とその応用
ブラックボックス最適化とその応用ブラックボックス最適化とその応用
ブラックボックス最適化とその応用gree_tech
 
確率モデルを用いた3D点群レジストレーション
確率モデルを用いた3D点群レジストレーション確率モデルを用いた3D点群レジストレーション
確率モデルを用いた3D点群レジストレーションKenta Tanaka
 
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介Deep Learning JP
 
【論文紹介】How Powerful are Graph Neural Networks?
【論文紹介】How Powerful are Graph Neural Networks?【論文紹介】How Powerful are Graph Neural Networks?
【論文紹介】How Powerful are Graph Neural Networks?Masanao Ochi
 
遺伝的アルゴリズム (Genetic Algorithm)を始めよう!
遺伝的アルゴリズム(Genetic Algorithm)を始めよう!遺伝的アルゴリズム(Genetic Algorithm)を始めよう!
遺伝的アルゴリズム (Genetic Algorithm)を始めよう!Kazuhide Okamura
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究についてDeep Learning JP
 

Mais procurados (20)

Transformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法についてTransformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法について
 
深層学習の数理
深層学習の数理深層学習の数理
深層学習の数理
 
PRML輪読#7
PRML輪読#7PRML輪読#7
PRML輪読#7
 
一般化線形モデル (GLM) & 一般化加法モデル(GAM)
一般化線形モデル (GLM) & 一般化加法モデル(GAM) 一般化線形モデル (GLM) & 一般化加法モデル(GAM)
一般化線形モデル (GLM) & 一般化加法モデル(GAM)
 
coordinate descent 法について
coordinate descent 法についてcoordinate descent 法について
coordinate descent 法について
 
Sparse estimation tutorial 2014
Sparse estimation tutorial 2014Sparse estimation tutorial 2014
Sparse estimation tutorial 2014
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
 
PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門
 
Batch normalization effectiveness_20190206
Batch normalization effectiveness_20190206Batch normalization effectiveness_20190206
Batch normalization effectiveness_20190206
 
Noisy Labels と戦う深層学習
Noisy Labels と戦う深層学習Noisy Labels と戦う深層学習
Noisy Labels と戦う深層学習
 
強化学習その3
強化学習その3強化学習その3
強化学習その3
 
A3C解説
A3C解説A3C解説
A3C解説
 
機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)機械学習による統計的実験計画(ベイズ最適化を中心に)
機械学習による統計的実験計画(ベイズ最適化を中心に)
 
【解説】 一般逆行列
【解説】 一般逆行列【解説】 一般逆行列
【解説】 一般逆行列
 
ブラックボックス最適化とその応用
ブラックボックス最適化とその応用ブラックボックス最適化とその応用
ブラックボックス最適化とその応用
 
確率モデルを用いた3D点群レジストレーション
確率モデルを用いた3D点群レジストレーション確率モデルを用いた3D点群レジストレーション
確率モデルを用いた3D点群レジストレーション
 
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介
 
【論文紹介】How Powerful are Graph Neural Networks?
【論文紹介】How Powerful are Graph Neural Networks?【論文紹介】How Powerful are Graph Neural Networks?
【論文紹介】How Powerful are Graph Neural Networks?
 
遺伝的アルゴリズム (Genetic Algorithm)を始めよう!
遺伝的アルゴリズム(Genetic Algorithm)を始めよう!遺伝的アルゴリズム(Genetic Algorithm)を始めよう!
遺伝的アルゴリズム (Genetic Algorithm)を始めよう!
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
 

Semelhante a [DL輪読会]陰関数微分を用いた深層学習

Phase diagram at finite T & Mu in strong coupling limit of lattice QCD
Phase diagram at finite T & Mu in strong coupling limit of lattice QCDPhase diagram at finite T & Mu in strong coupling limit of lattice QCD
Phase diagram at finite T & Mu in strong coupling limit of lattice QCDBenjamin Jaedon Choi
 
D. Vulcanov - On Cosmologies with non-Minimally Coupled Scalar Field and the ...
D. Vulcanov - On Cosmologies with non-Minimally Coupled Scalar Field and the ...D. Vulcanov - On Cosmologies with non-Minimally Coupled Scalar Field and the ...
D. Vulcanov - On Cosmologies with non-Minimally Coupled Scalar Field and the ...SEENET-MTP
 
Digital control systems (dcs) lecture 18-19-20
Digital control systems (dcs) lecture 18-19-20Digital control systems (dcs) lecture 18-19-20
Digital control systems (dcs) lecture 18-19-20Ali Rind
 
Non Linear Dynamics Basics and Theory
Non Linear Dynamics Basics and TheoryNon Linear Dynamics Basics and Theory
Non Linear Dynamics Basics and TheoryAnupama Kate
 
Software Development for Space-group Analysis: Magnetic Space Group and Irred...
Software Development for Space-group Analysis: Magnetic Space Group and Irred...Software Development for Space-group Analysis: Magnetic Space Group and Irred...
Software Development for Space-group Analysis: Magnetic Space Group and Irred...Kohei Shinohara
 
inverse trigonometric function_1669522645.pptx
inverse trigonometric function_1669522645.pptxinverse trigonometric function_1669522645.pptx
inverse trigonometric function_1669522645.pptxMasoudIbrahim3
 
Signal flow graph
Signal flow graphSignal flow graph
Signal flow graphjani parth
 
[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展Deep Learning JP
 
Tensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationTensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationAlexander Litvinenko
 
Introducing Zap Q-Learning
Introducing Zap Q-Learning   Introducing Zap Q-Learning
Introducing Zap Q-Learning Sean Meyn
 
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...Gota Morota
 
May the Force NOT be with you
May the Force NOT be with youMay the Force NOT be with you
May the Force NOT be with youMiguel Zuma
 
B. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-dualityB. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-dualitySEENET-MTP
 
Identification of the Mathematical Models of Complex Relaxation Processes in ...
Identification of the Mathematical Models of Complex Relaxation Processes in ...Identification of the Mathematical Models of Complex Relaxation Processes in ...
Identification of the Mathematical Models of Complex Relaxation Processes in ...Vladimir Bakhrushin
 
Quantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averagesQuantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averagesVjekoslavKovac1
 

Semelhante a [DL輪読会]陰関数微分を用いた深層学習 (20)

Phase diagram at finite T & Mu in strong coupling limit of lattice QCD
Phase diagram at finite T & Mu in strong coupling limit of lattice QCDPhase diagram at finite T & Mu in strong coupling limit of lattice QCD
Phase diagram at finite T & Mu in strong coupling limit of lattice QCD
 
D. Vulcanov - On Cosmologies with non-Minimally Coupled Scalar Field and the ...
D. Vulcanov - On Cosmologies with non-Minimally Coupled Scalar Field and the ...D. Vulcanov - On Cosmologies with non-Minimally Coupled Scalar Field and the ...
D. Vulcanov - On Cosmologies with non-Minimally Coupled Scalar Field and the ...
 
Digital control systems (dcs) lecture 18-19-20
Digital control systems (dcs) lecture 18-19-20Digital control systems (dcs) lecture 18-19-20
Digital control systems (dcs) lecture 18-19-20
 
勾配法
勾配法勾配法
勾配法
 
Non Linear Dynamics Basics and Theory
Non Linear Dynamics Basics and TheoryNon Linear Dynamics Basics and Theory
Non Linear Dynamics Basics and Theory
 
Software Development for Space-group Analysis: Magnetic Space Group and Irred...
Software Development for Space-group Analysis: Magnetic Space Group and Irred...Software Development for Space-group Analysis: Magnetic Space Group and Irred...
Software Development for Space-group Analysis: Magnetic Space Group and Irred...
 
Singlevaropt
SinglevaroptSinglevaropt
Singlevaropt
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
inverse trigonometric function_1669522645.pptx
inverse trigonometric function_1669522645.pptxinverse trigonometric function_1669522645.pptx
inverse trigonometric function_1669522645.pptx
 
Signal flow graph
Signal flow graphSignal flow graph
Signal flow graph
 
[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展
 
Tensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationTensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantification
 
Ph ddefence
Ph ddefencePh ddefence
Ph ddefence
 
ALPSチュートリアル
ALPSチュートリアルALPSチュートリアル
ALPSチュートリアル
 
Introducing Zap Q-Learning
Introducing Zap Q-Learning   Introducing Zap Q-Learning
Introducing Zap Q-Learning
 
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
 
May the Force NOT be with you
May the Force NOT be with youMay the Force NOT be with you
May the Force NOT be with you
 
B. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-dualityB. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-duality
 
Identification of the Mathematical Models of Complex Relaxation Processes in ...
Identification of the Mathematical Models of Complex Relaxation Processes in ...Identification of the Mathematical Models of Complex Relaxation Processes in ...
Identification of the Mathematical Models of Complex Relaxation Processes in ...
 
Quantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averagesQuantitative norm convergence of some ergodic averages
Quantitative norm convergence of some ergodic averages
 

Mais de Deep Learning JP

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving PlannersDeep Learning JP
 
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについてDeep Learning JP
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...Deep Learning JP
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-ResolutionDeep Learning JP
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxivDeep Learning JP
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLMDeep Learning JP
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...Deep Learning JP
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place RecognitionDeep Learning JP
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )Deep Learning JP
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...Deep Learning JP
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"Deep Learning JP
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "Deep Learning JP
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat ModelsDeep Learning JP
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"Deep Learning JP
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...Deep Learning JP
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...Deep Learning JP
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...Deep Learning JP
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...Deep Learning JP
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...Deep Learning JP
 
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...Deep Learning JP
 

Mais de Deep Learning JP (20)

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
 
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
 
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
 

Último

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

[DL輪読会]陰関数微分を用いた深層学習

  • 1. 1 
 Deep Learning with Implicit Gradients Shohei Taniguchi, Matsuo Lab (M1)
  • 2. • • • 2 - Meta-Learning with Implicit Gradients ‣ MAML inner update 
 iMAML - RNNs Evolving on an Equilibrium Manifold: A Panacea for Vanishing 
 and Exploding Gradients? ‣ ERNN 2
  • 3. Outline 1. - - 2. - 1 ‣ Implicit Reparameterization Gradients 3. Meta-Learning with Implicit Gradients 4. RNNs Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients? 3
  • 4. 4
  • 5. • (e.g. 2 ) - - - NN • (e.g. ) - - - 
 ( ) y = f (x) y = ax2 + bx + c f (x, y) = 0 x2 + y2 = r2 x y 5
  • 6. • • , , 
 f(x, y) = 0 dy dx = − ∂f/∂x ∂f/∂y = − fx fy f(x, y) = 0 (x0, y0) fy (x0, y0) x0 ∈ U y0 ∈ V g : U → V {(x, g(x))|x ∈ U} = {(x, y) ∈ U × V| f(x, y) = 0} 6
  • 7. • 1 - - A - B 
 ( ) • 2 Jacobian f(x, y) = 0 (x0, y0) fy (x0, y0) x2 + y2 − r2 = 0 y = r2 − x2 fy (r,0) = 2 × 0 = 0 y = ± r2 − x2 fy 7
  • 8. ( ) 1. - ( ) - iMAML 2. - - ERNN 8
  • 9. ( ) 1. - ( ) - iMAML 2. - - ERNN 9
  • 11. • NeurIPS 2018 accepted • - Michael Figurnov, Shakir Mohamed, Andriy Mnih - DeepMind • reparameterization trick • iMAML ERNN 11
  • 12. Reparameterization Trick • VAE • reparameterization trick • 𝔼q(z; ϕ) [log p (x|z)]−KL (q (z; ϕ)||p (z)) q ϵ = f (z; ϕ) = z − μϕ σϕ ϵ ∼ 𝒩 (0,1) ϕ ϵ ∇ϕ 𝔼q(z; ϕ) [log p (x|z)] = 𝔼p(ϵ) [ ∇ϕlog p (x|z) z=f−1 (ϵ; ϕ)] f f 12
  • 13. Implicit Reparameterization Gradients • 1 → - - - f ϵ ∼ U (0,1) ϕ z = f−1 (ϵ; ϕ) ∇ϕ 𝔼q(z; ϕ) [log p (x|z)] = 𝔼p(ϵ) [∇ϕlog p (x|z)] = 𝔼p(ϵ) [∇zlog p (x|z)∇ϕz] ∇ϕz 13
  • 14. Implicit Reparameterization Gradients • - - • ϵ = f (z; ϕ) ⇔ f (z; ϕ) − ϵ = 0 z ϕ ∇ϕz = − ∇ϕ f (z; ϕ) ∇z f (z; ϕ) = − ∇ϕ f (z; ϕ) q (z; ϕ) z q (z; ϕ) f−1 14
  • 16. • NeurIPS 2019 accepted • - Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine - MAML • MAML 16
  • 17. Model-Agnostic Meta-Learning (MAML) • - - 1 (one-step adaptation) θ*ML := argmin θ∈Θ F(θ),  where F(θ) = 1 M M ∑ i=1 ℒ ( 𝒜lgi (θ), 𝒟test i ) 𝒜lgi (θ) = θ − α∇θℒ (θ, 𝒟tr i ) 17
  • 18. MAML • - • MAML • 1 FOMAML - FOMAML 
 https://www.slideshare.net/DeepLearningJP2016/dl1maml • iMAML ∇θF (θ) 𝒜lgi (θ) 18
  • 19. Inner Loop • • 𝒜lg⋆ (θ) = argmin ϕ′∈Φ Gi (ϕ′, θ) Gi (ϕ′, θ) = ̂ℒ (ϕ′)+ λ 2 ϕ′− θ 2 19
  • 20. Outer Loop • MAML outer loop • inner loop ➡ θ ← θ − ηdθF(θ) = θ − η 1 M M ∑ i=1 d𝒜lgi(θ) dθ ∇ϕℒi ( 𝒜lgi(θ)) (ϕ = 𝒜lgi(θ)) d𝒜lgi(θ) dθ 20
  • 21. Outer Loop • inner loop • • - adapt ϕi ≡ 𝒜lg⋆ i (θ) = argmin ϕ′∈Φ Gi (ϕ′, θ) ∇ϕ′Gi (ϕ′, θ) ϕ′=ϕi = 0 ∇ ̂ℒ(ϕi) + λ(𝒜lg⋆ i (θ) − θ) = 0 θ 𝒜lg⋆ (θ) d𝒜lg⋆ (θ) dθ = ( I + 1 λ ∇2 ̂ℒ (ϕi)) −1 ϕi 21
  • 22. Outer Loop • 2 ① inner loop adapt (SGD ) ② 3 • ( I + 1 λ ∇2 ̂ℒ (ϕi)) −1 ϕi ( I + 1 λ ∇2 ̂ℒ (ϕi)) −1 ∇ϕℒi ( 𝒜lgi(θ)) 22
  • 23. (CG ) • • Ax = b ⋯(1) (1) f(x) = 1 2 xT Ax − bT x x0 = 0,r0 = b − Ax0, p0 = r0 αk = rT k pk pT k Apk xk+1 = xk + αk pk rk+1 = rk − αkApk pk+1 = rk+1 + rT k+1rk+1 rT k rk pk 23
  • 24. (CG ) • • ( 5 ) - (p22 ① ) ‣ Appendix E gi = ( I + 1 λ ∇2 ̂ℒ (ϕi)) −1 ∇ϕℒi ( 𝒜lgi(θ)) gi ( I + 1 λ ∇2 ̂ℒ (ϕi)) gi = ∇ϕℒi ( 𝒜lgi(θ)) rk 𝒜lgi(θ) 24
  • 25. iMAML • inner loop ➡adapt • outer loop inner loop ➡inner loop ‣ MAML 1 ‣ iMAML Hessian-Free 2 adapt 25
  • 26. • - iMAML inner loop ( ) - FOMAML (CG ) - MAML 
 (FOMAML ??) O(1) 26
  • 27. • Omniglot - inner loop Hessian-Free iMAML - iMAML way ( ) - FOMAML 27
  • 28. • Mini-ImageNet - Reptile (FOMAML ) - ?? 28
  • 30. ( ) 1. - ( ) - iMAML 2. - - ERNN 30
  • 31. RNNs Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients? 31
  • 32. • - Anil Kag, Ziming Zhang, Venkatesh Saligrama - , MERL • NeurIPS 2019 reject • RNN • • 32
  • 33. RNN / • RNN - sigmoid tanh • RNN / - LSTM GRU hk = ϕ (Uhk−1 + Wxk + b) ϕ ∂hm ∂hn = ∏ m≥k>n ∂hk ∂hk−1 = ∏ m≥k>n ∇ϕ (Uhk−1 + Wxk + b) U 33
  • 34. RNN ODE • RNN skip connection (ODE) • Neural ODE - 
 https://www.slideshare.net/DeepLearningJP2016/dlneural-ordinary- differential-equations dh(t) dt ≜ h′(t) = ϕ (Uh(t) + Wxk + b) ⟹ hk = hk−1 + ηϕ (Uhk−1 + Wxk + b) 34
  • 35. ODE • ODE • 1 ➡ ( ) • ERNN dh dt = f (h, x) f (h, x) = 0 ⋯(1) (1) h x (h0, x0) fh (h0, x0) (1) h = g (x) (h0, x0) 35
  • 36. ERNN • ERNN 
 
 ODE • 
 
 ➡ 
 h′(t) = ϕ (U (h(t) + hk−1) + Wxk + b) − γ (h(t) + hk−1) h′(t) = 0 hk hk f (hk−1, h) = ϕ (U (h + hk−1) + Wxk + b) − γ (h + hk−1) = 0 ∂h ∂hk−1 = − ∂f/∂hk−1 ∂f/∂h = − I ∂f/∂h 36
  • 37. ∂f/∂h • 1. (sigmoid tanh OK) 2. - 
 ( ) ∂f ∂h = ∇ϕ (U (h + hk−1) + Wxk + b) U ϕ U 37
  • 38. • • 5 • - h(0) k = 0 h(i+1) k = h(i) k + η(i) k [ ϕ ( U (h(i) k + hk−1) + Wxk + b ) − γ (h(i) k + hk−1)] η(i) k 38
  • 39. HAR-2 RNN ERNN (log scale) • RNN • ERNN 1 ∂hT ∂h1 39